name: go-memory-optimization description: This skill should be used when optimizing Go code for memory efficiency, reducing GC pressure, implementing object pooling, analyzing escape behavior, choosing between fixed-size arrays and slices, designing worker pools, or profiling memory allocations. Provides comprehensive knowledge of Go's memory model, stack vs heap allocation, sync.Pool patterns, goroutine reuse, and GC tuning.
This skill provides guidance on optimizing Go programs for memory efficiency and reduced garbage collection overhead. Topics include stack allocation semantics, fixed-size types, escape analysis, object pooling, goroutine management, and GC tuning.
Prefer allocations in this order (fastest to slowest):
Focus memory optimization efforts on:
Avoid premature optimization. Profile first with go tool pprof to identify actual bottlenecks.
Arrays with known compile-time size can be stack-allocated, avoiding heap entirely:
// HEAP: slice header + backing array escape to heap
func processSlice() []byte {
data := make([]byte, 32)
// ... use data
return data // escapes
}
// STACK: fixed array stays on stack if doesn't escape
func processArray() {
var data [32]byte // stack-allocated
// ... use data
} // automatically cleaned up
Define types with explicit sizes for protocol fields, cryptographic values, and identifiers:
// Binary types enforce length and enable stack allocation
type EventID [32]byte // SHA256 hash
type Pubkey [32]byte // Schnorr public key
type Signature [64]byte // Schnorr signature
// Methods operate on value receivers when size permits
func (id EventID) Hex() string {
return hex.EncodeToString(id[:])
}
func (id EventID) IsZero() bool {
return id == EventID{} // efficient zero-value comparison
}
| Size | Recommendation |
|---|---|
| ≤64 bytes | Pass by value, stack-friendly |
| 65-128 bytes | Consider context; value for read-only, pointer for mutation |
| >128 bytes | Pass by pointer to avoid copy overhead |
Convert fixed arrays to slices only at API boundaries:
type Hash [32]byte
func (h Hash) Bytes() []byte {
return h[:] // creates slice header, array stays on stack if h does
}
// Prefer methods that accept arrays directly
func VerifySignature(pubkey Pubkey, msg []byte, sig Signature) bool {
// pubkey and sig are stack-allocated in caller
}
Variables "escape" to the heap when the compiler cannot prove their lifetime is bounded by the stack frame. Check escape behavior with:
go build -gcflags="-m -m" ./...
// 1. Returning pointers to local variables
func escapes() *int {
x := 42
return &x // x escapes
}
// 2. Storing in interface{}
func escapes(x int) interface{} {
return x // x escapes (boxed)
}
// 3. Closures capturing by reference
func escapes() func() int {
x := 42
return func() int { return x } // x escapes
}
// 4. Slice/map with unknown capacity
func escapes(n int) []byte {
return make([]byte, n) // escapes (size unknown at compile time)
}
// 5. Sending pointers to channels
func escapes(ch chan *int) {
x := 42
ch <- &x // x escapes
}
// 1. Accept pointers, don't return them
func noEscape(result *[32]byte) {
// caller owns memory, function fills it
copy(result[:], computeHash())
}
// 2. Use fixed-size arrays
func noEscape() {
var buf [1024]byte // known size, stack-allocated
process(buf[:])
}
// 3. Preallocate with known capacity
func noEscape() {
buf := make([]byte, 0, 1024) // may stay on stack
// ... append up to 1024 bytes
}
// 4. Avoid interface{} on hot paths
func noEscape(x int) int {
return x * 2 // no boxing
}
var bufferPool = sync.Pool{
New: func() interface{} {
return make([]byte, 0, 4096)
},
}
func processRequest(data []byte) {
buf := bufferPool.Get().([]byte)
buf = buf[:0] // reset length, keep capacity
defer bufferPool.Put(buf)
// use buf...
}
type BufferPool struct {
pool sync.Pool
size int
}
func NewBufferPool(size int) *BufferPool {
return &BufferPool{
pool: sync.Pool{
New: func() interface{} {
b := make([]byte, size)
return &b
},
},
size: size,
}
}
func (p *BufferPool) Get() *[]byte {
return p.pool.Get().(*[]byte)
}
func (p *BufferPool) Put(b *[]byte) {
if b == nil || cap(*b) < p.size {
return // don't pool undersized buffers
}
*b = (*b)[:p.size] // reset to full size
p.pool.Put(b)
}
// BAD: Pool of pointers to small values (overhead exceeds benefit)
var intPool = sync.Pool{New: func() interface{} { return new(int) }}
// BAD: Not resetting state before Put
bufPool.Put(buf) // may contain sensitive data
// BAD: Pooling objects with goroutine-local state
var connPool = sync.Pool{...} // connections are stateful
// BAD: Assuming pooled objects persist (GC clears pools)
obj := pool.Get()
// ... long delay
pool.Put(obj) // obj may have been GC'd during delay
| Use Case | Pool? | Reason |
|---|---|---|
| Buffers in HTTP handlers | Yes | High allocation rate, short lifetime |
| Encoder/decoder state | Yes | Expensive to initialize |
| Small values (<64 bytes) | No | Pointer overhead exceeds benefit |
| Long-lived objects | No | Pools are for short-lived reuse |
| Objects with cleanup needs | No | Pool provides no finalization |
type WorkerPool struct {
jobs chan func()
workers int
wg sync.WaitGroup
}
func NewWorkerPool(workers, queueSize int) *WorkerPool {
p := &WorkerPool{
jobs: make(chan func(), queueSize),
workers: workers,
}
p.wg.Add(workers)
for i := 0; i < workers; i++ {
go p.worker()
}
return p
}
func (p *WorkerPool) worker() {
defer p.wg.Done()
for job := range p.jobs {
job()
}
}
func (p *WorkerPool) Submit(job func()) {
p.jobs <- job
}
func (p *WorkerPool) Shutdown() {
close(p.jobs)
p.wg.Wait()
}
type Semaphore struct {
sem chan struct{}
}
func NewSemaphore(n int) *Semaphore {
return &Semaphore{sem: make(chan struct{}, n)}
}
func (s *Semaphore) Acquire() { s.sem <- struct{}{} }
func (s *Semaphore) Release() { <-s.sem }
// Usage
sem := NewSemaphore(runtime.GOMAXPROCS(0))
for _, item := range items {
sem.Acquire()
go func(it Item) {
defer sem.Release()
process(it)
}(item)
}
| Metric | Spawn per request | Worker pool |
|---|---|---|
| Goroutine creation | O(n) | O(workers) |
| Stack allocation | 2KB × n | 2KB × workers |
| Scheduler overhead | Higher | Lower |
| GC pressure | Higher | Lower |
// 1. Reuse buffers across iterations
buf := make([]byte, 0, 4096)
for _, item := range items {
buf = buf[:0] // reset without reallocation
buf = processItem(buf, item)
}
// 2. Preallocate slices with known length
result := make([]Item, 0, len(input)) // avoid append reallocations
for _, in := range input {
result = append(result, transform(in))
}
// 3. Struct embedding instead of pointer fields
type Event struct {
ID [32]byte // embedded, not *[32]byte
Pubkey [32]byte // single allocation for entire struct
Signature [64]byte
Content string // only string data on heap
}
// 4. String interning for repeated values
var kindStrings = map[int]string{
0: "set_metadata",
1: "text_note",
// ...
}
import "runtime/debug"
func init() {
// GOGC: target heap growth percentage (default 100)
// Lower = more frequent GC, less memory
// Higher = less frequent GC, more memory
debug.SetGCPercent(50) // GC when heap grows 50%
// GOMEMLIMIT: soft memory limit (Go 1.19+)
// GC becomes more aggressive as limit approaches
debug.SetMemoryLimit(512 << 20) // 512MB limit
}
Environment variables:
GOGC=50 # More aggressive GC
GOMEMLIMIT=512MiB # Soft memory limit
GODEBUG=gctrace=1 # GC trace output
//go:build goexperiment.arenas
import "arena"
func processLargeDataset(data []byte) Result {
a := arena.NewArena()
defer a.Free() // bulk free all allocations
// All allocations from arena are freed together
items := arena.MakeSlice[Item](a, 0, 1000)
// ... process
// Copy result out before Free
return copyResult(result)
}
import "runtime/pprof"
func captureHeapProfile() {
f, _ := os.Create("heap.prof")
defer f.Close()
runtime.GC() // get accurate picture
pprof.WriteHeapProfile(f)
}
go tool pprof -http=:8080 heap.prof
go tool pprof -alloc_space heap.prof # total allocations
go tool pprof -inuse_space heap.prof # current usage
func BenchmarkAllocation(b *testing.B) {
b.ReportAllocs()
for i := 0; i < b.N; i++ {
result := processData(input)
_ = result
}
}
Output interpretation:
BenchmarkAllocation-8 1000000 1234 ns/op 256 B/op 3 allocs/op
↑ ↑
bytes/op allocations/op
func printMemStats() {
var m runtime.MemStats
runtime.ReadMemStats(&m)
fmt.Printf("Alloc: %d MB\n", m.Alloc/1024/1024)
fmt.Printf("TotalAlloc: %d MB\n", m.TotalAlloc/1024/1024)
fmt.Printf("Sys: %d MB\n", m.Sys/1024/1024)
fmt.Printf("NumGC: %d\n", m.NumGC)
fmt.Printf("GCPause: %v\n", time.Duration(m.PauseNs[(m.NumGC+255)%256]))
}
For detailed code examples and patterns, see references/patterns.md:
go tool pprof)-gcflags="-m")-benchmem flag