plan-alloc-instrumentation.md raw

Plan: Moxie Allocation Instrumentation + Smesh Memory Fix

Version 3 (final). All review feedback resolved.

Problem

Smesh web app has runaway memory growth. WASM target uses gc_leaking (bump allocator, never frees). ~15 unbounded maps in web/wasm/app/main.mx, DOM handle leaks in jsruntime/dom.mjs, no eviction anywhere. Every allocation is permanent for the lifetime of the WASM instance.

Key Architecture Facts

Phase 1: Compiler Instrumentation (LLVM path)

1A: Wire -print-allocs into the compiler

Flag already exists (main.go:453, compileopts/options.go:41) but is not connected to the compiler backend.

File: /home/mleku/s/moxie/compiler/compiler.go

At each allocation site, when containing function matches PrintAllocs regex, emit LLVM IR call to runtime.logAlloc(siteID, size) after the runtime.alloc call.

Four sites:

  1. *ssa.Alloc (line 2172) - struct/pointer heap allocation
  2. *ssa.MakeSlice (line 2430) - slice allocation
  3. *ssa.MakeMap (compiler/map.go) - map allocation
  4. *ssa.MakeChan (compiler/channel.go) - channel allocation

1B: Runtime counter array

New file: /home/mleku/s/moxie/src/runtime/alloc_trace.mx

var allocCounters [N]uint64    // indexed by site ID, N set by compiler
var allocSizes [N]uint64       // cumulative bytes per site
var allocSiteNames [N]string   // populated at init by compiler-generated code
var nextSiteID int32

func logAlloc(siteID int32, size uintptr) {
    allocCounters[siteID]++
    allocSizes[siteID] += uint64(size)
}

Cost: one integer increment per allocation (~1 instruction). Zero string formatting, zero JS bridge crossing during execution.

1C: Jsbridge export for WASM target

1D: JS-side instrumentation (parallel with 1A-1C)

Guarded by globalThis.__moxie_instrument = true:

Phase 2: Instrumented Smesh Build

# Instrumented app WASM
MOXIEROOT=../moxie GOWORK=off ../moxie/moxie build -target wasm -print-allocs "web/wasm/app" \
  -o web/static/app.wasm ./web/wasm/app/

# Instrumented relay
MOXIEROOT=../moxie ../moxie/moxie build -print-allocs ".*" -o /tmp/smesh-instrumented .

Add SMESH_TEST_HEADERS=1 env var to main.mx to emit COOP/COEP headers in test mode only.

Phase 3: Selenium Test Harness

3A: Memory profiling fixtures in test/conftest.py

3B: Test scenarios in test/test_memory_profile.py

  1. Baseline - load app, record initial totalAlloc and mallocs
  2. Feed scroll 500 events - measure bytes/event, mallocs/event
  3. Profile cycling x20 - verify rotate-map bounds working set
  4. Thread open/close x50 - verify threadEvents cleanup
  5. Idle 30 seconds - subscriptions explicitly CLOSEd, 5s settling period, then measure rate
  6. DOM handle count - assert _elements.size < 3000 throughout all scenarios
  7. Heap threshold - push past threshold, verify graceful restart

3C: Idle measurement definition

"Idle" = relay-proxy has no active subscriptions + 5s settling period since last EOSE. Test sends explicit CLOSE for all sub IDs ("feed", "ntf", etc.), waits 5s, then measures for 30s. Test relay is controlled (seeded dataset, no push after CLOSE). Assertion: idle_rate < 1024 bytes/sec.

3D: Output format

JSON per test run with initial/final heap, delta bytes, mallocs, top allocation sites by count/bytes, DOM handle count.

Phase 4: Fix the Leaks

4A: Rotate-map on all unbounded caches

Uniform pattern, no LRU data structure needed. No reflect, no generics. ~15 lines per cache group.

Cache groupMapsMax entriesRotate together
Event dataeventCache, seenEvents, noteElements, noteReactionMap, noteRepostCounts, noteZapSeen, actionFetchedIDs2000Yes (event ID keyed)
Author metadataauthorNames, authorPics, authorContent, authorTs, authorRelays, fetchedK0, fetchedK10k500Yes (pubkey keyed)
EmbedsembedCallbacks, embedRequested, embedSubIDs200Yes
Reply cachereplyCache, replyAvatarCache, replyLineCache, replyPending, replyNeedName, replyQueue500Yes
Profile wrappersprofileWrappers, profileWrapOrder10Yes
ThreadthreadEventsN/ACleared in closeNoteThread() already

Important: All *Old map vars must be initialized to empty maps at startup (not nil). On first rotation, range over nil map is safe (zero iterations) but silently drops all current handles without releasing them. Initialize alongside their primary counterparts.

4B: DOM handle release on rotation

func rotateEventCaches() {
    // Release DOM handles from entries about to be dropped.
    // Iterate noteElementsOld, release handles NOT also in current noteElements.
    var ids []int32
    for id, el := range noteElementsOld {
        if _, ok := noteElements[id]; !ok {
            ids = append(ids, el)
        }
    }
    if len(ids) > 0 {
        dom.ReleaseAll(ids)  // single jsbridge call, bulk
    }
    // Rotate all event-keyed maps together.
    eventCacheOld = eventCache
    eventCache = map[string]*nostr.Event{}
    noteElementsOld = noteElements
    noteElements = map[string]dom.Element{}
    seenEventsOld = seenEvents
    seenEvents = map[string]bool{}
    // ... same for reaction/repost/zap maps
    eventCacheCount = 0
}

4C: switchPage teardown

Insert teardownPage(activePage) in switchPage() before activePage = name (line 816). Each page case sends CLOSE messages to relay-proxy and trims page-specific caches.

4D: Proactive WASM restart

Strategy: option (c) - tune threshold high enough that restarts are rare.

Phase 5: Regression Gate

5A: Budget assertions

5B: CI integration

test-memory target in Makefile: builds instrumented WASM, runs memory tests.

Execution Order

1A (wire -print-allocs in compiler/compiler.go)
  -> 1B (runtime counter array in alloc_trace.mx)
    -> 1C (jsbridge export)
      -> 2 (instrumented build)
        -> 3 (selenium measurement)
          -> 4A-4C (fixes based on data)
            -> 4D (proactive restart)
              -> 5 (regression gate)

1D (JS-side dom.mjs instrumentation) -- parallel with 1A-1C, independent

Files Modified

FileChange
/home/mleku/s/moxie/compiler/compiler.goWire PrintAllocs: emit logAlloc call at 4 allocation sites
/home/mleku/s/moxie/compiler/map.goWire PrintAllocs for MakeMap
/home/mleku/s/moxie/compiler/channel.goWire PrintAllocs for MakeChan
/home/mleku/s/moxie/src/runtime/alloc_trace.mxNew: counter array + GetAllocCounters
/home/mleku/s/moxie/jsruntime/dom.mjsThreshold logging on _elements.size + releaseAll() batch function
/home/mleku/s/smesh/web/wasm/app/main.mxRotate-map caches, switchPage teardown, jsbridge for mem_stats
/home/mleku/s/smesh/web/static/wasm-host.mjsExpose __moxie_mem_stats, __moxie_read_alloc_counters; proactive restart
/home/mleku/s/smesh/main.mxCOOP/COEP headers in test mode
/home/mleku/s/smesh/test/conftest.pyMemory profiling fixtures
/home/mleku/s/smesh/test/test_memory_profile.pyNew: memory measurement tests
/home/mleku/s/smesh/Makefiletest-memory target