iskradb

triple-store lattice database - C=3 Bethe lattice with Bnoun/Bverb/Bmodifier branches, B-tree indexed per branch

git clone https://git.smesh.lol/iskradb.git

iskradb

Multi-branch B-tree lattice. Eight independent ordered trees share a single record pool and cross-link via stable record indices.

Built in Moxie. Module: git.smesh.lol/iskradb.

Architecture

Eight branches

The lattice has C=8 branches, one per semantic axis:

ConstantIndexAxisAlias
Bsemantic0ontological category (animacy, abstractness)-
Bgrammatical1syntactic role (nominal/verbal/function)Bnoun
Bcooccur2PVN co-occurrence context-
Bmorphology3morphological state (tense/polarity/aspect)Bverb
Bpragmatic4domain/register contextBmodifier
Bphonology5phonological (dormant)-
Bvalency6argument count-
Bregister7social register-

Bnoun, Bverb, Bmodifier are aliases retained for backward compatibility. Each branch is an independent B-tree with coordination number C=8 (each node holds up to 8 keys).

Node layout (200 bytes)

Keys     [8]Key       128 bytes   8 × 128-bit SipHash keys
RecPtrs  [8]uint32     32 bytes   record index per key slot
Children [9]uint32     36 bytes   child node indices
Mult     uint8          1 byte    expansion multiplier (compact)
Flags    uint8          1 byte    bits 4-7: key count, bits 1-3: branch, bit 0: deleted
_pad     [2]byte

Record layout (48 bytes)

DataFile uint32    4 bytes   overflow file selector / flag bits
DataOff  uint32    4 bytes   offset into string pool (overflow path)
DataLen  uint32    4 bytes   data length
Link     [2]uint32 8 bytes   cross-branch record indices
Branch   uint8     1 byte    packed POS + register + domain + special
_rpad    [3]byte
Inline   [24]byte  24 bytes  inline data store (up to 23 bytes + 1 length byte)

Records are stable across B-tree splits. Nodes move during splits and compaction; records do not. Cross-branch links (Link[0], Link[1]) use record indices and therefore never go stale.

Key

type Key [2]uint64   // 128-bit SipHash

Keys are opaque hashes. The default key derivation is:

lattice.HashKey(data []byte) Key   // SipHash(defaultSeed, data)

Callers that need domain separation prepend a domain byte before hashing (e.g. transdb prepends lang || coord_8bytes_LE before the word bytes).


Storage modes

In-memory

t := lattice.NewTree(cap int) *Tree

Allocates node and record slices in memory. No flush/close needed. Used for build pipelines, tests, and consolidation passes.

Flat binary (transdb format)

Not part of the iskradb API directly - transdb's ingest.SaveFlat/LoadFlat serialize to a flat binary:

[4B] nodeCount
[4B] recCount
[C×4B] roots
[4B] poolLen
[nodeCount × 200B] nodes
[recCount × 48B] records
[poolLen B] string pool
[recCount × 20B] RecKey entries: recIdx(4) + key(16)

Disk-backed (page cache)

t, err := lattice.Create(path string) (*Tree, error)
t, err := lattice.Open(path string) (*Tree, error)
err := t.Flush() error
err := t.Close() error

Uses a PageCache with dirty-page tracking. B-tree operations are identical; the cache layer intercepts node and record access.


Core operations

Insert

t.Insert(branch Branch, key Key, rec Record) uint32          // returns node index
t.InsertRec(branch Branch, key Key, rec Record) uint32       // returns record index

InsertRec resets Link[0] and Link[1] to NullRec after copy. Wire links after insertion.

Lookup

t.Lookup(branch Branch, key Key) *Record                     // returns nil if not found
t.LookupRecIdx(branch Branch, key Key) uint32                // returns NullRec if not found
t.GetRecord(idx uint32) *Record
t.GetNode(idx uint32) *Node

LookupRecIdx is the primary lookup path. Get the record index, then GetRecord to access fields including Link pointers.

Delete

t.Delete(branch Branch, key Key)

Marks the node deleted; does not reclaim memory immediately. Call Compact to reclaim.

Walk

t.Walk(branch Branch, key Key) (nodeIdx uint32, pos int, found bool)
t.WalkPath(branch Branch, key Key) []uint32   // node indices from root to target

Cross-branch links

Records carry two link slots (Link[0], Link[1]) pointing to record indices in other branches. Because record indices are stable across splits, links remain valid after any structural change to the B-tree.

CrossWalk

t.CrossWalk(startRecIdx uint32, pattern WalkPattern, visit func(uint32, *Record) bool)

Follows links according to a WalkPattern. Built-in patterns:

PatNounVerbMod  // noun → verb → modifier → stop
PatNounMod      // noun → modifier → stop
PatVerbNoun     // verb → noun → stop
PatFull         // noun → verb → modifier → noun (cycle, max 3 hops)

branchToLinkIndex determines which Link slot corresponds to a given transition:

  • noun → verb: Link[0]; noun → modifier: Link[1]
  • verb → noun: Link[0]; verb → modifier: Link[1]
  • modifier → noun: Link[0]; modifier → verb: Link[1]

Triple insert

t.InsertTriple(nounKey, verbKey, modKey Key, nRec, vRec, mRec Record)

Inserts three records (one per branch) and wires their cross-links.


TritPath

type TritPath uint32   // up to 16 trits, 2 bits each

Encodes a path through the lattice coordinate space. Trit 0 = branch selector (1=noun, 2=verb, 3=modifier). Trits 1-15 encode the path within the branch.

MakeTritPath(trits ...uint8) TritPath
BranchTritPath(b Branch) TritPath
p.Depth() int
p.Branch() uint8
p.Trit(i int) uint8
Parent(p TritPath) TritPath
Child(p TritPath, dir uint8) TritPath
LCA(a, b TritPath) TritPath     // longest common prefix
Distance(a, b TritPath) int     // hop count via LCA
Resolve(p TritPath, mult uint8) TritPath  // expand path with multiplier

Used for coordinate-space navigation and subtree addressing. Not required for basic insert/lookup.


Transfer operations

SubLattice

A SubLattice is a self-contained sub-tree extracted from a parent, with remapped record indices and preserved cross-links.

type SubLattice struct {
    Tree      *Tree
    OriginKey Key
    SrcBranch Branch
}

Extract

sl := t.ExtractBranch(branch Branch) *SubLattice
sl := t.ExtractSubtree(rootNodeIdx uint32, branch Branch, filter ExportFilter) *SubLattice
sl := t.ExtractLinked(startRecIdx uint32) *SubLattice

ExtractLinked follows Link pointers from startRecIdx across branches and collects all reachable records. Cross-links are remapped to new indices in the extracted tree.

ExportFilter is func(nodeIdx uint32, n *Node, recIdx uint32, r *Record) bool — return false to exclude a record from the extract.

Save/Load

err := sl.SaveFile(path string) error
sl, err := lattice.LoadSubLattice(path string) (*SubLattice, error)

Graft

inserted := t.Graft(sl *SubLattice) int   // returns count of inserted records

Merges a SubLattice into an existing tree. Deduplicates on key: if a key already exists in the target branch, the record is skipped. Cross-links are remapped to target indices after all records are inserted.


Compaction

t.NeedsCompact() bool
reclaimed := t.Compact() int

Reclaims free nodes after deletions by moving live nodes into the gaps and updating all child pointers. Also resolves any pending expansion entries. Safe to call at any time; the tree remains fully operational before and after.


RecKey map

t.RecKey map[uint32]Key   // record index → key

Maintained automatically on every Insert/InsertRec. Enables reverse lookup (record index → key) without B-tree traversal. Used by consolidation passes that need to re-insert records into a fresh tree while preserving keys.


Usage in transdb

transdb uses iskradb as its core storage engine. Key conventions:

Key derivation: SipHash(defaultSeed, [lang(1) || coord(8 LE) || word(...)]) where coord is a 64-bit packed bitfield encoding semantic category, morphological state, grammatical role, and context axes.

Active branches: Bnoun (nouns/adjectives), Bverb (verbs, conjugated forms at morph coord), Bmodifier (particles, function words). Bcooccur is repurposed as metadata storage for lang descriptors, particle role maps, and verb class registrations.

Record.Branch byte: repurposed from the lattice's POS-branch encoding to carry additional metadata (PackBranch(pos, reg, dom, spec) packs POS in bits 0-2, register in bits 3-4, domain in bits 5-6, honorific in bit 7). For Bcooccur metadata records, the Branch byte carries role codes, class codes, or descriptor bits directly.

Record.DataFile bits: transdb packs morph state (bits 1-5) and semantic flags (bits 6-21) into the DataFile field of lattice records, using the field as dense auxiliary storage rather than a file pointer (since transdb uses a flat binary format, not the disk-backed page cache).

Cross-links: rec.Link[0] on a JA verb record points to the primary EN translation record. rec.Link[1] points to a secondary translation. The TranslateWithClusters pipeline traverses these links during cluster translation.

files