PUBKEY_GRAPH.md raw

Pubkey Graph System

Overview

The pubkey graph system provides efficient social graph queries by creating bidirectional, direction-aware edges between events and pubkeys in the ORLY relay.

Architecture

1. Pubkey Serial Assignment

Purpose: Compress 32-byte pubkeys to 5-byte serials for space efficiency.

Tables:

Space Savings: Each graph edge saves 27 bytes per pubkey reference (32 → 5 bytes).

2. Graph Edge Storage

Bidirectional edges with metadata:

EventPubkeyGraph (Forward)

epg|event_serial(5)|pubkey_serial(5)|kind(2)|direction(1) = 16 bytes

PubkeyEventGraph (Reverse)

peg|pubkey_serial(5)|kind(2)|direction(1)|event_serial(5) = 16 bytes

3. Direction Byte

The direction byte distinguishes relationship types:

ValueDirectionFrom Event PerspectiveFrom Pubkey Perspective
0AuthorThis pubkey is the event authorI am the author of this event
1P-Tag OutEvent references this pubkey(not used in reverse)
2P-Tag In(not used in forward)I am referenced by this event

Location in keys:

Graph Edge Creation

When an event is saved:

  1. Extract pubkeys:

- Event author: ev.Pubkey - P-tags: All ["p", "<hex-pubkey>", ...] tags

  1. Get or create serials: Each unique pubkey gets a monotonic 5-byte serial
  1. Create bidirectional edges:

For author (pubkey = event author): ` epg|eventserial|authorserial|kind|0 (author edge) peg|authorserial|kind|0|eventserial (is-author edge) `

For each p-tag (referenced pubkey): ` epg|eventserial|ptagserial|kind|1 (outbound reference) peg|ptagserial|kind|2|eventserial (inbound reference) `

Query Patterns

Find all events authored by a pubkey

Prefix scan: peg|pubkey_serial|*|0|*
Filter: direction == 0 (author)

Find all events mentioning a pubkey (inbound p-tags)

Prefix scan: peg|pubkey_serial|*|2|*
Filter: direction == 2 (p-tag inbound)

Find all kind-1 events mentioning a pubkey

Prefix scan: peg|pubkey_serial|0x0001|2|*
Exact match: kind == 1, direction == 2

Find all pubkeys referenced by an event (outbound p-tags)

Prefix scan: epg|event_serial|*|*|1
Filter: direction == 1 (p-tag outbound)

Find the author of an event

Prefix scan: epg|event_serial|*|*|0
Filter: direction == 0 (author)

Implementation Details

Thread Safety

The GetOrCreatePubkeySerial function uses:

  1. Read transaction to check for existing serial
  2. If not found, get next sequence number
  3. Write transaction with double-check to handle race conditions
  4. Returns existing serial if another goroutine created it concurrently

Deduplication

The save-event function deduplicates pubkeys before creating serials:

Edge Cases

  1. Author in p-tags: Only creates author edge (direction=0), skips duplicate p-tag edge
  2. Invalid p-tags: Silently skipped if hex decode fails or length != 32 bytes
  3. No p-tags: Only author edge is created

Performance Characteristics

Space Efficiency

Per event with N unique pubkeys:

Example: Event with author + 10 p-tags:

Query Performance

  1. Pubkey lookup: O(1) hash lookup via 8-byte truncated hash
  2. Serial generation: O(1) atomic increment
  3. Graph queries: Sequential scan with prefix optimization
  4. Kind filtering: Built into key ordering, no event decoding needed

Testing

Comprehensive tests verify:

Future Query APIs

The graph structure supports efficient queries for:

  1. Social Graph Queries:

- Who does Alice follow? (p-tags authored by Alice) - Who follows Bob? (p-tags referencing Bob) - Common connections between Alice and Bob

  1. Event Discovery:

- All replies to Alice's events (kind-1 events with p-tag to Alice) - All events Alice has replied to (kind-1 events by Alice with p-tags) - Quote reposts, mentions, reactions by event kind

  1. Analytics:

- Most-mentioned pubkeys (count p-tag-in edges) - Most active authors (count author edges) - Interaction patterns by kind

Migration Notes

This is a new index that:

To backfill existing events, run a migration that:

  1. Iterates all events
  2. Extracts pubkeys and creates serials
  3. Creates graph edges for each event