The new pubkey graph indexes can significantly accelerate certain Nostr query patterns, particularly those involving #p tag filters. This document analyzes the optimization opportunities and implementation strategy.
Filter: {"#p": ["<hex-pubkey>"], "kinds": [1]}
Index Used: TagKind (tkc)
tkc|p|value_hash(8)|kind(2)|timestamp(8)|serial(5) = 27 bytes per entry
Process:
tkc|p|<hash>|0001|<timestamp range>|*Index Used: PubkeyEventGraph (peg)
peg|pubkey_serial(5)|kind(2)|direction(1)|event_serial(5) = 16 bytes per entry
Process:
pks|pubkey_hash(8)|* → 5-byte serialpeg|<serial>|0001|2|* (direction=2 for inbound p-tags)Advantages:
Filter: {"#p": ["<pubkey>"], "kinds": [1]}
Current: tkc index
Optimized: peg index
Example: "Find all text notes (kind-1) mentioning Alice"
// Current: tkc|p|hash(alice)|0001|timestamp|serial
// Optimized: peg|serial(alice)|0001|2|serial
Performance Gain: ~50% faster (smaller keys, exact match, no hash)
Filter: {"#p": ["<alice>", "<bob>", "<carol>"]}
Current: 3 separate tc- scans with union
Optimized: 3 separate peg scans with union
Performance Gain: ~40% faster (smaller indexes)
Filter: {"#p": ["<alice>", "<bob>"], "kinds": [1, 6, 7]}
Current: 6 separate tkc scans (3 kinds × 2 pubkeys)
Optimized: 6 separate peg scans with 41% smaller keys
Performance Gain: ~45% faster
Filter: {"authors": ["<alice>"], "#p": ["<bob>"]}
Current: Uses TagPubkey (tpc) index
Potential Optimization: Could use graph to find events where Alice is author AND Bob is mentioned
peg|serial(alice)|*|0|* (Alice's authored events)Recommendation: Keep using existing tpc index for this case
Create query-for-ptag-graph.go that:
GetPubkeySerialpeg index rangesConditions for optimization:
#p tagskinds (optional but beneficial)authors (use existing indexes)Modify GetIndexesFromFilter or create a query planner that:
Cost estimation:
O(log(pubkeys)) + O(matching_events)O(log(tag_values)) + O(matching_events)pubkeys < tag_values (usually true)The existing query cache should work transparently:
query-for-ptag-graph.gopackage database
// QueryPTagGraph uses the pubkey graph index for efficient p-tag queries
func (d *D) QueryPTagGraph(f *filter.F) (serials types.Uint40s, err error) {
// Extract p-tags from filter
// Resolve pubkey hex → serials
// Build peg index ranges
// Scan and return results
}
Update the query dispatcher to try graph optimization first:
func (d *D) GetSerialsFromFilter(f *filter.F) (sers types.Uint40s, err error) {
// Try p-tag graph optimization
if canUsePTagGraph(f) {
if sers, err = d.QueryPTagGraph(f); err == nil {
return
}
// Fall through to traditional indexes on error
}
// Existing logic...
}
func canUsePTagGraph(f *filter.F) bool {
// Has p-tags?
if f.Tags == nil || f.Tags.Len() == 0 {
return false
}
hasPTags := false
for _, t := range *f.Tags {
if len(t.Key()) >= 1 && t.Key()[0] == 'p' {
hasPTags = true
break
}
}
if !hasPTags {
return false
}
// No authors filter (that would need different index)
if f.Authors != nil && f.Authors.Len() > 0 {
return false
}
return true
}
- Measure: p-tag query latency - Compare: Tag index vs Graph index - Expected: 2-3x speedup
- Measure: p-tag + kind query latency - Compare: TagKind index vs Graph index - Expected: 3-4x speedup
- Measure: Multiple p-tag queries (fan-out) - Compare: Multiple tag scans vs graph scans - Expected: 4-5x speedup
func BenchmarkPTagQuery(b *testing.B) {
// Setup: Create 1M events, 10K pubkeys
// Filter: {"#p": ["<alice>"], "kinds": [1]}
b.Run("TagIndex", func(b *testing.B) {
// Use existing tag index
})
b.Run("GraphIndex", func(b *testing.B) {
// Use new graph index
})
}
Per event with N p-tags:
Mitigation:
ORLY_ENABLE_PTAG_GRAPH=trueIf enabling graph indexes on existing relay:
# Run migration to backfill graph from existing events
./orly migrate --backfill-ptag-graph
# Or via SQL-style approach:
# For each event:
# - Extract pubkeys (author + p-tags)
# - Create serials if not exist
# - Insert graph edges
Estimated time: 10K events/second = 100M events in ~3 hours
Instead of always using graph, use cost-based selection:
Rationale: Tag index can be faster for very broad queries due to:
query-for-ptag-graph.go with basic optimization{"kinds": [1, 6, 7], "#p": ["<my-pubkey>"]}
Use Case: "Show me mentions and replies" Speedup: 3-4x
{"kinds": [3], "#p": ["<alice>", "<bob>", "<carol>"]}
Use Case: "Who follows these people?" (kind-3 contact lists) Speedup: 2-3x
{"kinds": [7], "#p": ["<my-pubkey>"]}
Use Case: "Show me reactions to my events" Speedup: 4-5x
{"kinds": [9735], "#p": ["<my-pubkey>"]}
Use Case: "Show me zaps sent to me" Speedup: 3-4x