IMPLEMENTATION_SUMMARY.md raw

Event-Driven Vertex Management Implementation Summary

What Was Implemented

This document summarizes the event-driven vertex management system for maintaining NostrUser nodes and social graph relationships in the ORLY Neo4j backend.

Architecture Overview

Dual Node System

The implementation uses two separate node types:

  1. Event and Author nodes (NIP-01 Support)

- Created by SaveEvent() for ALL events - Used for standard Nostr REQ filter queries - Supports kinds, authors, tags, time ranges, etc. - Always maintained - this is the core relay functionality

  1. NostrUser nodes (Social Graph / WoT)

- Created by SocialEventProcessor for kinds 0, 3, 1984, 10000 - Used for social graph queries and Web of Trust metrics - Connected by FOLLOWS, MUTES, REPORTS relationships - Event traceable - every relationship links back to the event that created it

This dual approach ensures NIP-01 queries continue to work while adding social graph capabilities.

Files Created/Modified

New Files

  1. [EVENT_PROCESSING_SPEC.md](./EVENT_PROCESSING_SPEC.md) (120 KB)

- Complete specification for event-driven vertex management - Processing workflows for each event kind - Diff computation algorithms - Cypher query patterns - Implementation architecture with Go code examples - Testing strategy and performance considerations

  1. [social-event-processor.go](./social-event-processor.go) (18 KB)

- SocialEventProcessor struct and methods - processContactList() - handles kind 3 (follows) - processMuteList() - handles kind 10000 (mutes) - processReport() - handles kind 1984 (reports) - processProfileMetadata() - handles kind 0 (profiles) - Helper functions for diff computation and p-tag extraction - Batch processing support

  1. [WOT_SPEC.md](./WOT_SPEC.md) (40 KB)

- Web of Trust data model specification - Trust metrics definitions (GrapeRank, PageRank, etc.) - Deployment modes (lean vs. full) - Example Cypher queries

  1. [ADDITIONAL_REQUIREMENTS.md](./ADDITIONAL_REQUIREMENTS.md) (30 KB)

- 50+ missing implementation details identified - Organized into 12 categories - Phased implementation roadmap - Research needed items flagged

Modified Files

  1. [schema.go](./schema.go)

- Added ProcessedSocialEvent node constraint - Added indexes for social event processing - Maintained all existing NIP-01 schema (Event, Author, Tag, Marker) - Added WoT schema (NostrUser, WoT metrics nodes) - Updated dropAll() to handle new schema elements

  1. [save-event.go](./save-event.go)

- Integrated SocialEventProcessor call after base event save - Social processing is non-blocking (errors logged but don't fail save) - Processes kinds 0, 3, 1984, 10000 automatically - NIP-01 functionality preserved

  1. [README.md](./README.md)

- Added WoT features to feature list - Documented new specification files - Added file structure section

Data Model

Node Types

ProcessedSocialEvent (Tracking Node)

(:ProcessedSocialEvent {
  event_id: string,           // Hex event ID
  event_kind: int,            // 0, 3, 1984, or 10000
  pubkey: string,             // Author pubkey
  created_at: timestamp,      // Event timestamp
  processed_at: timestamp,    // When relay processed it
  relationship_count: int,    // How many relationships created
  superseded_by: string|null  // If replaced by newer event
})

NostrUser (Social Graph Vertex)

(:NostrUser {
  pubkey: string,             // Hex pubkey (unique)
  npub: string,               // Bech32 npub
  name: string,               // Profile name
  about: string,              // Profile about
  picture: string,            // Profile picture URL
  nip05: string,              // NIP-05 identifier
  // ... other profile fields
  // ... trust metrics (future)
})

Relationship Types (With Event Traceability)

FOLLOWS

(:NostrUser)-[:FOLLOWS {
  created_by_event: string,   // Event ID that created this
  created_at: timestamp,      // Event timestamp
  relay_received_at: timestamp
}]->(:NostrUser)

MUTES

(:NostrUser)-[:MUTES {
  created_by_event: string,
  created_at: timestamp,
  relay_received_at: timestamp
}]->(:NostrUser)

REPORTS

(:NostrUser)-[:REPORTS {
  created_by_event: string,
  created_at: timestamp,
  relay_received_at: timestamp,
  report_type: string         // NIP-56 type (spam, illegal, etc.)
}]->(:NostrUser)

Event Processing Flow

Kind 3 (Contact List) - Follow Relationships

1. Receive kind 3 event
2. Check if event already exists (base Event node)
   └─ If exists: return (already processed)
3. Save base Event + Author nodes (NIP-01)
4. Social processing:
   a. Check for existing kind 3 from this pubkey
   b. If new event is older: skip (don't replace newer with older)
   c. Extract p-tags from new event → new_follows[]
   d. Query old FOLLOWS relationships → old_follows[]
   e. Compute diff:
      - added_follows = new_follows - old_follows
      - removed_follows = old_follows - new_follows
   f. Transaction:
      - Mark old ProcessedSocialEvent as superseded
      - Create new ProcessedSocialEvent
      - DELETE removed FOLLOWS relationships
      - CREATE added FOLLOWS relationships
   g. Log: "processed contact list: added=X, removed=Y"

Example:

Event 1: Alice follows [Bob, Charlie]
  → Creates FOLLOWS to Bob and Charlie

Event 2: Alice follows [Bob, Dave] (newer)
  → Diff: added=[Dave], removed=[Charlie]
  → Marks Event 1 as superseded
  → Deletes FOLLOWS to Charlie
  → Creates FOLLOWS to Dave
  → Result: Alice follows [Bob, Dave]

Kind 10000 (Mute List) - Mute Relationships

Same pattern as kind 3, but creates/updates MUTES relationships.

Kind 1984 (Reporting) - Report Relationships

Different: Kind 1984 is NOT replaceable.

1. Receive kind 1984 event
2. Save base Event + Author nodes
3. Social processing:
   a. Extract p-tag (reported user) and report type
   b. Create ProcessedSocialEvent node
   c. Create REPORTS relationship (new, not replacing old)

Multiple Reports: Same user can report same target multiple times. Each creates a separate REPORTS relationship.

Kind 0 (Profile Metadata) - User Properties

1. Receive kind 0 event
2. Save base Event + Author nodes
3. Social processing:
   a. Parse JSON content
   b. MERGE NostrUser (create if not exists)
   c. SET profile properties (name, about, picture, etc.)

Note: Kind 0 is replaceable, but we don't diff - just update all profile fields.

Key Design Decisions

1. Event Traceability

All relationships include `created_by_event` property.

This enables:

2. Separate Event Tracking Node (ProcessedSocialEvent)

Instead of marking Event nodes as processed, we use a separate ProcessedSocialEvent node.

Why?

3. Superseded Chain

When a replaceable event is updated:

This creates a chain: Event1 → Event2 → Event3 (current)

Benefits:

4. Non-Blocking Social Processing

Social event processing errors are logged but don't fail the base event save.

Rationale:

5. Dual Node System (Event/Author vs. NostrUser)

Why not merge Author and NostrUser?

Query Examples

Get Current Follows for User

MATCH (user:NostrUser {pubkey: $pubkey})-[f:FOLLOWS]->(followed:NostrUser)
WHERE NOT EXISTS {
  MATCH (old:ProcessedSocialEvent {event_id: f.created_by_event})
  WHERE old.superseded_by IS NOT NULL
}
RETURN followed.pubkey, followed.name

Get Mutual Follows (Friends)

MATCH (user:NostrUser {pubkey: $pubkey})-[:FOLLOWS]->(friend:NostrUser)
-[:FOLLOWS]->(user)
RETURN friend.pubkey, friend.name

Count Reports by Type

MATCH (reported:NostrUser {pubkey: $pubkey})<-[r:REPORTS]-()
RETURN r.report_type, count(*) as count
ORDER BY count DESC

Follow Graph History

MATCH (evt:ProcessedSocialEvent {pubkey: $pubkey, event_kind: 3})
RETURN evt.event_id, evt.created_at, evt.relationship_count, evt.superseded_by
ORDER BY evt.created_at ASC

Testing Plan

Unit Tests

  1. Diff computation

- Empty lists - All new, all removed - Mixed additions and removals - Duplicate handling

  1. Event ordering

- Newer replaces older - Older rejected - Same timestamp

  1. P-tag extraction

- Valid tags - Invalid pubkeys - Empty lists

Integration Tests

  1. Contact list updates

- Initial list - Add follows - Remove follows - Replace entire list - Empty list (unfollow all)

  1. Multiple users

- Alice follows Bob - Bob follows Charlie - Verify independent updates

  1. Replaceable event semantics

- Older event arrives after newer - Verify rejection

  1. Report accumulation

- Multiple reports from same user - Multiple users reporting same target - Different report types

Performance Tests

  1. Large contact lists

- 1000+ follows - Diff computation time - Transaction time

  1. High volume

- 1000 events/sec - Graph write throughput - Query performance

Next Steps

Phase 1: Basic Testing (Current)

Phase 2: Optimization

Phase 3: Trust Metrics (Future)

Phase 4: Query Extensions (Future)

Configuration

Currently no configuration needed - social event processing is automatic for kinds 0, 3, 1984, 10000.

Future configuration options:

# Enable/disable social graph processing
ORLY_SOCIAL_GRAPH_ENABLED=true

# Which event kinds to process
ORLY_SOCIAL_GRAPH_KINDS=0,3,1984,10000

# Batch size for initial sync
ORLY_SOCIAL_GRAPH_BATCH_SIZE=1000

# Enable WoT metrics computation (future)
ORLY_WOT_ENABLED=false

Monitoring

Metrics to Track

Logs to Monitor

INFO: processed contact list: author=abc123, added=5, removed=2, total=50
INFO: processed mute list: author=abc123, added=1, removed=0
INFO: processed report: reporter=abc123, reported=def456, type=spam
ERROR: failed to process social event kind 3: <error details>

Known Limitations

  1. No event deletion support yet

- Kind 5 (event deletion) not implemented - Relationships persist even if event deleted

  1. No encrypted tag support

- Kind 10000 can have encrypted tags (private mutes) - Currently only processes public tags

  1. No temporal queries yet

- Can't query "who did Alice follow on 2024-01-01?" - Superseded chain exists but query API not implemented

  1. No batch import tool

- Processing events one at a time - Need tool for initial sync from existing relay

  1. No trust metrics computation

- NostrUser nodes created but metrics not calculated - Requires Phase 3 implementation

Performance Characteristics

Expected Performance

Bottlenecks

  1. Diff computation - O(n) where n = max(oldsize, newsize)
  2. Graph writes - Neo4j transaction overhead
  3. Multiple network round trips - Could batch queries

Optimization Opportunities

  1. Use APOC procedures for batch operations
  2. Cache frequently accessed user data
  3. Parallelize independent social event processing
  4. Use Neo4j Graph Data Science library for trust metrics

Summary

This implementation provides a solid foundation for social graph management in the ORLY Neo4j backend:

Event traceability - All relationships link to source events ✅ Diff-based updates - Efficient replaceable event handling ✅ NIP-01 compatible - Standard queries still work ✅ Extensible - Ready for trust metrics computation ✅ Well-documented - Comprehensive specs and code comments

The system is ready for testing and deployment in a pre-alpha environment.