Neo4j Database Backend for ORLY Relay

Overview

The Neo4j database backend provides a graph-native storage solution for the ORLY Nostr relay. Unlike traditional key-value or document stores, Neo4j is optimized for relationship-heavy queries, making it an ideal fit for Nostr's social graph and event reference patterns.

Architecture

Core Components

Main Database File (pkg/neo4j/neo4j.go)

- Implements the database.Database interface - Manages Neo4j driver connection and lifecycle - Uses Badger for metadata storage (markers, identity, subscriptions) - Registers with the database factory via init()

Schema Management (pkg/neo4j/schema.go)

- Defines Neo4j constraints and indexes using Cypher - Creates unique constraints on Event IDs and Author pubkeys - Indexes for optimal query performance (kind, created_at, tags)

Query Engine (pkg/neo4j/query-events.go)

- Translates Nostr REQ filters to Cypher queries - Leverages graph traversal for tag relationships - Supports prefix matching for IDs and pubkeys - Parameterized queries for security and performance

Event Storage (pkg/neo4j/save-event.go)

- Stores events as nodes with properties - Creates graph relationships: - AUTHORED_BY: Event → Author - REFERENCES: Event → Event (e-tags) - MENTIONS: Event → Author (p-tags) - TAGGED_WITH: Event → Tag

Graph Schema

Node Types

Event Node

(:Event {
  id: string,           // Hex-encoded event ID (32 bytes)
  serial: int,          // Sequential serial number
  kind: int,            // Event kind
  created_at: int,      // Unix timestamp
  content: string,      // Event content
  sig: string,          // Hex-encoded signature
  pubkey: string,       // Hex-encoded author pubkey
  tags: string          // JSON-encoded tags array
})

Author Node

(:Author {
  pubkey: string        // Hex-encoded pubkey (unique)
})

Tag Node

(:Tag {
  type: string,         // Tag type (e.g., "t", "d")
  value: string         // Tag value
})

Word Node (NIP-50 search index)

(:Word {
  hash: string,         // 8-byte truncated SHA-256, hex-encoded (16 chars)
  text: string          // Normalized lowercase word (e.g., "bitcoin")
})

Marker Node (for metadata)

(:Marker {
  key: string,          // Unique key
  value: string         // Hex-encoded value
})

Relationships

(:Event)-[:AUTHORED_BY]->(:Author) - Event authorship
(:Event)-[:REFERENCES]->(:Event) - Event references (e-tags)
(:Event)-[:MENTIONS]->(:Author) - Author mentions (p-tags)
(:Event)-[:TAGGED_WITH]->(:Tag) - Generic tag associations
(:Event)-[:HAS_WORD]->(:Word) - Word search index (NIP-50)

How Nostr REQ Messages Are Implemented

Filter to Cypher Translation

The query engine in query-events.go translates Nostr filters to Cypher queries:

1. ID Filters

{"ids": ["abc123..."]}

Becomes:

MATCH (e:Event)
WHERE e.id = $id_0

For prefix matching (partial IDs):

WHERE e.id STARTS WITH $id_0

2. Author Filters

{"authors": ["pubkey1...", "pubkey2..."]}

Becomes:

MATCH (e:Event)
WHERE e.pubkey IN $authors

3. Kind Filters

{"kinds": [1, 7]}

Becomes:

MATCH (e:Event)
WHERE e.kind IN $kinds

4. Time Range Filters

{"since": 1234567890, "until": 1234567900}

Becomes:

MATCH (e:Event)
WHERE e.created_at >= $since AND e.created_at <= $until

5. Tag Filters (Graph Advantage!)

{"#t": ["bitcoin", "nostr"]}

Becomes:

MATCH (e:Event)
OPTIONAL MATCH (e)-[:TAGGED_WITH]->(t0:Tag)
WHERE t0.type = $tagType_0 AND t0.value IN $tagValues_0

This leverages Neo4j's native graph traversal for efficient tag queries!

6. NIP-50 Word Search

{"search": "bitcoin lightning"}

Becomes:

MATCH (e:Event)-[:HAS_WORD]->(w:Word)
WHERE w.hash IN $wordHashes
WITH e, count(DISTINCT w) AS matchCount
RETURN e.id, e.kind, e.created_at, e.content, e.sig, e.pubkey, e.tags, matchCount
ORDER BY matchCount DESC, e.created_at DESC
LIMIT $limit

Search can be combined with any other filter (kinds, authors, since/until, tags). See the NIP-50 Search section below for full details.

7. Combined Filters

{
  "kinds": [1],
  "authors": ["abc..."],
  "#p": ["xyz..."],
  "limit": 50
}

Becomes:

MATCH (e:Event)
OPTIONAL MATCH (e)-[:TAGGED_WITH]->(t0:Tag)
WHERE e.kind IN $kinds
  AND e.pubkey IN $authors
  AND t0.type = $tagType_0
  AND t0.value IN $tagValues_0
RETURN e.id, e.kind, e.created_at, e.content, e.sig, e.pubkey, e.tags
ORDER BY e.created_at DESC
LIMIT $limit

Query Execution Flow

Parse Filter: Extract IDs, authors, kinds, times, tags
Build Cypher: Construct parameterized query with MATCH/WHERE clauses
Execute: Run via ExecuteRead() with read-only session
Parse Results: Convert Neo4j records to Nostr events
Return: Send events back to client

Configuration

All configuration is centralized in app/config/config.go and visible via ./orly help.

Important: All environment variables must be defined in app/config/config.go. Do not use os.Getenv() directly in package code. Database backends receive configuration via the database.DatabaseConfig struct.

Environment Variables

# Neo4j Connection
ORLY_NEO4J_URI="bolt://localhost:7687"
ORLY_NEO4J_USER="neo4j"
ORLY_NEO4J_PASSWORD="password"

# Database Type Selection
ORLY_DB_TYPE="neo4j"

# Data Directory (for Badger metadata storage)
ORLY_DATA_DIR="~/.local/share/ORLY"

# Neo4j Driver Tuning (Memory Management)
ORLY_NEO4J_MAX_CONN_POOL=25       # Max connections (default: 25, driver default: 100)
ORLY_NEO4J_FETCH_SIZE=1000        # Records per fetch batch (default: 1000, -1=all)
ORLY_NEO4J_MAX_TX_RETRY_SEC=30    # Max transaction retry time in seconds
ORLY_NEO4J_QUERY_RESULT_LIMIT=10000  # Max results per query (0=unlimited)

Example Docker Compose Setup

version: '3.8'
services:
  neo4j:
    image: neo4j:5.15
    ports:
      - "7474:7474"  # HTTP
      - "7687:7687"  # Bolt
    environment:
      - NEO4J_AUTH=neo4j/password
      - NEO4J_PLUGINS=["apoc"]
      # Memory tuning for production
      - NEO4J_server_memory_heap_initial__size=512m
      - NEO4J_server_memory_heap_max__size=1g
      - NEO4J_server_memory_pagecache_size=512m
      # Transaction memory limits (prevent runaway queries)
      - NEO4J_dbms_memory_transaction_total__max=256m
      - NEO4J_dbms_memory_transaction_max=64m
      # Query timeout
      - NEO4J_dbms_transaction_timeout=30s
    volumes:
      - neo4j_data:/data
      - neo4j_logs:/logs

  orly:
    build: .
    ports:
      - "3334:3334"
    environment:
      - ORLY_DB_TYPE=neo4j
      - ORLY_NEO4J_URI=bolt://neo4j:7687
      - ORLY_NEO4J_USER=neo4j
      - ORLY_NEO4J_PASSWORD=password
      # Driver tuning for memory management
      - ORLY_NEO4J_MAX_CONN_POOL=25
      - ORLY_NEO4J_FETCH_SIZE=1000
      - ORLY_NEO4J_QUERY_RESULT_LIMIT=10000
    depends_on:
      - neo4j

volumes:
  neo4j_data:
  neo4j_logs:

Performance Considerations

Advantages Over Badger/DGraph

Native Graph Queries: Tag relationships and social graph traversals are native operations
Optimized Indexes: Automatic index usage for constrained properties
Efficient Joins: Relationship traversals are O(1) lookups
Query Planner: Neo4j's query planner optimizes complex multi-filter queries

Tuning Recommendations

Indexes: The schema creates indexes for:

- Event ID (unique constraint + index) - Event kind - Event created_at - Composite: kind + created_at - Tag type + value

Cache Configuration: Configure Neo4j's page cache and heap size (see Memory Tuning below)

Query Limits: The relay automatically enforces ORLY_NEO4J_QUERY_RESULT_LIMIT (default: 10000) to prevent unbounded queries from exhausting memory

Memory Tuning

Neo4j runs as a separate process (typically in Docker), so memory management involves both the relay driver settings and Neo4j server configuration.

Understanding Memory Layers

ORLY Relay Process (~35MB RSS typical)

- Go driver connection pool - Query result buffering - Controlled by ORLY_NEO4J_* environment variables

Neo4j Server Process (512MB-4GB+ depending on data)

- JVM heap for Java objects - Page cache for graph data - Transaction memory for query execution - Controlled by NEO4J_* environment variables

Relay Driver Tuning (ORLY side)

Variable	Default	Description
`ORLY_NEO4J_MAX_CONN_POOL`	25	Max connections in pool. Lower = less memory, but may bottleneck under high load. Driver default is 100.
`ORLY_NEO4J_FETCH_SIZE`	1000	Records fetched per batch. Lower = less memory per query, more round trips. Set to -1 for all (risky).
`ORLY_NEO4J_MAX_TX_RETRY_SEC`	30	Max seconds to retry failed transactions.
`ORLY_NEO4J_QUERY_RESULT_LIMIT`	10000	Hard cap on results per query. Prevents unbounded queries. Set to 0 for unlimited (not recommended).

Recommended settings for memory-constrained environments:

ORLY_NEO4J_MAX_CONN_POOL=10
ORLY_NEO4J_FETCH_SIZE=500
ORLY_NEO4J_QUERY_RESULT_LIMIT=5000

Neo4j Server Tuning (Docker/neo4j.conf)

JVM Heap Memory - For Java objects and query processing:

# Docker environment variables
NEO4J_server_memory_heap_initial__size=512m
NEO4J_server_memory_heap_max__size=1g

# neo4j.conf equivalent
server.memory.heap.initial_size=512m
server.memory.heap.max_size=1g

Page Cache - For caching graph data from disk:

# Docker
NEO4J_server_memory_pagecache_size=512m

# neo4j.conf
server.memory.pagecache.size=512m

Transaction Memory Limits - Prevent runaway queries:

# Docker
NEO4J_dbms_memory_transaction_total__max=256m   # Global limit across all transactions
NEO4J_dbms_memory_transaction_max=64m           # Per-transaction limit

# neo4j.conf
dbms.memory.transaction.total.max=256m
db.memory.transaction.max=64m

Query Timeout - Kill long-running queries:

# Docker
NEO4J_dbms_transaction_timeout=30s

# neo4j.conf
dbms.transaction.timeout=30s

Memory Sizing Guidelines

Deployment Size	Heap	Page Cache	Total Neo4j	ORLY Pool
Development	512m	256m	~1GB	10
Small relay (<100k events)	1g	512m	~2GB	25
Medium relay (<1M events)	2g	1g	~4GB	50
Large relay (>1M events)	4g	2g	~8GB	100

Formula for Page Cache:

Page Cache = Data Size on Disk × 1.2

Use neo4j-admin server memory-recommendation inside the container to get tailored recommendations.

Monitoring Memory Usage

Check Neo4j memory from relay logs:

# Driver config is logged at startup
grep "connecting to neo4j" /path/to/orly.log
# Output: connecting to neo4j at bolt://... (pool=25, fetch=1000, txRetry=30s)

Check Neo4j server memory:

# Inside Neo4j container
docker exec neo4j neo4j-admin server memory-recommendation

# Or query via Cypher
CALL dbms.listPools() YIELD pool, heapMemoryUsed, heapMemoryUsedBytes
RETURN pool, heapMemoryUsed

Monitor transaction memory:

CALL dbms.listTransactions()
YIELD transactionId, currentQuery, allocatedBytes
RETURN transactionId, currentQuery, allocatedBytes
ORDER BY allocatedBytes DESC

Implementation Details

Replaceable Events

Replaceable events (kinds 0, 3, 10000-19999) are handled in WouldReplaceEvent():

MATCH (e:Event {kind: $kind, pubkey: $pubkey})
WHERE e.created_at < $createdAt
RETURN e.serial, e.created_at

Older events are deleted before saving the new one.

Parameterized Replaceable Events

For kinds 30000-39999, we also match on the d-tag:

MATCH (e:Event {kind: $kind, pubkey: $pubkey})-[:TAGGED_WITH]->(t:Tag {type: 'd', value: $dValue})
WHERE e.created_at < $createdAt
RETURN e.serial

Event Deletion (NIP-09)

Delete events (kind 5) are processed via graph traversal:

MATCH (target:Event {id: $targetId})
MATCH (delete:Event {kind: 5})-[:REFERENCES]->(target)
WHERE delete.pubkey = $pubkey OR delete.pubkey IN $admins
RETURN delete.id

Only same-author or admin deletions are allowed.

Comparison with Other Backends

Feature	Badger	DGraph	Neo4j
Storage Type	Key-value	Graph (distributed)	Graph (native)
Query Language	Custom indexes	DQL	Cypher
Tag Queries	Index lookups	Graph traversal	Native relationships
Scaling	Single-node	Distributed	Cluster/Causal cluster
Memory Usage	Low	Medium	High
Setup Complexity	Minimal	Medium	Medium
Best For	Small relays	Large distributed	Relationship-heavy

Development Guide

Adding New Indexes

Update schema.go with new index definition
Add to applySchema() function
Restart relay to apply schema changes

Example:

CREATE INDEX event_content_fulltext IF NOT EXISTS
FOR (e:Event) ON (e.content)
OPTIONS {indexConfig: {`fulltext.analyzer`: 'english'}}

Custom Queries

To add custom query methods:

Add method to query-events.go
Build Cypher query with parameterization
Use ExecuteRead() or ExecuteWrite() as appropriate
Parse results with parseEventsFromResult()

Testing

Due to Neo4j dependency, tests require a running Neo4j instance:

# Start Neo4j via Docker
docker run -d --name neo4j-test \
  -p 7687:7687 \
  -e NEO4J_AUTH=neo4j/test \
  neo4j:5.15

# Run tests
ORLY_NEO4J_URI="bolt://localhost:7687" \
ORLY_NEO4J_USER="neo4j" \
ORLY_NEO4J_PASSWORD="test" \
go test ./pkg/neo4j/...

# Cleanup
docker rm -f neo4j-test

Bolt+S External Access (Remote Cypher Queries)

What Is Bolt+S?

Bolt+S is Neo4j's encrypted Bolt protocol — the same wire protocol used by cypher-shell, Neo4j Browser, and client drivers, but wrapped in TLS. By default, the ORLY relay's Neo4j instance only listens on localhost and is not accessible from the network. Enabling bolt+s exposes Neo4j on a public port with TLS encryption, allowing external tools to run Cypher queries against the relay's graph database.

This is useful for:

Running ad-hoc Cypher queries from Neo4j Browser or Desktop
Connecting knowledge graph tools (e.g., Brainstorm) to the relay's social graph
Exploring the event graph, author relationships, and tag networks
Running analytics queries that aren't exposed through the Nostr protocol

How It Works

The ORLY relay manages Neo4j's bolt+s configuration through its web admin UI:

The relay reads and writes neo4j.conf directly (path configured via ORLY_NEO4J_CONF_PATH)
When an owner toggles bolt+s on/off, the relay modifies the relevant TLS and connector settings in neo4j.conf
The relay then restarts Neo4j via a configurable shell command (default: sudo systemctl restart neo4j)
The web UI displays the resulting bolt+s:// connection URI

The admin UI is only visible when ORLY_DB_TYPE=neo4j and only accessible to users with owner-level permissions.

Prerequisites

Before enabling bolt+s, you need:

Neo4j installed and running as a systemd service (or equivalent)
TLS certificates — typically from Let's Encrypt, but any valid cert/key pair works
The relay user must be able to restart Neo4j — via passwordless sudo
Neo4j must be able to read the TLS certificates — file permissions
The bolt port must be open in your firewall

Step-by-Step Setup

1. Set the TLS Certificate Directory

Tell the relay where your TLS certificates live. For Let's Encrypt:

ORLY_NEO4J_TLS_CERT_DIR=/etc/letsencrypt/live/relay.example.com

This directory must contain:

privkey.pem — the private key
fullchain.pem — the certificate chain

If you're using Let's Encrypt with Caddy or certbot, these files are created automatically. The relay configures Neo4j to use these exact filenames.

2. Grant the Relay User Permission to Restart Neo4j

The relay needs to run sudo systemctl restart neo4j after modifying the config. Create a sudoers rule so this works without a password:

echo 'mleku ALL=(ALL) NOPASSWD: /usr/bin/systemctl restart neo4j' | sudo tee /etc/sudoers.d/orly-neo4j
sudo chmod 440 /etc/sudoers.d/orly-neo4j

Replace mleku with the user running the relay. This grants permission only for systemctl restart neo4j — nothing else.

3. Ensure Neo4j Can Read the TLS Certificates

Let's Encrypt certificates are typically owned by root with restricted permissions. Neo4j (which runs as the neo4j user) needs read access:

sudo chmod -R 755 /etc/letsencrypt/live/ /etc/letsencrypt/archive/

Alternatively, copy the certificates to a directory Neo4j owns:

sudo mkdir -p /var/lib/neo4j/certificates/bolt
sudo cp /etc/letsencrypt/live/relay.example.com/privkey.pem /var/lib/neo4j/certificates/bolt/
sudo cp /etc/letsencrypt/live/relay.example.com/fullchain.pem /var/lib/neo4j/certificates/bolt/
sudo chown -R neo4j:neo4j /var/lib/neo4j/certificates/bolt

If you copy certificates, set ORLY_NEO4J_TLS_CERT_DIR=/var/lib/neo4j/certificates/bolt and remember to update the copies when certificates renew.

4. Open the Bolt Port in Your Firewall

The default bolt port is 7687. If you're using ufw:

sudo ufw allow 7687/tcp

If you're using a cloud provider's firewall (AWS security groups, DigitalOcean firewall, etc.), add an inbound rule for TCP port 7687.

To use a non-default port:

ORLY_NEO4J_BOLT_PORT=7688

5. Restart the Relay

The relay reads ORLY_NEO4J_TLS_CERT_DIR at startup. After setting it, restart the relay:

sudo systemctl restart orly

6. Enable Bolt+S via the Web UI

Log in to the relay's admin dashboard with an owner account
Click the Neo4j tab in the sidebar
Click Load Configuration to see the current bolt+s status
Toggle the Bolt+S switch to enabled
Click Apply & Restart Neo4j
The connection URI (e.g., bolt+s://relay.example.com:7687) appears once enabled

Environment Variables

Variable	Default	Description
`ORLY_NEO4J_CONF_PATH`	`/etc/neo4j/neo4j.conf`	Path to neo4j.conf. The relay reads and writes this file to manage bolt+s settings.
`ORLY_NEO4J_RESTART_CMD`	`sudo systemctl restart neo4j`	Shell command to restart Neo4j after config changes. Must succeed without interactive input.
`ORLY_NEO4J_TLS_CERT_DIR`	(empty)	Directory containing `privkey.pem` and `fullchain.pem`. Must be set before bolt+s can be enabled.
`ORLY_NEO4J_BOLT_PORT`	`7687`	The external port Neo4j listens on for bolt connections. Used in both `neo4j.conf` and the displayed connection URI.

What the Relay Modifies in neo4j.conf

When enabling bolt+s, the relay sets these keys in neo4j.conf:

# Require TLS for all bolt connections
dbms.connector.bolt.tls_level=REQUIRED

# Listen on all interfaces (not just localhost)
dbms.connector.bolt.listen_address=0.0.0.0:7687

# Enable the bolt SSL policy
dbms.ssl.policy.bolt.enabled=true
dbms.ssl.policy.bolt.base_directory=/etc/letsencrypt/live/relay.example.com
dbms.ssl.policy.bolt.private_key=privkey.pem
dbms.ssl.policy.bolt.public_certificate=fullchain.pem
dbms.ssl.policy.bolt.client_auth=NONE

When disabling, tls_level is set to DISABLED, bolt.enabled to false, and the certificate settings are commented out. The relay handles both Neo4j 4.x (dbms.connector.bolt.*) and 5.x (server.bolt.*) config key formats.

Connecting to the Database

Once bolt+s is enabled, connect with any Neo4j client:

cypher-shell:

cypher-shell -a bolt+s://relay.example.com:7687 -u neo4j -p <password>

Neo4j Browser:

Open Neo4j Browser (Desktop or web)
Enter connection URL: bolt+s://relay.example.com:7687
Enter credentials (same as ORLY_NEO4J_USER / ORLY_NEO4J_PASSWORD)

Neo4j Python driver:

from neo4j import GraphDatabase

driver = GraphDatabase.driver(
    "bolt+s://relay.example.com:7687",
    auth=("neo4j", "password")
)

with driver.session() as session:
    result = session.run("MATCH (e:Event) RETURN count(e) AS total")
    print(result.single()["total"])

Neo4j JavaScript driver:

import neo4j from "neo4j-driver";

const driver = neo4j.driver(
    "bolt+s://relay.example.com:7687",
    neo4j.auth.basic("neo4j", "password")
);

const session = driver.session();
const result = await session.run("MATCH (a:Author) RETURN count(a) AS total");
console.log(result.records[0].get("total"));
await session.close();

Example Queries

Once connected, you can query the relay's graph directly:

-- Count all events
MATCH (e:Event) RETURN count(e) AS total;

-- Find the most referenced authors
MATCH (e:Event)-[:MENTIONS]->(a:Author)
RETURN a.pubkey, count(e) AS mentions
ORDER BY mentions DESC LIMIT 20;

-- Social graph: who follows whom (kind 3 contact lists)
MATCH (e:Event {kind: 3})-[:AUTHORED_BY]->(a:Author)
MATCH (e)-[:MENTIONS]->(followed:Author)
RETURN a.pubkey AS follower, followed.pubkey AS follows
LIMIT 100;

-- Event references (reply chains)
MATCH path = (reply:Event)-[:REFERENCES*1..5]->(root:Event)
WHERE root.id = $eventId
RETURN path;

-- Tag popularity
MATCH (e:Event)-[:TAGGED_WITH]->(t:Tag {type: 't'})
RETURN t.value, count(e) AS usage
ORDER BY usage DESC LIMIT 50;

Security Considerations

Neo4j credentials are separate from Nostr identities. Anyone with the Neo4j username and password has full read/write access to the database. Do not share credentials publicly.
Bolt+s encrypts the connection but does not restrict who can connect. Use firewall rules to limit access by IP if needed.
The relay's owner toggle is protected by NIP-98 authentication. Only users with owner-level ACL permissions can enable or disable bolt+s.
Consider read-only access. Neo4j supports role-based access control. You can create a read-only user for external clients:

CREATE USER reader SET PASSWORD 'readonly123' SET PASSWORD CHANGE NOT REQUIRED;
GRANT ROLE reader TO reader;
GRANT READ {*} ON GRAPH * TO reader;

Troubleshooting

"TLS cert dir must be set before enabling bolt+s"

Set ORLY_NEO4J_TLS_CERT_DIR in your environment and restart the relay. The toggle is disabled until this is configured.

"Config updated but Neo4j restart failed"

The relay modified neo4j.conf successfully but couldn't restart Neo4j. Check:

Sudoers: Verify the relay user can run the restart command:

`bash sudo -n systemctl restart neo4j ` If this prompts for a password, the sudoers rule is missing or incorrect.

Neo4j service exists:

`bash systemctl status neo4j `

Restart command override: If Neo4j isn't managed by systemd, set a custom command:

`bash ORLYNEO4JRESTART_CMD="sudo /usr/local/bin/restart-neo4j.sh" `

Cannot connect after enabling bolt+s

Check Neo4j is running:

`bash systemctl status neo4j journalctl -u neo4j -n 50 `

Check Neo4j logs for TLS errors — common causes:

- Certificate files not readable by the neo4j user - Certificate and key don't match - Certificate expired

Check the port is open:

`bash ss -tlnp | grep 7687 `

Check firewall:

`bash sudo ufw status | grep 7687 `

Test connectivity from the client machine:

`bash openssl s_client -connect relay.example.com:7687 `

Certificate renewal

If you're using Let's Encrypt, certificates renew automatically but Neo4j needs a restart to pick up the new files. Add a renewal hook:

# /etc/letsencrypt/renewal-hooks/deploy/restart-neo4j.sh
#!/bin/bash
systemctl restart neo4j

sudo chmod +x /etc/letsencrypt/renewal-hooks/deploy/restart-neo4j.sh

Rolling back (disabling bolt+s)

Toggle bolt+s off in the web UI and click Apply & Restart Neo4j. This sets tls_level=DISABLED, comments out the certificate settings, and restarts Neo4j. Bolt connections will revert to unencrypted localhost-only access.

You can also manually revert by editing neo4j.conf:

sudo sed -i 's/^dbms.connector.bolt.tls_level=REQUIRED/dbms.connector.bolt.tls_level=DISABLED/' /etc/neo4j/neo4j.conf
sudo systemctl restart neo4j

API Endpoints

The bolt+s management feature exposes three HTTP endpoints:

Endpoint	Method	Auth	Description
`/api/neo4j/config`	GET	None	Returns `{ "db_type": "neo4j" }`. Used by the UI to decide whether to show the Neo4j tab.
`/api/neo4j/bolt`	GET	Owner (NIP-98)	Returns current bolt+s status, config path, port, cert dir, and connection URI.
`/api/neo4j/bolt/toggle`	POST	Owner (NIP-98)	Accepts `{ "enabled": true/false }`. Modifies neo4j.conf and restarts Neo4j.

Cypher Query Proxy (HTTP Endpoint)

ORLY binds to the internal Neo4j bolt connection and multiplexes incoming Cypher queries through its HTTP endpoint at POST /api/neo4j/cypher. Authentication uses NIP-98 — developers sign requests with their Nostr identity. No SSL certificates, no Neo4j credentials, no extra ports. A developer only needs their nsec.

This is the primary way to query the Neo4j graph. ORLY sits between the client and Neo4j, handling authentication, read-only enforcement, and timeouts. Neo4j never needs to be exposed to the network at all — it listens on localhost, and ORLY multiplexes access through its existing HTTP port.

Architecture

ORLY maintains the bolt connection to Neo4j internally. When a query arrives over HTTP, ORLY validates the NIP-98 signature, checks the query is read-only, and forwards it to Neo4j through the bolt connection. The result comes back as JSON over HTTP.

In split IPC mode (Cloudron deployment), the relay process proxies through gRPC to the database process that holds the bolt connection:

Developer (nsec) → HTTP + NIP-98 → ORLY relay → gRPC → orly-db-neo4j → bolt → Neo4j

In direct mode (ORLY_DB_TYPE=neo4j), the relay holds the bolt connection itself:

Developer (nsec) → HTTP + NIP-98 → ORLY relay → bolt → Neo4j

Either way, the developer experience is the same: sign an HTTP request with your nsec, get JSON back.

Why NIP-98 Instead of SSL

Traditional Neo4j access via bolt+s requires:

Exposing a Bolt port externally
Configuring TLS certificates for Neo4j
Managing Neo4j usernames and passwords
Neo4j Enterprise Edition for fine-grained RBAC

NIP-98 replaces all of that. The developer already has a Nostr identity (nsec/npub). ORLY checks the signed event in the Authorization header and verifies the pubkey is in ORLY_OWNERS. No certificates to generate, no credentials to distribute, no ports to open. The Nostr identity is both the authentication and authorization mechanism.

Configuration

Variable	Default	Description
`ORLY_NEO4J_CYPHER_ENABLED`	`false`	Enable the Cypher query endpoint
`ORLY_NEO4J_CYPHER_TIMEOUT`	`30`	Default query timeout in seconds (max 120)
`ORLY_NEO4J_CYPHER_MAX_ROWS`	`10000`	Max result rows returned per query (0 = unlimited)

Request Format

# Using ORLY's NIP-98 debugging tool (nurl)
# The only credential needed is your nsec
NOSTR_SECRET_KEY=nsec1... ./nurl -X POST \
  -d '{"query": "MATCH (n) RETURN count(n) AS cnt"}' \
  https://relay.example.com/api/neo4j/cypher

JSON body:

{
  "query": "MATCH (e:Event)-[:AUTHORED_BY]->(a:Author) WHERE a.pubkey = $pk RETURN e.kind, count(*) AS cnt ORDER BY cnt DESC",
  "params": {"pk": "abc123..."},
  "timeout": 30
}

query (required) — Cypher query string. Must be read-only.
params (optional) — Named parameters for the query. Values can be strings, numbers, booleans, or arrays.
timeout (optional) — Query timeout in seconds. Capped at 120s regardless of the value provided.

Response Format

The response is a JSON array of record objects. Each object has one key per column in the Cypher RETURN clause:

[
  {"e.kind": 1, "cnt": 4523},
  {"e.kind": 0, "cnt": 892},
  {"e.kind": 3, "cnt": 341}
]

Neo4j node, relationship, and path types are converted to JSON-safe maps with metadata prefixed by _:

{
  "_type": "node",
  "_id": "4:abc:123",
  "_labels": ["Event"],
  "id": "aabb...",
  "kind": 1,
  "created_at": 1700000000
}

Read-Only Validation

The proxy rejects queries containing write operations: CREATE, MERGE, SET, DELETE, REMOVE, DETACH, DROP, and CALL. Comments (// and /* */) are stripped before checking. This is a safety net — the NIP-98 owner-only gate is the primary access control, not the keyword filter.

Error Responses

Status	Meaning
401	Missing or invalid NIP-98 authentication
403	Authenticated user is not a relay owner
404	Endpoint disabled (`ORLY_NEO4J_CYPHER_ENABLED=false`)
400	Malformed JSON body or missing `query` field
500	Query execution error (timeout, syntax error, write rejection)
503	Database backend does not support Cypher (not Neo4j/gRPC-to-Neo4j)

Use Cases

Ad-hoc graph exploration — query the social graph from curl, Postman, or custom dashboards using just your nsec
Analytics — top authors, kind distributions, relationship counts, WoT metrics
Knowledge graph tools (e.g., Brainstorm) — any HTTP client can query the graph, no Neo4j driver needed
Monitoring and diagnostics — inspect relay state without touching Neo4j directly

NIP-50 Word Search

The Neo4j backend supports full-text word search via NIP-50, using a graph-based inverted index. This is the same approach used by the Badger backend — words are stored as graph nodes linked to events, enabling efficient set-intersection queries via graph traversal.

How It Works

When an event is saved, its content and tag values are tokenized into individual words. Each unique word becomes a (:Word) node in the graph, linked to the event via a (:Event)-[:HAS_WORD]->(:Word) relationship. Searching for words is then a graph traversal from Word nodes back to Events.

Indexing pipeline:

Event content is split into words (Unicode-aware word boundaries)
Tag values from all tags (subjects, hashtags, etc.) are also tokenized
Each word is normalized to lowercase ASCII (decorative Unicode → ASCII via NFKD)
Words shorter than 2 characters are discarded
URLs, Nostr URIs (nostr:...), and 64-character hex strings are skipped entirely
Each word is hashed with SHA-256, truncated to 8 bytes, and hex-encoded (16 chars)
A (:Word {hash, text}) node is MERGE'd (created if new, reused if existing)
A [:HAS_WORD] relationship is created from the event to the word

The hash serves as the unique key (~10^19 possible values — far beyond all human language words). The text property stores the readable word for debugging and direct Cypher queries.

Deployment: Upgrading to v0.60.5+

To enable word search on an existing Neo4j deployment:

Update the relay binary to v0.60.5 or later
Restart the relay — no config changes needed
Wait for migration v6 to complete

On startup, the relay automatically:

Creates the Word uniqueness constraint and index
Runs migration v6 to backfill Word nodes for all existing events

Monitoring migration progress — watch the relay logs:

[INFO] applying migration v6: Backfill Word nodes and HAS_WORD relationships for NIP-50 search
[INFO] backfilling word index for 50000 events...
[INFO] backfilled word index: 500/50000 events processed
[INFO] backfilled word index: 1000/50000 events processed
...
[INFO] word index backfill complete: 50000 events processed
[INFO] migration v6 completed successfully

The migration processes events in batches of 500. For a relay with 50,000 events, expect it to take a few minutes. For a relay with millions of events, it may take longer but runs non-blocking — the relay serves requests during migration, search results will simply be incomplete until it finishes.

Verifying migration completion — check for the Migration marker node:

MATCH (m:Migration {version: "v6"})
RETURN m.version, m.description, m.applied_at

Verifying word index health:

-- Count indexed events and unique words
MATCH (e:Event)-[:HAS_WORD]->(w:Word)
RETURN count(DISTINCT e) AS events_indexed,
       count(DISTINCT w) AS unique_words

-- Check a specific word exists
MATCH (w:Word {text: "bitcoin"})
RETURN w.hash, w.text

Querying via NIP-50 (Nostr Protocol)

Clients send a standard NIP-50 REQ message with a search field in the filter:

["REQ", "sub-id", {"search": "bitcoin lightning", "limit": 20}]

The relay tokenizes the search string using the same pipeline as indexing (lowercase, ASCII normalization, hash lookup), then traverses the word graph to find matching events.

Search can be combined with any standard NIP-01 filter:

["REQ", "sub-id", {
  "search": "bitcoin",
  "kinds": [1],
  "authors": ["abc123..."],
  "since": 1700000000,
  "#t": ["cryptocurrency"],
  "limit": 50
}]

All filter clauses are applied as additional WHERE conditions on the search query, so the index is used for both word matching and filter narrowing in a single graph traversal.

Relevance Scoring

Results are ranked using a 50/50 blend of match relevance and recency:

score = 0.5 × (matchCount / maxMatchCount) + 0.5 × (recencyRatio)

Where:

matchCount = number of search terms matched by this event
maxMatchCount = highest match count across all results
recencyRatio = (event.created_at - oldest) / (newest - oldest)

This means an event matching all search terms AND being recent ranks highest, while an event matching fewer terms or being older ranks lower. Events with equal scores are ordered by created_at DESC.

Example: Multi-Term Search

For "bitcoin lightning network":

An event containing all 3 words → matchCount = 3 (highest relevance)
An event containing "bitcoin" and "lightning" → matchCount = 2
An event containing only "bitcoin" → matchCount = 1
An event containing none → not returned

Querying via Bolt+S (Direct Cypher)

If bolt+s is enabled (see Bolt+S External Access), you can query the word index directly with Cypher. This is useful for analytics, debugging, and building custom search experiences outside the Nostr protocol.

Find Events Containing a Word

MATCH (e:Event)-[:HAS_WORD]->(w:Word {text: "bitcoin"})
RETURN e.id, e.content, e.created_at
ORDER BY e.created_at DESC
LIMIT 20

Multi-Word Intersection (AND Logic)

Find events containing ALL specified words:

WITH ["bitcoin", "lightning", "network"] AS searchTerms
MATCH (e:Event)-[:HAS_WORD]->(w:Word)
WHERE w.text IN searchTerms
WITH e, count(DISTINCT w) AS matchCount, size(searchTerms) AS totalTerms
WHERE matchCount = totalTerms
RETURN e.id, e.content, e.created_at
ORDER BY e.created_at DESC
LIMIT 20

Multi-Word Union with Ranking (OR Logic)

Find events containing ANY of the words, ranked by how many they match:

WITH ["bitcoin", "lightning", "network"] AS searchTerms
MATCH (e:Event)-[:HAS_WORD]->(w:Word)
WHERE w.text IN searchTerms
WITH e, count(DISTINCT w) AS matchCount
RETURN e.id, e.content, e.created_at, matchCount
ORDER BY matchCount DESC, e.created_at DESC
LIMIT 50

Word Co-Occurrence Analysis

Find words that frequently appear alongside "bitcoin":

MATCH (e:Event)-[:HAS_WORD]->(w1:Word {text: "bitcoin"})
MATCH (e)-[:HAS_WORD]->(w2:Word)
WHERE w2.text <> "bitcoin"
RETURN w2.text AS related_word, count(e) AS co_occurrences
ORDER BY co_occurrences DESC
LIMIT 20

Author's Vocabulary Profile

Find the most-used words by a specific author:

MATCH (e:Event {pubkey: $pubkey})-[:HAS_WORD]->(w:Word)
RETURN w.text, count(e) AS usage
ORDER BY usage DESC
LIMIT 50

Trending Words Over Time

Find words that appeared most in the last 24 hours:

MATCH (e:Event)-[:HAS_WORD]->(w:Word)
WHERE e.created_at > (timestamp() / 1000 - 86400)
RETURN w.text, count(e) AS mentions
ORDER BY mentions DESC
LIMIT 30

Hash-Based Queries

For programmatic access, you can query by hash directly (avoiding normalization edge cases):

MATCH (e:Event)-[:HAS_WORD]->(w:Word)
WHERE w.hash IN ["a1b2c3d4e5f6a7b8", "f8e7d6c5b4a39281"]
RETURN e.id, e.content, count(DISTINCT w) AS matchCount
ORDER BY matchCount DESC
LIMIT 20

The hash is the first 8 bytes of SHA-256(lowercase_word), hex-encoded to 16 characters.

What Gets Indexed (and What Doesn't)

Content Type	Indexed?	Example
Regular words	Yes	"Bitcoin is great" → `bitcoin`, `is`, `great`
Tag values	Yes	`["t", "cryptocurrency"]` → `cryptocurrency`
Subject tags	Yes	`["subject", "decentralized finance"]` → `decentralized`, `finance`
URLs	No	`https://example.com/page` → skipped entirely
Nostr URIs	No	`nostr:npub1abc...` → skipped entirely
Hex strings (64 chars)	No	Event IDs, pubkeys → skipped
Single characters	No	"a", "I" → too short (< 2 chars)
Mixed case	Normalized	"BITCOIN", "Bitcoin" → `bitcoin`
Decorative Unicode	Normalized	"𝗕𝗶𝘁𝗰𝗼𝗶𝗻" → `bitcoin`
Emoji	No	Emoji are not alphabetic words

Performance Characteristics

The word search index uses Neo4j's native graph traversal rather than a Lucene-based fulltext index. This has specific trade-offs:

Strengths:

Consistent with the Badger backend's approach (same tokenization, same scoring)
Word nodes are naturally deduplicated — "bitcoin" is one node regardless of how many events reference it
Combining search with graph traversals (authors, tags, kinds) is a single query
Adding new events is O(words) — each word is a MERGE + relationship CREATE

Considerations:

Each word adds a node and relationship to the graph, so the database size grows proportional to total word count
Very common words (stopwords like "the", "is") create high-degree Word nodes — Neo4j handles these well but they add little search value
The 8-byte hash provides ~10^19 possible values, sufficient for all human languages combined

Migration Details (v6)

The v6 migration backfills Word nodes for all events that existed before upgrading:

Finds all events without any HAS_WORD relationships
Processes them in batches of 500
For each event: tokenizes content + all tag values
Creates Word nodes and HAS_WORD relationships using UNWIND batch pattern
Logs progress every batch

The migration is idempotent — it only processes events missing word relationships, so it's safe to restart mid-migration. If the relay crashes during migration, it will resume from where it left off on the next startup.

The migration marker is stored as a (:Migration {version: "v6"}) node. Once this node exists, the migration is skipped on subsequent startups.

Future Enhancements

~~Full-text Search: Leverage Neo4j's full-text indexes for content search~~ — Done in v0.60.5 (graph-based word index via NIP-50)
Graph Analytics: Implement social graph metrics (centrality, communities)
~~Advanced Queries: Support NIP-50 search via Cypher full-text capabilities~~ — Done in v0.60.5
Clustering: Deploy Neo4j cluster for high availability
APOC Procedures: Utilize APOC library for advanced graph algorithms
Caching Layer: Implement query result caching similar to Badger backend
~~Stopword Filtering: Optionally skip indexing high-frequency words ("the", "is", "and") to reduce graph size~~ — Done in v0.60.6 (common English function words filtered during tokenization)

Troubleshooting

Connection Issues

# Test connectivity
cypher-shell -a bolt://localhost:7687 -u neo4j -p password

# Check Neo4j logs
docker logs neo4j

Performance Issues

// View query execution plan
EXPLAIN MATCH (e:Event) WHERE e.kind = 1 RETURN e LIMIT 10

// Profile query performance
PROFILE MATCH (e:Event)-[:AUTHORED_BY]->(a:Author) RETURN e, a LIMIT 10

Schema Issues

// List all constraints
SHOW CONSTRAINTS

// List all indexes
SHOW INDEXES

// Drop and recreate schema
DROP CONSTRAINT event_id_unique IF EXISTS
CREATE CONSTRAINT event_id_unique FOR (e:Event) REQUIRE e.id IS UNIQUE

References

License

This Neo4j backend implementation follows the same license as the ORLY relay project.

NEO4J_BACKEND.md raw

Neo4j Database Backend for ORLY Relay

Overview

Architecture

Core Components

Graph Schema

Node Types

Relationships

How Nostr REQ Messages Are Implemented

Filter to Cypher Translation

1. ID Filters

2. Author Filters

3. Kind Filters

4. Time Range Filters

5. Tag Filters (Graph Advantage!)

6. NIP-50 Word Search

7. Combined Filters

Query Execution Flow

Configuration

Environment Variables

Example Docker Compose Setup

Performance Considerations

Advantages Over Badger/DGraph

Tuning Recommendations

Memory Tuning

Understanding Memory Layers

Relay Driver Tuning (ORLY side)

Neo4j Server Tuning (Docker/neo4j.conf)

Memory Sizing Guidelines

Monitoring Memory Usage

Implementation Details

Replaceable Events

Parameterized Replaceable Events

Event Deletion (NIP-09)

Comparison with Other Backends

Development Guide

Adding New Indexes

Custom Queries

Testing

Bolt+S External Access (Remote Cypher Queries)

What Is Bolt+S?

How It Works

Prerequisites

Step-by-Step Setup

1. Set the TLS Certificate Directory

2. Grant the Relay User Permission to Restart Neo4j

3. Ensure Neo4j Can Read the TLS Certificates

4. Open the Bolt Port in Your Firewall

5. Restart the Relay

6. Enable Bolt+S via the Web UI

Environment Variables

What the Relay Modifies in neo4j.conf

Connecting to the Database

Example Queries

Security Considerations

Troubleshooting

"TLS cert dir must be set before enabling bolt+s"

"Config updated but Neo4j restart failed"

Cannot connect after enabling bolt+s

Certificate renewal

Rolling back (disabling bolt+s)

API Endpoints

Cypher Query Proxy (HTTP Endpoint)

Architecture

Why NIP-98 Instead of SSL

Configuration

Request Format

Response Format

Read-Only Validation

Error Responses

Use Cases

NIP-50 Word Search

How It Works

Deployment: Upgrading to v0.60.5+

Querying via NIP-50 (Nostr Protocol)

Relevance Scoring

Example: Multi-Term Search

Querying via Bolt+S (Direct Cypher)

Find Events Containing a Word

Multi-Word Intersection (AND Logic)