The Neo4j database backend provides a graph-native storage solution for the ORLY Nostr relay. Unlike traditional key-value or document stores, Neo4j is optimized for relationship-heavy queries, making it an ideal fit for Nostr's social graph and event reference patterns.
- Implements the database.Database interface
- Manages Neo4j driver connection and lifecycle
- Uses Badger for metadata storage (markers, identity, subscriptions)
- Registers with the database factory via init()
- Defines Neo4j constraints and indexes using Cypher - Creates unique constraints on Event IDs and Author pubkeys - Indexes for optimal query performance (kind, created_at, tags)
- Translates Nostr REQ filters to Cypher queries - Leverages graph traversal for tag relationships - Supports prefix matching for IDs and pubkeys - Parameterized queries for security and performance
- Stores events as nodes with properties
- Creates graph relationships:
- AUTHORED_BY: Event → Author
- REFERENCES: Event → Event (e-tags)
- MENTIONS: Event → Author (p-tags)
- TAGGED_WITH: Event → Tag
Event Node
(:Event {
id: string, // Hex-encoded event ID (32 bytes)
serial: int, // Sequential serial number
kind: int, // Event kind
created_at: int, // Unix timestamp
content: string, // Event content
sig: string, // Hex-encoded signature
pubkey: string, // Hex-encoded author pubkey
tags: string // JSON-encoded tags array
})
Author Node
(:Author {
pubkey: string // Hex-encoded pubkey (unique)
})
Tag Node
(:Tag {
type: string, // Tag type (e.g., "t", "d")
value: string // Tag value
})
Word Node (NIP-50 search index)
(:Word {
hash: string, // 8-byte truncated SHA-256, hex-encoded (16 chars)
text: string // Normalized lowercase word (e.g., "bitcoin")
})
Marker Node (for metadata)
(:Marker {
key: string, // Unique key
value: string // Hex-encoded value
})
(:Event)-[:AUTHORED_BY]->(:Author) - Event authorship(:Event)-[:REFERENCES]->(:Event) - Event references (e-tags)(:Event)-[:MENTIONS]->(:Author) - Author mentions (p-tags)(:Event)-[:TAGGED_WITH]->(:Tag) - Generic tag associations(:Event)-[:HAS_WORD]->(:Word) - Word search index (NIP-50)The query engine in query-events.go translates Nostr filters to Cypher queries:
{"ids": ["abc123..."]}
Becomes:
MATCH (e:Event)
WHERE e.id = $id_0
For prefix matching (partial IDs):
WHERE e.id STARTS WITH $id_0
{"authors": ["pubkey1...", "pubkey2..."]}
Becomes:
MATCH (e:Event)
WHERE e.pubkey IN $authors
{"kinds": [1, 7]}
Becomes:
MATCH (e:Event)
WHERE e.kind IN $kinds
{"since": 1234567890, "until": 1234567900}
Becomes:
MATCH (e:Event)
WHERE e.created_at >= $since AND e.created_at <= $until
{"#t": ["bitcoin", "nostr"]}
Becomes:
MATCH (e:Event)
OPTIONAL MATCH (e)-[:TAGGED_WITH]->(t0:Tag)
WHERE t0.type = $tagType_0 AND t0.value IN $tagValues_0
This leverages Neo4j's native graph traversal for efficient tag queries!
{"search": "bitcoin lightning"}
Becomes:
MATCH (e:Event)-[:HAS_WORD]->(w:Word)
WHERE w.hash IN $wordHashes
WITH e, count(DISTINCT w) AS matchCount
RETURN e.id, e.kind, e.created_at, e.content, e.sig, e.pubkey, e.tags, matchCount
ORDER BY matchCount DESC, e.created_at DESC
LIMIT $limit
Search can be combined with any other filter (kinds, authors, since/until, tags). See the NIP-50 Search section below for full details.
{
"kinds": [1],
"authors": ["abc..."],
"#p": ["xyz..."],
"limit": 50
}
Becomes:
MATCH (e:Event)
OPTIONAL MATCH (e)-[:TAGGED_WITH]->(t0:Tag)
WHERE e.kind IN $kinds
AND e.pubkey IN $authors
AND t0.type = $tagType_0
AND t0.value IN $tagValues_0
RETURN e.id, e.kind, e.created_at, e.content, e.sig, e.pubkey, e.tags
ORDER BY e.created_at DESC
LIMIT $limit
ExecuteRead() with read-only sessionAll configuration is centralized in app/config/config.go and visible via ./orly help.
Important: All environment variables must be defined inapp/config/config.go. Do not useos.Getenv()directly in package code. Database backends receive configuration via thedatabase.DatabaseConfigstruct.
# Neo4j Connection
ORLY_NEO4J_URI="bolt://localhost:7687"
ORLY_NEO4J_USER="neo4j"
ORLY_NEO4J_PASSWORD="password"
# Database Type Selection
ORLY_DB_TYPE="neo4j"
# Data Directory (for Badger metadata storage)
ORLY_DATA_DIR="~/.local/share/ORLY"
# Neo4j Driver Tuning (Memory Management)
ORLY_NEO4J_MAX_CONN_POOL=25 # Max connections (default: 25, driver default: 100)
ORLY_NEO4J_FETCH_SIZE=1000 # Records per fetch batch (default: 1000, -1=all)
ORLY_NEO4J_MAX_TX_RETRY_SEC=30 # Max transaction retry time in seconds
ORLY_NEO4J_QUERY_RESULT_LIMIT=10000 # Max results per query (0=unlimited)
version: '3.8'
services:
neo4j:
image: neo4j:5.15
ports:
- "7474:7474" # HTTP
- "7687:7687" # Bolt
environment:
- NEO4J_AUTH=neo4j/password
- NEO4J_PLUGINS=["apoc"]
# Memory tuning for production
- NEO4J_server_memory_heap_initial__size=512m
- NEO4J_server_memory_heap_max__size=1g
- NEO4J_server_memory_pagecache_size=512m
# Transaction memory limits (prevent runaway queries)
- NEO4J_dbms_memory_transaction_total__max=256m
- NEO4J_dbms_memory_transaction_max=64m
# Query timeout
- NEO4J_dbms_transaction_timeout=30s
volumes:
- neo4j_data:/data
- neo4j_logs:/logs
orly:
build: .
ports:
- "3334:3334"
environment:
- ORLY_DB_TYPE=neo4j
- ORLY_NEO4J_URI=bolt://neo4j:7687
- ORLY_NEO4J_USER=neo4j
- ORLY_NEO4J_PASSWORD=password
# Driver tuning for memory management
- ORLY_NEO4J_MAX_CONN_POOL=25
- ORLY_NEO4J_FETCH_SIZE=1000
- ORLY_NEO4J_QUERY_RESULT_LIMIT=10000
depends_on:
- neo4j
volumes:
neo4j_data:
neo4j_logs:
- Event ID (unique constraint + index) - Event kind - Event created_at - Composite: kind + created_at - Tag type + value
ORLY_NEO4J_QUERY_RESULT_LIMIT (default: 10000) to prevent unbounded queries from exhausting memoryNeo4j runs as a separate process (typically in Docker), so memory management involves both the relay driver settings and Neo4j server configuration.
- Go driver connection pool
- Query result buffering
- Controlled by ORLY_NEO4J_* environment variables
- JVM heap for Java objects
- Page cache for graph data
- Transaction memory for query execution
- Controlled by NEO4J_* environment variables
| Variable | Default | Description |
|---|---|---|
ORLY_NEO4J_MAX_CONN_POOL | 25 | Max connections in pool. Lower = less memory, but may bottleneck under high load. Driver default is 100. |
ORLY_NEO4J_FETCH_SIZE | 1000 | Records fetched per batch. Lower = less memory per query, more round trips. Set to -1 for all (risky). |
ORLY_NEO4J_MAX_TX_RETRY_SEC | 30 | Max seconds to retry failed transactions. |
ORLY_NEO4J_QUERY_RESULT_LIMIT | 10000 | Hard cap on results per query. Prevents unbounded queries. Set to 0 for unlimited (not recommended). |
Recommended settings for memory-constrained environments:
ORLY_NEO4J_MAX_CONN_POOL=10
ORLY_NEO4J_FETCH_SIZE=500
ORLY_NEO4J_QUERY_RESULT_LIMIT=5000
JVM Heap Memory - For Java objects and query processing:
# Docker environment variables
NEO4J_server_memory_heap_initial__size=512m
NEO4J_server_memory_heap_max__size=1g
# neo4j.conf equivalent
server.memory.heap.initial_size=512m
server.memory.heap.max_size=1g
Page Cache - For caching graph data from disk:
# Docker
NEO4J_server_memory_pagecache_size=512m
# neo4j.conf
server.memory.pagecache.size=512m
Transaction Memory Limits - Prevent runaway queries:
# Docker
NEO4J_dbms_memory_transaction_total__max=256m # Global limit across all transactions
NEO4J_dbms_memory_transaction_max=64m # Per-transaction limit
# neo4j.conf
dbms.memory.transaction.total.max=256m
db.memory.transaction.max=64m
Query Timeout - Kill long-running queries:
# Docker
NEO4J_dbms_transaction_timeout=30s
# neo4j.conf
dbms.transaction.timeout=30s
| Deployment Size | Heap | Page Cache | Total Neo4j | ORLY Pool |
|---|---|---|---|---|
| Development | 512m | 256m | ~1GB | 10 |
| Small relay (<100k events) | 1g | 512m | ~2GB | 25 |
| Medium relay (<1M events) | 2g | 1g | ~4GB | 50 |
| Large relay (>1M events) | 4g | 2g | ~8GB | 100 |
Formula for Page Cache:
Page Cache = Data Size on Disk × 1.2
Use neo4j-admin server memory-recommendation inside the container to get tailored recommendations.
Check Neo4j memory from relay logs:
# Driver config is logged at startup
grep "connecting to neo4j" /path/to/orly.log
# Output: connecting to neo4j at bolt://... (pool=25, fetch=1000, txRetry=30s)
Check Neo4j server memory:
# Inside Neo4j container
docker exec neo4j neo4j-admin server memory-recommendation
# Or query via Cypher
CALL dbms.listPools() YIELD pool, heapMemoryUsed, heapMemoryUsedBytes
RETURN pool, heapMemoryUsed
Monitor transaction memory:
CALL dbms.listTransactions()
YIELD transactionId, currentQuery, allocatedBytes
RETURN transactionId, currentQuery, allocatedBytes
ORDER BY allocatedBytes DESC
Replaceable events (kinds 0, 3, 10000-19999) are handled in WouldReplaceEvent():
MATCH (e:Event {kind: $kind, pubkey: $pubkey})
WHERE e.created_at < $createdAt
RETURN e.serial, e.created_at
Older events are deleted before saving the new one.
For kinds 30000-39999, we also match on the d-tag:
MATCH (e:Event {kind: $kind, pubkey: $pubkey})-[:TAGGED_WITH]->(t:Tag {type: 'd', value: $dValue})
WHERE e.created_at < $createdAt
RETURN e.serial
Delete events (kind 5) are processed via graph traversal:
MATCH (target:Event {id: $targetId})
MATCH (delete:Event {kind: 5})-[:REFERENCES]->(target)
WHERE delete.pubkey = $pubkey OR delete.pubkey IN $admins
RETURN delete.id
Only same-author or admin deletions are allowed.
| Feature | Badger | DGraph | Neo4j |
|---|---|---|---|
| Storage Type | Key-value | Graph (distributed) | Graph (native) |
| Query Language | Custom indexes | DQL | Cypher |
| Tag Queries | Index lookups | Graph traversal | Native relationships |
| Scaling | Single-node | Distributed | Cluster/Causal cluster |
| Memory Usage | Low | Medium | High |
| Setup Complexity | Minimal | Medium | Medium |
| Best For | Small relays | Large distributed | Relationship-heavy |
applySchema() functionExample:
CREATE INDEX event_content_fulltext IF NOT EXISTS
FOR (e:Event) ON (e.content)
OPTIONS {indexConfig: {`fulltext.analyzer`: 'english'}}
To add custom query methods:
ExecuteRead() or ExecuteWrite() as appropriateparseEventsFromResult()Due to Neo4j dependency, tests require a running Neo4j instance:
# Start Neo4j via Docker
docker run -d --name neo4j-test \
-p 7687:7687 \
-e NEO4J_AUTH=neo4j/test \
neo4j:5.15
# Run tests
ORLY_NEO4J_URI="bolt://localhost:7687" \
ORLY_NEO4J_USER="neo4j" \
ORLY_NEO4J_PASSWORD="test" \
go test ./pkg/neo4j/...
# Cleanup
docker rm -f neo4j-test
Bolt+S is Neo4j's encrypted Bolt protocol — the same wire protocol used by cypher-shell, Neo4j Browser, and client drivers, but wrapped in TLS. By default, the ORLY relay's Neo4j instance only listens on localhost and is not accessible from the network. Enabling bolt+s exposes Neo4j on a public port with TLS encryption, allowing external tools to run Cypher queries against the relay's graph database.
This is useful for:
The ORLY relay manages Neo4j's bolt+s configuration through its web admin UI:
neo4j.conf directly (path configured via ORLY_NEO4J_CONF_PATH)neo4j.confsudo systemctl restart neo4j)bolt+s:// connection URIThe admin UI is only visible when ORLY_DB_TYPE=neo4j and only accessible to users with owner-level permissions.
Before enabling bolt+s, you need:
Tell the relay where your TLS certificates live. For Let's Encrypt:
ORLY_NEO4J_TLS_CERT_DIR=/etc/letsencrypt/live/relay.example.com
This directory must contain:
privkey.pem — the private keyfullchain.pem — the certificate chainIf you're using Let's Encrypt with Caddy or certbot, these files are created automatically. The relay configures Neo4j to use these exact filenames.
The relay needs to run sudo systemctl restart neo4j after modifying the config. Create a sudoers rule so this works without a password:
echo 'mleku ALL=(ALL) NOPASSWD: /usr/bin/systemctl restart neo4j' | sudo tee /etc/sudoers.d/orly-neo4j
sudo chmod 440 /etc/sudoers.d/orly-neo4j
Replace mleku with the user running the relay. This grants permission only for systemctl restart neo4j — nothing else.
Let's Encrypt certificates are typically owned by root with restricted permissions. Neo4j (which runs as the neo4j user) needs read access:
sudo chmod -R 755 /etc/letsencrypt/live/ /etc/letsencrypt/archive/
Alternatively, copy the certificates to a directory Neo4j owns:
sudo mkdir -p /var/lib/neo4j/certificates/bolt
sudo cp /etc/letsencrypt/live/relay.example.com/privkey.pem /var/lib/neo4j/certificates/bolt/
sudo cp /etc/letsencrypt/live/relay.example.com/fullchain.pem /var/lib/neo4j/certificates/bolt/
sudo chown -R neo4j:neo4j /var/lib/neo4j/certificates/bolt
If you copy certificates, set ORLY_NEO4J_TLS_CERT_DIR=/var/lib/neo4j/certificates/bolt and remember to update the copies when certificates renew.
The default bolt port is 7687. If you're using ufw:
sudo ufw allow 7687/tcp
If you're using a cloud provider's firewall (AWS security groups, DigitalOcean firewall, etc.), add an inbound rule for TCP port 7687.
To use a non-default port:
ORLY_NEO4J_BOLT_PORT=7688
The relay reads ORLY_NEO4J_TLS_CERT_DIR at startup. After setting it, restart the relay:
sudo systemctl restart orly
bolt+s://relay.example.com:7687) appears once enabled| Variable | Default | Description |
|---|---|---|
ORLY_NEO4J_CONF_PATH | /etc/neo4j/neo4j.conf | Path to neo4j.conf. The relay reads and writes this file to manage bolt+s settings. |
ORLY_NEO4J_RESTART_CMD | sudo systemctl restart neo4j | Shell command to restart Neo4j after config changes. Must succeed without interactive input. |
ORLY_NEO4J_TLS_CERT_DIR | (empty) | Directory containing privkey.pem and fullchain.pem. Must be set before bolt+s can be enabled. |
ORLY_NEO4J_BOLT_PORT | 7687 | The external port Neo4j listens on for bolt connections. Used in both neo4j.conf and the displayed connection URI. |
When enabling bolt+s, the relay sets these keys in neo4j.conf:
# Require TLS for all bolt connections
dbms.connector.bolt.tls_level=REQUIRED
# Listen on all interfaces (not just localhost)
dbms.connector.bolt.listen_address=0.0.0.0:7687
# Enable the bolt SSL policy
dbms.ssl.policy.bolt.enabled=true
dbms.ssl.policy.bolt.base_directory=/etc/letsencrypt/live/relay.example.com
dbms.ssl.policy.bolt.private_key=privkey.pem
dbms.ssl.policy.bolt.public_certificate=fullchain.pem
dbms.ssl.policy.bolt.client_auth=NONE
When disabling, tls_level is set to DISABLED, bolt.enabled to false, and the certificate settings are commented out. The relay handles both Neo4j 4.x (dbms.connector.bolt.*) and 5.x (server.bolt.*) config key formats.
Once bolt+s is enabled, connect with any Neo4j client:
cypher-shell:
cypher-shell -a bolt+s://relay.example.com:7687 -u neo4j -p <password>
Neo4j Browser:
bolt+s://relay.example.com:7687ORLY_NEO4J_USER / ORLY_NEO4J_PASSWORD)Neo4j Python driver:
from neo4j import GraphDatabase
driver = GraphDatabase.driver(
"bolt+s://relay.example.com:7687",
auth=("neo4j", "password")
)
with driver.session() as session:
result = session.run("MATCH (e:Event) RETURN count(e) AS total")
print(result.single()["total"])
Neo4j JavaScript driver:
import neo4j from "neo4j-driver";
const driver = neo4j.driver(
"bolt+s://relay.example.com:7687",
neo4j.auth.basic("neo4j", "password")
);
const session = driver.session();
const result = await session.run("MATCH (a:Author) RETURN count(a) AS total");
console.log(result.records[0].get("total"));
await session.close();
Once connected, you can query the relay's graph directly:
-- Count all events
MATCH (e:Event) RETURN count(e) AS total;
-- Find the most referenced authors
MATCH (e:Event)-[:MENTIONS]->(a:Author)
RETURN a.pubkey, count(e) AS mentions
ORDER BY mentions DESC LIMIT 20;
-- Social graph: who follows whom (kind 3 contact lists)
MATCH (e:Event {kind: 3})-[:AUTHORED_BY]->(a:Author)
MATCH (e)-[:MENTIONS]->(followed:Author)
RETURN a.pubkey AS follower, followed.pubkey AS follows
LIMIT 100;
-- Event references (reply chains)
MATCH path = (reply:Event)-[:REFERENCES*1..5]->(root:Event)
WHERE root.id = $eventId
RETURN path;
-- Tag popularity
MATCH (e:Event)-[:TAGGED_WITH]->(t:Tag {type: 't'})
RETURN t.value, count(e) AS usage
ORDER BY usage DESC LIMIT 50;
CREATE USER reader SET PASSWORD 'readonly123' SET PASSWORD CHANGE NOT REQUIRED;
GRANT ROLE reader TO reader;
GRANT READ {*} ON GRAPH * TO reader;
Set ORLY_NEO4J_TLS_CERT_DIR in your environment and restart the relay. The toggle is disabled until this is configured.
The relay modified neo4j.conf successfully but couldn't restart Neo4j. Check:
`bash
sudo -n systemctl restart neo4j
`
If this prompts for a password, the sudoers rule is missing or incorrect.
`bash
systemctl status neo4j
`
`bash
ORLYNEO4JRESTART_CMD="sudo /usr/local/bin/restart-neo4j.sh"
`
`bash
systemctl status neo4j
journalctl -u neo4j -n 50
`
- Certificate files not readable by the neo4j user
- Certificate and key don't match
- Certificate expired
`bash
ss -tlnp | grep 7687
`
`bash
sudo ufw status | grep 7687
`
`bash
openssl s_client -connect relay.example.com:7687
`
If you're using Let's Encrypt, certificates renew automatically but Neo4j needs a restart to pick up the new files. Add a renewal hook:
# /etc/letsencrypt/renewal-hooks/deploy/restart-neo4j.sh
#!/bin/bash
systemctl restart neo4j
sudo chmod +x /etc/letsencrypt/renewal-hooks/deploy/restart-neo4j.sh
Toggle bolt+s off in the web UI and click Apply & Restart Neo4j. This sets tls_level=DISABLED, comments out the certificate settings, and restarts Neo4j. Bolt connections will revert to unencrypted localhost-only access.
You can also manually revert by editing neo4j.conf:
sudo sed -i 's/^dbms.connector.bolt.tls_level=REQUIRED/dbms.connector.bolt.tls_level=DISABLED/' /etc/neo4j/neo4j.conf
sudo systemctl restart neo4j
The bolt+s management feature exposes three HTTP endpoints:
| Endpoint | Method | Auth | Description |
|---|---|---|---|
/api/neo4j/config | GET | None | Returns { "db_type": "neo4j" }. Used by the UI to decide whether to show the Neo4j tab. |
/api/neo4j/bolt | GET | Owner (NIP-98) | Returns current bolt+s status, config path, port, cert dir, and connection URI. |
/api/neo4j/bolt/toggle | POST | Owner (NIP-98) | Accepts { "enabled": true/false }. Modifies neo4j.conf and restarts Neo4j. |
ORLY binds to the internal Neo4j bolt connection and multiplexes incoming Cypher queries through its HTTP endpoint at POST /api/neo4j/cypher. Authentication uses NIP-98 — developers sign requests with their Nostr identity. No SSL certificates, no Neo4j credentials, no extra ports. A developer only needs their nsec.
This is the primary way to query the Neo4j graph. ORLY sits between the client and Neo4j, handling authentication, read-only enforcement, and timeouts. Neo4j never needs to be exposed to the network at all — it listens on localhost, and ORLY multiplexes access through its existing HTTP port.
ORLY maintains the bolt connection to Neo4j internally. When a query arrives over HTTP, ORLY validates the NIP-98 signature, checks the query is read-only, and forwards it to Neo4j through the bolt connection. The result comes back as JSON over HTTP.
In split IPC mode (Cloudron deployment), the relay process proxies through gRPC to the database process that holds the bolt connection:
Developer (nsec) → HTTP + NIP-98 → ORLY relay → gRPC → orly-db-neo4j → bolt → Neo4j
In direct mode (ORLY_DB_TYPE=neo4j), the relay holds the bolt connection itself:
Developer (nsec) → HTTP + NIP-98 → ORLY relay → bolt → Neo4j
Either way, the developer experience is the same: sign an HTTP request with your nsec, get JSON back.
Traditional Neo4j access via bolt+s requires:
NIP-98 replaces all of that. The developer already has a Nostr identity (nsec/npub). ORLY checks the signed event in the Authorization header and verifies the pubkey is in ORLY_OWNERS. No certificates to generate, no credentials to distribute, no ports to open. The Nostr identity is both the authentication and authorization mechanism.
| Variable | Default | Description |
|---|---|---|
ORLY_NEO4J_CYPHER_ENABLED | false | Enable the Cypher query endpoint |
ORLY_NEO4J_CYPHER_TIMEOUT | 30 | Default query timeout in seconds (max 120) |
ORLY_NEO4J_CYPHER_MAX_ROWS | 10000 | Max result rows returned per query (0 = unlimited) |
# Using ORLY's NIP-98 debugging tool (nurl)
# The only credential needed is your nsec
NOSTR_SECRET_KEY=nsec1... ./nurl -X POST \
-d '{"query": "MATCH (n) RETURN count(n) AS cnt"}' \
https://relay.example.com/api/neo4j/cypher
JSON body:
{
"query": "MATCH (e:Event)-[:AUTHORED_BY]->(a:Author) WHERE a.pubkey = $pk RETURN e.kind, count(*) AS cnt ORDER BY cnt DESC",
"params": {"pk": "abc123..."},
"timeout": 30
}
query (required) — Cypher query string. Must be read-only.params (optional) — Named parameters for the query. Values can be strings, numbers, booleans, or arrays.timeout (optional) — Query timeout in seconds. Capped at 120s regardless of the value provided.The response is a JSON array of record objects. Each object has one key per column in the Cypher RETURN clause:
[
{"e.kind": 1, "cnt": 4523},
{"e.kind": 0, "cnt": 892},
{"e.kind": 3, "cnt": 341}
]
Neo4j node, relationship, and path types are converted to JSON-safe maps with metadata prefixed by _:
{
"_type": "node",
"_id": "4:abc:123",
"_labels": ["Event"],
"id": "aabb...",
"kind": 1,
"created_at": 1700000000
}
The proxy rejects queries containing write operations: CREATE, MERGE, SET, DELETE, REMOVE, DETACH, DROP, and CALL. Comments (// and /* */) are stripped before checking. This is a safety net — the NIP-98 owner-only gate is the primary access control, not the keyword filter.
| Status | Meaning |
|---|---|
| 401 | Missing or invalid NIP-98 authentication |
| 403 | Authenticated user is not a relay owner |
| 404 | Endpoint disabled (ORLY_NEO4J_CYPHER_ENABLED=false) |
| 400 | Malformed JSON body or missing query field |
| 500 | Query execution error (timeout, syntax error, write rejection) |
| 503 | Database backend does not support Cypher (not Neo4j/gRPC-to-Neo4j) |
The Neo4j backend supports full-text word search via NIP-50, using a graph-based inverted index. This is the same approach used by the Badger backend — words are stored as graph nodes linked to events, enabling efficient set-intersection queries via graph traversal.
When an event is saved, its content and tag values are tokenized into individual words. Each unique word becomes a (:Word) node in the graph, linked to the event via a (:Event)-[:HAS_WORD]->(:Word) relationship. Searching for words is then a graph traversal from Word nodes back to Events.
Indexing pipeline:
nostr:...), and 64-character hex strings are skipped entirely(:Word {hash, text}) node is MERGE'd (created if new, reused if existing)[:HAS_WORD] relationship is created from the event to the wordThe hash serves as the unique key (~10^19 possible values — far beyond all human language words). The text property stores the readable word for debugging and direct Cypher queries.
To enable word search on an existing Neo4j deployment:
On startup, the relay automatically:
Word uniqueness constraint and indexMonitoring migration progress — watch the relay logs:
[INFO] applying migration v6: Backfill Word nodes and HAS_WORD relationships for NIP-50 search
[INFO] backfilling word index for 50000 events...
[INFO] backfilled word index: 500/50000 events processed
[INFO] backfilled word index: 1000/50000 events processed
...
[INFO] word index backfill complete: 50000 events processed
[INFO] migration v6 completed successfully
The migration processes events in batches of 500. For a relay with 50,000 events, expect it to take a few minutes. For a relay with millions of events, it may take longer but runs non-blocking — the relay serves requests during migration, search results will simply be incomplete until it finishes.
Verifying migration completion — check for the Migration marker node:
MATCH (m:Migration {version: "v6"})
RETURN m.version, m.description, m.applied_at
Verifying word index health:
-- Count indexed events and unique words
MATCH (e:Event)-[:HAS_WORD]->(w:Word)
RETURN count(DISTINCT e) AS events_indexed,
count(DISTINCT w) AS unique_words
-- Check a specific word exists
MATCH (w:Word {text: "bitcoin"})
RETURN w.hash, w.text
Clients send a standard NIP-50 REQ message with a search field in the filter:
["REQ", "sub-id", {"search": "bitcoin lightning", "limit": 20}]
The relay tokenizes the search string using the same pipeline as indexing (lowercase, ASCII normalization, hash lookup), then traverses the word graph to find matching events.
Search can be combined with any standard NIP-01 filter:
["REQ", "sub-id", {
"search": "bitcoin",
"kinds": [1],
"authors": ["abc123..."],
"since": 1700000000,
"#t": ["cryptocurrency"],
"limit": 50
}]
All filter clauses are applied as additional WHERE conditions on the search query, so the index is used for both word matching and filter narrowing in a single graph traversal.
Results are ranked using a 50/50 blend of match relevance and recency:
score = 0.5 × (matchCount / maxMatchCount) + 0.5 × (recencyRatio)
Where:
matchCount = number of search terms matched by this eventmaxMatchCount = highest match count across all resultsrecencyRatio = (event.created_at - oldest) / (newest - oldest)This means an event matching all search terms AND being recent ranks highest, while an event matching fewer terms or being older ranks lower. Events with equal scores are ordered by created_at DESC.
For "bitcoin lightning network":
If bolt+s is enabled (see Bolt+S External Access), you can query the word index directly with Cypher. This is useful for analytics, debugging, and building custom search experiences outside the Nostr protocol.
MATCH (e:Event)-[:HAS_WORD]->(w:Word {text: "bitcoin"})
RETURN e.id, e.content, e.created_at
ORDER BY e.created_at DESC
LIMIT 20
Find events containing ALL specified words:
WITH ["bitcoin", "lightning", "network"] AS searchTerms
MATCH (e:Event)-[:HAS_WORD]->(w:Word)
WHERE w.text IN searchTerms
WITH e, count(DISTINCT w) AS matchCount, size(searchTerms) AS totalTerms
WHERE matchCount = totalTerms
RETURN e.id, e.content, e.created_at
ORDER BY e.created_at DESC
LIMIT 20
Find events containing ANY of the words, ranked by how many they match:
WITH ["bitcoin", "lightning", "network"] AS searchTerms
MATCH (e:Event)-[:HAS_WORD]->(w:Word)
WHERE w.text IN searchTerms
WITH e, count(DISTINCT w) AS matchCount
RETURN e.id, e.content, e.created_at, matchCount
ORDER BY matchCount DESC, e.created_at DESC
LIMIT 50
Find words that frequently appear alongside "bitcoin":
MATCH (e:Event)-[:HAS_WORD]->(w1:Word {text: "bitcoin"})
MATCH (e)-[:HAS_WORD]->(w2:Word)
WHERE w2.text <> "bitcoin"
RETURN w2.text AS related_word, count(e) AS co_occurrences
ORDER BY co_occurrences DESC
LIMIT 20
Find the most-used words by a specific author:
MATCH (e:Event {pubkey: $pubkey})-[:HAS_WORD]->(w:Word)
RETURN w.text, count(e) AS usage
ORDER BY usage DESC
LIMIT 50
Find words that appeared most in the last 24 hours:
MATCH (e:Event)-[:HAS_WORD]->(w:Word)
WHERE e.created_at > (timestamp() / 1000 - 86400)
RETURN w.text, count(e) AS mentions
ORDER BY mentions DESC
LIMIT 30
For programmatic access, you can query by hash directly (avoiding normalization edge cases):
MATCH (e:Event)-[:HAS_WORD]->(w:Word)
WHERE w.hash IN ["a1b2c3d4e5f6a7b8", "f8e7d6c5b4a39281"]
RETURN e.id, e.content, count(DISTINCT w) AS matchCount
ORDER BY matchCount DESC
LIMIT 20
The hash is the first 8 bytes of SHA-256(lowercase_word), hex-encoded to 16 characters.
| Content Type | Indexed? | Example |
|---|---|---|
| Regular words | Yes | "Bitcoin is great" → bitcoin, is, great |
| Tag values | Yes | ["t", "cryptocurrency"] → cryptocurrency |
| Subject tags | Yes | ["subject", "decentralized finance"] → decentralized, finance |
| URLs | No | https://example.com/page → skipped entirely |
| Nostr URIs | No | nostr:npub1abc... → skipped entirely |
| Hex strings (64 chars) | No | Event IDs, pubkeys → skipped |
| Single characters | No | "a", "I" → too short (< 2 chars) |
| Mixed case | Normalized | "BITCOIN", "Bitcoin" → bitcoin |
| Decorative Unicode | Normalized | "𝗕𝗶𝘁𝗰𝗼𝗶𝗻" → bitcoin |
| Emoji | No | Emoji are not alphabetic words |
The word search index uses Neo4j's native graph traversal rather than a Lucene-based fulltext index. This has specific trade-offs:
Strengths:
Considerations:
The v6 migration backfills Word nodes for all events that existed before upgrading:
HAS_WORD relationshipsUNWIND batch patternThe migration is idempotent — it only processes events missing word relationships, so it's safe to restart mid-migration. If the relay crashes during migration, it will resume from where it left off on the next startup.
The migration marker is stored as a (:Migration {version: "v6"}) node. Once this node exists, the migration is skipped on subsequent startups.
# Test connectivity
cypher-shell -a bolt://localhost:7687 -u neo4j -p password
# Check Neo4j logs
docker logs neo4j
// View query execution plan
EXPLAIN MATCH (e:Event) WHERE e.kind = 1 RETURN e LIMIT 10
// Profile query performance
PROFILE MATCH (e:Event)-[:AUTHORED_BY]->(a:Author) RETURN e, a LIMIT 10
// List all constraints
SHOW CONSTRAINTS
// List all indexes
SHOW INDEXES
// Drop and recreate schema
DROP CONSTRAINT event_id_unique IF EXISTS
CREATE CONSTRAINT event_id_unique FOR (e:Event) REQUIRE e.id IS UNIQUE
This Neo4j backend implementation follows the same license as the ORLY relay project.