What is NodeDB
NodeDB is a single Rust binary that provides seven data engines — Vector, Graph, Document (schemaless + strict), Columnar (with Timeseries and Spatial profiles), Key-Value, and Full-Text Search — each built with purpose-specific data structures. All engines share the same storage, memory, and query planner. Cross-engine queries execute in one process with zero network hops.
The Problem
Modern applications don't fit in one database. A healthcare app needs patient records (relational), medical imaging embeddings (vector), care team relationships (graph), device telemetry (timeseries), and offline-first sync for field workers (CRDT). The industry answer is a polyglot stack: PostgreSQL for relational, Qdrant for vectors, Neo4j for graphs, ClickHouse for timeseries, Redis for caching. Each system has its own protocol, deployment, and failure modes. Cross-database queries require application-level joins.
Some databases claim to solve this by bolting on capabilities — "graph" as recursive JOINs, "timeseries" without columnar compression or continuous aggregation. The features exist in name but not in performance.
Seven Engines
Vector — HNSW index with SQ8, PQ, and IVF-PQ quantization. SIMD-accelerated distance math. Adaptive bitmap pre-filtering.
Graph — CSR adjacency index with 13 native algorithms (PageRank, WCC, Louvain, SSSP, etc.) and a Cypher-subset MATCH pattern engine. GraphRAG fusion with vector search.
Document — Two modes per collection. Schemaless: MessagePack blobs with CRDT sync. Strict: Binary Tuples with O(1) field extraction and 3-4x cache density over BSON.
Columnar — Per-column compression (ALP, FastLanes, FSST, Gorilla, LZ4). Block statistics and predicate pushdown. Three profiles: Plain (general analytics), Timeseries (append-only, retention, continuous aggregation), Spatial (R*-tree, geohash, H3).
Key-Value — Hash-indexed O(1) point lookups with typed value fields. Native TTL, secondary indexes, atomic INCR/CAS, and predicate-filtered scans. SQL-queryable and joinable.
Full-Text Search — Block-Max WAND optimized BM25 with 16 Snowball stemmers, 27-language stop words, CJK bigram tokenization, posting compression, fuzzy matching, and native hybrid vector fusion.
CRDT — Loro-backed conflict-free replicated data types. AP on the edge, CP in the cloud. SQL constraint validation at sync time with compensation hints.
Three Deployment Modes
Origin (server) — Full distributed database. Multi-Raft consensus, Thread-per-Core Data Plane with io_uring, PostgreSQL-compatible SQL over pgwire. Horizontal scaling with automatic shard rebalancing.
Origin (local) — Same binary, single-node. No cluster overhead. Like running PostgreSQL locally.
NodeDB-Lite (embedded) — In-process library for phones, browsers (WASM), and desktops. All seven engines run locally with sub-millisecond reads. CRDT sync to Origin via WebSocket.
PostgreSQL Compatible
Connect with psql, any PostgreSQL driver, or ORM. Six wire protocols are available:
- pgwire — PostgreSQL wire protocol (port 6432)
- HTTP — REST, SSE, WebSocket (port 6480)
- NDB — Native MessagePack protocol (port 6433)
- RESP — Redis-compatible KV protocol (optional)
- ILP — InfluxDB Line Protocol for timeseries ingest (optional)
- Sync — WebSocket sync for NodeDB-Lite clients (port 9090)
What NodeDB Replaces
The combination of PostgreSQL + pgvector + Redis + Neo4j + ClickHouse + Elasticsearch — unified into one binary with shared storage and zero network hops between engines.