IoT stacks typically span a time-series database for sensor data, an object store or array database for bulk telemetry, and a synchronisation layer for edge devices. NodeDB covers all three: the Timeseries engine handles streaming ingestion, the Array engine stores multi-dimensional batch telemetry (replacing InfluxDB + TileDB separately), and NodeDB-Lite runs the same engines embedded on edge hardware, syncing deltas to Origin over CRDT when connectivity allows.
Engines used
| Engine | Role |
| Timeseries | Streaming sensor ingestion, continuous aggregates, retention |
| Array | ND telemetry tiles (spectrometry, radar, camera frames), batch analytics |
| Document (schemaless) | Device registry, configuration, alert rules with CRDT sync |
| Key-Value | Last-known device state, command queues |
Streaming sensor ingestion — Timeseries engine
The Timeseries engine is append-only, with a TIME_KEY column driving partition-by-time and block skip, plus per-collection retention.
CREATE COLLECTION sensor_readings (
ts TIMESTAMP TIME_KEY,
device_id VARCHAR,
metric VARCHAR,
value FLOAT,
unit VARCHAR
) WITH (engine='timeseries', partition_by='1d', retention='180d');
-- Bulk import (NDJSON / JSON array / CSV auto-detected)
COPY sensor_readings FROM '/var/spool/readings.ndjson';
-- Or INSERT for low-volume devices
INSERT INTO sensor_readings (ts, device_id, metric, value, unit) VALUES
(now(), 'device-001', 'temperature', 23.4, 'C'),
(now(), 'device-001', 'humidity', 61.2, '%'),
(now(), 'device-001', 'co2_ppm', 412, 'ppm');
-- 15-minute summaries
SELECT
time_bucket('15m', ts) AS bucket,
device_id,
avg(CASE WHEN metric = 'temperature' THEN value END) AS avg_temp,
max(CASE WHEN metric = 'co2_ppm' THEN value END) AS max_co2
FROM sensor_readings
WHERE ts >= now() - INTERVAL '24 hours'
GROUP BY bucket, device_id
ORDER BY bucket DESC;
For line-protocol producers (Telegraf, Vector), enable the ILP listener (ports.ilp = 8086) and push directly:
echo "env,device=device-001 temperature=23.4,humidity=61.2 1735689600000000000" | nc localhost 8086
Multi-dimensional telemetry — Array engine
Instruments like spectrometers, LiDAR scanners, and thermal cameras produce ND arrays of readings. The Array engine stores them in compressed Z-order tiles with per-tile ND-MBR statistics — replacing a separate TileDB or Zarr deployment. It uses its own DDL family (CREATE ARRAY), not CREATE COLLECTION.
-- A 3D spectrometry array: (device, wavelength, time). Dimensions are
-- integer-typed with half-open domains [lo, hi).
CREATE ARRAY spectral_readings
DIMS (
device_id INT64 DOMAIN [0, 100000),
wavelength_nm INT32 DOMAIN [300, 1100),
ts_epoch INT64 DOMAIN [0, 9223372036854775807)
)
ATTRS (intensity FLOAT32, noise_floor FLOAT32)
TILE_EXTENTS (1, 128, 3600)
WITH (cell_order = 'Z-ORDER', audit_retain_ms = 7776000000); -- 90 days
-- Insert cells from a scan session
INSERT INTO ARRAY spectral_readings (device_id, wavelength_nm, ts_epoch, intensity, noise_floor) VALUES
($dev, 500, $t, 0.41, 0.02),
($dev, 501, $t, 0.43, 0.02),
($dev, 502, $t, 0.40, 0.02);
-- Force in-memory tiles to durable storage
SELECT ARRAY_FLUSH('spectral_readings');
-- Slice: wavelengths 500–600 nm for one device over a time window
SELECT wavelength_nm, avg(intensity) AS avg_intensity
FROM ARRAY_SLICE(
'spectral_readings',
{ device_id: [$dev, $dev + 1), wavelength_nm: [500, 600), ts_epoch: [$t0, $t1) },
['intensity']
)
GROUP BY wavelength_nm
ORDER BY wavelength_nm;
-- Reduce a dimension: total intensity per wavelength, collapsing time
SELECT * FROM ARRAY_AGG('spectral_readings', 'intensity', 'SUM', 'ts_epoch');
Edge deployment — NodeDB-Lite
Edge gateways run NodeDB-Lite: the full engine set in an embeddable library with no network dependencies. Data is written locally and synced to Origin via Loro CRDT deltas when connectivity is available.
-- On the edge device (NodeDB-Lite, embedded in a Rust/C/Swift process)
-- Local timeseries writes survive network outages
INSERT INTO local_readings (ts, sensor_id, metric, value)
VALUES (now(), $sensor_id, 'vibration_g', $g);
-- Device configuration arrives via a shape subscription scoped to this device
SUBSCRIBE SHAPE ON device_config WHERE device_id = $me;
SELECT value FROM device_config WHERE key = 'alert_thresholds';
// iOS / embedded Swift
let db = NodeDbLite.open(path: "edge.db")
db.execute("INSERT INTO local_readings ...")
db.sync(url: "wss://origin.example.com/sync", token: authToken)
Sync on reconnect
CRDT sync is transparent — the edge process doesn't implement retry logic, conflict resolution, or delta tracking. Declare a conflict_policy on the collection (lww or field_merge) and the engine produces and ships the deltas; Origin validates SQL constraints at Raft commit and replies with a CompensationHint if a write loses.
Last-known state — Key-Value engine
The KV engine stores the current state of every device for sub-millisecond dashboard reads without scanning the timeseries collection.
CREATE COLLECTION device_state (key TEXT PRIMARY KEY) WITH (engine='kv');
-- Updated on every sensor publish
UPSERT INTO device_state { key: $device_id, last_seen: now(), status: $status, battery_pct: $batt };
-- Dashboard: current state of many devices at once
SELECT * FROM device_state WHERE key = ANY($device_ids);
Alerting via LIVE SELECT
Trigger downstream alert handlers the moment a sensor value crosses a threshold — no polling loop.
-- Subscribe to readings that breach the CO2 alert threshold
LIVE SELECT device_id, value, ts FROM sensor_readings
WHERE metric = 'co2_ppm'
AND value > 1000;
-- The handler receives each qualifying row immediately and fans out to
-- PagerDuty, SMS, or a dashboard socket. Cancel with: CANCEL LIVE SELECT <id>;
Retention and downsampling
Raw readings age out via the collection's retention. A continuous aggregate keeps incrementally-maintained hourly summaries — no cron job, no separate pipeline.
CREATE CONTINUOUS AGGREGATE sensor_hourly ON sensor_readings AS
SELECT
time_bucket('1h', ts) AS bucket,
device_id,
metric,
avg(value) AS avg_val,
min(value) AS min_val,
max(value) AS max_val
FROM sensor_readings
GROUP BY bucket, device_id, metric
WITH (refresh_interval = '5m');
Why not InfluxDB + TileDB?
A line-protocol time-series database can't store multi-dimensional instrument output, so you bolt on TileDB or Zarr — and now telemetry lives in two systems with two query languages and no shared identity. NodeDB keeps streaming readings and ND array tiles in one process: a query can prefilter cells by the same surrogate-identity bitmaps the rest of the engines use, and the edge runs the identical engine set embedded.