Quotas

NodeDB enforces resource quotas across four hierarchical levels: global cluster ceiling, per-database budget, per-tenant budget within each database, and engine-internal usage. Quotas protect the cluster from noisy neighbors, burst overages, and fairness violations.

Three-Tier Hierarchy

Resources flow through a three-level authorization gate before reaching the Data Plane:

Global ceiling (cluster-wide)
  ↓
Database budget (per-database, set at CREATE)
  ↓
Tenant budget (per-tenant within database)
  ↓
Engine internal usage (per-engine within tenant)

Enforcement points (in admission order):

  1. Tenant quota check — Rate limiting and concurrency per (database, tenant) pair
  2. Database quota check — Rate limiting and concurrency per database
  3. Global pressure check — Cluster-level backpressure if request queues overflow
  4. Memory allocation — Hierarchical reservation from global → database → tenant → engine

Database-Level Quotas

Set quotas at database creation or alter after:

CREATE DATABASE sales WITH QUOTA (
    max_memory_bytes    = 1073741824,  -- 1 GB
    max_storage_bytes   = 10737418240, -- 10 GB
    max_qps             = 1000,
    max_connections     = 100,
    cache_weight        = 2,
    priority_class      = 'critical',
    maintenance_cpu_pct = 25
);

ALTER DATABASE sales SET QUOTA (max_qps = 2000, cache_weight = 3);

Quota Fields

FieldMeaningDefaultNotes
max_memory_bytesRAM ceilingUnlimited (within global)Per-database L0 + index memory
max_storage_bytesDurable storage ceilingUnlimitedAll engines, all shards combined
max_qpsQueries per secondUnlimitedHard limit; returns DATABASE_QUOTA_EXCEEDED when exceeded
max_connectionsConcurrent connection cap1000Per-database; new logins rejected when full
cache_weightRelative LRU cache share10–100; higher weight = larger cache allocation
priority_classWAL fsync prioritystandardcritical | standard | bulk; see Priority Classes
maintenance_cpu_pctBackground work budget25% of core time for compaction, HNSW maintenance, etc.

Tenant-Level Quotas

Tenants inherit quotas from the database, but can be further subdivided:

ALTER TENANT marketing IN DATABASE sales SET QUOTA (
    max_memory_bytes  = 536870912,  -- 512 MB (half the database)
    max_qps           = 500,         -- half the database's 1000
    max_connections   = 50
);

SHOW TENANT QUOTA FOR marketing IN DATABASE sales;

Sum-of-tenant constraint: The sum of all tenant quotas within a database cannot exceed the database quota. NodeDB enforces this at write time:

-- This fails with QUOTA_OVERCOMMIT: tenant limits sum to 2500, but database max_qps is 1000
ALTER TENANT team_a IN DATABASE sales SET QUOTA (max_qps = 600);
ALTER TENANT team_b IN DATABASE sales SET QUOTA (max_qps = 1000);  -- ERROR

Inspecting Quotas and Usage

Check what's set and what's being used:

SHOW DATABASE QUOTA FOR sales;
SHOW DATABASE USAGE FOR sales;

SHOW TENANT QUOTA FOR marketing IN DATABASE sales;
SHOW TENANT USAGE FOR marketing IN DATABASE sales;

Quota output columns: database, max_memory_bytes, max_storage_bytes, max_qps, max_connections, cache_weight, priority_class, maintenance_cpu_pct

Usage output columns: database, memory_bytes, storage_bytes, qps_current, qps_p99, active_connections, maintenance_cpu_seconds

Priority Classes and WAL Commitment

Write-ahead log (WAL) fsync is the most expensive operation. NodeDB batches writes into three independent priority groups so critical databases don't wait behind bulk workloads:

PriorityBehaviorUse Case
criticalOwn fsync group, committed firstProduction payment system, real-time analytics
standardDefault batch group (most databases)User-facing API, transactional
bulkExtended timeout, lower fsync rateBatch ETL, daily reports, backfill

Set priority at database creation or alter:

CREATE DATABASE critical_payments WITH QUOTA (priority_class = 'critical');
ALTER DATABASE bulk_processing SET QUOTA (priority_class = 'bulk');

A write to a critical database blocks until its fsync completes; a write to bulk waits longer but doesn't delay critical or standard commits.

Weighted-Fair Queue on the SPSC Bridge

Each Data Plane core has a request ring buffer. To prevent one database from saturating an entire core, requests are scheduled via deficit round-robin (DRR) weighted by priority_class:

  • critical databases get first pick each scheduling cycle
  • standard databases get next
  • bulk databases get the remainder

If one database saturates its share of the core, it throttles only its own writes. Co-resident databases stay responsive.

Document Cache — Per-Database Allocation

The in-memory document cache is shared across all databases proportional to their cache_weight:

-- This database gets 10x the cache share of others
ALTER DATABASE hot_reads SET QUOTA (cache_weight = 10);

When the cache fills, NodeDB evicts entries from the database with the highest current-vs-weight overshoot. A hot database cannot evict a cold database below its proportional fair share.

Background Task Budget

Maintenance tasks (vector HNSW link cleanup, graph edge sweeps, timeseries segment compaction, array tile compaction, FTS LSM compaction) are CPU-hungry. Each database has a quota on how much core time maintenance can consume per minute:

ALTER DATABASE large_vector_search SET QUOTA (maintenance_cpu_pct = 50);  -- 50% of core time

The scheduler tracks CPU-seconds spent in maintenance per database per minute. Tasks over-cap are deferred to the next window. This prevents one database's compaction from starving interactive queries in another.

Hierarchical Rate Limiting

Requests are bucketed at four scopes (most-specific first; first to deny wins):

user:{user_id} → org:{org_id} → tenant:{tenant_id} → database:{database_id}

The database bucket has capacity equal to max_qps. A request hitting any bucket's rate limit returns DATABASE_QUOTA_EXCEEDED or TENANT_QUOTA_EXCEEDED depending on which bucket triggered.

Error Codes

Quota enforcement produces these errors:

ErrorTrigger
TENANT_QUOTA_EXCEEDEDTenant rate limit or concurrency exhausted
DATABASE_QUOTA_EXCEEDEDDatabase rate limit or concurrency exhausted
SERVER_OVERLOADGlobal cluster backpressure (queue > 95%)
QUOTA_OVERCOMMITSum of tenant quotas > database quota (or database > global)
TENANT_VECTOR_DIM_EXCEEDEDVector dimension exceeds tenant max_vector_dim
TENANT_GRAPH_DEPTH_EXCEEDEDGraph traversal depth exceeds tenant max_graph_depth

See error codes reference for full details.

Metrics and Observability

All quota-related metrics are labeled by database and (where applicable) tenant:

MetricTypeLabels
nodedb_database_qpsgaugedatabase="..."
nodedb_database_memory_bytesgaugedatabase="..."
nodedb_database_storage_bytesgaugedatabase="..."
nodedb_database_active_connectionsgaugedatabase="..."
nodedb_database_bridge_queue_depthgaugedatabase="..."
nodedb_database_wal_commit_latency_p99histogramdatabase="..."
nodedb_database_maintenance_cpu_secondscounterdatabase="..."
nodedb_tenant_qpsgaugedatabase="...", tenant="..."
nodedb_tenant_memory_bytesgaugedatabase="...", tenant="..."
nodedb_tenant_storage_bytesgaugedatabase="...", tenant="..."

Scrape your Prometheus instance for these metrics and alert on _p99 latency growth or queue depth > 85%.