Plain HTTP. JSON in, JSON out.
All requests live under /api/v1. JSON request bodies, JSON responses.
Bearer token in the Authorization header when JWT is enabled.
Every endpoint below was exercised against a live server while writing this page.
Endpoint summary
| Group | Method · Path | What it does |
|---|---|---|
| Health | GET /health | Liveness + DB check |
| Metrics | GET /metrics | Performance metrics (JSON) |
| Metrics | GET /metrics/prometheus | Prometheus text exposition for Prometheus / Grafana scraping |
| Config | POST /api/v1/configure | Create or reconfigure a queue |
| Messages | POST /api/v1/push | Push 1..N messages |
| Messages | GET /api/v1/pop/queue/:q | Pop from any partition of :q |
| Messages | GET /api/v1/pop/queue/:q/partition/:p | Pop from a specific partition |
| Messages | GET /api/v1/pop | Pop by namespace / task filter |
| Messages | POST /api/v1/ack | Ack a single message |
| Messages | POST /api/v1/ack/batch | Ack many messages at once |
| Advanced | POST /api/v1/transaction | Atomic ack + push bundle |
| Advanced | POST /api/v1/lease/:id/extend | Extend a message lease |
| Resources | GET /api/v1/resources/overview | Cluster-wide stats |
| Resources | GET /api/v1/resources/queues | List queues |
| Resources | GET / DELETE /api/v1/resources/queues/:q | Inspect / delete a queue |
| Resources | GET /api/v1/resources/namespaces · /tasks | List tags |
| Resources | GET /api/v1/messages · /messages/:tx | Inspect raw messages |
| Resources | GET /api/v1/dlq | DLQ messages with filters |
| Status | GET /api/v1/status · /status/queues · /status/analytics | Dashboard data |
| Status | GET /api/v1/consumer-groups | Group lag & state |
Health
Returns {"status":"healthy","database":"connected","version":"…",…}. Always public, never gated by auth.
Metrics & Prometheus
Two endpoints, both public (in JWT_SKIP_PATHS by default), neither needs a bearer token.
JSON metrics
Lightweight JSON snapshot intended for ad-hoc inspection and the bundled SDKs'
Admin.metrics() helper. Reports the live DB pool stats plus a stub for
uptime / requests / messages — for a real telemetry surface use
/metrics/prometheus below.
Prometheus exposition
Returns text/plain; version=0.0.4; charset=utf-8 in the standard
Prometheus text exposition format. Each scrape:
- reads in-process state for per-replica live gauges (process / db pool / threadpools / file buffer / sidecar / shared-state) — no DB hit;
- performs one DB call to
queen.get_prometheus_metrics_v1()for cluster-wide series (lifetime totals, latest-bucket per-queue and per-worker, DLQ depth).
Cluster-singleton series (lifetime totals, DLQ depth) carry
scope="cluster". They are the same on every replica — query them with
max(...), never sum(...).
Per-replica series carry no extra label; Prometheus adds
instance and job from your scrape config.
Per-queue series add queue;
per-worker series add hostname, worker_id,
pid.
Metric inventory
| Family | Type | Labels | Source |
|---|---|---|---|
queen_uptime_seconds | gauge | — | live |
queen_cpu_user_microseconds_total · queen_cpu_system_microseconds_total | counter | — | live (latest sample) |
queen_process_resident_memory_bytes · queen_process_virtual_memory_bytes | gauge | — | live |
queen_db_pool_size · _idle · _active | gauge | — | live |
queen_threadpool_size · queen_threadpool_queue_size | gauge | pool="db|system" | live |
queen_response_registry_size | gauge | worker_id | live |
queen_file_buffer_pending · _failed · _db_healthy | gauge | — | live |
queen_maintenance_mode_enabled | gauge | — | live (TTL-cached) |
queen_sidecar_op_count · _latency_microseconds · _items | gauge | op="push|pop|ack" | live (latest sample) |
queen_queue_backoff_groups · _queues · _avg_interval_milliseconds | gauge | — | live |
queen_shared_state_enabled | gauge | — | live |
queen_qc_cache_size · _hits_total · _misses_total | gauge / counter | — | live (when shared-state enabled) |
queen_servers | gauge | state="alive|dead" | live (when shared-state enabled) |
queen_transport_messages_total | counter | dir="sent|received|dropped" | live (when shared-state enabled) |
queen_cluster_push_requests_total · _pop_requests_total · _ack_requests_total · _transactions_total | counter | scope="cluster" | DB · worker_metrics_summary |
queen_cluster_push_messages_total · _pop_messages_total · _ack_messages_total | counter | scope="cluster" | DB |
queen_cluster_ack_total | counter | scope, result="success|failed" | DB |
queen_cluster_db_errors_total · queen_cluster_dlq_total | counter | scope="cluster" | DB |
queen_queue_pop_messages_per_minute | gauge | queue | DB · last bucket of queue_lag_metrics |
queen_queue_pop_lag_milliseconds | gauge | queue, stat="avg|max" | DB |
queen_queue_push_requests_per_minute · _push_messages_per_minute · _pop_empty_per_minute · _transactions_per_minute | gauge | queue | DB |
queen_queue_ack_per_minute | gauge | queue, result="success|failed" | DB |
queen_queue_parked_consumers · queen_queue_metrics_age_seconds | gauge | queue | DB |
queen_worker_event_loop_lag_milliseconds · queen_worker_lag_milliseconds | gauge | hostname, worker_id, pid, stat="avg|max" | DB · last bucket of worker_metrics |
queen_worker_free_slots · queen_worker_job_queue_size | gauge | hostname, worker_id, pid, stat="avg|min|max" | DB |
queen_worker_db_connections · queen_worker_backoff_size · queen_worker_jobs_done_per_minute | gauge | hostname, worker_id, pid | DB |
queen_worker_requests_per_minute | gauge | hostname, worker_id, pid, op="push|pop|ack|transaction" | DB |
queen_worker_messages_per_minute | gauge | hostname, worker_id, pid, op="push|pop|ack" | DB |
queen_worker_ack_per_minute | gauge | hostname, worker_id, pid, result="success|failed" | DB |
queen_worker_db_errors_per_minute · queen_worker_dlq_per_minute · queen_worker_metrics_age_seconds | gauge | hostname, worker_id, pid | DB |
queen_dlq_depth | gauge | scope="cluster" | DB · dead_letter_queue |
queen_dlq_depth_by_queue | gauge | queue | DB |
*_per_minute are gauges, not counters
The per-queue and per-worker *_per_minute series carry the
delta for the most recent minute bucket, already aggregated
server-side. Use the value directly as messages-per-minute or divide by 60 for
per-second. Do not wrap them in rate() — that would
compute the rate of a rate.
Use rate() only on the *_total counter families
(queen_cluster_*_total, queen_cpu_*_microseconds_total,
queen_qc_cache_*_total, queen_transport_messages_total).
Prometheus scrape config
Kubernetes pod annotations
Useful PromQL recipes
Configure a queue
| Option | Default | Notes |
|---|---|---|
leaseTime | 300 | Seconds before a popped message returns to the queue if not acked. |
retryLimit | 3 | Retries before DLQ. |
retryDelay | 1000 | Milliseconds between retries. |
priority | 0 | Higher priority queues drained first by multi-queue consumers. |
maxSize | 10000 | Max in-flight messages. |
delayedProcessing | 0 | Seconds to delay before a message becomes available. |
windowBuffer | 0 | Seconds to coalesce per partition (window aggregation). |
retentionSeconds | 0 | Pending message retention. 0 = forever. |
completedRetentionSeconds | 0 | Completed message retention. |
encryptionEnabled | false | Requires QUEEN_ENCRYPTION_KEY env. |
deadLetterQueue / dlqAfterMaxRetries | false | Enable + auto-route to DLQ on retry exhaustion. |
Push
partition defaults to "Default". transactionId
is server-generated if omitted; provide your own for idempotent pushes.
Optional: server-side push buffering (QoS 0)
For very high-rate fire-and-forget pushes, ask the server to batch-flush:
Pop
| Query param | Default | Notes |
|---|---|---|
batch | 1 | Max messages to return. With partitions>1 this is a global cap across all claimed partitions, not per-partition. |
partitions | 1 | Wildcard pop only: claim up to N partitions in a single call. 1 = legacy single-partition behaviour. Ignored on the /partition/:p route. |
wait | false | Long-polling: hold connection until a message arrives or timeout elapses. |
timeout | 30000 | Long-polling timeout in ms. |
consumerGroup | __QUEUE_MODE__ | Empty group ⇒ shared queue mode. |
autoAck | false | Mark messages completed on delivery (one round-trip). |
subscriptionMode | (server default) | new to skip historical messages on a brand-new group. |
subscriptionFrom | , | ISO 8601 timestamp to replay from. |
Response shape:
Each message carries its own partitionId, partition,
leaseId, and consumerGroup, these per-message fields are
authoritative for ACK and lease-renew calls. The top-level partitionId
/ partition reflect the first claimed partition (kept
for back-compat with single-partition pops). partitionsClaimed is the
number of distinct partitions the response actually drained.
Multi-partition pop (partitions=N)
Drain up to N partitions in a single round-trip. Designed for queues with many partitions where each partition only has a handful of new messages per polling interval, claiming one partition per call wastes bandwidth. The server walks eligible partitions in scan order, claims each one via a non-blocking advisory lock, and accumulates messages until either:
- the combined message count reaches
batch, or partitionspartitions have been claimed, or- no more eligible partitions remain.
All claimed partitions share a single leaseId, so a
single POST /api/v1/lease/:leaseId/extend call extends every claimed
partition's lease atomically. Only valid on the wildcard route
(/api/v1/pop/queue/:queue); the per-partition route silently treats
partitions>1 as 1.
Ack
transactionId is unique within a partition, not globally. You
must include the partitionId from the pop response so the right message
is acked.
Transaction (atomic ack + push)
All operations execute in a single PostgreSQL transaction. Either everything commits, or nothing does. Use this for exactly-once message handoff between queues.
Extend a lease
Use the leaseId returned by pop. Most clients renew leases for you when
you set renewLease(true, intervalMillis) on the consumer.
For multi-partition pops (partitions=N), all claimed partitions share
one leaseId, a single extend call updates every partition's lease in
one shot. No need to call extend per partition.
Resources
Status & analytics
These are what the dashboard renders. Useful for embedding Queen metrics into your own ops dashboards.
Dead-letter queue
Each entry includes the original message data, the errorMessage,
retryCount, original timestamps, and the consumer group it failed under.
Consumer groups
Returns all groups (named groups + the implicit __QUEUE_MODE__) with
member counts, total offset lag, max time-lag, and per-queue/partition detail. State
is one of Stable, Lagging, or Dead.
Auth header
When JWT is enabled (JWT_ENABLED=true), every request that isn't on the
public list (/health, /metrics,
/metrics/prometheus, dashboard) needs:
See Server setup → JWT for HS256 / RS256 / EdDSA configuration and the role-based gating model.
