HTTP API

Plain HTTP. JSON in, JSON out.

All requests live under /api/v1. JSON request bodies, JSON responses. Bearer token in the Authorization header when JWT is enabled. Every endpoint below was exercised against a live server while writing this page.

Endpoint summary

Group	Method · Path	What it does
Health	GET `/health`	Liveness + DB check
Metrics	GET `/metrics`	Performance metrics (JSON)
Metrics	GET `/metrics/prometheus`	Prometheus text exposition for Prometheus / Grafana scraping
Config	POST `/api/v1/configure`	Create or reconfigure a queue
Messages	POST `/api/v1/push`	Push 1..N messages
Messages	GET `/api/v1/pop/queue/:q`	Pop from any partition of `:q`
Messages	GET `/api/v1/pop/queue/:q/partition/:p`	Pop from a specific partition
Messages	GET `/api/v1/pop`	Pop by namespace / task filter
Messages	POST `/api/v1/ack`	Ack a single message
Messages	POST `/api/v1/ack/batch`	Ack many messages at once
Advanced	POST `/api/v1/transaction`	Atomic ack + push bundle
Advanced	POST `/api/v1/lease/:id/extend`	Extend a message lease
Resources	GET `/api/v1/resources/overview`	Cluster-wide stats
Resources	GET `/api/v1/resources/queues`	List queues
Resources	GET / DELETE `/api/v1/resources/queues/:q`	Inspect / delete a queue
Resources	GET `/api/v1/resources/namespaces` · `/tasks`	List tags
Resources	GET `/api/v1/messages` · `/messages/:tx`	Inspect raw messages
Resources	GET `/api/v1/dlq`	DLQ messages with filters
Status	GET `/api/v1/status` · `/status/queues` · `/status/analytics`	Dashboard data
Status	GET `/api/v1/consumer-groups`	Group lag & state

Health

GET/health

request

curl http://localhost:6632/health

Returns {"status":"healthy","database":"connected","version":"…",…}. Always public, never gated by auth.

Metrics & Prometheus

Two endpoints, both public (in JWT_SKIP_PATHS by default), neither needs a bearer token.

JSON metrics

GET/metrics

request

curl http://localhost:6632/metrics

Lightweight JSON snapshot intended for ad-hoc inspection and the bundled SDKs' Admin.metrics() helper. Reports the live DB pool stats plus a stub for uptime / requests / messages — for a real telemetry surface use /metrics/prometheus below.

Prometheus exposition

GET/metrics/prometheus

scrape

curl http://localhost:6632/metrics/prometheus

Returns text/plain; version=0.0.4; charset=utf-8 in the standard Prometheus text exposition format. Each scrape:

reads in-process state for per-replica live gauges (process / db pool / threadpools / file buffer / sidecar / shared-state) — no DB hit;
performs one DB call to queen.get_prometheus_metrics_v1() for cluster-wide series (lifetime totals, latest-bucket per-queue and per-worker, DLQ depth).

Series labelling rules

Cluster-singleton series (lifetime totals, DLQ depth) carry scope="cluster". They are the same on every replica — query them with max(...), never sum(...).

Per-replica series carry no extra label; Prometheus adds instance and job from your scrape config.

Per-queue series add queue; per-worker series add hostname, worker_id, pid.

Metric inventory

Family	Type	Labels	Source
`queen_uptime_seconds`	gauge	—	live
`queen_cpu_user_microseconds_total` · `queen_cpu_system_microseconds_total`	counter	—	live (latest sample)
`queen_process_resident_memory_bytes` · `queen_process_virtual_memory_bytes`	gauge	—	live
`queen_db_pool_size` · `_idle` · `_active`	gauge	—	live
`queen_threadpool_size` · `queen_threadpool_queue_size`	gauge	`pool="db\|system"`	live
`queen_response_registry_size`	gauge	`worker_id`	live
`queen_file_buffer_pending` · `_failed` · `_db_healthy`	gauge	—	live
`queen_maintenance_mode_enabled`	gauge	—	live (TTL-cached)
`queen_sidecar_op_count` · `_latency_microseconds` · `_items`	gauge	`op="push\|pop\|ack"`	live (latest sample)
`queen_queue_backoff_groups` · `_queues` · `_avg_interval_milliseconds`	gauge	—	live
`queen_shared_state_enabled`	gauge	—	live
`queen_qc_cache_size` · `_hits_total` · `_misses_total`	gauge / counter	—	live (when shared-state enabled)
`queen_servers`	gauge	`state="alive\|dead"`	live (when shared-state enabled)
`queen_transport_messages_total`	counter	`dir="sent\|received\|dropped"`	live (when shared-state enabled)
`queen_cluster_push_requests_total` · `_pop_requests_total` · `_ack_requests_total` · `_transactions_total`	counter	`scope="cluster"`	DB · `worker_metrics_summary`
`queen_cluster_push_messages_total` · `_pop_messages_total` · `_ack_messages_total`	counter	`scope="cluster"`	DB
`queen_cluster_ack_total`	counter	`scope, result="success\|failed"`	DB
`queen_cluster_db_errors_total` · `queen_cluster_dlq_total`	counter	`scope="cluster"`	DB
`queen_queue_pop_messages_per_minute`	gauge	`queue`	DB · last bucket of `queue_lag_metrics`
`queen_queue_pop_lag_milliseconds`	gauge	`queue, stat="avg\|max"`	DB
`queen_queue_push_requests_per_minute` · `_push_messages_per_minute` · `_pop_empty_per_minute` · `_transactions_per_minute`	gauge	`queue`	DB
`queen_queue_ack_per_minute`	gauge	`queue, result="success\|failed"`	DB
`queen_queue_parked_consumers` · `queen_queue_metrics_age_seconds`	gauge	`queue`	DB
`queen_worker_event_loop_lag_milliseconds` · `queen_worker_lag_milliseconds`	gauge	`hostname, worker_id, pid, stat="avg\|max"`	DB · last bucket of `worker_metrics`
`queen_worker_free_slots` · `queen_worker_job_queue_size`	gauge	`hostname, worker_id, pid, stat="avg\|min\|max"`	DB
`queen_worker_db_connections` · `queen_worker_backoff_size` · `queen_worker_jobs_done_per_minute`	gauge	`hostname, worker_id, pid`	DB
`queen_worker_requests_per_minute`	gauge	`hostname, worker_id, pid, op="push\|pop\|ack\|transaction"`	DB
`queen_worker_messages_per_minute`	gauge	`hostname, worker_id, pid, op="push\|pop\|ack"`	DB
`queen_worker_ack_per_minute`	gauge	`hostname, worker_id, pid, result="success\|failed"`	DB
`queen_worker_db_errors_per_minute` · `queen_worker_dlq_per_minute` · `queen_worker_metrics_age_seconds`	gauge	`hostname, worker_id, pid`	DB
`queen_dlq_depth`	gauge	`scope="cluster"`	DB · `dead_letter_queue`
`queen_dlq_depth_by_queue`	gauge	`queue`	DB

*_per_minute are gauges, not counters

The per-queue and per-worker *_per_minute series carry the delta for the most recent minute bucket, already aggregated server-side. Use the value directly as messages-per-minute or divide by 60 for per-second. Do not wrap them in rate() — that would compute the rate of a rate.

Use rate() only on the *_total counter families (queen_cluster_*_total, queen_cpu_*_microseconds_total, queen_qc_cache_*_total, queen_transport_messages_total).

Prometheus scrape config

prometheus.yml

scrape_configs:
  - job_name: queen
    metrics_path: /metrics/prometheus
    scrape_interval: 15s
    static_configs:
      - targets:
          - 'queen-host-1:6632'
          - 'queen-host-2:6632'
          - 'queen-host-3:6632'

Kubernetes pod annotations

helm values

podAnnotations:
  prometheus.io/scrape: "true"
  prometheus.io/path:   "/metrics/prometheus"
  prometheus.io/port:   "6632"

Useful PromQL recipes

throughput & lag

# Cluster-wide push throughput (msg/s)
max(rate(queen_cluster_push_messages_total[1m]))

# Per-queue pop throughput (msg/s) using the per-minute gauge
sum by (queue) (queen_queue_pop_messages_per_minute) / 60

# Per-queue p100 pop lag in seconds
max by (queue) (queen_queue_pop_lag_milliseconds{stat="max"}) / 1000

# DB connection saturation per replica
queen_db_pool_active / queen_db_pool_size

# Worker imbalance: max event-loop lag across workers
max by (instance) (queen_worker_event_loop_lag_milliseconds{stat="max"})

# Cluster ack failure ratio over 5m
max(rate(queen_cluster_ack_total{result="failed"}[5m]))
  / clamp_min(max(rate(queen_cluster_ack_total[5m])), 1)

# Stale metrics alert: bucket older than 3 minutes
queen_queue_metrics_age_seconds > 180

# DLQ depth alert
queen_dlq_depth > 0

Configure a queue

POST/api/v1/configure

request

curl -X POST http://localhost:6632/api/v1/configure \
  -H 'Content-Type: application/json' \
  -d '{
    "queue":"orders",
    "namespace":"billing",
    "task":"process-invoice",
    "options":{
      "leaseTime":300,
      "retryLimit":3,
      "priority":5,
      "retentionSeconds":3600,
      "completedRetentionSeconds":86400,
      "encryptionEnabled":false,
      "deadLetterQueue":true,
      "dlqAfterMaxRetries":true
    }
  }'

Option	Default	Notes
`leaseTime`	300	Seconds before a popped message returns to the queue if not acked.
`retryLimit`	3	Retries before DLQ.
`retryDelay`	1000	Milliseconds between retries.
`priority`	0	Higher priority queues drained first by multi-queue consumers.
`maxSize`	10000	Max in-flight messages.
`delayedProcessing`	0	Seconds to delay before a message becomes available.
`windowBuffer`	0	Seconds to coalesce per partition (window aggregation).
`retentionSeconds`	0	Pending message retention. `0` = forever.
`completedRetentionSeconds`	0	Completed message retention.
`encryptionEnabled`	false	Requires `QUEEN_ENCRYPTION_KEY` env.
`deadLetterQueue` / `dlqAfterMaxRetries`	false	Enable + auto-route to DLQ on retry exhaustion.

Push

POST/api/v1/push

request

curl -X POST http://localhost:6632/api/v1/push \
  -H 'Content-Type: application/json' \
  -d '{
    "items":[
      {
        "queue":"orders",
        "partition":"user-123",
        "payload":{"orderId":1,"amount":99.99},
        "transactionId":"order-1-v1",
        "traceId":"019dc9-..."
      }
    ]
  }'

partition defaults to "Default". transactionId is server-generated if omitted; provide your own for idempotent pushes.

Optional: server-side push buffering (QoS 0)

For very high-rate fire-and-forget pushes, ask the server to batch-flush:

QoS 0 buffering

curl -X POST http://localhost:6632/api/v1/push \
  -H 'Content-Type: application/json' \
  -d '{
    "items":[{"queue":"events","payload":{"type":"login"}}],
    "bufferMs":100,
    "bufferMax":100
  }'

Pop

GET/api/v1/pop/queue/:queue

GET/api/v1/pop/queue/:queue/partition/:partition

GET/api/v1/pop?namespace=&task=

Query param	Default	Notes
`batch`	1	Max messages to return. With `partitions>1` this is a global cap across all claimed partitions, not per-partition.
`partitions`	1	Wildcard pop only: claim up to N partitions in a single call. `1` = legacy single-partition behaviour. Ignored on the `/partition/:p` route.
`wait`	false	Long-polling: hold connection until a message arrives or `timeout` elapses.
`timeout`	30000	Long-polling timeout in ms.
`consumerGroup`	__QUEUE_MODE__	Empty group ⇒ shared queue mode.
`autoAck`	false	Mark messages completed on delivery (one round-trip).
`subscriptionMode`	(server default)	`new` to skip historical messages on a brand-new group.
`subscriptionFrom`	,	ISO 8601 timestamp to replay from.

blocking pop with consumer group

curl 'http://localhost:6632/api/v1/pop/queue/orders?consumerGroup=processors&batch=10&wait=true&timeout=30000'

Response shape:

response (excerpt)

{
  "success": true,
  "queue":         "orders",
  "partition":     "Default",
  "partitionId":   "c0715-...",
  "leaseId":       "019dc9-...",
  "consumerGroup": "__QUEUE_MODE__",
  "messages": [
    {
      "transactionId": "019dc9-...",
      "partitionId":   "c0715-...",
      "partition":     "Default",
      "leaseId":       "019dc9-...",
      "consumerGroup": "__QUEUE_MODE__",
      "data":          { "orderId": 1, "amount": 99.99 },
      "createdAt":     "2026-04-26T10:59:28.267966Z",
      "retryCount":    0,
      "producerSub":   null
    }
  ],
  "partitionsClaimed": 1
}

Each message carries its own partitionId, partition, leaseId, and consumerGroup, these per-message fields are authoritative for ACK and lease-renew calls. The top-level partitionId / partition reflect the first claimed partition (kept for back-compat with single-partition pops). partitionsClaimed is the number of distinct partitions the response actually drained.

Multi-partition pop (`partitions=N`)

Drain up to N partitions in a single round-trip. Designed for queues with many partitions where each partition only has a handful of new messages per polling interval, claiming one partition per call wastes bandwidth. The server walks eligible partitions in scan order, claims each one via a non-blocking advisory lock, and accumulates messages until either:

the combined message count reaches batch, or
partitions partitions have been claimed, or
no more eligible partitions remain.

All claimed partitions share a single leaseId, so a single POST /api/v1/lease/:leaseId/extend call extends every claimed partition's lease atomically. Only valid on the wildcard route (/api/v1/pop/queue/:queue); the per-partition route silently treats partitions>1 as 1.

multi-partition pop: 200 messages from up to 50 partitions

curl 'http://localhost:6632/api/v1/pop/queue/events?batch=200&partitions=50&wait=true'

response (multi-partition, excerpt)

{
  "success": true,
  "queue":         "events",
  "partition":     "p7",                    // first claimed partition (back-compat)
  "partitionId":   "8f...",                 // first claimed partitionId (back-compat)
  "leaseId":       "worker-abc",            // shared across all claimed partitions
  "consumerGroup": "__QUEUE_MODE__",
  "messages": [
    {"transactionId": "...", "partitionId": "8f...", "partition": "p7",  "leaseId": "worker-abc", "data": {...}, ...},
    {"transactionId": "...", "partitionId": "12...", "partition": "p12", "leaseId": "worker-abc", "data": {...}, ...},
    {"transactionId": "...", "partitionId": "5e...", "partition": "p3",  "leaseId": "worker-abc", "data": {...}, ...}
  ],
  "partitionsClaimed": 8
}

Ack

POST/api/v1/ack

partitionId is required

transactionId is unique within a partition, not globally. You must include the partitionId from the pop response so the right message is acked.

completed

curl -X POST http://localhost:6632/api/v1/ack \
  -H 'Content-Type: application/json' \
  -d '{
    "transactionId":"<tx>",
    "partitionId":"<pid>",
    "status":"completed"
  }'

failed (will retry)

curl -X POST http://localhost:6632/api/v1/ack \
  -H 'Content-Type: application/json' \
  -d '{
    "transactionId":"<tx>",
    "partitionId":"<pid>",
    "status":"failed",
    "error":"timeout calling downstream",
    "consumerGroup":"processors"
  }'

POST/api/v1/ack/batch

batch

curl -X POST http://localhost:6632/api/v1/ack/batch \
  -H 'Content-Type: application/json' \
  -d '{
    "consumerGroup":"processors",
    "acknowledgments":[
      {"transactionId":"tx-1","partitionId":"pid-1","status":"completed"},
      {"transactionId":"tx-2","partitionId":"pid-2","status":"failed","error":"validation"}
    ]
  }'

Transaction (atomic ack + push)

POST/api/v1/transaction

two-stage pipeline

curl -X POST http://localhost:6632/api/v1/transaction \
  -H 'Content-Type: application/json' \
  -d '{
    "operations":[
      {
        "type":"ack",
        "transactionId":"<input-tx>",
        "partitionId":"<input-pid>",
        "status":"completed",
        "consumerGroup":"analytics"
      },
      {
        "type":"push",
        "items":[
          {"queue":"next-stage","partition":"p2","payload":{"derived":true}}
        ]
      }
    ]
  }'

All operations execute in a single PostgreSQL transaction. Either everything commits, or nothing does. Use this for exactly-once message handoff between queues.

Extend a lease

POST/api/v1/lease/:leaseId/extend

request

curl -X POST http://localhost:6632/api/v1/lease/<leaseId>/extend \
  -H 'Content-Type: application/json' \
  -d '{"seconds":60}'

Use the leaseId returned by pop. Most clients renew leases for you when you set renewLease(true, intervalMillis) on the consumer.

For multi-partition pops (partitions=N), all claimed partitions share one leaseId, a single extend call updates every partition's lease in one shot. No need to call extend per partition.

Resources

GET/api/v1/resources/overview

GET/api/v1/resources/queues

GET/api/v1/resources/queues/:queue

DELETE/api/v1/resources/queues/:queue

GET/api/v1/resources/namespaces

GET/api/v1/resources/tasks

GET/api/v1/messages?queue=&partition=&status=&limit=

GET/api/v1/messages/:transactionId

Status & analytics

GET/api/v1/status

GET/api/v1/status/queues

GET/api/v1/status/queues/:queue

GET/api/v1/status/queues/:queue/messages

GET/api/v1/status/analytics?interval=hour&from=&to=

These are what the dashboard renders. Useful for embedding Queen metrics into your own ops dashboards.

Dead-letter queue

GET/api/v1/dlq?queue=&consumerGroup=&partition=&from=&to=&limit=&offset=

filter by queue + date range

curl 'http://localhost:6632/api/v1/dlq?queue=orders&from=2025-12-01&to=2025-12-31&limit=100'

Each entry includes the original message data, the errorMessage, retryCount, original timestamps, and the consumer group it failed under.

Consumer groups

GET/api/v1/consumer-groups

Returns all groups (named groups + the implicit __QUEUE_MODE__) with member counts, total offset lag, max time-lag, and per-queue/partition detail. State is one of Stable, Lagging, or Dead.

Auth header

When JWT is enabled (JWT_ENABLED=true), every request that isn't on the public list (/health, /metrics, /metrics/prometheus, dashboard) needs:

JWT bearer

curl http://localhost:6632/api/v1/resources/queues \
  -H "Authorization: Bearer $QUEEN_TOKEN"

See Server setup → JWT for HS256 / RS256 / EdDSA configuration and the role-based gating model.

Plain HTTP. JSON in, JSON out.

Endpoint summary

Health

Metrics & Prometheus

JSON metrics

Prometheus exposition

Metric inventory

Prometheus scrape config

Kubernetes pod annotations

Useful PromQL recipes

Configure a queue

Push

Optional: server-side push buffering (QoS 0)

Pop

Multi-partition pop (partitions=N)

Ack

Transaction (atomic ack + push)

Extend a lease

Resources

Status & analytics

Dead-letter queue

Consumer groups

Auth header

Multi-partition pop (`partitions=N`)