Queen MQ is a message queue backed by PostgreSQL with the partition count as a
continuous knob. Slide it toward thousands and you have
Kafka-shaped per-entity FIFO ordering without preallocating
anything, one chat per partition, one user per partition, one workflow
per partition. Slide it toward ten and you have
RabbitMQ-shaped competing consumers at higher throughput,
with per-shard ordering as a free bonus. Same engine, same SDK, same
durability tier (synchronous_commit=on). One Docker container
plus the Postgres you already run. Sized for workloads where
one beefy Postgres is enough.
Numbers from our benchmark page, including a 4-stage real-client pipeline (producer → worker → fan-out × 2). Use the sizing calculator to convert your target msg/s into a Postgres vCPU budget. Above ~200k msg/s sustained, or for multi-region replication, you want Kafka.
Kafka partitions are physical shards. They're brilliant for log-shipping, replication, and shovelling huge volumes through a streaming pipeline. They're awkward when you want one ordered lane per business entity, one chat per partition, one user per partition, one workflow per partition. Kafka makes you preallocate a fixed number and hash-mod into them, which means unrelated entities share a partition and one slow consumer can hold up many.
Queen reframes the partition as a logical ordering scope in a PostgreSQL-backed queue. You can have tens of thousands per queue, created on first push, with no preallocation. Slow processing on one chat doesn't slow another. It's not a Kafka replacement, it's a different shape, suited to a different range of workloads.
Queen was built at Smartness to power Smartchat, an AI guest-messaging platform: 100k+ concurrent chat conversations, AI translation steps, agent replies. One ordered partition per chat, one slow translation no longer freezes the others.
The script below is the examples/base.js
from the repo. It creates a queue, pushes a message, consumes it with a consumer
group, then atomically acknowledges the input and pushes a derived message into a
second queue, exactly-once across both operations.
Lease time, retry limit, retention, optional encryption, optional DLQ. One POST /api/v1/configure call.
Push to partition('p1') and every message in that lane is processed in order. Lanes are cheap, make one per user, tenant, or chat.
Each group has its own offset. Workers in a group share the load; separate groups all see every message.
Messages are leased to one consumer at a time. Renew leases for long jobs; failed ones retry, then go to the DLQ.
ack(input).push([output]).commit(). Wired straight into PostgreSQL's transaction, exactly-once across queues.
Pushes are written to a local file buffer when PostgreSQL is unreachable, then replayed automatically when it recovers. No lost messages.
The example above is JavaScript, but every Queen client speaks the same fluent verbs:
queue(name).partition(p).group(g).push() / .pop() / .consume().
All shipping today, not on a roadmap:
JavaScript (Node 22+ & browser, npm install queen-mq) ·
Python (3.8+, pip install queen-mq) ·
Go (1.24+) ·
PHP / Laravel (8.3+, composer require smartpricing/queen-mq) ·
C++ (header-only, C++17).
Or skip SDKs entirely and call the raw HTTP API.
See the full matrix →
Per-user, per-tenant, per-chat, tens of thousands of ordered lanes per queue. Slow processing on one lane doesn't slow another.
Process from beginning, only new messages, or replay from a timestamp. Each group has its own offset.
Atomic ack + push across queues, in one PostgreSQL transaction. Exactly-once between Queen operations; downstream effects are still your responsibility.
Server holds the connection until a message arrives. Inter-instance UDP wakeup keeps fan-out fast.
Configurable retry limit. Failed messages land in the DLQ with the error message and full payload.
If PostgreSQL is unreachable, pushes spill to a local file buffer and replay on recovery.
Trace a message across queues, transformations, retries, and consumer groups. Visualize timelines in the dashboard.
Native /metrics/prometheus exposition with per-queue, per-worker, and DLQ series. One DB call per scrape; ready for Grafana out of the box.
HS256, RS256, and EdDSA. Read-only / read-write / admin role tiers. Per-message producerSub stamped from the JWT.
Real-time queues, message browser, analytics, trace explorer, DLQ management, all served by the same C++ binary.
Update-heavy tables ship with FILLFACTOR=50 and tuned autovacuum knobs to stay HOT-update-friendly. Advisory locks replace row contention on the hot path. Retention windows on messages_consumed keep the message log bounded.
One Docker container is enough for most setups. For HA, run a StatefulSet behind a headless service: pods coordinate via UDP peer wake-ups and the UDPSYNC shared-state cache, with affinity routing in the clients.
Queen is good at a specific shape of workload, not at every queue use case. Picking infrastructure honestly matters more than picking the “winner” of every benchmark. Here's the truth, in three columns:
orders.*.created), headers exchanges, message priorities, alternate exchanges on unroutable.Note: for the typical RabbitMQ workload, competing consumers with persistent messages and acks, Queen at low partition count gets you the same shape at higher throughput, lower RAM (~70 MB vs 2–5 GB), and a simpler protocol. The reasons left to pick RabbitMQ are mostly protocol / ecosystem, not raw queue mechanics.
Most production message-queue workloads at most companies are below 100k msg/s and need ordering of some sort. Queen targets that median workload, not Kafka's upper tail and not RabbitMQ's specific routing strengths. Our benchmarks show Queen, Kafka, and RabbitMQ head-to-head on identical hardware so you can decide for yourself, with numbers we measured rather than wishful thinking.
If your messages do real work, a DB write, an API call, an LLM inference, then your worker fleet is the bottleneck, not the broker. At 20 ms of work per message, one worker handles 50 msg/s. Saturating Kafka's 1.5M msg/s requires 30 000 worker processes (~$150k/month of consumer fleet). Most companies have 100–2 000 workers and run at 5 k–100 k msg/s of actual demand , well within Queen's broker envelope. In that regime, the broker comparison stops mattering, and Queen wins on operational cost, durability semantics, and Postgres-transactional integration. The benchmarks page has the full math.
PostgreSQL + Queen, then push a message and pop it back. No SDK required.