Queen MQ

← All benchmarks

Behavior under overload · Queen 0.16 (pushser) · June 2026

What happens when you push faster than you can consume.

Every real system eventually runs behind: a traffic spike, a slow downstream, a deploy, or simply fan-out amplifying every message N times. The question that matters then isn't "what's the peak rate." It's what does the queue do when producers outrun consumers? So we built a realistic application workload: two consumer-group fan-out, explicit per-message ack, 20 ms handlers, transient failures + a dead-letter path, and a transactional ack-and-forward pipeline. Then we deliberately drove push above the sustainable consume rate for a full hour on a two-machine setup. This is the over-provisioned-push story.

Headline

Duration 1.0 h push held above consume
Messages pushed 43.2 M 12k/s, flat the whole hour
Push errors 0 producers never blocked
Lost messages 0 backlog ≠ loss
Delivered (fan-out) 78.3 M in order, to both groups
Backlog at peak 8.7 M bounded & self-draining
Broker CPU ~1.3 / 32 cores a thin pipe under lag
Restarts / crashes 0 no death-spiral
Verdict. Pushed ~1.5k msg/s above what it could drain for a full hour, Queen 0.16 did exactly what infrastructure should do under sustained lag: accepted every push (0 errors), lost nothing, preserved per-partition order, kept the backlog bounded and in memory, and drained itself cleanly once the load eased, all on a single Postgres node with the broker idling at ~1.3 cores. The queue absorbed the lag instead of failing under it.

The scenario: a realistic application, not a pipe

Unlike a raw throughput test, this models how an app actually uses a queue. Every knob below is on at once, on a dedicated broker machine with the load generator on a separate machine over a private network, so the broker+Postgres host has all 32 cores to itself.

What Queen does when it's behind

Sustained over-rate is where many queues misbehave: they push back on producers, drop messages, reorder under contention, or spiral as their storage outgrows RAM. Across the full hour at 2× fan-out demand, Queen 0.16 did none of that:

Throughput & backpressure

Push is closed-loop at the target; consume runs at whatever Postgres can sustain. The difference is the backlog, and it behaves exactly as a durable buffer should.

Push (in)

12.0k/s Held for60 min Total43,200,340 Errors0

Consume (out, 2 groups)

~22.5k/s Demand (2×)24.0k/s Delivered78,287,947 Deficit~1.5k/s

Backlog (the lag)

8.74M Growthlinear, bounded After loaddrained Lost0
End-to-end latency under lag: queue-wait p50 ~1 s, completion p50 ~4 s, exactly the depth of the buffer you've chosen to let build. The instant producers ease below the consume rate, that drains away. Latency under overload is a tunable (your target rate vs. consumer fleet), not a failure.

Correctness: messy workload, clean books

Volume

43.2M Pushed43,200,340 Delivered78,287,947 Groups2 (fan-out)

Errors

0 Push0 Pop0 Ack0

Reliability paths

DLQ ✓ Redeliveries302,643 Poison → DLQas designed Restarts0
Across 43.2M pushes and 78.3M deliveries, with 1% transient failures and a poison-message DLQ path active, the broker recorded zero push, pop, or ack errors. Failed messages were redelivered (at-least-once), poison messages reached the dead-letter queue, and per-partition order held throughout.

Storage & self-recovery: backpressure, not a spiral

The classic failure mode for a queue-on-a-database under sustained lag is the live table outgrowing RAM, after which inserts start faulting cold index pages from disk and throughput collapses. Queen 0.16's parallel retention (and a generous max_wal_size) keep that from happening.

Live table

17 → 4GB Peak (1h)~17 GB vs cache< 24 GB After drain~4 GB

Host memory

~34GB of64 GB OOMnone Cold readsnone

Where the work goes

PG ~18/32 Postgres~17-20 cores Broker~1.3 cores PG conns306, steady
The backlog lived entirely in cache and reclaimed itself. The live table tracked the pending backlog up to ~17 GB (under the 24 GB shared_buffers, so reads never went cold) while completed messages were retention-pruned continuously. When producers stopped, the pending-retention window reclaimed the backlog and the table fell back to ~4 GB on its own.

Transactions kept up, even while the queue lagged

Exactly-once handoff is the deliberately heavy path: an atomic ack-and-push has to serialize its partition (advisory lock + commit-ordered stamp) so the ack and the downstream push commit together. That cost is the guarantee.

Atomic ack + push

1.75M Committed1,747,085 Errors7 (4e-6) Stage-2 pending0

The point

Even while the primary queue was running a sustained backlog, the transactional pipeline stayed fully drained (stage-2 pending held at zero) and committed 1.75M exactly-once handoffs with a 0.0004% error rate. The serializing, commit-bound path was the reliable part of the system under stress, not a bottleneck that fell over.

Configuration: exactly what ran

Broker: Queen 0.16 (pushser)

Host32 vCPU · 64 GiB · dedicated
Imagesmartnessai/queen-mq:0.16
NUM_WORKERS10
DB_POOL_SIZE50 (+250 sidecar)
RETENTION_PARALLELISM8
Networkprivate VPC (loader off-host)

PostgreSQL 18

shared_buffers24 GB
effective_cache_size48 GB
max_wal_size96 GB
max_connections400
work_mem / maint32 MB / 1 GB
synchronous_commiton

Loader: goload app-mode

Host16 vCPU · 32 GiB · separate VM
Target push12,000 msg/s (closed-loop)
Producers300 · push-batch 20 · single-writer/session
Partitions5,000 sessions · Zipf skew 1.1
Consumer groups2 (fan-out)
Consumers400 / group (800 total)
Popbatch 100 · 10 partitions/pop · long-poll
Handler20 ms ±50% · 1% slow @ 200 ms
Failures1% transient · 0.1% poison → DLQ (retry 5)
Transactions5% of group-0 → stage-2 (atomic ack+push)
Payload256-2,048 B (variable JSON)
Lease / retention30 s · completed 120 s · pending 3600 s
Duration3,600 s
Why 12k? For this realistic profile (explicit ack + 2× fan-out), the flat-sustainable push rate on this single Postgres node is ~11k msg/s. The throughput ceiling is Postgres CPU on the pop/ack procedures, not the broker. Running at 12k puts the system deliberately ~10% over that line so we can watch it carry a real backlog. To sustain a higher flat rate you reduce per-message database work (server-side auto-ack to fuse pop+ack, or fewer consumer groups), not add consumers.

Reproduce it

The load generator is goload in app-mode (built on the official Go client), in benchmark-queen/. The exact run:

goload -mode app -url http://<broker>:6632 \
  -queue appq -stage2-queue appq-stage2 \
  -target-rate 12000 -producers 300 -push-batch 20 \
  -sessions 5000 -skew 1.1 -consumer-groups 2 -consumers-per-group 400 \
  -pop-batch 100 -pop-partitions 10 -pop-wait -idle-conns 8192 \
  -process-ms 20 -process-jitter 0.5 -slow-msg-pct 1 -slow-msg-ms 200 \
  -fail-pct 1 -poison-pct 0.1 -retry-limit 5 \
  -tx-pct 5 -stage2-consumers 80 \
  -payload-min 256 -payload-max 2048 -lease-time 30 \
  -completed-retention 120 -pending-retention 3600 \
  -report 30 -duration 3600
Companion benchmark

24-hour durability soak

The balanced-load counterpart: ~119k msg/s push and pop for a day, 10.4B messages, zero loss, flat memory.

0.16 · Matrix

Throughput & scaling

Batch sizing, partition scaling to 10k, and fan-out vs the 0.14 baseline.

Sizing

How much Postgres do I need?

Turn a target msg/s and your fan-out into a PG vCPU budget.