What happens when you push faster than you can consume.
Every real system eventually runs behind: a traffic spike, a slow downstream, a deploy, or simply fan-out amplifying every message N times. The question that matters then isn't "what's the peak rate." It's what does the queue do when producers outrun consumers? So we built a realistic application workload: two consumer-group fan-out, explicit per-message ack, 20 ms handlers, transient failures + a dead-letter path, and a transactional ack-and-forward pipeline. Then we deliberately drove push above the sustainable consume rate for a full hour on a two-machine setup. This is the over-provisioned-push story.
Headline
The scenario: a realistic application, not a pipe
Unlike a raw throughput test, this models how an app actually uses a queue. Every knob below is on at once, on a dedicated broker machine with the load generator on a separate machine over a private network, so the broker+Postgres host has all 32 cores to itself.
- Fan-out. Two independent consumer groups, each receiving a full copy of every message, so consume demand is 2× the push rate.
- Explicit acknowledgement. Consumers pop, do real work (~20 ms per message, ±50% jitter, plus a 1% slow tail at 200 ms), then ack, a separate round-trip from the pop, exercising the lease lifecycle.
- Per-entity ordering. 5,000 session partitions, each written by a single producer with a monotonic sequence, selected with a realistic Zipf skew (a few hot sessions, a long cold tail).
- Failures & DLQ. 1% of deliveries fail transiently (redelivered), 0.1% are poison messages that always fail and land in the dead-letter queue after 5 attempts.
- Transactional pipeline. 5% of the primary group's completions are an atomic ack-and-push into a second queue (exactly-once handoff), itself drained by a third consumer group.
- The stress. Push is driven at a closed-loop 12,000 msg/s, chosen to sit just above the sustainable consume ceiling, for a full hour.
What Queen does when it's behind
Sustained over-rate is where many queues misbehave: they push back on producers, drop messages, reorder under contention, or spiral as their storage outgrows RAM. Across the full hour at 2× fan-out demand, Queen 0.16 did none of that:
- Producers were never throttled. Push held exactly 12,000 msg/s for the entire hour with 0 push errors. The lag never propagated back to the producer. The broker accepted all 43,200,340 messages.
- Nothing was lost. Every one of the ~5,000 partitions kept being delivered to both groups; the growing gap was simply pending work, fully accounted for, not dropped.
- Order held. Per-partition FIFO was preserved within each consumer group even under heavy competing-consumer concurrency (independently verified with an offline order-checker on companion runs: every partition's sequence delivered in order, zero gaps).
- The backlog stayed bounded, and self-drained. The pending queue grew linearly to ~8.7M and the live table to ~17 GB, but stayed inside Postgres' 24 GB cache the whole time (no cold-disk spiral), and retention reclaimed it back to ~4 GB once producers stopped. Backpressure, not breakage.
- No errors, no crashes, no restarts across 43.2M pushes and 78.3M deliveries. The system degraded gracefully and predictably (linear, no cliff).
Throughput & backpressure
Push is closed-loop at the target; consume runs at whatever Postgres can sustain. The difference is the backlog, and it behaves exactly as a durable buffer should.
Push (in)
12.0k/s Held for60 min Total43,200,340 Errors0Consume (out, 2 groups)
~22.5k/s Demand (2×)24.0k/s Delivered78,287,947 Deficit~1.5k/sBacklog (the lag)
8.74M Growthlinear, bounded After loaddrained Lost0Correctness: messy workload, clean books
Volume
43.2M Pushed43,200,340 Delivered78,287,947 Groups2 (fan-out)Errors
0 Push0 Pop0 Ack0Reliability paths
DLQ ✓ Redeliveries302,643 Poison → DLQas designed Restarts0Storage & self-recovery: backpressure, not a spiral
The classic failure mode for a queue-on-a-database under sustained lag is the live table
outgrowing RAM, after which inserts start faulting cold index pages from disk and
throughput collapses. Queen 0.16's parallel retention (and a generous
max_wal_size) keep that from happening.
Live table
17 → 4GB Peak (1h)~17 GB vs cache< 24 GB After drain~4 GBHost memory
~34GB of64 GB OOMnone Cold readsnoneWhere the work goes
PG ~18/32 Postgres~17-20 cores Broker~1.3 cores PG conns306, steadyshared_buffers, so reads never went cold) while completed messages were
retention-pruned continuously. When producers stopped, the pending-retention window
reclaimed the backlog and the table fell back to ~4 GB on its own.
Transactions kept up, even while the queue lagged
Exactly-once handoff is the deliberately heavy path: an atomic ack-and-push has to serialize its partition (advisory lock + commit-ordered stamp) so the ack and the downstream push commit together. That cost is the guarantee.
Atomic ack + push
1.75M Committed1,747,085 Errors7 (4e-6) Stage-2 pending0The point
Even while the primary queue was running a sustained backlog, the transactional pipeline stayed fully drained (stage-2 pending held at zero) and committed 1.75M exactly-once handoffs with a 0.0004% error rate. The serializing, commit-bound path was the reliable part of the system under stress, not a bottleneck that fell over.
Configuration: exactly what ran
Broker: Queen 0.16 (pushser)
| Host | 32 vCPU · 64 GiB · dedicated |
| Image | smartnessai/queen-mq:0.16 |
NUM_WORKERS | 10 |
DB_POOL_SIZE | 50 (+250 sidecar) |
RETENTION_PARALLELISM | 8 |
| Network | private VPC (loader off-host) |
PostgreSQL 18
shared_buffers | 24 GB |
effective_cache_size | 48 GB |
max_wal_size | 96 GB |
max_connections | 400 |
work_mem / maint | 32 MB / 1 GB |
synchronous_commit | on |
Loader: goload app-mode
| Host | 16 vCPU · 32 GiB · separate VM |
| Target push | 12,000 msg/s (closed-loop) |
| Producers | 300 · push-batch 20 · single-writer/session |
| Partitions | 5,000 sessions · Zipf skew 1.1 |
| Consumer groups | 2 (fan-out) |
| Consumers | 400 / group (800 total) |
| Pop | batch 100 · 10 partitions/pop · long-poll |
| Handler | 20 ms ±50% · 1% slow @ 200 ms |
| Failures | 1% transient · 0.1% poison → DLQ (retry 5) |
| Transactions | 5% of group-0 → stage-2 (atomic ack+push) |
| Payload | 256-2,048 B (variable JSON) |
| Lease / retention | 30 s · completed 120 s · pending 3600 s |
| Duration | 3,600 s |
Reproduce it
The load generator is goload in app-mode (built on the official Go client),
in benchmark-queen/.
The exact run:
goload -mode app -url http://<broker>:6632 \
-queue appq -stage2-queue appq-stage2 \
-target-rate 12000 -producers 300 -push-batch 20 \
-sessions 5000 -skew 1.1 -consumer-groups 2 -consumers-per-group 400 \
-pop-batch 100 -pop-partitions 10 -pop-wait -idle-conns 8192 \
-process-ms 20 -process-jitter 0.5 -slow-msg-pct 1 -slow-msg-ms 200 \
-fail-pct 1 -poison-pct 0.1 -retry-limit 5 \
-tx-pct 5 -stage2-consumers 80 \
-payload-min 256 -payload-max 2048 -lease-time 30 \
-completed-retention 120 -pending-retention 3600 \
-report 30 -duration 3600
24-hour durability soak
The balanced-load counterpart: ~119k msg/s push and pop for a day, 10.4B messages, zero loss, flat memory.
Throughput & scaling
Batch sizing, partition scaling to 10k, and fan-out vs the 0.14 baseline.
How much Postgres do I need?
Turn a target msg/s and your fan-out into a PG vCPU budget.
