Performance Tuning
This guide covers how to tune the transport for high-throughput workloads. The most impactful settings are backpressure controls and concurrency limits.
Backpressure & flow control
Pull consumers provide implicit backpressure — the client controls how fast it pulls messages from the server. The transport never receives more messages than it can handle.
The message flow through the system:
NATS Server → pull consumer (max_ack_pending) → consume() buffer → RxJS mergeMap (concurrency) → handler → ack → frees slot
The primary flow control knob is max_ack_pending on the consumer. It limits how many messages can be in-flight (delivered but not yet acknowledged) at any time. When the limit is reached, the server stops delivering new messages until existing ones are acknowledged — creating natural backpressure.
Configuration options
| Option | Default | Effect |
|---|---|---|
consumer.max_ack_pending | 100 | Max in-flight unacked messages |
consumer.ack_wait | 10s (events) / 5min (RPC) | Ack deadline before redelivery |
concurrency | unlimited | Limits parallel handler execution |
consume.max_messages | @nats-io/jetstream default | Internal prefetch buffer size |
consume.threshold_messages | 75% of max_messages | Auto-refill trigger |
How these interact
max_ack_pendingis the ceiling. The server will never deliver more than this many unacked messages.concurrencycontrols how many messages are processed in parallel by your handlers. Ifconcurrencyis lower thanmax_ack_pending, excess messages wait in the internal buffer.consume.max_messagesandconsume.threshold_messagescontrol the prefetch buffer — how many messages the@nats-io/jetstreamclient requests from the server in a single batch and when it requests more.
For most workloads, tuning max_ack_pending and concurrency is sufficient.
Consume options
The consume field controls the internal prefetch buffer used by the @nats-io/jetstream consume() call. These options determine how aggressively the client pulls messages from the server.
events: {
consume: {
max_messages: 200, // prefetch up to 200 messages
threshold_messages: 150, // request more when buffer drops below 150
idle_heartbeat: 10_000, // 10s heartbeat to detect stalled consumers
},
}
The idle_heartbeat option enables the server to send periodic heartbeats when there are no messages to deliver. This allows the client to detect broken connections and re-establish the subscription.
Concurrency control
The concurrency option limits the number of messages processed in parallel by the mergeMap operator in the message pipeline. Without it, all pulled messages are dispatched to handlers immediately.
events: {
concurrency: 200, // max 200 handlers running in parallel
}
Guideline: Set concurrency at or below max_ack_pending. If concurrency exceeds max_ack_pending, the server-side limit will be the effective bottleneck anyway.
When concurrency is set, messages that exceed the limit are buffered in the RxJS pipeline until a handler slot frees up. This prevents overloading downstream resources (databases, external APIs) while still maintaining a full prefetch buffer.
Ack extension
Long-running handlers risk exceeding the ack_wait deadline, causing redelivery of messages that are still being processed. The ackExtension option prevents this by periodically extending the ack deadline while the handler is running.
events: {
consumer: { ack_wait: toNanos(30, 'seconds') },
ackExtension: true,
}
When enabled, the transport calls msg.working() at regular intervals (at half the ack_wait period) to signal that the message is still being processed. This resets the ack deadline on the server.
When to use:
- Handlers that call slow external APIs or run complex computations.
- RPC handlers with long timeouts —
ackExtensionkeeps the command message alive while the handler runs.
Interaction with RPC timeout: For JetStream RPC, the RPC timeout determines how long the caller waits. The ackExtension keeps the server-side message alive independently. Both should be configured to accommodate the expected handler duration.
S2 compression
All streams default to Snappy S2 compression, reducing disk I/O and storage with negligible CPU overhead (~1-3%). Requires NATS Server >= 2.10.
See Default Configs — S2 Compression for details and how to override per stream kind.
Connection defaults
The transport configures unlimited reconnection attempts (maxReconnectAttempts: -1) and a 1-second reconnection interval by default. This ensures the application automatically recovers from transient network failures.
See Default Configs — Connection Defaults for the full list and how to override.
Complete example
A production-ready configuration tuned for high throughput:
import { toNanos } from '@horizon-republic/nestjs-jetstream';
JetstreamModule.forRoot({
name: 'orders',
servers: ['nats://localhost:4222'],
events: {
consumer: { max_ack_pending: 500, ack_wait: toNanos(30, 'seconds') },
consume: { idle_heartbeat: 10_000 },
concurrency: 200,
ackExtension: true,
},
broadcast: {
concurrency: 50,
},
rpc: {
mode: 'jetstream',
timeout: 60_000,
concurrency: 50,
ackExtension: true,
},
})
This configuration:
- Allows up to 500 unacked event messages, with 200 processed in parallel.
- Extends ack deadlines automatically for long-running event and RPC handlers.
- Limits broadcast processing to 50 concurrent handlers.
- Sets a 60-second RPC timeout with ack extension to prevent redelivery during processing.
Performance expectations
Actual throughput depends heavily on handler complexity, network latency, and NATS deployment. Here are rough ballpark numbers for a single Node.js instance on modern hardware (16-core, 32 GB RAM, local NATS):
| Scenario | Throughput | Notes |
|---|---|---|
| Event handlers (fast, no I/O) | 10k–50k msg/s | CPU-bound; concurrency helps up to core count |
| Event handlers (DB write per msg) | 1k–5k msg/s | I/O-bound; increase concurrency and max_ack_pending |
| RPC Core mode (fast handler) | 5k–20k req/s | Lowest latency; no stream overhead |
| RPC JetStream mode | 1k–10k req/s | Adds stream write + inbox reply |
| Broadcast (N instances) | Same per-instance as events | Each instance processes independently |
Always benchmark your specific workload. The bottleneck is usually the handler, not the transport. Profile your handlers first before tuning transport settings.
Measuring your throughput
- NATS monitoring: Enable the NATS monitoring endpoint (
-m 8222) and checkmsgs_in/msgs_outrates. - Consumer lag:
nats consumer info <stream> <consumer>— watch "Num Pending" to detect falling behind. - Application metrics: Use Lifecycle Hooks to emit Prometheus counters for
MessageRouted,Error, andRpcTimeoutevents.
See also
- Default Configs — all stream, consumer, and connection defaults
- Module Configuration — where to set these options
- Troubleshooting — diagnosing consumer lag