Skip to main content

Performance Tuning

This guide covers how to tune the transport for high-throughput workloads. The most impactful settings are backpressure controls and concurrency limits.

Backpressure & flow control

Pull consumers provide implicit backpressure — the client controls how fast it pulls messages from the server. The transport never receives more messages than it can handle.

The message flow through the system:

NATS Server → pull consumer (max_ack_pending) → consume() buffer → RxJS mergeMap (concurrency) → handler → ack → frees slot

The primary flow control knob is max_ack_pending on the consumer. It limits how many messages can be in-flight (delivered but not yet acknowledged) at any time. When the limit is reached, the server stops delivering new messages until existing ones are acknowledged — creating natural backpressure.

Configuration options

OptionDefaultEffect
consumer.max_ack_pending100Max in-flight unacked messages
consumer.ack_wait10s (events) / 5min (RPC)Ack deadline before redelivery
concurrencyunlimitedLimits parallel handler execution
consume.max_messages@nats-io/jetstream defaultInternal prefetch buffer size
consume.threshold_messages75% of max_messagesAuto-refill trigger

How these interact

  • max_ack_pending is the ceiling. The server will never deliver more than this many unacked messages.
  • concurrency controls how many messages are processed in parallel by your handlers. If concurrency is lower than max_ack_pending, excess messages wait in the internal buffer.
  • consume.max_messages and consume.threshold_messages control the prefetch buffer — how many messages the @nats-io/jetstream client requests from the server in a single batch and when it requests more.

For most workloads, tuning max_ack_pending and concurrency is sufficient.

Consume options

The consume field controls the internal prefetch buffer used by the @nats-io/jetstream consume() call. These options determine how aggressively the client pulls messages from the server.

events: {
consume: {
max_messages: 200, // prefetch up to 200 messages
threshold_messages: 150, // request more when buffer drops below 150
idle_heartbeat: 10_000, // 10s heartbeat to detect stalled consumers
},
}

The idle_heartbeat option enables the server to send periodic heartbeats when there are no messages to deliver. This allows the client to detect broken connections and re-establish the subscription.

Concurrency control

The concurrency option limits the number of messages processed in parallel by the mergeMap operator in the message pipeline. Without it, all pulled messages are dispatched to handlers immediately.

events: {
concurrency: 200, // max 200 handlers running in parallel
}

Guideline: Set concurrency at or below max_ack_pending. If concurrency exceeds max_ack_pending, the server-side limit will be the effective bottleneck anyway.

When concurrency is set, messages that exceed the limit are buffered in the RxJS pipeline until a handler slot frees up. This prevents overloading downstream resources (databases, external APIs) while still maintaining a full prefetch buffer.

Ack extension

Long-running handlers risk exceeding the ack_wait deadline, causing redelivery of messages that are still being processed. The ackExtension option prevents this by periodically extending the ack deadline while the handler is running.

events: {
consumer: { ack_wait: toNanos(30, 'seconds') },
ackExtension: true,
}

When enabled, the transport calls msg.working() at regular intervals (at half the ack_wait period) to signal that the message is still being processed. This resets the ack deadline on the server.

When to use:

  • Handlers that call slow external APIs or run complex computations.
  • RPC handlers with long timeouts — ackExtension keeps the command message alive while the handler runs.

Interaction with RPC timeout: For JetStream RPC, the RPC timeout determines how long the caller waits. The ackExtension keeps the server-side message alive independently. Both should be configured to accommodate the expected handler duration.

S2 compression

All streams default to Snappy S2 compression, reducing disk I/O and storage with negligible CPU overhead (~1-3%). Requires NATS Server >= 2.10.

See Default Configs — S2 Compression for details and how to override per stream kind.

Connection defaults

The transport configures unlimited reconnection attempts (maxReconnectAttempts: -1) and a 1-second reconnection interval by default. This ensures the application automatically recovers from transient network failures.

See Default Configs — Connection Defaults for the full list and how to override.

Complete example

A production-ready configuration tuned for high throughput:

import { toNanos } from '@horizon-republic/nestjs-jetstream';

JetstreamModule.forRoot({
name: 'orders',
servers: ['nats://localhost:4222'],
events: {
consumer: { max_ack_pending: 500, ack_wait: toNanos(30, 'seconds') },
consume: { idle_heartbeat: 10_000 },
concurrency: 200,
ackExtension: true,
},
broadcast: {
concurrency: 50,
},
rpc: {
mode: 'jetstream',
timeout: 60_000,
concurrency: 50,
ackExtension: true,
},
})

This configuration:

  • Allows up to 500 unacked event messages, with 200 processed in parallel.
  • Extends ack deadlines automatically for long-running event and RPC handlers.
  • Limits broadcast processing to 50 concurrent handlers.
  • Sets a 60-second RPC timeout with ack extension to prevent redelivery during processing.

Performance expectations

Actual throughput depends heavily on handler complexity, network latency, and NATS deployment. Here are rough ballpark numbers for a single Node.js instance on modern hardware (16-core, 32 GB RAM, local NATS):

ScenarioThroughputNotes
Event handlers (fast, no I/O)10k–50k msg/sCPU-bound; concurrency helps up to core count
Event handlers (DB write per msg)1k–5k msg/sI/O-bound; increase concurrency and max_ack_pending
RPC Core mode (fast handler)5k–20k req/sLowest latency; no stream overhead
RPC JetStream mode1k–10k req/sAdds stream write + inbox reply
Broadcast (N instances)Same per-instance as eventsEach instance processes independently
These are guidelines, not guarantees

Always benchmark your specific workload. The bottleneck is usually the handler, not the transport. Profile your handlers first before tuning transport settings.

Measuring your throughput

  1. NATS monitoring: Enable the NATS monitoring endpoint (-m 8222) and check msgs_in/msgs_out rates.
  2. Consumer lag: nats consumer info <stream> <consumer> — watch "Num Pending" to detect falling behind.
  3. Application metrics: Use Lifecycle Hooks to emit Prometheus counters for MessageRouted, Error, and RpcTimeout events.

See also