#server-performance

Nginx Performance Tuning: HTTP Processing and Configuration

12 min read - May 26, 2026

Table of contents

Nginx HTTP Processing Flow: Configuration Tuning
How Nginx Processes HTTP Requests
Worker Processes and Connections
Timeouts and Keepalive
Buffer Sizing
Load Balancing and Upstream Keepalive
Testing and Monitoring

Share

Tune Nginx worker processes, buffers, keepalive, and load balancing to handle 50,000+ requests per second on a single server.

Table of contents

Nginx HTTP Processing Flow: Configuration Tuning
How Nginx Processes HTTP Requests
Worker Processes and Connections
Timeouts and Keepalive
Buffer Sizing
Load Balancing and Upstream Keepalive
Testing and Monitoring

Nginx HTTP Processing Flow: Configuration Tuning

Nginx's default configuration is designed for compatibility, not performance. With proper tuning, a single server can handle 50,000 to 80,000 requests per second. This guide covers the settings that matter most: worker processes, connections, keepalive, buffering, load balancing, and how to verify your changes with benchmarks.

How Nginx Processes HTTP Requests

Nginx handles requests in distinct phases, each consuming system resources. Understanding the flow helps you target the right settings.

It starts at the kernel level. Incoming connections land in SYN and ACCEPT queues, and Nginx worker processes pick them up. Once accepted, the worker parses the HTTP request from the kernel buffer. TLS traffic makes this step more CPU-intensive.

Next, Nginx matches the request to a virtual server using the Host header and IP/port combination, then resolves the URI to a location block via prefix or regex matching.

For dynamic content, Nginx forwards the request to a backend (FastCGI, proxy). This upstream communication phase benefits heavily from persistent connections. Without upstream keepalive, Nginx opens a new TCP connection per request, adding latency and CPU overhead.

If proxy_buffering is enabled, Nginx reads the full upstream response into memory before delivering it to the client. This frees the worker to handle new requests immediately. Finally, during output delivery, enabling sendfile allows zero-copy transfers, which can push throughput from around 6 Gbps to 30 Gbps.

Every phase consumes memory buffers, file descriptors, and CPU cycles. The sections below target each bottleneck.

Worker Processes and Connections

Start with worker_processes auto; in your main context. This matches the worker count to your CPU cores. On a VPS with limited cores, set the number manually (e.g. worker_processes 2;). If your workload is memory-intensive, consider reducing workers to avoid overcommitting RAM.

Enable worker_cpu_affinity auto; to pin each worker to a specific core. This reduces cache misses and context switching. Available since Nginx 1.9.10.

Connection limits

The worker_connections directive sets how many simultaneous connections each worker can handle. Total capacity is worker_processes × worker_connections. The default of 512 or 1,024 is too low for production. Set it to 2,048 or 4,096 per worker for high-traffic sites.

Each connection needs at least one file descriptor. In a reverse proxy setup, each connection uses two (one for the client, one for the upstream). Set worker_rlimit_nofile to at least double your worker_connections value with headroom. For worker_connections 4096;, use worker_rlimit_nofile 10000; or higher.

At the system level, increase fs.file-max in /etc/sysctl.conf to at least 500,000 and set systemd's LimitNOFILE=65535.

Finally, add multi_accept on; in the events block so workers accept all pending connections at once rather than one at a time.

Timeouts and Keepalive

Client keepalive

The keepalive_timeout directive controls how long idle client connections stay open. For high-traffic servers, 30 to 65 seconds works well. Use the two-parameter form:

keepalive_timeout 65s 60s;

The first value is the server-side timeout. The second sends a Keep-Alive: timeout=60 header to the client. Setting the client value slightly lower prevents race conditions where browsers try to reuse connections that Nginx has already closed.

The keepalive_requests directive caps how many requests a single connection handles before being retired. The default is 1,000 (raised from 100 in version 1.19.10). For stable backends, increase this to 10,000 to reduce connection churn.

Proxy timeouts

proxy_connect_timeout sets how long Nginx waits to establish a connection with the backend. The default is 60 seconds. Reduce it to 5 to 10 seconds for fast failover.

proxy_read_timeout defines how long Nginx waits between successive reads from the upstream. Align it with your backend's execution timeout. If PHP-FPM's request_terminate_timeout is 120 seconds, set proxy_read_timeout to at least 120 seconds to avoid premature 504 errors.

proxy_send_timeout controls the interval between successive writes to the upstream. The default of 60 seconds is usually fine unless you're sending large request bodies.

As a rule, proxy_connect_timeout should always be the shortest of the three.

Buffer Sizing

client_body_buffer_size controls how much of an incoming request body Nginx holds in memory. The default of 8k or 16k handles simple form submissions, but file uploads will spill to disk. Increase it to 128k for small-to-medium uploads. Raise client_max_body_size from its default of 1m if users upload larger files.

proxy_buffer_size handles response headers separately from the body. The default of 4k or 8k usually works, but applications with large Set-Cookie headers (common in e-commerce) can exceed it, causing 502 errors. Measure your actual header size:

curl -s -w '%{size_header}' -o /dev/null http://your-upstream-url

Round up to the nearest 4k increment.

proxy_buffers sets the number and size of buffers for the response body. The default (8 buffers of 4k or 8k, totalling 32k to 64k) won't hold a large JSON response. Measure your largest typical response with curl and configure enough buffers to keep it entirely in RAM.

Proxy buffering behaviour

With proxy_buffering on (the default), Nginx reads the full upstream response into memory before sending it to the client. This lets backend servers move on to new requests immediately.

Set proxy_busy_buffers_size to at least proxy_buffer_size plus one buffer, but less than your total buffer pool. For proxy_buffers 8 16k (128k total), keep proxy_busy_buffers_size under 112k.

For real-time endpoints like Server-Sent Events or long-polling, disable buffering with proxy_buffering off in a location-specific block. Apply this selectively to /stream or /events paths, not globally.

Load Balancing and Upstream Keepalive

By default, Nginx opens a new TCP connection for every proxied request. Each handshake adds 10 to 100ms of latency, with TLS adding another 10 to 50ms. Upstream keepalive maintains a pool of persistent connections to eliminate this overhead.

Three settings are required:

upstream backend {
    server 10.0.1.10:8080;
    server 10.0.1.11:8080;
    keepalive 128;
}
 
server {
    location / {
        proxy_pass http://backend;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
    }
}

The keepalive value sets the maximum idle connections per worker. Calculate the optimal pool size as: (workers × target concurrency) / upstream nodes.

Set upstream keepalive_timeout to 60 to 120 seconds to match your backend's settings and handle traffic spikes.

Balancing strategies

Nginx supports several load balancing methods. Round-robin (the default) distributes requests sequentially. least_conn routes to the server with the fewest active connections, which suits workloads with variable request durations. ip_hash provides session persistence by routing the same client IP to the same backend.

Use the weight parameter when servers have different capacities:

upstream backend {
    least_conn;
    server 10.0.1.10:8080 weight=3;
    server 10.0.1.11:8080;
    server 10.0.1.12:8080 backup;
    keepalive 128;
}

The weighted server receives three times the traffic. The backup server only activates if all primary servers are down. Configure max_fails and fail_timeout for passive health checks, and use max_conns to cap simultaneous connections per backend.

Testing and Monitoring

Always measure baseline performance before changing anything. After each change, benchmark again. Make one change at a time.

Validate your config syntax with nginx -t before reloading. Run benchmarks from a separate machine to avoid resource contention.

wrk is the standard tool for HTTP load testing:

wrk -t4 -c200 -d30s http://your-server.com/

Track requests per second, average latency, max latency, and transfer rate. Apache Bench works for simpler tests:

ab -n 50000 -c 40 http://your-server.com/

Ongoing monitoring

Enable the stub_status module to monitor active connections in real time via curl http://localhost/nginx_status.

Add timing variables to your log format to identify where delays occur:

$request_time for total request duration
$upstream_connect_time for backend connection time
$upstream_response_time for total backend processing time

Check error logs for buffer issues with journalctl -u nginx --no-pager | grep "temporary file". If responses are hitting disk, your proxy_buffers are too small. Look for "too many open files" errors, which indicate worker_rlimit_nofile needs raising.

Reduce log I/O with buffered logging:

access_log /var/log/nginx/access.log combined buffer=64k flush=5s;

Use ss -tn state established dst [backend_ip] during load tests to verify connections are being reused and not piling up in TIME_WAIT.

For dedicated servers and VPS hosting optimised for high-performance workloads, see FDC Servers.

Blog

Featured this week

#bandwidth#server-performance

iperf3 Tutorial: Test Network Speed on Linux & Windows

Install iperf3, run bandwidth tests, and tune TCP buffers for accurate results on Linux and Windows. Covers UDP, bidirectional, and 10GbE+ testing

10 min read - May 7, 2026

#server-performance

Tuned Profiles for Linux Server Workload Optimisation

16 min read - June 9, 2026

Have questions or need a custom solution?

Flexible options

Global reach

Instant deployment

Flexible options

Global reach

Instant deployment