docs(http2): add performance tuning section

- Add three tuning profiles: Conservative, Balanced, High Concurrency - Document pros/cons and trade-offs for each HTTP/2 setting - Add Linux kernel tuning parameters for high concurrency - Add Docker-specific tuning recommendations - Include benchmark results from h2load testing
2026-07-01 18:21:30 +08:00 · 2026-01-27 16:11:05 +01:00 · 2026-01-27 16:11:05 +01:00 · 49193223d5
commit 49193223d5
parent 34dc8b8fad
1 changed files with 186 additions and 0 deletions
--- a/docs/content/guides/http2.md
+++ b/docs/content/guides/http2.md
@ -454,6 +454,192 @@ If clients report stream limit errors:
 http2_max_concurrent_streams = 200  # Increase from default 100
 ```

+## Performance Tuning
+
+HTTP/2 performance depends on proper tuning of both Gunicorn and system settings.
+This section covers different tuning profiles and their trade-offs.
+
+### Tuning Profiles
+
+#### Conservative (Default)
+
+Best for: Low to moderate traffic, memory-constrained environments.
+
+```python
+# gunicorn.conf.py - Conservative profile
+workers = 2
+worker_class = "gthread"
+threads = 4
+
+http2_max_concurrent_streams = 100
+http2_initial_window_size = 65535      # 64KB
+http2_max_frame_size = 16384           # 16KB
+```
+
+| Pros | Cons |
+|------|------|
+| Low memory footprint | Limited throughput at high concurrency |
+| Safe defaults per RFC | More round-trips for large transfers |
+| Works on constrained systems | May bottleneck at ~10K req/s |
+
+#### Balanced
+
+Best for: Moderate traffic, general production use.
+
+```python
+# gunicorn.conf.py - Balanced profile
+workers = 4
+worker_class = "gthread"
+threads = 4
+backlog = 2048
+
+http2_max_concurrent_streams = 128
+http2_initial_window_size = 262144     # 256KB
+http2_max_frame_size = 16384           # 16KB
+```
+
+| Pros | Cons |
+|------|------|
+| Good throughput (15K+ req/s) | More memory per connection |
+| Handles traffic spikes | Requires more CPU |
+| Good balance of resources | |
+
+#### High Concurrency
+
+Best for: High traffic APIs, microservices, load testing.
+
+```python
+# gunicorn.conf.py - High concurrency profile
+workers = 4
+worker_class = "gthread"
+threads = 8
+backlog = 2048
+worker_connections = 10000
+
+http2_max_concurrent_streams = 256
+http2_initial_window_size = 1048576    # 1MB
+http2_max_frame_size = 32768           # 32KB
+```
+
+| Pros | Cons |
+|------|------|
+| High throughput (20K+ req/s) | Higher memory usage (~4x conservative) |
+| Handles 1000s of clients | Requires system tuning |
+| Better large transfer performance | May overwhelm downstream services |
+
+### Setting Trade-offs
+
+#### `http2_max_concurrent_streams`
+
+Controls how many simultaneous streams a client can open per connection.
+
+| Value | Memory | Throughput | Use Case |
+|-------|--------|------------|----------|
+| 50-100 | Low | Moderate | APIs with small payloads |
+| 128-256 | Medium | High | General web applications |
+| 500+ | High | Very High | Streaming, real-time apps |
+
+!!! warning
+    Very high values (500+) can lead to resource exhaustion under attack.
+    Use with rate limiting.
+
+#### `http2_initial_window_size`
+
+Flow control window size determines how much data can be sent before waiting for acknowledgment.
+
+| Value | Memory | Latency | Use Case |
+|-------|--------|---------|----------|
+| 65535 (64KB) | Low | Higher for large transfers | Default, memory-constrained |
+| 262144 (256KB) | Medium | Balanced | General use |
+| 1048576 (1MB) | High | Lower for large transfers | Large file transfers, streaming |
+
+!!! note
+    Larger windows improve throughput for large responses but increase memory
+    usage per stream. Calculate: `max_streams × window_size × connections`.
+
+#### `http2_max_frame_size`
+
+Maximum size of individual HTTP/2 frames.
+
+| Value | Memory | Efficiency | Use Case |
+|-------|--------|------------|----------|
+| 16384 (16KB) | Low | More frames for large data | Default, RFC minimum |
+| 32768 (32KB) | Medium | Balanced | General use |
+| 65536 (64KB) | Higher | Fewer frames | Large payloads |
+
+### System Tuning (Linux)
+
+For high concurrency (1000+ clients), tune these kernel parameters:
+
+```bash
+# /etc/sysctl.conf or /etc/sysctl.d/99-gunicorn.conf
+
+# Increase socket backlog for burst connections
+net.core.somaxconn = 65535
+net.ipv4.tcp_max_syn_backlog = 65535
+
+# Increase network queue size
+net.core.netdev_max_backlog = 65535
+
+# Expand ephemeral port range
+net.ipv4.ip_local_port_range = 1024 65535
+
+# Allow reuse of TIME_WAIT sockets
+net.ipv4.tcp_tw_reuse = 1
+
+# Increase max open files system-wide
+fs.file-max = 2097152
+```
+
+Apply with: `sudo sysctl -p`
+
+Also increase file descriptor limits:
+
+```bash
+# /etc/security/limits.conf
+* soft nofile 65535
+* hard nofile 65535
+```
+
+### Docker Tuning
+
+For Docker deployments, add these to your container or compose file:
+
+```yaml
+# docker-compose.yml
+services:
+  gunicorn:
+    ulimits:
+      nofile:
+        soft: 65535
+        hard: 65535
+    sysctls:
+      net.core.somaxconn: 65535
+```
+
+Or in Dockerfile:
+
+```dockerfile
+# Increase file descriptor limit
+RUN ulimit -n 65535
+```
+
+### Benchmark Results
+
+Reference benchmarks using h2load with 4 Gunicorn workers in Docker (Apple M4 Pro):
+
+| Profile | Clients | Streams | Requests/sec | Latency (mean) |
+|---------|---------|---------|--------------|----------------|
+| Conservative | 100 | 10 | 11,700 | 69ms |
+| Conservative | 1000 | 10 | 12,750 | 441ms |
+| High Concurrency | 100 | 10 | 15,000+ | 50ms |
+| High Concurrency | 1000 | 10 | 21,700 | 253ms |
+| High Concurrency | 2000 | 10 | 12,300 | 243ms |
+
+!!! note
+    Actual performance varies based on hardware, network, and application complexity.
+    Always benchmark your specific workload.
+
 ## Testing HTTP/2

 ### Using curl