1873 Commits

Author SHA1 Message Date
Benoit Chesneau
634290fc75 fix(dirty): resolve pylint warnings in dirty module 2026-01-25 10:29:52 +01:00
Benoit Chesneau
5e3c07d11d test(dirty): add Docker-based parent death integration tests
Add comprehensive Docker integration tests verifying dirty arbiter
lifecycle under realistic conditions:
- Parent death detection via ppid monitoring
- Orphan cleanup on restart
- Dirty arbiter respawning after crash
- Graceful shutdown with SIGTERM

Also fix race condition in manage_workers() by checking self.alive
before spawning new workers during shutdown.
2026-01-25 10:23:25 +01:00
Benoit Chesneau
79f85af55e fix(dirty): detect parent death and self-terminate
Add ppid monitoring to dirty arbiter's worker monitor loop. If the
main arbiter dies unexpectedly (SIGKILL, crash, OOM), the dirty
arbiter detects the parent change and shuts itself down gracefully.

This complements the existing orphan cleanup on startup.
2026-01-25 10:23:25 +01:00
Benoit Chesneau
e21d23bfa6 fix(dirty): add orphan cleanup via well-known PID file
When the main arbiter crashes and restarts, orphaned dirty arbiters
may continue running. This adds detection and cleanup:

- Add well-known PID file location based on proc_name
- Dirty arbiter writes PID on startup, removes on exit
- Main arbiter checks for orphans on fresh start (not USR2)
- Uses self.proc_name for USR2 compatibility (myapp vs myapp.2)

During USR2 upgrade, old and new dirty arbiters coexist with
separate PID files, preventing the old from removing the new's file.
2026-01-25 10:23:25 +01:00
Benoit Chesneau
f6418d4eb0 feat(dirty): add streaming support and async client benchmarks
Add support for streaming responses when dirty app actions return
generators (sync or async). This enables real-time delivery of
incremental results for use cases like LLM token generation.

Features:
- Streaming protocol with chunk/end/error message types
- Worker support for sync and async generators
- Arbiter forwarding of streaming messages
- Deadline-based timeout handling
- Async client streaming API

Protocol:
- Chunk messages (type: "chunk") contain partial data
- End messages (type: "end") signal stream completion
- Error messages can occur mid-stream

New files:
- benchmarks/dirty_streaming.py: Streaming benchmark suite
- tests/dirty/test_*_streaming*.py: Streaming test coverage
- docs/content/dirty.md: Streaming documentation with examples
2026-01-25 10:23:25 +01:00
Benoit Chesneau
ce2e06ceba refactor(dirty): replace per-worker locks with queues
Replace lock-based request serialization with queue-based approach:
- Each worker now has a dedicated asyncio.Queue and consumer task
- route_request() submits (request, future) to queue and awaits future
- Consumer task processes requests sequentially per worker
- No lock contention - pure async queue operations

Benefits:
- Clearer separation of concerns
- Better visibility into request backlog (queue.qsize())
- Eliminates lock contention under high concurrency

Changes:
- worker_locks dict replaced with worker_queues and worker_consumers
- Added _start_worker_consumer() to create queue and consumer per worker
- Added _execute_on_worker() for actual worker communication
- Updated _cleanup_worker() to cancel consumer tasks
- Updated stop() to cancel all consumers before shutdown

Benchmark results (4 workers, isolated):
- throughput_10ms: 333 req/s, 0 failures
- overload_10ms (200 clients): 334 req/s, 0 failures
- All tests pass with perfect round-robin distribution
2026-01-25 10:23:25 +01:00
Benoit Chesneau
56cc094b68 feat(dirty): add benchmark suite and fix arbiter concurrency
Add comprehensive benchmark suite for stress testing the dirty pool:
- dirty_bench_app.py: Configurable benchmark app with sleep/cpu/mixed/payload tasks
- dirty_benchmark.py: Main runner with isolated and integrated test modes
- dirty_bench_wsgi.py: WSGI app for HTTP integration testing
- dirty_bench_gunicorn.py: Gunicorn config for integration benchmarks

Fix arbiter concurrency issues:
- Add per-worker locks to serialize requests and prevent read conflicts
- Implement round-robin worker selection for linear throughput scaling

The benchmark suite supports:
- Quick smoke tests (--quick)
- Full isolated benchmarks (--isolated)
- Configuration sweeps (--config-sweep)
- Payload size tests (--payload-tests)
- Integration tests with wrk (--integrated)
2026-01-25 10:23:25 +01:00
Benoit Chesneau
3d9382b07c refactor: address PR review comments
1. Split respawning logic from reap_dirty_arbiter() into manage_dirty_arbiter()
   to avoid respawning during shutdown/re-exec (follows reap_workers pattern)

2. Reduce public API surface in __all__:
   - Keep errors, DirtyApp, client functions as public
   - Internal protocol helpers remain importable from submodules
   - DirtyArbiter and set_dirty_socket_path kept for gunicorn core
2026-01-25 10:21:18 +01:00
Benoit Chesneau
06aba09251 feat(dirty): add thread pool with execution timeout control
- Use dirty_threads config for thread pool size (default: 1)
- Enforce dirty_timeout at worker level via asyncio.wait_for
- Heartbeat runs independently, not blocked by task execution
- Document thread safety and state persistence in docstrings
2026-01-25 10:21:18 +01:00
Benoit Chesneau
21f769ce16 fix: resolve lint issues in dirty arbiter modules 2026-01-25 10:21:18 +01:00
Benoit Chesneau
77222b8017 feat: add dirty arbiters for long-running blocking operations
Introduce Dirty Arbiters - a separate process pool for executing
long-running, blocking operations (AI model loading, heavy computation)
without blocking HTTP workers. Inspired by Erlang's dirty schedulers.

Key features:
- Completely separate from HTTP workers - can be killed/restarted independently
- Stateful - loaded resources persist in dirty worker memory
- Message-passing IPC via Unix sockets with JSON serialization
- Explicit execute() API from HTTP workers
- Asyncio-based for clean concurrent handling

Architecture:
- DirtyArbiter: manages the dirty worker pool, routes requests
- DirtyWorker: executes functions, maintains state, handles requests
- DirtyClient: sync/async API for HTTP workers to call dirty apps
- DirtyProtocol: length-prefixed JSON messages over Unix sockets
- DirtyApp: base class for dirty applications

Configuration options:
- dirty_apps: list of import paths for dirty applications
- dirty_workers: number of dirty workers (default: 0)
- dirty_timeout: task timeout in seconds (default: 300)
- dirty_graceful_timeout: shutdown timeout (default: 30)

Lifecycle hooks:
- on_dirty_starting(arbiter)
- dirty_post_fork(arbiter, worker)
- dirty_worker_init(worker)
- dirty_worker_exit(arbiter, worker)

Includes comprehensive test suite with 164 tests covering:
- Protocol encoding/decoding
- Worker and arbiter lifecycle
- Client sync/async APIs
- Signal handling
- Error handling and timeouts
- Integration tests
2026-01-25 10:21:18 +01:00
Benoit Chesneau
be6f3b97ab Disable setproctitle on macOS to prevent segfaults
setproctitle causes segfaults on macOS due to fork() safety issues
introduced in newer macOS versions. The mere import of setproctitle
can trigger crashes in forked worker processes.

Fixes #3021
2026-01-25 09:57:20 +01:00
Paul J. Dorn
481dbf2e9b
Publish full exception when the application fails to load (#3462)
* Python3: refactor returned traceback

Exceptions provide __traceback__ reference since Python 3.0
(and creating cyclic references has not been big deal since Python 2.2)

* --reload: publish entire exception, not just traceback

This is dangerous insofar as the exception text is more
likely to contain secrets than the quoted lines from traceback are.

However, the difference between the two is minor compared to the
primary danger of enabling this on a production machine, so focus
on that instead!
2026-01-25 09:41:39 +01:00
Paul J. Dorn
6d61afab3e forgotten import 2026-01-25 04:11:04 +01:00
Paul J. Dorn
ba336daabe duplicate 100-continue patch for asgi 2026-01-25 03:47:49 +01:00
Paul J. Dorn
88d503ba1c HTTP/1.0 - ignore Expect: 100-continue
* ignore on HTTP/1.0 (would possibly confuse a client or proxy)
* refuse requests with unknown expectations

https://datatracker.ietf.org/doc/html/rfc9110#section-10.1.1
2026-01-24 21:59:02 +01:00
Benoit Chesneau
375e79e95b release: bump version to 24.1.1 2026-01-24 02:13:42 +01:00
Benoit Chesneau
e9a3f30a0f
fix: keep forwarded_allow_ips as strings for backward compatibility (#3459)
The CIDR network support added in 24.1.0 changed forwarded_allow_ips
and proxy_allow_ips from string lists to ipaddress.ip_network objects.
This broke external tools like uvicorn that expect strings.

This fix validates IP/CIDR format during config parsing but keeps the
string representation. Network objects are cached in Config methods
(forwarded_allow_networks() and proxy_allow_networks()) for efficient
IP checking without repeated conversions.

Also uses strict mode for ip_network validation to detect mistakes like
192.168.1.1/24 where host bits are set (should be 192.168.1.0/24).

Fixes #3458
2026-01-23 23:51:25 +01:00
Benoit Chesneau
3179789f46 fix: handle SIGCLD alias for SIGCHLD on Linux
On Linux, SIGCLD and SIGCHLD are aliases for the same signal number (17).
The SIG_NAMES dict iteration order can map to either name, causing
"Unhandled signal: cld" errors when workers fail during boot.

Fixes #3453
2026-01-23 21:25:07 +01:00
Benoit Chesneau
7894d1c170 release: prepare 24.1.0
- Bump version to 24.1.0
- Add PROXY protocol v2 documentation to deploy guide
- Add 24.1.0 changelog with new features and bug fixes
- Update all docs.gunicorn.org URLs to gunicorn.org
2026-01-23 18:47:17 +01:00
Benoit Chesneau
f3190f84cc
feat: add PROXY protocol v2 support with version selection (#3451)
Extend --proxy-protocol to accept version values (off, v1, v2, auto) instead
of being boolean-only. This allows explicit control over which PROXY protocol
versions are accepted.

Changes:
- Add InvalidProxyHeader exception for v2 binary header errors
- Add validate_proxy_protocol() validator with backwards compatibility
- Update ProxyProtocol setting with nargs="?" and const="auto"
- Add PROXY v2 constants (PP_V2_SIGNATURE, PPCommand, PPFamily, PPProtocol)
- Add _parse_proxy_protocol_v1() and _parse_proxy_protocol_v2() methods
- Update both sync (message.py) and async (asgi/message.py) parsers
- Add hex escape handling in treq.py for v2 binary test data
- Add test cases for v2 TCPv4 and TCPv6

Backwards compatible: --proxy-protocol alone (or True) maps to "auto".

Closes #2912
2026-01-23 18:40:44 +01:00
Benoit Chesneau
f95ac41b8f fix: use smaller buffer in finish_body for faster timeout
Reduce buffer size from 8192 to 1024 bytes when discarding unread
body data, allowing timeouts to trigger more quickly on slow or
stalled connections.
2026-01-23 14:46:40 +01:00
Benoit Chesneau
66963367f3 fix: set socket to blocking mode on keepalive connections
On keepalive connections, finish_request() sets the socket to non-blocking
for selector registration. When the connection is reused, handle() calls
conn.init() which returns early (already initialized) without restoring
blocking mode. This caused SSLWantReadError when WSGI apps read the
request body on SSL connections.

Fix by explicitly setting blocking mode at the start of handle().

Fixes #3448
2026-01-23 14:40:40 +01:00
Benoit Chesneau
f22cd6558e feat: add socket backlog metric (Linux only)
Add --enable-backlog-metric option to emit a gunicorn.backlog histogram
metric showing connections waiting in the socket backlog. This helps
identify worker saturation and concurrency issues.

Also distinguishes between timer (|ms) and histogram (|h) statsd metric
types per the statsd spec.

Note: Only works on Linux using TCP_INFO from getsockopt.

Closes #2407
Partially fixes #2057
2026-01-23 11:39:05 +01:00
Benoit Chesneau
e52ac46e29 feat: support CIDR networks in forwarded_allow_ips and proxy_allow_ips
Use Python's ipaddress module to support IP networks in allow lists.
Individual IP addresses are converted to /32 (IPv4) or /128 (IPv6)
networks. CIDR notation (e.g., 192.168.0.0/16) is now supported.

Fixes #1485
Closes #2390
2026-01-23 11:39:05 +01:00
Benoit Chesneau
b0d38928c8 feat: InotifyReloader now watches newly loaded modules
Refactor reloader to share code via ReloaderBase class. InotifyReloader
now calls refresh_dirs() on each event loop timeout (~1 sec) to watch
directories for dynamically loaded modules (e.g., Django dynamic imports).

Fixes #1790
Closes #1791
2026-01-23 11:39:05 +01:00
Benoit Chesneau
bbc9bba95e fix: log SIGTERM as info level, not warning
SIGTERM is expected during graceful shutdown and reload operations.
Logging it as warning level causes unnecessary noise in error logs.
SIGKILL remains at error level (suggests OOM), other signals at warning.

Closes #3094
2026-01-23 11:39:05 +01:00
Benoit Chesneau
19d07bd4af fix: print exception to stderr on worker boot failure
When a worker fails to boot, the exception is now printed to stderr
(in addition to being logged), consistent with AppImportError handling.
This makes boot failures more visible to users.

Closes #2933
2026-01-23 11:39:05 +01:00
Benoit Chesneau
56abeaf105 fix: unreader.unread() now prepends data to buffer
The unread method was incorrectly appending data to the end of the
buffer instead of prepending it to the beginning. This caused issues
when reading partial data and then unreading it.

Closes #2915
Closes #2346
2026-01-23 11:39:05 +01:00
Benoit Chesneau
8e75b3aba3 fix: prevent RecursionError when pickling Config
On Python 3.8+ with macOS, the multiprocessing module uses 'spawn' by
default which pickles objects. When pickle.load tries to read
__setstate__ before __dict__ is restored, it hits __getattr__ causing
infinite recursion. Adding a special case for 'settings' prevents this.

Closes #2401
2026-01-23 11:39:05 +01:00
Benoit Chesneau
a182066bea fix: use proper exception chaining with 'raise from' in glogging.py
Use 'raise X from e' syntax instead of just 'raise X' when wrapping
exceptions. This provides more accurate exception chaining messages
("The above exception was the direct cause of" vs "During handling of").

Closes #2360
2026-01-23 11:39:05 +01:00
Benoit Chesneau
33e5337395 docs: fix post_request hook signature description
The description incorrectly stated the callable accepts two parameters
(Worker and Request), but the signature shows four parameters including
environ and resp.

Closes #2592
2026-01-23 11:39:05 +01:00
Benoit Chesneau
7c22955837
Merge pull request #3450 from benoitc/fix/ssl-want-read-error-3448
fix: handle SSLWantReadError in finish_body() (#3448)
2026-01-23 11:17:46 +01:00
Benoit Chesneau
4ef635446b docs: add dogstatsd_tags example to description
Clarify the expected format with a concrete example.

Closes #3288
2026-01-23 10:37:30 +01:00
Benoit Chesneau
46e7726838 fix: make syslog_addr default platform-neutral in docs
The syslog_addr setting has different defaults depending on the
platform (macOS, FreeBSD, OpenBSD, Linux). Added default_doc to
show all platform-specific defaults in the documentation, ensuring
consistent output regardless of which platform generates the docs.

Also kept the diagnostic git diff in CI for future debugging.
2026-01-23 10:08:01 +01:00
Benoit Chesneau
47b9a18619 fix: handle SSLWantReadError in finish_body() (#3448)
The finish_body() function can raise ssl.SSLWantReadError when
discarding unread request body data on SSL connections. This causes
TLS requests to fail intermittently with "Invalid request" errors.

Handle SSLWantReadError by treating it as "no more data to read".
This is safe because finish_body() only discards leftover data before
keepalive - if SSL says "need to wait for more data", there's nothing
left to discard.

Fixes #3448
2026-01-23 09:38:41 +01:00
Benoit Chesneau
58d803977d bump version to 24.0.0, remove sphinx docs 2026-01-23 01:12:46 +01:00
Benoit Chesneau
f9df39f600 gevent: Require gevent 24.10.1+ to address CVE-2024-3219 2026-01-23 00:59:51 +01:00
Benoit Chesneau
9aaa75c0bf fix: Add noqa comments for E402 in geventlet worker 2026-01-23 00:36:05 +01:00
Benoit Chesneau
4062a82ba7 eventlet: Require eventlet 0.40.3+ for security fixes
Upgrade minimum eventlet version to 0.40.3 to address security
vulnerabilities:

- CVE-2021-21419 (Moderate 6.9): Websocket memory exhaustion via
  large/compressed frames (fixed in 0.31.0)
- CVE-2025-58068 (Moderate 6.3): HTTP Request Smuggling via improper
  trailer handling (fixed in 0.40.3)

Also restructure module to call monkey_patch() at import time for
better patching coverage, while keeping hubs.use_hub() in the worker's
patch() method since it creates OS resources that don't survive fork.

Add comprehensive tests for the eventlet worker.
2026-01-23 00:25:50 +01:00
Benoit Chesneau
543854c123 gevent: Require gevent 23.9.0+ for security fixes
Address CVE-2023-41419 (Critical - remote privilege escalation via
WSGIServer) by requiring gevent 23.9.0 or higher.

Changes:
- Update minimum gevent version from 1.4.0 to 23.9.0
- Remove legacy server.kill() code path (gevent < 1.0)
- Update documentation to reflect new version requirement
- Add comprehensive tests for gevent worker
2026-01-23 00:14:11 +01:00
Benoit Chesneau
4b9d787c93 tornado: Require Tornado 6.5.0+ for security fixes
Update minimum Tornado version to 6.5.0 to address:
- CVE-2024-52804 (Medium): HTTP Cookie Parsing DoS
- CVE-2025-47287 (High 7.5): Multipart/Form-Data Parser DoS

This simplifies the tornado worker by removing legacy code paths
for Tornado < 5.0 and < 6.0, reducing the codebase by ~30%.

Changes:
- pyproject.toml: Update tornado requirement to >=6.5.0
- gtornado.py: Remove TORNADO5 constant and legacy code paths
- tornadoapp.py: Update example to use async/await syntax
- test_gtornado.py: Add comprehensive test suite
2026-01-23 00:02:01 +01:00
Benoit Chesneau
1521266e2f asgi/uwsgi: Address PR review feedback
- asgi: Check HTTP method is GET for WebSocket upgrade per RFC 6455
  Section 4.1. Previously HEAD and other methods with upgrade headers
  could trigger WebSocket handling.

- uwsgi: Add detailed docstring explaining header mapping from CGI-style
  environment variables to HTTP headers, including the lossy nature of
  underscore-to-hyphen conversion.
2026-01-22 19:28:11 +01:00
Benoit Chesneau
ac7296ec49 uwsgi: Add native uWSGI binary protocol support
Add support for the uWSGI binary protocol, enabling gunicorn to work
with nginx's uwsgi_pass directive.

New module gunicorn/uwsgi/ with:
- UWSGIRequest: Parses 4-byte binary header and key-value vars block
- UWSGIParser: Protocol parser following existing Parser pattern
- Error classes: InvalidUWSGIHeader, UnsupportedModifier, ForbiddenUWSGIRequest

New configuration options:
- --protocol: Select 'http' (default) or 'uwsgi' protocol
- --uwsgi-allow-from: IP allowlist for uWSGI requests (default: localhost)

Worker integration via get_parser() factory in gunicorn/http/__init__.py,
updates to sync, gthread, and base_async workers.

Example nginx config:
    upstream gunicorn {
        server 127.0.0.1:8000;
    }
    location / {
        uwsgi_pass gunicorn;
        include uwsgi_params;
    }
2026-01-22 18:32:17 +01:00
Benoit Chesneau
11c6a97c47 asgi: Fix pylint and pycodestyle warnings
- Remove unused imports (ssl, os, base64, hashlib, traceback)
- Remove unused variables (body_parts, has_content_length, etc.)
- Fix no-else-break patterns in protocol.py and websocket.py
- Replace __anext__() with anext() builtin
- Remove unnecessary pass statements
- Add proper access logging to ASGI protocol handler
- Add ASGIResponseInfo class and _build_environ method for logging
- Disable too-many-return-statements for _read_frame method
- Fix raising-bad-type error (use 'is not None' check)
- Fix whitespace before colon in message.py
2026-01-22 18:03:14 +01:00
Benoit Chesneau
ae1eea8108 asgi: Add native ASGI worker with HTTP and WebSocket support
Add a new ASGI worker type that provides native async support using
gunicorn's own HTTP parsing infrastructure adapted for asyncio.

Features:
- HTTP/1.1 with keepalive support
- WebSocket connections (RFC 6455)
- ASGI lifespan protocol for startup/shutdown hooks
- Optional uvloop support for improved performance
- Full proxy protocol support (inherited from gunicorn)

New configuration options:
- --asgi-loop: Event loop selection (auto/asyncio/uvloop)
- --asgi-lifespan: Lifespan protocol control (auto/on/off)
- --root-path: ASGI root path for reverse proxy setups

Usage: gunicorn -k asgi myapp:app
2026-01-22 17:05:29 +01:00
Benoit Chesneau
b650332c70
Arbiter signal handling improvements (#3441)
* tests: Add tests for current signal handling behavior

Add tests for arbiter signal handling:
- TestSignalHandlerRegistration (4 tests): Verify signal handler
  registration, pipe creation, SIGCHLD separate handler, and
  expected signals list
- TestSignalQueue (4 tests): Test signal queueing, max queue size,
  wakeup writes to pipe, and sleep returns on pipe data
- TestReapWorkers (6 tests): Test worker reaping for normal exit,
  error exit codes, WORKER_BOOT_ERROR, APP_LOAD_ERROR, signal
  termination, and SIGKILL OOM hint

These tests establish baseline coverage before refactoring the
signal handling code for safety and reliability improvements.

* tests: Add tests for SIGHUP reload and worker lifecycle

Add tests for reload and worker management:
- TestSighupReload (3 tests): Verify reload spawns configured number
  of workers, calls manage_workers, and logs hang up message
- TestWorkerLifecycle (4 tests): Test spawn_worker adds to WORKERS
  dict, kill_worker sends correct signal, murder_workers sends
  SIGABRT first then SIGKILL on subsequent timeout

* arbiter: Fix waitpid status parsing using POSIX macros

Use os.WIFEXITED/WEXITSTATUS and os.WIFSIGNALED/WTERMSIG instead
of manual bit shifting for waitpid status interpretation. This
correctly distinguishes between normal exits and signal termination.

The previous code used 'status >> 8' which only worked for normal
exits, and used raw status values for signal detection which was
incorrect.

Fixes part of #3435 and #3056 (signal name display issues)

* arbiter: Change SIGTERM log level to warning

Log signal termination at warning level for expected signals
(SIGTERM, SIGQUIT) since these typically occur during normal
graceful shutdown. SIGKILL remains at error level with the
OOM hint since it indicates abnormal termination.

Fixes #3311, #3050 (SIGTERM logged as error)

* arbiter: Remove logging from SIGCHLD signal handler

Move reap_workers() call from signal handler context to main loop.
The signal handler (now signal_chld) only queues the signal and
wakes up the main loop. The actual reap_workers() is called from
handle_chld() in the main loop where logging is safe.

This fixes potential deadlocks caused by logging from signal
handler context when holding the logging lock.

Fixes #3198, #3004 (logging in signal handlers unsafe, deadlock)

* arbiter: Replace PIPE+select with queue.SimpleQueue

Use queue.SimpleQueue for signal handling instead of PIPE+select.
SimpleQueue is reentrant-safe and can be used from signal handlers.

Changes:
- Remove PIPE-based wakeup mechanism
- Add SIG_QUEUE as SimpleQueue instance
- Add WAKEUP_REQUEST sentinel for non-signal wakeups
- Replace sleep() with wait_for_signals() using queue.get()
- Simplify signal handler to just put_nowait()
- Update main loop to iterate over wait_for_signals()
- Add reap_workers() call in stop() to properly clean up workers
  since SIGCHLD is no longer processed during shutdown

This simplifies the code and removes the dependency on select().

Also adds integration tests for signal handling that verify:
- Basic request/response
- Graceful shutdown with SIGTERM/SIGINT
- SIGHUP reload
- Multiple concurrent requests

* arbiter: Wait for old workers on SIGHUP reload

After spawning new workers during reload, wait for old workers to
terminate before returning from reload(). This prevents the issue
where old workers could receive double SIGTERM - once from
manage_workers() and again from the arbiter loop.

The reload now tracks worker_age before spawning, then waits up to
graceful_timeout for workers older than that age to exit.

Fixes #3312, #3274 (SIGHUP can send double SIGTERM)

* arbiter: Log SIGCHLD at debug level

SIGCHLD is received frequently (whenever a worker exits) and doesn't
need to be logged at info level. Log it at debug level to reduce
noise in the logs while still making it available for debugging.

* tests: Fix lint warnings in test_arbiter.py
2026-01-22 11:56:23 +01:00
Benoit Chesneau
0186211400 gthread: Lock-free PollableMethodQueue refactoring
Replace RLock-based synchronization with a pipe-based method queue
for lock-free coordination between worker threads and main thread.

Key changes:
- Add PollableMethodQueue class using os.pipe() for wake-up signaling
- Non-blocking pipe (both ends) for BSD compatibility (FreeBSD, OpenBSD)
- Unified event loop using single poller.select() - no more futures.wait()
- Better graceful shutdown with connection draining within grace period
- Rename _keep to keepalived_conns, remove _lock entirely
- Add handle_exit() for SIGTERM, improve handle_quit() for SIGQUIT
- Add set_accept_enabled() for dynamic connection acceptance control
- Add wait_for_and_dispatch_events() with EINTR handling

Performance improvement: ~8% at high concurrency due to reduced
lock contention and non-blocking pipe operations.

Tests: 40 tests covering PollableMethodQueue, graceful shutdown,
keepalive management, error handling, and BSD compatibility.

Fixes #3146
Closes #3157
2026-01-22 09:32:48 +01:00
Benoit Chesneau
b43dc6d398 gthread: Improve reliability and fix edge cases
This commit addresses three issues with the gthread worker:

1. Request body handling on keepalive
   - Add finish_body() method to Parser to discard unread body bytes
   - Call it before returning connections to the poller
   - Prevents socket appearing readable due to leftover body
   Fixes #3301

2. Timeout reliability with monotonic clock
   - Replace time.time() with time.monotonic() in set_timeout()
   - Replace time.time() with time.monotonic() in murder_keepalived()
   - Prevents timeout issues caused by NTP adjustments

3. SSL error handling
   - Move conn.init() from enqueue_req() to handle()
   - SSL handshake now runs in worker thread, not main thread
   - ENOTCONN errors during ssl_wrap_socket are caught per-connection
   - Prevents entire worker crashes on SSL handshake failures

Also adds comprehensive unit tests for the gthread worker.

Closes #3303
Closes #3308
2026-01-22 09:14:19 +01:00
Benoit Chesneau
eb2f81dcf8
Merge pull request #3390 from adk-swisstopo/timeout-units
Specify the units for `graceful_timeout`.
2025-10-05 18:21:01 +02:00