Benoit Chesneau b650332c70
Arbiter signal handling improvements (#3441)
* tests: Add tests for current signal handling behavior

Add tests for arbiter signal handling:
- TestSignalHandlerRegistration (4 tests): Verify signal handler
  registration, pipe creation, SIGCHLD separate handler, and
  expected signals list
- TestSignalQueue (4 tests): Test signal queueing, max queue size,
  wakeup writes to pipe, and sleep returns on pipe data
- TestReapWorkers (6 tests): Test worker reaping for normal exit,
  error exit codes, WORKER_BOOT_ERROR, APP_LOAD_ERROR, signal
  termination, and SIGKILL OOM hint

These tests establish baseline coverage before refactoring the
signal handling code for safety and reliability improvements.

* tests: Add tests for SIGHUP reload and worker lifecycle

Add tests for reload and worker management:
- TestSighupReload (3 tests): Verify reload spawns configured number
  of workers, calls manage_workers, and logs hang up message
- TestWorkerLifecycle (4 tests): Test spawn_worker adds to WORKERS
  dict, kill_worker sends correct signal, murder_workers sends
  SIGABRT first then SIGKILL on subsequent timeout

* arbiter: Fix waitpid status parsing using POSIX macros

Use os.WIFEXITED/WEXITSTATUS and os.WIFSIGNALED/WTERMSIG instead
of manual bit shifting for waitpid status interpretation. This
correctly distinguishes between normal exits and signal termination.

The previous code used 'status >> 8' which only worked for normal
exits, and used raw status values for signal detection which was
incorrect.

Fixes part of #3435 and #3056 (signal name display issues)

* arbiter: Change SIGTERM log level to warning

Log signal termination at warning level for expected signals
(SIGTERM, SIGQUIT) since these typically occur during normal
graceful shutdown. SIGKILL remains at error level with the
OOM hint since it indicates abnormal termination.

Fixes #3311, #3050 (SIGTERM logged as error)

* arbiter: Remove logging from SIGCHLD signal handler

Move reap_workers() call from signal handler context to main loop.
The signal handler (now signal_chld) only queues the signal and
wakes up the main loop. The actual reap_workers() is called from
handle_chld() in the main loop where logging is safe.

This fixes potential deadlocks caused by logging from signal
handler context when holding the logging lock.

Fixes #3198, #3004 (logging in signal handlers unsafe, deadlock)

* arbiter: Replace PIPE+select with queue.SimpleQueue

Use queue.SimpleQueue for signal handling instead of PIPE+select.
SimpleQueue is reentrant-safe and can be used from signal handlers.

Changes:
- Remove PIPE-based wakeup mechanism
- Add SIG_QUEUE as SimpleQueue instance
- Add WAKEUP_REQUEST sentinel for non-signal wakeups
- Replace sleep() with wait_for_signals() using queue.get()
- Simplify signal handler to just put_nowait()
- Update main loop to iterate over wait_for_signals()
- Add reap_workers() call in stop() to properly clean up workers
  since SIGCHLD is no longer processed during shutdown

This simplifies the code and removes the dependency on select().

Also adds integration tests for signal handling that verify:
- Basic request/response
- Graceful shutdown with SIGTERM/SIGINT
- SIGHUP reload
- Multiple concurrent requests

* arbiter: Wait for old workers on SIGHUP reload

After spawning new workers during reload, wait for old workers to
terminate before returning from reload(). This prevents the issue
where old workers could receive double SIGTERM - once from
manage_workers() and again from the arbiter loop.

The reload now tracks worker_age before spawning, then waits up to
graceful_timeout for workers older than that age to exit.

Fixes #3312, #3274 (SIGHUP can send double SIGTERM)

* arbiter: Log SIGCHLD at debug level

SIGCHLD is received frequently (whenever a worker exits) and doesn't
need to be logged at info level. Log it at debug level to reduce
noise in the logs while still making it available for debugging.

* tests: Fix lint warnings in test_arbiter.py
2026-01-22 11:56:23 +01:00
..
2020-05-01 01:11:21 +02:00
2024-04-22 03:33:14 +02:00
2024-04-22 03:33:30 +02:00
2020-02-02 22:57:14 +01:00
2024-04-22 03:33:14 +02:00
2024-04-22 03:33:14 +02:00
2024-04-22 03:33:14 +02:00
2024-04-22 03:33:30 +02:00