mirror of
https://github.com/frappe/gunicorn.git
synced 2026-07-01 10:11:30 +08:00
refactor: Simpler implementation
Just do multiple queues, nothing else.
This commit is contained in:
parent
ec6af68013
commit
7dc4d184be
@ -57,23 +57,23 @@ the actual failure mode — prediction is effective after the first sample.
|
||||
│ fast │ slow
|
||||
▼ ▼
|
||||
┌───────────┐ ┌───────────┐
|
||||
│ fast_pool │ │ slow_pool │ (bounded, 503 on full)
|
||||
│ fast_pool │ │ slow_pool │
|
||||
│ F threads │ │ S threads │
|
||||
└─────┬─────┘ └─────┬─────┘
|
||||
└───────┬───────┘
|
||||
on completion: predictor.update(route, duration)
|
||||
```
|
||||
|
||||
- **Fast lane**: a `ThreadPoolExecutor` of `F = cfg.threads` threads. Only ever
|
||||
runs fast-classified work.
|
||||
- **Slow lane**: a separate `ThreadPoolExecutor` of `S = cfg.slow_threads`
|
||||
threads (default 1). Only ever runs slow-classified work.
|
||||
- Total OS threads per worker = `F + S`.
|
||||
- The slow lane is bounded: a counter (`nr_slow`) tracks slow requests
|
||||
submitted-but-not-finished; once it reaches `S + cfg.slow_queue_maxsize` (i.e.
|
||||
running plus queued), further slow requests are rejected with `503` instead of
|
||||
growing the executor's unbounded internal queue. The fast lane is governed by
|
||||
the existing `worker_connections` admission like today.
|
||||
- **Fast lane**: a `ThreadPoolExecutor` of `F = ceil(cfg.threads / 2)` threads.
|
||||
Only ever runs fast-classified work.
|
||||
- **Slow lane**: a separate `ThreadPoolExecutor` of `S = cfg.threads // 2`
|
||||
threads. Only ever runs slow-classified work.
|
||||
- Total OS threads per worker stays at `cfg.threads` — adaptive-queueing mode splits
|
||||
the existing budget, it does not expand it. `cfg.threads` must be at least 2
|
||||
for the split to be meaningful; otherwise the worker logs a warning and runs
|
||||
with a single pool.
|
||||
- Both lanes share the existing `worker_connections` admission like today; the
|
||||
slow lane is not separately bounded.
|
||||
|
||||
### Why two plain pools (and not a custom dual-queue scheduler)
|
||||
|
||||
@ -157,13 +157,13 @@ the handshake completes. For SSL connections in this first cut:
|
||||
|
||||
New settings, mirroring `WorkerThreads` (`config.py:697`):
|
||||
|
||||
- `enable_adaptive_queueing` — boolean; when true, the `gthread` worker splits its
|
||||
`cfg.threads` budget between a fast and a slow lane and routes by prediction.
|
||||
Default `False` (single pool, today's behavior). Requires `cfg.threads >= 2`;
|
||||
otherwise the worker logs a warning and falls back to the single pool.
|
||||
- `slow_request_threshold` — float seconds; a route whose learned timing meets/
|
||||
exceeds this is "slow". Default e.g. `1.0`. **`0` disables the whole feature**
|
||||
and restores today's single-pool behavior exactly.
|
||||
- `slow_threads` — `S`, slow-lane worker count. Default `1`.
|
||||
- `slow_queue_maxsize` — bound on `slow_q`; overflow ⇒ `503`. Default e.g. `100`
|
||||
(`0` = unbounded).
|
||||
- `slow_lane_retry_after` — seconds for the `Retry-After` header on 503.
|
||||
exceeds this is "slow". Default `1.0`. Only consulted when `enable_adaptive_queueing` is
|
||||
enabled.
|
||||
|
||||
A `slow_route_key` hook to customize the route key (e.g. collapse
|
||||
`/users/<id>`) is a possible future addition; the default key is method + path
|
||||
@ -171,17 +171,17 @@ with the query string stripped.
|
||||
|
||||
### 5.2 Two thread pools
|
||||
|
||||
`init_process` builds two plain `ThreadPoolExecutor`s when routing is enabled —
|
||||
`fast_pool` (`F = cfg.threads`) and `slow_pool` (`S = cfg.slow_threads`) — and
|
||||
falls back to the single `get_thread_pool()` executor when it is disabled.
|
||||
`enqueue_req(conn, slow)` submits to the matching pool; both produce ordinary
|
||||
`init_process` builds two plain `ThreadPoolExecutor`s when `enable_adaptive_queueing` is
|
||||
enabled — `fast_pool` with `F = ceil(cfg.threads / 2)` workers and `slow_pool`
|
||||
with `S = cfg.threads // 2` workers — and falls back to the single
|
||||
`get_thread_pool()` executor when it is disabled. `enqueue_req(conn, slow)`
|
||||
submits to the matching pool; both produce ordinary
|
||||
`concurrent.futures.Future`s, so `_wrap_future`, `add_done_callback`,
|
||||
`self.futures` tracking, and `futures.wait` all keep working unchanged.
|
||||
|
||||
- Bounding the slow lane: `nr_slow` counts slow requests submitted-but-not-yet-
|
||||
finished. `enqueue_req` rejects (503) when `nr_slow >= S + slow_queue_maxsize`;
|
||||
`finish_request` decrements it. This caps the slow executor's otherwise
|
||||
unbounded internal queue.
|
||||
- The slow lane is not separately bounded; its executor's internal queue is
|
||||
capped indirectly by the worker's existing `worker_connections` admission,
|
||||
the same as the single-pool path today.
|
||||
- Shutdown drains both pools via a `_shutdown_pools` helper, replacing the
|
||||
single `tpool.shutdown` calls; the `graceful_timeout` `futures.wait` is
|
||||
unchanged.
|
||||
@ -215,16 +215,16 @@ A small, self-contained, thread-safe object:
|
||||
|
||||
## 6. Behavior under load (the cases that matter)
|
||||
|
||||
- **Flood of a previously-seen slow route**: every such request
|
||||
is routed to the slow pool. The `F` fast threads are never given this work and
|
||||
keep serving fast traffic at full capacity. When the slow lane reaches
|
||||
`S + slow_queue_maxsize`, further slow requests get a fast `503` — backpressure
|
||||
is contained to the slow lane.
|
||||
- **Flood of a previously-seen slow route**: every such request is routed to
|
||||
the slow pool. The `F` fast threads are never given this work and keep
|
||||
serving fast traffic at full capacity. Excess slow requests sit in the slow
|
||||
pool's queue, gated overall by `worker_connections`.
|
||||
- **Flood of a never-seen slow route**: the first occurrence(s) run in the fast
|
||||
lane; mid-flight learning (§5.4.2) flips the route to slow after one threshold
|
||||
interval, so the flood is contained quickly.
|
||||
- **Mixed fast traffic, idle slow lane**: the `S` slow threads stay parked (no
|
||||
work stealing in this design — see §3), so fast throughput is `F`, not `F + S`.
|
||||
work stealing in this design — see §3), so fast throughput is `F`, not
|
||||
`F + S`. This is the cost of splitting a fixed `cfg.threads` budget.
|
||||
- **Misprediction (route marked slow but now fast)**: handled gracefully — it
|
||||
runs in the slow lane, and EWMA decay restores it to the fast lane over time.
|
||||
|
||||
@ -232,23 +232,23 @@ A small, self-contained, thread-safe object:
|
||||
|
||||
Implemented:
|
||||
|
||||
- `config.py` — `slow_request_threshold`, `slow_threads`, `slow_queue_maxsize`,
|
||||
`slow_lane_retry_after`, plus `validate_pos_float`.
|
||||
- `config.py` — `enable_adaptive_queueing`, `slow_request_threshold`, plus
|
||||
`validate_pos_float`.
|
||||
- `gthread.py` `init_process`/`get_thread_pool` — build `fast_pool` and
|
||||
`slow_pool` (or the single legacy pool when disabled); `_shutdown_pools`.
|
||||
- `gthread.py` `enqueue_req` — route to the matching pool; `nr_slow` bound +
|
||||
`reject_overloaded` (503).
|
||||
`slow_pool` (split from `cfg.threads`) when `enable_adaptive_queueing` is on, or the
|
||||
single legacy pool when off; `_shutdown_pools`.
|
||||
- `gthread.py` `enqueue_req` — route to the matching pool.
|
||||
- `gthread.py` `accept`/`park_for_request`/`classify_and_dispatch`/
|
||||
`_peek_request_line`/`_route_key` — poller-driven request-line peek + routing.
|
||||
- `gthread.py` `finish_request` — `predictor.update`, `nr_slow` decrement,
|
||||
routing-aware keepalive re-park.
|
||||
- `gthread.py` `finish_request` — `predictor.update`, routing-aware keepalive
|
||||
re-park.
|
||||
- `gthread.py` run-loop sweep — mid-flight learning.
|
||||
- `gthread_routing.py` — `SlowRoutePredictor`.
|
||||
|
||||
## 8. Backward compatibility
|
||||
|
||||
- `slow_request_threshold = 0` ⇒ feature off: single pool, no classification, no
|
||||
rejection — byte-for-byte current behavior.
|
||||
- `enable_adaptive_queueing = False` (the default) ⇒ feature off: single pool, no
|
||||
classification — byte-for-byte current behavior.
|
||||
- Hard per-request timeout (`gthread.py:243-250`) preserved unchanged; this adds
|
||||
a softer, non-fatal classification on top.
|
||||
- Worker `handle`/`handle_request`, keepalive semantics, and the
|
||||
@ -265,8 +265,8 @@ Implemented:
|
||||
dispatches to the expected lane.
|
||||
- **Integration — flood isolation**: app with a known-slow route flooded
|
||||
concurrently; assert fast-route latency stays low and slow requests never
|
||||
occupy fast workers; assert 503 once the slow lane is full.
|
||||
occupy fast workers.
|
||||
- **Integration — cold start**: never-seen slow route burst ⇒ confirm the lane
|
||||
flips to slow within ~one threshold interval via mid-flight learning.
|
||||
- **Regression**: `slow_request_threshold = 0` ⇒ current behavior; keepalive,
|
||||
SSL, and graceful shutdown paths still pass existing tests.
|
||||
- **Regression**: `enable_adaptive_queueing = False` ⇒ current behavior; keepalive, SSL, and
|
||||
graceful shutdown paths still pass existing tests.
|
||||
|
||||
@ -744,32 +744,26 @@ class WorkerConnections(Setting):
|
||||
"""
|
||||
|
||||
|
||||
class SlowRequestThreshold(Setting):
|
||||
name = "slow_request_threshold"
|
||||
class EnableAdaptiveQueueing(Setting):
|
||||
name = "enable_adaptive_queueing"
|
||||
section = "Worker Processes"
|
||||
cli = ["--slow-request-threshold"]
|
||||
meta = "FLOAT"
|
||||
validator = validate_pos_float
|
||||
type = float
|
||||
default = 0.0
|
||||
cli = ["--enable-adaptive-queueing"]
|
||||
validator = validate_bool
|
||||
action = "store_true"
|
||||
type = bool
|
||||
default = False
|
||||
desc = """\
|
||||
Processing time (in seconds) above which a request route is treated as
|
||||
"slow" by the ``gthread`` worker.
|
||||
Enable adaptive multi-queue routing in the ``gthread`` worker.
|
||||
|
||||
When set to a positive value, the ``gthread`` worker runs two separate
|
||||
lanes of threads: a *fast* lane (sized by :ref:`threads`) and a *slow*
|
||||
lane (sized by :ref:`slow-threads`). Requests are routed by predicting,
|
||||
from previously observed timings of the same route (method + path),
|
||||
whether they will take longer than this threshold. Slow-predicted
|
||||
requests go to the slow lane so they can never starve the fast lane,
|
||||
even under a flood.
|
||||
When enabled, the worker splits its :ref:`threads` roughly evenly into
|
||||
two lanes — a *fast* lane and a *slow* lane — and routes each request
|
||||
to one of them by predicting, from previously observed timings of the
|
||||
same route (method + path), whether it will exceed
|
||||
:ref:`slow-request-threshold`. Slow-predicted requests go to the slow
|
||||
lane so they can never starve the fast lane, even under a flood of
|
||||
slow requests.
|
||||
|
||||
A route is learned as slow once it has been observed exceeding this
|
||||
threshold (either on completion or while still running); its timing
|
||||
decays back below the threshold if it becomes fast again.
|
||||
|
||||
The default of ``0`` disables the feature and restores the single
|
||||
thread-pool behaviour.
|
||||
Requires :ref:`threads` to be at least 2.
|
||||
|
||||
This setting only affects the ``gthread`` worker type.
|
||||
|
||||
@ -777,66 +771,25 @@ class SlowRequestThreshold(Setting):
|
||||
"""
|
||||
|
||||
|
||||
class SlowThreads(Setting):
|
||||
name = "slow_threads"
|
||||
class SlowRequestThreshold(Setting):
|
||||
name = "slow_request_threshold"
|
||||
section = "Worker Processes"
|
||||
cli = ["--slow-threads"]
|
||||
meta = "INT"
|
||||
validator = validate_pos_int
|
||||
type = int
|
||||
default = 1
|
||||
cli = ["--slow-request-threshold"]
|
||||
meta = "FLOAT"
|
||||
validator = validate_pos_float
|
||||
type = float
|
||||
default = 1.0
|
||||
desc = """\
|
||||
The number of worker threads dedicated to the slow lane.
|
||||
Processing time (in seconds) above which a request route is treated as
|
||||
"slow" by the ``gthread`` worker when :ref:`enable-adaptive-queueing`
|
||||
is enabled.
|
||||
|
||||
These threads only ever run requests predicted to be slow. When the slow
|
||||
lane has no pending work, they help drain the fast lane, so they are
|
||||
never idle while fast work is waiting. Total threads per worker is
|
||||
``threads + slow_threads``.
|
||||
A route is learned as slow once it has been observed exceeding this
|
||||
threshold (either on completion or while still running); its timing
|
||||
decays back below the threshold if it becomes fast again.
|
||||
|
||||
Only used by the ``gthread`` worker when
|
||||
:ref:`slow-request-threshold` is set.
|
||||
|
||||
.. versionadded:: 23.1.0
|
||||
"""
|
||||
|
||||
|
||||
class SlowQueueMaxsize(Setting):
|
||||
name = "slow_queue_maxsize"
|
||||
section = "Worker Processes"
|
||||
cli = ["--slow-queue-maxsize"]
|
||||
meta = "INT"
|
||||
validator = validate_pos_int
|
||||
type = int
|
||||
default = 100
|
||||
desc = """\
|
||||
Maximum number of requests allowed to wait in the slow lane queue.
|
||||
|
||||
When the slow lane queue is full, additional slow-predicted requests are
|
||||
rejected immediately with a ``503 Service Unavailable`` response instead
|
||||
of being queued, keeping the slow lane's backpressure contained. A value
|
||||
of ``0`` means the slow queue is unbounded.
|
||||
|
||||
Only used by the ``gthread`` worker when
|
||||
:ref:`slow-request-threshold` is set.
|
||||
|
||||
.. versionadded:: 23.1.0
|
||||
"""
|
||||
|
||||
|
||||
class SlowLaneRetryAfter(Setting):
|
||||
name = "slow_lane_retry_after"
|
||||
section = "Worker Processes"
|
||||
cli = ["--slow-lane-retry-after"]
|
||||
meta = "INT"
|
||||
validator = validate_pos_int
|
||||
type = int
|
||||
default = 1
|
||||
desc = """\
|
||||
Value (in seconds) of the ``Retry-After`` header sent with the ``503``
|
||||
response when the slow lane queue is full.
|
||||
|
||||
Only used by the ``gthread`` worker when
|
||||
:ref:`slow-request-threshold` is set.
|
||||
Only used by the ``gthread`` worker when :ref:`enable-adaptive-queueing`
|
||||
is enabled.
|
||||
|
||||
.. versionadded:: 23.1.0
|
||||
"""
|
||||
|
||||
@ -76,24 +76,22 @@ class ThreadWorker(base.Worker):
|
||||
super().__init__(*args, **kwargs)
|
||||
self.worker_connections = self.cfg.worker_connections
|
||||
self.max_keepalived = self.cfg.worker_connections - self.cfg.threads
|
||||
# request routing: when adaptive queueing is enabled, the configured
|
||||
# threads are split into a fast lane and a slow lane so slow requests
|
||||
# cannot starve fast ones
|
||||
self.routing_enabled = (
|
||||
self.cfg.enable_adaptive_queueing and self.cfg.threads >= 2)
|
||||
self.slow_threshold = self.cfg.slow_request_threshold
|
||||
# initialise the pool(s): a single pool when routing is disabled, or a
|
||||
# separate fast (``self.tpool``) and slow pool when it is enabled
|
||||
self.tpool = None
|
||||
self.slow_pool = None
|
||||
# number of slow requests submitted but not yet finished, used to bound
|
||||
# the slow lane (running + queued) and shed load with 503
|
||||
self.nr_slow = 0
|
||||
self.poller = None
|
||||
self.shutdown_event = os.eventfd(0)
|
||||
self._lock = None
|
||||
self.futures = deque()
|
||||
self._keep = deque()
|
||||
self.nr_conns = 0
|
||||
|
||||
# request routing: when a slow-request threshold is configured, slow
|
||||
# requests are routed to a separate lane so they cannot starve fast ones
|
||||
self.slow_threshold = self.cfg.slow_request_threshold
|
||||
self.routing_enabled = self.slow_threshold > 0
|
||||
self.predictor = None
|
||||
|
||||
@classmethod
|
||||
@ -104,14 +102,21 @@ class ThreadWorker(base.Worker):
|
||||
log.warning("No keepalived connections can be handled. " +
|
||||
"Check the number of worker connections and threads.")
|
||||
|
||||
if cfg.enable_adaptive_queueing and cfg.threads < 2:
|
||||
log.warning("enable_adaptive_queueing requires at least 2 threads; "
|
||||
"running with a single pool.")
|
||||
|
||||
def init_process(self):
|
||||
self.tpool = self.get_thread_pool()
|
||||
if self.routing_enabled:
|
||||
# split the configured threads roughly evenly between the two
|
||||
# lanes; the fast lane gets the extra thread when threads is odd
|
||||
slow = self.cfg.threads // 2
|
||||
fast = self.cfg.threads - slow
|
||||
self.tpool = futures.ThreadPoolExecutor(max_workers=fast)
|
||||
self.slow_pool = futures.ThreadPoolExecutor(max_workers=slow)
|
||||
self.predictor = SlowRoutePredictor(self.slow_threshold)
|
||||
# a dedicated pool for the slow lane: slow requests can never
|
||||
# occupy the fast pool's (``self.tpool``) threads
|
||||
self.slow_pool = futures.ThreadPoolExecutor(
|
||||
max_workers=self.cfg.slow_threads)
|
||||
else:
|
||||
self.tpool = self.get_thread_pool()
|
||||
self.poller = selectors.DefaultSelector()
|
||||
self._lock = RLock()
|
||||
super().init_process()
|
||||
@ -148,15 +153,7 @@ class ThreadWorker(base.Worker):
|
||||
|
||||
def enqueue_req(self, conn, slow=False):
|
||||
conn.init()
|
||||
# submit the connection to the appropriate pool
|
||||
if self.routing_enabled and slow:
|
||||
cap = self.cfg.slow_queue_maxsize
|
||||
if cap and self.nr_slow >= self.cfg.slow_threads + cap:
|
||||
# slow lane (running + queued) is full; shed load with 503
|
||||
# instead of letting the slow queue grow unbounded
|
||||
self.reject_overloaded(conn)
|
||||
return
|
||||
self.nr_slow += 1
|
||||
fs = self.slow_pool.submit(self.handle, conn)
|
||||
else:
|
||||
fs = self.tpool.submit(self.handle, conn)
|
||||
@ -262,22 +259,6 @@ class ThreadWorker(base.Worker):
|
||||
return None
|
||||
return method + " " + path
|
||||
|
||||
def reject_overloaded(self, conn):
|
||||
"""Reject a connection with 503 because the slow lane is saturated."""
|
||||
self.nr_conns -= 1
|
||||
try:
|
||||
conn.sock.setblocking(True)
|
||||
conn.sock.sendall(
|
||||
b"HTTP/1.1 503 Service Unavailable\r\n"
|
||||
b"Connection: close\r\n"
|
||||
b"Content-Length: 0\r\n"
|
||||
b"Retry-After: %d\r\n\r\n"
|
||||
% int(self.cfg.slow_lane_retry_after))
|
||||
except OSError:
|
||||
pass
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
def reuse_connection(self, conn, client):
|
||||
with self._lock:
|
||||
# unregister the client from the poller
|
||||
@ -410,10 +391,6 @@ class ThreadWorker(base.Worker):
|
||||
futures.wait(self.futures, timeout=self.cfg.graceful_timeout)
|
||||
|
||||
def finish_request(self, fs):
|
||||
# the slow request is done (whatever the outcome): free its slow slot
|
||||
if self.routing_enabled and fs.slow:
|
||||
self.nr_slow -= 1
|
||||
|
||||
if fs.cancelled():
|
||||
self.nr_conns -= 1
|
||||
fs.conn.close()
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user