Strip whitespace also *after* header field value.
Simply refuse obsolete header folding (a default-off
option to revert is temporarily provided).
While we are at it, explicitly handle recently
introduced http error classes with intended status code.
changes:
- Just follow the new TE specification (https://datatracker.ietf.org/doc/html/rfc9112#name-transfer-encoding)
here and accept to introduce a breaking change.
- gandle multiple TE on one line
** breaking changes ** : invalid headers and position will now return
an error.
New parser rule: refuse HTTP requests where a header field value
contains characters that
a) should never appear there in the first place,
b) might have lead to incorrect treatment in a proxy in front, and
c) might lead to unintended behaviour in applications.
From RFC 9110 section 5.5:
"Field values containing CR, LF, or NUL characters are invalid and
dangerous, due to the varying ways that implementations might parse
and interpret those characters; a recipient of CR, LF, or NUL within
a field value MUST either reject the message or replace each of those
characters with SP before further processing or forwarding of that
message."
chunk extensions are silently ignored before and after this change;
its just the whitespace handling for the case without extensions that matters
applying same strip(WS)->rstrip(BWS) replacement as already done in related cases
half-way fix: could probably reject all BWS cases, rejecting only misplaced ones
Note: This is unrelated to a reverse proxy potentially talking HTTP/3 to clients.
This is about the HTTP protocol version spoken to Gunicorn, which is HTTP/1.0 or HTTP/1.1.
Little legitimate need for processing HTTP 1 requests with ambiguous version numbers.
Broadly refuse.
Co-authored-by: Ben Kallus <benjamin.p.kallus.gr@dartmouth.edu>
Do the validation on the original, not the result from unicode case folding.
Background:
latin-1 0xDF is traditionally uppercased 0x53+0x53 which puts it back in ASCII
If we promise wsgi.input_terminated, we better get it right - or not at all.
* chunked encoding on HTTP <= 1.1
* chunked not last transfer coding
* multiple chinked codings
* any unknown codings (yes, this too! because we do not detect unusual syntax that is still chunked)
* empty coding (plausibly harmless, but not see in real life anyway - refused, for the moment)
Ambiguous mappings open a bottomless pit of "what is user input and what is proxy input" confusion.
Default to what everyone else has been doing for years now, silently drop.
see also https://nginx.org/r/underscores_in_headers
- Unify HEADER_RE and METH_RE
- Replace CRLF with SP during obs-fold processing (See RFC 9112 Section 5.2, last paragraph)
- Stop stripping header names.
- Remove HTAB in OWS in header values that use obs-fold (See RFC 9112 Section 5.2, last paragraph)
- Use fullmatch instead of search, which has problems with empty strings. (See GHSA-68xg-gqqm-vgj8)
- Split proxy protocol line on space only. (See proxy protocol Section 2.1, bullet 3)
- Use fullmatch for method and version (Thank you to Paul Dorn for noticing this.)
- Replace calls to str.strip() with str.strip(' \t')
- Split request line on SP only.
Co-authored-by: Paul Dorn <pajod@users.noreply.github.com>
Numbers must be separated by dot. This makes life
a little bit harder for attackers who would like to inject specially crafted packets after GET / (e.g. in nginx there are sometimes regular expressions like (?P<action>[^.]).html
patch from Djoume Salvetti . address the following issues in gunicorn:
* Gunicorn does not limit the size of a request header (the
* limit_request_field_size configuration parameter is not used)
* When the configured request limit is lower than its maximum value, the
* maximum value is used instead. For instance if limit_request_line is
* set to 1024, gunicorn will only limit the request line to 4096 chars
* (this issue also affects limit_request_fields)
* Request limits are not limited to their maximum authorized values. For
* instance it is possible to set limit_request_line to 64K (this issue
* also affects limit_request_fields)
* Setting limit_request_fields and limit_request_field_size to 0 does
* not make them unlimited. The following patch allows limit_request_line
* and limit_request_field_size to be unlimited. limit_request_fields can
* no longer be unlimited (I can't imagine 32K fields to not be enough
* but I have a use case where 8K for the request line is not enough).
* Parsing errors (premature client disconnection) are not reported
* When request line limit is exceeded the configured value is reported
* instead of the effective value.
Add --limit-request-fields (limit_request_fields) and
--limit-request-field-size (limit-request-field-size) options.
- limit_request_fields:
Value is a number from 0 (unlimited) to 32768. This parameter is
used to limit the number of headers in a request to prevent DDOS
attack. Used with the `limit_request_field_size` it allows more
safety.
- limit_request_field_size:
Value is a number from 0 (unlimited) to 8190. to set the limit
on the allowed size of an HTTP request header field.
You can now pass the parameter --limit-request-line or set the
limit_request_line in your configuration file to set the max size of the
request line in bytes.
This parameter is used to limit the allowed size of a client's HTTP
request-line. Since the request-line consists of the HTTP method, URI,
and protocol version, this directive places a restriction on the length
of a request-URI allowed for a request on the server. A server needs
this value to be large enough to hold any of its resource names,
including any information that might be passed in the query part of a
GET request. By default this value is 4094 and can't be larger than
8190.
This parameter can be used to prevent any DDOS attack.