-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix broken header parsing in logging #1607
Conversation
…e, just take .items() from that.
Can you explain how this fixes the problem? It's not at all obvious from the change. |
Sorry, could have made that more clear -- pywsgi provides an object that is dict-like and wraps the headers -- the headers property on that object: @property
def headers(self):
for key, value in self._headers:
yield '%s: %s\r\n' % (key, value) adds whitespace for some reason, it's much cleaner to just call .items(): import sys
if sys.version_info.major == 2:
from StringIO import StringIO
else:
from io import BytesIO as StringIO
import gevent.pywsgi
raw_headers = b'Foo: bar\r\nBar: baz\r\nBaz: boo\r\n\r\n'
fp = StringIO(raw_headers)
headers = gevent.pywsgi.headers_factory(fp)
print([h.split(":", 1) for h in headers.headers])
# [['Foo', ' bar\r\n'], ['Bar', ' baz\r\n'], ['Baz', ' boo\r\n']]
print(headers.items())
# [('baz', 'boo'), ('foo', 'bar'), ('bar', 'baz')] (tve)mattb@matt:~ $ python3 header_fix.py
[['Foo', ' bar\r\n'], ['Bar', ' baz\r\n'], ['Baz', ' boo\r\n']]
[('Foo', 'bar'), ('Bar', 'baz'), ('Baz', 'boo')]
(tve)mattb@matt:~ $ python2 header_fix.py
[['Foo', ' bar\r\n'], ['Bar', ' baz\r\n'], ['Baz', ' boo\r\n']]
[('baz', 'boo'), ('foo', 'bar'), ('bar', 'baz')] I'm not sure what the actual interface for the headers attribute is supposed to be between the various servers and frameworks -- not sure how much test coverage you have around this; I think this code path isn't even exercised when using the gevent worker, I only have this problem when also using flask-sockets and geventwebsocket. |
I checked this out and it seems fine. We don't have any coverage right now for anything around this, but in gevent 0.13.x up through the most recent versions this object is @berkerpeksag should we do anything more here or just merge this? |
@@ -213,7 +213,7 @@ def log_request(self): | |||
resp_headers = getattr(self, 'response_headers', {}) | |||
resp = GeventResponse(self.status, resp_headers, self.response_length) | |||
if hasattr(self, 'headers'): | |||
req_headers = [h.split(":", 1) for h in self.headers.headers] | |||
req_headers = self.headers.items() | |||
else: | |||
req_headers = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be changed to {}
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope. It's been a list of tuples and that isn't being changed here.
I don't know much about this part of Gunicorn, but this looks good to me. Thank you! I think current behavior comes from the mimetools module (which is an ancient Python 2 module) so I don't know why mimetools behaved like that. In gevent/gevent@198f239, they make the behavior same under Python 3. There is also a check for I'll just ping @jamadden in case we missed something. |
I'm not sure what you're asking, but calling |
We have a http.client.HTTPMessage in Python 3 (mimetools.Message in Python 2), just take .items() from that.
I found this by mixing flask-websocket in a flask app -- the parsing logic here leaves whitespace in the header values (including \r\n) so it's not suitable for extracting headers into logs.