Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix broken header parsing in logging #1607

Merged
merged 1 commit into from
Oct 23, 2017
Merged

Fix broken header parsing in logging #1607

merged 1 commit into from
Oct 23, 2017

Conversation

mattbillenstein
Copy link
Contributor

I found this by mixing flask-websocket in a flask app -- the parsing logic here leaves whitespace in the header values (including \r\n) so it's not suitable for extracting headers into logs.

@tilgovi
Copy link
Collaborator

tilgovi commented Sep 27, 2017

Can you explain how this fixes the problem? It's not at all obvious from the change.

@mattbillenstein
Copy link
Contributor Author

Sorry, could have made that more clear -- pywsgi provides an object that is dict-like and wraps the headers -- the headers property on that object:

        @property
        def headers(self):
            for key, value in self._headers:
                yield '%s: %s\r\n' % (key, value)

adds whitespace for some reason, it's much cleaner to just call .items():

import sys
if sys.version_info.major == 2:
    from StringIO import StringIO
else:
    from io import BytesIO as StringIO

import gevent.pywsgi

raw_headers = b'Foo: bar\r\nBar: baz\r\nBaz: boo\r\n\r\n'
fp = StringIO(raw_headers)
headers = gevent.pywsgi.headers_factory(fp)

print([h.split(":", 1) for h in headers.headers])
# [['Foo', ' bar\r\n'], ['Bar', ' baz\r\n'], ['Baz', ' boo\r\n']]

print(headers.items())
# [('baz', 'boo'), ('foo', 'bar'), ('bar', 'baz')]
(tve)mattb@matt:~ $ python3 header_fix.py
[['Foo', ' bar\r\n'], ['Bar', ' baz\r\n'], ['Baz', ' boo\r\n']]
[('Foo', 'bar'), ('Bar', 'baz'), ('Baz', 'boo')]

(tve)mattb@matt:~ $ python2 header_fix.py
[['Foo', ' bar\r\n'], ['Bar', ' baz\r\n'], ['Baz', ' boo\r\n']]
[('baz', 'boo'), ('foo', 'bar'), ('bar', 'baz')]

I'm not sure what the actual interface for the headers attribute is supposed to be between the various servers and frameworks -- not sure how much test coverage you have around this; I think this code path isn't even exercised when using the gevent worker, I only have this problem when also using flask-sockets and geventwebsocket.

@tilgovi
Copy link
Collaborator

tilgovi commented Oct 23, 2017

I checked this out and it seems fine.

We don't have any coverage right now for anything around this, but in gevent 0.13.x up through the most recent versions this object is mimetools.Message or the wrapper on Python 3 that has the code you copy/pasted. In all cases, .items() works fine.

@berkerpeksag should we do anything more here or just merge this?

@@ -213,7 +213,7 @@ def log_request(self):
resp_headers = getattr(self, 'response_headers', {})
resp = GeventResponse(self.status, resp_headers, self.response_length)
if hasattr(self, 'headers'):
req_headers = [h.split(":", 1) for h in self.headers.headers]
req_headers = self.headers.items()
else:
req_headers = []
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be changed to {}?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. It's been a list of tuples and that isn't being changed here.

@berkerpeksag
Copy link
Collaborator

I don't know much about this part of Gunicorn, but this looks good to me. Thank you!

I think current behavior comes from the mimetools module (which is an ancient Python 2 module) so I don't know why mimetools behaved like that. In gevent/gevent@198f239, they make the behavior same under Python 3. There is also a check for ' ' and '\t' at https://github.com/gevent/gevent/blob/98c2c15fbbee0c6ab6fab0ba747fd0cdceca2ba8/src/gevent/pywsgi.py#L1036 but this detail is probably not important in our case.

I'll just ping @jamadden in case we missed something.

@jamadden
Copy link
Collaborator

I'm not sure what you're asking, but calling items should be fine. I don't see that changing.

@berkerpeksag berkerpeksag merged commit 5d4f885 into benoitc:master Oct 23, 2017
mjjbell pushed a commit to mjjbell/gunicorn that referenced this pull request Mar 16, 2018
We have a http.client.HTTPMessage in Python 3
(mimetools.Message in Python 2), just take .items()
from that.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants