Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(5925): suppress ValueError due to invalid datetime header #6012

Merged

Conversation

hi-ogawa
Copy link
Contributor

@hi-ogawa hi-ogawa commented Sep 18, 2021

EDIT: I will update CHANGES and CONTRIBUTORS right away.

What do these changes do?

Suppress ValueError from datetime.datetime function in BaseRequest._http_date, which is used for parsing a timestamp in request header (If-Modified-Since, If-Unmodified-Since, and If-Range).
According to the documentation, the corresponding request properties (if_modified_since, if_unmodified_since, and if_range) are supposed to return None for invalid date, so this PR will make the behavior aligned to the documentation.

Are there changes in behavior for the user?

Well... if there are users who want to handle such error by themselves, then they won't be able to see that error anymore, but that seems unlikely.

Related issue number

Fixes #5925

Checklist

  • I think the code is well written
  • Unit tests for the changes exist
  • Documentation reflects the changes
  • If you provide code modification, please add yourself to CONTRIBUTORS.txt
    • The format is <Name> <Surname>.
    • Please keep alphabetical order, the file is sorted by names.
  • Add a new news fragment into the CHANGES folder
    • name it <issue_id>.<type> for example (588.bugfix)
    • if you don't have an issue_id change it to the pr id after creating the pr
    • ensure type is one of the following:
      • .feature: Signifying a new feature.
      • .bugfix: Signifying a bug fix.
      • .doc: Signifying a documentation improvement.
      • .removal: Signifying a deprecation or removal of public API.
      • .misc: A ticket has been closed, but it is not of interest to users.
    • Make sure to use full sentences with correct case and punctuation, for example: "Fix issue with non-ascii contents in doctest text files."

@hi-ogawa hi-ogawa requested a review from asvetlov as a code owner September 18, 2021 12:25
@hi-ogawa hi-ogawa force-pushed the 5925-suppress-invalid-datetime-header branch from 966e14b to a43b53b Compare September 18, 2021 12:50
@hi-ogawa hi-ogawa requested a review from webknjaz as a code owner September 18, 2021 12:50
@psf-chronographer psf-chronographer bot added the bot:chronographer:provided There is a change note present in this PR label Sep 18, 2021
@webknjaz
Copy link
Member

I wonder if it's the docs that should be updated. Have you checked if the code changed at some point w/o updating the docs? Merging such a change in a minor release may be dangerous for people relying on the de-facto behavior. Plus raising an exception is more pythonic.

@hi-ogawa
Copy link
Contributor Author

hi-ogawa commented Sep 19, 2021

@webknjaz Thanks for the feedback!

Have you checked if the code changed at some point w/o updating the docs?

I found a few relevant PRs, so I will list them here.

So, in summary, I don't think there was any change in behavior w/o updating the docs.


Plus raising an exception is more pythonic.

For this one, I don't think I should weight in since I'm just a beginner user. So I'll leave it to core maintainers for the final decision.
However, I'll just give pointers to what other libraries do, which might help you to decide.

  • For Flask's werkzeug library, they have similar properties on Request class (e.g. if_modified_since) and they use a parse_date utility function, which is similar to what aiohttp does, but it suppresses exceptions.

  • For Django (personally I'm not familiar with this though), it doesn't seem there are properties like if_modified_since, but there is a utility function parse_http_date_safe which can parse date header without exception.


While grepping around the code base, I actually found StreamResponse.last_modified, where same date parsing logic is copy/pasted. So, I will update this as well, possibly with small refactoring.

@hi-ogawa
Copy link
Contributor Author

hi-ogawa commented Sep 19, 2021

Another thing I noticed is that current aiohttp's if_modified_since ignores timezone in the original header (e.g. "GMT", "+0900" etc..) and overwrites with datetime(... tzinfo=datetime.timezone.utc), which might not be the behavior users would expect.
I would suggest to use email.utils.parsedate_to_datetime to parse directly from str to datetime including timezone as werkzeug's parse_date does (this function is available since Python 3.3 so it meets aiohttp's requirement Python >= 3.7).
However, note that currently email.utils.parsedate_to_datetime has a bit tricky behavior regarding parse errors (see https://bugs.python.org/issue30681), but we can just use suppress(TypeError, ValueError) and it should be fine.

EDIT:
Oh sorry, probably I was wrong. According to MDN (https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/If-Modified-Since) and http spec linked from there, the timezone string in "HTTP date" is supposed to be always exactly "GMT".

@hi-ogawa hi-ogawa force-pushed the 5925-suppress-invalid-datetime-header branch from 12d16ec to 1619d69 Compare September 19, 2021 02:45
…esponse.last_modified` for invalid http date header
@hi-ogawa hi-ogawa force-pushed the 5925-suppress-invalid-datetime-header branch from 1619d69 to 42f1e97 Compare September 19, 2021 02:46
@hi-ogawa
Copy link
Contributor Author

hi-ogawa commented Sep 19, 2021

With this commit 42f1e97, StreamResponse.last_modified will be handled similary.

@Dreamsorcerer
Copy link
Member

* [Make StaticRoute support Last-Modified and If-Modified-Since headers #386](https://github.com/aio-libs/aiohttp/pull/386)
  This PR introduced `if_modified_since` with `email.utils.parsedate` and `datetime.datetime` parsing logic. Documentation for `if_modified_since` is also added in this PR and it hasn't changed since then.

Looks to me like the original code expected the parsedate() function to only return results if they would also be valid for datetime(), not realising that it could parse, for example, a year number which is larger than what datetime() will accept. So, this looks like a subtle bug in the original implementation.

Plus raising an exception is more pythonic.

In general, but these little helper attributes seem like they should not raise exceptions. I don't expect to get a ValueError when accessing an attribute. If I did want a more precise control over the parsing with exceptions and such, I'd probably look at the raw header and parse it manually, rather than using this attribute.

@Dreamsorcerer
Copy link
Member

@webknjaz Any other concerns or can we merge this?

Copy link
Member

@asvetlov asvetlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@asvetlov asvetlov enabled auto-merge (squash) October 24, 2021 08:25
@codecov
Copy link

codecov bot commented Oct 24, 2021

Codecov Report

Merging #6012 (11f66af) into master (c6eb530) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #6012   +/-   ##
=======================================
  Coverage   93.33%   93.34%           
=======================================
  Files         102      102           
  Lines       30080    30088    +8     
  Branches     2690     2689    -1     
=======================================
+ Hits        28076    28085    +9     
  Misses       1829     1829           
+ Partials      175      174    -1     
Flag Coverage Δ
unit 93.26% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
aiohttp/helpers.py 96.92% <100.00%> (+0.05%) ⬆️
aiohttp/web_request.py 95.91% <100.00%> (-0.08%) ⬇️
aiohttp/web_response.py 98.41% <100.00%> (+0.20%) ⬆️
tests/test_helpers.py 98.85% <100.00%> (+0.01%) ⬆️
tests/test_web_request.py 99.07% <100.00%> (+0.01%) ⬆️
tests/test_web_response.py 99.47% <100.00%> (+<0.01%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c6eb530...11f66af. Read the comment docs.

@asvetlov asvetlov merged commit 4744923 into aio-libs:master Oct 24, 2021
@patchback
Copy link
Contributor

patchback bot commented Oct 24, 2021

Backport to 3.8: 💔 cherry-picking failed — conflicts found

❌ Failed to cleanly apply 4744923 on top of patchback/backports/3.8/4744923f8070baa364d5bfd3edd8510a4394303c/pr-6012

Backporting merged PR #6012 into master

  1. Ensure you have a local repo clone of your fork. Unless you cloned it
    from the upstream, this would be your origin remote.
  2. Make sure you have an upstream repo added as a remote too. In these
    instructions you'll refer to it by the name upstream. If you don't
    have it, here's how you can add it:
    $ git remote add upstream https://github.com/aio-libs/aiohttp.git
  3. Ensure you have the latest copy of upstream and prepare a branch
    that will hold the backported code:
    $ git fetch upstream
    $ git checkout -b patchback/backports/3.8/4744923f8070baa364d5bfd3edd8510a4394303c/pr-6012 upstream/3.8
  4. Now, cherry-pick PR fix(5925): suppress ValueError due to invalid datetime header #6012 contents into that branch:
    $ git cherry-pick -x 4744923f8070baa364d5bfd3edd8510a4394303c
    If it'll yell at you with something like fatal: Commit 4744923f8070baa364d5bfd3edd8510a4394303c is a merge but no -m option was given., add -m 1 as follows intead:
    $ git cherry-pick -m1 -x 4744923f8070baa364d5bfd3edd8510a4394303c
  5. At this point, you'll probably encounter some merge conflicts. You must
    resolve them in to preserve the patch from PR fix(5925): suppress ValueError due to invalid datetime header #6012 as close to the
    original as possible.
  6. Push this branch to your fork on GitHub:
    $ git push origin patchback/backports/3.8/4744923f8070baa364d5bfd3edd8510a4394303c/pr-6012
  7. Create a PR, ensure that the CI is green. If it's not — update it so that
    the tests and any other checks pass. This is it!
    Now relax and wait for the maintainers to process your pull request
    when they have some cycles to do reviews. Don't worry — they'll tell you if
    any improvements are necessary when the time comes!

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

@aio-libs-github-bot
Copy link
Contributor

💔 Backport was not successful

The PR was attempted backported to the following branches:

  • ❌ 3.8: Commit could not be cherrypicked due to conflicts

asvetlov added a commit that referenced this pull request Oct 24, 2021
@hi-ogawa hi-ogawa deleted the 5925-suppress-invalid-datetime-header branch October 25, 2021 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bot:chronographer:provided There is a change note present in this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ValueError: year is out of range
4 participants