You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While developing PR #28 I encountered an issue with crochet not terminating the twisted deferred correctly when raising the TimeoutError exception. I don't think it's a fault on our side but we should design how to handle it.
Let's take the following example code as a reference.
Code for reference
deftest_eventual_result_timeout(server_url):
# server incredibly sloweventual_result=fido.fetch(server_url+'100sec')
# make sure no timeout error is thrown here but only on result retrievalasserteventual_result.original_failure() isNonewithpytest.raises(crochet.TimeoutError):
eventual_result.wait(timeout=1.0)
asserteventual_result.original_failure() isNoneprint("Sleeping a bit to make sure a potential exception in the reactor""thread due to connectionLost is trapped in eventual_result")
time.sleep(5)
asserteventual_result.original_failure() isNoneprint("Finished sleeping. in a moment we will lose references to""eventual_result and hence the deferred will be closed uncorrectly causing""the CRITICAL message to be logged right after this message....")
@crochet.run_in_reactordeffetch_inner(url, method, headers, body, timeout, connect_timeout):
bodyProducer, headers=_build_body_producer(body, headers)
deferred=Agent(reactor, connect_timeout).request(
method=method,
uri=url,
headers=listify_headers(headers),
bodyProducer=bodyProducer)
defresponse_callback(response):
finished=Deferred()
response.deliverBody(HTTPBodyFetcher(response, finished))
returnfinisheddeferred.addCallback(response_callback)
returndeferreddeffetch(
url, method='GET', headers=None, body='', timeout=DEFAULT_TIMEOUT,
connect_timeout=DEFAULT_CONNECT_TIMEOUT,
content_type=DEFAULT_CONTENT_TYPE, user_agent=DEFAULT_USER_AGENT,
):
crochet.setup()
returnfetch_inner(url, method, headers, body, timeout, connect_timeout)
Traceback
Everything goes fine (TimeoutError is correctly raised) but the twisted reactor thread is failing due to the connection closing incorrectly. Traceback:
Sleeping a bit to make sure a potential exception in the reactor thread due to
connectionLost is trapped in eventual_result
Finished sleeping. in a moment we will lose references to eventual_result and hence
the deferred will be closed uncorrectly causing the CRITICAL message to
be logged right after this message....
CRITICAL:twisted:Unhandled error in EventualResult
Traceback (most recent call last):
Failure: twisted.web._newclient.ResponseNeverReceived: [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
All assertion are true, showing that no exception is trapped in the EventualResult. Exception happens exactly when the eventual_result is garbage collected.
If a Deferred is garbage-collected with an unhandled error (i.e. it would call the next errback if there was one), then Twisted will write the error’s traceback to the log file. This means that you can typically get away with not adding errbacks and still get errors logged. Be careful though; if you keep a reference to the Deferred around, preventing it from being garbage-collected, then you may never see the error (and your callbacks will mysteriously seem to have never been called).
In this case the ConnectionLost exception is caused by the garbage collector disposing of the deferred when we lose all references to the EventualResult (and hence the deferred). In fact we don't see anything logged until the eventual_result is alive.
I think that raising TimeoutError here without disposing of the deferred is causing the Agent connection to be closed incorrectly and the exception to be thrown in the reactor thread after raising the TimeoutError.
Why crochet does not dispose of the deferred ?
Crochet choice is to allow multiple retries to collect results, so I guess they would not be ok with destroying the object after a first try. See crochet docstrings.
Why not handling twisted.internet.error.ConnectionLost with a twisted errback?
The problem is a systematic erroneous disposal of the deferred object causing the connection to be always incorrectly closed the same way. Let's say I was to handle this on our side this way:
defhandle_timeout_errors(error):
iferror.check(twisted.web.client.ResponseNeverReceived):
iferror.value.reasons[0].check(
twisted.internet.error.ConnectionLost
):
# we get here if crochet aborts the connection uncorrectly# when the EventualResult timeout is due. Nothing to raise# because crochet will take care of raising a TimeoutError.returnreturnerrordeferred.addErrback(handle_timeout_errors)
I would be mistakenly catching all potential connectionLost issues, even the ones due to other causes (for example the server aborting the connection).
Possible approaches
We document the error, we let it happen and we let the logger write "CRITICAL" all over our logs. We know why it happens so it would all be expected.
Let's find a way to correctly destroy the deferred when loosing references to it before garbage collection. I've given a try to this approach by changing crochet code and it seems to be working (tested with our internal acceptance test suite):
I don't think crochet would like to change its entire default behavior on retries by destroying the object after the first try. Maybe an option could be something like this on stackoverflow (destroying the deferred ourselves).
Options 1 looks to me as the fast track. I'd go for option 2.
The text was updated successfully, but these errors were encountered:
After [this discussion|https://github.com/itamarst/crochet/issues/102] with itamarst, he suggested a very simple and easy solution to the problem which I had missed. We can cancel the deferred after catching the crochet TimeoutError in the caller code (hence bravado). Closing this issue. Will spin up a bravado branch to solve this.
While developing PR #28 I encountered an issue with crochet not terminating the twisted deferred correctly when raising the TimeoutError exception. I don't think it's a fault on our side but we should design how to handle it.
Let's take the following example code as a reference.
Code for reference
Traceback
Everything goes fine (TimeoutError is correctly raised) but the twisted reactor thread is failing due to the connection closing incorrectly. Traceback:
All assertion are true, showing that no exception is trapped in the EventualResult. Exception happens exactly when the eventual_result is garbage collected.
The problem
From Twisted documentation:
In this case the ConnectionLost exception is caused by the garbage collector disposing of the deferred when we lose all references to the EventualResult (and hence the deferred). In fact we don't see anything logged until the eventual_result is alive.
I think that raising TimeoutError here without disposing of the deferred is causing the Agent connection to be closed incorrectly and the exception to be thrown in the reactor thread after raising the TimeoutError.
Why crochet does not dispose of the deferred ?
Crochet choice is to allow multiple retries to collect results, so I guess they would not be ok with destroying the object after a first try. See crochet docstrings.
Why not handling twisted.internet.error.ConnectionLost with a twisted errback?
The problem is a systematic erroneous disposal of the deferred object causing the connection to be always incorrectly closed the same way. Let's say I was to handle this on our side this way:
I would be mistakenly catching all potential connectionLost issues, even the ones due to other causes (for example the server aborting the connection).
Possible approaches
I don't think crochet would like to change its entire default behavior on retries by destroying the object after the first try. Maybe an option could be something like this on stackoverflow (destroying the deferred ourselves).
Options 1 looks to me as the fast track. I'd go for option 2.
The text was updated successfully, but these errors were encountered: