Closed
Description
This is related, but I think slightly different than #868 and #420 (which deals with the fact that these flags do not stop execution after a fail).
Observe this test file:
import sys
from time import sleep
from pytest import fixture
# hack for live prints (https://stackoverflow.com/questions/27006884)
sys.stdout = sys.stderr
@fixture(scope='session', autouse=True)
def session_fixture():
print('### executing fixture! ###')
def test_1():
assert False
def test_2():
sleep(1)
def test_3():
sleep(1)
def test_4():
sleep(1)
Running this with:
pytest test_parallel.py -n 2 -s --maxfail=1
leads to the session_fixture
being executed 3 times! Removing --maxfail=1
, you can see session_fixture
is executed twice (as expected).
Activity
bluetech commentedon Feb 22, 2024
Can confirm, this happens to me too. Will try to bisect this later.
bluetech commentedon Feb 22, 2024
Had a couple of minutes so did the bisect. It's a regression from pytest 8.0.0: pytest-dev/pytest@12b9bd5 (PR pytest-dev/pytest#11721). Will look into it.
bluetech commentedon Feb 22, 2024
Particularly this new code in pytest 8.0.0 (the
shouldfail
case):https://github.com/pytest-dev/pytest/blob/c5c729e27aa1e866af37f2ee97b1c68e9463070c/src/_pytest/runner.py#L135-L138
bluetech commentedon Feb 23, 2024
Since I am not sure how to fix pytest-xdist
--maxfail
handling to work with the pytest change, I'm going to propose a revert of the pytest change for pytest 8.0.2.--maxfail
fix caused a regression in pytest-xdist pytest-dev/pytest#12021Revert "Fix teardown error reporting when `--maxfail=1` (pytest-dev#1…
--maxfail=1
(#11721)" pytest-dev/pytest#12022bluetech commentedon Feb 23, 2024
Will be fixed in pytest 8.0.2, but let's keep this open for adding a regression test if possible.
bbrown1867 commentedon Feb 23, 2024
A couple of observations:
-n 1
) - I see the fixture run twice. With a single worker I'd expect it to run once (correct me if I'm wrong here).This makes me think it is not really multi-process related, but just some logic when how
pytest-xdist
is handlingnextitem
that pytest-dev/pytest@12b9bd5 messed up. Will investigate a bit more.bbrown1867 commentedon Feb 23, 2024
I did some more investigation into this - I don't think the issue is in
pytest
, but in the waypytest-xdist
is (mis)handling the--maxfail
flag. For example, if you run the example code above usingpytest
version 7.4.4, multiple tests will run on a single worker before the worker quits. I think this is incorrect...after a single failure on a worker, it should exit.A simpler way to see this is using
-n 1
. As I understand it,-n 1
should behave the same as simply omitting-n
.With
-n 2
, you can still see thatgw0
runs a second test after the first one fails. I thinkpytest
version 8 just exposes this issue, since the hooks inpytest-xdist
keep running when they should have exited.I have a fix locally, I'll try to get into a pull request.
shouldfail
orshouldstop
is set #1026bluetech commentedon Feb 24, 2024
I'm not really familiar with the pytest-xdist code but I assume there is some inherit raciness here:
--maxfail
is a global limit, the handling of it cannot happen within each worker, it must be handled in the manager.So the number of failures can end up higher than maxfail. I don't think there is a way to avoid this, unless you start limiting the parallelism to
maxfail - fails
, which with--maxfail=1
/-x
means no parallelism at all...Well
-n1
still works using xdist, which I think is reasonable. Probably what's going on is:SetupState
in pytest).I think the
--maxfail=1
case is a bit misleading. When themaxfail
is greater than 1, there are two possible situations:For the original issue in pytest (pytest-dev/pytest#11706), the more difficult case is 1. But the regression we're discussing only manifests for 2. If I understand your PR correctly, it tackles 2 right?
bbrown1867 commentedon Feb 25, 2024
Yes, the PR I posted stops the worker if the "local" failures reach
maxfail
- which I believe fixes the regression and allows us to keep the bugfix inpytest
8.x.The "global" vs. "local" failure count is a useful distinction and clarifies the challenge here: Each worker has a unique pytest
session
instance that maintains it's ownshouldfail
/shouldstop
properties. Thenpytest-xdist
has the "global"shouldstop
property insrc/xdist/dsession.py
:pytest-xdist/src/xdist/dsession.py
Lines 354 to 362 in f57c658
I think you raise a good point about the "race condition" here - the
DSession
might be still handling/processing the failures while the worker keeps running the next test.An alternative solution could be have some configuration option to simply disable the
shouldfail
/shouldstop
properties withinpytest
, since it seems likepytest-xdist
want's its own, custom behavior here? But I think the PR posted is a decent workaround as well, since if local failures exceedsmaxfail
, so would global.12 remaining items