Session-scoped fixture executed too many times with `-x` / `--maxfails` #1024

New issue

Closed

#1026

Closed

Session-scoped fixture executed too many times with -x / --maxfails#1024

#1026

Labels

bugregression

Garrett-R

This is related, but I think slightly different than #868 and #420 (which deals with the fact that these flags do not stop execution after a fail).

Observe this test file:

import sys
from time import sleep

from pytest import fixture

# hack for live prints (https://stackoverflow.com/questions/27006884)
sys.stdout = sys.stderr


@fixture(scope='session', autouse=True)
def session_fixture():
    print('### executing fixture! ###')


def test_1():
    assert False


def test_2():
    sleep(1)


def test_3():
    sleep(1)


def test_4():
    sleep(1)

Running this with:

pytest test_parallel.py  -n 2  -s --maxfail=1

leads to the session_fixture being executed 3 times! Removing --maxfail=1, you can see session_fixture is executed twice (as expected).

bluetech

Member

Can confirm, this happens to me too. Will try to bisect this later.

added

Member

Had a couple of minutes so did the bisect. It's a regression from pytest 8.0.0: pytest-dev/pytest@12b9bd5 (PR pytest-dev/pytest#11721). Will look into it.

bluetech

Member

Particularly this new code in pytest 8.0.0 (the shouldfail case):

https://github.com/pytest-dev/pytest/blob/c5c729e27aa1e866af37f2ee97b1c68e9463070c/src/_pytest/runner.py#L135-L138

bluetech

Member

Since I am not sure how to fix pytest-xdist --maxfail handling to work with the pytest change, I'm going to propose a revert of the pytest change for pytest 8.0.2.

bluetech

mentioned this

on Feb 23, 2024

Pytest 8.0.0 --maxfail fix caused a regression in pytest-xdist pytest-dev/pytest#12021

bluetech

added a commit that references this issue

on Feb 23, 2024

Revert "Fix teardown error reporting when `--maxfail=1` (pytest-dev#1…

00d9640

bluetech

mentioned this

on Feb 23, 2024

Revert "Fix teardown error reporting when --maxfail=1 (#11721)" pytest-dev/pytest#12022

bluetech

Member

Will be fixed in pytest 8.0.2, but let's keep this open for adding a regression test if possible.

bbrown1867

Contributor

A couple of observations:

I believe the issue still exists if a single worker is used (-n 1) - I see the fixture run twice. With a single worker I'd expect it to run once (correct me if I'm wrong here).
Adding some timestamps shows the fixture runs again after the failing test case runs.

This makes me think it is not really multi-process related, but just some logic when how pytest-xdist is handling nextitem that pytest-dev/pytest@12b9bd5 messed up. Will investigate a bit more.

bbrown1867

Contributor

I did some more investigation into this - I don't think the issue is in pytest, but in the way pytest-xdist is (mis)handling the --maxfail flag. For example, if you run the example code above using pytest version 7.4.4, multiple tests will run on a single worker before the worker quits. I think this is incorrect...after a single failure on a worker, it should exit.

A simpler way to see this is using -n 1. As I understand it, -n 1 should behave the same as simply omitting -n.

% pytest test_parallel.py  -n 1 -s --maxfail=1
====================================================== test session starts =======================================================
platform darwin -- Python 3.8.13, pytest-7.4.4, pluggy-1.4.0
rootdir: /Users/bbrown/Downloads
plugins: xdist-3.5.0
1 worker [4 items]
### executing fixture! ###
F.
============================================================ FAILURES ============================================================
_____________________________________________________________ test_1 _____________________________________________________________
[gw0] darwin -- Python 3.8.13 /Users/bbrown/Downloads/venv_old_pytest/bin/python

    def test_1():
>       assert False
E       assert False

test_parallel.py:16: AssertionError
==================================================== short test summary info =====================================================
FAILED test_parallel.py::test_1 - assert False
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! xdist.dsession.Interrupted: stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
================================================== 1 failed, 1 passed in 1.41s ===================================================

With -n 2, you can still see that gw0 runs a second test after the first one fails. I think pytest version 8 just exposes this issue, since the hooks in pytest-xdist keep running when they should have exited.

I have a fix locally, I'll try to get into a pull request.

bbrown1867

mentioned this

on Feb 23, 2024

Terminate the test loop if shouldfail or shouldstop is set #1026

bluetech

Member

For example, if you run the example code above using pytest version 7.4.4, multiple tests will run on a single worker before the worker quits. I think this is incorrect...after a single failure on a worker, it should exit.

I'm not really familiar with the pytest-xdist code but I assume there is some inherit raciness here:

pytest-xdist works by distributing the items to multiple works
Since --maxfail is a global limit, the handling of it cannot happen within each worker, it must be handled in the manager.
So I assume each worker tells the manager that an item failed, the manager increments the failed count and when it reaches maxfail it (asynchronously) tells all the workers to stop.
But by the time the "stop" signal reaches the workers they might already be working on an item, which can fail.

So the number of failures can end up higher than maxfail. I don't think there is a way to avoid this, unless you start limiting the parallelism to maxfail - fails, which with --maxfail=1/-x means no parallelism at all...

A simpler way to see this is using -n 1. As I understand it, -n 1 should behave the same as simply omitting -n.

Well -n1 still works using xdist, which I think is reasonable. Probably what's going on is:

The pytest runtest protocol requires both the item it executes and the next item. Needs this to determine which fixtures to teardown (see SetupState in pytest).
This means a worker always needs to hold two items from the manager, current & next.

I think the --maxfail=1 case is a bit misleading. When the maxfail is greater than 1, there are two possible situations:

The maxfail is reached globally, but not in the worker (e.g. max fail is 10, 5 workers, each worker had 2 failures).
The maxfail is reached in the worker itself.

For the original issue in pytest (pytest-dev/pytest#11706), the more difficult case is 1. But the regression we're discussing only manifests for 2. If I understand your PR correctly, it tackles 2 right?

bbrown1867

Contributor

I think the --maxfail=1 case is a bit misleading. When the maxfail is greater than 1, there are two possible situations:

The maxfail is reached globally, but not in the worker (e.g. max fail is 10, 5 workers, each worker had 2 failures).

The maxfail is reached in the worker itself.

For the original issue in pytest (pytest-dev/pytest#11706), the more difficult case is 1. But the regression we're discussing only manifests for 2. If I understand your PR correctly, it tackles 2 right?

Yes, the PR I posted stops the worker if the "local" failures reach maxfail - which I believe fixes the regression and allows us to keep the bugfix in pytest 8.x.

The "global" vs. "local" failure count is a useful distinction and clarifies the challenge here: Each worker has a unique pytest session instance that maintains it's own shouldfail / shouldstop properties. Then pytest-xdist has the "global" shouldstop property in src/xdist/dsession.py:

pytest-xdist/src/xdist/dsession.py

Lines 354 to 362 in f57c658

    
           def _handlefailures(self, rep): 
        
               if rep.failed: 
        
                   self.countfailures += 1 
        
                   if ( 
        
                       self.maxfail 
        
                       and self.countfailures >= self.maxfail 
        
                       and not self.shouldstop 
        
                   ): 
        
                       self.shouldstop = f"stopping after {self.countfailures} failures"

I think you raise a good point about the "race condition" here - the DSession might be still handling/processing the failures while the worker keeps running the next test.

An alternative solution could be have some configuration option to simply disable the shouldfail / shouldstop properties within pytest, since it seems like pytest-xdist want's its own, custom behavior here? But I think the PR posted is a decent workaround as well, since if local failures exceeds maxfail, so would global.

12 remaining items

to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

Labels

bugregression

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

Terminate the test loop if `shouldfail` or `shouldstop` is setpytest-dev/pytest-xdist

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Session-scoped fixture executed too many times with `-x` / `--maxfails` #1024

12 remaining items

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

	def _handlefailures(self, rep):
	if rep.failed:
	self.countfailures += 1
	if (
	self.maxfail
	and self.countfailures >= self.maxfail
	and not self.shouldstop
	):
	self.shouldstop = f"stopping after {self.countfailures} failures"

Session-scoped fixture executed too many times with -x / --maxfails #1024

Description

Activity

bluetech commented on Feb 22, 2024

bluetech commented on Feb 22, 2024

bluetech commented on Feb 22, 2024

bluetech commented on Feb 23, 2024

bluetech commented on Feb 23, 2024

bbrown1867 commented on Feb 23, 2024

bbrown1867 commented on Feb 23, 2024

bluetech commented on Feb 24, 2024

bbrown1867 commented on Feb 25, 2024

12 remaining items

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions

Session-scoped fixture executed too many times with `-x` / `--maxfails` #1024