Skip to content

Session-scoped fixture executed too many times with -x / --maxfails #1024

Closed
@Garrett-R

Description

@Garrett-R

This is related, but I think slightly different than #868 and #420 (which deals with the fact that these flags do not stop execution after a fail).

Observe this test file:

import sys
from time import sleep

from pytest import fixture

# hack for live prints (https://stackoverflow.com/questions/27006884)
sys.stdout = sys.stderr


@fixture(scope='session', autouse=True)
def session_fixture():
    print('### executing fixture! ###')


def test_1():
    assert False


def test_2():
    sleep(1)


def test_3():
    sleep(1)


def test_4():
    sleep(1)

Running this with:

pytest test_parallel.py  -n 2  -s --maxfail=1

leads to the session_fixture being executed 3 times! Removing --maxfail=1, you can see session_fixture is executed twice (as expected).

Activity

bluetech

bluetech commented on Feb 22, 2024

@bluetech
Member

Can confirm, this happens to me too. Will try to bisect this later.

bluetech

bluetech commented on Feb 22, 2024

@bluetech
Member

Had a couple of minutes so did the bisect. It's a regression from pytest 8.0.0: pytest-dev/pytest@12b9bd5 (PR pytest-dev/pytest#11721). Will look into it.

bluetech

bluetech commented on Feb 22, 2024

@bluetech
Member
bluetech

bluetech commented on Feb 23, 2024

@bluetech
Member

Since I am not sure how to fix pytest-xdist --maxfail handling to work with the pytest change, I'm going to propose a revert of the pytest change for pytest 8.0.2.

bluetech

bluetech commented on Feb 23, 2024

@bluetech
Member

Will be fixed in pytest 8.0.2, but let's keep this open for adding a regression test if possible.

bbrown1867

bbrown1867 commented on Feb 23, 2024

@bbrown1867
Contributor

A couple of observations:

  • I believe the issue still exists if a single worker is used (-n 1) - I see the fixture run twice. With a single worker I'd expect it to run once (correct me if I'm wrong here).
  • Adding some timestamps shows the fixture runs again after the failing test case runs.

This makes me think it is not really multi-process related, but just some logic when how pytest-xdist is handling nextitem that pytest-dev/pytest@12b9bd5 messed up. Will investigate a bit more.

bbrown1867

bbrown1867 commented on Feb 23, 2024

@bbrown1867
Contributor

I did some more investigation into this - I don't think the issue is in pytest, but in the way pytest-xdist is (mis)handling the --maxfail flag. For example, if you run the example code above using pytest version 7.4.4, multiple tests will run on a single worker before the worker quits. I think this is incorrect...after a single failure on a worker, it should exit.

A simpler way to see this is using -n 1. As I understand it, -n 1 should behave the same as simply omitting -n.

% pytest test_parallel.py  -n 1 -s --maxfail=1
====================================================== test session starts =======================================================
platform darwin -- Python 3.8.13, pytest-7.4.4, pluggy-1.4.0
rootdir: /Users/bbrown/Downloads
plugins: xdist-3.5.0
1 worker [4 items]
### executing fixture! ###
F.
============================================================ FAILURES ============================================================
_____________________________________________________________ test_1 _____________________________________________________________
[gw0] darwin -- Python 3.8.13 /Users/bbrown/Downloads/venv_old_pytest/bin/python

    def test_1():
>       assert False
E       assert False

test_parallel.py:16: AssertionError
==================================================== short test summary info =====================================================
FAILED test_parallel.py::test_1 - assert False
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! xdist.dsession.Interrupted: stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
================================================== 1 failed, 1 passed in 1.41s ===================================================

With -n 2, you can still see that gw0 runs a second test after the first one fails. I think pytest version 8 just exposes this issue, since the hooks in pytest-xdist keep running when they should have exited.

I have a fix locally, I'll try to get into a pull request.

bluetech

bluetech commented on Feb 24, 2024

@bluetech
Member

For example, if you run the example code above using pytest version 7.4.4, multiple tests will run on a single worker before the worker quits. I think this is incorrect...after a single failure on a worker, it should exit.

I'm not really familiar with the pytest-xdist code but I assume there is some inherit raciness here:

  • pytest-xdist works by distributing the items to multiple works
  • Since --maxfail is a global limit, the handling of it cannot happen within each worker, it must be handled in the manager.
  • So I assume each worker tells the manager that an item failed, the manager increments the failed count and when it reaches maxfail it (asynchronously) tells all the workers to stop.
  • But by the time the "stop" signal reaches the workers they might already be working on an item, which can fail.

So the number of failures can end up higher than maxfail. I don't think there is a way to avoid this, unless you start limiting the parallelism to maxfail - fails, which with --maxfail=1/-x means no parallelism at all...

A simpler way to see this is using -n 1. As I understand it, -n 1 should behave the same as simply omitting -n.

Well -n1 still works using xdist, which I think is reasonable. Probably what's going on is:

  • The pytest runtest protocol requires both the item it executes and the next item. Needs this to determine which fixtures to teardown (see SetupState in pytest).
  • This means a worker always needs to hold two items from the manager, current & next.

I think the --maxfail=1 case is a bit misleading. When the maxfail is greater than 1, there are two possible situations:

  1. The maxfail is reached globally, but not in the worker (e.g. max fail is 10, 5 workers, each worker had 2 failures).
  2. The maxfail is reached in the worker itself.

For the original issue in pytest (pytest-dev/pytest#11706), the more difficult case is 1. But the regression we're discussing only manifests for 2. If I understand your PR correctly, it tackles 2 right?

bbrown1867

bbrown1867 commented on Feb 25, 2024

@bbrown1867
Contributor

I think the --maxfail=1 case is a bit misleading. When the maxfail is greater than 1, there are two possible situations:

  1. The maxfail is reached globally, but not in the worker (e.g. max fail is 10, 5 workers, each worker had 2 failures).
  2. The maxfail is reached in the worker itself.

For the original issue in pytest (pytest-dev/pytest#11706), the more difficult case is 1. But the regression we're discussing only manifests for 2. If I understand your PR correctly, it tackles 2 right?

Yes, the PR I posted stops the worker if the "local" failures reach maxfail - which I believe fixes the regression and allows us to keep the bugfix in pytest 8.x.

The "global" vs. "local" failure count is a useful distinction and clarifies the challenge here: Each worker has a unique pytest session instance that maintains it's own shouldfail / shouldstop properties. Then pytest-xdist has the "global" shouldstop property in src/xdist/dsession.py:

def _handlefailures(self, rep):
if rep.failed:
self.countfailures += 1
if (
self.maxfail
and self.countfailures >= self.maxfail
and not self.shouldstop
):
self.shouldstop = f"stopping after {self.countfailures} failures"

I think you raise a good point about the "race condition" here - the DSession might be still handling/processing the failures while the worker keeps running the next test.

An alternative solution could be have some configuration option to simply disable the shouldfail / shouldstop properties within pytest, since it seems like pytest-xdist want's its own, custom behavior here? But I think the PR posted is a decent workaround as well, since if local failures exceeds maxfail, so would global.

12 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      Participants

      @bluetech@bbrown1867@Garrett-R

      Issue actions

        Session-scoped fixture executed too many times with `-x` / `--maxfails` · Issue #1024 · pytest-dev/pytest-xdist