Skip to content

Non-blocking _PyMutex_LockTimed spins and may fail unnecessarily in no-GIL builds #135871

Open
@jtibbertsma

Description

@jtibbertsma

Bug report

Bug description:

Non-blocking _PyMutex_LockTimed spins and may fail unnecessarily (no-GIL build)

Branch: main (commit 2793b68f758c10fb63b264787f10d46a71fc8086)
Build: configured with --disable-gil (MAX_SPIN_COUNT > 0)
OS: all


Summary

_PyMutex_LockTimed() is supposed to return immediately when called with
timeout == 0 (non-blocking). In a no-GIL build it:

  1. Spins for up to MAX_SPIN_COUNT yields before returning;
  2. Can still return PY_LOCK_FAILURE even though the lock was released during that spin.

Timed/blocking calls (timeout > 0) also waste CPU because the spin loop
never reloads the lock word, so they can’t notice an unlock until after the
maximum spin count.

The bug is invisible in GIL builds because MAX_SPIN_COUNT is 0.


Root cause

/* Python/lock.c — excerpts */

if ((v & _Py_LOCKED) == 0) {
    if (_Py_atomic_compare_exchange_uint8(&m->_bits, &v, v | _Py_LOCKED))
        return PY_LOCK_ACQUIRED;
}
else if (timeout == 0) {            // executes only if FIRST load saw LOCKED
    return PY_LOCK_FAILURE;         // non-blocking, OK
}

/* … later … */
if (!(v & _Py_HAS_PARKED) && spin_count < MAX_SPIN_COUNT) {
    _Py_yield();
    spin_count++;
    continue;                       // BUG: never refreshes v
}
  • If our fast CAS loses a race, the else if guard is skipped, so a
    non-blocking call drops into the spin loop.
  • Inside that loop v is never reloaded, so the thread can’t see that the
    lock became free. After MAX_SPIN_COUNT iterations it falls through to
    the same timeout == 0 guard and fails spuriously.

Proposed fix

/* 1 — reload v each spin */
if (!(v & _Py_HAS_PARKED) && spin_count < MAX_SPIN_COUNT) {
    _Py_yield();
    spin_count++;
    v = _Py_atomic_load_uint8_relaxed(&m->_bits);   // ← added
    continue;
}

/* 2 — early-out for timeout == 0 */
if ((v & _Py_LOCKED) == 0) {
    if (_Py_atomic_compare_exchange_uint8(&m->_bits, &v, v | _Py_LOCKED))
        return PY_LOCK_ACQUIRED;
}
if (timeout == 0) {                               // ← moved outside else
    return PY_LOCK_FAILURE;
}

Result

  • Non-blocking calls now return immediately: success if the CAS wins,
    failure if it loses – no spinning, no parking.
  • Timed/blocking calls still spin for fairness, but they now reload the
    lock word each iteration, so they acquire promptly once the lock is free.

CPython versions tested on:

CPython main branch

Operating systems tested on:

macOS

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    3.13bugs and security fixes3.14bugs and security fixes3.15new features, bugs and security fixesinterpreter-core(Objects, Python, Grammar, and Parser dirs)topic-free-threadingtype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions