Open
Description
Bug report
Bug description:
Non-blocking _PyMutex_LockTimed
spins and may fail unnecessarily (no-GIL build)
Branch: main
(commit 2793b68f758c10fb63b264787f10d46a71fc8086
)
Build: configured with --disable-gil
(MAX_SPIN_COUNT > 0
)
OS: all
Summary
_PyMutex_LockTimed()
is supposed to return immediately when called with
timeout == 0
(non-blocking). In a no-GIL build it:
- Spins for up to
MAX_SPIN_COUNT
yields before returning; - Can still return
PY_LOCK_FAILURE
even though the lock was released during that spin.
Timed/blocking calls (timeout > 0
) also waste CPU because the spin loop
never reloads the lock word, so they can’t notice an unlock until after the
maximum spin count.
The bug is invisible in GIL builds because MAX_SPIN_COUNT
is 0.
Root cause
/* Python/lock.c — excerpts */
if ((v & _Py_LOCKED) == 0) {
if (_Py_atomic_compare_exchange_uint8(&m->_bits, &v, v | _Py_LOCKED))
return PY_LOCK_ACQUIRED;
}
else if (timeout == 0) { // executes only if FIRST load saw LOCKED
return PY_LOCK_FAILURE; // non-blocking, OK
}
/* … later … */
if (!(v & _Py_HAS_PARKED) && spin_count < MAX_SPIN_COUNT) {
_Py_yield();
spin_count++;
continue; // BUG: never refreshes v
}
- If our fast CAS loses a race, the
else if
guard is skipped, so a
non-blocking call drops into the spin loop. - Inside that loop
v
is never reloaded, so the thread can’t see that the
lock became free. AfterMAX_SPIN_COUNT
iterations it falls through to
the sametimeout == 0
guard and fails spuriously.
Proposed fix
/* 1 — reload v each spin */
if (!(v & _Py_HAS_PARKED) && spin_count < MAX_SPIN_COUNT) {
_Py_yield();
spin_count++;
v = _Py_atomic_load_uint8_relaxed(&m->_bits); // ← added
continue;
}
/* 2 — early-out for timeout == 0 */
if ((v & _Py_LOCKED) == 0) {
if (_Py_atomic_compare_exchange_uint8(&m->_bits, &v, v | _Py_LOCKED))
return PY_LOCK_ACQUIRED;
}
if (timeout == 0) { // ← moved outside else
return PY_LOCK_FAILURE;
}
Result
- Non-blocking calls now return immediately: success if the CAS wins,
failure if it loses – no spinning, no parking. - Timed/blocking calls still spin for fairness, but they now reload the
lock word each iteration, so they acquire promptly once the lock is free.
CPython versions tested on:
CPython main branch
Operating systems tested on:
macOS