Closed
Description
Bug report
Bug description:
Here's a race reported by thread sanitizer that I haven't been able to find a small reproducer for, but it does look racy reading the code.
WARNING: ThreadSanitizer: data race (pid=1575489)
Read of size 4 at 0x7fb14614ee00 by thread T130:
#0 unicode_eq /usr/local/google/home/phawkins/p/cpython/Objects/stringlib/eq.h:13:9 (python3.13+0x25da7a) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
#1 compare_unicode_unicode /usr/local/google/home/phawkins/p/cpython/Objects/dictobject.c:1139:50 (python3.13+0x25da7a)
#2 do_lookup /usr/local/google/home/phawkins/p/cpython/Objects/dictobject.c:1063:23 (python3.13+0x25da7a)
#3 unicodekeys_lookup_unicode /usr/local/google/home/phawkins/p/cpython/Objects/dictobject.c:1148:12 (python3.13+0x25da7a)
#4 _Py_dict_lookup /usr/local/google/home/phawkins/p/cpython/Objects/dictobject.c:1259:22 (python3.13+0x25ea02) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
#5 dict_setdefault_ref_lock_held /usr/local/google/home/phawkins/p/cpython/Objects/dictobject.c:4282:21 (python3.13+0x26a5da) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
#6 PyDict_SetDefaultRef /usr/local/google/home/phawkins/p/cpython/Objects/dictobject.c:4332:11 (python3.13+0x26a221) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
#7 intern_common /usr/local/google/home/phawkins/p/cpython/Objects/unicodeobject.c:15225:19 (python3.13+0x34a02c) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
#8 _PyUnicode_InternMortal /usr/local/google/home/phawkins/p/cpython/Objects/unicodeobject.c:15286:10 (python3.13+0x34a40c) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
#9 PyUnicode_InternFromString /usr/local/google/home/phawkins/p/cpython/Objects/unicodeobject.c:15322:5 (python3.13+0x34a40c)
#10 nanobind::detail::nb_type_new(nanobind::detail::type_init_data const*) nb_type.cpp (xla_extension.so+0xdcd28aa) (BuildId: e484e79ecc5a6e10)
...
Previous write of size 4 at 0x7fb14614ee00 by thread T137:
#0 immortalize_interned /usr/local/google/home/phawkins/p/cpython/Objects/unicodeobject.c:15141:34 (python3.13+0x34a20e) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
#1 intern_common /usr/local/google/home/phawkins/p/cpython/Objects/unicodeobject.c:15270:9 (python3.13+0x34a20e)
#2 _PyUnicode_InternMortal /usr/local/google/home/phawkins/p/cpython/Objects/unicodeobject.c:15286:10 (python3.13+0x34a40c) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
#3 PyUnicode_InternFromString /usr/local/google/home/phawkins/p/cpython/Objects/unicodeobject.c:15322:5 (python3.13+0x34a40c)
#4 nanobind::detail::nb_type_new(nanobind::detail::type_init_data const*) nb_type.cpp (xla_extension.so+0xdcd28aa) (BuildId: e484e79ecc5a6e10)
...
I think the scenario here is:
- thread A and B are simultaneously interning strings
- thread A succeeds at inserting that string into the intern dictionary, and is at the end of
intern_common
immortalizing the string, which sets the.interned
field on the string - thread B is now trying to intern a string and is performing an equality test during the intern dictionary lookup, which reads the
.kind
field.
The .kind
and .interned
fields are bitfields in the same word, so this is a race, and I can't see any synchronization or atomicity that would prevent it.
Perhaps we need to hold the critical section on the intern dictionary longer, until immortalization is complete?
CPython versions tested on:
3.13
Operating systems tested on:
Linux