Skip to content
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Commit fffe254

Browse files
committedApr 25, 2017
std: Avoid locks during TLS destruction on Windows
Gecko recently had a bug reported [1] with a deadlock in the Rust TLS implementation for Windows. TLS destructors are implemented in a sort of ad-hoc fashion on Windows as it doesn't natively support destructors for TLS keys. To work around this the runtime manages a list of TLS destructors and registers a hook to get run whenever a thread exits. When a thread exits it takes a look at the list and runs all destructors. Unfortunately it turns out that there's a lock which is held when our "at thread exit" callback is run. The callback then attempts to acquire a lock protecting the list of TLS destructors. Elsewhere in the codebase while we hold a lock over the TLS destructors we try to acquire the same lock held first before our special callback is run. And as a result, deadlock! This commit sidesteps the issue with a few small refactorings: * Removed support for destroying a TLS key on Windows. We don't actually ever exercise this as a public-facing API, and it's only used during `lazy_init` during racy situations. To handle that we just synchronize `lazy_init` globally on Windows so we never have to call `destroy`. * With no need to support removal the global synchronized `Vec` was tranformed to a lock-free linked list. With the removal of locks this means that iteration no long requires a lock and as such we won't run into the deadlock problem mentioned above. Note that it's still a general problem that you have to be extra super careful in TLS destructors. For example no code which runs a TLS destructor on Windows can call back into the Windows API to do a dynamic library lookup. Unfortunately I don't know of a great way around that, but this at least fixes the immediate problem that Gecko was seeing which is that with "well behaved" destructors the system would still deadlock! [1]: https://bugzilla.mozilla.org/show_bug.cgi?id=1358151
1 parent c7e724a commit fffe254

File tree

4 files changed

+92
-114
lines changed

4 files changed

+92
-114
lines changed
 

‎src/libstd/sys/unix/thread_local.rs

+5
Original file line numberDiff line numberDiff line change
@@ -38,3 +38,8 @@ pub unsafe fn destroy(key: Key) {
3838
let r = libc::pthread_key_delete(key);
3939
debug_assert_eq!(r, 0);
4040
}
41+
42+
#[inline]
43+
pub fn requires_synchronized_create() -> bool {
44+
false
45+
}

‎src/libstd/sys/windows/c.rs

-1
Original file line numberDiff line numberDiff line change
@@ -935,7 +935,6 @@ extern "system" {
935935
args: *const c_void)
936936
-> DWORD;
937937
pub fn TlsAlloc() -> DWORD;
938-
pub fn TlsFree(dwTlsIndex: DWORD) -> BOOL;
939938
pub fn TlsGetValue(dwTlsIndex: DWORD) -> LPVOID;
940939
pub fn TlsSetValue(dwTlsIndex: DWORD, lpTlsvalue: LPVOID) -> BOOL;
941940
pub fn GetLastError() -> DWORD;

‎src/libstd/sys/windows/thread_local.rs

+66-99
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,11 @@
88
// option. This file may not be copied, modified, or distributed
99
// except according to those terms.
1010

11+
use mem;
1112
use ptr;
13+
use sync::atomic::AtomicPtr;
14+
use sync::atomic::Ordering::SeqCst;
1215
use sys::c;
13-
use sys_common::mutex::Mutex;
14-
use sys_common;
1516

1617
pub type Key = c::DWORD;
1718
pub type Dtor = unsafe extern fn(*mut u8);
@@ -34,8 +35,6 @@ pub type Dtor = unsafe extern fn(*mut u8);
3435
// * All TLS destructors are tracked by *us*, not the windows runtime. This
3536
// means that we have a global list of destructors for each TLS key that
3637
// we know about.
37-
// * When a TLS key is destroyed, we're sure to remove it from the dtor list
38-
// if it's in there.
3938
// * When a thread exits, we run over the entire list and run dtors for all
4039
// non-null keys. This attempts to match Unix semantics in this regard.
4140
//
@@ -50,13 +49,6 @@ pub type Dtor = unsafe extern fn(*mut u8);
5049
// [2]: https://github.com/ChromiumWebApps/chromium/blob/master/base
5150
// /threading/thread_local_storage_win.cc#L42
5251

53-
// NB these are specifically not types from `std::sync` as they currently rely
54-
// on poisoning and this module needs to operate at a lower level than requiring
55-
// the thread infrastructure to be in place (useful on the borders of
56-
// initialization/destruction).
57-
static DTOR_LOCK: Mutex = Mutex::new();
58-
static mut DTORS: *mut Vec<(Key, Dtor)> = ptr::null_mut();
59-
6052
// -------------------------------------------------------------------------
6153
// Native bindings
6254
//
@@ -85,81 +77,64 @@ pub unsafe fn get(key: Key) -> *mut u8 {
8577
}
8678

8779
#[inline]
88-
pub unsafe fn destroy(key: Key) {
89-
if unregister_dtor(key) {
90-
// FIXME: Currently if a key has a destructor associated with it we
91-
// can't actually ever unregister it. If we were to
92-
// unregister it, then any key destruction would have to be
93-
// serialized with respect to actually running destructors.
94-
//
95-
// We want to avoid a race where right before run_dtors runs
96-
// some destructors TlsFree is called. Allowing the call to
97-
// TlsFree would imply that the caller understands that *all
98-
// known threads* are not exiting, which is quite a difficult
99-
// thing to know!
100-
//
101-
// For now we just leak all keys with dtors to "fix" this.
102-
// Note that source [2] above shows precedent for this sort
103-
// of strategy.
104-
} else {
105-
let r = c::TlsFree(key);
106-
debug_assert!(r != 0);
107-
}
80+
pub unsafe fn destroy(_key: Key) {
81+
rtabort!("can't destroy tls keys on windows")
82+
}
83+
84+
#[inline]
85+
pub fn requires_synchronized_create() -> bool {
86+
true
10887
}
10988

11089
// -------------------------------------------------------------------------
11190
// Dtor registration
11291
//
113-
// These functions are associated with registering and unregistering
114-
// destructors. They're pretty simple, they just push onto a vector and scan
115-
// a vector currently.
92+
// Windows has no native support for running destructors so we manage our own
93+
// list of destructors to keep track of how to destroy keys. We then install a
94+
// callback later to get invoked whenever a thread exits, running all
95+
// appropriate destructors.
11696
//
117-
// FIXME: This could probably be at least a little faster with a BTree.
118-
119-
unsafe fn init_dtors() {
120-
if !DTORS.is_null() { return }
97+
// Currently unregistration from this list is not supported. A destructor can be
98+
// registered but cannot be unregistered. There's various simplifying reasons
99+
// for doing this, the big ones being:
100+
//
101+
// 1. Currently we don't even support deallocating TLS keys, so normal operation
102+
// doesn't need to deallocate a destructor.
103+
// 2. There is no point in time where we know we can unregister a destructor
104+
// because it could always be getting run by some remote thread.
105+
//
106+
// Typically processes have a statically known set of TLS keys which is pretty
107+
// small, and we'd want to keep this memory alive for the whole process anyway
108+
// really.
109+
//
110+
// Perhaps one day we can fold the `Box` here into a static allocation,
111+
// expanding the `StaticKey` structure to contain not only a slot for the TLS
112+
// key but also a slot for the destructor queue on windows. An optimization for
113+
// another day!
121114

122-
let dtors = box Vec::<(Key, Dtor)>::new();
115+
static DTORS: AtomicPtr<Node> = AtomicPtr::new(ptr::null_mut());
123116

124-
let res = sys_common::at_exit(move|| {
125-
DTOR_LOCK.lock();
126-
let dtors = DTORS;
127-
DTORS = 1 as *mut _;
128-
Box::from_raw(dtors);
129-
assert!(DTORS as usize == 1); // can't re-init after destructing
130-
DTOR_LOCK.unlock();
131-
});
132-
if res.is_ok() {
133-
DTORS = Box::into_raw(dtors);
134-
} else {
135-
DTORS = 1 as *mut _;
136-
}
117+
struct Node {
118+
dtor: Dtor,
119+
key: Key,
120+
next: *mut Node,
137121
}
138122

139123
unsafe fn register_dtor(key: Key, dtor: Dtor) {
140-
DTOR_LOCK.lock();
141-
init_dtors();
142-
assert!(DTORS as usize != 0);
143-
assert!(DTORS as usize != 1,
144-
"cannot create new TLS keys after the main thread has exited");
145-
(*DTORS).push((key, dtor));
146-
DTOR_LOCK.unlock();
147-
}
124+
let mut node = Box::new(Node {
125+
key: key,
126+
dtor: dtor,
127+
next: ptr::null_mut(),
128+
});
148129

149-
unsafe fn unregister_dtor(key: Key) -> bool {
150-
DTOR_LOCK.lock();
151-
init_dtors();
152-
assert!(DTORS as usize != 0);
153-
assert!(DTORS as usize != 1,
154-
"cannot unregister destructors after the main thread has exited");
155-
let ret = {
156-
let dtors = &mut *DTORS;
157-
let before = dtors.len();
158-
dtors.retain(|&(k, _)| k != key);
159-
dtors.len() != before
160-
};
161-
DTOR_LOCK.unlock();
162-
ret
130+
let mut head = DTORS.load(SeqCst);
131+
loop {
132+
node.next = head;
133+
match DTORS.compare_exchange(head, &mut *node, SeqCst, SeqCst) {
134+
Ok(_) => return mem::forget(node),
135+
Err(cur) => head = cur,
136+
}
137+
}
163138
}
164139

165140
// -------------------------------------------------------------------------
@@ -196,16 +171,12 @@ unsafe fn unregister_dtor(key: Key) -> bool {
196171
// # Ok, what's up with running all these destructors?
197172
//
198173
// This will likely need to be improved over time, but this function
199-
// attempts a "poor man's" destructor callback system. To do this we clone a
200-
// local copy of the dtor list to start out with. This is our fudgy attempt
201-
// to not hold the lock while destructors run and not worry about the list
202-
// changing while we're looking at it.
203-
//
204-
// Once we've got a list of what to run, we iterate over all keys, check
205-
// their values, and then run destructors if the values turn out to be non
206-
// null (setting them to null just beforehand). We do this a few times in a
207-
// loop to basically match Unix semantics. If we don't reach a fixed point
208-
// after a short while then we just inevitably leak something most likely.
174+
// attempts a "poor man's" destructor callback system. Once we've got a list
175+
// of what to run, we iterate over all keys, check their values, and then run
176+
// destructors if the values turn out to be non null (setting them to null just
177+
// beforehand). We do this a few times in a loop to basically match Unix
178+
// semantics. If we don't reach a fixed point after a short while then we just
179+
// inevitably leak something most likely.
209180
//
210181
// # The article mentions weird stuff about "/INCLUDE"?
211182
//
@@ -259,25 +230,21 @@ unsafe extern "system" fn on_tls_callback(h: c::LPVOID,
259230
unsafe fn run_dtors() {
260231
let mut any_run = true;
261232
for _ in 0..5 {
262-
if !any_run { break }
233+
if !any_run {
234+
break
235+
}
263236
any_run = false;
264-
let dtors = {
265-
DTOR_LOCK.lock();
266-
let ret = if DTORS as usize <= 1 {
267-
Vec::new()
268-
} else {
269-
(*DTORS).iter().map(|s| *s).collect()
270-
};
271-
DTOR_LOCK.unlock();
272-
ret
273-
};
274-
for &(key, dtor) in &dtors {
275-
let ptr = c::TlsGetValue(key);
237+
let mut cur = DTORS.load(SeqCst);
238+
while !cur.is_null() {
239+
let ptr = c::TlsGetValue((*cur).key);
240+
276241
if !ptr.is_null() {
277-
c::TlsSetValue(key, ptr::null_mut());
278-
dtor(ptr as *mut _);
242+
c::TlsSetValue((*cur).key, ptr::null_mut());
243+
((*cur).dtor)(ptr as *mut _);
279244
any_run = true;
280245
}
246+
247+
cur = (*cur).next;
281248
}
282249
}
283250
}

‎src/libstd/sys_common/thread_local.rs

+21-14
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,7 @@
6161
use sync::atomic::{self, AtomicUsize, Ordering};
6262

6363
use sys::thread_local as imp;
64+
use sys_common::mutex::Mutex;
6465

6566
/// A type for TLS keys that are statically allocated.
6667
///
@@ -145,20 +146,6 @@ impl StaticKey {
145146
#[inline]
146147
pub unsafe fn set(&self, val: *mut u8) { imp::set(self.key(), val) }
147148

148-
/// Deallocates this OS TLS key.
149-
///
150-
/// This function is unsafe as there is no guarantee that the key is not
151-
/// currently in use by other threads or will not ever be used again.
152-
///
153-
/// Note that this does *not* run the user-provided destructor if one was
154-
/// specified at definition time. Doing so must be done manually.
155-
pub unsafe fn destroy(&self) {
156-
match self.key.swap(0, Ordering::SeqCst) {
157-
0 => {}
158-
n => { imp::destroy(n as imp::Key) }
159-
}
160-
}
161-
162149
#[inline]
163150
unsafe fn key(&self) -> imp::Key {
164151
match self.key.load(Ordering::Relaxed) {
@@ -168,6 +155,26 @@ impl StaticKey {
168155
}
169156

170157
unsafe fn lazy_init(&self) -> usize {
158+
// Currently the Windows implementation of TLS is pretty hairy, and
159+
// it greatly simplifies creation if we just synchronize everything.
160+
//
161+
// Additionally a 0-index of a tls key hasn't been seen on windows, so
162+
// we just simplify the whole branch.
163+
if imp::requires_synchronized_create() {
164+
static INIT_LOCK: Mutex = Mutex::new();
165+
INIT_LOCK.lock();
166+
let mut key = self.key.load(Ordering::SeqCst);
167+
if key != 0 {
168+
key = imp::create(self.dtor) as usize;
169+
if key != 0 {
170+
self.key.store(key, Ordering::SeqCst);
171+
}
172+
}
173+
INIT_LOCK.unlock();
174+
assert!(key != 0);
175+
return key
176+
}
177+
171178
// POSIX allows the key created here to be 0, but the compare_and_swap
172179
// below relies on using 0 as a sentinel value to check who won the
173180
// race to set the shared TLS key. As far as I know, there is no

0 commit comments

Comments
 (0)
Please sign in to comment.