__lsan_do_recoverable_leak_check with concurrent work #757

toddlipcon · 2016-12-19T11:32:55Z

I'm seeing what looks like a false positive in my application when running the recoverable leak check concurrent with a bunch of other threads doing work.

In particular, it seems to be detecting a leak of an allocation that is getting stored in a threadlocal variable. But, it's not detecting it reliably (only a couple percent of the time in one particular stress test, and this is a code path used everywhere).

I'm having issues creating a standalone repro here, but wanted to check: is it expected that the recoverable leak check should be safe to run concurrently at all times? Are there any known LLVM optimizations that will interfere with it? For example, is it possible that LLVM would ever store an address "past the end of the stack" before decrementing the stack pointer accordingly, and thus have a moment in time when a new allocation may be neither in a register nor on the 'live' part of the stack? We do run these particular tests with -O1.

toddlipcon · 2016-12-19T14:08:55Z

I managed to write a repro:
https://gist.github.com/c4d8b263ef2ee2f360c8ec1db3e33263

Compile with: clang++ -fsanitize=address -o test test.cc --std=c++11
Then run a few copies of the program in parallel:
$ ./test &
$ ./test &
$ ./test &

Within 15 seconds or so on my machine, this produces the following leak report:

==4571==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 10 byte(s) in 1 object(s) allocated from:
    #0 0x4d7326 in malloc /home/todd/git/kudu/thirdparty/src/llvm-3.9.0.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:64
    #1 0x5107b4 in CreateThreads()::$_0::operator()() const (/tmp/test+0x5107b4)
    #2 0x510764 in void std::_Bind_simple<CreateThreads()::$_0 ()>::_M_invoke<>(std::_Index_tuple<>) (/tmp/test+0x510764)
    #3 0x510734 in std::_Bind_simple<CreateThreads()::$_0 ()>::operator()() (/tmp/test+0x510734)
    #4 0x5105e8 in std::thread::_Impl<std::_Bind_simple<CreateThreads()::$_0 ()> >::_M_run() (/tmp/test+0x5105e8)
    #5 0x7f5471170a5f in execute_native_thread_routine /build/gcc-4.8-_Vitw3/gcc-4.8-4.8.4/build/x86_64-linux-gnu/libstdc++-v3/src/c++11/../../../../../src/libstdc++-v3/src/c++11/thread.cc:84

SUMMARY: AddressSanitizer: 10 byte(s) leaked in 1 allocation(s).

Reproed with clang version 4.0.0 (trunk 288545)

kcc · 2016-12-20T05:40:02Z

reproduced on my machine.... Hm... isn't that an actual leak?
I am not aware of any such problems with recoverable_leak_check.
Note that this test stores the pointer in TSD, not TLS (not that it should matter though)

(I may not be able to investigate deeper before Jan)

eugenis · 2016-12-20T19:09:22Z

Right... where is the free()? This looks like a false negative.

…

On Mon, Dec 19, 2016 at 9:40 PM, Kostya Serebryany ***@***.*** > wrote: reproduced on my machine.... Hm... isn't that an actual leak? I am not aware of any such problems with recoverable_leak_check. Note that this test stores the pointer in TSD, not TLS (not that it should matter though) (I may not be able to investigate deeper before Jan) — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#757 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAZuSj20nk3Emy83SrRA8_XzdCYAffL1ks5rJ2oygaJpZM4LQoc5> .

eugenis · 2016-12-20T19:10:22Z

Oh, sorry, I completely missed the destructor. On Tue, Dec 20, 2016 at 11:08 AM, Evgenii Stepanov < [email protected]> wrote:

…

Right... where is the free()? This looks like a false negative. On Mon, Dec 19, 2016 at 9:40 PM, Kostya Serebryany < ***@***.***> wrote: > reproduced on my machine.... Hm... isn't that an actual leak? > I am not aware of any such problems with recoverable_leak_check. > Note that this test stores the pointer in TSD, not TLS (not that it > should matter though) > > (I may not be able to investigate deeper before Jan) > > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub > <#757 (comment)>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AAZuSj20nk3Emy83SrRA8_XzdCYAffL1ks5rJ2oygaJpZM4LQoc5> > . >

toddlipcon · 2016-12-21T08:19:46Z

Right, the destructor function is set to 'free' by pthread_key_create(). And in fact, when the recoverable leak check is run again in the same process, it doesn't detect any leaks. So it's definitely a matter of a race of some kind, but I haven't dug into the pthreads implementation yet to see why.

kcc · 2016-12-30T02:37:59Z

Interestingly, I sometimes get another leak report:

    #0 0x4ecc8b in operator new(unsigned long) /usr/local/google/home/kcc/llvm/projects/compiler-rt/lib/asan/asan_new_delete.cc:82:35
    #1 0x4f12f5 in __gnu_cxx::new_allocator<std::_Sp_counted_ptr_inplace<std::thread::_Impl<std::_Bind_simple<CreateThreads()::$_0 ()> >, std::allocator<std::thread::_Impl<std::_Bind_simple<CreateThreads()::$_0 ()> > >, (__gnu_cxx::_Lock_policy)2> >::allocate(unsigned long, void const*) (/usr/local/google/home/kcc/tmp/lsan/test+0x4f12f5)
    #2 0x4f11bb in std::allocator_traits<std::allocator<std::_Sp_counted_ptr_inplace<std::thread::_Impl<std::_Bind_simple<CreateThreads()::$_0 ()> >, std::allocator<std::thread::_Impl<std::_Bind_simple<CreateThreads()::$_0 ()> > >, (__gnu_cxx::_Lock_policy)2> > >::allocate(std::allocator<std::_Sp_counted_ptr_inplace<std::thread::_Impl<std::_Bind_simple<CreateThreads()::$_0 ()> >, std::allocator<std::thread::_Impl<std::_Bind_simple<CreateThreads()::$_0 ()> > >, (__gnu_cxx::_Lock_policy)2> >&, unsigned long) (/usr/local/google/home/kcc/tmp/lsan/test+0x4f11bb)
    #3 0x4f0f88 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<std::thread::_Impl<std::_Bind_simple<CreateThreads()::$_0 ()> >, std::allocator<std::thread::_Impl<std::_Bind_simple<CreateThreads()::$_0 ()> > >, std::_Bind_simple<CreateThreads()::$_0 ()> >(std::_Sp_make_shared_tag, std::thread::_Impl<std::_Bind_simple<CreateThreads()::$_0 ()> >*, std::allocator<std::thread::_Impl<std::_Bind_simple<CreateThreads()::$_0 ()> > > const&, std::_Bind_simple<CreateThreads()::$_0 ()>&&) (/usr/local/google/home/kcc/tmp/lsan/test+0x4f0f88)
    #4 0x4f0d60 in std::__shared_ptr<std::thread::_Impl<std::_Bind_simple<CreateThreads()::$_0 ()> >, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<std::thread::_Impl<std::_Bind_simple<CreateThreads()::$_0 ()> > >, std::_Bind_simple<CreateThreads()::$_0 ()> >(std::_Sp_make_shared_tag, std::allocator<std::thread::_Impl<std::_Bind_simple<CreateThreads()::$_0 ()> > > const&, std::_Bind_simple<CreateThreads()::$_0 ()>&&) (/usr/local/google/home/kcc/tmp/lsan/test+0x4f0d60)
    #5 0x4f0cdc in std::shared_ptr<std::thread::_Impl<std::_Bind_simple<CreateThreads()::$_0 ()> > >::shared_ptr<std::allocator<std::thread::_Impl<std::_Bind_simple<CreateThreads()::$_0 ()> > >, std::_Bind_simple<CreateThreads()::$_0 ()> >(std::_Sp_make_shared_tag, std::allocator<std::thread::_Impl<std::_Bind_simple<CreateThreads()::$_0 ()> > > const&, std::_Bind_simple<CreateThreads()::$_0 ()>&&) (/usr/local/google/home/kcc/tmp/lsan/test+0x4f0cdc)
    #6 0x4f0c4e in std::shared_ptr<std::thread::_Impl<std::_Bind_simple<CreateThreads()::$_0 ()> > > std::allocate_shared<std::thread::_Impl<std::_Bind_simple<CreateThreads()::$_0 ()> >, std::allocator<std::thread::_Impl<std::_Bind_simple<CreateThreads()::$_0 ()> > >, std::_Bind_simple<CreateThreads()::$_0 ()> >(std::allocator<std::thread::_Impl<std::_Bind_simple<CreateThreads()::$_0 ()> > > const&, std::_Bind_simple<CreateThreads()::$_0 ()>&&) (/usr/local/google/home/kcc/tmp/lsan/test+0x4f0c4e)
    #7 0x4f0b31 in std::shared_ptr<std::thread::_Impl<std::_Bind_simple<CreateThreads()::$_0 ()> > > std::make_shared<std::thread::_Impl<std::_Bind_simple<CreateThreads()::$_0 ()> >, std::_Bind_simple<CreateThreads()::$_0 ()> >(std::_Bind_simple<CreateThreads()::$_0 ()>&&) (/usr/local/google/home/kcc/tmp/lsan/test+0x4f0b31)
    #8 0x4f0892 in std::shared_ptr<std::thread::_Impl<std::_Bind_simple<CreateThreads()::$_0 ()> > > std::thread::_M_make_routine<std::_Bind_simple<CreateThreads()::$_0 ()> >(std::_Bind_simple<CreateThreads()::$_0 ()>&&) (/usr/local/google/home/kcc/tmp/lsan/test+0x4f0892)
    #9 0x4f074c in std::thread::thread<CreateThreads()::$_0>(CreateThreads()::$_0&&) (/usr/local/google/home/kcc/tmp/lsan/test+0x4f074c)
    #10 0x4f0330 in void __gnu_cxx::new_allocator<std::thread>::construct<std::thread, CreateThreads()::$_0>(std::thread*, CreateThreads()::$_0&&) (/usr/local/google/home/kcc/tmp/lsan/test+0x4f0330)
    #11 0x4f0094 in void std::deque<std::thread, std::allocator<std::thread> >::emplace_back<CreateThreads()::$_0>(CreateThreads()::$_0&&) (/usr/local/google/home/kcc/tmp/lsan/test+0x4f0094)
    #12 0x4eff2a in CreateThreads() (/usr/local/google/home/kcc/tmp/lsan/test+0x4eff2a)
    #13 0x4f80e3 in void std::_Bind_simple<void (*())()>::_M_invoke<>(std::_Index_tuple<>) (/usr/local/google/home/kcc/tmp/lsan/test+0x4f80e3)
    #14 0x4f8094 in std::_Bind_simple<void (*())()>::operator()() (/usr/local/google/home/kcc/tmp/lsan/test+0x4f8094)
    #15 0x4f7ef8 in std::thread::_Impl<std::_Bind_simple<void (*())()> >::_M_run() (/usr/local/google/home/kcc/tmp/lsan/test+0x4f7ef8)
    #16 0x7fbc6af4ca5f in execute_native_thread_routine

kcc · 2016-12-30T02:49:42Z

... and even if I make the thread body empty

morehouse · 2018-06-08T01:22:46Z

Was this ever resolved?

toddlipcon · 2018-11-06T00:44:53Z

I don't think so, we're still seeing it

LeakSanitizer will report a leak when allocating a string in SuperviseThread. It's unclear why this is the case, but upon inspecting the code, it seems like a false positive. The stack trace is as follows: ================================================================= ==93677==ERROR: LeakSanitizer: detected memory leaks Direct leak of 58 byte(s) in 1 object(s) allocated from: #0 0x5318c8 in operator new(unsigned long) /data/8/awong/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/asan/asan_new_delete.cc:92 #1 0x3ae3a9c3c8 in std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (/usr/lib64/libstdc++.so.6+0x3ae3a9c3c8) #2 0x3ae3a9d19a in std::string::_Rep::_M_clone(std::allocator<char> const&, unsigned long) (/usr/lib64/libstdc++.so.6+0x3ae3a9d19a) #3 0x3ae3a9d5eb in std::string::reserve(unsigned long) (/usr/lib64/libstdc++.so.6+0x3ae3a9d5eb) #4 0x3ae3a9d770 in std::string::append(unsigned long, char) (/usr/lib64/libstdc++.so.6+0x3ae3a9d770) #5 0x7f518f799c12 in strings::SubstituteAndAppend(std::string*, StringPiece, strings::internal::SubstituteArg const&, strings::internal::SubstituteArg const&, strings::internal::SubstituteArg const&, strings::internal::SubstituteArg const&, strings::internal::SubstituteArg const&, strings::internal::SubstituteArg const&, strings::internal::SubstituteArg const&, strings::internal::SubstituteArg const&, strings::internal::SubstituteArg const&, strings::internal::SubstituteArg const&) ../src/kudu/gutil/strings/substitute.cc:110:3 #6 0x536e76 in strings::Substitute(StringPiece, strings::internal::SubstituteArg const&, strings::internal::SubstituteArg const&, strings::internal::SubstituteArg const&, strings::internal::SubstituteArg const&, strings::internal::SubstituteArg const&, strings::internal::SubstituteArg const&, strings::internal::SubstituteArg const&, strings::internal::SubstituteArg const&, strings::internal::SubstituteArg const&, strings::internal::SubstituteArg const&) ../src/kudu/gutil/strings/substitute.h:188:3 #7 0x7f5190590860 in kudu::Thread::SuperviseThread(void*) ../src/kudu/util/thread.cc:607:17 #8 0x3ae0e079d0 in start_thread (/lib64/libpthread.so.0+0x3ae0e079d0) #9 0x3ae0ae88fc in clone (/lib64/libc.so.6+0x3ae0ae88fc) This appears to be affecting a number tests, but generally only lines #0 and #1 are present in the logs, making them difficult to debug (a developer would have to rerun the test with specific ASAN_OPTIONS to unwind the stacktrace more). Namely, exactly_once_writes-itest (KUDU-2517), kudu-admin-test (KUDU-2583), and rebalancer-tool-test (untracked via Jira) all show the top of the above stack trace, and based on the full stack trace, it seems these are all false positives. The presence of issues like google/sanitizers#757 confirms that LeakSanitizer can report false positives in workloads with high concurrency. Generally, the test binary will return an error in the face of real leaks, but in tests like the ones mentioned, the test may log messages reporting leaks, but not actually return an error because the "leak" was transient (e.g. see GenericServiceImpl::CheckLeaks). We currently inject errors into JUnit XML report if any leaks are reported, even for false positives, since the leak messages still find their way into the logs. This patch updates this to only inject these errors if the test also returned an error. For clarity, I also threw in a log statement to GenericServiceImpl::CheckLeaks denoting false positives. Change-Id: I1f199795c48bd9b6106110aae132ec165eb0f647 Reviewed-on: http://gerrit.cloudera.org:8080/11886 Tested-by: Kudu Jenkins Reviewed-by: Andrew Wong <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

__lsan_do_recoverable_leak_check with concurrent work #757

__lsan_do_recoverable_leak_check with concurrent work #757

toddlipcon commented Dec 19, 2016

toddlipcon commented Dec 19, 2016

kcc commented Dec 20, 2016

eugenis commented Dec 20, 2016 via email

eugenis commented Dec 20, 2016 via email

toddlipcon commented Dec 21, 2016

kcc commented Dec 30, 2016

kcc commented Dec 30, 2016

morehouse commented Jun 8, 2018

toddlipcon commented Nov 6, 2018

__lsan_do_recoverable_leak_check with concurrent work #757

__lsan_do_recoverable_leak_check with concurrent work #757

Comments

toddlipcon commented Dec 19, 2016

toddlipcon commented Dec 19, 2016

kcc commented Dec 20, 2016

eugenis commented Dec 20, 2016 via email

eugenis commented Dec 20, 2016 via email

toddlipcon commented Dec 21, 2016

kcc commented Dec 30, 2016

kcc commented Dec 30, 2016

morehouse commented Jun 8, 2018

toddlipcon commented Nov 6, 2018