Closed
Description
There seems to be some initialisation/finalisation issues when using multiple plugins.
If I try to run the same simple operation on both OpenCL and CUDA plugins, in any order, I get
$ ./simple-sycl-app cpu cuda
Available SYCL platforms:
- Intel(R) OpenCL, driver version OpenCL 2.1 LINUX
- Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz, driver version 2020.9.1.0.18
- Intel(R) OpenCL HD Graphics, driver version OpenCL 2.1
- Intel(R) Gen9 HD Graphics NEO, driver version 20.08.15750
- NVIDIA CUDA, driver version CUDA 10.20
- Tesla K40c, driver version CUDA 10.20
- SYCL host platform, driver version 1.2
- SYCL host device, driver version 1.2
Running on SYCL device Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz, driver version 2020.9.1.0.18
The results are correct!
Running on SYCL device Tesla K40c, driver version CUDA 10.20
Segmentation fault (core dumped)
The stack trace is
#0 0x00007ffff69f1248 in clCreateUserEvent (context=0x2004170, errcode_ret=0x7fffffffccfc) at /data/user/fwyzard/sycl/build-devel-4c3719d637b/tools/sycl/OpenCL/icd/loader/icd_dispatch.c:2216
#1 0x00007ffff63c1e8a in OclpiEventCreate (context=<optimized out>, ret_event=0x2327e10) at /data/user/fwyzard/sycl/llvm/sycl/plugins/opencl/pi_opencl.cpp:561
#2 0x00007ffff7206a93 in cl::sycl::detail::plugin::call_nocheck<(cl::sycl::detail::PiApiKind)44, _pi_context*, _pi_event**> (this=0x63e9d0) at /data/user/fwyzard/sycl/llvm/sycl/include/CL/sycl/detail/pi.def:69
#3 cl::sycl::detail::plugin::call<(cl::sycl::detail::PiApiKind)44, _pi_context*, _pi_event**> (this=0x63e9d0) at /data/user/fwyzard/sycl/llvm/sycl/source/detail/plugin.hpp:63
#4 cl::sycl::detail::Command::prepareEvents (this=0x200bea0, Context=std::shared_ptr<class cl::sycl::detail::context_impl> (use count 9, weak count 0) = {...})
at /data/user/fwyzard/sycl/llvm/sycl/source/detail/scheduler/commands.cpp:125
#5 0x00007ffff720808e in cl::sycl::detail::MemCpyCommand::enqueueImp (this=0x200bea0) at /usr/include/c++/8/bits/shared_ptr.h:129
#6 0x00007ffff7205cfd in cl::sycl::detail::Command::enqueue (this=0x200bea0, EnqueueResult=..., Blocking=<optimized out>) at /data/user/fwyzard/sycl/llvm/sycl/source/detail/scheduler/commands.cpp:196
#7 0x00007ffff7210bc6 in cl::sycl::detail::Scheduler::GraphProcessor::enqueueCommand (Blocking=cl::sycl::detail::NON_BLOCKING, EnqueueResult=..., Cmd=0xa24a00)
at /data/user/fwyzard/sycl/llvm/sycl/source/detail/scheduler/graph_processor.cpp:65
#8 cl::sycl::detail::Scheduler::GraphProcessor::enqueueCommand (Cmd=Cmd@entry=0xa24a00, EnqueueResult=..., Blocking=Blocking@entry=cl::sycl::detail::NON_BLOCKING)
at /data/user/fwyzard/sycl/llvm/sycl/source/detail/scheduler/graph_processor.cpp:55
#9 0x00007ffff720c056 in cl::sycl::detail::Scheduler::addCG (this=0x7ffff74a4ea0 <cl::sycl::detail::Scheduler::instance>, CommandGroup=std::unique_ptr<class cl::sycl::detail::CG> = {...},
Queue=std::shared_ptr<class cl::sycl::detail::queue_impl> (empty) = {...}) at /data/user/fwyzard/sycl/llvm/sycl/source/detail/scheduler/scheduler.cpp:72
#10 0x00007ffff7237499 in cl::sycl::handler::finalize (this=this@entry=0x7fffffffd250) at /usr/include/c++/8/bits/move.h:74
#11 0x00007ffff723dedf in cl::sycl::detail::queue_impl::submit_impl(std::function<void (cl::sycl::handler&)> const&, std::shared_ptr<cl::sycl::detail::queue_impl>) (this=0x20095c0, CGF=...,
Self=std::shared_ptr<class cl::sycl::detail::queue_impl> (empty) = {...}) at /data/user/fwyzard/sycl/llvm/sycl/source/detail/queue_impl.hpp:355
#12 0x00007ffff723e92d in cl::sycl::detail::queue_impl::submit(std::function<void (cl::sycl::handler&)> const&, std::shared_ptr<cl::sycl::detail::queue_impl>) (Self=..., CGF=..., this=<optimized out>)
at /usr/include/c++/8/bits/shared_ptr_base.h:754
#13 cl::sycl::queue::submit_impl(std::function<void (cl::sycl::handler&)>) (this=<optimized out>, CGH=...) at /data/user/fwyzard/sycl/llvm/sycl/source/queue.cpp:110
#14 0x0000000000403966 in cl::sycl::queue::submit<main::$_0> (this=0x7fffffffd590, CGF=...) at /data/user/fwyzard/sycl/build-devel-4c3719d637b/lib64/clang/11.0.0/include/CL/sycl/queue.hpp:171
#15 main (argc=<optimized out>, argv=<optimized out>) at simple-sycl-app.cpp:118
And
$ ./simple-sycl-app cuda cpu
Available SYCL platforms:
- Intel(R) OpenCL, driver version OpenCL 2.1 LINUX
- Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz, driver version 2020.9.1.0.18
- Intel(R) OpenCL HD Graphics, driver version OpenCL 2.1
- Intel(R) Gen9 HD Graphics NEO, driver version 20.08.15750
- NVIDIA CUDA, driver version CUDA 10.20
- Tesla K40c, driver version CUDA 10.20
- SYCL host platform, driver version 1.2
- SYCL host device, driver version 1.2
Running on SYCL device Tesla K40c, driver version CUDA 10.20
The results are correct!
Running on SYCL device Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz, driver version 2020.9.1.0.18
pi_die: cuda_piEventSetCallback not implemented
terminate called without an active exception
Aborted (core dumped)
The stack trace is
#0 0x00007ffff6c2b8df in raise () from /lib64/libc.so.6
#1 0x00007ffff6c15cf5 in abort () from /lib64/libc.so.6
#2 0x00007ffff7acf06b in __gnu_cxx::__verbose_terminate_handler() [clone .cold.1] () from /lib64/libstdc++.so.6
#3 0x00007ffff7ad54cc in __cxxabiv1::__terminate(void (*)()) () from /lib64/libstdc++.so.6
#4 0x00007ffff7ad5527 in std::terminate() () from /lib64/libstdc++.so.6
#5 0x00007ffff718e94c in cl::sycl::detail::pi::die (Message=0x7ffff61b7810 "cuda_piEventSetCallback not implemented") at /data/user/fwyzard/sycl/llvm/sycl/source/detail/pi.cpp:220
#6 0x00007ffff61b2990 in cuda_piEventSetCallback (event=<optimized out>, command_exec_callback_type=<optimized out>, pfn_notify=<optimized out>, user_data=<optimized out>)
at /data/user/fwyzard/sycl/llvm/sycl/plugins/cuda/pi_cuda.cpp:2381
#7 0x00007ffff7206b25 in cl::sycl::detail::plugin::call_nocheck<(cl::sycl::detail::PiApiKind)48, _pi_event*, int, void (*)(_pi_event*, int, void*), std::shared_ptr<cl::sycl::detail::event_impl>*> (this=0x790bb0)
at /data/user/fwyzard/sycl/llvm/sycl/include/CL/sycl/detail/pi.def:73
#8 cl::sycl::detail::plugin::call<(cl::sycl::detail::PiApiKind)48, _pi_event*, int, void (*)(_pi_event*, int, void*), std::shared_ptr<cl::sycl::detail::event_impl>*> (this=0x790bb0)
at /data/user/fwyzard/sycl/llvm/sycl/source/detail/plugin.hpp:63
#9 cl::sycl::detail::Command::prepareEvents (this=0x146e860, Context=std::shared_ptr<cl::sycl::detail::context_impl> (use count 9, weak count 0) = {...}) at /data/user/fwyzard/sycl/llvm/sycl/source/detail/scheduler/commands.cpp:129
#10 0x00007ffff720808e in cl::sycl::detail::MemCpyCommand::enqueueImp (this=0x146e860) at /usr/include/c++/8/bits/shared_ptr.h:129
#11 0x00007ffff7205cfd in cl::sycl::detail::Command::enqueue (this=0x146e860, EnqueueResult=..., Blocking=<optimized out>) at /data/user/fwyzard/sycl/llvm/sycl/source/detail/scheduler/commands.cpp:196
#12 0x00007ffff7210bc6 in cl::sycl::detail::Scheduler::GraphProcessor::enqueueCommand (Blocking=cl::sycl::detail::NON_BLOCKING, EnqueueResult=..., Cmd=0x1494f70)
at /data/user/fwyzard/sycl/llvm/sycl/source/detail/scheduler/graph_processor.cpp:65
#13 cl::sycl::detail::Scheduler::GraphProcessor::enqueueCommand (Cmd=Cmd@entry=0x1494f70, EnqueueResult=..., Blocking=Blocking@entry=cl::sycl::detail::NON_BLOCKING)
at /data/user/fwyzard/sycl/llvm/sycl/source/detail/scheduler/graph_processor.cpp:55
#14 0x00007ffff720c056 in cl::sycl::detail::Scheduler::addCG (this=0x7ffff74a4ea0 <cl::sycl::detail::Scheduler::instance>, CommandGroup=std::unique_ptr<cl::sycl::detail::CG> = {...},
Queue=std::shared_ptr<cl::sycl::detail::queue_impl> (empty) = {...}) at /data/user/fwyzard/sycl/llvm/sycl/source/detail/scheduler/scheduler.cpp:72
#15 0x00007ffff7237499 in cl::sycl::handler::finalize (this=this@entry=0x7fffffffd250) at /usr/include/c++/8/bits/move.h:74
#16 0x00007ffff723dedf in cl::sycl::detail::queue_impl::submit_impl(std::function<void (cl::sycl::handler&)> const&, std::shared_ptr<cl::sycl::detail::queue_impl>) (this=0x9fde00, CGF=...,
Self=std::shared_ptr<cl::sycl::detail::queue_impl> (empty) = {...}) at /data/user/fwyzard/sycl/llvm/sycl/source/detail/queue_impl.hpp:355
#17 0x00007ffff723e92d in cl::sycl::detail::queue_impl::submit(std::function<void (cl::sycl::handler&)> const&, std::shared_ptr<cl::sycl::detail::queue_impl>) (Self=..., CGF=..., this=<optimized out>)
at /usr/include/c++/8/bits/shared_ptr_base.h:754
#18 cl::sycl::queue::submit_impl(std::function<void (cl::sycl::handler&)>) (this=<optimized out>, CGH=...) at /data/user/fwyzard/sycl/llvm/sycl/source/queue.cpp:110
#19 0x0000000000403966 in cl::sycl::queue::submit<main::$_0> (this=0x7fffffffd590, CGF=...) at /data/user/fwyzard/sycl/build-devel-4c3719d637b/lib64/clang/11.0.0/include/CL/sycl/queue.hpp:171
#20 main (argc=<optimized out>, argv=<optimized out>) at simple-sycl-app.cpp:118