-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unexpected sleep
function behavior in relation to main thread activity in multithreading context
#50643
Comments
sleep
function behaves inconsistently depending on main thread activitysleep
function behavior in relation to main thread activity
sleep
function behavior in relation to main thread activitysleep
function behavior in relation to main thread activity in multithreading context
sleep has no upper bound on timeout (unless you are running a realtime kernel on custom hardware) |
We can agree on that - but the mere not having an upper bound on timeout state of affairs doesn't seem to explain the behavior pointed out in the MWE I provided. Please note that I am not questioning the equality From what I am observing, the issue I raised might have nothing to do with the actual duration of I also acknowledge that your comment might have some technical implications I am unaware of. P. S. I updated the initial issue body to decrease the chance of confusing this with some naive why doesn't |
Julia uses a libuv based event-loop under the hood. Processing certain things like Timers/IO depend on the event-loop being run regularly. Looking at
I am unsure why we still have this mechanism instead of allowing any thread to run the libuv event loop. |
I really hope we can run the Especially for the But even if no work is done - spawning lots of tasks (like But more generally, the interactive tasks are heavily impacted by this if there is some Timer/IO work involved (and it is awkward always to be careful not to run anything on the main if responsivity of the Let me know if more MWEs are needed in the above context. |
On master with the suggested snippet from #53422 (comment) julia> function make_io_thread()
tid = UInt[0]
threadwork = @cfunction function(arg::Ptr{Cvoid})
@ccall jl_set_io_loop_tid((Threads.threadid() - 1)::Int16)::Cvoid
wait() # spin uv_run as long as needed
nothing
end Cvoid (Ptr{Cvoid},)
err = @ccall uv_thread_create(tid::Ptr{UInt}, threadwork::Ptr{Cvoid}, C_NULL::Ptr{Cvoid})::Cint
err == 0 || Base.uv_error("uv_thread_create", err)
@ccall uv_thread_detach(tid::Ptr{UInt})::Cint
err == 0 || Base.uv_error("uv_thread_detach", err)
# n.b. this does not wait for the thread to start or to take ownership of the event loop
nothing
end
make_io_thread() the MWE from above produces: [ Info: phase 1: control run - no main thread work
[ Info: worker 1 executing on thread 5
[ Info: sleeper 1 executing on thread 16
sleeper 1: 10.033890 seconds (109.12 M allocations: 6.504 GiB, 2.50% gc time)
worker 1: 10.213286 seconds (110.98 M allocations: 6.615 GiB, 2.53% gc time, 0.13% compilation time)
****************************************
[ Info: phase 2: main thread work without yield
[ Info: worker 2 executing on thread 5
[ Info: sleeper 2 executing on thread 15
[ Info: mainspin executing on thread 1
worker 2: 10.012252 seconds (119.87 M allocations: 7.145 GiB, 3.50% gc time, 0.13% compilation time)
sleeper 2: 10.024603 seconds (119.98 M allocations: 7.151 GiB, 3.51% gc time)
****************************************
[ Info: phase 3: main thread work with yield
[ Info: worker 3 executing on thread 2
[ Info: sleeper 3 executing on thread 12
[ Info: mainspin executing on thread 1
sleeper 3: 10.015027 seconds (113.49 M allocations: 6.764 GiB, 3.67% gc time)
worker 3: 10.025508 seconds (113.62 M allocations: 6.772 GiB, 3.67% gc time, 0.17% compilation time) |
Just passing by, but is this a duplicate of #43952? |
Yes, marking as duplicate of #43952 |
It seems that the
sleep
function behavior when used in tasks running on different threads is heavily impacted by work taking place on the main thread.Given the documentation of
sleep
function you would expect that@time sleep(1)
will result in approximately1-second
being reported by the@time
evaluator. In the same way, you would expect that the following will take about 10-seconds in total:However, the above expectation is violated (in a manner not explainable by the expected/usual variability in the
sleep
and/or@time
behavior) whensleep
is called from tasks running on different threads and any of the following two conditions are met:Libc.systemsleep(n)
is called on the main thread (showcased in my MWE)sleep
concurrently (increasing the number of available threads results in worse results).To give a clear example:
Output like this would look normal:
However, the function
sleeper(1, 10, 1)
running in tasks spawned on different threads than main, can output values like the one below (when main thread is busy):Or if you make the main really busy for a longer period of time:
And this is not about some weird print to
stdout
competition between tasks (we also have acontrol
function,realworker
, not usingsleep
to confirm my statement).All this might make more sense if you run the MWE provided below.
Important note: the current issue is to be interpreted in the context of Julia being started with multiple threads. For consistency, let's assume
julia -t 6,3 script.jl
.Main MWE follows:
Output with
julia -t 6,3 script.jl
:It would be speculative on my part to derive hard conclusions about the root cause of this issue. There is a discourse topic that I open and some users are proposing some hypotheses - feel free to check it out.
The underlying cause of the above issue seems to extend to various packages (for example, non-blocking listen from
HTTP.jl
responsivity is highly dependent on the main threat not doing work - and that is not caused by the directsleep
usage). Essential to mention thatHTTP.serve!/HTTP.listen!
is spawned to the:interactive
threadpool.Libc.systemsleep MWE version
The `Libc.systemsleep` version of the above MWE
Output:
The text was updated successfully, but these errors were encountered: