[Core][experimental] Failure detection doesn't happen properly when the worker fails with RuntimeError #42441
Labels
bug
Something that is supposed to be working; but isn't
compiled-graphs
core
Issues that should be addressed in Ray Core
core-worker
P1
Issue that should be fixed within a few weeks
size-small
stability
What happened + What you expected to happen
From OSS vllm, remove
The execute_method fails with
but the driver hangs.
Versions / Dependencies
master
Reproduction script
vllm-project/vllm#2462
And comment out code ^
Issue Severity
None
The text was updated successfully, but these errors were encountered: