Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows 2022 Runner - Cuda 12.5.x installer hangs #382

Open
heathhenley opened this issue Feb 27, 2025 · 7 comments
Open

Windows 2022 Runner - Cuda 12.5.x installer hangs #382

heathhenley opened this issue Feb 27, 2025 · 7 comments

Comments

@heathhenley
Copy link

Recently cuda installs started to hang (or take so long they timeout at 1.5 hours so far - I didn't try yet with no timeout) on the github 2022 windows runner.

Everything was working until as recently as yesterday, so I suspect it's an update to the runner and not cuda-toolkit (we've had the version pinned).

Hoping you might be able point me in the right direction to start to figure out what's going on though - maybe there's a way I can log to see if there's an error other output on the runner that would give us a clue about it?

Here's a minimal action to reproduce and there's one running here: https://github.com/heathhenley/test_windows_action_cuda_install/actions/runs/13573647553 - as far as I can tell it's just hanging there on the installer call:

Image

name: Install CUDA Windows
on:
  push:
    branches:
      - 'main'
      - 'master'
jobs:
  install-cuda:
    runs-on: windows-latest
    steps:
      - name: Checkout repo
        uses: actions/checkout@v4
      - name: Install cuda-toolkit
        uses: Jimver/[email protected]
        with:
          cuda: '12.5.0'

Maybe the answer is to open an issue on the runner image repo, but given that it's hanging (or taking significantly longer) with this action, I'm hoping you might have some tips.

Any help is appreciated, thanks for putting this action out there!

@Silverlan
Copy link

Silverlan commented Feb 28, 2025

Same problem here.
It's just a guess, but I believe it has to do with this issue:
https://forums.developer.nvidia.com/t/stuck-installing-nsight-on-cuda-12-8-0-571-96-windows/323732

When trying to install CUDA on my PC, the installer also hangs during the NSight Visual Studio installation.

I suspect that there was a GitHub runner update that has updated Visual Studio to a newer version (since the problem doesn't seem to occur with older versions).
If that's the case, a possible fix/workaround could be to disable the installation of NSight Visual Studio in the action, but I'm not sure how to do that.

@heathhenley
Copy link
Author

heathhenley commented Feb 28, 2025

Good find that definitely seems like the culprit...

This script posted on another issue seems to install cuda no problem, maybe it will help someone: #253 (comment)

@N-Storm
Copy link

N-Storm commented Mar 3, 2025

Issue that addresses this bug: microsoft/STL#5282

Quote from it:

(Update: This is a VS bug, DevCom-10841757, expected to be fixed in VS 2022 17.14 Preview 2.)

Not only 12.5.0 are affected, versions up to 12.8.0 are affected as well.

@andrey-khropov
Copy link

Same problem here. It's just a guess, but I believe it has to do with this issue: https://forums.developer.nvidia.com/t/stuck-installing-nsight-on-cuda-12-8-0-571-96-windows/323732

When trying to install CUDA on my PC, the installer also hangs during the NSight Visual Studio installation.

I suspect that there was a GitHub runner update that has updated Visual Studio to a newer version (since the problem doesn't seem to occur with older versions).

Yes. Visual Studio 2022 has been updated to 17.3.x version.

If that's the case, a possible fix/workaround could be to disable the installation of NSight Visual Studio in the action,

nVidia answered that you should disable two components during installation:

  • “Nsight VSE”
  • "Visual Studio Integration"

https://forums.developer.nvidia.com/t/nvidia-nsight-visual-studio-edition-2025-1-0-25002-setup-stuck-on-configuring-visual-studio-2022-settings-for-nsight-visual-studio-edition/323713/5

but I'm not sure how to do that.

This action has sub-packages parameter where you can select only some sub-packages to install.

I can confirm that the installation works with subpackages '["nvcc","cudart","thrust"]'

(catboost/catboost@147855b)

@N-Storm
Copy link

N-Storm commented Mar 3, 2025

I can confirm that the installation works with subpackages '["nvcc","cudart","thrust"]'

Note that this way it doesn't adds $CUDA_PATH/bin to $PATH env. So if your build system doesn't discovers path to nvcc, simply calling nvcc won't work. Either add it to PATH or call it by full path.

@andrey-khropov
Copy link

I can confirm that the installation works with subpackages '["nvcc","cudart","thrust"]'

Note that this way it doesn't adds $CUDA_PATH/bin to $PATH env. So if your build system doesn't discovers path to nvcc, simply calling nvcc won't work. Either add it to PATH or call it by full path.

But it sets

    CUDA_PATH: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8
    CUDA_PATH_V11_8: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8
    CUDA_PATH_VX_Y: CUDA_PATH_V11_8

so you can add it to $PATH by yourself if necessary.

@N-Storm
Copy link

N-Storm commented Mar 3, 2025

@andrey-khropov, yes, that's exactly that I've meant. Just a reminder that you have to add it on your own. Because previously visual_studio_integration package (installed by default) added $CUDA_PATH/bin on install.

N-Storm added a commit to N-Storm/mfaktc that referenced this issue Mar 3, 2025
This commit provides a workaround for these issues:
microsoft/STL#5282
Jimver/cuda-toolkit#382

And fixes builds for older CUDA versions 8.x - 10.x as well.
N-Storm added a commit to N-Storm/mfaktc that referenced this issue Mar 3, 2025
This commit marks version 0.23.2 (see primesearch#14).

* Fix incorrect `cuda_version` reference and order of CUDA version / CC
  columns in the table generated in the release notes.
* Fix `mfaktc.ini` copy on Windows CI/CD builds:
  GitHub Actions runs an unusual environment for Windows builds,
  where GNU Make has `SHELL` set to PowerShell (otherwise, GNU Make
  from MSYS2 would try to launch the Bash shell, which fails to build
  without heavy modifications to the Makefile). For some reason, Make
  fails to spawn the `copy` command, probably because it's just a
  shortcut for the `Copy-Item` command.
  Anyway, simply copying this file to the upper directory before
  launching Make fixes this issue, as Make will skip this step once
  it has already been completed. This way, we don't have to patch
  `Makefile.win` or come up with other workarounds.
* Changes suggested by @tdulcet on primesearch#14 added. Thanks!
  These include:
  * Set action `fail-fast` to `false`. This allows other jobs to
    continue if one fails.
  * Commented out CUDA versions to leave only the highest `.patch`
    version per `major.minor` version.
  * "Code quality" improvements to the workflow & helper script.
* Fix broken CUDA installation with newer MSVC versions.
  This provides a workaround for these issues:
  microsoft/STL#5282
  Jimver/cuda-toolkit#382
  A recent update to GitHub Actions' Windows runner images updated
  the MSVC version, causing the CUDA Toolkit installer to hang during
  the VS integration component installation. A workaround was added
  to install only the `nvcc` and `cudart` components.
  Additionally, the CUDA binaries directory must be added to `PATH`
  from the workflow since it was originally set by the broken
  component installation.
* Added CUDA 8.x / 9.x / 10.x Windows builds.
  There is an option to run the MSVC 14.0 (2015) build environment,
  which is installed as a component alongside MSVC 2019 on the
  GitHub Actions runner.
N-Storm added a commit to N-Storm/mfaktc that referenced this issue Mar 3, 2025
This commit marks version 0.23.2 (see primesearch#14).

* Fix incorrect `cuda_version` reference and order of CUDA version / CC
  columns in the table generated in the release notes.
* Fix `mfaktc.ini` copy on Windows CI/CD builds:
  GitHub Actions runs an unusual environment for Windows builds,
  where GNU Make has `SHELL` set to PowerShell (otherwise, GNU Make
  from MSYS2 would try to launch the Bash shell, which fails to build
  without heavy modifications to the Makefile). For some reason, Make
  fails to spawn the `copy` command, probably because it's just a
  shortcut for the `Copy-Item` command.
  Anyway, simply copying this file to the upper directory before
  launching Make fixes this issue, as Make will skip this step once
  it has already been completed. This way, we don't have to patch
  `Makefile.win` or come up with other workarounds.
* Changes suggested by @tdulcet on primesearch#14 added. Thanks!
  These include:
  * Set action `fail-fast` to `false`. This allows other jobs to
    continue if one fails.
  * Commented out CUDA versions to leave only the highest `.patch`
    version per `major.minor` version.
  * "Code quality" improvements to the workflow & helper script.
* Fix broken CUDA installation with newer MSVC versions.
  This provides a workaround for these issues:
  microsoft/STL#5282
  Jimver/cuda-toolkit#382
  A recent update to GitHub Actions' Windows runner images updated
  the MSVC version, causing the CUDA Toolkit installer to hang during
  the VS integration component installation. A workaround was added
  to install only the `nvcc` and `cudart` components.
  Additionally, the CUDA binaries directory must be added to `PATH`
  from the workflow since it was originally set by the broken
  component installation.
* Added CUDA 8.x / 9.x / 10.x Windows builds.
  There is an option to run the MSVC 14.0 (2015) build environment,
  which is installed as a component alongside MSVC 2019 on the
  GitHub Actions runner.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants