Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI/CD with Github Actions #14

Merged
merged 5 commits into from
Mar 4, 2025
Merged

Conversation

N-Storm
Copy link
Contributor

@N-Storm N-Storm commented Feb 24, 2025

This PR introduces automated builds using GitHub Actions (GA) runners, with support for uploading resulting binaries to the "Releases" section, along with release notes generation. The latter happens when a tag is created, but this can be adjusted to match a specific template or be triggered manually.

Supported configurations:

  • Windows x64: CUDA 11.0 and higher
  • Linux x86_64: CUDA 8.0 and higher

Building 32-bit binaries and supporting older CUDA versions within GA runners doesn’t seem reasonable to me. The time required to implement and test these would likely outweigh any demand. If possible at all—since, for example, Windows GA runners are only available in two flavors (MSVC 2019 & 2022). The currently supported versions should cover any hardware still useful for factoring.

Currently, I’ve enabled one target version per MAJOR.MINOR, selecting the latest .PATCHLEVEL available. In other words, for every X.Y.z, I pick one X.Y with the highest available .z. However, this can be easily configured in the .github/workflows/build.yml by uncommenting other versions. Or adding new if required. I’ve left comments there with the links to available versions.

It’s also possible to add a release flag so that generic builds run for every version listed, but only those marked with this flag get uploaded to the releases section. This is useful if you prefer not to upload too many releases but still want continuous builds to gather feedback on the build process as further changes are made. Builds that aren’t uploaded can still be downloaded by repository members with write permissions under Actions → [Build] → Artifacts, where they remain available for up to 90 days (configurable in the repository settings on the GitHub web interface).

I had to slightly modify Makefile.win to fix path and quoting issues for CUDA. Instead of using an intermediate CUDA_DIR, I switched to using CUDA_PATH directly. This should be safe for manual builds as well, but I haven’t tested it. If this change breaks manual builds or is otherwise unacceptable, it can be moved to a build helper script—modifying a local copy of Makefile.win during the build while keeping the repository version unchanged.

Example releases built using this workflow:

Known Issue:

  • Sometimes, Windows builds may fail early during the CUDA Toolkit installation step with an error message like this:
Warning: Failed to save: Cache service responded with 429 during upload chunk.
D:\a\_actions\Jimver\cuda-toolkit\v0.2.21\dist\index.js:6116
                       throw new Error(`Cache upload failed because file read failed with ${error.message}`);
                       ^

Error: Cache upload failed because file read failed with EBADF: bad file descriptor, read

This appears to happen because the GitHub cache service returns a 429 Too Many Requests status code error due to high load. In such cases, simply manually re-running all failed tasks from the web interface usually resolves the issue. At least that always worked for me on 2nd attempt.

Reducing the number of parallel jobs might help avoid this entirely if the rate limiting is based on GitHub account or repository. And assuming it's not a global denial happening for all users right now. Alternatively, the cache could be bypassed by downloading CUDA directly, or additional automation could be added to retry jobs that fail at this step. For now, we'll see how things develop; it may just be a temporary issue that gets fixed soon.

Feel free to PM me on the forums if you have any questions.

Adding Github Actions workflow file, which does builds on commits.
And with the version tag added, it will upload binary builds the
Github Releases as well.
This changes build workflow in a way, that now it doesn't uses
Github actions provided container support, but launches them from
the steps instead, and runs commands on them via `docker exec`.
This fixes broken Github Actionsd with older Ubuntu versions
used by the NVidia containers. Which are used fome older CUDA
versions.
Action to upload releases are now separate job, which uploads all
files at one step, in a sorted way. And posts a table with all
build info on the release notes.
…steps

Those can be downloaded from build artifacts section of Github. Might come in handy in some cases".
This commit brings automated builds for Windows, but the build script
has signifiant changes in other parts as well. Information gathering
(compiler version, NVCC version, etc) are now located in external
script (.github/workflows/scripts/build_helper.sh) and shares common
functions for both Windows & Linux builds.
So far CUDA version / OS combinations added in the workflow
was tested to build succesfully. Older versions of CUDA currently
fail to build.
Getting Windows builds with older CUDA versions working with Github
Actions wouldn't be any easy, if possible at all. Github Actions
runners currently has images with MS Visual Studio 2019 and 2022.
While older versions of CUDA hardcoded MSVC version requirement,
no flag to ignore it. But so far it builds for CUDA 11.0 with Compute
Capability support from 2.0, and 1.x won't be supported by mfaktc
with next version.
Makefile.win was also sligtly modified to fix paths, and remove
intermediate variable CUDA_DIR (uses CUDA_PATH directly). This should
be a generic fix for better portability, but it was tested only
with the Github Runners builders for now.
@brubsby
Copy link
Collaborator

brubsby commented Feb 24, 2025

First let me start by saying THANK YOU. Your work here is highly appreciated.

Building 32-bit binaries and supporting older CUDA versions within GA runners doesn’t seem reasonable to me. The time required to implement and test these would likely outweigh any demand. If possible at all—since, for example, Windows GA runners are only available in two flavors (MSVC 2019 & 2022). The currently supported versions should cover any hardware still useful for factoring.

Completely agree. We might have someone coming out of the woodwork to grumble about 32 bit versions, but they can implement it if they really want. The 64 bit versions are more than sufficient.

I'm also thinking we might want this PR to be the patch version bump to 0.23.2, if you wouldn't mind changing

#define MFAKTC_VERSION "0.23.1" /* DO NOT CHANGE! */

at params.h:89 to 0.23.2, then the release can be tagged and be the first automated one (which will also include the log corruption fix)

@brubsby brubsby self-requested a review February 24, 2025 16:06
brubsby
brubsby previously approved these changes Feb 24, 2025
@N-Storm
Copy link
Contributor Author

N-Storm commented Feb 24, 2025

I'm also thinking we might want this PR to be the patch version bump to 0.23.2, if you wouldn't mind changing at params.h:89 to 0.23.2, then the release can be tagged and be the first automated one (which will also include the log corruption fix)

Thanks! I'll adjust it right now and update this PR soon.

@brubsby
Copy link
Collaborator

brubsby commented Feb 24, 2025

Also, fwiw, the changes to Makefile.win had no adverse effect on my personal build environment, so we're presumably good on that front as well.

@N-Storm
Copy link
Contributor Author

N-Storm commented Feb 24, 2025

Hmm... how do you think, maybe I should add my patch which replaces deprecated cudaThreadSetCacheConfig() and cudaThreadSynchronize() calls to the supported cudaDeviceSetCacheConfig() / cudaDeviceSynchronize() respectively to include it in the release?
I've posted about this on same forum thread recently. Changes made can be previewed in these commits: e737557, 7b536c6 (2nd are just a fix for a small error I've did, which only affects builds with DEBUG_GPU_MATH build option enabled)
It looks like 0.24.0 incorporates similar changes, nearly the same. I've tested those running a Windows for a few days without any issues now. Didn't noticed any difference with my compute 8.9 device, neither it helped to fix issues on RTX 5090. But doesn't hurts at least and might be of use with some other GPUs as well.

@N-Storm
Copy link
Contributor Author

N-Storm commented Feb 24, 2025

Added patch to replace deprecated API calls as well. Let me know if want it out of this PR / v0.23.2, I'll exclude it and force-push the branch without it.

PS: @brubsby, Your review was automatically dismissed when I've updated PR, so it has to be confirmed again if you accept it.

@N-Storm N-Storm requested a review from brubsby February 24, 2025 16:50
@KarlLudwig3485
Copy link

Seems to work fine on my setups.
I hope that any builds for CUDA <=11.x you make can run on Windows 7, as it seems a few people still use it on their systems. Otherwise I would be more than happy to continue providing builds for legacy/novelty systems (which I will continue to do for Windows XP and similar).

And I'm glad to report that your changes still work with CUDA 5.0 :)

@N-Storm
Copy link
Contributor Author

N-Storm commented Feb 25, 2025

@KarlLudwig3485 by CUDA <= 11.x & CUDA 5.0 do you mean Compute Capability devices actually? It it's like that, then yes, build with the CUDA version 11.0.1 targets devices with Compute Capability 3.5 - 8.0, so even 3.5 should be supported.
If you are talking about actual CUDA version, then I'm confused, because this PR explicitly notes that min CUDA version that can be used to build on Windows is CUDA 11.0.1 actually. Linux targets have much more options on supporting older CUDA versions, and currently it builds with CUDA as low as 8.0, which targets Compute Capability 2.0 - 6.2.
But this PR doesn't changes anything in the codebase. It only adds configuration and scripts for running automated builds via Github Actions feature.

@KarlLudwig3485
Copy link

My apologies, I didn't see that the Windows builds only went as far back as CUDA 11.0. And, given that Visual Studio 2022 still supports targeting Windows 7 by default (as far as I'm aware), it should work just fine.

As for CUDA 5.0, I was referring to the changes regarding cudaThreadSetCacheConfig() and cudaThreadSynchronize() being replaced with cudaDeviceSetCacheConfig() / cudaDeviceSynchronize(). Interestingly the former were already deprecated as early as CUDA 5.0 (I haven't checked any earlier versions yet).

@N-Storm
Copy link
Contributor Author

N-Storm commented Feb 25, 2025

@KarlLudwig3485 got it. Thanks for the feedback! Didn't thought at first that your comment was about that last addition, which I've included so recently. And it's so unrelated to the PR so I wasn't even thinking about it while reading your comment. :)
Yes, those call was declared deprecated pretty long time ago. But as NVidia explains on the deprecation notice: "Note that this function is deprecated because its name does not reflect its behavior. Its functionality is similar to the non-deprecated function cudaDeviceSynchronize(), which should be used instead.". So I guess those functions could be just a renamed or aliased. With the exception of cudaStreamSynchronize(), which work with the CUDA streams.

Copy link
Member

@tdulcet tdulcet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some comments in case they are helpful. Feel free to ignore.

@brubsby
Copy link
Collaborator

brubsby commented Feb 26, 2025

Added patch to replace deprecated API calls as well. Let me know if want it out of this PR / v0.23.2, I'll exclude it and force-push the branch without it.

Sorry for the slow response, since we're making the same change in 0.24, I would lean towards not including it in 0.23.2, just on the off chance that it causes issues with anybody on Linux (as seems to happen sometimes), and doesn't seem to fix anything if I'm understanding correctly. Especially as we're aiming to stabilize 0.23 and move on to developing 0.24.0.

However it does seem low risk, and already in the PR, so I'll defer to your judgement. Worst case we can do a 0.23.3 that removes it if issues crop up, and with the CI/CD now, that's not as annoying as it once was.

brubsby
brubsby previously approved these changes Feb 26, 2025
Copy link
Collaborator

@brubsby brubsby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to make any changes suggested by Teal, and I'll re-review, but I'm okay for this to go in as-is. Might also be easier to cherry pick onto 0.24.0 if this PR is squashed, but I'm okay with non-squashed if you want to preserve the commit history.

N-Storm added a commit to N-Storm/mfaktc that referenced this pull request Mar 3, 2025
N-Storm added a commit to N-Storm/mfaktc that referenced this pull request Mar 3, 2025
N-Storm added a commit to N-Storm/mfaktc that referenced this pull request Mar 3, 2025
This commit marks version 0.23.2 (see primesearch#14).

* Fix incorrect cuda_version reference and order of CUDA version / CC
  columns in table generated on the release notes
* Fix mfaktc.ini copy on Windows CI/CI builds:
 Github Actions running an unusual environment for Windows builds,
 where GNU Make has SHELL set to PowerShell (in other case GNU Make
 from MSYS2 installed will try to launch bash shell which fails to
 build without heavy modifications to the Makefile). And for some
 reason make fails to spawn "copy" command, probably because it's
 just a shortcut from Copy-Item command name. Anyways, simply copying
 this file to upper dir before launching make fixes this issue,
 because Make will skip this step as completed previously. This way
 we don't have to patch Makefile.win or invent other tricks.
* Changes suggested by @tdulcet on primesearch#14 added. Thanks!
  These include:
  * Set action fail-fast to false. This allows other job continue if
    one fails.
  * Commented out CUDA versions to leave only one highest .patch per
    major.minor.
  * "Code quality" improvements to workflow & helper script.
N-Storm added a commit to N-Storm/mfaktc that referenced this pull request Mar 3, 2025
This commit marks version 0.23.2 (see primesearch#14).

* Fix incorrect cuda_version reference and order of CUDA version / CC
  columns in table generated on the release notes
* Fix mfaktc.ini copy on Windows CI/CI builds:
  Github Actions running an unusual environment for Windows builds,
  where GNU Make has SHELL set to PowerShell (in other case GNU Make
  from MSYS2 installed will try to launch bash shell which fails to
  build without heavy modifications to the Makefile). And for some
  reason make fails to spawn "copy" command, probably because it's
  just a shortcut from Copy-Item command name. Anyways, simply copying
  this file to upper dir before launching make fixes this issue,
  because Make will skip this step as completed previously. This way
  we don't have to patch Makefile.win or invent other tricks.
* Changes suggested by @tdulcet on primesearch#14 added. Thanks!
  These include:
  * Set action fail-fast to false. This allows other job continue if
    one fails.
  * Commented out CUDA versions to leave only one highest .patch per
    major.minor.
  * "Code quality" improvements to workflow & helper script.
@N-Storm
Copy link
Contributor Author

N-Storm commented Mar 3, 2025

I was quite busy recently. Finally got time to return to this.
I've updated PR with a new commit. Removed deprecated CUDA calls patch and added most of the changes suggested by @tdulcet. I've squashed some of the latest commits into one, but keeping previous few commits with a history.

@brubsby, you can cherry-pick a range of commits in one run. See this link: https://stackoverflow.com/questions/9229301/git-cherry-pick-says-38c74d-is-a-merge-but-no-m-option-was-given

I'm waiting now for a build to complete on my fork to check for possible issues. Once it's all good and ready, I'll re-request review again and comment it here.

@N-Storm
Copy link
Contributor Author

N-Storm commented Mar 3, 2025

Currently Windows 2022 builds are broken due to Jimver/cuda-toolkit#382 / actions/runner-images#11691
Hopefully it will be resolved soon. If not, workaround is possible by manually unpacking & installing CUDA instead, but getting installer fixed would be preferred solution.

EDIT: Builds with Windows 2019 & MSVC 2019 still works. New Windows 2025 image has the same MSVC version, as 2022. So it's affected by the same issue as well. I have some progress installing CUDA manually, but I think we need to wait for a few days if normal installer will be fixed.
Alternatively I can submit version which builds with Win/MSVC 2019.

@N-Storm
Copy link
Contributor Author

N-Storm commented Mar 3, 2025

Confirmed to be a VS bug here: microsoft/STL#5282

@N-Storm
Copy link
Contributor Author

N-Storm commented Mar 3, 2025

It looks like a very easy workaround exists to install only some of the CUDA components. I've tested and looks like it's working, just need to add CUDA bin dir to PATH additionally (installing without broken "Visual Studio Integration" component which was used to set it). I'll test it a bit more and update PR if everything will be fine.

N-Storm added a commit to N-Storm/mfaktc that referenced this pull request Mar 3, 2025
This commit marks version 0.23.2 (see primesearch#14).

* Fix incorrect `cuda_version` reference and order of CUDA version / CC
  columns in the table generated in the release notes.
* Fix `mfaktc.ini` copy on Windows CI/CD builds:
  GitHub Actions runs an unusual environment for Windows builds,
  where GNU Make has `SHELL` set to PowerShell (otherwise, GNU Make
  from MSYS2 would try to launch the Bash shell, which fails to build
  without heavy modifications to the Makefile). For some reason, Make
  fails to spawn the `copy` command, probably because it's just a
  shortcut for the `Copy-Item` command.
  Anyway, simply copying this file to the upper directory before
  launching Make fixes this issue, as Make will skip this step once
  it has already been completed. This way, we don't have to patch
  `Makefile.win` or come up with other workarounds.
* Changes suggested by @tdulcet on primesearch#14 added. Thanks!
  These include:
  * Set action `fail-fast` to `false`. This allows other jobs to
    continue if one fails.
  * Commented out CUDA versions to leave only the highest `.patch`
    version per `major.minor` version.
  * "Code quality" improvements to the workflow & helper script.
* Fix broken CUDA installation with newer MSVC versions.
  This provides a workaround for these issues:
  microsoft/STL#5282
  Jimver/cuda-toolkit#382
  A recent update to GitHub Actions' Windows runner images updated
  the MSVC version, causing the CUDA Toolkit installer to hang during
  the VS integration component installation. A workaround was added
  to install only the `nvcc` and `cudart` components.
  Additionally, the CUDA binaries directory must be added to `PATH`
  from the workflow since it was originally set by the broken
  component installation.
* Added CUDA 8.x / 9.x / 10.x Windows builds.
  There is an option to run the MSVC 14.0 (2015) build environment,
  which is installed as a component alongside MSVC 2019 on the
  GitHub Actions runner.
This commit marks version 0.23.2 (see primesearch#14).

* Fix incorrect `cuda_version` reference and order of CUDA version / CC
  columns in the table generated in the release notes.
* Fix `mfaktc.ini` copy on Windows CI/CD builds:
  GitHub Actions runs an unusual environment for Windows builds,
  where GNU Make has `SHELL` set to PowerShell (otherwise, GNU Make
  from MSYS2 would try to launch the Bash shell, which fails to build
  without heavy modifications to the Makefile). For some reason, Make
  fails to spawn the `copy` command, probably because it's just a
  shortcut for the `Copy-Item` command.
  Anyway, simply copying this file to the upper directory before
  launching Make fixes this issue, as Make will skip this step once
  it has already been completed. This way, we don't have to patch
  `Makefile.win` or come up with other workarounds.
* Changes suggested by @tdulcet on primesearch#14 added. Thanks!
  These include:
  * Set action `fail-fast` to `false`. This allows other jobs to
    continue if one fails.
  * Commented out CUDA versions to leave only the highest `.patch`
    version per `major.minor` version.
  * "Code quality" improvements to the workflow & helper script.
* Fix broken CUDA installation with newer MSVC versions.
  This provides a workaround for these issues:
  microsoft/STL#5282
  Jimver/cuda-toolkit#382
  A recent update to GitHub Actions' Windows runner images updated
  the MSVC version, causing the CUDA Toolkit installer to hang during
  the VS integration component installation. A workaround was added
  to install only the `nvcc` and `cudart` components.
  Additionally, the CUDA binaries directory must be added to `PATH`
  from the workflow since it was originally set by the broken
  component installation.
* Added CUDA 8.x / 9.x / 10.x Windows builds.
  There is an option to run the MSVC 14.0 (2015) build environment,
  which is installed as a component alongside MSVC 2019 on the
  GitHub Actions runner.
@N-Storm
Copy link
Contributor Author

N-Storm commented Mar 3, 2025

I guess it's all good now, build passes: https://github.com/N-Storm/mfaktc/actions/runs/13642472654

And I've added CUDA 8.x, 9x and 10.x Windows builds as well. I hope now it should be ready to merge.

@N-Storm N-Storm requested a review from brubsby March 3, 2025 23:52
Copy link
Collaborator

@brubsby brubsby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks again for putting all this together.

@brubsby brubsby merged commit ece2613 into primesearch:main Mar 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants