Nix build using flake #162

league · 2022-03-09T22:28:26Z

This is a start on specifying a “flake” for building bifrost with Nix. A flake is a format that pins all dependencies to support reproducible builds. This initial version includes github workflow for building with different releases of the nixpkgs tree (containing compilers, python dependencies, etc.) and different versions of python, then running the tests on all those configurations. (This builds on autoconf.) More background: nixos.org, Flakes on Nix wiki, Flakes tutorial from tweag.io

It doesn't yet support GPU builds, though I've gotten it mostly to work. It's a little tricky to integrate because the libcuda.so.1 must come from the host platform, so it agrees with the kernel version and GPU hardware. If the host is itself NixOS it's manageable, but when Nix is being used on top of another platform (e.g. Ubuntu) we can only build against stubs, and then sub in the real libcuda later. For similar reasons, auto-detecting the right GPU architecture during the build seems to be problematic. As long as GPU architecture is an input to the build, it gets hashed into the package signature, but we can't ask what GPU architecture is “from the inside.” Same story as for builtins.currentSystem for the overall architecture/OS tag. Some hints about libcuda on Nix

All this should be solvable (and hopefully useful). I'd like to continue to work on it and tweak it here... so marking this as a draft.

codecov-commenter · 2022-03-09T22:31:02Z

Codecov Report

Merging #162 (ced5117) into master (657e705) will not change coverage.
The diff coverage is n/a.

❗ Current head ced5117 differs from pull request most recent head 38bfbec. Consider uploading reports for the commit 38bfbec to get more accurate results

@@           Coverage Diff           @@
##           master     #162   +/-   ##
=======================================
  Coverage   58.54%   58.54%           
=======================================
  Files          67       67           
  Lines        5727     5727           
=======================================
  Hits         3353     3353           
  Misses       2374     2374

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 657e705...38bfbec. Read the comment docs.

jaycedowell · 2022-03-10T02:09:30Z

With regards to auto-detecting the GPU arch., would something like building all available archs. be a workable solution? It would take forever to compile but it should be generic.

league · 2022-03-11T01:06:23Z

I'd like to find a way for telemetry to quietly disable itself if it turns out that $HOME doesn't exist or is not writable. It prevents me from ever doing import bifrost in a nix build environment.

bifrost-doc>   File "/nix/store/mdvjrgc0m3lhcz75hvcpz9lgnis69wbv-python3-3.9.6-env/lib/python3.9/site-packages/bifrost/telemetry.py", line 52, in <module>
bifrost-doc>     os.mkdir(os.path.join(os.path.expanduser('~'), '.bifrost'))
bifrost-doc> FileNotFoundError: [Errno 2] No such file or directory: '/homeless-shelter/.bifrost'

I was toying with generating the documentation from nix (and maybe restoring the auto-update of gh-pages as a workflow action). To generate the python API docs, it needs a python from which it can import bifrost... which can be arranged but the telemetry bombs in that environment.

I guess that whole preamble could be wrapped in a try/catch and on FileNotFound set TELEMETRY_ACTIVE = False and _INSTALL_KEY to a fresh uuid. It would still fail if somebody called enable or disable because you can't unlink or write to _ACTIVE_KEY, but that's probably okay… at least you should be able to import telemetry.

jaycedowell · 2022-03-22T15:29:16Z

Now that #157 has been merged we should probably close this PR and open a new one that targets master.

league · 2022-03-23T18:40:53Z

Yep. I may just rebase/squash onto new master... the timeline is pretty confusing with occasional merge commits from autoconf... but ultimately it should just add 3 files.

jaycedowell · 2022-03-28T13:42:30Z

I think I will try out the nix thing once the new server gets setup.

league · 2022-03-28T16:40:24Z

I may merge the nix stuff soon, but I feel like it deserves a small section of the README or manual... maybe before next release. I'll add a CHANGELOG entry as a placeholder. Next time you're on qblocks, you might try a quick nix setup like this:

# Install nix
  wget https://nixos.org/nix/install
  sh install
# Log out and back in, or source the script given at the end of the install to set up current shell
# Optional: install our binary cache, maybe another useful tool or so
  nix-env -i cachix ripgrep
  cachix use bifrost
# Some basic nix config: (unfree for nvidia)
  mkdir -p ~/.config/{nix,nixpkgs}
  echo "experimental-features = nix-command flakes" > ~/.config/nix/nix.conf
  echo "{ allowUnfree = true; }" > ~/.config/nixpkgs/config.nix

Even before a git clone, you should be able to do stuff like the following. (If it needs to configure and build it will, but if you hit exactly the configuration that's in cachix, it will just download. Cachix is usually populated by the CI.)

Run ctypesgen tool (after merge, won't need /nix-flake branch specifier)

$ nix run github:ledatelescope/bifrost/nix-flake#ctypesgen-py3 --
ERROR: No header files specified

Load a python with basic (CPU-only) bifrost

$ nix run github:ledatelescope/bifrost/nix-flake#python3-bifrost  
Python 3.9.6 (default, Jun 28 2021, 08:57:49) 
[GCC 10.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import bifrost
>>> bifrost.version.__version__
'0.10.0'

Same, but python 3.8:

$ nix run github:ledatelescope/bifrost/nix-flake#python38-bifrost
Python 3.8.12 (default, Aug 30 2021, 16:42:10) 
[GCC 10.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import bifrost

Check on its config (note the double dash needed to separate the nix run options from the python options).

$ nix run github:ledatelescope/bifrost/nix-flake#python3-bifrost -- -m bifrost.version --config
bifrost 0.10.0
Copyright (c) 2016-2020, The Bifrost Authors. All rights reserved.
Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
License: BSD 3-Clause

Configuration:
 Memory alignment: 4096 B
 OpenMP support: yes
 NUMA support no
 Hardware locality support: no
 Mellanox messaging accelerator (VMA) support: no
 Logging directory: /dev/shm/bifrost
 Debugging: no
 CUDA support: no

The debug-enabled version is pre-configured in the flake, just named python3-bifrost-debug:

$ nix run github:ledatelescope/bifrost/nix-flake#python3-bifrost-debug -- -m bifrost.version --config
bifrost 0.10.0
[etc...]
 Logging directory: /dev/shm/bifrost
 Debugging: yes
 CUDA support: no

CUDA versions available too, it's just that it will have to download the pinned cuda toolkit from nvidia (won't use the one from the underlying system) and that can take a surprising amount of time. This one probably won't be pre-built in the cache because CUDA versions aren't built by CI.
```
# Make sure we can find libcuda.so.1:
$ export LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu:
$ nix run github:ledatelescope/bifrost/nix-flake#python3-bifrost-cuda11 -- -m bifrost.version --config
```

It provides an overlay for nixpkgs that adds ctypesgen and a configurable bifrost package, that can be overridden with various versions of python3 and cudatoolkit (or without them), and can enable debugging or not. The github workflow uses nix to build a few variations, run non-GPU tests, and update documentation in gh-pages branch. Builds are cached by cachix; upon merge that part may need some keys loaded onto the ledatelescope repo. Maybe needs a blurb in the README before next release.

That technique pulled in cudatoolkit too early, even when not building with it (and so it failed on darwin).

Just going with ["70" "75"] in default CUDA builds for now. Always possible to override the `gpuArchs` argument.

overlay → overlays.default devShell.ARCH → devShells.ARCH.default

Better for consistency with non-nix build.

Following merge of PR lwa-project#169

league mentioned this pull request Mar 14, 2022

Support python -m bifrost.version command #163

Merged

league force-pushed the nix-flake branch from 5fb66f1 to b3cfd36 Compare March 23, 2022 21:59

league changed the base branch from autoconf to master March 23, 2022 22:01

league marked this pull request as ready for review March 26, 2022 00:57

league added 9 commits March 28, 2022 12:45

nix: Simplify default GPU architecture specs

89f28fa

Just going with ["70" "75"] in default CUDA builds for now. Always possible to override the `gpuArchs` argument.

nix: Anticipate some naming deprecations in latest nix versions

1ed3858

overlay → overlays.default devShell.ARCH → devShells.ARCH.default

nix: Remove a few unnecessary let bindings

479aabd

nix: Drop back to ctypes-1.0.2 instead of master

f93fdbc

Better for consistency with non-nix build.

nix: Oops, linuxism in library name

b38fc6c

nix: Remove seds for CLEAR_LINE and \r

f938259

Following merge of PR lwa-project#169

nix: Add comment about gh-pages deploy in nix.yml

eae62fa

league force-pushed the nix-flake branch from 66dd6a0 to eae62fa Compare March 28, 2022 16:53

league added 2 commits March 28, 2022 12:59

Changelog entry mentioning nix build

ced5117

Redeploy gh-pages on updates to master branch.

38bfbec

league merged commit 7790fdb into lwa-project:master Mar 29, 2022

league deleted the nix-flake branch March 29, 2022 18:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nix build using flake #162

Nix build using flake #162

league commented Mar 9, 2022 •

edited

Loading

codecov-commenter commented Mar 9, 2022 •

edited

Loading

jaycedowell commented Mar 10, 2022

league commented Mar 11, 2022

jaycedowell commented Mar 22, 2022

league commented Mar 23, 2022

jaycedowell commented Mar 28, 2022

league commented Mar 28, 2022

Nix build using flake #162

Nix build using flake #162

Conversation

league commented Mar 9, 2022 • edited Loading

codecov-commenter commented Mar 9, 2022 • edited Loading

Codecov Report

jaycedowell commented Mar 10, 2022

league commented Mar 11, 2022

jaycedowell commented Mar 22, 2022

league commented Mar 23, 2022

jaycedowell commented Mar 28, 2022

league commented Mar 28, 2022

league commented Mar 9, 2022 •

edited

Loading

codecov-commenter commented Mar 9, 2022 •

edited

Loading