Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add capability to autoinstall grackle #415

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

mabruzzo
Copy link
Contributor

@mabruzzo mabruzzo commented Jun 16, 2024

Pull request summary

This adds the ability to auto-install Grackle

Detailed Description

This depends on #414 and on grackle-project/grackle#182. But I think it's probably worth discussing now.

The main goal is to make installation of Enzo-E more seamless for end-users (or new developers). I want feedback in 3 main areas: (i) how to enable/disable, (ii) whether to use git-submodules or cmake's FetchContent feature, and (iii) how to handle CI.

1. How to enable this feature:

Right now, the proposed out-of-box behavior is to always automatically download and build Grackle during a build.

  • If you want to use an external Grackle build, you need to set -DUSE_EXTERNAL_GRACKLE=ON.
  • If you want to disable Grackle, you need to set -DUSE_GRACKLE=OFF

I think this would provide a nice experience for first-time users. But there are some potential caveats:

  • Is this too much of an inconvenience to developers?
  • I'm not thrilled with needing 2 Grackle options here. The alternative is to support passing keywords to the existing USE_GRACKLE option (maybe "EXTERNAL" to prefer a pre-existing installation and "EMBED" for automatic downloading?). But if we eventually decide to make Grackle a mandatory requirement,1 we would probably just drop the USE_GRACKLE option altogether.

An alternative idea for the default behavior is to first search for an existing Grackle installation and then fall back to downloading Grackle if we can't find it -- we would still provide the option to force one of the choices.

  • With that said, I worry a little that our existing FindGrackle.cmake module might not be robust enough for this (as a reminder, this is a special module that provides heuristic instructions in the general search for Grackle)
  • At the same time, CMake should get better at finding grackle after Cmake Build Part 2: have the install-process create metadata files grackle-project/grackle#204 is merged and we feel comfortable requiring people to use a Grackle-build built with CMake. For context, that PR directly installs a file called GrackleConfig.cmake alongside the grackle libraries. (That file directly provides CMake with information about the installation and is considered to be more robust that the aforementioned find-module).

Do people have thoughts on this? I suspect that my "alternative idea for the default behavior" is probably worth trying.

2. Should we use git-submodules or FetchContent?

I implemented this in terms cmake's FetchContent feature. With that said, it's unclear whether it would be better for us to use git-submodules. I've highlighted the pro's and cons down below:

Advantages of FetchContent:

  • the end-user doesn't need to explicitly initialize the git submodule when checking out the repository. (downloading is handled automatically)
  • relatedly, if the end-user switches between branches that require 2 different (potentially incompatible) grackle-versions, there won't be any issues (you will always use the correct grackle-version). If using git-submodules, I think you would need to manually invoke some git-submodule commands to ensure that you use the proper version. In practice, I doubt this is much of an issue for Grackle.
  • this is the "preferred solution" by the CMake developers. This translates to 2 nice features:
    • In a few years, when we can require a minimum CMake Version of 3.24.2, we will be able to rely upon the nice built-in integration between FetchContent and find_package (reminder: the latter is used to find existing installations). Until then, we need to manually provide arguments to switch between the 2 choices (just as we would do for git-submodules)
    • It also theoretically handles the case where Enzo-E and it's dependencies share dependencies more elegantly.
    • In practice, neither of these are particularly relevant right now.
  • it supports another usage scenario to support co-development of grackle and Enzo-E. If you have a copy of the grackle source files on your machine, you can setup -DFETCHCONTENT_SOURCE_DIR_GRACKLE=<path/to/grackle>.
    • This will cause the Enzo-E build to directly compile the source files in that directory as part of compiling Enzo-E (any expectation about version details are ignored in that case).
    • In practice, you could probably develop grackle within Enzo-E's submodule if you're so inclined (so this probably isn't a significant advantage)

Disadvantages of FetchContent:

  • conceptually, it's a little more complicated
  • in it's default configuration, it will download a fresh copy of the grackle repository every time you try to configure a fresh Enzo-E build directory.
    • in other words, you need to have an internet connection in this default configuration
    • aside: if you use the FETCHCONTENT_SOURCE_DIR_GRACKLE variable, cmake won't try to download anything.

I'm really not sure what to do here. From the perspective of a user, who treats the build-system as a black box, the FetchContent approach might be a little simpler. From a developer perspective, I also like functionality provided by setting the FETCHCONTENT_SOURCE_DIR_GRACKLE variable. Furthermore, if we start doing automatic dependency management of a few additional dependencies (like fmt), FetchContent may be better (I think FetchContent may even be able to handle libtorch). With that said, most of the other advantages aren't really relevant right now (so it may make more sense with git-submodules).

I'm curious to hear people's thoughts.

3. How to handle CI

At the moment, Continuous Integration is using the new auto-download feature. This is mostly for testing purposes. With that said, it may makes sense to switch back to using the older-style approach (especially for gold-standard testing where we need to support rebuilds)...

What do you think?

This is a major change or addition (that needs 2 reviewers): unsure

PR Checklist

  • New features are documented with narrative docs.

Footnotes

  1. I think this is a worthwhile discussion, especially if auto-installation of grackle works well. But, I think we should hold off on that until after we're satisfied with the robustness of this system.

@mabruzzo mabruzzo force-pushed the autoinstall-grackle branch from 5925ec2 to 2c85225 Compare June 17, 2024 13:29
@cindytsai
Copy link

Setting FETCHCONTENT_BASE_DIR could probably help. It downloads to a certain directory instead of the build directory.

Next time FetchContent gets called, it avoids downloading again because it finds the dependency in the directory.

@mabruzzo
Copy link
Contributor Author

Setting FETCHCONTENT_BASE_DIR could probably help. It downloads to a certain directory instead of the build directory.

Next time FetchContent gets called, it avoids downloading again because it finds the dependency in the directory.

As a brief followup, it appears that doing this is discouraged by the cmake developers. I think there are potentially some workarounds. For Grackle in particular, I'm not too worried (we provide plenty of instructions for installing it through other methods).

@jwise77 jwise77 self-requested a review December 12, 2024 17:19
@mabruzzo
Copy link
Contributor Author

This is almost done! I just need to manually confirm that builds will work if you try to pre-install Grackle (with the CMake build-system). It should "just work"

Following a previous discussion, we no longer support grackle installations that used Grackle's "classic buildsystem" (i.e. the one inherited from Enzo-Classic)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants