Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pre and post build hooks #10799

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

erikd
Copy link
Member

@erikd erikd commented Feb 21, 2025

Run a program (named "preBuildHook") before doing a package build and another program (named "postBuildHook") after the package is built. The exit code from the pre-build hook is passed to the post-build hook.

The commit includes documentation for the hooks and the security safeguards implemented to avoid the running of malicious hook files.

A previous version of this functionality but without the hook security features was submitted in #9899 .

Please read Github PR Conventions and then fill in one of these two templates.

Include the following checklist in your PR:

@erikd erikd self-assigned this Feb 21, 2025
@erikd erikd marked this pull request as draft February 21, 2025 05:17
@erikd erikd force-pushed the erikd/pre-post-build-hooks branch 4 times, most recently from d48220c to 815c21e Compare February 21, 2025 05:57
Comment on lines +868 to +879
concat
[ "The following file does not appear in the hooks-security file.\n"
, " hook file : "
, fpath
, "\n"
, " file hash : "
, hash
, "\n"
, "After checking the contents of that file, it should be added to the\n"
, "hooks-security file with either AcceptAlways or better yet an AcceptHash.\n"
, "The hooks-security file is (probably) located at: "
, hsPath
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this line look? Can you print a command that likely inserts the right value?

> echo "..." >> hsPath

?

@erikd erikd force-pushed the erikd/pre-post-build-hooks branch 2 times, most recently from 3853088 to 5f7b47f Compare February 21, 2025 06:06
@mpickering
Copy link
Collaborator

mpickering commented Feb 21, 2025

t seems to me this approach is very overfitted to a particular usecase. Why are there pre/post hooks on just the build phase, why not the haddock phase or the test phase for example? Why are the particular options chosen to be passed to the script? Therefore I don't think this should be merged into cabal in the current state. If this approach is to be taken then I think a broader community consensus needs to be arrived at.

In order to solve the problems mentioned in the ticket I would supply a wrapped GHC which could answer queries to --make a package by downloading relevant files from a cache and placing them into the output directory.

If this is operating on a company specific cache, the GHC provided by the nix-shell to developers can already be wrapped to support fetching from the cache. It will also work uniformly with all the other commands which may build things rather than overfitted to the build phase. It also means that ALL the relevant options are passed to the wrapper rather than only a few.

If the approach of committing a wrapper to the repo for all to use is preferred then the wrapper should accept the configured ghc to be used as the first argument, so invocations look like

./wrapper.sh ghc --make A B C D E

@mpickering Sorry, by mistake I edited your comment rather than responding to it (did not even know I could do that). Hopefully I have restored it.

@erikd
Copy link
Member Author

erikd commented Feb 23, 2025

Why are there pre/post hooks on just the build phase, why not the haddock phase or the test phase for example?

More can be added if needed. Its trivial.

In order to solve the problems mentioned in the ticket I would supply a wrapped GHC which could answer queries to --make a package by downloading relevant files from a cache and placing them into the output directory.

We did have a "wrapper" solution that did a cabal build all --dry-run and then inspected the plan.json file. That solution was considerably slower (about 50%) than this solution.

@angerman
Copy link
Collaborator

@mpickering, thanks for the feedback—I appreciate it.

You're suggesting wrapping GHC for file-level caching across all build commands, which makes sense. However, I believe Cabal is the right place for package-level hooks, as it manages package orchestration (e.g., fetching from Hackage). These hooks are generic, supporting use cases beyond caching, like build-time profiling.

That said, the current implementation is limited (build phase only, specific options). I'm open to discussing:

  • Adding hooks for other phases (e.g., haddock, test).
  • Adjusting hook arguments.
  • Ensuring alignment with community needs.

I agree broader consensus is needed before merging. Happy to collaborate further—looking forward to your thoughts!

@erikd
Copy link
Member Author

erikd commented Feb 24, 2025

@mpickering Adding hooks for the "haddock" and "test" phases seems to provide very little utility above your "wrapper" suggestion for the "build" phase. Neither the "haddock" nor the "test" phases have anywhere near the complexity of the "build" phase. What we get by hooking the "build" phase is direct access to the (name, version, package id hash) tuple as well as information about when in the build sequence the package is required.

Yes , this (name, version, package id hash) information could be extracted from the plan.json file but reconstructing the build sequence would still be required.

@erikd erikd force-pushed the erikd/pre-post-build-hooks branch 2 times, most recently from d5a3b9c to bfecc42 Compare February 24, 2025 05:49
@dcoutts
Copy link
Contributor

dcoutts commented Feb 24, 2025

Why are these yet more magic named files rather than being specified in the cabal.project (or any file that it might choose to include)?

@erikd
Copy link
Member Author

erikd commented Feb 24, 2025

Why are these yet more magic named files rather than being specified in the cabal.project (or any file that it might choose to include)?

Three reasons:

  • This functionality was inspired by git which has fixed names for its hook files.
  • The file names document what they do.
  • Accepting arbitrary names for these files adds complexity for zero benefit.

@erikd erikd force-pushed the erikd/pre-post-build-hooks branch from bfecc42 to 755e002 Compare February 26, 2025 00:21
Run a program (named "preBuildHook") before doing a package build and
another program (named "postBuildHook") after the package is built.
The exit code from the pre-build hook is passed to the post-build
hook.

The commit includes documentation for the hooks and the security
safeguards implemented to avoid the running of malicious hook
files.
@erikd erikd force-pushed the erikd/pre-post-build-hooks branch from 755e002 to 39ec78c Compare February 26, 2025 02:40
@erikd
Copy link
Member Author

erikd commented Feb 26, 2025

Work to do:

@mpickering
Copy link
Collaborator

These are currently the extension mechanisms to the cabal ecosystem (which I know about).

  1. External commands, provide extensibility to the command line
  2. Custom setup scripts, provide extensibility to the ./Cabal interface
  3. Hooks, the new means to provide extensibility to the ./Cabal interface.

This PR, proposes a 4th means of extension, "build hooks" to run in "cabal-install" just before PBBuildPhase. What are the motivations for "build hooks" over the existing mechanisms?

  1. They operate on the ./Cabal interface level (ie abstracting over ./Setup build) but are executed by cabal-install. The same hooks are available for an entire project.
  2. Unlike 2/3 using a build hook does not require changing the package description (ie changing the build-type of a package), they are intended to not be required for distribution. (I think this is why means 2 or 3 are not suitable). The hooks are also called on external package builds, for which you have no control over the source.

Therefore, could we instead consider a design which augments 2 or 3 in a way which allows local modification.

Another alternative design

Allow a project to contain a component containing LocalHooks.hs file which has the same interface as SetupHooks.hs

  • This extends the existing modification mechanisms, does not introduce another interface. SetupHooks are composable, you can run a LocalHooks.hs prebuild followed by a SetupHooks.hs prebuild. This would also provide means to more fundamentally change the build process.
  • All relevant options are available to a hook author since the hooks are implemented using the Cabal API.
  • The cabal.project file can specify the name of the component which contains the implementation of the package level local hooks or cabal.project can contain a hooks stanza.

Summary

Otherwise, It seems hard to justify how another extension point at exactly the same place as custom setup and hooks. If you observe the current implementation, the hooks run immediately before ./Setup build is called.

I think steps to moving this project forward are

  1. Provide a principled justification about the options which are passed to hooks and which extension points are provided.
  2. Write a proposal which explains the design, compares the approach to existing extension mechanisms, post it on discourse and solicit more use cases.
  3. Once consensus is reached, add the new extension mechanism.

Personal opinion

My personal opinion, is that

  1. The current design is too overfitted to the one use-case. (Only implement a build hook, only for some very precise options)
  2. It's very important to design a principled extension point and understand how it relates to the other ones as this will be part of the interface that has to be maintained for a long time into the future.

@dcoutts
Copy link
Contributor

dcoutts commented Feb 27, 2025

Here's another way to think about it:

Remember that Cabal distinguishes the role of package author vs package builder. So lets look at the existing hooks in that light

@mpickering lists:

These are currently the extension mechanisms to the cabal ecosystem (which I know about).

  1. External commands, provide extensibility to the command line
  2. Custom setup scripts, provide extensibility to the ./Cabal interface
  3. Hooks, the new means to provide extensibility to the ./Cabal interface.

So 2 & 3 are for the package author. 1 is for the package builder. The proposal here is also for the package builder.

I think it's not crazy to have hooks designed for the package builder. And perhaps 1 is not sufficient.

However, I think it should be coherent with the rest of the package builder design. We have a pkgname.cabal files for the package author, and the cabal.project file for the package builder. So it would be more coherent for hooks to be specified here. This is about discoverability of configuration and reproducibility of builds.

If we have a single root to start from (and then specify or include other files) then we have a much better handle on what is going on. Spreading things out over random magic file names makes it very hard for users who are not intimately familiar with the latest cabal to figure out what is going on, or where they should edit things, or where to find out where they should edit things. An explicit link from the cabal.project to other hook files solves that aspect of it. And it lets people turn it off wholesale with one flag --ignore-project-file, otherwise how do we know what will and will not be used?

@dcoutts
Copy link
Contributor

dcoutts commented Feb 27, 2025

Idea, in the cabal.project file:

hooks
  hook pre-build
    exec: scripts/my-prebuildhook.sh

  hook post-build
    exec: scripts/my-postbuildhook.sh

This gives us extension points for the future. We can add more hooks. We can add more attributes to control how/when/whatever the hooks are executed. We can arrange for it to support conditional syntax if that ever becomes necessary. etc.

And it's all explicit and discoverable. At least, one can see that a project is using it.

And other cabal.project mechanisms for including can be used to control profiles for using/not-using, splitting things across multiple files etc.

@Mikolaj
Copy link
Member

Mikolaj commented Feb 28, 2025

Thank you for shaking up the design of this new feature. Shall I copy some of this discussion to the actual design ticket for others to see, with a back-link, e.g., maybe this summary #10799 (comment)?

@angerman
Copy link
Collaborator

angerman commented Mar 1, 2025

@Mikolaj I think the summary misses some crucial observations from @dcoutts. Let me try to come up with a summary with @erikd that included @mpickering's and @dcoutts's comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants