Skip to content

docs: update provenance tutorial #1110

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/source/pages/tutorials/commit_finder.rst
Original file line number Diff line number Diff line change
Expand Up @@ -164,4 +164,4 @@ Future Work

Mapping artifact to commits within repositories is a challenging endeavour. Macron's Commit Finder feature relies on repositories having and using version tags in a sensible way (a tag is considered sensible if it closely matches the version it represents). An alternative, or complimentary, approach would be to make use of the information found within provenance files, where information such as the commit hash used to create the artifact can potentially be found. Additionally, it should be noted that the Commit Finder feature was modelled on the intentions of developers (in terms of tag usage) within a large quantity of Java projects. As tag formatting is "generally" language agnostic in the same way that versioning schemes are, this feature should work well for other languages. However, there may be some improvements to be made by further testing on a large number of non-Java projects.

.. note:: Macaron now supports extracting repository URLs and commit hashes from provenance files. This is demonstrated in a new tutorial: :doc:`npm_provenance </pages/tutorials/npm_provenance>`.
.. note:: Macaron now supports extracting repository URLs and commit hashes from provenance files. This is demonstrated in a new tutorial: :doc:`provenance </pages/tutorials/provenance>`.
2 changes: 1 addition & 1 deletion docs/source/pages/tutorials/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ For the full list of supported technologies, such as CI services, registries, an
commit_finder
detect_malicious_package
detect_vulnerable_github_actions
npm_provenance
provenance
detect_malicious_java_dep
generate_verification_summary_attestation
use_verification_summary_attestation
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,11 @@
Provenance discovery, extraction, and verification
--------------------------------------------------

This tutorial demonstrates how Macaron can automatically retrieve provenance for npm artifacts, validate the contents, and verify the authenticity. Any artifact that can be analyzed and checked for these properties can then be trusted to a greater degree than would be otherwise possible, as provenance files provide verifiable information, such as the commit and build service pipeline that has triggered the release.
This tutorial demonstrates how Macaron can automatically retrieve provenance for artifacts, validate the contents, and verify the authenticity. Any artifact that can be analyzed and checked for these properties can then be trusted to a greater degree than would be otherwise possible, as provenance files provide verifiable information, such as the commit and build service pipeline that has triggered the release.

For npm artifacts, Macaron makes use of available features provided by `npm <https://npmjs.com/>`_. Most importantly, npm allows developers to generate provenance files when publishing their artifacts. The `semver <https://www.npmjs.com/package/semver>`_ package is chosen as an example for this tutorial.
Currently, Macaron supports discovery of attestation for: npm artifacts using features provided by `npm <https://npmjs.com/>`_; PyPI artifacts using features provided by `Open Source Insights <https://deps.dev/>`_; and artifacts that have been published attestations to `GitHub <https://docs.github.com/en/rest/repos/repos?apiVersion=2022-11-28#list-attestations>`_. This tutorial uses two example packages to demonstrate these three discovery methods: The `semver <https://www.npmjs.com/package/semver>`_ npm package, and the `toga <https://pypi.org/pypi/toga>`_ PyPI package.

.. contents:: :local:

******************************
Installation and Prerequisites
Expand All @@ -34,46 +36,111 @@ Skip this section if you already know how to install Macaron.

Now you should be good to run Macaron. For more details, see the documentation :ref:`here <prepare-github-token>`.

********
Analysis
********
The analyses in this tutorial involve downloading the contents of a target repository to the configured, or default, ``output`` folder. Results from the analyses, including checks, are stored in the database found at ``output/macaron.db`` (See :ref:`Output Files Guide <output_files_guide>`). Once the analysis is complete, Macaron will also produce a report in the form of a HTML file.

.. note:: If you are unfamiliar with PackageURLs (purl), see this link: `PURLs <https://github.com/package-url/purl-spec>`_.

**************************************
Attestation Discovery for semver (npm)
**************************************

To perform an analysis on the latest version of semver (when this tutorial was written), Macaron can be run with the following command:

.. code-block:: shell

./run_macaron.sh analyze -purl pkg:npm/[email protected] --verify-provenance

The analysis involves Macaron downloading the contents of the target repository to the configured, or default, ``output`` folder. Results from the analysis, including checks, are stored in the database found at ``output/macaron.db`` (See :ref:`Output Files Guide <output_files_guide>`). Once the analysis is complete, Macaron will also produce a report in the form of a HTML file.

.. note:: If you are unfamiliar with PackageURLs (purl), see this link: `PURLs <https://github.com/package-url/purl-spec>`_.
./run_macaron.sh analyze -purl pkg:npm/[email protected] --verify-provenance

During this analysis, Macaron will retrieve two provenance files from the npm registry. One is a :term:`SLSA` v1.0 provenance, while the other is a npm specific publication provenance. The SLSA provenance provides details of the artifact it relates to, the repository it was built from, and the build action used to build it. The npm specific publication provenance exists if the SLSA provenance has been verified before publication.

.. note:: Most of the details from the two provenance files can be found through the links provided on the artifacts page on the npm website. In particular: `Sigstore Rekor <https://search.sigstore.dev/?logIndex=92391688>`_. The provenance file itself can be found at: `npm registry <https://registry.npmjs.org/-/npm/v1/attestations/semver@7.6.2>`_.
.. note:: Most of the details from the two provenance files can be found through the links provided on the artifacts page on the npm website. In particular: `Sigstore Rekor <https://search.sigstore.dev/?logIndex=211457167>`_. The provenance file itself can be found at: `npm registry <https://registry.npmjs.org/-/npm/v1/attestations/semver@7.7.2>`_.

Of course to reliably say the above does what is claimed here, proof is needed. For this we can rely on the check results produced from the analysis run. In particular, we want to know the results of three checks: ``mcn_provenance_derived_repo_1``, ``mcn_provenance_derived_commit_1``, and ``mcn_provenance_verified_1``. The first two to ensure that the commit and the repository being analyzed match those found in the provenance file, and the last check to ensure that the provenance file has been verified. For the third check to succeed, you need to enable provenance verification in Macaron by using the ``--verify-provenance`` command-line argument, as demonstrated above. This verification is disabled by default because it can be slow in some cases due to I/O-bound operations.

.. _fig_semver_7.6.2_report:
.. _fig_semver_7.7.2_report:

.. figure:: ../../_static/images/tutorial_semver_7.6.2_report.png
:alt: HTML report for ``semver 7.6.2``, summary
.. figure:: ../../_static/images/tutorial_semver_7.7.2_report.png
:alt: HTML report for ``semver 7.7.2``, summary
:align: center


This image shows that the report produced by the previous analysis has pass results for the three checks of interest. This can also be viewed directly by opening the report file:

.. code-block:: shell

open output/reports/npm/semver/semver.html

*****************************
Run ``verify-policy`` command
*****************************
The check results of this example (and others) can be automatically verified. A demonstration of verification for this case is provided later in this tutorial.

*************************************
Attestation Discovery for toga (PyPI)
*************************************

To perform an analysis on the latest version of toga (when this tutorial was written), Macaron can be run with the following command:

.. code-block:: shell

./run_macaron.sh analyze -purl pkg:pypi/[email protected]

During this analysis, Macaron will retrieve information from two sources to attempt to discover a PyPI attestation file. Firstly, Open Source Insights will be queried for an attestation URL that can be used to access the desired information. If found, this URL can be followed to its source on the PyPI package registry, which is where the actual attestation file is hosted.

As an example of these internal steps, the attestation information can be seen via the `Open Source Insights API <https://api.deps.dev/v3alpha/purl/pkg:pypi%[email protected]>`_. From this information the PyPI attestation URL is extracted, revealing its location: `https://pypi.org/integrity/toga/0.5.1/toga-0.5.1-py3-none-any.whl/provenance <https://pypi.org/integrity/toga/0.5.1/toga-0.5.1-py3-none-any.whl/provenance>`_.

.. _fig_toga_osi_api:

.. figure:: ../../_static/images/tutorial_osi_toga.png
:alt: Open Source Insight's API result for toga package
:align: center

This image shows the attestation URL found in the Open Source Insight API result.

By using the Open Source Insights API, Macaron can check that the discovered provenance is verified, as well as being a valid match of the user provided PURL. For this we can rely on the check results produced from the analysis run. In particular, we want to know the results of three checks: ``mcn_provenance_derived_repo_1``, ``mcn_provenance_derived_commit_1``, and ``mcn_provenance_verified_1``. The first two to ensure that the commit and the repository being analyzed match those found in the provenance file, and the last check to ensure that the provenance file has been verified.

.. _fig_toga_pypi_checks:

.. figure:: ../../_static/images/tutorial_toga_pypi.png
:alt: HTML report for ``toga 0.5.1``, summary
:align: center

All three checks show they have passed, meaning Macaron has discovered the correct provenance for the user provided PURL, and determined that it is verified. To access the full report use the following:

.. code-block:: shell

open output/reports/pypi/toga/toga.html

***************************************
Attestation Discovery for toga (GitHub)
***************************************

The toga library is interesting in that it has GitHub attestation or PyPI attestation depending on which particular version of it is analyzed. To discover a GitHub attestation, we can analyze version 0.4.8:

.. code-block:: shell

./run_macaron.sh analyze -purl pkg:pypi/[email protected]

During this analysis, Macaron will attempt to discover a GitHub attestation by computing the hash of the relevant artifact. This is a requirement of GitHub's API to view artifact attestation, see the `GitHub Attestation API <https://docs.github.com/en/rest/repos/repos?apiVersion=2022-11-28#list-attestations>`_. The hash is computed by downloading the artifact and analysing it with the SHA256 algorithm. With the hash, the GitHub API can be called to find the related attestation.

In this particular case, the SHA256 hash of the toga 0.4.8 artifact is 0814a72abb0a9a5f22c32cc9479c55041ec30cdf4b12d73a0017aee58f9a1f00. A GitHub attestation can be found for this artifact `here <https://api.github.com/repos/beeware/toga/attestations/sha256:0814a72abb0a9a5f22c32cc9479c55041ec30cdf4b12d73a0017aee58f9a1f00>`_.

Attestation discovered through GitHub cannot be ascertained as verified at this time. However, we can still be sure that the repository URL and commit digest associated with the user provided PURL match what is found within the attestation. This is reported by Macaron in two checks: ``mcn_provenance_derived_repo_1`` and ``mcn_provenance_derived_commit_1``.

.. _fig_toga_github_checks:

.. figure:: ../../_static/images/tutorial_toga_github.png
:alt: HTML report for ``toga 0.4.8``, summary
:align: center

This image shows that both checks have passed, confirming that the repository URL and commit digest from the provenance match those associated with the user provided PURL. To access the full report use the following:

.. code-block:: shell

open output/reports/pypi/toga/toga.html

**************************************
Run ``verify-policy`` command (semver)
**************************************

Another feature of Macaron is policy verification. This allows Macaron to report on whether an artifact meets the security requirements specified by the user. Policies are written using `Soufflé Datalog <https://souffle-lang.github.io/index.html>`_ , a language similar to SQL. Results collected by the ``analyze`` command can be checked via declarative queries in the created policy, which Macaron can then automatically check.

For this tutorial, we can create a policy that checks whether the three checks (as above) have passed. In this way we can be sure that the requirement is satisfied without having to dive into the reports directly.
For this tutorial, we can create a policy that checks whether the three checks relating to the semver npm example above have passed. E.g. ``mcn_provenance_derived_repo_1``, ``mcn_provenance_derived_commit_1``, and ``mcn_provenance_verified_1``. In this way we can be sure that the requirement is satisfied without having to dive into the reports directly.

.. code-block:: prolog

Expand All @@ -85,9 +152,9 @@ For this tutorial, we can create a policy that checks whether the three checks (
check_passed(component_id, "mcn_provenance_verified_1").

apply_policy_to("has-verified-provenance", component_id) :-
is_component(component_id, "pkg:npm/semver@7.6.2").
is_component(component_id, "pkg:npm/semver@7.7.2").

After including some helper rules, the above policy is defined as requiring all three of the checks to pass through the ``check_passed(<target>, <check_name>)`` mechanism. The target is then defined by the criteria applied to the policy. In this case, the artifact with a PURL that matches the version of ``semver`` used in this tutorial: ``pkg:npm/semver@7.6.2``. With this check saved to a file, say ``verified.dl``, we can run it against Macaron's local database to confirm that the analysis we performed earlier in this tutorial did indeed pass all three checks.
After including some helper rules, the above policy is defined as requiring all three of the checks to pass through the ``check_passed(<target>, <check_name>)`` mechanism. The target is then defined by the criteria applied to the policy. In this case, the artifact with a PURL that matches the version of ``semver`` used in this tutorial: ``pkg:npm/semver@7.7.2``. With this check saved to a file, say ``verified.dl``, we can run it against Macaron's local database to confirm that the analysis we performed earlier in this tutorial did indeed pass all three checks.

.. code-block:: shell

Expand All @@ -98,7 +165,7 @@ The result of this command should show that the policy we have written succeeds
.. code-block:: javascript

component_satisfies_policy
['1', 'pkg:npm/semver@7.6.2', 'has-verified-provenance']
['1', 'pkg:npm/semver@7.7.2', 'has-verified-provenance']
component_violates_policy
failed_policies
passed_policies
Expand All @@ -117,14 +184,14 @@ With this modification, all versions of ``semver`` previously analysed will show
.. code-block:: javascript

component_satisfies_policy
['1', 'pkg:npm/semver@7.6.2', 'has-verified-provenance']
['1', 'pkg:npm/semver@7.7.2', 'has-verified-provenance']
['2', 'pkg:npm/[email protected]', 'has-verified-provenance']
component_violates_policy
['3', 'pkg:npm/[email protected]', 'has-verified-provenance']
failed_policies
['has-verified-provenance']

Here we can see that the newer versions, 7.6.2 and 7.6.0, passed the checks, meaning they have verified provenance. The much older version, 1.0.0, did not pass the checks, which is not surprising given that it was published 13 years before this tutorial was made.
Here we can see that the newer versions, 7.7.2 and 7.6.0, passed the checks, meaning they have verified provenance. The much older version, 1.0.0, did not pass the checks, which is not surprising given that it was published 13 years before this tutorial was made.

However, if we wanted to acknowledge that earlier versions of the artifact do not have provenance, and accept that as part of the policy, we can do that too. For this to succeed we need to extend the policy with more complicated modifications.

Expand Down Expand Up @@ -157,7 +224,7 @@ When run, this updated policy produces the following:
.. code-block:: javascript

component_satisfies_policy
['1', 'pkg:npm/semver@7.6.2', 'has-verified-provenance-or-is-excluded']
['1', 'pkg:npm/semver@7.7.2', 'has-verified-provenance-or-is-excluded']
['2', 'pkg:npm/[email protected]', 'has-verified-provenance-or-is-excluded']
['3', 'pkg:npm/[email protected]', 'has-verified-provenance-or-is-excluded']
component_violates_policy
Expand Down
Loading
Loading