-
Notifications
You must be signed in to change notification settings - Fork 28
feat: improve VSA generation with digest for each subject #685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
39b25e3
to
1877043
Compare
…ilable Signed-off-by: Nathan Nguyen <[email protected]>
…nances Signed-off-by: Nathan Nguyen <[email protected]>
…provenance subjects Signed-off-by: Nathan Nguyen <[email protected]>
Signed-off-by: Nathan Nguyen <[email protected]>
…itness provenances Signed-off-by: Nathan Nguyen <[email protected]>
Signed-off-by: Nathan Nguyen <[email protected]>
…igest if it exists Signed-off-by: Nathan Nguyen <[email protected]>
1877043
to
1fbe7ec
Compare
…larifying support for qualifiers Signed-off-by: Nathan Nguyen <[email protected]>
c0e8284
to
ebf8667
Compare
src/macaron/artifact/maven.py
Outdated
@classmethod | ||
def from_artifact_name( | ||
cls, | ||
artifact_name: str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
artifact_name
seems a bit ambiguous. Is it the filename?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is indeed the filename. Thanks for pointing out.
I have renamed artifact_name
to artifact_filename
in 263414b.
Signed-off-by: Nathan Nguyen <[email protected]>
…y engine result Signed-off-by: Nathan Nguyen <[email protected]>
…ls` function Signed-off-by: Nathan Nguyen <[email protected]>
Signed-off-by: Nathan Nguyen <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for the changes.
dict[str, str] | ||
list[InTotoV01Subject] | ||
A dictionary in which each key is a subject name and each value is the corresponding SHA256 digest. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to update comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also need to remove non-existent extensions
parameter from comment above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Fixed in 9e6f98f.
src/macaron/artifact/maven.py
Outdated
JAR = _MavenArtifactType( | ||
filename_pattern="{artifact_id}-{version}.jar", | ||
purl_qualifiers={"type": "jar"}, | ||
) | ||
POM = _MavenArtifactType( | ||
filename_pattern="{artifact_id}-{version}.pom", | ||
purl_qualifiers={"type": "pom"}, | ||
) | ||
JAVADOC = _MavenArtifactType( | ||
filename_pattern="{artifact_id}-{version}-javadoc.jar", | ||
purl_qualifiers={"type": "javadoc"}, | ||
) | ||
JAVA_SOURCE = _MavenArtifactType( | ||
filename_pattern="{artifact_id}-{version}-sources.jar", | ||
purl_qualifiers={"type": "java-source"}, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The patterns in all these types seem to be the same, i.e., {artifact_id}-{version}.extension
. Do we really need all these to be created as separate patterns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-javadoc.jar
and -sources.jar
are not file extensions, but I get the question.
This pattern is currently being used in the MavenArtifact.from_artifact_filename
class method to construct a regex to extract the artifact id from an artifact filename.
An alternative I see is to change filename_pattern
to something like filename_suffix
, which will involve also changing the implementation of the MavenArtifact.from_artifact_filename
class method. I am not sure if doing this is exactly simplifying by any means while I chose to do it this was initially to explicitly specify the naming convention of Maven artifacts, but I can change the current implementation to this if you think it makes sense. If you have any other suggestions please let me know.
Signed-off-by: Nathan Nguyen <[email protected]>
Signed-off-by: Nathan Nguyen <[email protected]>
This pull request improves upon the VSA generation feature in Macaron, by populating each subject with a corresponding SHA256 digest if certain conditions are met. This change helps VSA generation in Macaron become more in line with in-toto attestation specifications. More importantly, this also improves the VSA's quality in uniquely identifying artifacts verified by Macaron.
Specifically, built on top of Macaron's recent feature of taking a provenance file as input, this pull request identifies if the PackageURL provided through the command-line argument
--package-url/-purl
ofmacaron analyze
identifies an artifact being one of the subjects in the provenance. If such a subject is found, the corresponding SHA256 digest is stored in the Macaron database, which can then be retrieved and populated in Macaron-generated VSAs.For example, given the following analyze command
Macaron will check if there is a jar file for the
micronaut-router
package among the subjects in the provenance file./multiple.intoto.jsonl
.In this PR, only PackageURL of type
maven
is supported. Other PackageURL types can be added in future pull requests.Some other related changes include
mcn_provenance_available_1
check in case a user-provided provenance is available.mcn_provenance_expectation_1
check also validate the user-provided provenance if it is available.Below is a comprehensive list detailing the changes made in this PR, spanning across a few of commits:
b3eb53a
fix: disable provenance discovery in case a provenance is already available
8c49c69
fix: enable provenance expectation validation for user-provided provenances
1b82d5e
chore: add a table to the database schema to store sha256 digests of provenance subjects
A new table is created to store the SHA256 hash. One option being considered was to add another column to the
_component
table. However, since this will introduce breaking changes to the current policy library, I decided against it at this time. Another option considered was theReleaseArtifact
andHashDigest
table. However, they are now currently storing a different kind of data to what this pull request wants to achieve. Future revision of the database schema may clarify whether we want to unify these tables and how we want an eventual schema to look like.f2ed7f7
chore: introduce maven artifact types and utilities
12066e3
refactor: improve extraction of subjects being build artifacts from witness provenances
There was an existing utility function for extracting subjects from witness provenance. It was originally created for the Witness provenance check. It has then been modified, taking into account the product attestor to only recognize subjects that are actual artifacts. This utility function is also now used for matching a subject to a given PURL, introduced in the next commit.
7fa98d6
chore: match a witness provenance subject against a maven artifact purl
1fbe7ec
chore: modify vsa generation to populate each subject with a sha256 digest if it exists