Skip to content

Conversation

@wlynch
Copy link

@wlynch wlynch commented Aug 1, 2025

Proposes an attestation storage format for how to store attestations within a Git repo.

@wlynch wlynch requested a review from a team as a code owner August 1, 2025 17:38
Proposes an attestation storage format for how to store attestations within a Git repo.

Signed-off-by: Billy Lynch <[email protected]>
@wlynch
Copy link
Author

wlynch commented Aug 1, 2025

cc @adityasaky @puerco @TomHennen @patzielinski

Attestations are stored in [git references](https://git-scm.com/book/en/v2/Git-Internals-Git-References),
and can be pushed or pulled like any other Git ref.

The latest commit in each attestation ref represents the current state of attestations for that Git object type.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we make any statement that the storage of attestations must strictly grow (i.e. each new commit has all previous attestations inside plus any new ones)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No - I kinda few this like OCI attestations in that that probably makes the most sense, but there's no harm in removing them beyond you may not be able to verify anymore (which sometimes may be a feature).

Copy link
Member

@SantiagoTorres SantiagoTorres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is cool! I'd like to move forward a bit. I think some clarification and we can start gathering comments

| Git Repository Attestation Storage Specification

| Sponsor
| link:https://github.com/wlynch[Billy Lynch]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the sponsor here should be an ITSC editor, rather than the author

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or both?

2. **Check the appropriate namespace**: Look in the corresponding `refs/attestations/<type>` namespace
3. **Navigate to the object directory**: Find the directory named with the target object's hash
4. **Enumerate attestation files**: List all `.intoto.jsonl` files in that directory
5. **Parse and verify**: Read the attestation bundles and verify signatures as needed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am trying to picture this in my head. This is to say that an attestation collection is inside a worktree? is this separate from the standard worktree? or is storage tangential and we don't care?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I fully understand, but the ref structure is more about discoverability of attestations - e.g. the same way for OCI you can use referrers / <digest>.att, you should be able to use refs/attestations/... to locate attestations in git repos. From there is a matter of iterating through the candidates to find what you need - either an attestation for a certain type/subject/signer, etc.

If your ITE includes code changes, provide some links to the prototype
implementation of these changes.

TODO (we will use one or all of the existing tools as a prototype implementation)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if instead of TODO we could just point out to all of these frameworks that are, admittedly, quite mature already.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Part of the motivation for this was to settle on a common way that these tools can interop with each other by storing things in similar formats. Some use refs/attestations (but different format) today, some use git notes, etc.

- **Tree attestations**: `refs/attestations/trees`
- **Blob attestations**: `refs/attestations/blobs`

Additional types may be created under the `refs/attestations` namespace to store attestations about other types, even non-Git objects.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth thinking about the scope here: if the attestations are about the git objects in the same repo, it makes sense for them to live in refs/attestations/<>, i.e., out of the way of "regular" repo contents in main etc. If we want to store attestations for non Git objects, does it need to go in the special ref namespace in the first place?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to the user? This is intentionally written to not be prescriptive, the same way you can extend refs/... in generally to store information about what ever you want.

├── a1b2c3d4e5f6789012345678901234567890abcd/
│ └── attestations.intoto.jsonl
└── b2c3d4e5f6789012345678901234567890abcdef1/
├── pull_request.intoto.jsonl
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we allow multiple blobs? Can we consistently have attestations.intoto.jsonl that is a union of all the jsonl files in this example? That might simplify client logic. We can determine later if multiple files are necessary, perhaps.

We could even argue for commitID.intoto.jsonl instead of having a directory for the commit ID with a single item in it, but that might be problematic to extend.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should. While it might make verification more complex, it's much easier as an implementor to stake claim of a particular file namespace without care of other integrations instead of having to do read-modify-write workflows.

e.g. the gittuf app might always write observations as github_<app-id>.json.


All source attestations MUST use the [Attestation bundle](./bundle.md) format.

Attestations in the bundle MAY have different subjects, but they SHOULD
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest bumping this up to MUST to keep things simpler. Attestations in the bundle may have 1+ subjects but at least one of those subjects must be for the object whose directory it's in to avoid scenarios where an attestation is assumed to apply to the Git object whose directory it's in.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a structural standpoint it doesn't really matter if these don't match?
Verifiers should validate but the biggest issue if you don't follow this is discoverability is hindered.

We want to avoid cruft here, but the security properties still holds if this is mucked with, so SHOULD feels appropriate?

Signed-off-by: Billy Lynch <[email protected]>
Copy link

@patzielinski patzielinski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No further suggestions from me from the gittuf perspective 🙂.

Co-authored-by: patzielinski <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants