Skip to content

Add timespec structured mtime metadata #236

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jan 7, 2020
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 25 additions & 2 deletions UNIXFS.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,12 +61,18 @@ message Data {
optional uint64 hashType = 5;
optional uint64 fanout = 6;
optional uint32 mode = 7;
optional int64 mtime = 8;
optional TimeSpec mtime = 8;
}

message Metadata {
optional string MimeType = 1;
}

message TimeSpec {
required int64 EpochSeconds = 1;

optional fixed32 EpochNanoseconds = 2;
}
```

This `Data` object is used for all non-leaf nodes in Unixfs.
Expand All @@ -90,7 +96,9 @@ UnixFS currently supports two optional metadata fields:
- The remaining 20 bits are reserved for future use, and are subject to change. Spec implementations **MUST** handle bits they do not expect as follows:
- For future-proofing the (de)serialization layer must preserve the entire uint32 value during clone/copy operations, modifying only bit values that have a well defined meaning: `clonedValue = ( modifiedBits & 07777 ) | ( originalValue & 0xFFFFF000 )`
- Implementations of this spec must proactively mask off bits without a defined meaning in the implemented version of the spec: `interpretedValue = originalValue & 07777`
* `mtime` -- The modification time in seconds since the epoch. This defaults to the unix epoch if unspecified
* `mtime` -- A two-element structure ( `EpochSeconds`, `EpochNanoseconds` ) representing the modification time in seconds relative to the unix epoch `1970-01-01T00:00:00Z`. In contexts where an mtime is mandatory ( e.g. FUSE interfaces ) implementations must treat an unspecified mtime as `0`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just default this to 0 as it is currently. The go-ipfs and js-ipfs implementations cannot control which context they will be used from so should default to the most conservative approach so that might as well be in the spec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My hesitation placing this in the spec is that it implicitly means unknown mtime is not a thing. E.g. a web gateway would be obligated to send out a 0 Last-Modified header, and so on...

What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's fine. At least, I can't think of a reason that makes it not fine, given that we need a default value and we're happy with seeing 1970 everywhere when we ipfs files ls -l.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make this more concrete, I think a web gateway sending a 0 Last-Modified header is ok.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:(

I will relent if this is a wider consensus, but then it makes me sad as this part of the spec becomes unusable to me as well. In my use case having a gateway render "this is the mtime" vs "mtime was not supplied" is really crucial.

- `EpochSeconds` represents the amount of seconds after **or before** the epoch. Implementations must be able to gracefully handle negative mtime, even if such a value is not applicable within their domain ( e.g. a POSIX filesystem )
- `EpochNanoseconds` represents the fractional part of the mtime as the amount of nanoseconds. The valid range for this value is the integer range `[1, 999999999]`. If a fractional part outside of this range is encountered, implementations should consider the entire metadata block invalid and abort processing it. Note that **a fractional value of `0` is NOT valid** - omit the nanosecond value altogether to represent whole seconds.

### Deduplication and inlining

Expand Down Expand Up @@ -192,6 +200,21 @@ This scheme would see metadata stored in an external database.

The downsides to this are that metadata would not be transferred from one node to another when syncing as [Bitswap] is not aware of the database, and in-tree metadata

### TimeSpec protobuf datatype rationale

#### EpochSeconds

The integer portion of the epoch is represented on the wire using a varint encoding. While this is inefficient for
negative values, it avoids introducing zig-zag encoding. Negative epoch values will be exceedingly rare, and there
could very well be value in having such cases stand out, while at the same keeping the "usual" positive values easy
to eyeball. The varint representing the time of writing this text is 5 bytes long. It will remain so until
October 26, 3058 ( 34,359,738,367 )

#### EpichNanoseconds
Since fractional values will very often be > 2^28 nanoseconds, that part is represented as a 4-byte `fixed32`,
[as per google's recommendation](https://developers.google.com/protocol-buffers/docs/proto#scalar).


[multihash]: https://tools.ietf.org/html/draft-multiformats-multihash-00
[CID]: https://docs.ipfs.io/guides/concepts/cid/
[Bitswap]: https://github.com/ipfs/specs/blob/master/BITSWAP.md
Expand Down