Skip to content

Conversation

@chrysn
Copy link

@chrysn chrysn commented Aug 24, 2021

Warning: This is based on a not-yet-concluded specification precision suggestion. You might not want to merge this until the discussion there is concluded; this serves to illustrate, and as a reference implementation of the proposed canonicalization.

There is some over-escaping happening currently before the MD5 hashing; in particular, a colon (':') is escaped without need.

While discussion is ongoing at FDO and there may be follow-ups on the IETF side, I think that what is proposed here is what should be the canonical encoding.

Concrete changes:

  • The to-be-escaped set is changed from the illegal characters of the userinfo component (of which I don't understand why it was picked for here) to the set described by the allowed characters in a path segment (named pchar in RFC3986, and imported from RFC8089 via the path-absolute rule.
  • A ./ prefix is introduced iff there is a colon in the relative reference. (Otherwise URI parsing would give completely different results when parsed back).
  • Tests to capture the new behavior. (The existing tests didn't even notice the changes).

@chrysn chrysn marked this pull request as draft August 24, 2021 10:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant