-
Notifications
You must be signed in to change notification settings - Fork 50
Fix character class range matching #570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hrm, I thought the macOS CI would be new enough by now |
00fdbb5
to
4a58679
Compare
So it turns out the macOS CI does have a new enough toolchain, it's just that when testing through |
Splitting off the |
We talked about this, and we want to do the following errors. Workaround for 1-2 is to use a scalar escape, which is clearer and more explicit anyways and doesn't have the bug potential (specially since copy-past might normalize to NFC on them, etc)
|
4a58679
to
5c6adcd
Compare
5c6adcd
to
087377b
Compare
Replace a couple of `#if os(Linux)` checks with a check to see if we have a newer stdlib available. This lets us emit an expected failure in the case where we're testing on an older stdlib.
Previously we performed a lexicographic comparison with the bounds of a character class range. However this produced surprising results, and our implementation didn't properly handle case sensitivity. Update the logic to instead only allow single scalar NFC bounds. The input is then converted to NFC in grapheme semantic mode, and checked against the range. In scalar semantic mode, the input scalar is checked on its own. Additionally, fix the case sensitivity handling such that we check both the lowercase and uppercase version of the input against the range.
087377b
to
cd5cc37
Compare
@swift-ci please test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, thank you!
@swift-ci please test |
Previously we performed a lexicographic comparison with the bounds of a character class range. However this produced surprising results, and our implementation didn't properly handle case sensitivity.
Update the logic to instead only allow single scalar NFC bounds. The input is then converted to NFC in grapheme semantic mode, and checked against the range. In scalar semantic mode, the input scalar is checked on its own. Additionally, fix the case sensitivity handling such that we check both the lowercase and uppercase version of the input against the range.
Resolves #401
Resolves #395
rdar://96898279