-
Notifications
You must be signed in to change notification settings - Fork 50
[5.9] Optimize search for start-anchored regexes #684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[5.9] Optimize search for start-anchored regexes #684
Conversation
When a regex is anchored to the start of a subject, there's no need to search throughout a string for the pattern when searching for the first match: a prefix match is sufficient. This adds a regex compilation-time check about whether a match can only be found at the start of a subject, and then uses that to choose whether to defer to `prefixMatch` from within `firstMatch`.
@swift-ci Please test |
@swift-ci Please test macOS platform |
1 similar comment
@swift-ci Please test macOS platform |
e114521
to
b58c24c
Compare
@swift-ci Please test macOS platform |
@swift-ci Please test macOS platform |
1 similar comment
@swift-ci Please test macOS platform |
9d0d9f5
to
12a8c9b
Compare
@swift-ci Please test macOS platform |
1 similar comment
@swift-ci Please test macOS platform |
The order of this property in MEProgram seems to determine whether or not it persists from the time it's stored to when it's accessed in RegexDSLTests. Clearly something else is going on here, but this works around the issue for now.
@swift-ci Please test |
/// - `nil`: This node is inconclusive about where it can match. | ||
/// | ||
/// In particular, non-required groups and option-setting groups are | ||
/// inconclusive about where they can match. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is nil
actually different from false
? What can the caller do with a nil
result that they cannot do with a false
result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returning nil
means that the node doesn't determine the result either way. For example, in /(?i)^foo/
, the (?i)
node is encountered as the first child of a concatenation, and doesn't affect whether or not the pattern can only match at the start. Returning nil
means that the search for a definitive answer can continue to the next sibling.
Note that callers don't have access to this method; the one they call just returns a Bool
.
When a regex is anchored to the start of a subject, there's no need to search throughout a string for the pattern when searching for the first match: a prefix match is sufficient.
This adds a regex compilation-time check about whether a match can only be found at the start of a subject, and then uses that to choose whether to defer to
prefixMatch
from withinfirstMatch
.(This is a cherry-pick of #682.)