Skip to content

[5.9] Optimize search for start-anchored regexes #684

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

natecook1000
Copy link
Member

When a regex is anchored to the start of a subject, there's no need to search throughout a string for the pattern when searching for the first match: a prefix match is sufficient.

This adds a regex compilation-time check about whether a match can only be found at the start of a subject, and then uses that to choose whether to defer to prefixMatch from within firstMatch.

(This is a cherry-pick of #682.)

When a regex is anchored to the start of a subject, there's no need
to search throughout a string for the pattern when searching for the
first match: a prefix match is sufficient.

This adds a regex compilation-time check about whether a match can
only be found at the start of a subject, and then uses that to
choose whether to defer to `prefixMatch` from within `firstMatch`.
@natecook1000
Copy link
Member Author

@swift-ci Please test

@natecook1000
Copy link
Member Author

@swift-ci Please test macOS platform

1 similar comment
@natecook1000
Copy link
Member Author

@swift-ci Please test macOS platform

@natecook1000 natecook1000 force-pushed the anchor_prefix_match_5_9 branch from e114521 to b58c24c Compare July 20, 2023 16:19
@natecook1000
Copy link
Member Author

@swift-ci Please test macOS platform

@natecook1000
Copy link
Member Author

@swift-ci Please test macOS platform

1 similar comment
@natecook1000
Copy link
Member Author

@swift-ci Please test macOS platform

@natecook1000 natecook1000 force-pushed the anchor_prefix_match_5_9 branch from 9d0d9f5 to 12a8c9b Compare July 20, 2023 21:49
@natecook1000
Copy link
Member Author

@swift-ci Please test macOS platform

1 similar comment
@natecook1000
Copy link
Member Author

@swift-ci Please test macOS platform

The order of this property in MEProgram seems to determine whether
or not it persists from the time it's stored to when it's accessed
in RegexDSLTests. Clearly something else is going on here, but this
works around the issue for now.
@natecook1000
Copy link
Member Author

@swift-ci Please test

/// - `nil`: This node is inconclusive about where it can match.
///
/// In particular, non-required groups and option-setting groups are
/// inconclusive about where they can match.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is nil actually different from false? What can the caller do with a nil result that they cannot do with a false result?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning nil means that the node doesn't determine the result either way. For example, in /(?i)^foo/, the (?i) node is encountered as the first child of a concatenation, and doesn't affect whether or not the pattern can only match at the start. Returning nil means that the search for a definitive answer can continue to the next sibling.

Note that callers don't have access to this method; the one they call just returns a Bool.

@stephentyrone stephentyrone merged commit 4d40535 into swiftlang:swift/release/5.9 Jul 25, 2023
@natecook1000 natecook1000 deleted the anchor_prefix_match_5_9 branch July 25, 2023 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants