Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

automata: make behavior more consistent when WhichCaptures::None is used #1303

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
BurntSushi merged 1 commit into master from ag/consistent-none-capture-behavior
Oct 9, 2025

Conversation

@BurntSushi
Copy link
Member

@BurntSushi BurntSushi commented Oct 9, 2025

Specifically, when used with meta::Regex. Before this PR, if callers
built a meta::Regex with WhichCaptures::None, then it was possible
for find to sometimes return Some and sometimes return None,
just based on the sequence of previous search calls.

In particular, when WhichCaptures::None is used, some regex engines
(like the PikeVM) cannot report match offsets while some (like the
lazy DFA) can. This meant that if the meta regex engine happened to
select the lazy DFA for a Regex::find call, then it would return
Some. But if it happened to select the Pike VM, then it would return
None. Since engine selection can be influenced by the haystack itself,
this leads to the behavior of find being tied to the contents of the
haystack.

Instead, what we should do is make it so anything that returns match
offsets on a meta::Regex will always return None when
WhichCaptures::None is used, even if Regex::is_match returns
true.

(Yes, this is a weird option and it's crazy that Regex::is_match can
return true while Regex::find can return None. This was already
true before this PR and is a result of a very low level option that
optimizes for memory usage in specific circumstances. This sort of
whacky behavior can't be observed in the regex crate API. Only in
regex-automata.)

... used
Specifically, when used with `meta::Regex`. Before this PR, if callers
built a `meta::Regex` with `WhichCaptures::None`, then it was possible
for `find` to _sometimes_ return `Some` and _sometimes_ return `None`,
just based on the sequence of previous search calls.
In particular, when `WhichCaptures::None` is used, some regex engines
(like the `PikeVM`) cannot report match offsets while some (like the
lazy DFA) can. This meant that if the meta regex engine _happened_ to
select the lazy DFA for a `Regex::find` call, then it would return
`Some`. But if it _happened_ to select the Pike VM, then it would return
`None`. Since engine selection can be influenced by the haystack itself,
this leads to the behavior of `find` being tied to the contents of the
haystack.
Instead, what we should do is make it so anything that returns match
offsets on a `meta::Regex` will always return `None` when
`WhichCaptures::None` is used, _even_ if `Regex::is_match` returns
`true`.
(Yes, this is a weird option and it's crazy that `Regex::is_match` can
return `true` while `Regex::find` can return `None`. This was already
true before this PR and is a result of a very low level option that
optimizes for memory usage in specific circumstances. This sort of
whacky behavior can't be observed in the `regex` crate API. Only in
`regex-automata`.)
@BurntSushi BurntSushi merged commit 8f5d947 into master Oct 9, 2025
18 checks passed
@BurntSushi BurntSushi deleted the ag/consistent-none-capture-behavior branch October 9, 2025 01:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

No reviews

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

2 participants

AltStyle によって変換されたページ (->オリジナル) /