Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

automata: fix ID rollover bug in lazy DFA #1300

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
BurntSushi merged 1 commit into master from ag/fix-lazy-dfa-id-cache-rollover
Oct 6, 2025

Conversation

@BurntSushi
Copy link
Member

@BurntSushi BurntSushi commented Oct 6, 2025

The lazy DFA has a cache of transitions that it may clear from time to
time if it gets too full. One cleared, transitions are re-generated.

There are two ways the cache gets full. First is if it uses too much
memory. Second is if there are so many states that it exceeds
LazyStateID::MAX. You might expect this to be 2^32, but it's smaller
than that because of some bits reserved for tagging purposes.

When the cache is clearer, we have to be rather careful with our state.
For example, we are careful to "save" the current state so that we know
where to go next after the cache is cleared. And we need to re-map state
identifiers when this happens.

The abstraction for handling cache clearing is basically non-existent.
The current code basically tried to look before it leaps, and if the
cache might be cleared, then it will save the current state. (Saving
the current state is costly, so we don't always want to do it.) But if
the cache gets cleared and we think it definitely won't, then we don't
save the current state and things get FUBAR.

That's what happens in #1083 (I believe) and definitively what happens
in BurntSushi/ripgrep#3135. Specifically, the
"look before we leap" logic wasn't accounting for the number of states
exceeding the maximum. It was only accounting for memory usage.

Ideally we could have a better abstraction that makes this harder to get
wrong via a single point of truth on whether a cache gets cleared or
not, but this is tricky for perf reasons.

Fixes #1083
Fixes BurntSushi/ripgrep#3135

Guillermogsjc and kenorb reacted with heart emoji
The lazy DFA has a cache of transitions that it may clear from time to
time if it gets too full. One cleared, transitions are re-generated.
There are two ways the cache gets full. First is if it uses too much
memory. Second is if there are so many states that it exceeds
`LazyStateID::MAX`. You might expect this to be `2^32`, but it's smaller
than that because of some bits reserved for tagging purposes.
When the cache is clearer, we have to be rather careful with our state.
For example, we are careful to "save" the current state so that we know
where to go next after the cache is cleared. And we need to re-map state
identifiers when this happens.
The abstraction for handling cache clearing is basically non-existent.
The current code basically tried to look before it leaps, and if the
cache *might* be cleared, then it will save the current state. (Saving
the current state is costly, so we don't always want to do it.) But if
the cache gets cleared and we think it definitely won't, then we don't
save the current state and things get FUBAR.
That's what happens in #1083 (I believe) and definitively what happens
in BurntSushi/ripgrep#3135. Specifically, the
"look before we leap" logic wasn't accounting for the number of states
exceeding the maximum. It was only accounting for memory usage.
Ideally we could have a better abstraction that makes this harder to get
wrong via a single point of truth on whether a cache gets cleared or
not, but this is tricky for perf reasons.
Fixes #1083
Fixes BurntSushi/ripgrep#3135 
@BurntSushi BurntSushi merged commit d6b3546 into master Oct 6, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

No reviews

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

invalid 'from' id: LazyStateID(134217696) unexpected bug on paralellization of patterns

2 participants

AltStyle によって変換されたページ (->オリジナル) /