Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

automata: make PikeVM cache initialization lazy #1302

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
BurntSushi merged 1 commit into master from ag/fix-nfa-memory-usage
Oct 7, 2025

Conversation

@BurntSushi
Copy link
Member

@BurntSushi BurntSushi commented Oct 7, 2025

Prior to the advent of regex-automata, the PikeVM would decide how much
space it needed at the beginning of every search. In regex-automata, we
did away with that check at search time and moved it to the time at
which the cache is constructed. (The inputs to the sizing are currently
invariant in regex-automata, as they were in the old regex crate.)

The downside of this is that we create the caches for each regex engine
eagerly. So even if we never call the PikeVM (which is actually quite
common, since the lazy DFA handles mostly everything), we end up paying
for the memory of its cache. In many cases, this memory is likely
negligible, but it can be substantial if there are a lot of capture
groups, even if they aren't used. As in #1116.

We fix this by just re-arranging the meta regex engine wrappers to avoid
eagerly creating caches. Instead, they are only initialized when they
are actually needed.

This ends up making memory usage a bit less than regex 1.7.3.

Fixes #1116

Marwes reacted with heart emoji
Prior to the advent of regex-automata, the PikeVM would decide how much
space it needed at the beginning of every search. In regex-automata, we
did away with that check at search time and moved it to the time at
which the cache is constructed. (The inputs to the sizing are currently
invariant in regex-automata, as they were in the old regex crate.)
The downside of this is that we create the caches for each regex engine
eagerly. So even if we never call the PikeVM (which is actually quite
common, since the lazy DFA handles mostly everything), we end up paying
for the memory of its cache. In many cases, this memory is likely
negligible, but it can be substantial if there are a lot of capture
groups, even if they aren't used. As in #1116.
We fix this by just re-arranging the meta regex engine wrappers to avoid
eagerly creating caches. Instead, they are only initialized when they
are actually needed.
This ends up making memory usage a bit less than `regex 1.7.3`.
Fixes #1116 
@BurntSushi BurntSushi merged commit 977feeb into master Oct 7, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

No reviews

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

Increased memory usage when updating to regex 1.10

2 participants

AltStyle によって変換されたページ (->オリジナル) /