-
Couldn't load subscription status.
- Fork 483
hir: lift alternations' common suffixes too #1248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This should probably produce better regexes internally; additionally, I know it will produce Hirs that are more amenable to being walked recursively (trading away performance for content-intelligence) to produce human-readable regex syntax. (My own usecase involves automating the operation of a ghastly pre-existing machine that takes PCREs.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is worrying me. I can't see why this is supposed to prevent stack overflow. Can you say more about why you have this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a mutual recursion here: Hir::alternation calls lift_common_prefix which calls lift_common_suffix which calls Hir::alternation (last line before the happy-path return). If I remember correctly, inlining lift_common_suffix reduced the rate of stack growth enough that one of these 3 functions would hit a base-case before the stack ran out of space. Thinking back, it would probably be a good idea to also inline lift_common_prefix.
(To be clear, I don't know whether inline(always) is supposed to affect intra-crate callsites this way - all I know is that in my full-blown use-case I got stack overflows from this mutual recursion until I added inline(always) here.)
Also, can you please write some tests for this?
This should probably produce better regexes internally; additionally, I know it will produce Hirs that are more amenable to being walked recursively (trading away performance for content-intelligence) to produce human-readable regex syntax.
(My own usecase involves automating the operation of a ghastly pre-existing machine that takes PCREs.)