Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

unix-ffi: re: convert to PCRE2 #737

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
dpgeorge merged 1 commit into micropython:master from Ansuel:pcre2-conversion
Oct 31, 2023
Merged

Conversation

@Ansuel
Copy link
Contributor

@Ansuel Ansuel commented Sep 28, 2023

PCRE is marked as EOL and won't receive any new security update.

Convert the re module to PCRE2 API to enforce security. Additional dependency is now needed with uctypes due to changes in how PCRE2 return the match_data in a pointer and require special handling.

The converted module is tested with the test_re.py with no regression.

Neustradamus reacted with heart emoji
Copy link
Contributor Author

Ansuel commented Sep 28, 2023

(this is needed as we are deprecating pcre in openwrt and we are converting any user to pcre2)

Copy link
Member

dpgeorge commented Oct 4, 2023

Thanks for the contribution.

What's the difference between libpcre2-posix, libpcre2-8, libpcre2-16 and libpcre2-32?

It would be possible to support both the old PCRE and PCRE2, but I don't think it's worth the effort.

Copy link
Contributor Author

Ansuel commented Oct 4, 2023

The POSIX variant just provide wrapper to have POSIX interface for pcre2 API.
Pcre had the same variant and wasn't used here.

The -16 and -32 variant are used to support 16bit and 32bit string (UTF-16 and UTF-32)
UTF-8 should be enough and what is commonly used when migrating from pcre to pcre2

Supporting both versions would bloat the code with lots of ifdef and IMHO it's not worth it and pcre won't receive any update and would pose security risks.

BKPepe reacted with thumbs up emoji

Copy link
Member

dpgeorge commented Oct 4, 2023

The -16 and -32 variant are used to support 16bit and 32bit string (UTF-16 and UTF-32)
UTF-8 should be enough and what is commonly used when migrating from pcre to pcre2

Aah, I see, thanks for the info. Indeed we should use the -8 variant.

I see that the test_re.py provided in this re directory doesn't pass anymore. It needs to pass.

Copy link
Contributor Author

Ansuel commented Oct 4, 2023

@dpgeorge i remember it did pass correctly. Maybe i'm running it wrong? my test was compiling the lib and run the test using the compiled library. Something wrong in this?

Copy link
Member

dpgeorge commented Oct 5, 2023

I'm simply running the following in the unix-ffi/re directory with this PR checked out:

$ micropython
MicroPython v1.20.0-557-g5aec051f9 on 2023年10月05日; linux [GCC 13.2.1] version
Use Ctrl-D to exit, Ctrl-E for paste mode
>>> import re
>>> re.search(r"a+", "caaab").group(0)
''

That's not correct output.

Copy link
Contributor Author

Ansuel commented Oct 5, 2023

@dpgeorge mhh i'm confused

root@OpenWrt:~# micropython
MicroPython v1.20.0 on 2023年10月04日; linux [GCC 12.3.0] version
Use Ctrl-D to exit, Ctrl-E for paste mode
>>> import re
>>> re.search(r"a+", "caaab").group(0)
'aaa'

Copy link
Contributor Author

Ansuel commented Oct 5, 2023

Also by putting the uncompiled re.py in /usr/lib/micropython/unix produce the same output

Copy link
Contributor Author

Ansuel commented Oct 9, 2023

@dpgeorge I finally manage to repro the problem... It seems to be a problem with PCRE2_SIZE that is size_t and this change between 32bit and 64bit. Using ULONG handled both systems. (my test were done on a 32bit system)

@Ansuel Ansuel force-pushed the pcre2-conversion branch 2 times, most recently from bbfc3f8 to c35e738 Compare October 9, 2023 11:22
# pcre2_get_ovector_pointer return PCRE2_SIZE that is of type
# size_t. Use ULONG as type to support both 32bit and 64bit.
ov_buf = uctypes.bytearray_at(
ov_ptr, uctypes.sizeof({"field": 0 | uctypes.ULONG}) * (cap_count + 1) * 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this call to sizeof to get the size be done at global module level, so it only has to be done once?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better now?

Copy link
Member

Thanks, I can confirm that it now works on my (64-bit) machine.

Ansuel reacted with rocket emoji

@Ansuel Ansuel force-pushed the pcre2-conversion branch 3 times, most recently from 54e92a9 to 1cbe8c4 Compare October 12, 2023 12:10
Copy link

BKPepe commented Oct 16, 2023

So, @dpgeorge it can be merged, right? :)

Copy link
Contributor

@projectgus projectgus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM, thanks @Ansuel!

PCRE is marked as EOL and won't receive any new security update.
Convert the re module to PCRE2 API to enforce security. Additional
dependency is now needed with uctypes due to changes in how PCRE2 return
the match_data in a pointer and require special handling.
The converted module is tested with the test_re.py with no regression.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
@dpgeorge dpgeorge merged commit d8e163b into micropython:master Oct 31, 2023
Copy link
Member

Thanks for updating, this looks good now.

Merged.

Ansuel reacted with hooray emoji Neustradamus reacted with heart emoji

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

@dpgeorge dpgeorge dpgeorge left review comments

@projectgus projectgus projectgus approved these changes

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

AltStyle によって変換されたページ (->オリジナル) /