homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: re doesn't work with big charsets
Type: behavior Stage: resolved
Components: Library (Lib), Regular Expressions Versions: Python 3.3, Python 3.4, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: ezio.melotti, mrabarnett, python-dev, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2013年10月21日 11:24 by serhiy.storchaka, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
re_bigcharset.patch serhiy.storchaka, 2013年10月21日 11:24 review
Messages (4)
msg200747 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013年10月21日 11:24
>>> import re
>>> re.compile('[%s]' % ''.join(map(chr, range(256, 2**16, 255))))
Traceback (most recent call last):
 File "/home/serhiy/py/cpython/Lib/sre_compile.py", line 211, in _optimize_charset
 charmap[fixup(av)] = 1
IndexError: list assignment index out of range
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "/home/serhiy/py/cpython/Lib/re.py", line 213, in compile
 return _compile(pattern, flags)
 File "/home/serhiy/py/cpython/Lib/re.py", line 280, in _compile
 p = sre_compile.compile(pattern, flags)
 File "/home/serhiy/py/cpython/Lib/sre_compile.py", line 489, in compile
 code = _code(p, flags)
 File "/home/serhiy/py/cpython/Lib/sre_compile.py", line 471, in _code
 _compile_info(code, p, flags)
 File "/home/serhiy/py/cpython/Lib/sre_compile.py", line 459, in _compile_info
 _compile_charset(charset, flags, code)
 File "/home/serhiy/py/cpython/Lib/sre_compile.py", line 177, in _compile_charset
 for op, av in _optimize_charset(charset, fixup):
 File "/home/serhiy/py/cpython/Lib/sre_compile.py", line 220, in _optimize_charset
 return _optimize_unicode(charset, fixup)
 File "/home/serhiy/py/cpython/Lib/sre_compile.py", line 342, in _optimize_unicode
 mapping = array.array('b', mapping).tobytes()
OverflowError: signed char is greater than maximum
msg200748 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013年10月21日 11:30
@Serhiy: Could you please take a look at issue #13100?
msg200751 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013年10月21日 11:40
I have encountered this bug when writing test for for fragment of my large patch which cleanups and optimize the re module (it is too large to be committed all at once).
msg201163 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013年10月24日 19:05
New changeset d2bb0da45c93 by Serhiy Storchaka in branch '2.7':
Issue #19327: Fixed the working of regular expressions with too big charset.
http://hg.python.org/cpython/rev/d2bb0da45c93
New changeset 4431fa917f22 by Serhiy Storchaka in branch '3.3':
Issue #19327: Fixed the working of regular expressions with too big charset.
http://hg.python.org/cpython/rev/4431fa917f22
New changeset 10081a0ca4bd by Serhiy Storchaka in branch 'default':
Issue #19327: Fixed the working of regular expressions with too big charset.
http://hg.python.org/cpython/rev/10081a0ca4bd 
History
Date User Action Args
2022年04月11日 14:57:52adminsetgithub: 63526
2013年10月24日 19:25:42serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2013年10月24日 19:05:08python-devsetnosy: + python-dev
messages: + msg201163
2013年10月21日 12:01:44serhiy.storchakalinkissue19329 dependencies
2013年10月21日 11:40:58serhiy.storchakasetmessages: + msg200751
2013年10月21日 11:30:01vstinnersetmessages: + msg200748
2013年10月21日 11:27:30vstinnersetnosy: + vstinner
2013年10月21日 11:24:02serhiy.storchakacreate

AltStyle によって変換されたページ (->オリジナル) /