This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2015年11月04日 09:02 by The Compiler, last changed 2022年04月11日 14:58 by admin. This issue is now closed.
| Messages (7) | |||
|---|---|---|---|
| msg254040 - (view) | Author: Florian Bruhin (The Compiler) * | Date: 2015年11月04日 09:02 | |
I just found this thanks to Hypothesis[1]:
>>> import re
>>> re.compile('(' * 500)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.5/re.py", line 224, in compile
return _compile(pattern, flags)
File "/usr/lib/python3.5/re.py", line 293, in _compile
p = sre_compile.compile(pattern, flags)
File "/usr/lib/python3.5/sre_compile.py", line 536, in compile
p = sre_parse.parse(p, flags)
File "/usr/lib/python3.5/sre_parse.py", line 829, in parse
p = _parse_sub(source, pattern, 0)
File "/usr/lib/python3.5/sre_parse.py", line 437, in _parse_sub
itemsappend(_parse(source, state))
File "/usr/lib/python3.5/sre_parse.py", line 778, in _parse
p = _parse_sub(source, state)
File "/usr/lib/python3.5/sre_parse.py", line 437, in _parse_sub
itemsappend(_parse(source, state))
File "/usr/lib/python3.5/sre_parse.py", line 778, in _parse
p = _parse_sub(source, state)
File "/usr/lib/python3.5/sre_parse.py", line 437, in _parse_sub
itemsappend(_parse(source, state))
[...]
File "/usr/lib/python3.5/sre_parse.py", line 493, in _parse
subpattern = SubPattern(state)
RecursionError: maximum recursion depth exceeded
It seems a maximum recursion error has been treated as a bug before (like in issue401612), so I'm guessing this isn't intended here either.
[1] https://hypothesis.readthedocs.org/
|
|||
| msg254042 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年11月04日 09:27 | |
This isn't a bug, this is a limitation of the implementation. Regular expression parser is recursive and Python has a limit for recursion depth. You can increase this limit if needed.
>>> import sys, re
>>> sys.setrecursionlimit(2000)
>>> re.compile('(' * 500)
Traceback (most recent call last):
File "/home/serhiy/py/cpython/Lib/sre_parse.py", line 437, in _parse_sub
itemsappend(_parse(source, state))
File "/home/serhiy/py/cpython/Lib/sre_parse.py", line 778, in _parse
p = _parse_sub(source, state)
File "/home/serhiy/py/cpython/Lib/sre_parse.py", line 437, in _parse_sub
itemsappend(_parse(source, state))
...
File "/home/serhiy/py/cpython/Lib/sre_parse.py", line 778, in _parse
p = _parse_sub(source, state)
File "/home/serhiy/py/cpython/Lib/sre_parse.py", line 437, in _parse_sub
itemsappend(_parse(source, state))
File "/home/serhiy/py/cpython/Lib/sre_parse.py", line 781, in _parse
source.tell() - start)
sre_constants.error: missing ), unterminated subpattern at position 499
|
|||
| msg254046 - (view) | Author: Florian Bruhin (The Compiler) * | Date: 2015年11月04日 10:26 | |
I see how it's not possible to compile that pattern as it's using recursion - I don't mind that.
However, I think this should be handled and re-raised as a re.error ("Exception raised [...] when some other error occurs during compilation or matching.").
I think no matter what the pattern is, it's quite unexpected to get anything other than a re.error (or a TypeError) from re.compile.
|
|||
| msg254060 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年11月04日 15:54 | |
re.compile can also raise ValueError and OverflowError. Agree that may be these exceptions can be converted to re.error.
But it can also raise MemoryError, and KeyboardInterrupt. And these exceptions can't be converted to re.error.
RecursionError is closer to the latter case. It depends not only on the pattern itself, but on the place where it is called. A pattern that is compiled successfully at top level can produce a RecursionError if it is compiled deeply in the recursion.
>>> import re
>>> def r(n):
... if not n:
... return re.compile('x')
... return r(n-1)
...
>>> r(990)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 4, in r
File "<stdin>", line 4, in r
...
File "<stdin>", line 4, in r
File "<stdin>", line 3, in r
File "/home/serhiy/py/cpython/Lib/re.py", line 224, in compile
return _compile(pattern, flags)
File "/home/serhiy/py/cpython/Lib/re.py", line 293, in _compile
p = sre_compile.compile(pattern, flags)
File "/home/serhiy/py/cpython/Lib/sre_compile.py", line 555, in compile
p = sre_parse.parse(p, flags)
File "/home/serhiy/py/cpython/Lib/sre_parse.py", line 822, in parse
source = Tokenizer(str)
File "/home/serhiy/py/cpython/Lib/sre_parse.py", line 225, in __init__
self.__next()
RecursionError: maximum recursion depth exceeded
>>> re.compile('x')
re.compile('x')
|
|||
| msg254236 - (view) | Author: Terry J. Reedy (terry.reedy) * (Python committer) | Date: 2015年11月06日 22:22 | |
Is the recursion limit hit enough in real-world problems that we should consider replacing recursive calls and the implicit C call stack with loops and an explicit stack? |
|||
| msg254261 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年11月07日 07:52 | |
I think it is possible to make the parser non-recursive, but without reports about such real-world problems it looks premature and just will complicate the code. For years before 3.5 there was a limitation on only 100 capturing groups. |
|||
| msg278752 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2016年10月16日 08:25 | |
Closed as not a bug. Unlike to issue401612 this is not an infinite recursion. Raising RecursionError depends on the context (it is more likely in deeply nested call) and there is a common workaround -- just increase maximum recursion depth using sys.setrecursionlimit(). There is nothing specific to regular expressions. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:58:23 | admin | set | github: 69736 |
| 2016年10月16日 08:25:41 | serhiy.storchaka | set | status: open -> closed resolution: not a bug messages: + msg278752 stage: resolved |
| 2015年11月07日 07:52:41 | serhiy.storchaka | set | messages: + msg254261 |
| 2015年11月06日 22:22:05 | terry.reedy | set | nosy:
+ terry.reedy messages: + msg254236 |
| 2015年11月04日 15:54:55 | serhiy.storchaka | set | messages: + msg254060 |
| 2015年11月04日 10:26:25 | The Compiler | set | messages: + msg254046 |
| 2015年11月04日 09:27:11 | serhiy.storchaka | set | messages: + msg254042 |
| 2015年11月04日 09:07:10 | SilentGhost | set | nosy:
+ mrabarnett components: + Regular Expressions versions: + Python 3.6 |
| 2015年11月04日 09:02:56 | The Compiler | create | |