This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2008年04月02日 17:36 by jorendorff, last changed 2022年04月11日 14:56 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| issue-2537.patch | meador.inge, 2010年02月11日 03:17 | patch against 2.7 trunk | review | |
| Messages (9) | |||
|---|---|---|---|
| msg64865 - (view) | Author: Jason Orendorff (jorendorff) | Date: 2008年04月02日 17:36 | |
Below, the second regexp seems just as guilty as the first to me. Python 2.5.1 (r251:54869, Apr 18 2007, 22:08:04) [GCC 4.0.1 (Apple Computer, Inc. build 5367)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import re >>> re.compile(r'((x|y)*)*') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/re.py", line 180, in compile return _compile(pattern, flags) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/re.py", line 233, in _compile raise error, v # invalid expression sre_constants.error: nothing to repeat >>> re.compile(r'((x|y+)*)*') <_sre.SRE_Pattern object at 0x18548> I don't know if that error is to protect the sre engine from bad patterns or just a courtesy to users. If the former, it could be a serious bug. |
|||
| msg64934 - (view) | Author: Mark Dickinson (mark.dickinson) * (Python committer) | Date: 2008年04月04日 17:41 | |
I'm almost tempted to call the first of these a bug: isn't '((x|y)*)*' a perfectly valid (albeit somewhat redundant) regular expression? What am I missing here? Even if there are issues with capturing, shouldn't the version without capturing subexpressions still work? I get: >>> re.compile(r'(?:(?:x|y)*)*') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.5/re.py", line 180, in compile return _compile(pattern, flags) File "/usr/lib/python2.5/re.py", line 233, in _compile raise error, v # invalid expression sre_constants.error: nothing to repeat |
|||
| msg64950 - (view) | Author: Jason Orendorff (jorendorff) | Date: 2008年04月04日 20:37 | |
Huh. Maybe you're right. JavaScript, Ruby, and Perl all accept both regexes, although no two agree on what should be captured: js> "xyyzy".replace(/((x|y)*)*/, "(1,ドル 2ドル)") (xyy, y)zy js> "xyyzy".replace(/((x|y+)*)*/, "(1,ドル 2ドル)") (xyy, yy)zy >> "xyyzy".sub(/((x|y)*)*/, "(\1,円 \2円)") => "(, y)zy" >> "xyyzy".sub(/((x|y+)*)*/, "(\1,円 \2円)") => "(, yy)zy" DB<1> $_ = 'xyyzy'; s/((x|y)*)*/(1円 2円)/; print ( )zy DB<2> $_ = 'xyyzy'; s/((x|y+)*)*/(1円 2円)/; print ( yy)zy Ruby's behavior seems best to me. |
|||
| msg99194 - (view) | Author: Meador Inge (meador.inge) * (Python committer) | Date: 2010年02月11日 03:17 | |
> Ruby's behavior seems best to me. We can obtain the Ruby behavior easily. There is one check in sre_compile.py in the '_simple' function that needs to be removed (see attached patch). Whether or not the Ruby behavior is the "correct" behavior I am still not sure. In any case, I think throwing an exception is to aggressive for this case. |
|||
| msg99237 - (view) | Author: Matthew Barnett (mrabarnett) * (Python triager) | Date: 2010年02月11日 20:49 | |
The re module is addressed in issue #2636. BTW, my regex module behaves like Ruby: >>> regex.sub(r"((x|y)*)*", "(\1,円 \2円)", "xyyzy", count=1) '(, y)zy' >>> regex.sub(r"((x|y+)*)*", "(\1,円 \2円)", "xyyzy", count=1) '(, yy)zy' |
|||
| msg99248 - (view) | Author: Meador Inge (meador.inge) * (Python committer) | Date: 2010年02月12日 02:12 | |
> The re module is addressed in issue #2636. Wow, that issue thread is massive... What about the 're' module is addressed? Is 'regex' replacing 're'? Is 'regex' being rolled into 're'? Are they both going to exist? |
|||
| msg99251 - (view) | Author: Matthew Barnett (mrabarnett) * (Python triager) | Date: 2010年02月12日 02:27 | |
The issue started about updating the re module and adding features that other languages already possess in their regex implementations (the last time any significant work was done on it was in 2003). The hope is that the new regex implementation will eventually replace the existing one, and putting it initially in a module called 'regex' allows it to be tested more easily. You can do: import regex as re and existing code should still work. |
|||
| msg195662 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2013年08月19日 20:30 | |
New changeset 7ab07f15d78c by Serhiy Storchaka in branch '3.3': Issue #2537: Remove breaked check which prevented valid regular expressions. http://hg.python.org/cpython/rev/7ab07f15d78c New changeset f4271cc2dfb5 by Serhiy Storchaka in branch 'default': Issue #2537: Remove breaked check which prevented valid regular expressions. http://hg.python.org/cpython/rev/f4271cc2dfb5 New changeset 7b867a46a8b4 by Serhiy Storchaka in branch '2.7': Issue #2537: Remove breaked check which prevented valid regular expressions. http://hg.python.org/cpython/rev/7b867a46a8b4 |
|||
| msg195664 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2013年08月19日 20:43 | |
This issue is a duplicate of issue1633953. See also issue18647. After some fixes in other parts of the re module this check has become even more invalid. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:56:32 | admin | set | github: 46789 |
| 2013年08月19日 20:43:29 | serhiy.storchaka | set | status: open -> closed versions: + Python 3.3 title: re.compile(r'((x|y+)*)*') should fail -> re.compile(r'((x|y+)*)*') should not fail messages: + msg195664 resolution: fixed stage: resolved |
| 2013年08月19日 20:33:07 | serhiy.storchaka | link | issue1633953 superseder |
| 2013年08月19日 20:30:24 | python-dev | set | nosy:
+ python-dev messages: + msg195662 |
| 2012年11月27日 22:47:00 | ezio.melotti | set | nosy:
+ serhiy.storchaka |
| 2012年11月25日 15:47:41 | ezio.melotti | set | components: + Regular Expressions |
| 2012年11月25日 15:12:53 | mark.dickinson | set | nosy:
- mark.dickinson |
| 2012年11月25日 14:23:22 | Ramchandra Apte | set | components:
+ Library (Lib), - Regular Expressions versions: + Python 3.4 |
| 2010年02月12日 02:27:16 | mrabarnett | set | messages: + msg99251 |
| 2010年02月12日 02:12:23 | meador.inge | set | nosy:
mark.dickinson, rsc, timehorse, jorendorff, ezio.melotti, mrabarnett, meador.inge type: behavior messages: + msg99248 components: + Regular Expressions |
| 2010年02月11日 20:49:44 | mrabarnett | set | nosy:
+ mrabarnett messages: + msg99237 |
| 2010年02月11日 03:17:57 | meador.inge | set | files:
+ issue-2537.patch nosy: + meador.inge messages: + msg99194 keywords: + patch |
| 2009年05月12日 14:25:39 | ezio.melotti | set | nosy:
+ ezio.melotti |
| 2008年09月28日 19:23:32 | timehorse | set | nosy:
+ timehorse versions: + Python 2.7, - Python 2.6 |
| 2008年04月24日 21:02:18 | rsc | set | nosy: + rsc |
| 2008年04月04日 20:37:57 | jorendorff | set | messages: + msg64950 |
| 2008年04月04日 17:41:12 | mark.dickinson | set | nosy:
+ mark.dickinson messages: + msg64934 |
| 2008年04月02日 17:36:05 | jorendorff | create | |