This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2010年11月05日 14:33 by Alexander.Schmolck, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Messages (4) | |||
|---|---|---|---|
| msg120499 - (view) | Author: Alexander Schmolck (Alexander.Schmolck) | Date: 2010年11月05日 14:33 | |
In certain cases a zero-width /Z match that should be replaced isn't.
An example might help:
re.compile('(?m)(?P<trailing_ws>[ \t]+\r*$)|(?P<no_final_newline>(?<=[^\n])\Z)').subn(lambda m:next('<'+k+'>' for k,v in m.groupdict().items() if v is not None), 'foobar ')
this gives
('foobar<trailing_ws>', 1)
I would have expected
('foobar<trailing_ws><no_final_newline>', 2)
Contrast this with the following behavior:
[m.span() for m in re.compile('(?P<trailing_ws>[ \t]+\r*$)|(?P<no_final_newline>(?<=[^\n])\Z)', re.M).finditer('foobar ')]
gives
[(6, 7), (7, 7)]
The matches are clearly not overlapping and the re module docs for sub say "Return the string obtained by replacing the leftmost non-overlapping occurrences of pattern in string by the replacement repl.", so I would have expected two replacements.
This seems to be what perl is doing:
echo -n 'foobar ' | perl -pe 's/(?m)(?P<trailing_ws>[ \t]+\r*$)|(?P<no_final_newline>(?<=[^\n])\Z)/<$&>/g'
gives
foobar< ><>%
|
|||
| msg120535 - (view) | Author: Matthew Barnett (mrabarnett) * (Python triager) | Date: 2010年11月05日 21:09 | |
It's a bug caused by trying to avoid getting stuck when a zero-width match is found. Basically the fix is to advance one character after a zero-width match, but that doesn't always give the correct result. There are a number of related issues like issue #1647489 ("zero-length match confuses re.finditer()"). |
|||
| msg228010 - (view) | Author: Mark Lawrence (BreamoreBoy) * | Date: 2014年09月30日 21:47 | |
@Serhiy can you take a look at this as I recall you've been doing some regex work? |
|||
| msg340960 - (view) | Author: Ma Lin (malin) * | Date: 2019年04月27日 02:37 | |
This bug was fixed in Python 3.7, see issue32308. Python 3.6.8 (tags/v3.6.8:3c6b436a57, Dec 23 2018, 23:31:17) [MSC v.1916 32 bit (Intel)] on win32 >>> re.compile('(?m)(?P<trailing_ws>[ \t]+\r*$)|(?P<no_final_newline>(?<=[^\n])\Z)').subn(lambda m:next('<'+k+'>' for k,v in m.groupdict().items() if v is not None), 'foobar ') ('foobar<trailing_ws>', 1) Python 3.7.3rc1 (tags/v3.7.3rc1:69785b2127, Mar 12 2019, 22:37:55) [MSC v.1916 64 bit (AMD64)] on win32 >>> re.compile('(?m)(?P<trailing_ws>[ \t]+\r*$)|(?P<no_final_newline>(?<=[^\n])\Z)').subn(lambda m:next('<'+k+'>' for k,v in m.groupdict().items() if v is not None), 'foobar ') ('foobar<trailing_ws><no_final_newline>', 2) |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:08 | admin | set | github: 54537 |
| 2019年04月27日 04:41:18 | serhiy.storchaka | set | status: open -> closed resolution: out of date stage: resolved |
| 2019年04月27日 02:37:04 | malin | set | nosy:
+ malin messages: + msg340960 |
| 2019年04月26日 20:42:37 | BreamoreBoy | set | nosy:
- BreamoreBoy |
| 2014年09月30日 21:47:20 | BreamoreBoy | set | nosy:
+ BreamoreBoy, serhiy.storchaka messages: + msg228010 versions: + Python 3.4, Python 3.5, - Python 3.1 |
| 2010年11月05日 23:00:52 | terry.reedy | set | versions: + Python 2.7, - Python 2.6 |
| 2010年11月05日 21:09:58 | mrabarnett | set | messages: + msg120535 |
| 2010年11月05日 15:38:19 | r.david.murray | set | nosy:
+ mrabarnett |
| 2010年11月05日 14:33:47 | Alexander.Schmolck | create | |