homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: MULTILINE confuses re.split
Type: behavior Stage: resolved
Components: Regular Expressions Versions: Python 2.7
process
Status: closed Resolution: duplicate
Dependencies: Superseder: re.sub confusion between count and flags args
View: 11957
Assigned To: Nosy List: dabrahams, ezio.melotti, mrabarnett, serhiy.storchaka
Priority: normal Keywords:

Created on 2012年08月02日 14:58 by dabrahams, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Messages (5)
msg167228 - (view) Author: Dave Abrahams (dabrahams) Date: 2012年08月02日 14:58
compare the output of
$ python -c "open('/tmp/tst','w').write(100*'x\n');import re;print len(re.split('\n(?=x)', open('/tmp/tst').read()))"
100
with
$ python -c "open('/tmp/tst','w').write(100*'x\n');import re;print len(re.split('\n(?=x)', open('/tmp/tst').read(), re.MULTILINE))"
9
msg167240 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012年08月02日 17:05
re.split = split(pattern, string, maxsplit=0, flags=0)
 Split the source string by the occurrences of the pattern,
 returning a list containing the resulting substrings. If
 capturing parentheses are used in pattern, then the text of all
 groups in the pattern are also returned as part of the resulting
 list. If maxsplit is nonzero, at most maxsplit splits occur,
 and the remainder of the string is returned as the final element
 of the list.
maxsplit=0 in your fist example and maxsplit=8 (re.MULTILINE is 8) in your second example. This is not a bug, this is a wrong understanding.
msg167243 - (view) Author: Matthew Barnett (mrabarnett) * (Python triager) Date: 2012年08月02日 17:28
There are actually 2 issues here:
1. The third argument is 'maxsplit', the fourth is 'flags'.
2. It never splits on a zero-width match. See issue 3262.
msg167282 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012年08月03日 02:41
See also #11957.
msg167283 - (view) Author: Dave Abrahams (dabrahams) Date: 2012年08月03日 02:47
Dang! Thanks, and sorry for wasting everyone's time on this.
History
Date User Action Args
2022年04月11日 14:57:33adminsetgithub: 59742
2014年10月29日 16:13:16vstinnersetsuperseder: re.sub confusion between count and flags args
resolution: not a bug -> duplicate
2012年08月04日 21:07:14r.david.murraylinkissue15536 superseder
2012年08月03日 02:47:47dabrahamssetmessages: + msg167283
2012年08月03日 02:41:48ezio.melottisetstatus: open -> closed
type: behavior
messages: + msg167282

resolution: not a bug
stage: resolved
2012年08月02日 17:29:00mrabarnettsetmessages: + msg167243
2012年08月02日 17:05:06serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg167240
2012年08月02日 14:58:56dabrahamscreate

AltStyle によって変換されたページ (->オリジナル) /