homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author serhiy.storchaka
Recipients ezio.melotti, mrabarnett, rexdwyer, serhiy.storchaka
Date 2014年11月08日.09:11:18
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1415437879.71.0.116422408869.issue22817@psf.upfronthosting.co.za>
In-reply-to
Content
It is possible to change this behavior (see example patch). With this patch:
>>> re.split(r'(?<=CA)(?=GCTG)', 'ACGTCAGCTGAAACCCCAGCTGACGTACGT')
['ACGTCA', 'GCTGAAACCCCA', 'GCTGACGTACGT']
>>> re.split(r'\b', "the quick, brown fox")
['', 'the', ' ', 'quick', ', ', 'brown', ' ', 'fox', '']
But unfortunately this is backward incompatible change and will likely break existing code (and breaks tests). Consider following example: re.split('(:*)', 'ab'). Currently the result is ['ab'], but with the patch it is ['', '', 'a', '', 'b', '', ''].
In third-part regex module [1] there is the V1 flag which switches incompatible bahavior change.
>>> regex.split('(:*)', 'ab')
['ab']
>>> regex.split('(?V1)(:*)', 'ab')
['', '', 'a', '', 'b', '', '']
>>> regex.split(r'(?<=CA)(?=GCTG)', 'ACGTCAGCTGAAACCCCAGCTGACGTACGT')
['ACGTCAGCTGAAACCCCAGCTGACGTACGT']
>>> regex.split(r'(?V1)(?<=CA)(?=GCTG)', 'ACGTCAGCTGAAACCCCAGCTGACGTACGT')
['ACGTCA', 'GCTGAAACCCCA', 'GCTGACGTACGT']
>>> regex.split(r'\b', "the quick, brown fox")
['the quick, brown fox']
>>> regex.split(r'(?V1)\b', "the quick, brown fox")
['', 'the', ' ', 'quick', ', ', 'brown', ' ', 'fox', '']
I don't know how to solve this issue without introducing such flag (or adding special boolean argument to re.split()).
As a workaround I suggest you to use the regex module.
[1] https://pypi.python.org/pypi/regex 
History
Date User Action Args
2014年11月08日 09:11:19serhiy.storchakasetrecipients: + serhiy.storchaka, ezio.melotti, mrabarnett, rexdwyer
2014年11月08日 09:11:19serhiy.storchakasetmessageid: <1415437879.71.0.116422408869.issue22817@psf.upfronthosting.co.za>
2014年11月08日 09:11:19serhiy.storchakalinkissue22817 messages
2014年11月08日 09:11:19serhiy.storchakacreate

AltStyle によって変換されたページ (->オリジナル) /