homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Lookbehind assertions go behind the start position for the match
Type: behavior Stage: resolved
Components: Documentation, Regular Expressions Versions: Python 3.2, Python 3.3, Python 2.7
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: docs@python Nosy List: Devin Jeanpierre, docs@python, ezio.melotti, mrabarnett
Priority: normal Keywords:

Created on 2012年02月12日 08:54 by Devin Jeanpierre, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Messages (4)
msg153188 - (view) Author: Devin Jeanpierre (Devin Jeanpierre) * Date: 2012年02月12日 08:54
compiled regex objects' match method offers an optional "pos" parameter described to be roughly equivalent to slicing except for how it treats the "^" operation. See http://docs.python.org/library/re.html#re.RegexObject.search
However, the behavior of lookbehind assertions also differs:
>>> re.compile("(?<=a)b").match("ab", 1)
<_sre.SRE_Match object at 0x...>
>>> re.compile("(?<=a)b").match("ab"[1:])
>>>
This alone might be a documentation bug, but the behavior is also inconsistent with the behavior of lookahead assertions, which do *not* look past the endpos:
>>> re.compile("a(?=b)").match("ab", 0, 1)
>>> re.compile("a(?=b)").match("ab")
<_sre.SRE_Match object at 0x...>
>>>
msg153284 - (view) Author: Matthew Barnett (mrabarnett) * (Python triager) Date: 2012年02月13日 17:39
The documentation says of the 'pos' parameter "This is not completely equivalent to slicing the string" and of the 'endpos' parameter "it will be as if the string is endpos characters long".
In other words, it starts searching at 'pos' but truncates at 'endpos'.
Yes, it's inconsistent, but it's documented.
msg153285 - (view) Author: Devin Jeanpierre (Devin Jeanpierre) * Date: 2012年02月13日 17:54
If it's intended behaviour, then I'd request that the documentation specifically mention lookbehind assertions the way it does with "^".
Saying "it's slightly different" doesn't make clear the ways in which it is different, and that's important for people writing or using regular expressions.
msg154626 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012年02月29日 12:34
IMHO the documentation is fine as is. Using pos in combination with lookarounds that match on the beginning/end of the "slice" seems a rather uncommon corner case, and I don't think it's worth documenting it. Even if it was documented, as a user, I would just try it from the interpreter anyway, rather than checking the docs for some prose to decipher.
History
Date User Action Args
2022年04月11日 14:57:26adminsetgithub: 58206
2012年02月29日 12:34:34ezio.melottisetstatus: open -> closed

assignee: docs@python
components: + Documentation
versions: + Python 3.3, - Python 3.1
nosy: + docs@python

messages: + msg154626
resolution: wont fix
stage: resolved
2012年02月13日 17:54:24Devin Jeanpierresetmessages: + msg153285
2012年02月13日 17:39:22mrabarnettsetnosy: + mrabarnett
messages: + msg153284
2012年02月12日 08:54:43Devin Jeanpierrecreate

AltStyle によって変換されたページ (->オリジナル) /