[Python-Dev] Finding overlapping matches with re assertions: bug or feature?

2013年11月15日 01:36:58 -0800

I was surprised to find that "this works": if you want to find all
_overlapping_ matches for a regexp R, wrap it in
 (?=(R))
and feed it to (say) finditer. Here's a very simple example, finding
all overlapping occurrences of "xx":
 pat = re.compile("(?=(xx))")
 for it in pat.finditer("xxxx"):
 print(it.span(1))
That displays:
 (0, 2)
 (1, 3)
 (2, 4)
Is that a feature? Or an accident? It's very surprising to find a
non-empty match inside an empty match (the outermost lookahead
assertion). If it's intended behavior, it's just in time for the
holiday season; e.g., to generate ASCII art for half an upside-down
Christmas tree:
 pat = re.compile("(?=(x+))")
 for it in pat.finditer("xxxxxxxxxx"):
 print(it.group(1))
_______________________________________________
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to