This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2013年11月09日 15:15 by brandon-rhodes, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Messages (6) | |||
|---|---|---|---|
| msg202480 - (view) | Author: Brandon Rhodes (brandon-rhodes) * | Date: 2013年11月09日 15:15 | |
Regular expression re.MatchObject objects are sequences. They contain at least one "group" string, possibly more, which are integer-indexed starting at zero. Today, groups can be accessed in one of two ways. (1) You can call the method match.group(N). (2) You can call glist = match.groups() and then access each group as glist[N-1]. Note the obvious off-by-one error: .groups() does not include "group zero", which contains the entire match, and therefore its indexes are off-by-one from the values you would pass to .group(). I propose that MatchObject gain a __getitem__(N) method whose return value for every N is the same as .group(N) as I think that match[N] is a quite obvious syntax for asking for one particular group of an RE match. The only objection I can see to this proposal is the obvious asymmetry between Group Zero and all subsequent groups of a regular expression pattern: zero means "the whole thing" whereas each of the others holds the content of a particular explicit set of parens. Looping over the elements match[0], match[1], ... of a pattern like this: r'(\d\d\d\d)/(\d\d)/(\d\d)' will give you *first* the *entire* match, and only then turn its attention to the three parenthesized substrings. My retort is that concentric groups can happen anyway: that Group Zero, holding the entire match, is not really as special as the newcomer might suspect, because you can always wind up with groups inside of other groups; it is simply part of the semantics of regular expressions that groups might overlap or might contain one another, as in: r'((\d\d)/(\d\d)) Description: (.*)' Here, we see that concentricity is not a special property of Group Zero, but in fact something that can happen quite naturally with other groups. The caller simply needs to imagine every regular expression being surrounded by an "automatic set of parentheses" to understand where Group Zero comes from, and how it will be ordered in the resulting sequence of groups relative to the subordinate groups within the string. If one or two people voice agreement here in this issue, I will be very happy to offer a patch. |
|||
| msg202481 - (view) | Author: Ezio Melotti (ezio.melotti) * (Python committer) | Date: 2013年11月09日 15:29 | |
This is something that the regex module already has, and since it is/was supposed to replace the re module in stdlib, I've been holding off to add to re for a long time. We also discussed this recently on #python-dev, and I think it's OK to add it, as long as it behaves the same way as it does in the regex module. If others agree it would be great to do it before the 3.4 feature freeze (there aren't many days left). |
|||
| msg202488 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2013年11月09日 17:02 | |
We discussed this recently on #python-dev, and I don't think that it's worth to add indexing to match object. It will be confused that len(match) != len(match.groups()). I don't know any use case for indexing, it doesn't add anything new except yet one way to access a group. This feature not only increases maintaining complexity, but it also increases a number of things which should learn and remember Python programmer. |
|||
| msg202588 - (view) | Author: Greg Ward (gward) (Python committer) | Date: 2013年11月11日 00:00 | |
>>> import this [...] There should be one-- and preferably only one --obvious way to do it. |
|||
| msg202693 - (view) | Author: Ezio Melotti (ezio.melotti) * (Python committer) | Date: 2013年11月12日 14:50 | |
I think the idea is to eventually deprecate the .group() API. |
|||
| msg269353 - (view) | Author: Berker Peksag (berker.peksag) * (Python committer) | Date: 2016年06月27日 06:41 | |
Thanks for the detailed report! Issue 24454 is actually a duplicate of this but it has a patch and the idea was discussed by several core developers there. I'm going to close this one. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:53 | admin | set | github: 63735 |
| 2016年06月27日 06:41:43 | berker.peksag | set | status: open -> closed superseder: Improve the usability of the match object named group API nosy: + berker.peksag messages: + msg269353 resolution: duplicate stage: needs patch -> resolved |
| 2013年11月12日 14:50:47 | ezio.melotti | set | messages: + msg202693 |
| 2013年11月11日 00:00:07 | gward | set | nosy:
+ gward messages: + msg202588 |
| 2013年11月09日 17:02:41 | serhiy.storchaka | set | messages: + msg202488 |
| 2013年11月09日 15:29:23 | ezio.melotti | set | nosy:
+ christian.heimes, serhiy.storchaka messages: + msg202481 stage: needs patch |
| 2013年11月09日 15:20:29 | brandon-rhodes | set | versions: + Python 3.4, - Python 3.5 |
| 2013年11月09日 15:15:27 | brandon-rhodes | create | |