Message 340301 - Python tracker

➜

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

In-reply-to
Author	scoder
Recipients	eli.bendersky, py.user, scoder, serhiy.storchaka
Date	2019年04月15日.18:51:42
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1555354302.43.0.512122129958.issue28238@roundup.psfhosted.org>

Content
lxml has a couple of nice features here: - all tags in a namespace: "{namespace}" - a local name 'tag' in any (or no) namespace: "{}tag" - a tag without namespace: "{}tag" - all tags without namespace: "{}" "{}" is also accepted but is the same as "". Note that "*" is actually allowed as an XML tag name by the spec, but rare enough to hijack it for this purpose. I've actually never seen it used anywhere in the wild. lxml's implementation isn't applicable to ElementTree (searching has been subject to excessive optimisation), but it shouldn't be hard to extend the one in ET's ElementPath.py module, as well as Element.iter() in ElementTree.py, to support this kind of tag comparison. PR welcome. lxml's tests are here (and in the following test methods): https://github.com/lxml/lxml/blob/359f693b972c2e6b0d83d26a329d2d20b7581c48/src/lxml/tests/test_etree.py#L2911 Note that they actually test the deprecated .getiterator() method for historical reasons. They should probably call .iter() instead these days. lxml's ElementPath implementation is under src/lxml/_elementpath.py, but the tag comparison itself is done elsewhere in Cython code (here, in case it matters:) https://github.com/lxml/lxml/blob/359f693b972c2e6b0d83d26a329d2d20b7581c48/src/lxml/apihelpers.pxi#L921-L1048

Content

lxml has a couple of nice features here:
- all tags in a namespace: "{namespace}*"
- a local name 'tag' in any (or no) namespace: "{*}tag"
- a tag without namespace: "{}tag"
- all tags without namespace: "{}*"
"{*}*" is also accepted but is the same as "*". Note that "*" is actually allowed as an XML tag name by the spec, but rare enough to hijack it for this purpose. I've actually never seen it used anywhere in the wild.
lxml's implementation isn't applicable to ElementTree (searching has been subject to excessive optimisation), but it shouldn't be hard to extend the one in ET's ElementPath.py module, as well as Element.iter() in ElementTree.py, to support this kind of tag comparison.
PR welcome.
lxml's tests are here (and in the following test methods):
https://github.com/lxml/lxml/blob/359f693b972c2e6b0d83d26a329d2d20b7581c48/src/lxml/tests/test_etree.py#L2911
Note that they actually test the deprecated .getiterator() method for historical reasons. They should probably call .iter() instead these days. lxml's ElementPath implementation is under src/lxml/_elementpath.py, but the tag comparison itself is done elsewhere in Cython code (here, in case it matters:)
https://github.com/lxml/lxml/blob/359f693b972c2e6b0d83d26a329d2d20b7581c48/src/lxml/apihelpers.pxi#L921-L1048

History
Date	User	Action	Args
2019年04月15日 18:51:42	scoder	set	recipients: + scoder, eli.bendersky, py.user, serhiy.storchaka
2019年04月15日 18:51:42	scoder	set	messageid: <1555354302.43.0.512122129958.issue28238@roundup.psfhosted.org>
2019年04月15日 18:51:42	scoder	link	issue28238 messages
2019年04月15日 18:51:42	scoder	create

homepage