This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2010年04月01日 01:37 by poke, last changed 2022年04月11日 14:56 by admin. This issue is now closed.
| Messages (5) | |||
|---|---|---|---|
| msg102051 - (view) | Author: Patrick Westerhoff (poke) | Date: 2010年04月01日 01:37 | |
When using xml.etree.ElementTree to parse external XML files, all XML comments within that file are being stripped out. I guess that happens because there is no comment handler in the expat parser. Example: test.xml -------- <example> <nodeA /> <!-- some comment --> <nodeB /> </example> test.py ------- from xml.etree import ElementTree with open( 'test.xml', 'r' ) as f: xml = ElementTree.parse( f ) ElementTree.dump( xml ) Result ------ <example> <nodeA /> <nodeB /> </example> |
|||
| msg102078 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) | Date: 2010年04月01日 09:01 | |
ElementTree does parse comments, it just omit them in the tree. A quick search lead me to this page: http://effbot.org/zone/element-pi.htm which can be further simplified: from xml.etree import ElementTree class MyTreeBuilder(ElementTree.TreeBuilder): def comment(self, data): self.start(ElementTree.Comment, {}) self.data(data) self.end(ElementTree.Comment) with open('c:/temp/t.xml', 'r') as f: xml = ElementTree.parse( f, parser=ElementTree.XMLParser(target=MyTreeBuilder())) ElementTree.dump(xml) Now, should ElementTree do this by default? It's not certain, see how effbot's sample needs to wrap the entire file into another 'document' element. |
|||
| msg102110 - (view) | Author: Patrick Westerhoff (poke) | Date: 2010年04月01日 17:24 | |
Thanks for your reply, Amaury. That page really might mean that it was not intended for ElementTree to parse such things by default. Although it might be nice if there was some easy way to simply enable it, instead of having to hack it into there and depending on details of some internal code (which might change in the future).
Your code btw. didn't work for me, but based on it and on that effbot page, I came up with the following solution, which works fine.
test.py
-------
from xml.etree import ElementTree
class CommentedTreeBuilder ( ElementTree.XMLTreeBuilder ):
def __init__ ( self, html = 0, target = None ):
ElementTree.XMLTreeBuilder.__init__( self, html, target )
self._parser.CommentHandler = self.handle_comment
def handle_comment ( self, data ):
self._target.start( ElementTree.Comment, {} )
self._target.data( data )
self._target.end( ElementTree.Comment )
with open( 'test.xml', 'r' ) as f:
xml = ElementTree.parse( f, parser = CommentedTreeBuilder() )
ElementTree.dump( xml )
|
|||
| msg102112 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) | Date: 2010年04月01日 17:29 | |
yes, my code uses the newer version of ElementTree which will be included with 2.7 and 3.2 |
|||
| msg113322 - (view) | Author: Florent Xicluna (flox) * (Python committer) | Date: 2010年08月08日 21:06 | |
IIUC it works like that by design. The ElementTree 1.3 (which is part of Python 2.7 and 3.2) allows to define your own parser which parses comments (see previous comments). Close as "won't fix"? |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:56:59 | admin | set | github: 52524 |
| 2011年10月29日 02:35:58 | flox | set | status: open -> closed |
| 2010年08月08日 21:06:58 | flox | set | type: behavior -> enhancement versions: + Python 3.2, - Python 3.1 nosy: + scoder messages: + msg113322 resolution: wont fix stage: resolved |
| 2010年04月01日 17:29:22 | amaury.forgeotdarc | set | messages: + msg102112 |
| 2010年04月01日 17:24:05 | poke | set | messages: + msg102110 |
| 2010年04月01日 13:35:26 | brian.curtin | set | nosy:
+ flox |
| 2010年04月01日 09:01:50 | amaury.forgeotdarc | set | nosy:
+ amaury.forgeotdarc, effbot messages: + msg102078 |
| 2010年04月01日 01:37:44 | poke | create | |