Problem inserting an element where I want it using lxml

Alan Meyer ameyer2 at yahoo.com
Wed Jan 5 00:57:37 EST 2011


I'm having some trouble inserting elements where I want them
using the lxml ElementTree (Python 2.6). I presume I'm making
some wrong assumptions about how lxml works and I'm hoping
someone can clue me in.
I want to process an xml document as follows:
For every occurrence of a particular element, no matter where it
appears in the tree, I want to add a sibling to that element with
the same name and a different value.
Here's the smallest artificial example I've found so far
demonstrates the problem:
 <foo>
 <whatever>
 <something/>
 </whatever>
 <bingo>Add another bingo after this</bingo>
 <bar/>
 </foo>
What I'd like to produce is this:
 <foo>
 <whatever>
 <something/>
 </whatever>
 <bingo>Add another bingo after this</bingo>
 <bar/>
 </foo>
Here's my program:
-------- cut here -----
from lxml import etree as etree
xml = """<?xml version="1.0" ?>
<foo>
 <whatever>
 <something/>
 </whatever>
 <bingo>Add another bingo after this</bingo>
 <bar/>
</foo>
"""
tree = etree.fromstring(xml)
# A list of all "bingo" element objects in the unmodified original xml
# There's only one in this example
elems = tree.xpath("//bingo")
# For each one, insert a sibling after it
bingoCounter = 0
for elem in elems:
 parent = elem.getparent()
 subIter = parent.iter()
 pos = 0
 for subElem in subIter:
 # Is it one we want to create a sibling for?
 if subElem == elem:
 newElem = etree.Element("bingo")
 bingoCounter += 1
 newElem.text = "New bingo %d" % bingoCounter
 newElem.tail = "\n"
 parent.insert(pos, newElem)
 break
 pos += 1
newXml = etree.tostring(tree)
print("")
print(newXml)
-------- cut here -----
The output follows:
-------- output -----
<foo>
 <whatever>
 <something/>
 </whatever>
 <bingo>Add another bingo after this</bingo>
 <bar/>
<bingo>New bingo 1</bingo>
</foo>
-------- output -----
Setting aside the whitespace issues, the bug in the program shows
up in the positioning of the insertion. I wanted and expected it
to appear immediately after the original "bingo" element,
and before the "bar" element, but it appeared after the "bar"
instead of before it.
Everything works if I take the "something" element out of the
original input document. The new "bingo" appears before the
"bar". But when I put it back in, the inserted bingo is out of
order. Why should that be? What am I misunderstanding?
Is there a more intelligent way to do what I'm trying to do?
Thanks.
 Alan


More information about the Python-list mailing list

AltStyle によって変換されたページ (->オリジナル) /