I'm trying to parse this XML using Element Tree in the latest version of python. What I'd like to do is count the number of APPINFO elements and then get the data out of the latest instance of APPINFO (the last one in the tree). So far I am able to get the number of APPINFO elements using
count = len(root.findall("./APPINFO"))
But how do I reference only the last one in the tree and extract the values?
<APPLICANT>
<APPINFO>
<FIRSTNAME>Joe</FIRSTNAME>
<LASTNAME>Smith</LASTNAME>
<MIDDLENAME></MIDDLENAME>
<OTHERNAME></OTHERNAME>
</APPINFO>
<APPLICANT>
<APPINFO>
<FIRSTNAME>Peter</FIRSTNAME>
<LASTNAME>Smith</LASTNAME>
<MIDDLENAME></MIDDLENAME>
<OTHERNAME></OTHERNAME>
</APPINFO>
<APPINFO> #I need the data out of this one only
<FIRSTNAME>John</FIRSTNAME>
<LASTNAME>Smith</LASTNAME>
<MIDDLENAME></MIDDLENAME>
<OTHERNAME></OTHERNAME>
</APPINFO>
2 Answers 2
Working example to count and access last element. when working with lists, negative indices access elements from the end of the list.
from xml.etree import ElementTree as et
data = '''\
<APPLICANT>
<APPINFO>
<FIRSTNAME>Joe</FIRSTNAME>
<LASTNAME>Smith</LASTNAME>
<MIDDLENAME></MIDDLENAME>
<OTHERNAME></OTHERNAME>
</APPINFO>
<APPINFO>
<FIRSTNAME>Peter</FIRSTNAME>
<LASTNAME>Smith</LASTNAME>
<MIDDLENAME></MIDDLENAME>
<OTHERNAME></OTHERNAME>
</APPINFO>
<APPINFO>
<FIRSTNAME>John</FIRSTNAME>
<LASTNAME>Smith</LASTNAME>
<MIDDLENAME></MIDDLENAME>
<OTHERNAME></OTHERNAME>
</APPINFO>
</APPLICANT>'''
tree = et.fromstring(data)
appinfo = tree.findall("./APPINFO")
print(len(appinfo))
et.dump(appinfo[-1])
print(appinfo[-1].find('FIRSTNAME').text)
Output:
3
<APPINFO>
<FIRSTNAME>John</FIRSTNAME>
<LASTNAME>Smith</LASTNAME>
<MIDDLENAME />
<OTHERNAME />
</APPINFO>
John
2 Comments
<DATA> <EMPLOYMENT> <TERMS></TERMS> <EMP_NAME1>TEST1</EMP_NAME1> </EMPLOYMENT> <EMPLOYMENT> <TERMS></TERMS> <EMP_NAME1>TEST2</EMP_NAME1> </EMPLOYMENT> </DATA> <DATA> <EMPLOYMENT> <TERMS></TERMS> <EMP_NAME1>TEST2</EMP_NAME1> </EMPLOYMENT> <EMPLOYMENT> <TERMS></TERMS> <EMP_NAME1>TEST4</EMP_NAME1> </EMPLOYMENT> </DATA> How do I pull the values out of each employment element in the latest instance of DATA?data.findall('EMPLOYMENT'). If you need more detail, ask another question.allAppInfo=root.findall("./APPINFO")
The above returns a list of elements.
count=len(allAppInfo)
The above returns the count of the elements inside the list allAppInfo
last=allAppInfo[count-1]
The above returns the last element in the list which is the element at index count-1.
last=allAppInfo[-1]
The above also returns the last element in the list which is at index -1 from the last.
last=root.findall("./APPINFO")[-1]