Translate SOAP response into a CSV using Python

Question 1

I have this XML from a SOAP call:

<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
 <soapenv:Header/>
 <soapenv:Body>
 <SessionID xmlns="http://www.gggg.com/oog">5555555</SessionID>
 <QueryResult xmlns="http://www.gggg.com/oog/Query" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 <Code>testsk</Code>
 <Records>
 <Record>
 <dim_id>1</dim_id>
 <resource_full_name>Administrator, Sir</resource_full_name>
 <resource_first_name>Sir</resource_first_name>
 <resource_last_name>Administrator</resource_last_name>
 <resource_email>[email protected]</resource_email>
 <resource_user_name>admin</resource_user_name>
 </Record>
 <Record>
 <dim_id>2</dim_id>
 <resource_full_name>scheduler, scheduler</resource_full_name>
 <resource_first_name>scheduler</resource_first_name>
 <resource_last_name>scheduler</resource_last_name>
 <resource_email>[email protected]</resource_email>
 <resource_user_name>scheduler</resource_user_name>
 </Record>

My goal: To parse each Record's sub-elements <dim_id> ... <resource_user_name> and save each record as a row in a CSV.

My Code:

dim_id_list = []
full_name_list = []
first_name_list = []
last_name_list = []
resource_email_list = []
resource_user_name_list = []
root = et.parse('xml_stuff.xml').getroot()
for dim_id in root.iter('{http://www.gggg.com/oog/Query}dim_id'):
 dim_id_list.append(dim_id.text)
for resource_full_name in root.iter('{http://www.gggg.com/oog/Query}resource_full_name'):
 full_name_list.append(resource_full_name.text)
for resource_first_name in root.iter('{http://www.gggg.com/oog/Query}resource_first_name'):
 first_name_list.append(resource_first_name.text)
for resource_last_name in root.iter('{http://www.gggg.com/oog/Query}resource_last_name'):
 last_name_list.append(resource_last_name.text)
for resource_email in root.iter('{http://www.gggg.com/oog/Query}resource_email'):
 resource_email_list.append(resource_email.text)
for resource_user_name in root.iter('{http://www.gggg.com/oog/Query}resource_user_name'):
 resource_user_name_list.append(resource_user_name.text)
rows = zip(dim_id_list, full_name_list, first_name_list, last_name_list, resource_email_list, resource_user_name_list)
with open('test.csv', "w", encoding='utf16', newline='') as f:
 writer = csv.writer(f)
 for row in rows:
 writer.writerow(row)

Is there a better way to loop through the Records? This code is terribly verbose. I tried this:

for record in root.findall('.//{http://www.gggg.com/oog/Query}Record'):
 dim_id = record.find('dim_id').text
# Extract each attribute, save to list. etc.

But I am getting attribute errors trying to access each record's text property.

Question 2

It makes little sense to slice the data into "vertical" lists, then transpose them back into rows using zip(). Not only is it cumbersome to do it that way, it's also fragile. If, for example, one records is missing its resource_email child element, then all subsequent rows will be off!

You can use writer.writerows(rows) instead of the for row in rows: writer.write(row) loop. Furthermore, you can pass a generator expression so that the CSV writer extracts records on the fly as needed.

It's customary to import xml.etree.ElementTree as ET rather than as et.

Suggested solution

import csv
from xml.etree import ElementTree as ET
fieldnames = [
 'dim_id',
 'resource_full_name',
 'resource_first_name',
 'resource_last_name',
 'resource_email',
 'resource_user_name',
]
ns = {'': 'http://www.gggg.com/oog/Query'}
xml_records = ET.parse('xml_stuff.xml').find('.//Records', ns)
with open('test2.csv', 'w', encoding='utf16', newline='') as f:
 csv.DictWriter(f, fieldnames).writerows(
 {
 prop.tag.split('}', 1)[1]: prop.text
 for prop in xr
 }
 for xr in xml_records
 )

If you are certain that each <Record> always has its child elements in the right order, you can simplify it further by not explicitly stating the element/field names:

import csv
from xml.etree import ElementTree as ET
ns = {
 '': 'http://www.gggg.com/oog/Query',
 'soapenv': 'http://schemas.xmlsoap.org/soap/envelope/',
}
records = ET.parse('xml_stuff.xml').find('soapenv:Body/QueryResult/Records', ns)
with open('test2.csv', 'w', encoding='utf16', newline='') as f:
 csv.writer(f).writerows(
 [prop.text for prop in r] for r in records
 )

200_success 200_success 145k22 gold badges190 silver badges478 bronze badges · Accepted Answer · 2022-07-28 17:47:56Z

It makes little sense to slice the data into "vertical" lists, then transpose them back into rows using zip(). Not only is it cumbersome to do it that way, it's also fragile. If, for example, one records is missing its resource_email child element, then all subsequent rows will be off!

You can use writer.writerows(rows) instead of the for row in rows: writer.write(row) loop. Furthermore, you can pass a generator expression so that the CSV writer extracts records on the fly as needed.

It's customary to import xml.etree.ElementTree as ET rather than as et.

Stack Exchange Network

Translate SOAP response into a CSV using Python

1 Answer 1

Suggested solution

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Translate SOAP response into a CSV using Python

1 Answer 1

Suggested solution

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions