Generating .xml files based on .csv files

Question 1

This reads in data from a .csv file and generates an .xml file with respect to the .csv data. Can you spot any parts where re-factoring can make this code more efficient?

import csv
from xml.etree.ElementTree import Element, SubElement, Comment, tostring
from xml.etree.ElementTree import ElementTree
import xml.etree.ElementTree as etree
def main():
 reader = read_csv()
 generate_xml(reader)
def read_csv():
 with open ('1250_12.csv', 'r') as data:
 return list(csv.reader(data))
def generate_xml(reader):
 root = Element('Solution')
 root.set('version','1.0')
 tree = ElementTree(root)
 head = SubElement(root, 'DrillHoles')
 head.set('total_holes', '238')
 description = SubElement(head,'description')
 current_group = None
 i = 0
 for row in reader:
 if i > 0:
 x1,y1,z1,x2,y2,z2,cost = row
 if current_group is None or i != current_group.text:
 current_group = SubElement(description, 'hole',{'hole_id':"%s"%i})
 collar = SubElement (current_group, 'collar',{'':', '.join((x1,y1,z1))}),
 toe = SubElement (current_group, 'toe',{'':', '.join((x2,y2,z2))}) 
 cost = SubElement(current_group, 'cost',{'':cost})
 i+=1
 def indent(elem, level=0):
 i = "\n" + level*" "
 if len(elem):
 if not elem.text or not elem.text.strip():
 elem.text = i + " "
 if not elem.tail or not elem.tail.strip():
 elem.tail = i
 for elem in elem:
 indent(elem, level+1)
 if not elem.tail or not elem.tail.strip():
 elem.tail = i
 else:
 if level and (not elem.tail or not elem.tail.strip()):
 elem.tail = i
 indent(root)
 tree.write(open('holes1.xml','w'))

Question 2

one note : you have a for elem in elem that reassigns elem to the last element of elem, and then you access elem again. Not sure if that's what you want to do or not, but anyway that's confusing.

Question 3

Minor stuff:

Remove the last import - etree is not used anywhere.
Merge the two first imports

Possibly speed-improving stuff:

Avoid converting the csv.reader output before returning unless absolutely necessary.
Skip indent unless the output must be readable by a human with a non-formatting editor.
If you need to indent the output, existing solutions are probably very efficient.
Use reader.next() to skip the header line in generate_xml, then you don't need to keep checking the value of i.

Question 4

Don't use something like for elem in elem at some point with larger for loops you will miss that elem is different variable before and in/after the for loop:

for subelem in elem:
 indent(subelem, level+1)
if not subelem.tail or not elem.tail.strip():
 subelem.tail = i

Since indent(subelem...) already sets the tail, you probably do not need to do that again.

l0b0 l0b0 9,10722 silver badges36 bronze badges · Answer 1 · 2013-06-05 14:59:18Z

Minor stuff:

Remove the last import - etree is not used anywhere.
Merge the two first imports

Possibly speed-improving stuff:

Avoid converting the csv.reader output before returning unless absolutely necessary.
Skip indent unless the output must be readable by a human with a non-formatting editor.
If you need to indent the output, existing solutions are probably very efficient.
Use reader.next() to skip the header line in generate_xml, then you don't need to keep checking the value of i.

Anthon Anthon 3961 gold badge7 silver badges17 bronze badges · Answer 2 · 2013-06-12 04:30:52Z

Don't use something like for elem in elem at some point with larger for loops you will miss that elem is different variable before and in/after the for loop:

for subelem in elem:
 indent(subelem, level+1)
if not subelem.tail or not elem.tail.strip():
 subelem.tail = i

Since indent(subelem...) already sets the tail, you probably do not need to do that again.

Stack Exchange Network

Generating .xml files based on .csv files

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Generating .xml files based on .csv files

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions