Write to a new, modified Excel file from an original one

Question 1

I noticed that I've a HUGE bottleneck in this following function of mine. I can't see how to make it faster.

These are the profiling test results(keep in mind that I'm using a PyQt GUI so times can be stretched):

def write_workbook_to_file(self, dstfilename):
 self.populaterownumstodelete()
 # actual column in the new file
 col_write = 0
 # actual row in the new file
 row_write = 0
 for row in (rows for rows in range(self.sheet.nrows) if rows not in self.row_nums_to_delete):
 for col in (cols for cols in range(self.sheet.ncols) if cols not in self.col_indexes_to_delete):
 self.wb_sheet.write(row_write, col_write, self.parseandgetcellvalue(row, col))
 col_write += 1
 row_write += 1
 col_write = 0

I ran a cProfile profiling test and write_workbook_to_file() resulted the slowest function in all my application. parseandgetcellvalue() isn't a problem at all.

Question 2

I don't know Python, but it looks like you're iterating through every cell (row & column) in the sheet. That will definitely be slow. Just for the sheer number of iterations. What are trying to accomplish exactly?

Question 3

I have some columns / rows in the original excel file that does not have to be showed in the new file. These informations are stored in col_indexes_to_delete and in row_nums_to_delete.

Question 4

Why not copy everything (all at once via a Range) and then delete what needs to be deleted, instead of only copying the final dataset?

Question 5

I should need to "move" the rows / columns then.. I don't think it's a good idea. I've updated my code now (I'm so tired right now ahah)

Question 6

I've replaced the newest code block with the original one. There's no need for multiple of them.

Question 7

To make it faster you could move this

(cols for cols in range(self.sheet.ncols) if cols not in self.col_indexes_to_delete)

out of the loop, and use enumerate instead of explicitly incrementing variables.

Revised code:

def write_workbook_to_file(self, dstfilename):
 self.populaterownumstodelete()
 rows = [row for row in xrange(self.sheet.nrows) if row not in self.row_nums_to_delete]
 cols = [col for col in xrange(self.sheet.ncols) if col not in self.col_indexes_to_delete]
 for row_write, row in enumerate(rows):
 for col_write, col in enumerate(cols):
 self.wb_sheet.write(row_write, col_write, self.parseandgetcellvalue(row, col))

In fact you could even move enumerate out of the loop, but I like how the code looks quite clean now.

Question 8

Fantastic solution! Thanks for showing me enumerate()! It's just PERFECT used here!

Janne Karila Janne Karila 10.6k21 silver badges34 bronze badges · Accepted Answer · 2014-12-13 10:29:28Z

To make it faster you could move this

(cols for cols in range(self.sheet.ncols) if cols not in self.col_indexes_to_delete)

out of the loop, and use enumerate instead of explicitly incrementing variables.

Revised code:

def write_workbook_to_file(self, dstfilename):
 self.populaterownumstodelete()
 rows = [row for row in xrange(self.sheet.nrows) if row not in self.row_nums_to_delete]
 cols = [col for col in xrange(self.sheet.ncols) if col not in self.col_indexes_to_delete]
 for row_write, row in enumerate(rows):
 for col_write, col in enumerate(cols):
 self.wb_sheet.write(row_write, col_write, self.parseandgetcellvalue(row, col))

In fact you could even move enumerate out of the loop, but I like how the code looks quite clean now.

Fantastic solution! Thanks for showing me enumerate()! It's just PERFECT used here!

Stack Exchange Network

Write to a new, modified Excel file from an original one

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Write to a new, modified Excel file from an original one

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions