4
\$\begingroup\$

This is a follow up to the simpler question here.

I have hundreds of files looking something like this:

enter image description here

These files have names such as:

2301 Item 1.xlsx
2014 Item 2.xlsx
5321 Item 3.xlsx
3212 Item 200.xlsx

I also have a template file (Template.xlsx) looking like this:

enter image description here

I want to:

  • Copy the template file, and for each of the other files
    • Give it the filename in cell C6.
    • Copy the values of B2,B3,B3 over to the appropriate spot in the template (B4,B5,B6).
    • Copy the filename to B8

I've written the following script in Python 3:

import os
import shutil
from xlrd import open_workbook
from xlutils.copy import copy
def duplicate_and_transfer(path):
 for filename in os.listdir(path):
 if filename == 'Template.xlsx':
 continue
 else:
 filepath = os.path.join(path, filename)
 excel_file = open_workbook(filepath)
 worksheet = excel_file.sheet_by_index(0)
 new_filename = worksheet.cell_value(5,2)
 new_filepath = os.path.join(path, new_filename + '.xls')
 r_value = [0,0,0]
 for row in range(3):
 r_value[row] = worksheet.cell_value(row+1,1) 
 shutil.copy2(os.path.join(path, 'Template.xlsx'),new_filepath)
 rb = open_workbook(new_filepath)
 wb = copy(rb)
 s = wb.get_sheet(0)
 s.write(7,2,new_filename)
 for r in range(3):
 s.write(r+3,1,r_value[r])
 wb.save(new_filepath)
if __name__ == '__main__':
 duplicate_and_transfer(r'C:\Users\Stewie\Documents\excel_folder')

A few things:

  • I have to make the new files xls instead of xlsx. They get corrupted if not (someone else have had this problem). Can I avoid this?
  • Should I split this up into more functions?

This is fast enough, so my main issue is not with performance.

asked Aug 27, 2017 at 21:07
\$\endgroup\$

1 Answer 1

2
\$\begingroup\$
  • Since you use continue in the if clause, you can get rid of the else part and save a level of indentation.
  • You can use a list-comprehension to build the list of r_value: [worksheet.cell_value(row + 1, 1) for row in range(3)].
  • You can iterate directly over the content of r_value to set it back:

    for row, value in enumerate(r_value, 3):
     s.write(row, 1, value)
    
  • I don't think you need to copy the template first and modify it after. You may be able to open the template and then save it using the new filename.

  • You can use a second parameter with a default value to hold the name of the template.

Proposed improvements:

import os
from xlrd import open_workbook
from xlutils.copy import copy
def duplicate_and_transfer(path, template_name='Template.xlsx'):
 template = os.path.join(path, template_name)
 for filename in os.listdir(path):
 if filename == template_name:
 continue
 filepath = os.path.join(path, filename)
 excel_file = open_workbook(filepath)
 worksheet = excel_file.sheet_by_index(0)
 new_filename = worksheet.cell_value(5,2)
 new_filepath = os.path.join(path, new_filename + '.xls')
 values = [worksheet.cell_value(row + 1, 1) for row in range(3)]
 rb = open_workbook(template)
 workbook = copy(rb)
 worksheet = workbook.get_sheet(0)
 worksheet.write(7, 2, new_filename)
 for row, value in enumerate(values, 3):
 worksheet.write(row, 1, value)
 workbook.save(new_filepath)
if __name__ == '__main__':
 duplicate_and_transfer(r'C:\Users\Stewie\Documents\excel_folder')

You may as well be able to open the template once and save it several times.

Lastly, if you have troubles working with XLSX files, you may want to try openpyxl instead:

import os
from openpyxl import load_workbook
def duplicate_and_transfer(path, template_name='Template.xlsx'):
 template = load_workbook(os.path.join(path, template_name))
 for filename in os.listdir(path):
 if filename == template_name:
 continue
 workbook = load_workbook(os.path.join(path, filename))
 worksheet = workbook.worksheets[0]
 new_filename = worksheet['C6'].value
 template['B8'] = new_filename
 new_filepath = os.path.join(path, new_filename + '.xlsx')
 # Using double unpacking as the ranges will return a 3-tuple of 1-tuple
 for (src,), (dest,) in zip(worksheet['B2:B4'], template['B4:B6']):
 dest.value = src.value
 template.save(new_filepath)
if __name__ == '__main__':
 duplicate_and_transfer(r'C:\Users\Stewie\Documents\excel_folder')
answered Aug 28, 2017 at 8:17
\$\endgroup\$
1
  • \$\begingroup\$ Thanks Mathias! I tried using openpyxl and got the following error (my SO post): stackoverflow.com/q/46150893/2338750. Do you know what might have caused this? \$\endgroup\$ Commented Sep 11, 2017 at 8:15

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.