Using Python to rename multiple csv files in Windows

Question 1

I need to rename a large list of csv files that are generated from 3 different servers. The files are produced with a date stamp as the extension, which I need to move in each file name to retain the date stamp.

The file name format before editing is as:

billing.pps-svr01.csv.2015年09月01日

billing.pps-svr02.csv.2015年09月01日

billing.pps-svr03.csv.2015年09月01日

The file name format after editing should be:

billing.pps-svr01.2015年09月01日.csv

billing.pps-svr02.2015年09月01日.csv

billing.pps-svr03.2015年09月01日.csv

My question is in regard to code efficiency and best practice. The following code seems to work in testing, however I'm very new to Python and programming in general and I'd like to know if there are any other ways to solve this problem that are more efficient...for example, could this be accomplished in a one-liner or something else? I should also note that I intend to incorporate this into a much larger script that parses these files after they've been renamed. Any feedback and/or suggestions would be great.

import os
os.chdir(r'C:\Users\Extract')
for filename in os.listdir('.'):
 if filename.startswith('billing.pps-svr'):
 name = filename.split('.')
 name = name+[name.pop(0-2)]
 new_name = '.'.join(name)
 os.rename(filename, new_name)

Question 2

Here are my thoughts:

You don't do any error handling if any of the file operation fails. Not to be recommended
Your code will flip the latter two parts, all the time. It does not care if the filename has already been fixed or not!
The code name + [name.pop(0-2)] troubles me. You are concatenating the name list with the popped value, but in order for this work you need for the popping to happen before the first part to be joined. Scary stuff...
Here are some pythonic ways to do stuff with lists:
- name[:-2] – Get everything but the two last elements
- name[-2:] – Get only the last two elements
- name[::-1] – Reverse the element list

Here is some coding displaying the flaw in your original rename code, and two options for how to handle it correctly.

for filename in (
 'billing.pps-svr01.2014年09月01日.csv',
 'billing.pps-svr02.2014年09月01日.csv',
 'billing.pps-svr01.csv.2015年09月01日',
 'billing.pps-svr02.csv.2015年09月01日',
 ):
 print('\nTesting {}:'.format(filename))
 name = filename.split('.')
 name = name + [name.pop(0-2)]
 new_name = '.'.join(name)
 print ' old rename: {} to {}'.format(filename, new_name)
 filename_parts = filename.split('.')
 first_part = filename_parts[:-2]
 last_part = filename_parts[-2:]
 if last_part[-1] != 'csv':
 new_name = '.'.join(first_part + last_part[::-1])
 print ' new rename: {} to {}'.format(filename, new_name)
 else:
 print ' no rename needed'
 if filename_parts[-2] == 'csv':
 new_name = '.'.join(filename_parts[:-2] + filename_parts[-2:][::-1])
 print ' alt rename: {} to {}'.format(filename, new_name)
 else:
 print ' no alternate rename needed'

The output from this are as follows:

Testing billing.pps-svr01.2014年09月01日.csv:
 old rename: billing.pps-svr01.2014年09月01日.csv to billing.pps-svr01.csv.2014年09月01日
 no rename needed
 no alternate rename needed
Testing billing.pps-svr02.2014年09月01日.csv:
 old rename: billing.pps-svr02.2014年09月01日.csv to billing.pps-svr02.csv.2014年09月01日
 no rename needed
 no alternate rename needed
Testing billing.pps-svr01.csv.2015年09月01日:
 old rename: billing.pps-svr01.csv.2015年09月01日 to billing.pps-svr01.2015年09月01日.csv
 new rename: billing.pps-svr01.csv.2015年09月01日 to billing.pps-svr01.2015年09月01日.csv
 alt rename: billing.pps-svr01.csv.2015年09月01日 to billing.pps-svr01.2015年09月01日.csv
Testing billing.pps-svr02.csv.2015年09月01日:
 old rename: billing.pps-svr02.csv.2015年09月01日 to billing.pps-svr02.2015年09月01日.csv
 new rename: billing.pps-svr02.csv.2015年09月01日 to billing.pps-svr02.2015年09月01日.csv
 alt rename: billing.pps-svr02.csv.2015年09月01日 to billing.pps-svr02.2015年09月01日.csv

Notice how the two first files would have gotten a wrongly rename use your original code.

Code refactor (added)

To accomodate for your question regarding building this into a larger script, and to give example of error handling, I've refactor your code into the following (using the tip from Janne Karila on using rsplit):

import os
def rename_csv_files(directory, required_start):
 """Rename files in <directory> starting with <required_start> to csv files
 Go to <directory> and read through all files, and for those
 starting with <required_start> and ending with something like 
 *.csv.YYYY-MM-DD and rename these to *.YYYY-MM-DD.
 """
 try:
 os.chdir(directory)
 except OSError, exception:
 print('IOError when changing directory - {}'.format(exception))
 return
 try:
 for filename in os.listdir('.'):
 if filename.startswith(required_start):
 base, ext, date = filename.rsplit('.', 2)
 new_filename = '.'.join((base, date, ext))
 if ext == 'csv' and not os.path.exists(new_filename):
 try:
 os.rename(filename, new_filename)
 print 'Renamed: {}'.format(new_filename)
 except OSError, exception:
 print('Failed renaming file - dir: {}, original file: {}, new file: {} - {}'.format(
 directory, filename, new_filename, exception))
 elif ext != 'csv':
 print('Skipped: {}'.format(filename))
 else:
 print('Skipped: {} - Renamed version already exists'.format(filename))
 except OSError, exception:
 print('Failed traversing directory - dir: {} - {}'.format(directory, exception))
def main():
 rename_csv_files('./test_data', 'billing.pps-svr')
if __name__ == '__main__':
 main()

Running this script against the following test-data:

$ ls -1d test_data/* | sort -n
test_data/billing.pps-svr01.2014年09月01日.csv
test_data/billing.pps-svr01.csv.2015年09月01日
test_data/billing.pps-svr02.2014年09月01日.csv
test_data/billing.pps-svr02.2015年09月01日.csv
test_data/billing.pps-svr02.csv.2015年09月01日
test_data/original_files.tar

Gives the following output:

Skipped: billing.pps-svr01.2014年09月01日.csv
Renamed: billing.pps-svr01.2015年09月01日.csv
Skipped: billing.pps-svr02.2014年09月01日.csv
Skipped: billing.pps-svr02.2015年09月01日.csv
Skipped: billing.pps-svr02.csv.2015年09月01日 - Renamed version already exists

This code now handles error handling for at least the following cases:

Directory not existing, or read or execution permission faults
Errors when traversing directory or renaming files
The logical error of renaming a file into an already existing file

Question 3

This is great feedback, thank you! One minor note - I hit an NameError using your refactored code because main() was not defined...had to move the IF statement after the function was defined to clear it. Otherwise works great!

Question 4

@erns, That is correct. The if needs to be after... Have corrected code

Question 5

The string manipulation becomes more readable using sequence unpacking:

base, ext, date = name.rsplit('.', 2)
new_name = '.'.join((base, date, ext))

Question 6

I should also note that I intend to incorporate this into a much larger script that parses these files after they've been renamed

If so, you want modularity and customization possibilities, both of which can be achieved by wrapping your code in a function:

def sensible_name_for_this_renaming_process(directory, required_start):
 os.chdir(directory)
 for filename in os.listdir('.'):
 if filename.startswith(required_start):
 name = filename.split('.')
 name = name+[name.pop(0-2)]
 new_name = '.'.join(name)
 os.rename(filename, new_name)

Question 7

It's slightly neater to avoid nesting by inverting the if condition at the start:

for filename in os.listdir('.'):
 if not filename.startswith('billing.pps-svr'):
 continue
 name = filename.split('.')
 name = name+[name.pop(0-2)]
 new_name = '.'.join(name)
 os.rename(filename, new_name)

holroy holroy 11.7k1 gold badge27 silver badges59 bronze badges · Accepted Answer · 2015-10-03 00:47:17Z

Here are my thoughts:

You don't do any error handling if any of the file operation fails. Not to be recommended
Your code will flip the latter two parts, all the time. It does not care if the filename has already been fixed or not!
The code name + [name.pop(0-2)] troubles me. You are concatenating the name list with the popped value, but in order for this work you need for the popping to happen before the first part to be joined. Scary stuff...
Here are some pythonic ways to do stuff with lists:
- name[:-2] – Get everything but the two last elements
- name[-2:] – Get only the last two elements
- name[::-1] – Reverse the element list

Here is some coding displaying the flaw in your original rename code, and two options for how to handle it correctly.

for filename in (
 'billing.pps-svr01.2014年09月01日.csv',
 'billing.pps-svr02.2014年09月01日.csv',
 'billing.pps-svr01.csv.2015年09月01日',
 'billing.pps-svr02.csv.2015年09月01日',
 ):
 print('\nTesting {}:'.format(filename))
 name = filename.split('.')
 name = name + [name.pop(0-2)]
 new_name = '.'.join(name)
 print ' old rename: {} to {}'.format(filename, new_name)
 filename_parts = filename.split('.')
 first_part = filename_parts[:-2]
 last_part = filename_parts[-2:]
 if last_part[-1] != 'csv':
 new_name = '.'.join(first_part + last_part[::-1])
 print ' new rename: {} to {}'.format(filename, new_name)
 else:
 print ' no rename needed'
 if filename_parts[-2] == 'csv':
 new_name = '.'.join(filename_parts[:-2] + filename_parts[-2:][::-1])
 print ' alt rename: {} to {}'.format(filename, new_name)
 else:
 print ' no alternate rename needed'

The output from this are as follows:

Testing billing.pps-svr01.2014年09月01日.csv:
 old rename: billing.pps-svr01.2014年09月01日.csv to billing.pps-svr01.csv.2014年09月01日
 no rename needed
 no alternate rename needed
Testing billing.pps-svr02.2014年09月01日.csv:
 old rename: billing.pps-svr02.2014年09月01日.csv to billing.pps-svr02.csv.2014年09月01日
 no rename needed
 no alternate rename needed
Testing billing.pps-svr01.csv.2015年09月01日:
 old rename: billing.pps-svr01.csv.2015年09月01日 to billing.pps-svr01.2015年09月01日.csv
 new rename: billing.pps-svr01.csv.2015年09月01日 to billing.pps-svr01.2015年09月01日.csv
 alt rename: billing.pps-svr01.csv.2015年09月01日 to billing.pps-svr01.2015年09月01日.csv
Testing billing.pps-svr02.csv.2015年09月01日:
 old rename: billing.pps-svr02.csv.2015年09月01日 to billing.pps-svr02.2015年09月01日.csv
 new rename: billing.pps-svr02.csv.2015年09月01日 to billing.pps-svr02.2015年09月01日.csv
 alt rename: billing.pps-svr02.csv.2015年09月01日 to billing.pps-svr02.2015年09月01日.csv

Notice how the two first files would have gotten a wrongly rename use your original code.

Code refactor (added)

To accomodate for your question regarding building this into a larger script, and to give example of error handling, I've refactor your code into the following (using the tip from Janne Karila on using rsplit):

import os
def rename_csv_files(directory, required_start):
 """Rename files in <directory> starting with <required_start> to csv files
 Go to <directory> and read through all files, and for those
 starting with <required_start> and ending with something like 
 *.csv.YYYY-MM-DD and rename these to *.YYYY-MM-DD.
 """
 try:
 os.chdir(directory)
 except OSError, exception:
 print('IOError when changing directory - {}'.format(exception))
 return
 try:
 for filename in os.listdir('.'):
 if filename.startswith(required_start):
 base, ext, date = filename.rsplit('.', 2)
 new_filename = '.'.join((base, date, ext))
 if ext == 'csv' and not os.path.exists(new_filename):
 try:
 os.rename(filename, new_filename)
 print 'Renamed: {}'.format(new_filename)
 except OSError, exception:
 print('Failed renaming file - dir: {}, original file: {}, new file: {} - {}'.format(
 directory, filename, new_filename, exception))
 elif ext != 'csv':
 print('Skipped: {}'.format(filename))
 else:
 print('Skipped: {} - Renamed version already exists'.format(filename))
 except OSError, exception:
 print('Failed traversing directory - dir: {} - {}'.format(directory, exception))
def main():
 rename_csv_files('./test_data', 'billing.pps-svr')
if __name__ == '__main__':
 main()

Running this script against the following test-data:

$ ls -1d test_data/* | sort -n
test_data/billing.pps-svr01.2014年09月01日.csv
test_data/billing.pps-svr01.csv.2015年09月01日
test_data/billing.pps-svr02.2014年09月01日.csv
test_data/billing.pps-svr02.2015年09月01日.csv
test_data/billing.pps-svr02.csv.2015年09月01日
test_data/original_files.tar

Gives the following output:

Skipped: billing.pps-svr01.2014年09月01日.csv
Renamed: billing.pps-svr01.2015年09月01日.csv
Skipped: billing.pps-svr02.2014年09月01日.csv
Skipped: billing.pps-svr02.2015年09月01日.csv
Skipped: billing.pps-svr02.csv.2015年09月01日 - Renamed version already exists

This code now handles error handling for at least the following cases:

Directory not existing, or read or execution permission faults
Errors when traversing directory or renaming files
The logical error of renaming a file into an already existing file

This is great feedback, thank you! One minor note - I hit an NameError using your refactored code because main() was not defined...had to move the IF statement after the function was defined to clear it. Otherwise works great!
@erns, That is correct. The if needs to be after... Have corrected code

Stack Exchange Network

Using Python to rename multiple csv files in Windows

4 Answers 4

Code refactor (added)

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Using Python to rename multiple csv files in Windows

4 Answers 4

Code refactor (added)

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions