I've prepared a script to manipulate csv files in Windows that simply adds quotes as requested.
The script creates a backup of the original files and overwrite the wrong originals in its same folder.
It works as expected but I'd be really happy to learn from gurus how to improve my code, my thinking and my Python.
#!/usr/bin/python
# coding: utf-8
#
# ______________________________________
# / CSV Fix \
# \ LtPitt - 25/02/2015 /
# --------------------------------------
# \ ^__^
# \ (oo)\_______
# (__)\ )\/\
# ||----w |
# || ||
#
# This script will fix the manually-edited csv files before import into application
import os
import csv
import shutil
import sys
import time
script_path = os.path.dirname(sys.argv[0])
backup_path = script_path + "\\backup\\"
tmp_path = script_path + "\\tmp\\"
os.chdir(script_path)
paths = [backup_path, tmp_path]
for path in paths:
if not os.path.exists(path):
os.makedirs(path)
def fixCsv(csv_file):
# This function fixes the broken csv adding ; and " where needed and saves the fixed file into the script path
rows = []
with open(tmp_path + csv_file, 'rb') as csvfile:
spamreader = csv.reader(csvfile, delimiter=';', quotechar='"')
for row in spamreader:
if not ''.join(row).strip():
pass
else:
rows.append(row)
with open(script_path + "\\" + csv_file, 'wb') as csvfile:
spamwriter = csv.writer(csvfile, delimiter=';', quotechar='"', quoting=csv.QUOTE_ALL)
for row in rows:
spamwriter.writerow(row)
def prepareFiles():
# This function saves original files in case of errors / reviews and moves originals into tmp folder for elaboration
current_datetime = str(time.strftime('%Y%m%d%H%M%S') )
os.makedirs(backup_path + current_datetime)
for file in os.listdir(script_path):
if file.endswith(".csv"):
shutil.copyfile(script_path + "\\" + file, backup_path + current_datetime + "\\" + file)
shutil.move(script_path + "\\" + file, tmp_path + "\\" + file)
# Let's rock
prepareFiles()
for file in os.listdir(tmp_path):
if file.endswith(".csv"):
fixCsv(file)
shutil.rmtree(tmp_path)
1 Answer 1
Long story short: read the style guide.
- Imports should be in alphabetical order;
- Function names should be
lowercase_with_underscores
, notmixedCase
; - Functions should start with
"""docstrings"""
, not# comments
; and - Indents should be four spaces, not two.
time.strftime(...)
always returns a string (that's what it's for!) so wrapping it in a str
call is redundant.
There should be much less code at the top level of your script - move it into an entry point function (e.g. call it main
) then guard its call:
if __name__ == '__main__':
main()
This makes it easier to import
the functionality elsewhere later.
if not ''.join(row).strip():
pass
else:
rows.append(row)
could be simplified to:
if ''.join(row):
rows.append(row)
csv.reader
already strips newlines, and there's no point including the pass
block.
You can (and should!) use os.path.join
instead of e.g. script_path + "\\" + csv_file
.