Okay...so I have been self teaching for about seven months, this past week someone in accounting at work said that it'd be nice if she could get reports split up....so I made that my first program because I couldn't ever come up with something useful to try to make...all that said, I got it finished last night, and it does what I was expecting it to do so far, but I'm sure there are things that have a better way of being done.
I wanted to get it finished without help first, but now I'd like someone to take a look and tell me what I could have done better, or if there is a better way to go about getting the same results....
What this is doing is: it opens a CSV file (the file I've been practicing with has 27K lines of data) and it loops through, creating a separate file for each billing number, using the billing number as the filename, and writing the header as the first line. Each instance is overwriting the file if it is already created, this is an area I'm sure I could have done better. After that, it loops through the data again, appending each line of data into the correct file.
import os
import csv
#currentdirpath = os.getcwd()
#filename = 'argos.csv'
#file_path = os.path.join(os.getcwd(), filename) #filepath to open
def get_file_path(filename):
''' - This gets the full path...file and terminal need to be in
same directory - '''
file_path = os.path.join(os.getcwd(), filename)
return file_path
pathOfFile = get_file_path('argos.csv')
''' - Below opens and reads the csv file,
then going to try to loop and write the rows out in files sorted
by Billing Number - '''
with open(pathOfFile, 'rU') as csvfile:
reader = csv.reader(csvfile)
header = next(reader)
for row in reader:
new_file_name = row[5][:5] + '.csv'
''' Create file named by billing number, and print the header to
each file '''
fb = open(new_file_name, 'w+')
fb.write(str(header) + '\n')
#fb.close()
with open(pathOfFile, 'rU') as csvfile:
reader = csv.reader(csvfile)
for row in reader:
new_file_name = row[5][:5] + '.csv'
ab = open(new_file_name, 'a')
ab.write(str(row) + '\n')
I've left a few of the things in there that I had at one point, but commented out...just thought it might give you a better idea of what I was thinking...any advice is appreciated!
1 Answer 1
- Don't use string literals as comments. PEP 8 explains how to use comments.
- Docstrings should use
"""
rather than'''
, as described in PEP 257. Also your docstring doesn't need the "-", and should probably be rephrased slightly to fit on one line. - Close files,
#fb.close()
shows you went out of your way to make bad code. Withoutfb.close
or wrappingopen
in awith
, the file is not guaranteed to be closed. I personally preferwith
tofb.close
, as described here. - Personally, rather than over-riding your files \$n\$ times, I'd use
collections.defaultdict
, to group all your files into their rows. - You may want to change
get_file_path
, to be based of__file__
. Or leave your path to be relative, as it'll default to that behavior.
import os
import csv
from collections import defaultdict
FILE_DIR = os.path.dirname(os.path.abspath(__file__))
def get_file_path(filename):
return os.path.join(FILE_DIR, filename)
file_path = get_file_path('argos.csv')
with open(file_path, 'rU') as csvfile:
reader = csv.reader(csvfile)
header = next(reader)
data = defaultdict(lambda:[header])
_ = data[header[5][:5]]
for row in reader:
data[row[5][:5]].append(row)
for file_name, rows in data.items():
with open(file_name, 'w+') as f:
for row in rows:
f.write(str(row) + '\n')
''' Create file named by billing number, and print the header to each file '''
looks illegal. \$\endgroup\$