My script reads two .csv files and generates one dictionary per .csv file.
import csv
# First dictionary
first_dict = {}
with open(first_file, 'r') as f:
csvReader = csv.reader(f)
next(csvReader, None) # skip the header
for row in csvReader:
key = row[0]
first_dict[key] = row[1]
# Second dictionary
second_dict = {}
with open(second_file, 'r') as f:
csvReader = csv.reader(f)
next(csvReader, None) # skip the header
for row in csvReader:
key = " ".join(row[:3]).replace(" "," ")
second_dict[key] = row[4]
Both reading procedures only differ in the generation of the key. While for first_dictionary
the key is just the first row, it is the first three rows for second_dictionary
. Is there a way to combine both procedures in one function while only setting appropriate arguments?
1 Answer 1
Higher order functions
You should define a more general function and then use higher order functions to costumize behaviour:
def csv_dict(filename, key_func, value_func):
final_dict = {}
with open(filename, 'r') as f:
csvReader = csv.reader(f)
next(csvReader, None) # skip the header
for row in csvReader:
key = key_func(row)
final_dict[key] = value_func(row)
return final_dict
def dict_one(filename):
return csv_dict(filename,
lambda row: row[0],
lambda row: row[1])
def dict_two(filename):
return csv_dict(filename,
lambda row: " ".join(row[:3]).replace(" "," "),
lambda row: row[4])
Naming is more important than you think
Names like first_dict = {}
and second_dict = {}
are ugly and should be avoided even in throw-away code.
Enumerating
You can use enumerate
to reduce verbosity:
def csv_dict(filename, key_func, value_func):
with open(first_file, 'r') as f:
for row_number, row enumerate(csv.reader(f)):
if row_number == 0: # Skip header
continue
key = key_func(row)
first_dict[key] = value_func(row)
Only write sensible comments
# Second dictionary
gives no information and is just noise.
-
\$\begingroup\$ Thank you for the effort and the lesson. Three questions: Why don't you use
rb
inopen()
? You don't initiatefirst_dict
in the first definition - intended? Why exactly isnext(csvReader, None)
more verbose than your alternative? \$\endgroup\$MERose– MERose2015年04月29日 12:35:21 +00:00Commented Apr 29, 2015 at 12:35 -
\$\begingroup\$ @MERose 1.
rb
shuold be for binary right? 2. the missed definition was an error 3. You had to name another variable to call next, so I think it was more verbose \$\endgroup\$Caridorc– Caridorc2015年04月29日 18:14:29 +00:00Commented Apr 29, 2015 at 18:14