1
\$\begingroup\$

My script reads two .csv files and generates one dictionary per .csv file.

import csv
# First dictionary
first_dict = {}
with open(first_file, 'r') as f:
 csvReader = csv.reader(f)
 next(csvReader, None) # skip the header
 for row in csvReader:
 key = row[0]
 first_dict[key] = row[1]
# Second dictionary
second_dict = {}
with open(second_file, 'r') as f:
 csvReader = csv.reader(f)
 next(csvReader, None) # skip the header
 for row in csvReader:
 key = " ".join(row[:3]).replace(" "," ")
 second_dict[key] = row[4]

Both reading procedures only differ in the generation of the key. While for first_dictionary the key is just the first row, it is the first three rows for second_dictionary. Is there a way to combine both procedures in one function while only setting appropriate arguments?

asked Apr 23, 2015 at 12:56
\$\endgroup\$

1 Answer 1

4
\$\begingroup\$

Higher order functions

You should define a more general function and then use higher order functions to costumize behaviour:

def csv_dict(filename, key_func, value_func):
 final_dict = {}
 with open(filename, 'r') as f:
 csvReader = csv.reader(f)
 next(csvReader, None) # skip the header
 for row in csvReader:
 key = key_func(row)
 final_dict[key] = value_func(row)
 return final_dict
def dict_one(filename):
 return csv_dict(filename,
 lambda row: row[0],
 lambda row: row[1])
def dict_two(filename):
 return csv_dict(filename,
 lambda row: " ".join(row[:3]).replace(" "," "),
 lambda row: row[4])

Naming is more important than you think

Names like first_dict = {} and second_dict = {} are ugly and should be avoided even in throw-away code.

Enumerating

You can use enumerate to reduce verbosity:

def csv_dict(filename, key_func, value_func):
 with open(first_file, 'r') as f:
 for row_number, row enumerate(csv.reader(f)):
 if row_number == 0: # Skip header
 continue
 key = key_func(row)
 first_dict[key] = value_func(row)

Only write sensible comments

# Second dictionary gives no information and is just noise.

MERose
41511 silver badges21 bronze badges
answered Apr 23, 2015 at 17:37
\$\endgroup\$
2
  • \$\begingroup\$ Thank you for the effort and the lesson. Three questions: Why don't you use rb in open()? You don't initiate first_dict in the first definition - intended? Why exactly is next(csvReader, None) more verbose than your alternative? \$\endgroup\$ Commented Apr 29, 2015 at 12:35
  • \$\begingroup\$ @MERose 1. rb shuold be for binary right? 2. the missed definition was an error 3. You had to name another variable to call next, so I think it was more verbose \$\endgroup\$ Commented Apr 29, 2015 at 18:14

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.