Reading two csv files as dictionaries with differing key definitions

Question 1

My script reads two .csv files and generates one dictionary per .csv file.

import csv
# First dictionary
first_dict = {}
with open(first_file, 'r') as f:
 csvReader = csv.reader(f)
 next(csvReader, None) # skip the header
 for row in csvReader:
 key = row[0]
 first_dict[key] = row[1]
# Second dictionary
second_dict = {}
with open(second_file, 'r') as f:
 csvReader = csv.reader(f)
 next(csvReader, None) # skip the header
 for row in csvReader:
 key = " ".join(row[:3]).replace(" "," ")
 second_dict[key] = row[4]

Both reading procedures only differ in the generation of the key. While for first_dictionary the key is just the first row, it is the first three rows for second_dictionary. Is there a way to combine both procedures in one function while only setting appropriate arguments?

Question 2

Higher order functions

You should define a more general function and then use higher order functions to costumize behaviour:

def csv_dict(filename, key_func, value_func):
 final_dict = {}
 with open(filename, 'r') as f:
 csvReader = csv.reader(f)
 next(csvReader, None) # skip the header
 for row in csvReader:
 key = key_func(row)
 final_dict[key] = value_func(row)
 return final_dict
def dict_one(filename):
 return csv_dict(filename,
 lambda row: row[0],
 lambda row: row[1])
def dict_two(filename):
 return csv_dict(filename,
 lambda row: " ".join(row[:3]).replace(" "," "),
 lambda row: row[4])

Naming is more important than you think

Names like first_dict = {} and second_dict = {} are ugly and should be avoided even in throw-away code.

Enumerating

You can use enumerate to reduce verbosity:

def csv_dict(filename, key_func, value_func):
 with open(first_file, 'r') as f:
 for row_number, row enumerate(csv.reader(f)):
 if row_number == 0: # Skip header
 continue
 key = key_func(row)
 first_dict[key] = value_func(row)

Only write sensible comments

# Second dictionary gives no information and is just noise.

Question 3

Thank you for the effort and the lesson. Three questions: Why don't you use rb in open()? You don't initiate first_dict in the first definition - intended? Why exactly is next(csvReader, None) more verbose than your alternative?

Question 4

@MERose 1. rb shuold be for binary right? 2. the missed definition was an error 3. You had to name another variable to call next, so I think it was more verbose

Caridorc Caridorc 28k7 gold badges54 silver badges137 bronze badges · Accepted Answer · 2015-04-23 17:37:09Z

Higher order functions

You should define a more general function and then use higher order functions to costumize behaviour:

def csv_dict(filename, key_func, value_func):
 final_dict = {}
 with open(filename, 'r') as f:
 csvReader = csv.reader(f)
 next(csvReader, None) # skip the header
 for row in csvReader:
 key = key_func(row)
 final_dict[key] = value_func(row)
 return final_dict
def dict_one(filename):
 return csv_dict(filename,
 lambda row: row[0],
 lambda row: row[1])
def dict_two(filename):
 return csv_dict(filename,
 lambda row: " ".join(row[:3]).replace(" "," "),
 lambda row: row[4])

Naming is more important than you think

Names like first_dict = {} and second_dict = {} are ugly and should be avoided even in throw-away code.

Enumerating

You can use enumerate to reduce verbosity:

def csv_dict(filename, key_func, value_func):
 with open(first_file, 'r') as f:
 for row_number, row enumerate(csv.reader(f)):
 if row_number == 0: # Skip header
 continue
 key = key_func(row)
 first_dict[key] = value_func(row)

Only write sensible comments

# Second dictionary gives no information and is just noise.

Thank you for the effort and the lesson. Three questions: Why don't you use rb in open()? You don't initiate first_dict in the first definition - intended? Why exactly is next(csvReader, None) more verbose than your alternative?
@MERose 1. rb shuold be for binary right? 2. the missed definition was an error 3. You had to name another variable to call next, so I think it was more verbose

Stack Exchange Network

Reading two csv files as dictionaries with differing key definitions

1 Answer 1

Higher order functions

Naming is more important than you think

Enumerating

Only write sensible comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Reading two csv files as dictionaries with differing key definitions

1 Answer 1

Higher order functions

Naming is more important than you think

Enumerating

Only write sensible comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions