I have a function that takes a column title, and a response.body from a urllib GET (I already know the body contains text/csv), and iterates through the data to build a list of values to be returned. My question to the gurus here: have I written this in the cleanest, most efficient way possible? Can you suggest any improvements?
def _get_values_from_csv(self, column_title, response_body):
"""retrieves specified values found in the csv body returned from GET
@requires: csv
@param column_title: the name of the column for which we'll build a list of return values.
@param response_body: the raw GET output, which should contain the csv data
@return: list of elements from the column specified.
@note: the return values have duplicates removed. This could pose a problem, if you are looking for duplicates.
I'm not sure how to deal with that issue."""
dicts = [row for row in csv.DictReader(response_body.split("\r\n"))]
results = {}
for dic in dicts:
for k, v in dic.iteritems():
try:
results[k] = results[k] + [v] #adds elements as list+list
except: #first time through the iteritems loop.
results[k] = [v]
#one potential problem with this technique: handling duplicate rows
#not sure what to do about it.
return_list = list(set(results[column_title]))
return_list.sort()
return return_list
2 Answers 2
Here's a shorter function that does the same thing. It doesn't create lists for the columns you're not interested in.
def _get_values_from_csv(self, column_title, response_body):
dicts = csv.DictReader(response_body.split("\r\n"))
return sorted(set(d[column_title] for d in dicts))
-
\$\begingroup\$ Thanks! This is brilliant! I'm still trying to wrap my head around how to use dictionaries in a comprehension context like this. So this is exactly what I was looking for. :) \$\endgroup\$Greg Gauthier– Greg Gauthier2011年02月16日 19:39:06 +00:00Commented Feb 16, 2011 at 19:39
-
\$\begingroup\$ sorted() will return a list, so the list() bit isn't needed. \$\endgroup\$Lennart Regebro– Lennart Regebro2011年02月16日 20:54:42 +00:00Commented Feb 16, 2011 at 20:54
-
\$\begingroup\$ Good point, Lennart -- Edited out. \$\endgroup\$Martin Stone– Martin Stone2011年02月17日 08:12:37 +00:00Commented Feb 17, 2011 at 8:12
My suggestions:
def _get_values_from_csv(self, column_title, response_body):
# collect results in a set to eliminate duplicates
results = set()
# iterate the DictReader directly
for dic in csv.DictReader(response_body.split("\r\n")):
# only add the single column we are interested in
results.add(dic[column_title])
# turn the set into a list and sort it
return_list = list(results)
return_list.sort()
return return_list
except
, you will catch ALL exceptions rather than the one you want. \$\endgroup\$