Getting lists of values from a CSV

Question 1

I have a function that takes a column title, and a response.body from a urllib GET (I already know the body contains text/csv), and iterates through the data to build a list of values to be returned. My question to the gurus here: have I written this in the cleanest, most efficient way possible? Can you suggest any improvements?

def _get_values_from_csv(self, column_title, response_body):
 """retrieves specified values found in the csv body returned from GET
 @requires: csv
 @param column_title: the name of the column for which we'll build a list of return values.
 @param response_body: the raw GET output, which should contain the csv data
 @return: list of elements from the column specified.
 @note: the return values have duplicates removed. This could pose a problem, if you are looking for duplicates.
 I'm not sure how to deal with that issue."""
 dicts = [row for row in csv.DictReader(response_body.split("\r\n"))]
 results = {}
 for dic in dicts:
 for k, v in dic.iteritems():
 try:
 results[k] = results[k] + [v] #adds elements as list+list
 except: #first time through the iteritems loop.
 results[k] = [v]
 #one potential problem with this technique: handling duplicate rows
 #not sure what to do about it.
 return_list = list(set(results[column_title]))
 return_list.sort()
 return return_list

Question 2

Tip: Don't use a blanket except, you will catch ALL exceptions rather than the one you want.

Question 3

Here's a shorter function that does the same thing. It doesn't create lists for the columns you're not interested in.

def _get_values_from_csv(self, column_title, response_body):
 dicts = csv.DictReader(response_body.split("\r\n"))
 return sorted(set(d[column_title] for d in dicts))

Question 4

Thanks! This is brilliant! I'm still trying to wrap my head around how to use dictionaries in a comprehension context like this. So this is exactly what I was looking for. :)

Question 5

sorted() will return a list, so the list() bit isn't needed.

Question 6

Good point, Lennart -- Edited out.

Question 7

My suggestions:

def _get_values_from_csv(self, column_title, response_body):
 # collect results in a set to eliminate duplicates
 results = set()
 # iterate the DictReader directly
 for dic in csv.DictReader(response_body.split("\r\n")):
 # only add the single column we are interested in
 results.add(dic[column_title])
 # turn the set into a list and sort it
 return_list = list(results)
 return_list.sort()
 return return_list

Martin Stone 1963 bronze badges · Accepted Answer · 2011-02-16 19:24:29Z

7

\$\begingroup\$

Here's a shorter function that does the same thing. It doesn't create lists for the columns you're not interested in.

def _get_values_from_csv(self, column_title, response_body):
 dicts = csv.DictReader(response_body.split("\r\n"))
 return sorted(set(d[column_title] for d in dicts))

Share

edited Feb 17, 2011 at 8:11

answered Feb 16, 2011 at 19:24

Martin Stone's user avatar

Martin Stone

1963 bronze badges

\$\endgroup\$

3

\$\begingroup\$ Thanks! This is brilliant! I'm still trying to wrap my head around how to use dictionaries in a comprehension context like this. So this is exactly what I was looking for. :) \$\endgroup\$

Greg Gauthier
– Greg Gauthier

2011年02月16日 19:39:06 +00:00
Commented Feb 16, 2011 at 19:39
\$\begingroup\$ sorted() will return a list, so the list() bit isn't needed. \$\endgroup\$

Lennart Regebro
– Lennart Regebro

2011年02月16日 20:54:42 +00:00
Commented Feb 16, 2011 at 20:54
\$\begingroup\$ Good point, Lennart -- Edited out. \$\endgroup\$

Martin Stone
– Martin Stone

2011年02月17日 08:12:37 +00:00
Commented Feb 17, 2011 at 8:12

Add a comment |

Stack Exchange Network

Getting lists of values from a CSV

2 Answers 2

You must log in to answer this question.

Hot Network Questions

Getting lists of values from a CSV

2 Answers 2

You must log in to answer this question.

Related

Hot Network Questions