Accessing csv header white space and case insensitive

Question 1

I'm overriding the csv.Dictreader.fieldnames property like the following to read all headers from csv files without white space and in lower case.

import csv
class MyDictReader(csv.DictReader):
 @property
 def fieldnames(self):
 return [field.strip().lower() for field in super(MyDictReader, self).fieldnames]

Now my question is, how can I access the fieldnames with automatically strip() and lower() the query?

This is, how I do it manually:

csvDict = MyDictReader(open('csv-file.csv', 'rU'))
for lineDict in csvDict:
 query = ' Column_A'.strip().lower()
 print(lineDict[query])

Any ideas?

Question 2

Based on Pedro Romano's suggestion I coded the following example.

import csv
class DictReaderInsensitive(csv.DictReader):
 # This class overrides the csv.fieldnames property.
 # All fieldnames are without white space and in lower case
 @property
 def fieldnames(self):
 return [field.strip().lower() for field in super(DictReaderInsensitive, self).fieldnames]
 def __next__(self):
 # get the result from the original __next__, but store it in DictInsensitive
 dInsensitive = DictInsensitive()
 dOriginal = super(DictReaderInsensitive, self).__next__()
 # store all pairs from the old dict in the new, custom one
 for key, value in dOriginal.items():
 dInsensitive[key] = value
 return dInsensitive
class DictInsensitive(dict):
 # This class overrides the __getitem__ method to automatically strip() and lower() the input key
 def __getitem__(self, key):
 return dict.__getitem__(self, key.strip().lower())

For a file containing headers like

"column_A"
" column_A"
"Column_A"
" Column_A"
...

you can access the columns like this:

csvDict = DictReaderInsensitive(open('csv-file.csv', 'rU'))
for lineDict in csvDict:
 print(lineDict[' Column_A']) # or
 print(lineDict['Column_A']) # or
 print(lineDict[' column_a']) # all returns the same

Question 3

When I try to run this I get the following error: return [field.strip().lower() for field in super(MyDictReader, self).fieldnames] TypeError: must be type, not classobj

Question 4

@PrestonDocks: I learned from this question that super behaves different in Python2 and Python3. The code should work with Python3. Did you try that?

Question 5

The problem is that DictReader is an "old style" class in Python2, and is "new style" in Python3. You can fix this by inheriting from object: DictReaderInsensitive(csv.DictReader, object ):

Question 6

You'll have to do it in two steps:

Create your dict specialisation with a __getitem__ method that applies the .strip().lower() to the its key parameter.
Override __next__ on your MyDictReader specialised class to return one of your special dictionaries initialised with the dictionary returned by the csv.DictReader superclass's __next__ method.

user1251007 17k14 gold badges52 silver badges78 bronze badges · Accepted Answer · 2012-10-19 08:42:03Z

Based on Pedro Romano's suggestion I coded the following example.

import csv
class DictReaderInsensitive(csv.DictReader):
 # This class overrides the csv.fieldnames property.
 # All fieldnames are without white space and in lower case
 @property
 def fieldnames(self):
 return [field.strip().lower() for field in super(DictReaderInsensitive, self).fieldnames]
 def __next__(self):
 # get the result from the original __next__, but store it in DictInsensitive
 dInsensitive = DictInsensitive()
 dOriginal = super(DictReaderInsensitive, self).__next__()
 # store all pairs from the old dict in the new, custom one
 for key, value in dOriginal.items():
 dInsensitive[key] = value
 return dInsensitive
class DictInsensitive(dict):
 # This class overrides the __getitem__ method to automatically strip() and lower() the input key
 def __getitem__(self, key):
 return dict.__getitem__(self, key.strip().lower())

For a file containing headers like

"column_A"
" column_A"
"Column_A"
" Column_A"
...

you can access the columns like this:

csvDict = DictReaderInsensitive(open('csv-file.csv', 'rU'))
for lineDict in csvDict:
 print(lineDict[' Column_A']) # or
 print(lineDict['Column_A']) # or
 print(lineDict[' column_a']) # all returns the same

When I try to run this I get the following error: return [field.strip().lower() for field in super(MyDictReader, self).fieldnames] TypeError: must be type, not classobj
@PrestonDocks: I learned from this question that super behaves different in Python2 and Python3. The code should work with Python3. Did you try that?
The problem is that DictReader is an "old style" class in Python2, and is "new style" in Python3. You can fix this by inheriting from object: DictReaderInsensitive(csv.DictReader, object ):

CollectivesTM on Stack Overflow

Accessing csv header white space and case insensitive

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

CollectivesTM on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related