[Python-ideas] csv.DictReader could handle headers more intelligently.

Fri Jan 25 19:03:03 CET 2013

On 01/25/2013 09:53 AM, Mark Hackett wrote:> On Friday 25 Jan 2013, Ethan 
Furman wrote:
>> On 01/25/2013 03:00 AM, Mark Hackett wrote:
>> > On Thursday 24 Jan 2013, Steven D'Aprano wrote:
>> >> - it is less obvious: how does the caller decide that there are too 
many
>> >> field names?
>> >
>> > Additionally, the user of the library now has to read much more about 
the
>> > library (either code or documentation, which has to track the code 
too),
>> > to decide what it is going to do.
>> >
>> > If you have to read the code, then it's not really OO, is it. It's 
light
>> > grey, not black box.
>>>> If you have to read the code, the documentation needs improvement.
>>>> And if you put your feet too close to the fire, your feet will burn.
>> Neither have anything to do with the subject at hand, however.
>> Which is if a dictionary acts a certain way and calling a routine that 
creates 
> a dictionary AND WORKS DIFFERENTLY, then why did you use a routine that 
> creates a dictionary?
>> You see, the option here is to leave it operating as a dictionary 
operates. 
> And in that case, you do not need to document anything. The documentation 
of 
> how it works is already covered by the python basics: "How does a 
dictionary 
> work in Python?".

The csv DictReader *uses* a dictionary for its output. That 
it does so imposes no requirements on how it should parse or 
otherwise handle the input that eventually goes into that 
dict.
I can understand the appeal of keeping things simple and
simply cramming whatever comes out of a simple parse of 
the header into the dict keys. Simplicity is good and
that is a valid opinion. However it is not a-priori the
obviously best one no matter how much hand-waving and 
foot stomping comes with it.
I would prefer to see a suppressible exception when header
keys are duplicated on the grounds that such a csv file 
is not in general an appropriate input for the DictReader.
> So don't change it, and you don't have to improve the documentation.

If it's not changed then documentation definitely should
be fixed. The very fact that when the behaviour was pointed
out here, the result was a long discussion rather than one
or two responses that said, "of course it behaves that way"
is the strongest evidence that the current description
is inadequate.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130125/9256c04f/attachment.html>