homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: 'enumerate' 'start' parameter documentation is confusing
Type: behavior Stage:
Components: Documentation Versions: Python 3.2, Python 3.3, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: eric.araujo, phammer, python-dev, r.david.murray, rhettinger, terry.reedy
Priority: low Keywords:

Created on 2011年04月20日 16:08 by phammer, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Messages (8)
msg134162 - (view) Author: Peter Hammer (phammer) Date: 2011年04月20日 16:08
"""
A point of confusion using the builtin function 'enumerate' and
enlightenment for those who, like me, have been confused.
Note, this confusion was discussed at length at
 http://bugs.python.org/issue2831
prior to the 'start' parameter being added to 'enumerate'. The
confusion discussed herein was forseen in that discussion, and
ultimately discounted. There remains, IMO, an issue with the
clarity of the documentation that needs to be addressed. That
is, the closed issue at
 http://bugs.python.org/issue8635
concerning the 'enumerate' docstring does not address the confusion
that prompted this posting.
Consider:
x=['a','b','c','d','e']
y=['f','g','h','i','j']
print 0,y[0]
for i,c in enumerate(y,1):
 print i,c
 if c=='g':
 print x[i], 'y[%i]=g' % (i)
 continue
 print x[i]
This code produces the following unexpected output, using python 2.7,
which is apparently the correct behavior (see commentary below). This
example is an abstract simplification of a program defect encountered
in practice:
>>> 
0 f
1 f
b
2 g
c y[2]=g
3 h
d
4 i
e
5 j
Traceback (most recent call last):
 File "Untitled", line 9
 print x[i]
IndexError: list index out of range
Help on 'enumerate' yields:
>>> help(enumerate)
Help on class enumerate in module __builtin__:
class enumerate(object)
 | enumerate(iterable[, start]) -> iterator for index, value of iterable
 | 
 | Return an enumerate object. iterable must be another object that supports
 | iteration. The enumerate object yields pairs containing a count (from
 | start, which defaults to zero) and a value yielded by the iterable argument.
 | enumerate is useful for obtaining an indexed list:
 | (0, seq[0]), (1, seq[1]), (2, seq[2]), ...
 | 
 | Methods defined here:
 | 
 | __getattribute__(...)
 | x.__getattribute__('name') <==> x.name
 | 
 | __iter__(...)
 | x.__iter__() <==> iter(x)
 | 
 | next(...)
 | x.next() -> the next value, or raise StopIteration
 | 
 | ----------------------------------------------------------------------
 | Data and other attributes defined here:
 | 
 | __new__ = <built-in method __new__ of type object>
 | T.__new__(S, ...) -> a new object with type S, a subtype of T
>>> 
Commentary:
The expected output was:
>>>
0 f
1 g
b y[2]=g
2 h
c
3 i
d
4 j
e
>>>
That is, it was expected that the iterator would yield a value
corresponding to the index, whether the index started at zero or not.
Using the notation of the doc string, with start=1, the expected
behavior was:
 | (1, seq[1]), (2, seq[2]), (3, seq[3]), ...
while the actual behavior is:
 | (1, seq[0]), (2, seq[1]), (3, seq[2]), ...
The practical problem in the real world code was to do something
special with the zero index value of x and y, then run through the
remaining values, doing one of two things with x and y, correlated,
depending on the value of y.
I can see now that the doc string does in fact correctly specify the
actual behavior: nowhere does it say the iterator will begin at any
other place than the beginning, so this is not a python bug. I do
however question the general usefulness of such behavior. Normally,
indices and values are expected to be correlated.
The correct behavior can be simply implemented without using
'enumerate':
x=['a','b','c','d','e']
y=['f','g','h','i','j']
print 0,y[0]
for i in xrange(1,len(y)):
 c=y[i]
 print i,c
 if c=='g':
 print x[i], 'y[%i]=g' % (i)
 continue
 print x[i]
This produces the expected results.
If one insists on using enumerate to produce the correct behavior in
this example, it can be done as follows:
"""
x=['a','b','c','d','e']
y=['f','g','h','i','j']
seq=enumerate(y)
print '%s %s' % seq.next()
for i,c in seq:
 print i,c
 if c=='g':
 print x[i], 'y[%i]=g' % (i)
 continue
 print x[i]
"""
This version produces the expected results, while achieving clarity
comparable to that which was sought in the original incorrect code.
Looking a little deeper, the python documentation on enumerate states:
enumerate(sequence[, start=0])
Return an enumerate object. sequence must be a sequence, an iterator,
or some other object which supports iteration. The next() method of the
iterator returned by enumerate() returns a tuple containing a count
(from start which defaults to 0) and the corresponding value obtained
from iterating over iterable. enumerate() is useful for obtaining an
indexed series:
 (0, seq[0]), (1, seq[1]), (2, seq[2]),
This makes a pretty clear implication the value corresponds to the
index, so perhaps there really is an issue here. Have at it. I'm
going back to work, using 'enumerate' as it actually is, now that I
clearly understand it.
One thing is certain: the documentation has to be clarified, for the
confusion foreseen prior to adding the start parameter is very real.
"""
msg134169 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011年04月20日 17:52
If you know what an iterator is, the documentation, it seems to me, is clear. That is, an iterator cannot be indexed, so the behavior you expected could not be implemented by enumerate.
That doesn't meant the docs shouldn't be improved. An example with a non-zero start would make things clear.
msg134290 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011年04月23日 01:31
Note: 3.x correct gives the signature at enumerate(iterable, start) rather that enumerate(sequence, start).
I agree that the current entry is a bit awkward. Perhaps the doc would be clearer with a reference to zipping. Removing the unneeded definition of *iterable* (which should be linked to the definition in the glossary, along with *iterator*), my suggestion is:
'''
enumerate(iterable, start=0)
Return an enumerate object, an *iterator* of tuples, that zips together a sequence of counts and *iterable*. Each tuple contain a count and an item from *iterable*, in that order. The counts begin with *start*, which defaults to 0. enumerate() is useful for obtaining an indexed series: enumerate(seq) produces (0, seq[0]), (1, seq[1]), (2, seq[2]), .... For another example, which uses *start*:
>>> for i, season in enumerate(['Spring','Summer','Fall','Winter'], 1):
... print(i, season)
1 Spring
2 Summer
3 Fall
4 Winter
'''
Note that I changed the example to use a start of 1 instead of 0, to produce a list in traditional form, which is one reason to have the parameter!
msg134302 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011年04月23日 14:42
+1 to what David says.
Terry’s patch is a good starting point; I think Raymond will commit something along its lines.
msg134311 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011年04月23日 16:54
I've got it from here. Thanks.
msg136824 - (view) Author: Peter Hammer (phammer) Date: 2011年05月25日 03:22
"""
Changing the 'enumerate' doc string text from:
| (0, seq[0]), (1, seq[1]), (2, seq[2]), ...
to:
| (start, seq[0]), (start+1, seq[1]), (start+2, seq[2]), ...
would completely disambiguate the doc string at the modest cost of
sixteen additional characters, a small price for pellucid clarity.
The proposed changes to the formal documentation also seem to me to
be prudent, and I hope at this late writing, they have already been
committed.
I conclude with a code fragment for the edification of R. David Murray.
"""
class numerate(object):
 """
 A demonstration of a plausible incorrect interpretation of
 the 'enumerate' function's doc string and documentation.
 """
 def __init__(self,seq,start=0):
 self.seq=seq; self.index=start-1
 try:
 if seq.next: pass #test for iterable
 for i in xrange(start): self.seq.next()
 except:
 if type(seq)==dict: self.seq=seq.keys()
 self.seq=iter(self.seq[start:])
 def next(self):
 self.index+=1
 return self.index,self.seq.next()
 
 def __iter__(self): return self
if __name__ == "__main__":
 #s=['spring','summer','autumn','winter']
 s={'spring':'a','summer':'b','autumn':'c','winter':'d'}
 #s=enumerate(s)#,2)
 s=numerate(s,2)
 for t in s: print t
msg139051 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011年06月25日 12:57
New changeset 0ca8ffffd90b by Raymond Hettinger in branch '2.7':
Issue 11889: Clarify docs for enumerate.
http://hg.python.org/cpython/rev/0ca8ffffd90b 
msg139054 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011年06月25日 13:01
New changeset d0df12b32522 by Raymond Hettinger in branch '3.2':
Issue 11889: Clarify docs for enumerate.
http://hg.python.org/cpython/rev/d0df12b32522
New changeset 9b827e3998f6 by Raymond Hettinger in branch 'default':
Issue 11889: Clarify docs for enumerate.
http://hg.python.org/cpython/rev/9b827e3998f6 
History
Date User Action Args
2022年04月11日 14:57:16adminsetgithub: 56098
2011年06月25日 13:02:18rhettingersetstatus: open -> closed
resolution: fixed
2011年06月25日 13:01:22python-devsetmessages: + msg139054
2011年06月25日 12:57:13python-devsetnosy: + python-dev
messages: + msg139051
2011年05月25日 04:46:57rhettingersetpriority: normal -> low
2011年05月25日 03:22:38phammersetmessages: + msg136824
2011年04月23日 16:54:45rhettingersetmessages: + msg134311
2011年04月23日 14:42:00eric.araujosetnosy: + eric.araujo
messages: + msg134302
2011年04月23日 01:31:46terry.reedysetnosy: + terry.reedy

messages: + msg134290
versions: + Python 3.2, Python 3.3
2011年04月20日 19:27:27rhettingersetassignee: rhettinger

components: + Documentation, - None
nosy: + rhettinger
2011年04月20日 17:52:26r.david.murraysetnosy: + r.david.murray
messages: + msg134169
2011年04月20日 16:08:55phammercreate

AltStyle によって変換されたページ (->オリジナル) /