homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Clarify that readlines() is not needed to iterate over a file
Type: enhancement Stage: resolved
Components: Documentation Versions: Python 3.3, Python 3.4, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: ezio.melotti Nosy List: ashwch, cvrebert, dan.riti, docs@python, eric.araujo, ezio.melotti, kushal.das, meador.inge, mikehoy, peter.otten, pitrou, python-dev, terry.reedy
Priority: normal Keywords: easy, patch

Created on 2011年11月30日 17:42 by peter.otten, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
demote-readlines.patch dan.riti, 2013年04月13日 20:13 Reorganize reading lines from a file based on comments.
demote-readlines-v2.patch dan.riti, 2013年04月15日 01:43 Incorporate Ezio's comment.
demote-readlines-v3.patch dan.riti, 2013年04月15日 15:40 Update patch to include changes to io.rst.
Messages (17)
msg148679 - (view) Author: Peter Otten (peter.otten) * Date: 2011年11月30日 17:42
I've been looking at code on the tutor mailing list for some time, and
for line in file.readlines(): ...
is a common idiom there. I suppose this is because the readlines() method is easily discoverable while the proper way (iterate over the file object directly) is not.
A note added to the readlines() documentation might help:
"""
You don't need the readlines() method to loop over the lines of a file.
for line in file: process(line)
consumes less memory and is often faster.
"""
msg148693 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011年11月30日 22:54
This current line is
"Read and return a list of lines from the stream. hint can be specified to control the number of lines read: no more lines will be read if the total size (in bytes/characters) of all lines so far exceeds hint."
I would like to see added
"Since a file is a line iterator, file.readlines() == list(file). To save time and space, iterate over lines of a file with for line in file: process(line)."
(with code markup for the two snippets).
msg148703 - (view) Author: Meador Inge (meador.inge) * (Python committer) Date: 2011年12月01日 05:12
I am skeptical that such a note will help. The iterator behavior is clearly pointed out in the Python Tutorial [1] and in the IOBase documentation [2]. I suspect this bad code pattern is just being copied and pasted from other sources without folks ever even looking at the Python documentation.
[1] http://docs.python.org/dev/tutorial/inputoutput.html?highlight=readlines#methods-of-file-objects
[2] http://docs.python.org/dev/library/io.html#io.IOBase 
msg148704 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011年12月01日 05:23
FWIW I've seen several persons using "for line in file.readlines(): ..." or even "while 1: line = file.readline()". IMHO it's a good idea to document that "without sizehint, it's equivalent to list(file)" and that "for line in file: ..." can be used directly. Even if some people don't read the doc, the ones who do will benefit from this. The same note might also be added to the docstring (I think it's somewhat common to learn about readlines() through dir(file) + help(file.readlines)).
msg173730 - (view) Author: Chris Rebert (cvrebert) * Date: 2012年10月25日 05:17
file.readlines() (and perhaps dare I say even file.readline()) should not even be mentioned in the tutorial, IMO. It is difficult to imagine a use case where just iterating over the file object isn't superior. I cannot remember the last time that I have used either of these methods. They ought to be relegated to the library docs. Presenting `for line in a_file:` as merely "An alternative approach" in the official tutorial is practically archaic.
msg173761 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2012年10月25日 14:36
readlines might be discouraged; readline has its use cases (two days ago I used it to get one line to pass to csv.Sniffer.sniff), but next(file) works too, so it could be de-emphasized. There may be a difference with respect to the trailing newline however; I don’t remember if __iter__ or readline keep it or not.
msg185488 - (view) Author: Kushal Das (kushal.das) * (Python committer) Date: 2013年03月29日 02:52
Working on a patch for this.
msg186818 - (view) Author: Dan Riti (dan.riti) * Date: 2013年04月13日 20:13
After reading the comments, I generated a patch that does the following:
- Reorganize to present `for line in f:` as the first approach for reading lines. I refrained from saying it's the *preferred* approach, however I can add that if desired.
- Reorganize to present `f.readlines()` as the alternative approach.
Any feedback is more then welcome! Thanks.
msg186827 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013年04月13日 20:25
In 3.x I think we'll want to drop the following sentence: "Since the two approaches manage line buffering differently, they should not be mixed". But it can wait for another issue.
msg186936 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2013年04月14日 17:51
I would actually remove the whole section about readlines() or possibly just mention it briefly (something like "If you want to read all the lines of a file in a list you can also use f.readlines().")
The sizehint arg is rarely used, so I don't see the point of going in such details about it in the tutorial. In Lib/, there are only a couple of places where it's actually used:
Lib/fileinput.py:358: self._buffer = self._file.readlines(self._bufsize)
Lib/idlelib/GrepDialog.py:90: block = f.readlines(100000)
msg186937 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013年04月14日 17:52
> I would actually remove the whole section about readlines() or
> possibly just mention it briefly (something like "If you want to read
> all the lines of a file in a list you can also use f.readlines().")
> The sizehint arg is rarely used, so I don't see the point of going in
> such details about it in the tutorial.
You are right, Ezio.
msg186940 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2013年04月14日 17:59
I agree with removing .readlines section. If .readlines did not exist, I do not think we would add it now.
msg186965 - (view) Author: Dan Riti (dan.riti) * Date: 2013年04月15日 01:43
Added a new version of the patch to incorporate Ezio's comment!
msg186966 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2013年04月15日 03:20
Patch LGTM. I think it would also be better to say something like "Note that it's already possible to iterate on file objects using ``for line in file: ...`` without calling file.readlines()." in Doc/library/io.rst:readlines, as suggested in msg148703.
msg187002 - (view) Author: Dan Riti (dan.riti) * Date: 2013年04月15日 15:40
Agreed Ezio, I've updated the patch to include the change to Doc/library/io.rst:readlines.
msg187005 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013年04月15日 16:09
New changeset 1e8be05a4039 by Ezio Melotti in branch '3.3':
#13510: clarify that f.readlines() is note necessary to iterate over a file. Patch by Dan Riti.
http://hg.python.org/cpython/rev/1e8be05a4039
New changeset 7f4325dc4256 by Ezio Melotti in branch 'default':
#13510: merge with 3.3.
http://hg.python.org/cpython/rev/7f4325dc4256
New changeset 6a4746b0afaf by Ezio Melotti in branch '2.7':
#13510: clarify that f.readlines() is note necessary to iterate over a file. Patch by Dan Riti.
http://hg.python.org/cpython/rev/6a4746b0afaf 
msg187006 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2013年04月15日 16:13
Fixed, thanks for the patch!
I also realized I missed Terry suggestion about file.readlines() == list(file), so I added that too.
History
Date User Action Args
2022年04月11日 14:57:24adminsetgithub: 57719
2013年04月15日 16:13:13ezio.melottisetstatus: open -> closed
messages: + msg187006

assignee: docs@python -> ezio.melotti
resolution: fixed
stage: needs patch -> resolved
2013年04月15日 16:09:59python-devsetnosy: + python-dev
messages: + msg187005
2013年04月15日 15:40:40dan.ritisetfiles: + demote-readlines-v3.patch

messages: + msg187002
2013年04月15日 03:20:23ezio.melottisetmessages: + msg186966
2013年04月15日 01:43:29dan.ritisetfiles: + demote-readlines-v2.patch

messages: + msg186965
2013年04月14日 17:59:47terry.reedysetmessages: + msg186940
2013年04月14日 17:52:46pitrousetmessages: + msg186937
2013年04月14日 17:51:28ezio.melottisetmessages: + msg186936
2013年04月13日 20:25:35pitrousetnosy: + pitrou
messages: + msg186827
2013年04月13日 20:13:36dan.ritisetfiles: + demote-readlines.patch

nosy: + dan.riti
messages: + msg186818

keywords: + patch
2013年03月29日 02:52:23kushal.dassetnosy: + kushal.das
messages: + msg185488
2013年03月29日 00:59:22ezio.melottisetversions: - Python 3.2
2013年03月27日 19:50:09ashwchsetnosy: + ashwch
2012年10月27日 02:16:27mikehoysetnosy: + mikehoy
2012年10月25日 14:40:01ezio.melottisetkeywords: + easy
versions: + Python 3.4
2012年10月25日 14:36:53eric.araujosetmessages: + msg173761
2012年10月25日 05:17:18cvrebertsetnosy: + cvrebert
messages: + msg173730
2011年12月01日 05:23:25ezio.melottisetmessages: + msg148704
2011年12月01日 05:12:28meador.ingesetnosy: + meador.inge
messages: + msg148703
2011年11月30日 22:54:26terry.reedysetnosy: + terry.reedy
messages: + msg148693
2011年11月30日 18:09:19ezio.melottisetnosy: + ezio.melotti, eric.araujo
stage: needs patch

versions: + Python 2.7, Python 3.2, Python 3.3
2011年11月30日 17:42:01peter.ottencreate

AltStyle によって変換されたページ (->オリジナル) /