homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: codecs.StreamReader.read behaves differently from regular files
Type: behavior Stage: resolved
Components: Versions: Python 2.7, Python 2.6
process
Status: closed Resolution: duplicate
Dependencies: Superseder: When I use codecs.open(...) and f.readline() follow up by f.read() return bad result
View: 8260
Assigned To: Nosy List: A.S, serhiy.storchaka, tdb, vstinner
Priority: normal Keywords:

Created on 2012年04月02日 13:13 by tdb, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Messages (4)
msg157355 - (view) Author: Mikko Rasa (tdb) Date: 2012年04月02日 13:13
For regular files, a read() call without arguments will read until EOF. codecs.StreamReader does its own buffering, and if there are characters in the buffer, a read() call will be satisfied from the buffer without an attempt to read the rest of the file. This discrepancy causes certain code that worked with regular open() fail if codecs.open() is substituted.
The easiest way to reproduce this is to first call readline() and then read(). Since readline() can't know how many characters are on the line, it will almost always leave some characters in the buffer, triggering the problem with read().
msg157380 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2012年04月02日 20:03
Oh, yet another bug in in codecs.StreamReader. I should add it to the PEP :-)
http://www.python.org/dev/peps/pep-0400/
Use io.TextIOWrapper (open) instead of codecs.StreamReader (codecs.open), it's bugfree :-)
msg160941 - (view) Author: Andrew (A.S) Date: 2012年05月16日 23:37
Just got this behavior, with readlines(), which is unsurprising since it internally uses read() as described in the original bug report.
The break on line 468 of codecs.py seems to be the problem, it fixes it if I remove this conditional locally.
http://hg.python.org/cpython/file/f6a207d86154/Lib/codecs.py#l466
I may be overlooking something, but I would assume this should be checking if the character buffer extends to the EOF of the underlaying stream at this point?
As stated before can be reproduced by:
f = codecs.open(...)
f.read()
f.readlines()
msg177122 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012年12月07日 20:03
This is obviously a duplicate of issue8260 and issue12446.
History
Date User Action Args
2022年04月11日 14:57:28adminsetgithub: 58680
2012年12月07日 20:03:38serhiy.storchakasetstatus: open -> closed

superseder: When I use codecs.open(...) and f.readline() follow up by f.read() return bad result

nosy: + serhiy.storchaka
messages: + msg177122
resolution: duplicate
stage: resolved
2012年05月16日 23:37:34A.Ssetnosy: + A.S
messages: + msg160941
2012年04月02日 20:03:58vstinnersetnosy: + vstinner
messages: + msg157380
2012年04月02日 13:13:33tdbcreate

AltStyle によって変換されたページ (->オリジナル) /