homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Delayed exception using non-text encodings with TextIOWrapper
Type: behavior Stage: resolved
Components: IO Versions: Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: ncoghlan Nosy List: georg.brandl, larry, ncoghlan, pitrou, python-dev, serhiy.storchaka, vstinner
Priority: release blocker Keywords: patch

Created on 2014年01月27日 05:01 by ncoghlan, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
issue20404_check_valid_textio_codec.diff ncoghlan, 2014年01月27日 12:29 Check codec validity in TextIOWrapper.__init__ review
issue20404_check_valid_textio_codec_v2.diff ncoghlan, 2014年02月02日 05:08 Rationalise codec lookups during TextIOWrapper creation review
issue20404_check_valid_textio_codec_v3.diff ncoghlan, 2014年02月04日 10:53 Updated for Serhiy's comments review
issue20404_check_valid_textio_codec-3.3.patch serhiy.storchaka, 2014年02月25日 18:38 review
Messages (12)
msg209396 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2014年01月27日 05:01
TextIOWrapper doesn't check the codec type early the way the convenience methods now do, so the open() method currently still suffers from the problems Victor described in issue 19619 for str.encode() and bytes.decode():
>>> with open("hex.txt", 'w') as f:
... f.write("aabbccddeeff")
... 
12
>>> print(open('hex.txt').read())
aabbccddeeff
>>> print(open('hex.txt', encoding='hex').read())
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
TypeError: decoder should return a string result, not 'bytes'
These codecs are never going to work correctly with TextIOWrapper, so we should add the appropriate compatibility check in the constructor.
msg209399 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2014年01月27日 05:25
I also created issue 20405 as an RFE for 3.5, since it seems there is a possible gap in capability relative to Python 2 here.
msg209438 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2014年01月27日 12:29
Attached patch checks the requested encoding is a valid text encoding in TextIOWrapper.__init__.
Two additional test changes were needed:
- the one that checks handling of non-text-encodings at runtime now tweaks quopri to lie about being a text encoding when creating the stream
- the one that checks the interpreter doesn't crash at shutdown needed to be adjusted to handle the fact that _pyio now also fails in that situation, but with a different error (it can't find the ascii codec because the codec machinery is mostly gone)
Currently, this adds a third lookup of the encoding name to the process of creating a TextIOWrapper instance. This could be reduced to just one by changing the retrieval of the encoder and decoder to look in the retrieved codec info tuple, rather than doing the lookup by name again.
msg209948 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2014年02月02日 05:08
Revised patch that avoids doing multiple lookups of the same codec name while creating the stream.
Absent any comments, I'll commit this version with appropriate NEWS and What's New updates tomorrow.
msg209949 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2014年02月02日 05:26
Ah, just noticed the test case is still using the overly specific check for the exception wording. I'll fix that, too.
msg209970 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014年02月02日 11:17
The _PyCodec_GetIncrementalDecoder name looks too similar to PyCodec_IncrementalDecoder. It would be better to use more different name.
And please note my minor comments on Rietveld.
msg210197 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2014年02月04日 10:53
v3:
- prefix for internal helper APIs is now _PyCodecInfo_ to better distinguish them from the ones that take an encoding name
- error check is now just for "is not a text encoding"
- tweaked the name and comment in the error test to be clear that it is codecs that are not marked as text encodings that are rejected, not just binary transforms
- fixed indentation of multi-line expression in _pyio.py
msg210198 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014年02月04日 11:00
LGTM.
msg210217 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2014年02月04日 12:14
New changeset f3ec00d2b75e by Nick Coghlan in branch 'default':
Close #20404: blacklist non-text encodings in io.TextIOWrapper
http://hg.python.org/cpython/rev/f3ec00d2b75e 
msg212195 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014年02月25日 18:38
Here is backported to 3.3 patch.
msg212539 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2014年03月02日 08:22
New changeset 140a69d950eb by Georg Brandl in branch '3.3':
Issue #20404: reject non-text encodings early in TextIOWrapper.
http://hg.python.org/cpython/rev/140a69d950eb 
msg244549 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015年05月31日 17:21
New changeset cf6e782a7f94 by Serhiy Storchaka in branch '2.7':
Issue #19543: Emit deprecation warning for known non-text encodings.
https://hg.python.org/cpython/rev/cf6e782a7f94 
History
Date User Action Args
2022年04月11日 14:57:57adminsetgithub: 64603
2015年05月31日 17:21:42python-devsetmessages: + msg244549
2014年03月02日 08:22:51python-devsetmessages: + msg212539
2014年02月25日 18:38:23serhiy.storchakasetfiles: + issue20404_check_valid_textio_codec-3.3.patch
nosy: + georg.brandl
messages: + msg212195

2014年02月04日 12:14:10python-devsetstatus: open -> closed

nosy: + python-dev
messages: + msg210217

resolution: fixed
stage: commit review -> resolved
2014年02月04日 11:00:20serhiy.storchakasetmessages: + msg210198
stage: test needed -> commit review
2014年02月04日 10:53:39ncoghlansetfiles: + issue20404_check_valid_textio_codec_v3.diff

messages: + msg210197
2014年02月02日 11:17:18serhiy.storchakasetmessages: + msg209970
2014年02月02日 05:26:57ncoghlansetmessages: + msg209949
2014年02月02日 05:08:59ncoghlansetfiles: + issue20404_check_valid_textio_codec_v2.diff

messages: + msg209948
2014年01月27日 12:29:17ncoghlansetfiles: + issue20404_check_valid_textio_codec.diff

nosy: + pitrou
messages: + msg209438

components: + IO
keywords: + patch
2014年01月27日 05:25:43ncoghlansetmessages: + msg209399
2014年01月27日 05:01:16ncoghlancreate

AltStyle によって変換されたページ (->オリジナル) /