Message 139969 - Python tracker

➜

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

In-reply-to
Author	vstinner
Recipients	vstinner
Date	2011年07月07日.11:14:34
SpamBayes Score	3.4014127e-05
Marked as misclassified	No
Message-id	<1310037275.84.0.0119141520986.issue12512@psf.upfronthosting.co.za>

Content
The following code fails with an AssertionError('###\ufeffdef'): import codecs _open = codecs.open #_open = open filename = "test" with _open(filename, 'w', encoding='utf_16') as f: f.write('abc') pos = f.tell() with _open(filename, 'w', encoding='utf_16') as f: f.seek(pos) f.write('def') f.seek(0) f.write('###') with _open(filename, 'r', encoding='utf_16') as f: content = f.read() assert content == '###def', ascii(content) It is a bug in StreamWriter.seek(): it should update the encoder state to not write a new BOM. It has to be fixed in the StreamWriter class of each stateful codec, or a stateful StreamWriter class should be implemented in the codecs module. Python supports the following stateful codecs: * cp932 * cp949 * cp950 * euc_jis_2004 * euc_jisx2003 * euc_jp * euc_kr * gb18030 * gbk * hz * iso2022_jp * iso2022_jp_1 * iso2022_jp_2 * iso2022_jp_2004 * iso2022_jp_3 * iso2022_jp_ext * iso2022_kr * shift_jis * shift_jis_2004 * shift_jisx0213 * utf_8_sig * utf_16 * utf_32 This bug has already been fixed in TextIOWrapper: issue #5006.

Content

The following code fails with an AssertionError('###\ufeffdef'):
import codecs
_open = codecs.open
#_open = open
filename = "test"
with _open(filename, 'w', encoding='utf_16') as f:
 f.write('abc')
 pos = f.tell()
with _open(filename, 'w', encoding='utf_16') as f:
 f.seek(pos)
 f.write('def')
 f.seek(0)
 f.write('###')
with _open(filename, 'r', encoding='utf_16') as f:
 content = f.read()
 assert content == '###def', ascii(content)
It is a bug in StreamWriter.seek(): it should update the encoder state to not write a new BOM. It has to be fixed in the StreamWriter class of each stateful codec, or a stateful StreamWriter class should be implemented in the codecs module.
Python supports the following stateful codecs:
 * cp932
 * cp949
 * cp950
 * euc_jis_2004
 * euc_jisx2003
 * euc_jp
 * euc_kr
 * gb18030
 * gbk
 * hz
 * iso2022_jp
 * iso2022_jp_1
 * iso2022_jp_2
 * iso2022_jp_2004
 * iso2022_jp_3
 * iso2022_jp_ext
 * iso2022_kr
 * shift_jis
 * shift_jis_2004
 * shift_jisx0213
 * utf_8_sig
 * utf_16
 * utf_32
This bug has already been fixed in TextIOWrapper: issue #5006.

History
Date	User	Action	Args
2011年07月07日 11:14:35	vstinner	set	recipients: + vstinner
2011年07月07日 11:14:35	vstinner	set	messageid: <1310037275.84.0.0119141520986.issue12512@psf.upfronthosting.co.za>
2011年07月07日 11:14:35	vstinner	link	issue12512 messages
2011年07月07日 11:14:34	vstinner	create

homepage