Message139969
| Author |
vstinner |
| Recipients |
vstinner |
| Date |
2011年07月07日.11:14:34 |
| SpamBayes Score |
3.4014127e-05 |
| Marked as misclassified |
No |
| Message-id |
<1310037275.84.0.0119141520986.issue12512@psf.upfronthosting.co.za> |
| In-reply-to |
| Content |
The following code fails with an AssertionError('###\ufeffdef'):
import codecs
_open = codecs.open
#_open = open
filename = "test"
with _open(filename, 'w', encoding='utf_16') as f:
f.write('abc')
pos = f.tell()
with _open(filename, 'w', encoding='utf_16') as f:
f.seek(pos)
f.write('def')
f.seek(0)
f.write('###')
with _open(filename, 'r', encoding='utf_16') as f:
content = f.read()
assert content == '###def', ascii(content)
It is a bug in StreamWriter.seek(): it should update the encoder state to not write a new BOM. It has to be fixed in the StreamWriter class of each stateful codec, or a stateful StreamWriter class should be implemented in the codecs module.
Python supports the following stateful codecs:
* cp932
* cp949
* cp950
* euc_jis_2004
* euc_jisx2003
* euc_jp
* euc_kr
* gb18030
* gbk
* hz
* iso2022_jp
* iso2022_jp_1
* iso2022_jp_2
* iso2022_jp_2004
* iso2022_jp_3
* iso2022_jp_ext
* iso2022_kr
* shift_jis
* shift_jis_2004
* shift_jisx0213
* utf_8_sig
* utf_16
* utf_32
This bug has already been fixed in TextIOWrapper: issue #5006. |
|
History
|
|---|
| Date |
User |
Action |
Args |
| 2011年07月07日 11:14:35 | vstinner | set | recipients:
+ vstinner |
| 2011年07月07日 11:14:35 | vstinner | set | messageid: <1310037275.84.0.0119141520986.issue12512@psf.upfronthosting.co.za> |
| 2011年07月07日 11:14:35 | vstinner | link | issue12512 messages |
| 2011年07月07日 11:14:34 | vstinner | create |
|