homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: BOM incorrectly inserted before writing, after seeking in text file
Type: behavior Stage: resolved
Components: IO Versions: Python 3.4, Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: MarkIngramUK, amaury.forgeotdarc, pitrou, python-dev
Priority: normal Keywords: patch

Created on 2014年12月02日 16:41 by MarkIngramUK, last changed 2022年04月11日 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
append-test.py MarkIngramUK, 2014年12月02日 16:41 Test case
bom_seek_append.patch pitrou, 2014年12月07日 01:11
Messages (7)
msg232015 - (view) Author: Mark Ingram (MarkIngramUK) Date: 2014年12月02日 16:41
If you open a text file for append, but then perform any form of seeking, before attempting to write to the file, it will cause the BOM to be written before you text. See the attached file for an example.
If you run the test, take a look at the output file, and you'll notice the UTF16 BOM gets written out before each number.
I'm running a 2014 iMac with Yosemite.
msg232025 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2014年12月02日 17:09
issue5006 was supposed to take care of this, but it has a flaw IMO:
This statement https://hg.python.org/cpython/file/0744ceb5c0ed/Lib/_pyio.py#l2003 is missing an "and whence!=2".
msg232091 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014年12月03日 20:52
This is a limitation more than a bug. When you seek to the start of the file, the encoder is reset because Python thinks you are gonna to write there. If you remove the call to `file.seek(0, io.SEEK_SET)`, things work fine.
@Amaury, whence can only be zero there:
https://hg.python.org/cpython/file/0744ceb5c0ed/Lib/_pyio.py#l1960 
msg232092 - (view) Author: Mark Ingram (MarkIngramUK) Date: 2014年12月03日 20:57
It's more than a limitation, because if I call `file.seek(0, io.SEEK_END)` then the encoder is still reset, and will still write the BOM, even at the end of the file.
This also means that it's impossible to seek in a text file that you want to append to. I've had to work around this by opening the file as binary, manually writing the BOM, and writing the strings as encoded bytes.
msg232263 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014年12月07日 01:11
Here is a patch.
msg240688 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015年04月13日 18:04
New changeset 946740824eaf by Antoine Pitrou in branch '3.4':
Issue #22982: Improve BOM handling when seeking to multiple positions of a writable text file.
https://hg.python.org/cpython/rev/946740824eaf
New changeset 3583e5191b96 by Antoine Pitrou in branch 'default':
Issue #22982: Improve BOM handling when seeking to multiple positions of a writable text file.
https://hg.python.org/cpython/rev/3583e5191b96 
msg240689 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015年04月13日 18:05
Fix is pushed. Thanks for the report!
History
Date User Action Args
2022年04月11日 14:58:10adminsetgithub: 67171
2015年04月13日 18:05:22pitrousetstatus: open -> closed
resolution: fixed
messages: + msg240689

stage: patch review -> resolved
2015年04月13日 18:04:54python-devsetnosy: + python-dev
messages: + msg240688
2014年12月07日 01:11:49pitrousetfiles: + bom_seek_append.patch
versions: + Python 3.5
messages: + msg232263

keywords: + patch
stage: patch review
2014年12月03日 20:57:15MarkIngramUKsetmessages: + msg232092
2014年12月03日 20:52:28pitrousetnosy: + pitrou
messages: + msg232091
2014年12月02日 17:09:08amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg232025
2014年12月02日 16:41:42MarkIngramUKcreate

AltStyle によって変換されたページ (->オリジナル) /