homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: zipfile.ZipFile.write() does not accept bytes arcname
Type: behavior Stage: needs patch
Components: Documentation, Library (Lib) Versions: Python 3.4, Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: Patrik Dufresne, docs@python, iritkatriel, july, matrixise, r.david.murray, serhiy.storchaka
Priority: normal Keywords:

Created on 2015年05月01日 21:30 by july, last changed 2022年04月11日 14:58 by admin.

Messages (9)
msg242355 - (view) Author: July Tikhonov (july) * Date: 2015年05月01日 21:30
In documentation of zipfile.ZipFile.write() there is following notice:
"There is no official file name encoding for ZIP files. If you have unicode file names, you must convert them to byte strings in your desired encoding before passing them to write()."
I understand it as that 'arcname' argument to write() shouldn't be of type str, but rather bytes.
But it is str that works, and bytes that does not:
$ ./python
Python 3.5.0a4+ (default:6f6e78931875, May 1 2015, 23:18:40) 
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import zipfile
>>> zf = zipfile.ZipFile('foo.zip', 'w')
>>> zf.write('python', 'a')
>>> zf.write('python', b'b')
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "/home/july/source/python/Lib/zipfile.py", line 1442, in write
 zinfo = ZipInfo(arcname, date_time)
 File "/home/july/source/python/Lib/zipfile.py", line 322, in __init__
 null_byte = filename.find(chr(0))
TypeError: a bytes-like object is required, not 'str'
(ZipInfo ostensibly attempts to find a zero byte in the filename, but searches instead for a unicode character chr(0). There are several other places in ZipInfo class that assume filename being str rather than bytes.)
I consider this a documentation issue: the notice is misleading. Although maybe there is someone who wants to fix the behavior of ZipInfo to allow bytes filename.
msg242356 - (view) Author: Stéphane Wirtel (matrixise) * (Python committer) Date: 2015年05月01日 21:41
This documentation is correct for python2 but maybe not for python3.
To check.
msg242373 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015年05月02日 02:36
We should either make it work with byte filenames, or allow control of the filename encoding. See also issue 20329. Unfortunately that part is probably a new feature. In the meantime the docs should be fixed: I believe we automatically encode the filename using the default zip filename codec (but someone should check).
msg242374 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015年05月02日 04:30
Indeed, the note is outdated and incorrect. First, general unicode filename are allowed. They are encoded with UTF-8 internally. Second, currently there is no way to create an entry without encoding the filename to UTF-8 (if it is not ASCII-only). So you can't create ZIP file with arbitrary encoding (e.g. cp866) for old DOS/Windows unzippers.
Adding support of bytes filenames is different issue (issue10757).
msg242392 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015年05月02日 12:31
Ah, I *thought* there was an issue for that, but I didn't find it when I searched. So this is just a doc issue to fix the docs to reflect current reality.
msg257358 - (view) Author: Patrik Dufresne (Patrik Dufresne) Date: 2016年01月02日 20:12
I'm converting my project into python3. I'm encountering issue with zipfile encoding. Look like, it only support unicode path. This is a huge issue since path are, by definition, bytes. You may store a file name with an invalid character without issue on the filesystem.
As such, arcname should support bytes.
Like, Tar, zip file format doesn't define a specific encoding. You may store filename as bytes.
msg257359 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2016年01月02日 20:20
As noted, adding that support is the subject of issue 10757.
msg259273 - (view) Author: Patrik Dufresne (Patrik Dufresne) Date: 2016年01月31日 02:07
Manage to work around this issue by using surrogateescape for arcname and filename. For me it's no longer an issue.
msg382761 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2020年12月08日 19:35
That part of the documentation was updated here by Serhiy: https://github.com/python/cpython/pull/10592 
History
Date User Action Args
2022年04月11日 14:58:16adminsetgithub: 68298
2020年12月08日 19:35:55iritkatrielsetnosy: + iritkatriel
messages: + msg382761
2016年01月31日 02:07:41Patrik Dufresnesetmessages: + msg259273
2016年01月02日 20:20:18r.david.murraysetmessages: + msg257359
2016年01月02日 20:12:28Patrik Dufresnesetnosy: + Patrik Dufresne
messages: + msg257358
2015年05月02日 12:31:29r.david.murraysetmessages: + msg242392
2015年05月02日 04:30:28serhiy.storchakasetversions: - Python 3.6
nosy: + serhiy.storchaka

messages: + msg242374

stage: needs patch
2015年05月02日 02:36:20r.david.murraysetnosy: + r.david.murray
messages: + msg242373
2015年05月01日 21:41:52matrixisesetnosy: + matrixise
messages: + msg242356
2015年05月01日 21:31:23julysetcomponents: + Library (Lib)
2015年05月01日 21:30:03julycreate

AltStyle によって変換されたページ (->オリジナル) /