This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2010年12月22日 12:44 by connexion2000, last changed 2022年04月11日 14:57 by admin.
| Messages (7) | |||
|---|---|---|---|
| msg124499 - (view) | Author: Jacek Jabłoński (connexion2000) | Date: 2010年12月22日 12:44 | |
file = 'somefile.dat'
filename = "ółśąśółąś.dat"
zip = zipfile.ZipFile('archive.zip', 'w', zipfile.ZIP_DEFLATED)
zip.write(file, filename)
above produces very nasty filename in zip archive.
*************************************************************
file = 'somefile.dat'
filename = "ółśąśółąś.dat"
zip = zipfile.ZipFile('archive.zip', 'w', zipfile.ZIP_DEFLATED)
zip.write(file, filename.encode('cp852'))
this produces TypeError: expected an object with the buffer interface
Documentation says that:
There is no official file name encoding for ZIP files. If you have unicode file names, you must convert them to byte strings in your desired encoding before passing them to write().
I convert them to byte string but it ends with an error.
If it is documentation bug, what is the proper way to have filenames like "ółśąśółąś" in zip archive?
|
|||
| msg124518 - (view) | Author: Martin v. Löwis (loewis) * (Python committer) | Date: 2010年12月22日 20:07 | |
This is not a bug. Your code that produces "very nasty filename" is the right one - the file name is actually the one you asked for. The second code is also behaving correctly: filename already *is* a bytestring, calling .encode for it is meaningless. |
|||
| msg124519 - (view) | Author: Martin v. Löwis (loewis) * (Python committer) | Date: 2010年12月22日 20:12 | |
Oops, I take this back - I didn't notice you were using Python 3.1. In any case, your first code is correct. What you get is the best you can ask for. That the second case fails is indeed a bug. |
|||
| msg124641 - (view) | Author: R. David Murray (r.david.murray) * (Python committer) | Date: 2010年12月25日 16:37 | |
See also msg79724 of issue 4871. From looking at the code it appears that the filename must be a string, and if it contains only ASCII characters it is entered as ascii, while if it contains non-ascii it is encoded to utf-8 and the appropriate flag bits set in the archive to indicate this (I know nothing about the archive format, by the way, I'm just looking at the code). So, in reverse of issue 4871, it appears that in this case the API should reject bytes input with an appropriate error message. |
|||
| msg124686 - (view) | Author: Martin v. Löwis (loewis) * (Python committer) | Date: 2010年12月26日 23:54 | |
> So, in reverse of issue 4871, it appears that in this case the API should reject bytes input with an appropriate error message. -1. It is quite common to produce ill-formed zipfiles, and other ziptools are interpreting them in violation of the format spec. Python needs to support creation of such broken zipfiles, even though it may not be able to read them back. |
|||
| msg124690 - (view) | Author: R. David Murray (r.david.murray) * (Python committer) | Date: 2010年12月27日 00:45 | |
Well, this is the same treat-strings-and-byte-strings-equivalently-in-the-same-API problem that we've had elsewhere. It'll require a bit of refactoring to make it work. On read zipfile decodes filenames using cp437 if the utf-8 flag isn't set. Logically, then, a binary string should be encoded using cp437. Since cp437 has a character corresponding to each of the 256 bytes, it seems to me it should be enough to decode a binary filename using cp437 and set a flag that _encodeFilenameFlags would respect and re-encode to cp437 instead of utf-8. That might produce unexpected results if someone passes in a binary filename encoded in some other character set, but it would be consistent with how zipfiles work and so should be at least as interoperable as zipfiles normally are. |
|||
| msg257385 - (view) | Author: Patrik Dufresne (Patrik Dufresne) | Date: 2016年01月02日 23:23 | |
This bug is very old, any development on the subject. This issue is hitting me trying to port my project (rdiffweb) to python3. It received a lot of broken filename with invalid encoding and I need to create a meaningful Zip archive with it. Currently, it just fail because zipfile doesn't accept arcname as bytes. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:10 | admin | set | github: 54966 |
| 2016年01月02日 23:23:30 | Patrik Dufresne | set | nosy:
+ Patrik Dufresne messages: + msg257385 |
| 2015年07月21日 07:19:00 | ethan.furman | set | nosy:
- ethan.furman |
| 2015年04月13日 21:25:50 | ozialien | set | nosy:
+ ozialien |
| 2013年10月14日 22:39:46 | ethan.furman | set | nosy:
+ ethan.furman |
| 2010年12月27日 00:45:06 | r.david.murray | set | nosy:
loewis, aimacintyre, r.david.murray, connexion2000 messages: + msg124690 title: zipfile.write, arcname should be bytestring -> zipfile.write, arcname should be allowed to be a byte string |
| 2010年12月26日 23:54:25 | loewis | set | nosy:
loewis, aimacintyre, r.david.murray, connexion2000 messages: + msg124686 |
| 2010年12月25日 16:37:05 | r.david.murray | set | nosy:
+ r.david.murray messages: + msg124641 |
| 2010年12月24日 21:54:48 | terry.reedy | set | nosy:
+ aimacintyre stage: test needed type: compile error -> behavior versions: + Python 3.2 |
| 2010年12月22日 20:12:05 | loewis | set | status: closed -> open messages: + msg124519 resolution: not a bug -> |
| 2010年12月22日 20:07:48 | loewis | set | status: open -> closed nosy: + loewis messages: + msg124518 resolution: not a bug |
| 2010年12月22日 12:44:03 | connexion2000 | create | |