This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2011年05月10日 07:59 by yaoyu, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| test.zip | yaoyu, 2011年05月10日 07:59 | |||
| Messages (6) | |||
|---|---|---|---|
| msg135687 - (view) | Author: yaoyu (yaoyu) | Date: 2011年05月10日 07:59 | |
Python 3, ZipFile Bug In Chinese: 1. In Python3.1.3 can't extract "复件 test.txt" from test.zip ╕┤╝■しかく test.txt Traceback (most recent call last): File "C:\Temp\PythonZipTest\pythonzip.py", line 14, in <module> main() File "C:\Temp\PythonZipTest\pythonzip.py", line 11, in main z.extract(z.namelist()[0]) File "c:\python31\lib\zipfile.py", line 980, in extract return self._extract_member(member, path, pwd) File "c:\python31\lib\zipfile.py", line 1023, in _extract_member source = self.open(member, pwd=pwd) File "c:\python31\lib\zipfile.py", line 928, in open % (zinfo.orig_filename, fname)) zipfile.BadZipfile: File name in directory '╕┤╝■しかく test.txt' and header b'\xb8\xb4\xbc\xfe test.txt' differ. 2. In Python3.2 extract "复件 test.txt" from test.zip uncorrect It extract the file as "╕┤╝■しかく test.txt" 3. In Python 2.7.1, It's OK! 2011年05月10日 Source Code ###################################################################### #coding=gbk import zipfile import os def main(): szTestDir = os.path.dirname(__file__) szFile = os.path.join(szTestDir, 'test.zip') z = zipfile.ZipFile(szFile) print(z.namelist()[0]) z.extract(z.namelist()[0]) if __name__ == '__main__': main() |
|||
| msg135837 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2011年05月12日 14:33 | |
This is a duplicate of #10801, issue fixed in Python 3.2 or later by 33543b4e0e5d. Should we backport the fix to Python 3.1, or you can upgrade to Python 3.2? Output with Python 3.2: "╕┤╝■しかく test.txt". |
|||
| msg135840 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) | Date: 2011年05月12日 14:48 | |
But according to the initial report, 3.2 does not give the expected behavior. This zip file actually stores the filename encoded with cp932, which is incorrect according to the specifications of the ZIP format (only cp437 and utf8 are valid) See issue10614 for a possible solution: allow users to specify an alternate encoding to handle such invalid files. |
|||
| msg135842 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2011年05月12日 15:07 | |
Oh, right.
Note: the encoding looks to be GBK, not CP932:
>>> '\u590d\u4ef6'.encode('gbk')
b'\xb8\xb4\xbc\xfe'
>>> '\u590d\u4ef6'.encode('gbk').decode('cp437')
'╕┤╝■しかく'
>>> '\u590d\u4ef6'.encode('cp932')
...
UnicodeEncodeError: 'cp932' codec can't encode character '\u590d' ...
|
|||
| msg136226 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2011年05月18日 11:30 | |
See also #4621. |
|||
| msg136232 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2011年05月18日 12:01 | |
This issue is just another example of the issue #10614: I'm closing it as a duplicate. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:17 | admin | set | github: 56257 |
| 2011年05月18日 12:01:36 | vstinner | set | status: open -> closed resolution: duplicate messages: + msg136232 |
| 2011年05月18日 11:30:57 | vstinner | set | messages: + msg136226 |
| 2011年05月12日 15:07:37 | vstinner | set | messages: + msg135842 |
| 2011年05月12日 14:48:15 | amaury.forgeotdarc | set | nosy:
+ amaury.forgeotdarc messages: + msg135840 |
| 2011年05月12日 14:35:17 | vstinner | set | nosy:
+ georg.brandl |
| 2011年05月12日 14:34:44 | vstinner | set | components: + Library (Lib), Unicode |
| 2011年05月12日 14:33:52 | vstinner | set | messages:
+ msg135837 versions: - Python 3.2, Python 3.3 |
| 2011年05月12日 14:22:06 | pitrou | set | nosy:
+ vstinner versions: + Python 3.3 |
| 2011年05月10日 07:59:56 | yaoyu | create | |