homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Python 3, ZipFile Bug In Chinese
Type: behavior Stage:
Components: Library (Lib), Unicode Versions: Python 3.1
process
Status: closed Resolution: duplicate
Dependencies: Superseder:
Assigned To: Nosy List: amaury.forgeotdarc, georg.brandl, vstinner, yaoyu
Priority: normal Keywords:

Created on 2011年05月10日 07:59 by yaoyu, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
test.zip yaoyu, 2011年05月10日 07:59
Messages (6)
msg135687 - (view) Author: yaoyu (yaoyu) Date: 2011年05月10日 07:59
Python 3, ZipFile Bug In Chinese:
1. In Python3.1.3 can't extract "复件 test.txt" from test.zip
╕┤╝しかく test.txt
Traceback (most recent call last):
 File "C:\Temp\PythonZipTest\pythonzip.py", line 14, in <module>
 main()
 File "C:\Temp\PythonZipTest\pythonzip.py", line 11, in main
 z.extract(z.namelist()[0])
 File "c:\python31\lib\zipfile.py", line 980, in extract
 return self._extract_member(member, path, pwd)
 File "c:\python31\lib\zipfile.py", line 1023, in _extract_member
 source = self.open(member, pwd=pwd)
 File "c:\python31\lib\zipfile.py", line 928, in open
 % (zinfo.orig_filename, fname))
zipfile.BadZipfile: File name in directory '╕┤╝しかく test.txt' and header b'\xb8\xb4\xbc\xfe test.txt' differ.
2. In Python3.2 extract "复件 test.txt" from test.zip uncorrect
 It extract the file as "╕┤╝しかく test.txt"
3. In Python 2.7.1, It's OK!
     2011年05月10日
Source Code
######################################################################
#coding=gbk
import zipfile
import os
def main():
 szTestDir = os.path.dirname(__file__)
 szFile = os.path.join(szTestDir, 'test.zip')
 z = zipfile.ZipFile(szFile)
 print(z.namelist()[0])
 z.extract(z.namelist()[0])
if __name__ == '__main__':
 main()
msg135837 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011年05月12日 14:33
This is a duplicate of #10801, issue fixed in Python 3.2 or later by 33543b4e0e5d. Should we backport the fix to Python 3.1, or you can upgrade to Python 3.2?
Output with Python 3.2: "╕┤╝しかく test.txt".
msg135840 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2011年05月12日 14:48
But according to the initial report, 3.2 does not give the expected behavior. This zip file actually stores the filename encoded with cp932, which is incorrect according to the specifications of the ZIP format (only cp437 and utf8 are valid)
See issue10614 for a possible solution: allow users to specify an alternate encoding to handle such invalid files.
msg135842 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011年05月12日 15:07
Oh, right.
Note: the encoding looks to be GBK, not CP932:
>>> '\u590d\u4ef6'.encode('gbk')
b'\xb8\xb4\xbc\xfe'
>>> '\u590d\u4ef6'.encode('gbk').decode('cp437')
'╕┤╝しかく'
>>> '\u590d\u4ef6'.encode('cp932')
...
UnicodeEncodeError: 'cp932' codec can't encode character '\u590d' ...
msg136226 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011年05月18日 11:30
See also #4621.
msg136232 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011年05月18日 12:01
This issue is just another example of the issue #10614: I'm closing it as a duplicate.
History
Date User Action Args
2022年04月11日 14:57:17adminsetgithub: 56257
2011年05月18日 12:01:36vstinnersetstatus: open -> closed
resolution: duplicate
messages: + msg136232
2011年05月18日 11:30:57vstinnersetmessages: + msg136226
2011年05月12日 15:07:37vstinnersetmessages: + msg135842
2011年05月12日 14:48:15amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg135840
2011年05月12日 14:35:17vstinnersetnosy: + georg.brandl
2011年05月12日 14:34:44vstinnersetcomponents: + Library (Lib), Unicode
2011年05月12日 14:33:52vstinnersetmessages: + msg135837
versions: - Python 3.2, Python 3.3
2011年05月12日 14:22:06pitrousetnosy: + vstinner

versions: + Python 3.3
2011年05月10日 07:59:56yaoyucreate

AltStyle によって変換されたページ (->オリジナル) /