homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: zipfile can't decrypt
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.1, Python 3.2
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: r.david.murray Nosy List: amaury.forgeotdarc, ebfe, ggenellina, gladed, r.david.murray, vstinner
Priority: normal Keywords: patch

Created on 2009年01月07日 20:26 by gladed, last changed 2022年04月11日 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
zipfile-pwd.patch amaury.forgeotdarc, 2009年01月13日 08:40 review
zipfile_no_unicode_pasword.patch vstinner, 2009年01月13日 19:29 review
zipfile_no_unicode_pasword-2.patch vstinner, 2009年01月13日 23:22 review
zipfile_no_unicode_pasword-2-py3k.patch vstinner, 2009年01月13日 23:23 review
Messages (15)
msg79368 - (view) Author: Glade Diviney (gladed) Date: 2009年01月07日 20:26
If a password is supplied using zpifile.read(objName, password), a 
TypeError occurs:
 File "C:\Program Files\Python30\lib\zipfile.py", line 420, in __init__
 self._UpdateKeys(p)
 File "C:\Program Files\Python30\lib\zipfile.py", line 424, in 
_UpdateKeys
 self.key0 = self._crc32(c, self.key0)
 File "C:\Program Files\Python30\lib\zipfile.py", line 413, in _crc32
 return ((crc >> 8) & 0xffffff) ^ self.crctable[(crc ^ ch) & 0xff]
TypeError: unsupported operand type(s) for ^: 'int' and 'str'
This is resolved in zipfile.py by replacing this line in __init__:
 self._UpdateKeys(ord(p))
msg79388 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009年01月08日 01:03
The key is a bytes string or an unicode string? Using bytes, each 
element is an int instead of a str.
msg79421 - (view) Author: Glade Diviney (gladed) Date: 2009年01月08日 16:19
In this case under test, the password string had been clipped from a 
filename, so I suppose it would have been Unicode.
msg79423 - (view) Author: Lukas Lueg (ebfe) Date: 2009年01月08日 16:41
This is basically the same problem as with other bytes-orientated modules.
The choice is either to reject unicode and force the caller to use
.encode() on all his strings (so 'password' is an instance of bytes and
'ch' an instance of int). I'd prefer that.
Or to check if 'password' is a unicode-object and encode to default
within zipfile.
msg79425 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009年01月08日 16:50
> check if 'password' is a unicode-object 
> and encode to default within zipfile. 
Hmmm... what is the default charset? :-) I think that the charset 
depends on the OS and the user locale...
msg79426 - (view) Author: Lukas Lueg (ebfe) Date: 2009年01月08日 16:59
The default encoding is UTF8. Even without that there should be no
problems while staying in the same environment as the
character->translation is the same.
However it violates the rule of least surprise to see zipfile throwing
CRC errors because the password is something like 'u?è ́ ́n' and encoded
by some locale-changing-nitwit in korea...
msg79719 - (view) Author: Gabriel Genellina (ggenellina) Date: 2009年01月13日 05:34
Lukas Lueg> The default encoding is UTF8
What do you mean? Not inside a zip file. The default encoding is CP437 
(the original IBM PC US character set).
A zipfile password is a sequence of bytes, not characters, as defined 
in the specification. Probably this issue has to wait until it is 
decided what to do in the general case; anyway, encoding the password 
to bytes at the application level seems to be safer.
msg79724 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2009年01月13日 08:40
in Lib/zipfile.py, the filename is already allowed to be unicode if the
"flag_bits" for the file entry contains 0x800. This indicates that the
encoding is utf-8.
We could do the same for the password. Attached is a tentative patch
along this idea (for py3k replace 'unicode' with 'str' of course)
msg79771 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009年01月13日 19:25
I created small archive with a password using zip 2.32 (Ubuntu Gutsy). 
The unicode flag is not set and so ASCII charset is used to encode the 
password. But my password was an unicode password using non-ASCII 
characters and so I get an UnicodeEncodeError ('ascii' codec can't 
encode character u'\xe9' ...). The user (me) doesn't expect such error 
here :-/
If I encode the unicode password to bytes using 
password.encode('utf-8'), it works as expected. So I would prefer to 
just reject unicode for the password. Attached patch implements that.
msg79784 - (view) Author: Gabriel Genellina (ggenellina) Date: 2009年01月13日 22:45
Yes, the unicode flag is irrelevant to the password. To successfuly 
decrypt a file, one must know the password *and* the encoding in use 
when it was written, so the only sensible thing to do is to accept 
bytes only.
Your patch looks good to me with a few comments:
- I'd only check the password in setpassword() and open(); the other 
methods eventually call open().
- The issue version says "3.0"; but it should be applied to 2.6 2.7 
and 3.1 too.
- For 3.0 the check is wrong, should say "bytes" instead of "str". 
Even in 2.6 should use "bytes" to make clear the intent.
- The usual wording for such TypeError is more like "pwd: 
expected 'bytes', got '%s'"
msg79791 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009年01月13日 23:22
gagenellina: Oops, yes, I wrote my patch for Python trunk. New patch 
only uses two checks for the password type (since extract() calls 
open()). I also changed the error message.
msg79793 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009年01月13日 23:23
Patch for py3k: str=>bytes, remove "u" string prefix and 
rename "bytes" variable to "data" to avoid conflict with the bytes 
type.
msg124322 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010年12月18日 21:43
What about bytearray? Apparently that works pre-patch for at least read, though setpassword rejects it via an assertion.
Also, the error message should be "expected bytes" rather than "bytes expected". Don't ask me why, that's just the way it is normally done, so the other order sounds weird to this English speaker's ear. Otherwise the py3k patch looks good and tests correctly for me.
msg124460 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010年12月21日 21:36
Thinking about this some more, it seems like the chance that someone is using bytearray to pass a password to zipfile is vanishingly small, especially since in non-optimized mode setpassword would have rejected it. So I think that this should go in.
msg124463 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010年12月21日 22:03
Committed in r87430 (with message word order change), backported to 3.1 in r87431. Making the parallel change to 2.7 would be likely to break working code, IMO.
History
Date User Action Args
2022年04月11日 14:56:43adminsetgithub: 49121
2010年12月21日 22:03:13r.david.murraysetstatus: open -> closed
versions: - Python 2.7
nosy: amaury.forgeotdarc, ggenellina, vstinner, r.david.murray, ebfe, gladed
messages: + msg124463

resolution: fixed
stage: commit review -> resolved
2010年12月21日 21:36:15r.david.murraysetassignee: r.david.murray
messages: + msg124460
nosy: amaury.forgeotdarc, ggenellina, vstinner, r.david.murray, ebfe, gladed
stage: commit review
2010年12月18日 21:43:24r.david.murraysetnosy: + r.david.murray

messages: + msg124322
versions: + Python 3.2, - Python 2.6, Python 3.0
2009年01月13日 23:23:26vstinnersetfiles: + zipfile_no_unicode_pasword-2-py3k.patch
messages: + msg79793
2009年01月13日 23:22:20vstinnersetfiles: + zipfile_no_unicode_pasword-2.patch
messages: + msg79791
versions: + Python 2.6, Python 3.1, Python 2.7
2009年01月13日 22:46:00ggenellinasetmessages: + msg79784
2009年01月13日 19:29:34vstinnersetfiles: + zipfile_no_unicode_pasword.patch
2009年01月13日 19:29:20vstinnersetfiles: - zipfile_no_unicode_pasword.patch
2009年01月13日 19:25:04vstinnersetfiles: + zipfile_no_unicode_pasword.patch
messages: + msg79771
2009年01月13日 08:40:49amaury.forgeotdarcsetfiles: + zipfile-pwd.patch
keywords: + patch
messages: + msg79724
nosy: + amaury.forgeotdarc
2009年01月13日 05:34:43ggenellinasetnosy: + ggenellina
messages: + msg79719
2009年01月08日 16:59:35ebfesetmessages: + msg79426
2009年01月08日 16:50:48vstinnersetmessages: + msg79425
2009年01月08日 16:41:01ebfesetnosy: + ebfe
messages: + msg79423
2009年01月08日 16:19:25gladedsetmessages: + msg79421
2009年01月08日 01:03:41vstinnersetnosy: + vstinner
messages: + msg79388
2009年01月07日 20:26:25gladedcreate

AltStyle によって変換されたページ (->オリジナル) /