homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: zipfile can't extract file
Type: behavior Stage: patch review
Components: Library (Lib), Windows Versions: Python 3.7, Python 3.6, Python 3.4, Python 3.5, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Jim.Jewett, MorganRamsay, NewerCookie, Sean Goodwin, amaury.forgeotdarc, apolkosnik, berker.peksag, chuck, francismb, georg.brandl, gregory.p.smith, ncoghlan, ronaldoussoren, serhiy.storchaka, terry.reedy
Priority: normal Keywords: patch

Created on 2009年09月04日 19:56 by NewerCookie, last changed 2022年04月11日 14:56 by admin.

Files
File name Uploaded Description Edit
test.zip NewerCookie, 2009年09月04日 19:56 mildly corrupt zipfile to test error handling
zlib_forward_slash.patch chuck, 2009年09月19日 17:01 review
zipfile_276_filename_mismatch_v2.patch apolkosnik, 2014年04月30日 16:10 patch with warnings against 2.7.6
zipfile_340_filename_mismatch_v3.patch apolkosnik, 2014年04月30日 18:28 patch with warnings against 3.4.0
zipfile_276_filename_mismatch_v3.patch apolkosnik, 2014年04月30日 19:11 patch with print against 2.7.6
Pull Requests
URL Status Linked Edit
PR 14212 open python-dev, 2019年06月18日 21:36
Messages (61)
msg92265 - (view) Author: Kim Kyung Don (NewerCookie) Date: 2009年09月04日 19:57
The following exception occured when I tried to extract on Windows.
"zipfile.BadZipfile: File name in directory "test\test2.txt" and header
"test/test2.txt" differ."
It seems like problem about slash.
I tested using by zipfile Revision 72893.
msg92297 - (view) Author: Kim Kyung Don (NewerCookie) Date: 2009年09月06日 04:02
P.S
I tested extraction by using 7-zip.
It works fine.
msg92309 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2009年09月06日 12:40
The zipfile is technically incorrect, the zipfile specification prescribes 
that all filenames use '/' as the directory separator.
Even without that caveat the file is corrupt because the zipfile directory 
header and the per-file header don't agree on the name of the file.
That said: IMHO the current code in zipfile.ZipFile.open is too strict, it 
shouldn't raise an error when the two names aren't exactly the same 
because there are valid reasons for them to be different (such as renaming 
a file in the zipfile without rewriting the entire zipfile).
msg92326 - (view) Author: Alan McIntyre (alanmcintyre) * (Python committer) Date: 2009年09月06日 18:58
FileRoller doesn't complain about the mismatched slashes either. Where
did the ZIP come from, by the way? I seem to recall that there have
been other instances in which ZIP applications were more "forgiving"
than the zipfile module. How far should zipfile go in bending the
interpretation of the ZIP standard? 
As far as the renaming goes, it seems the standard says the header name
should be used if the two names are different. If nobody else has time
to make a patch and tests I can take a stab at it in the next few days.
msg92330 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2009年09月06日 20:41
alan: I don't quite understand which filename you want to use when the 
name in the per-file header and the central directory don't match. 
Where in the standard is this prescribed? I couldn't find anything in 
the PKWare zipfile appnote [1]
My preference would be to use the central directory as the canonical 
value because scanning the entire zipfile to read the per-file header 
would give a significant overhead. This might not be very noticable with 
small zipfiles, but I regularly use zipfiles with over 100K files in 
them in those files a scan of the zipfile is prohibitively expensive.
Furthermore, when the two are different the most reasonably explaination 
is that an in-place edit of the zipfile changed the directory without 
rewriting the entire zipfile (just like you can "delete" files from a 
zipfile by dumping them from the directory rather than completely 
rewriting the entire archive)
[1] 
APPNOTE.TXT - .ZIP File Format Specification Version: 6.3.2 
Revised: September 28, 2007 
Copyright (c) 1989 - 2007 PKWARE Inc., All Rights Reserved.
msg92335 - (view) Author: Alan McIntyre (alanmcintyre) * (Python committer) Date: 2009年09月06日 21:26
Sorry about the confusion--I think I confused myself by looking at the
bit about CRC checksums in the "Info-ZIP Unicode Path Extra Field"
section before I posted. I meant to say that the central directory name
looks preferred over the per-file header.
n section J, under "file name (Variable)" there's a bit that says:
"If input came from standard input, there is no file name field. If
encrypting the central directory and general purpose bit flag 13 is set
indicating masking, the file name stored in the Local Header will not be
the actual file name. A masking value consisting of a unique
hexadecimal value will be stored."
So in these cases the central directory name has to be used. And, as
you pointed out, some operations like "deleting" a member from the
archive are implemented by editing the central directory, so it would
seem that the central directory should be used if there's a conflict.
msg92516 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2009年09月11日 18:57
In the case at issue, the file name is the same (contrary to the error
message). The two representations of the *path* are different, but
equivalent. There is no ambiguity: the file should be put in directory
'test' and named 'test2.txt'. So I think zipfile should do what 7zip
does and do just that.
An actual filename difference might be argued differently.
msg92874 - (view) Author: Jan (chuck) * Date: 2009年09月19日 17:01
I added a patch to replace back slashes by forward slashes in three 
places, only one if them actually relevant to the errors in the attached 
.zip file.
I kept the exception for mismatching filenames, but if you think it is 
appropriate to remove it I could do that as well.
msg116384 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010年09月14日 11:26
I agree with the change, but the code should be factorized in a function (normalize_filename for example)
msg116385 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2010年09月14日 11:35
I'd prefer if the code no longer checked if the filename in the directory matches the name in the per-file header.
The reason of that is that the two don't have to match: it is relatively cheap to rename a file in the zipfile by rewriting the directory while rewriting the entire zipfile can be pretty expensive when zipfiles get large.
It's probably worthwhile to test what other zipfile tools do in the respect (e.g., create a zipfile where the filename in the header doesn't match the name in the directory and extract that zip using a number of popular tools).
(I have a slightly odd perspective on this because I regularly deal with zipfiles containing over 100K files and over 10GByte of data).
msg200165 - (view) Author: Adam Polkosnik (apolkosnik) * Date: 2013年10月17日 20:42
I've got bitten by a different variation of this bug.
In my case the issue can be summarized by:
zipfile.BadZipfile: File name in directory "Windows\TEMP\test.tmp" and header "C:\Windows\TEMP\test.tmp" differ.
Attached is a patch for Python27/lib/zipfile.py. I understand that it might not be the best approach, but at least we just compare the filenames without caring much about those pesky paths preceding them.
msg201842 - (view) Author: Adam Polkosnik (apolkosnik) * Date: 2013年10月31日 18:56
Just tested my patch on mac, and it appears that it didn't work on OSX (and likely on other unix platforms too).
Conclusion... os.path.basename() will not do anything to windows paths when running on unix.
I'm thinking that instead of bailing at 'File name in directory "%s" and header "%s" differ.', the library should just print a warning, and continue.
msg208970 - (view) Author: Adam Polkosnik (apolkosnik) * Date: 2014年01月23日 17:59
I'm in a similar situation, my test file raises this:
File name in directory "windows\TEMP\\test123.txt" and header "C:\windows\TEMP\\test123.txt" differ.
It turns out that I can't find any cross platform procedures for processing the paths between the different platforms. And there are other things like doing it in portable way; os.path.split() nor os.path.basename() won't touch windows paths on un*x, etc...
So, I'd like to propose an easy way, just allow the process to extract the files (and print a warning message) rather that just raising an exception (raise BadZipfile,...) and stopping the extraction altogether.
msg208973 - (view) Author: Adam Polkosnik (apolkosnik) * Date: 2014年01月23日 18:20
This one has the parentheses for print, so that it works in python 3.x. Also, the default fallback behavior in this case is to use the filename from the zips' directory (the first path in the warning).
msg208975 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2014年01月23日 18:34
As I wrote in msg116385 I'd prefer to drop the consistency check completely because updating data like the filename in the central directory is a cheap way to rename files without completely rewriting the zip file.
msg208982 - (view) Author: Adam Polkosnik (apolkosnik) * Date: 2014年01月23日 20:04
Can we get this simple "fix" implemented in time for the next 2.7.x release?!
Thank you!
msg208983 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2014年01月23日 20:09
print() is not a good way to emit the warning; please use the warnings module.
msg208985 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014年01月23日 20:12
> As I wrote in msg116385 I'd prefer to drop the consistency check completely
> because updating data like the filename in the central directory is a cheap
> way to rename files without completely rewriting the zip file.
It should at least left as debugging print.
It can't be a warning, because it depends not on user's actions, but on 
external data. But user still should be able to investigate uncommon zipfiles 
by setting the debug attribute.
msg208987 - (view) Author: Adam Polkosnik (apolkosnik) * Date: 2014年01月23日 20:22
Excellent, please see my third attempt.
msg209023 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2014年01月23日 23:52
Adam this is not a security issue (2.6, 3.1, 3.2), nor a future issue that must wait for 3.5.
msg211562 - (view) Author: Adam Polkosnik (apolkosnik) * Date: 2014年02月18日 21:54
It might not be a regular "security" issue, but it is not extracting some files that it should. There's a possible scenario, where it can be a security issue.
msg217533 - (view) Author: Adam Polkosnik (apolkosnik) * Date: 2014年04月29日 17:58
Gentlemen,
Is there's any way this fix can be included in any version?
Currently, the fact that the exception is thrown makes extracting some zip files impossible with this library, and rolling your own is a bit painful. (either using a wrapper around 7zip to handle those or just provide cloned/patched versions for every major python version).
This ridiculous behavior is really not consistent with other ZIP implementations (7zip just ignores the mismatch).
Thank you for your time and effort.
msg217546 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2014年04月29日 20:41
Adam P, please don't screw around with the version headers. If you want to claim that this is a security issue of the type we care about (threats to the public internet) for patching old releases, and severe enough that we should do anything about it, send a detailed explanation with links to evidence to security response team. Simple writing 'a possible scenario' is insufficient.
msg217551 - (view) Author: Adam Polkosnik (apolkosnik) * Date: 2014年04月29日 21:08
For the version headers, I've added the versions featuring the broken behavior. That's all.
I'm not saying that this is 
I'm extracting malware from the Central Quarantine files, and the vendor's implementation is broken and is causing this issue for me on every single file inside the archive.
Let's say, I've got a wrapper script that feeds the contents of a zip file to be scanned with this, because of this behavior, the wrapper will error out... Customers will say your product sucks, etc.
Does this really take an act of god to fix this?
msg217554 - (view) Author: Adam Polkosnik (apolkosnik) * Date: 2014年04月29日 21:15
Also, this behavior is present on all platforms and all versions of Python (zipfile Library), so maybe the headers should be adjusted there too.
I'm not saying that this is necessarily a big freaking hole, but by using this, one can prevent files from being extracted using this simple trick.
msg217556 - (view) Author: Adam Polkosnik (apolkosnik) * Date: 2014年04月29日 21:21
If I got a file scanner in my mail gateway implemented with this, one can easily avoid getting the contents of zip-files scanned. Is that enough of a security impact?
msg217558 - (view) Author: Adam Polkosnik (apolkosnik) * Date: 2014年04月29日 21:42
I've also tested with WinZip, and Windows Explorer, on windows. Both extract the contents of test.zip without a warning (just like 7zip on Windows did). This behavior counts as Denial Of Service if the zipfile Library is used to extract files, besides lots of formats use ZIP as an envelope; DOCX, APK, JAR, EPUB come to mind.
msg217561 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2014年04月29日 23:12
Adam P. I politely asked you to leave the headers alone. Since you ignored that, LEAVE THE HEADERS ALONE! If you continue, you will eventually get banned. 
An issue gets dealt with when a volunteer core developer makes it his top priority. In the past 24 hours, patches were pushed for 16 different issues, but not this one. Sorry, but that is how it is.
msg217569 - (view) Author: Adam Polkosnik (apolkosnik) * Date: 2014年04月30日 04:16
Terry, I apologize about the second change of headers, somehow I must have used the submission form to post the comment from a tab that had the old content, and the headers didn't refresh there. I assure you that it was not my intention to change them again.
msg217570 - (view) Author: Adam Polkosnik (apolkosnik) * Date: 2014年04月30日 04:23
In any event, I think that zipfile_stupid3.patch would be the best trivial fix to this issue.
msg217571 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2014年04月30日 04:34
The check can be simplified further to "if self.debug and fname != zinfo.orig_filename:", but the conversion to a debugging print seems reasonable to me.
msg217572 - (view) Author: Adam Polkosnik (apolkosnik) * Date: 2014年04月30日 05:16
Patch against 2.7.6 attached.
msg217573 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2014年04月30日 05:19
Nick, do you agree that this should be treated as a bug (apply to all 3 versions)?
Should debug messages be 'print'ed, sent to stderr, or go through the warnings module?
msg217574 - (view) Author: Adam Polkosnik (apolkosnik) * Date: 2014年04月30日 05:23
Patch against zipfile 3.4.0 attached.
msg217575 - (view) Author: Adam Polkosnik (apolkosnik) * Date: 2014年04月30日 05:27
update
msg217576 - (view) Author: Adam Polkosnik (apolkosnik) * Date: 2014年04月30日 05:30
Once again patch against 2.7.6
msg217578 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2014年04月30日 07:23
Don't use print (to stdout) or sys.stderr directly. There are already many other uses of warnings.warn within the zipfile module. Be consistent with those.
Existing zipfile warnings seem to favor lazily importing warnings when its needed rather than a top level 'import warnings'. While I find that annoying, there are sometimes reasons to do it and the minimally invasive change that is consistent with the rest of the existing code is to do the same thing here.
something similar to:
+ if self.debug and fname != zinfo.orig_filename:
+ import warnings
+ warnings.warn(
+ 'Warning: Filename in directory "%s" and header "%s" differ.' % (
+ zinfo.orig_filename, fname))
msg217616 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2014年04月30日 12:54
As Greg suggested, the important thing is to follow the precedent set by
other debug messages in the module.
msg217624 - (view) Author: Adam Polkosnik (apolkosnik) * Date: 2014年04月30日 15:51
Attached is a patch with warnings against 2.7.6
msg217625 - (view) Author: Adam Polkosnik (apolkosnik) * Date: 2014年04月30日 15:52
Attached is a patch with warnings against 3.4.0
msg217627 - (view) Author: Adam Polkosnik (apolkosnik) * Date: 2014年04月30日 16:10
Attached is a patch with warnings against 2.7.6 (this one should be good to go)
msg217634 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2014年04月30日 18:01
--- a/zipfile.py	Wed Apr 30 11:27:16 2014
+++ b/zipfile.py	Wed Apr 30 11:27:01 2014
@@ -1174,8 +1174,9 @@
 else:
 fname_str = fname.decode("cp437")
 
- if fname_str != zinfo.orig_filename:
- raise BadZipFile(
+ if self.debug and fname_str != zinfo.orig_filename:
+ import warnings
+ warnings.warn(
 'File name in directory %r and header %r differ.'
 % (zinfo.orig_filename, fname))
Also, you need to add ``stacklevel=2`` to warnings.warn().
msg217635 - (view) Author: Adam Polkosnik (apolkosnik) * Date: 2014年04月30日 18:28
3.4.0 pathc with stacklevel=2
msg217636 - (view) Author: Jim Jewett (Jim.Jewett) * (Python triager) Date: 2014年04月30日 18:31
I'm leaving it as "needs patch" because it isn't clear exactly what a committer should do. 
I think the current intent is to make the changes listed in zipfile_???_filename_mismatch_v2.patch (which are not listed as reviewable -- but the changes are indeed sufficiently straightforward that the the files -- if need be -- could be edited by hand as if they were made originally by the committer.)
This change is small enough (warning instead of raise) that a test case is probably not strictly required, but it would be helpful.
test.zip would presumably be useful data for a test case.
There is dispute over whether this would be an enhancement (more generous with what to accept), a bug fix, or a security *regression* because it still allows old vulnerable files to stick around unreplaced (or to hide from a malware scanner), but no longer raises an Exception to get attention. (warnings are often ignored)
zlib_forward_slash.patch would also be good (and might even be a security fix, by allowing the new versions to be installed), but is not ready to be committed, as 
(A) it repeats the logic inline instead of using the newly defined helper method
(B) it doesn't have a test case (test1.zip should help when creating one)
(C) it has neither a doc change nor an explicit (and dubious) statement that this is just a bug fix and wouldn't need to be listed in the versionchanged. 
There is also a question of how general the filename correction should be, particularly with respect to windows drives and capitalization. The one in this patch seems to be the minimal change, and is explicitly supported by the zip spec.
msg217638 - (view) Author: Jim Jewett (Jim.Jewett) * (Python triager) Date: 2014年04月30日 18:33
Presumably the stacklevel applies to all versions; verifying that it warns about the right code location is important enough to require a test case.
msg217641 - (view) Author: Adam Polkosnik (apolkosnik) * Date: 2014年04月30日 19:05
I just looked through 2.7.6 version of zipfile, and the the error handling there is either through using raise() or print(). So, inline with the guidance provided for 2.7.6, perhapswe should stick with print() instead of warning.warn(). I'll post that a bit later.
test.zip up there is the test case for this change. Is there any other test case needed?
msg217642 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2014年04月30日 19:08
Adam, please stop deleting the files. It makes for a lot of noise to those on the nosy list, and is unnecessary.
Just make sure you increment the version number on the files you upload and it will be fine.
Thanks.
msg217643 - (view) Author: Adam Polkosnik (apolkosnik) * Date: 2014年04月30日 19:11
Jim, 
I've got some test cases where the zlib_forward_slash.patch doesn't cut it. That was the reason for trying a broader approach with filename_mismatch patches.
msg217647 - (view) Author: Francis MB (francismb) * Date: 2014年04月30日 20:13
A small question related to: "zipfile_276_filename_mismatch_v3.patch"
--- a/zipfile.py	Wed Apr 30 11:44:38 2014
+++ b/zipfile.py	Wed Apr 30 15:10:38 2014
@@ -970,10 +970,10 @@
 if fheader[_FH_EXTRA_FIELD_LENGTH]:
 zef_file.read(fheader[_FH_EXTRA_FIELD_LENGTH])
 
- if fname != zinfo.orig_filename:
- raise BadZipfile, \
+ if self.debug and fname != zinfo.orig_filename:
+ print( \
 'File name in directory "%s" and header "%s" differ.' % (
- zinfo.orig_filename, fname)
+ zinfo.orig_filename, fname))
Shouldn't a change from raising an exception to a print be somewhere documented?
Thanks
msg217648 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2014年04月30日 20:42
The bug was that BadZipFile was being raised when it shouldn't be so I wouldn't worry about documenting the behavior change other than in the Misc/NEWS entry that the ultimate commiter writes up.
msg217659 - (view) Author: Adam Polkosnik (apolkosnik) * Date: 2014年04月30日 21:51
Is there anything else that you need me to provide?
msg217660 - (view) Author: Jim Jewett (Jim.Jewett) * (Python triager) Date: 2014年04月30日 22:00
On Wed, Apr 30, 2014 at 3:05 PM, Adam Polkosnik wrote:
> test.zip up there is the test case for this change. Is there any other test case needed?
ah; I see the confusion. test.zip is test *data*. When I asked for a
test *case*, I meant something that ensures the data will be used to
actually run the test automatically.
Typically, that would involve adding something to
Lib/test/test_zipfile.py. I'm guessing it would be easiest to add a
new class inheriting from unittest.TestCase and opening test.zip in
the setUp, then using a bunch of assert* methods to verify that the
file was read and interpreted correctly.
-jJ
msg217661 - (view) Author: Jim Jewett (Jim.Jewett) * (Python triager) Date: 2014年04月30日 22:13
On Wed, Apr 30, 2014 at 3:11 PM, Adam Polkosnik
> I've got some test cases where the zlib_forward_slash.patch doesn't cut it.
My recommendation (and I could be convinced otherwise) would be to replace
 if fname_str != zinfo.orig_filename:
 raise ...
with something more like
 self.filename_check(fname_str, zinfo.orig_filename)
and a default implementation of filename_check that does nothing if
they're equal; calls the slash replace (since the standard supports
that correction); does nothing else if they're now equal; emits a
warning (or prints, in 2.7.6) otherwise.
In 2.7.6, you would have to keep the new methods private, but in 3.5,
users could override filename_check to handle the windows path
normalization, or whatever other problems you have documented.
msg217743 - (view) Author: Adam Polkosnik (apolkosnik) * Date: 2014年05月02日 05:14
Jim,
The problems documented here are related to two cases (both apparently arriving from world of windows): 
1. two relative paths with inverted slash in one of them (test\test2.txt vs test/test2.txt)
2. relative path vs absolute path (windows\temp\test.txt vs c:\windows\temp\test.txt)
The extraction part seems to be doing a good job at writing the files into sane locations.
IMHO, there's no point in trying to replace slashes or otherwise "normalize", as this would fix the cases where the presence of an inverted slashes should be noted in debug output. 
By the same token stripping the drive letter from the absolute path part would just deprive us from noticing such intricacies in these special zip files.
msg217753 - (view) Author: Jim Jewett (Jim.Jewett) * (Python triager) Date: 2014年05月02日 14:55
On Fri, May 2, 2014 at 1:14 AM, Adam Polkosnik
> The problems documented here are related to two cases (both apparently arriving from world of windows):
Good! I had thought you had even more!
> 1. two relative paths with inverted slash in one of them (test\test2.txt vs test/test2.txt)
My understanding from earlier -- and I may have been reading too much
into some of the comments -- is that the standard defined \filename as
an inferior alias for /filename and supported the fix.
Notably, if you're extracting on windows with windows conventions,
then windows will treat them identically anyhow.
If you're extracting a windows file to a unix environment, then \t
really should be translated to /t.
> 2. relative path vs absolute path (windows\temp\test.txt vs c:\windows\temp\test.txt)
These really are different, as leaving off the "C:" should mean
"current drive", which will often (but not always) be C:
This (and differing capitalization) are among the reasons to do the
filename fix in a separate method, so that subclasses with more local
knowledge can more easily do the right thing.
Note that for python 3.4 and newer, pathlib <URL:
https://docs.python.org/3/library/pathlib.html> may be helpful. It
would probably even be possible to backport the essential parts as an
implementation detail. But I'm not sure if that could be done
compatibly with maintenance releases, or how much work it would take.
> The extraction part seems to be doing a good job at writing the files into sane locations.
> IMHO, there's no point in trying to replace slashes or otherwise "normalize", as this would fix the cases where the presence of an inverted slashes should be noted in debug output.
My understanding had been that it was failing to extract entirely. So
exactly what is the problem?
msg217754 - (view) Author: Adam Polkosnik (apolkosnik) * Date: 2014年05月02日 15:19
Extraction works fine, the issue was that raise() was creating an exception, and stoping the whole extraction process. When replaced with a warning, everything works fine.
msg217756 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2014年05月02日 15:53
Adam Polkasnik said:
--------------------
> Extraction works fine, the issue was that raise() was creating an exception, and
> stopping the whole extraction process.
That doesn't make sense. If an exception was "stopping the whole extraction process" then extraction was not working fine.
Questions:
 - Are the names with '\' in them in the central directory, or the per-file header?
 - If in the central directory (which is the name we are going to use, yes?) how do
 we tell if the '\' should be a '/' or an escape? (such as '\t')
msg217778 - (view) Author: Adam Polkosnik (apolkosnik) * Date: 2014年05月02日 19:30
Ethan,
I'd refer you to msg92309...
And
When testing with WinZip it looks like this: 
No errors detected in compressed data of C:\Downloads\test.zip.
Testing ...
Testing test\ OK
Testing test\test2.txt OK
Testing test1.txt OK
Then in python:
Python 3.4.0 (v3.4.0:04f714765c13, Mar 16 2014, 19:25:23) [MSC v.1600 64 bit (AM
D64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import zipfile
>>> zf = zipfile.ZipFile('test.zip')
>>> namelist = zf.namelist()
>>> namelist
['test/', 'test/test2.txt', 'test1.txt']
>>> for af in namelist:
... zf.read(af)
...
Traceback (most recent call last):
 File "<stdin>", line 2, in <module>
 File "c:\Python34\lib\zipfile.py", line 1117, in read
 with self.open(name, "r", pwd) as fp:
 File "c:\Python34\lib\zipfile.py", line 1180, in open
 % (zinfo.orig_filename, fname))
zipfile.BadZipFile: File name in directory 'test\\' and header b'test/' differ.
So, based on that everything is already converted to forward slashes for the extraction.
msg217932 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2014年05月05日 16:35
Ah, so when you (Adam) said "extraction works fine", what you meant was "extraction works fine *in other programs*". Okay.
msg217933 - (view) Author: Adam Polkosnik (apolkosnik) * Date: 2014年05月05日 16:40
Both. Other programs, and in python scripts when raise() is removed in zipfile.py. Unless your results are different.
msg345999 - (view) Author: Morgan Ramsay (MorganRamsay) * Date: 2019年06月18日 16:41
The encoding test in ZipFile.open() is highly opinionated and has no purpose beyond itself. Testing for encoding issues should be done outside this library in the user's own code.
Using the 3.7.2 version of ZipFile, this is my proposal:
https://gist.github.com/MorganRamsay/696e89450e0f172c16ac8dfc016eb79f/revisions?diff=unified
Currently, I'm subclassing ZipFile with this patch and I've had no issues with extracting thousands of different ZIP files on Windows. I can't attest to this solution's applicability on other platforms.
History
Date User Action Args
2022年04月11日 14:56:52adminsetgithub: 51088
2019年06月18日 21:36:54python-devsetstage: needs patch -> patch review
pull_requests: + pull_request14050
2019年06月18日 16:41:11MorganRamsaysetnosy: + MorganRamsay

messages: + msg345999
versions: + Python 3.6, Python 3.7
2015年07月21日 07:09:20ethan.furmansetnosy: - ethan.furman
2015年06月18日 19:59:54Sean Goodwinsetnosy: + Sean Goodwin
2014年05月05日 16:40:07apolkosniksetmessages: + msg217933
2014年05月05日 16:35:24ethan.furmansetmessages: + msg217932
2014年05月02日 19:30:24apolkosniksetmessages: + msg217778
2014年05月02日 15:53:17ethan.furmansetmessages: + msg217756
2014年05月02日 15:19:29apolkosniksetmessages: + msg217754
2014年05月02日 14:55:38Jim.Jewettsetmessages: + msg217753
2014年05月02日 05:14:04apolkosniksetmessages: + msg217743
2014年05月01日 00:05:52alanmcintyresetnosy: - alanmcintyre
2014年04月30日 22:13:45Jim.Jewettsetmessages: + msg217661
2014年04月30日 22:00:30Jim.Jewettsetmessages: + msg217660
2014年04月30日 21:51:58apolkosniksetmessages: + msg217659
2014年04月30日 20:42:23gregory.p.smithsetmessages: + msg217648
2014年04月30日 20:13:09francismbsetnosy: + francismb
messages: + msg217647
2014年04月30日 19:11:50apolkosniksetfiles: + zipfile_276_filename_mismatch_v3.patch

messages: + msg217643
2014年04月30日 19:08:59ethan.furmansetmessages: + msg217642
2014年04月30日 19:05:26apolkosniksetmessages: + msg217641
2014年04月30日 18:48:28apolkosniksetfiles: - zipfile_340_filename_mismatch_v2.patch
2014年04月30日 18:33:23Jim.Jewettsetmessages: + msg217638
2014年04月30日 18:31:41Jim.Jewettsetnosy: + Jim.Jewett
messages: + msg217636
2014年04月30日 18:28:56apolkosniksetfiles: + zipfile_340_filename_mismatch_v3.patch

messages: + msg217635
2014年04月30日 18:01:52berker.peksagsetnosy: + berker.peksag
messages: + msg217634
2014年04月30日 16:10:33apolkosniksetfiles: + zipfile_276_filename_mismatch_v2.patch

messages: + msg217627
2014年04月30日 15:53:40apolkosniksetfiles: - zipfile_276_filename_mismatch_v2.patch
2014年04月30日 15:52:37apolkosniksetfiles: + zipfile_340_filename_mismatch_v2.patch

messages: + msg217625
2014年04月30日 15:51:57apolkosniksetfiles: + zipfile_276_filename_mismatch_v2.patch

messages: + msg217624
2014年04月30日 15:50:50apolkosniksetfiles: - zipfile_276_filename_mismatch.patch
2014年04月30日 15:50:42apolkosniksetfiles: - zipfile_stupid3.patch
2014年04月30日 15:50:33apolkosniksetfiles: - zipfile_340_filename_mismatch.patch
2014年04月30日 12:54:14ncoghlansetmessages: + msg217616
2014年04月30日 07:23:25gregory.p.smithsetnosy: + gregory.p.smith
messages: + msg217578
2014年04月30日 05:30:35apolkosniksetfiles: + zipfile_276_filename_mismatch.patch

messages: + msg217576
2014年04月30日 05:27:23apolkosniksetfiles: + zipfile_340_filename_mismatch.patch

messages: + msg217575
2014年04月30日 05:26:12apolkosniksetfiles: - zipfile_276_filename_mismatch.patch
2014年04月30日 05:25:59apolkosniksetfiles: - zipfile_340_filename_mismatch.patch
2014年04月30日 05:23:54apolkosniksetfiles: + zipfile_340_filename_mismatch.patch

messages: + msg217574
2014年04月30日 05:19:50terry.reedysetmessages: + msg217573
2014年04月30日 05:16:10apolkosniksetfiles: + zipfile_276_filename_mismatch.patch

messages: + msg217572
2014年04月30日 04:34:06ncoghlansetnosy: + ncoghlan
messages: + msg217571
2014年04月30日 04:23:10apolkosniksetmessages: + msg217570
2014年04月30日 04:16:52apolkosniksetmessages: + msg217569
2014年04月30日 04:12:23ethan.furmansetnosy: + ethan.furman
2014年04月29日 23:12:25terry.reedysetmessages: + msg217561
versions: - Python 3.1, Python 3.2, Python 3.3
2014年04月29日 21:42:14apolkosniksetmessages: + msg217558
2014年04月29日 21:21:07apolkosniksetmessages: + msg217556
2014年04月29日 21:15:57apolkosniksetmessages: + msg217554
2014年04月29日 21:08:29apolkosniksetmessages: + msg217551
versions: + Python 3.1, Python 3.2, Python 3.3
2014年04月29日 20:41:34terry.reedysetmessages: + msg217546
versions: - Python 3.1, Python 3.2, Python 3.3
2014年04月29日 17:58:44apolkosniksetmessages: + msg217533
versions: + Python 3.1, Python 3.2, Python 3.5
2014年02月18日 21:54:32apolkosniksetmessages: + msg211562
2014年01月23日 23:52:27terry.reedysetmessages: + msg209023
versions: - Python 2.6, Python 3.1, Python 3.2, Python 3.5
2014年01月23日 20:23:28apolkosniksetfiles: - zipfile_stupid.patch
2014年01月23日 20:23:23apolkosniksetfiles: - zipfile_stupid2.patch
2014年01月23日 20:22:51apolkosniksetfiles: + zipfile_stupid3.patch

messages: + msg208987
2014年01月23日 20:12:46serhiy.storchakasetmessages: + msg208985
2014年01月23日 20:09:18georg.brandlsetnosy: + georg.brandl
messages: + msg208983
2014年01月23日 20:04:33apolkosniksetmessages: + msg208982
2014年01月23日 18:34:28ronaldoussorensetmessages: + msg208975
2014年01月23日 18:20:51apolkosniksetfiles: + zipfile_stupid2.patch

messages: + msg208973
2014年01月23日 17:59:11apolkosniksetfiles: + zipfile_stupid.patch

messages: + msg208970
versions: + Python 3.1, Python 3.2, Python 3.3, Python 3.4, Python 3.5
2014年01月23日 17:37:39apolkosniksetfiles: - zipfile.py.patch
2013年10月31日 18:56:35apolkosniksetmessages: + msg201842
2013年10月24日 09:56:32tim.goldensetnosy: - tim.golden
2013年10月17日 20:42:49apolkosniksetfiles: + zipfile.py.patch
versions: + Python 2.7
nosy: + apolkosnik

messages: + msg200165
2012年09月28日 14:01:24tim.goldensetassignee: tim.golden ->
2012年04月07日 19:11:36serhiy.storchakasetnosy: + serhiy.storchaka
2010年09月14日 11:35:01ronaldoussorensetmessages: + msg116385
2010年09月14日 11:26:24amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg116384
2010年08月06日 15:35:16tim.goldensetassignee: tim.golden

nosy: + tim.golden
2009年09月19日 17:01:06chucksetfiles: + zlib_forward_slash.patch

nosy: + chuck
messages: + msg92874

keywords: + patch
2009年09月11日 21:01:53amaury.forgeotdarcsetstage: needs patch
2009年09月11日 18:57:17terry.reedysetnosy: + terry.reedy
messages: + msg92516
2009年09月06日 21:26:51alanmcintyresetmessages: + msg92335
2009年09月06日 20:41:54ronaldoussorensetmessages: + msg92330
2009年09月06日 18:58:40alanmcintyresetnosy: + alanmcintyre
messages: + msg92326
2009年09月06日 12:40:05ronaldoussorensetnosy: + ronaldoussoren
messages: + msg92309
2009年09月06日 04:02:01NewerCookiesetmessages: + msg92297
2009年09月04日 19:57:57NewerCookiesetmessages: + msg92265
2009年09月04日 19:56:50NewerCookiecreate

AltStyle によって変換されたページ (->オリジナル) /