This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2010年08月31日 01:02 by craigds, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| zipfile_zip64_header.patch | craigds, 2010年08月31日 01:02 | |||
| zipfile-huge-files.diff | alanmcintyre, 2010年09月07日 04:57 | review | ||
| zipfile_zip64_always.patch | serhiy.storchaka, 2012年09月23日 09:54 | Always write Zip64 extra | review | |
| zipfile_zip64_try.patch | serhiy.storchaka, 2012年09月23日 09:55 | Try to write Zip64 extra only if needed | review | |
| zipfile_zip64_always_2.patch | serhiy.storchaka, 2012年11月28日 12:25 | Always write Zip64 extra | review | |
| zipfile_zip64_try_2.patch | serhiy.storchaka, 2012年11月28日 12:26 | Try to write Zip64 extra only if needed | review | |
| zipfile_zip64_try_2-2.7.patch | serhiy.storchaka, 2013年01月04日 13:27 | review | ||
| zipfile_zip64_try_2-3.2.patch | serhiy.storchaka, 2013年01月04日 13:27 | review | ||
| Messages (20) | |||
|---|---|---|---|
| msg115250 - (view) | Author: Craig de Stigter (craigds) | Date: 2010年08月31日 01:02 | |
Steps to reproduce:
# create a large (>4gb) file
f = open('foo.txt', 'wb')
text = 'a' * 1024**2
for i in xrange(5 * 1024):
f.write(text)
f.close()
# now zip the file
import zipfile
z = zipfile.ZipFile('foo.zip', mode='w', allowZip64=True)
z.write('foo.txt')
z.close()
Now inspect the file headers using a hex editor. The written headers are incorrect. The filesize and compressed size should be written as 0xffffffff and the 'extra field' should contain the actual sizes.
Tested on Python 2.5 but looking at the latest code in 3.2 it still looks broken.
The problem is that the ZipInfo.FileHeader() is written before the filesize is populated, so Zip64 extensions are not written. Later, the sizes in the header are written, but Zip64 extensions are not taken into account and the filesize is just wrapped (7gb becomes 3gb, for instance).
My patch fixes the problem on Python 2.5, it might need minor porting to fix trunk. It works by assigning the uncompressed filesize to the ZipInfo header initially, then writing the header. Then later on, I re-write the header (this is okay since the header size will not have increased.)
|
|||
| msg115466 - (view) | Author: Éric Araujo (eric.araujo) * (Python committer) | Date: 2010年09月03日 16:53 | |
A tip about versions: Development happens on the current active branch, py3k (future 3.2 version), and bug or doc fixes are backported to the stable versions 2.7 and 3.1. Security fixes go into 2.6 too. Can you reproduce your bug in 2.7, 3.1 and 3.2? Adding Alan to nosy since he’s listed in Misc/maintainers.rst. |
|||
| msg115514 - (view) | Author: Craig de Stigter (craigds) | Date: 2010年09月03日 21:47 | |
Yes, the bug still exists in Python 3.1.2. However, struct.pack() no longer silently ignores overflow, so I get this error instead:
>>> z.write('foo.txt')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.1/zipfile.py", line 1095, in write
zinfo.file_size))
struct.error: argument out of range
|
|||
| msg115660 - (view) | Author: Alan McIntyre (alanmcintyre) * (Python committer) | Date: 2010年09月05日 17:42 | |
Thanks for the patch, Craig; I should have some time later today or tomorrow to do a review. Did you have a patch for the test suite(s) as well? If not, I can just make sure your test case is covered in test_zipfile64. |
|||
| msg115672 - (view) | Author: Craig de Stigter (craigds) | Date: 2010年09月05日 21:16 | |
Hi, sorry no I haven't had time to add a real test for this |
|||
| msg115741 - (view) | Author: Alan McIntyre (alanmcintyre) * (Python committer) | Date: 2010年09月07日 04:57 | |
Here's an updated patch for the py3k trunk with tests. This pretty much doubles the runtime of test_zipfile64.py. The patch also removes some unnecessary code from the existing test_zipfile64 tests. Note: It looks like writestr will also suffer from a struct.pack overflow if it's given a ZipInfo with the third general purpose flag bit set. I won't have time to address that until next weekend, probably. |
|||
| msg146923 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2011年11月03日 12:17 | |
Issue 6434 was marked as a duplicate of this issue. |
|||
| msg156442 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2012年03月20日 17:52 | |
I am afraid that the problem is more complicated. With the option allowZip64=True all files need to write with this extension, because size of local file header may change and there will be after compression just go back and rewrite it. Now it appears that the Zip64 option simply does not work. |
|||
| msg170645 - (view) | Author: Christian Heimes (christian.heimes) * (Python committer) | Date: 2012年09月18日 13:44 | |
Serhiy: If I understand you correctly it should be easy to fix. The code in close() has to check if any file is beyond the ZIP64 limit and then write all headers with extra args. Is that correct? |
|||
| msg171010 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2012年09月22日 17:56 | |
No, on the contrary, it is not such easy to fix, and the patch is incorrect. Sorry that it is not clear either. The size of the header with extra args depends on the size of the file. The file size can be changed in the process of compressing, and compressed size may be larger than uncompressed size, exceeding 32-bit boundary. Rewriting the header with extra args, we can overwrite compressed data. I was put off the issue for further more careful research. Thanks for the reminder. One solution is always (even for smallest files) to write 64-bit sizes when allowZip64 is true. |
|||
| msg171025 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2012年09月23日 09:54 | |
I see two rational solutions of the issue (all written below is applicable only for allowZip64=True): 1) Always write Zip64 extended information extra field. This approach always successful, but the zipfile size will increase by 20 bytes for each file. The first patch (zipfile_zip64_always.patch) uses this approach. 2) Write Zip64 extended information extra field only if assumed file size is more than a certain limit. In very rare cases this leads to the impossibility of compression of the file which can be compressed the first way. However it produces the same file as before patch in most cases. The second patch (zipfile_zip64_try.patch) is based on Alan's patch and uses the second approach. The probability of errors is reduced and they are now detected and does not lead to a silent data damage. Both patches are for Python 3.3. If any patch is good, I'll backport it for the older versions. |
|||
| msg172648 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2012年10月11日 15:08 | |
What the conclusion about the patches? Which variant I should backport for older versions? |
|||
| msg172652 - (view) | Author: Ronald Oussoren (ronaldoussoren) * (Python committer) | Date: 2012年10月11日 15:22 | |
I'd write the extended header when the current file size is larger than the zip64 limit (that is, when 'st.st_size > ZIP64_LIMIT' in the write method. That way the minimal header size is used whenever possible. As you noted this can cause problems when the file grows beyond the limit while it is stored in the zipfile, but IMHO storing data while it is modified is asking for problems anyway. BTW. I haven't actually review the patch yet. |
|||
| msg175471 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2012年11月12日 20:38 | |
Please, review the patches. |
|||
| msg176538 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2012年11月28日 12:26 | |
Patches updated to resolve merge conflict with issue11981. Please review and apply any of this patches. This is needed for some other my zipfile patches. |
|||
| msg178603 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2012年12月30日 19:11 | |
What variant of patches should I commit? Or prepare other? |
|||
| msg179013 - (view) | Author: Nico Möller (Nico.Möller) | Date: 2013年01月04日 10:21 | |
I most definitely need a patch for 2.7.3 Would be awesome if you could provide a patch for that version. |
|||
| msg179019 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2013年01月04日 13:27 | |
Here are second variant patches for 2.7 and 3.2. |
|||
| msg179987 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2013年01月14日 22:45 | |
New changeset ce869b05762c by Serhiy Storchaka in branch '2.7': Issue #9720: zipfile now writes correct local headers for files larger than 4 GiB. http://hg.python.org/cpython/rev/ce869b05762c New changeset b93848ca7760 by Serhiy Storchaka in branch '3.2': Issue #9720: zipfile now writes correct local headers for files larger than 4 GiB. http://hg.python.org/cpython/rev/b93848ca7760 New changeset 656a45738e5e by Serhiy Storchaka in branch '3.3': Issue #9720: zipfile now writes correct local headers for files larger than 4 GiB. http://hg.python.org/cpython/rev/656a45738e5e New changeset 628a6af64a46 by Serhiy Storchaka in branch 'default': Issue #9720: zipfile now writes correct local headers for files larger than 4 GiB. http://hg.python.org/cpython/rev/628a6af64a46 |
|||
| msg179989 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2013年01月14日 22:49 | |
Fixed. Thank you for report, Craig de Stigter. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:05 | admin | set | github: 53929 |
| 2013年01月14日 22:49:08 | serhiy.storchaka | set | status: open -> closed resolution: fixed messages: + msg179989 stage: patch review -> resolved |
| 2013年01月14日 22:45:09 | python-dev | set | nosy:
+ python-dev messages: + msg179987 |
| 2013年01月04日 13:27:38 | serhiy.storchaka | set | files:
+ zipfile_zip64_try_2-2.7.patch, zipfile_zip64_try_2-3.2.patch messages: + msg179019 |
| 2013年01月04日 10:21:58 | Nico.Möller | set | nosy:
+ Nico.Möller messages: + msg179013 |
| 2012年12月30日 19:11:38 | serhiy.storchaka | set | messages: + msg178603 |
| 2012年12月29日 22:08:10 | serhiy.storchaka | set | assignee: serhiy.storchaka |
| 2012年11月28日 12:26:01 | serhiy.storchaka | set | files:
+ zipfile_zip64_always_2.patch, zipfile_zip64_try_2.patch messages: + msg176538 |
| 2012年11月26日 20:32:14 | jhenry82 | set | nosy:
+ jhenry82 |
| 2012年11月12日 20:38:26 | serhiy.storchaka | set | messages: + msg175471 |
| 2012年10月19日 08:54:37 | Ruben.Gonzalez | set | nosy:
+ Ruben.Gonzalez |
| 2012年10月11日 15:22:27 | ronaldoussoren | set | messages: + msg172652 |
| 2012年10月11日 15:08:29 | serhiy.storchaka | set | messages:
+ msg172648 versions: + Python 3.4 |
| 2012年09月23日 09:55:46 | serhiy.storchaka | set | files:
+ zipfile_zip64_try.patch stage: needs patch -> patch review |
| 2012年09月23日 09:54:19 | serhiy.storchaka | set | files:
+ zipfile_zip64_always.patch nosy: + loewis, gregory.p.smith, ronaldoussoren messages: + msg171025 |
| 2012年09月22日 17:56:23 | serhiy.storchaka | set | messages: + msg171010 |
| 2012年09月18日 13:44:30 | christian.heimes | set | keywords:
+ needs review nosy: + christian.heimes messages: + msg170645 |
| 2012年09月18日 13:25:53 | Kristof.Keppens | set | nosy:
+ Kristof.Keppens |
| 2012年03月20日 17:52:08 | serhiy.storchaka | set | messages: + msg156442 |
| 2012年03月20日 17:13:23 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka |
| 2012年03月20日 14:35:55 | dandrzejewski | set | nosy:
+ dandrzejewski |
| 2011年11月03日 12:17:55 | nadeem.vawda | set | versions:
+ Python 3.3, - Python 3.1 nosy: + amaury.forgeotdarc, nadeem.vawda, lambacck, segfault42, enlavin, Paul messages: + msg146923 stage: needs patch |
| 2011年11月03日 12:17:17 | nadeem.vawda | link | issue6434 superseder |
| 2010年09月07日 04:57:46 | alanmcintyre | set | files:
+ zipfile-huge-files.diff messages: + msg115741 |
| 2010年09月05日 21:16:38 | craigds | set | messages: + msg115672 |
| 2010年09月05日 17:42:17 | alanmcintyre | set | messages: + msg115660 |
| 2010年09月03日 21:47:12 | craigds | set | messages: + msg115514 |
| 2010年09月03日 16:53:46 | eric.araujo | set | nosy:
+ eric.araujo, alanmcintyre messages: + msg115466 versions: - Python 2.6, Python 2.5, Python 3.3 |
| 2010年08月31日 01:02:17 | craigds | create | |