This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2013年01月20日 18:41 by rhettinger, last changed 2022年04月11日 14:57 by admin.
| Messages (11) | |||
|---|---|---|---|
| msg180307 - (view) | Author: Raymond Hettinger (rhettinger) * (Python committer) | Date: 2013年01月20日 18:41 | |
Only a little of the existing logic is tied to the zipfile format. Consider adding support for xz, tar, tar.gz, tar.bz2, etc. In particular, xz has better compression, resulting in both space savings and faster load times. |
|||
| msg180310 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2013年01月20日 20:19 | |
tar.* is not a good choice because it doesn't allow random access. Bare tar better than zip only in case when you need to save additional file attributes (Unix file access mode, times, owner, group, links). ZIP format supports all this too, but not zipfile module yet. Adding bz2 or lzma compression to ZIP file shouldn't be too hard. |
|||
| msg180311 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2013年01月20日 20:32 | |
Here are some tests. time 7z a -tzip -mx=0 python-0.zip $(find Lib -type f -name '*.py') >/dev/null time 7z a -tzip python.zip $(find Lib -type f -name '*.py') >/dev/null time 7z a -tzip -mx=9 python-9.zip $(find Lib -type f -name '*.py') >/dev/null time 7z a -tzip -mm=bzip2 python-bzip2.zip $(find Lib -type f -name '*.py') >/dev/null time 7z a -tzip -mm=bzip2 -mx=9 python-bzip2-9.zip $(find Lib -type f -name '*.py') >/dev/null time 7z a -tzip -mm=lzma python-lzma.zip $(find Lib -type f -name '*.py') >/dev/null time 7z a -tzip -mm=lzma -mx=9 python-lzma-9.zip $(find Lib -type f -name '*.py') >/dev/null time 7z t python-0.zip >/dev/null time 7z t python.zip >/dev/null time 7z t python-9.zip >/dev/null time 7z t python-bzip2.zip >/dev/null time 7z t python-bzip2-9.zip >/dev/null time 7z t python-lzma >/dev/null time 7z t python-lzma.zip >/dev/null time 7z t python-lzma-9.zip >/dev/null wc -c python*.zip Results: pack* unpack size time time (MB) store 0.5 0.2 19.42 deflate 6 0.4 4.59 deflate-max 40 0.4 4.52 bzip2 6 2.1 4.45 bzip2-max 79 2.0 4.39 lzma 37 0.7 4.42 lzma-max 62 0.7 4.39 *) For pack time I take user time because 7-zip well parallelize deflate and bzip2 compression. As you can see, a size difference between maximal compression with different methods only 3%. lzma decompress almost twice slower then deflate, and bzip2 decompress 5 times slower. Python files are too small to get benefit from advanced compression. |
|||
| msg180313 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2013年01月20日 20:54 | |
> Here are some tests. I think you want to put pyc files in the zip file as well. |
|||
| msg180314 - (view) | Author: Raymond Hettinger (rhettinger) * (Python committer) | Date: 2013年01月20日 21:09 | |
xz will likely be the best win -- it is purported to compress smaller than bz2 while retaining the decompression speed of zip. As Antoine says, the usual practice is to add py, pyc, and pyo files to the compressed library; otherwise, there is an added cost with Python tries to write a missing pyc/pyo file. |
|||
| msg180323 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2013年01月20日 21:55 | |
Well. ./python -m compileall $(find Lib -type f -name '*.py') ./python -O -m compileall $(find Lib -type f -name '*.py') Tests: FILES="$(find Lib -name '*.py' -o -name '*.py[co]')" time 7z a -tzip -mx=0 python-0.zip $FILES >/dev/null time 7z a -tzip python.zip $FILES >/dev/null time 7z a -tzip -mx=9 python-9.zip $FILES >/dev/null time 7z a -tzip -mm=bzip2 python-bzip2.zip $FILES >/dev/null time 7z a -tzip -mm=bzip2 -mx=9 python-bzip2-9.zip $FILES >/dev/null time 7z a -tzip -mm=lzma python-lzma.zip $FILES >/dev/null time 7z a -tzip -mm=lzma -mx=9 python-lzma-9.zip $FILES >/dev/null time 7z t python-0.zip >/dev/null time 7z t python.zip >/dev/null time 7z t python-9.zip >/dev/null time 7z t python-bzip2.zip >/dev/null time 7z t python-bzip2-9.zip >/dev/null time 7z t python-lzma.zip >/dev/null time 7z t python-lzma-9.zip >/dev/null wc -c python*.zip Results: pack unpack size time time (MB) store 1.6 0.5 65.4 deflate 19 0.9 17.5 deflate-max 134 0.9 17.2 bzip2 21 4.2 16.5 bzip2-max 294 4.1 16.3 lzma 120 2.3 15.9 lzma-max 204 2.3 15.8 All numbers are about 3x larger. lzma-max is 8% less than deflate-max but 2.5 times slower. Bzip2 is out of the game. |
|||
| msg180324 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2013年01月20日 21:58 | |
Agreed it doesn't look very promising. |
|||
| msg180347 - (view) | Author: Brett Cannon (brett.cannon) * (Python committer) | Date: 2013年01月21日 18:00 | |
So this seems like a confluence of both supporting compressed files for loading source code as well as supporting new archive formats (e.g. xz vs. tar); zip just happens to do both implicitly. And there is also the question of if you explicitly plan to do this in C code or in pure Python as I plan to introduce a pure Python version of zipimport into importlib for 3.4 so that it can use zipfile directly and thus all of its full support of zipfile abilities. And there doesn't have to be any performance cost in trying to write bytecode files; it's very simple to have a loader which simply skips that step entirely. |
|||
| msg220589 - (view) | Author: Eric Snow (eric.snow) * (Python committer) | Date: 2014年06月14日 22:19 | |
related: issue #17630 and issue #5950 |
|||
| msg267527 - (view) | Author: (yan12125) * | Date: 2016年06月06日 12:58 | |
+1 for that. I like XZ support so that our application size can be reduced. |
|||
| msg325729 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2018年09月19日 07:53 | |
zipimport has been rewritten in pure Python (issue25711). Now it is easier to add support of other compression methods. Although I don't think that reducing the size by 3-8% is worth complicating the code. If you still need this, I think that the simplest way is importing the zipfile module and monkey patching the simple ZIP file implementation in the zipimport module with zipfile-based implementation. This can be made only after importing zipfile itself, i.e. in case of zipping the stdlib, the zipfile module and its dependencies should be stored uncompressed or with the deflate compression. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:40 | admin | set | github: 61206 |
| 2022年04月06日 03:00:34 | yan12125 | set | nosy:
- yan12125 |
| 2022年04月05日 16:53:26 | christian.heimes | set | versions: + Python 3.11, - Python 3.8 |
| 2020年03月06日 20:01:35 | brett.cannon | set | nosy:
- brett.cannon |
| 2018年09月19日 07:53:50 | serhiy.storchaka | set | messages:
+ msg325729 versions: + Python 3.8, - Python 3.6 |
| 2016年06月06日 12:58:29 | yan12125 | set | nosy:
+ yan12125 messages: + msg267527 |
| 2015年08月05日 15:58:39 | eric.snow | set | nosy:
+ gregory.p.smith, superluser versions: + Python 3.6, - Python 3.4 |
| 2014年06月14日 22:19:35 | eric.snow | set | nosy:
+ eric.snow messages: + msg220589 |
| 2014年06月14日 08:47:51 | serhiy.storchaka | link | issue21751 superseder |
| 2013年01月21日 18:00:53 | brett.cannon | set | nosy:
+ brett.cannon messages: + msg180347 |
| 2013年01月20日 21:58:08 | pitrou | set | messages: + msg180324 |
| 2013年01月20日 21:55:39 | serhiy.storchaka | set | messages: + msg180323 |
| 2013年01月20日 21:09:12 | rhettinger | set | messages: + msg180314 |
| 2013年01月20日 20:54:26 | pitrou | set | nosy:
+ pitrou messages: + msg180313 |
| 2013年01月20日 20:32:22 | serhiy.storchaka | set | messages: + msg180311 |
| 2013年01月20日 20:19:58 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka, nadeem.vawda messages: + msg180310 stage: needs patch |
| 2013年01月20日 18:45:44 | brian.curtin | set | nosy:
+ brian.curtin components: + Library (Lib) |
| 2013年01月20日 18:41:42 | rhettinger | create | |