This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2008年11月06日 20:46 by jfrechet, last changed 2022年04月11日 14:56 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| gzip-mtime-py3k.patch | jfrechet, 2008年11月06日 20:46 | gzip mtime patch (vs branches/py3k) | ||
| gzip-mtime-2.x.patch | jfrechet, 2008年11月06日 20:49 | gzip mtime patch (vs 2.x trunk) | ||
| gzip-mtime-revised-py3k.patch | jfrechet, 2009年01月02日 05:12 | same patch without test_literal_output [py3k] | ||
| gzip-mtime-revised-2.x.patch | jfrechet, 2009年01月02日 05:13 | same patch without test_literal_output [2.x trunk] | ||
| Messages (7) | |||
|---|---|---|---|
| msg75580 - (view) | Author: Jacques Frechet (jfrechet) | Date: 2008年11月06日 20:46 | |
The gzip header defined in RFC 1952 includes a mandatory "MTIME" field, originally intended to contain the modification time of the original uncompressed file. It is often ignored when decompressing, though gunzip (for example) uses it to set the modification time of the output file if applicable. The Python gzip module always sets the MTIME field to the current time, and always discards MTIME when decompressing. As a result, compressing the same string using gzip produces different output every time. For certain applications, especially those involving comparisons or cryprographic signing of binary files, these spurious changes can be quite inconvenient. Aside from the MTIME field, the gzip module already produces entirely deterministic output. I'm attaching a patch which adds an optional "mtime" argument to the GzipFile class, giving the caller the option of providing a timestamp when compressing. Default behavior is unchanged. I've included updated documentation and three new test cases in the patch. In order to facilitate testing, the patch also includes code to set the "mtime" member of the GzipFile instance when decompressing. The first test case uses the new member to ensure that the timestamp given to the GzipFile constructor is preserved correctly. The second test checks for specific values in the entire gzip header (not just the MTIME field) by reading the compressed file directly, examining individual fields in a (relatively) flexible way. The third compares the entire compressed stream against a predetermined sequence of bytes in a relatively inflexible way. All tests pass on my AMD64 box, and I expect them all to pass on all supported platforms without any problems. However, If anybody is concerned that any of the tests sound like they might be too brittle, I'm certainly not overly attached to them. If anyone has any further suggestions, I'd be delighted to submit a new patch. Thanks! Jacques |
|||
| msg75581 - (view) | Author: Jacques Frechet (jfrechet) | Date: 2008年11月06日 21:21 | |
This discussion of the problem and possible workarounds might also be of interest: http://stackoverflow.com/questions/264224/setting-the-gzip-timestamp-from-python |
|||
| msg75586 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) | Date: 2008年11月07日 00:19 | |
I considered using a datetime.datetime object instead. But it make more sense to use a time_t number, like os.stat() and time.time(). About the tests on the gzip format details: I am not an expert of the gzip format, but are we sure that the compressed data will always be the same? Otherwise the patch is fine. |
|||
| msg75588 - (view) | Author: Jacques Frechet (jfrechet) | Date: 2008年11月07日 01:26 | |
I'm no expert either. The output certainly seems to be deterministic for a given version of zlib, and I'm not aware of any prior versions of zlib that produce different compressed output. However, my understanding is that there is more than one possible compressed representation of a given uncompressed input, so it's entirely possible that a past or future version of zlib might produce compressed output that is different while remaining interoperable. I have no idea whether the zlib people care specifically about producing identical compressed output across versions or not. It might be a big deal to them, or they might have other priorities. I included the third test because I am guessing that the compressed output probably won't change very soon, and that if it does, it might be interesting to know that it changed. If that sounds to you like it might be more trouble than it's worth, then I think the right thing to do would be to simply get rid of the third test and keep the first two. |
|||
| msg78679 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2009年01月01日 02:12 | |
test_literal_output looks really too strict to me. At most, you could check that the header and trailer are unchanged, but it would probably make it equivalent to test_metadata. Other than that, I think it's an useful addition. |
|||
| msg78758 - (view) | Author: Jacques Frechet (jfrechet) | Date: 2009年01月02日 05:12 | |
I am uploading a new patch, identical to the previous patch except that it does not contain the ill-advised third test case (test_literal_output). The patch still applies cleanly and the tests still pass. |
|||
| msg79086 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2009年01月04日 21:39 | |
The patches have been committed, thanks! |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:56:41 | admin | set | github: 48522 |
| 2009年01月04日 21:39:42 | pitrou | set | status: open -> closed resolution: fixed messages: + msg79086 |
| 2009年01月02日 05:13:52 | jfrechet | set | files: + gzip-mtime-revised-2.x.patch |
| 2009年01月02日 05:12:50 | jfrechet | set | files:
+ gzip-mtime-revised-py3k.patch messages: + msg78758 |
| 2009年01月01日 02:12:56 | pitrou | set | priority: normal nosy: + pitrou stage: patch review messages: + msg78679 versions: + Python 3.1, Python 2.7 |
| 2008年11月07日 01:26:34 | jfrechet | set | messages: + msg75588 |
| 2008年11月07日 00:19:17 | amaury.forgeotdarc | set | nosy:
+ amaury.forgeotdarc messages: + msg75586 |
| 2008年11月06日 21:21:44 | jfrechet | set | messages: + msg75581 |
| 2008年11月06日 20:49:09 | jfrechet | set | files: + gzip-mtime-2.x.patch |
| 2008年11月06日 20:46:09 | jfrechet | create | |