This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2012年04月27日 20:49 by Sam.Rushing, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| zlib_set_dictionary.patch | Sam.Rushing, 2012年04月27日 20:49 | adds inflateSetDictionary(), deflateSetDictionary() | review | |
| zlib_set_dictionary_2.patch | Sam.Rushing, 2012年05月03日 23:48 | set dictionary in constructors | review | |
| tz2.py | Sam.Rushing, 2012年05月03日 23:50 | verify correct stream behavior with dictionary | ||
| zlib_set_dictionary_3.patch | Sam.Rushing, 2012年05月04日 00:51 | review | ||
| zlib_set_dictionary_4.patch | Sam.Rushing, 2012年05月05日 19:33 | review | ||
| Messages (23) | |||
|---|---|---|---|
| msg159492 - (view) | Author: Sam Rushing (Sam.Rushing) | Date: 2012年04月27日 20:49 | |
Google's SPDY protocol requires the use of a pre-defined compression dictionary. The current zlib module doesn't expose the two functions for setting the dictionary.
This patch is minimal in the sense that it only exposes the two functions, but unfortunately the sequence of zlib calls required is clumsy: a call to inflate() must fail first (with an error of Z_NEED_DICT):
import zlib
zdict = b"thequickbrownfoxjumped\x00"
c = zlib.compressobj()
c.set_dictionary (zdict)
cd = c.compress (b"the quick brown fox jumped over the candlestick")
cd += c.flush()
d = zlib.decompressobj()
try:
print (d.decompress (cd))
except zlib.error as what:
if what.args[0].startswith ('Error 2 '):
d.set_dictionary (zdict)
print (d.flush())
Obviously a better way to catch/match Z_NEED_DICT would be nice.
A much nicer solution would allow you to associate the dictionary at creation time, rather than having to wait for an error condition. I can only guess that the zlib authors designed it that way for a reason?
|
|||
| msg159870 - (view) | Author: Jesús Cea Avión (jcea) * (Python committer) | Date: 2012年05月03日 19:31 | |
A dictionary could be provided an init time. Then, the Z_NEED_DICT could be intercepted in the binding and automatically inject the dictionary provided in the init. Anyway, for a patch to be approved, we need a test too. PS: Why is this NOT targeted to 3.3?. We have time, yet. |
|||
| msg159886 - (view) | Author: Sam Rushing (Sam.Rushing) | Date: 2012年05月03日 23:04 | |
I'm currently reworking this so that the dictionaries are provided in the constructor, and inflateSetDictionary() is called automatically. I've gone over the zlib RFC's and zlibmodule.c, and I'm fairly certain that whatever usage mode might involve multiple calls to SetDictionary() couldn't be supported by the zlib object anyway. r.e. 3.3/3.4 - I merely chose the highest version number I saw, since this diff is against HEAD (as recommended by the docs). It's been quite a few years since I've submitted a patch to CPython! |
|||
| msg159887 - (view) | Author: Jesús Cea Avión (jcea) * (Python committer) | Date: 2012年05月03日 23:38 | |
Retargetting to python 3.3. If you hurry a bit and I find your patch acceptable (remember the tests!), I will try to integrate it. |
|||
| msg159888 - (view) | Author: Sam Rushing (Sam.Rushing) | Date: 2012年05月03日 23:48 | |
Ok, here's the patch. It has a single short test. For use with SPDY, it's necessary to test that the following stream data also correctly decompresses, I'll attach that to the next comment. |
|||
| msg159889 - (view) | Author: Sam Rushing (Sam.Rushing) | Date: 2012年05月03日 23:50 | |
This test is rather large, since it includes the predefined SPDY draft 2 dictionary, and some real-world data. Not sure what the policy is on including so much data in a test. If there's enough time I could make a smaller test that also verifies the correct behavior on a stream... |
|||
| msg159890 - (view) | Author: Sam Rushing (Sam.Rushing) | Date: 2012年05月04日 00:00 | |
Argh, probably need to add the 'dict' field to the copy() method. |
|||
| msg159893 - (view) | Author: Sam Rushing (Sam.Rushing) | Date: 2012年05月04日 00:50 | |
Updated version of the patch: extends the test, including a test of the streaming behavior needed for SPDY (both compression and decompression). Also wik: copy()/uncopy() are aware of the 'dict' attribute. |
|||
| msg159897 - (view) | Author: Éric Araujo (eric.araujo) * (Python committer) | Date: 2012年05月04日 03:33 | |
Added a few comments on Rietveld. |
|||
| msg160004 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2012年05月05日 16:43 | |
I've posted a review on Rietveld. |
|||
| msg160028 - (view) | Author: Sam Rushing (Sam.Rushing) | Date: 2012年05月05日 19:33 | |
renames dict->zdict, splits the test, adds BEGIN/END around inflate call. |
|||
| msg161082 - (view) | Author: Jesús Cea Avión (jcea) * (Python committer) | Date: 2012年05月18日 19:42 | |
Status of this feature?. Ready to integrate? |
|||
| msg161093 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2012年05月19日 06:08 | |
The code should be changed to use the buffer API (instead of accepting only bytes objects). Other than that, I think it's ready for integration. |
|||
| msg163118 - (view) | Author: Jesús Cea Avión (jcea) * (Python committer) | Date: 2012年06月18日 20:21 | |
Sam, the window for Python 3.3 integration is almost close. Could you possibly update your patch with Nadeem's feedback?. |
|||
| msg163130 - (view) | Author: Sam Rushing (Sam.Rushing) | Date: 2012年06月19日 01:00 | |
I think other than the disagreement about whether the dictionary constructor arg should be a buffer object, it's good to go. To restate my position: the need is for an immutable string of bytes, and that's exactly what PyBytes_Type is for. I see no advantage to allowing a buffer object, which will require extra code to check that it is both a buffer object and set to be readonly. I believe the argument for aesthetics does not apply, as the constant dictionary constructor argument is a morally different kind of parameter, comparable to (say) the compression level. You folks are of course welcome to change it, though. 8^) |
|||
| msg163147 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2012年06月19日 08:57 | |
> To restate my position: the need is for an immutable string of bytes, [...]
I disagree that we should require the dictionary to be immutable - if the
caller wishes to use a mutable buffer here, it is their responsibility to
ensure that it is not modified until the compressor is finished with it
("consenting adults" and all that). The documentation can inform users of
this requirement.
> I believe the argument for aesthetics does not apply, as the constant
> dictionary constructor argument is a morally different kind of
> parameter, comparable to (say) the compression level.
Even so, the surrounding code sets a precedent for how it accepts binary
data buffers, and deviating from this existing convention should not be
taken lightly.
Nitpicking about the API aside, thanks for the patch :-)
|
|||
| msg163199 - (view) | Author: Jesús Cea Avión (jcea) * (Python committer) | Date: 2012年06月19日 18:21 | |
So my question is easy: could we apply this patch as is and defer any "improvement" to 3.4?. The risk of not doing so would be to miss 3.3 completely. |
|||
| msg163228 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2012年06月19日 22:28 | |
I plan to commit it (along with the buffer API changes) tomorrow. |
|||
| msg163298 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2012年06月21日 00:20 | |
New changeset dd4f7d5c51c7 by Nadeem Vawda in branch 'default': Issue #14684: Add support for predefined compression dictionaries to the zlib module. http://hg.python.org/cpython/rev/dd4f7d5c51c7 |
|||
| msg163300 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2012年06月21日 00:27 | |
Committed. Once again, thanks for the patch! |
|||
| msg163339 - (view) | Author: Jim Jewett (Jim.Jewett) * (Python triager) | Date: 2012年06月21日 15:24 | |
Just saw this on the checkins list; where are the other options documented? """ PyDoc_STRVAR(compressobj__doc__, -"compressobj([level]) -- Return a compressor object.\n" +"compressobj([level[, method[, wbits[, memlevel[, strategy[, zdict]]]]]])\n" +" -- Return a compressor object.\n" "\n" -"Optional arg level is the compression level, in 1-9."); +"Optional arg level is the compression level, in 1-9.\n" +"\n" +"Optional arg zdict is the predefined compression dictionary - a sequence of\n" +"bytes containing subsequences that are likely to occur in the input data."); """ I'm honestly not certain what they should be, but the following is my best guess: """ PyDoc_STRVAR(compressobj__doc__, "compressobj([level[, method[, wbits[, memlevel[, strategy[, zdict]]]]]])\n" " -- Return a compressor object.\n" "\n" -"Optional arg level is the compression level, in 1-9.\n" +"Optional arg level (1-9) is the compression level.\n" +"Larger numbers take longer, but produce smaller results.\n" "\n" +"Optional arg method is the compression method.\n" +"The only currently supported method is zlib.DEFLATED.\n" +"\n" +"Optional arg wbits determines the window buffer size.\n" +"Normal values are 8 (least memory) to 15 (best compression).\n" +"\n" +"Optional arg memlevel (1-9) controls working memory size.\n" +"Larger numbers use more memory, but produce smaller results more quickly.\n" +"\n" +"Optional arg strategy tunes the compression algorithm.\n" +"Supported options include zlib.Z_DEFAULT_STRATEGY, zlib.Z_FILTERED, and zlib.Z_HUFFMAN_ONLY.\n" +"\n" +"Optional arg zdict is the predefined compression dictionary - a sequence of\n" +"bytes containing subsequences that are likely to occur in the input data."); """ |
|||
| msg163341 - (view) | Author: Nadeem Vawda (nadeem.vawda) * (Python committer) | Date: 2012年06月21日 15:36 | |
> Just saw this on the checkins list; where are the other options documented? They aren't, AFAIK. I've been planning on adding them when I've got time (based on the zlib manual at http://zlib.net/manual.html), but with the upcoming feature freeze for 3.3, this issue was higher priority. |
|||
| msg163380 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2012年06月21日 23:51 | |
New changeset 1cfa44cb5af0 by Nadeem Vawda in branch 'default': Document the rest of zlib.compressobj()'s arguments. http://hg.python.org/cpython/rev/1cfa44cb5af0 |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:29 | admin | set | github: 58889 |
| 2012年06月21日 23:51:41 | python-dev | set | messages: + msg163380 |
| 2012年06月21日 15:36:20 | nadeem.vawda | set | messages: + msg163341 |
| 2012年06月21日 15:24:40 | Jim.Jewett | set | nosy:
+ Jim.Jewett messages: + msg163339 |
| 2012年06月21日 00:27:01 | nadeem.vawda | set | status: open -> closed resolution: fixed messages: + msg163300 stage: patch review -> resolved |
| 2012年06月21日 00:20:56 | python-dev | set | nosy:
+ python-dev messages: + msg163298 |
| 2012年06月19日 22:28:19 | nadeem.vawda | set | messages: + msg163228 |
| 2012年06月19日 18:21:05 | jcea | set | messages: + msg163199 |
| 2012年06月19日 08:57:15 | nadeem.vawda | set | messages: + msg163147 |
| 2012年06月19日 01:01:00 | Sam.Rushing | set | messages: + msg163130 |
| 2012年06月18日 20:21:36 | jcea | set | messages: + msg163118 |
| 2012年05月19日 06:08:09 | nadeem.vawda | set | messages: + msg161093 |
| 2012年05月18日 19:58:31 | eric.araujo | set | keywords:
+ needs review stage: patch review |
| 2012年05月18日 19:42:30 | jcea | set | messages: + msg161082 |
| 2012年05月05日 19:33:31 | Sam.Rushing | set | files:
+ zlib_set_dictionary_4.patch messages: + msg160028 |
| 2012年05月05日 16:43:06 | nadeem.vawda | set | messages: + msg160004 |
| 2012年05月05日 10:01:11 | nadeem.vawda | set | nosy:
+ nadeem.vawda |
| 2012年05月04日 03:33:02 | eric.araujo | set | nosy:
+ eric.araujo messages: + msg159897 |
| 2012年05月04日 00:51:05 | Sam.Rushing | set | files: + zlib_set_dictionary_3.patch |
| 2012年05月04日 00:50:27 | Sam.Rushing | set | messages: + msg159893 |
| 2012年05月04日 00:00:22 | Sam.Rushing | set | messages: + msg159890 |
| 2012年05月03日 23:50:04 | Sam.Rushing | set | files:
+ tz2.py messages: + msg159889 |
| 2012年05月03日 23:48:08 | Sam.Rushing | set | files:
+ zlib_set_dictionary_2.patch messages: + msg159888 |
| 2012年05月03日 23:39:10 | jcea | set | assignee: roger.serwy -> nosy: - roger.serwy |
| 2012年05月03日 23:38:24 | jcea | set | versions:
+ Python 3.3, - Python 3.4 nosy: + roger.serwy messages: + msg159887 assignee: roger.serwy |
| 2012年05月03日 23:04:16 | Sam.Rushing | set | messages: + msg159886 |
| 2012年05月03日 19:31:39 | jcea | set | nosy:
+ jcea messages: + msg159870 |
| 2012年04月27日 20:49:09 | Sam.Rushing | create | |