homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Make pickletools.optimize aware of the MEMOIZE opcode.
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.4, Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: alexandre.vassalotti, pitrou, python-dev, serhiy.storchaka, tim.peters
Priority: normal Keywords: needs review, patch

Created on 2013年12月02日 01:00 by alexandre.vassalotti, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
pickle_optimize_memoize.patch serhiy.storchaka, 2014年11月29日 14:04 review
pickle_optimize_memoize_2.patch serhiy.storchaka, 2014年12月16日 12:48 review
Messages (12)
msg204985 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2013年12月02日 01:00
PEP 3154 introduced the MEMOIZE opcode which lowered the overhead of memoization compared to the PUT opcodes which were previously used.
We should update pickletools.optimize to remove superfluous uses of this new opcode.
msg205032 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013年12月02日 17:05
I afraid this is not a feature, but a bug. pickletools.optimize() can produce invalid pickled data when *PUT and MEMOIZE opcodes are used.
>>> import pickle, pickletools
>>> p = b'\x80\x04]\x94(\x8c\x04spamq\x01\x8c\x03ham\x94h\x02e.'
>>> pickle.loads(p)
['spam', 'ham', 'ham']
>>> pickletools.dis(p)
 0: \x80 PROTO 4
 2: ] EMPTY_LIST
 3: \x94 MEMOIZE
 4: ( MARK
 5: \x8c SHORT_BINUNICODE 'spam'
 11: q BINPUT 1
 13: \x8c SHORT_BINUNICODE 'ham'
 18: \x94 MEMOIZE
 19: h BINGET 2
 21: e APPENDS (MARK at 4)
 22: . STOP
highest protocol among opcodes = 4
>>> p2 = pickletools.optimize(p)
>>> p2
b'\x80\x04\x95\x13\x00\x00\x00\x00\x00\x00\x00]\x94(\x8c\x04spam\x8c\x03ham\x94h\x02e.'
>>> pickle.loads(p2)
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
KeyError: 2
>>> pickletools.dis(p2)
 0: \x80 PROTO 4
 2: \x95 FRAME 19
 11: ] EMPTY_LIST
 12: \x94 MEMOIZE
 13: ( MARK
 14: \x8c SHORT_BINUNICODE 'spam'
 20: \x8c SHORT_BINUNICODE 'ham'
 25: \x94 MEMOIZE
 26: h BINGET 2
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "/home/serhiy/py/cpython/Lib/pickletools.py", line 2458, in dis
 raise ValueError(errormsg)
ValueError: memo key 2 has never been stored into
msg205051 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2013年12月02日 20:19
Well, that can only happen if MEMOIZE and PUT are both used together, which won't happen with the Pickler classes we support. The easiest thing to do here is to disable pickletools.optimize on protocol 4.
msg205055 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013年12月02日 20:30
By the way, I was curious if there were many users of pickletools.optimize(), so I did a search and most matches seem to be copies of the stdlib source tree:
http://code.ohloh.net/search?s=pickletools%20optimize&p=0&pp=0&fl=Python&mp=1&ml=1&me=1&md=1&ff=1&filterChecked=true 
msg205071 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2013年12月03日 03:55
Is it _documented_ that MEMOIZE and PUT can't be used together? If not, it should be documented; and pickletools dis() and optimize() should verify that this restriction is honored in their inputs.
msg205074 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2013年12月03日 04:41
MEMOIZE and PUT can be used together. They just need to not step on each other toes when they write to the memo table. As specified by PEP 3154, the memo index used by MEMOIZE is the number of elements currently in the memo table. This obviously means functions like, pickletools.optimize, have to be more careful when they rewrite pickles.
msg205348 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2013年12月06日 03:48
Ah, I almost forgot! I did implement the verification in pickletools.dis() for MEMOIZE:
http://hg.python.org/cpython/file/2612ea573ff7/Lib/pickletools.py#l2420 
msg231861 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014年11月29日 14:04
Here is a patch which makes pickletools.optimize() to work with the MEMOIZE opcode.
As side effect it also renumbers memoized values, this allows to use short BINPUT and BINGET instead of LONG_BINPUT and LONG_BINGET.
msg232658 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014年12月15日 12:34
Ping.
msg232738 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014年12月16日 12:48
Here is a patch which addresses Antoine's comments. Also added dis() output for binary example in comments.
msg232748 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2014年12月16日 16:06
New changeset c49b7acba06f by Serhiy Storchaka in branch '3.4':
Issue #19858: pickletools.optimize() now aware of the MEMOIZE opcode, can
https://hg.python.org/cpython/rev/c49b7acba06f
New changeset e7dd739b4b4e by Serhiy Storchaka in branch 'default':
Issue #19858: pickletools.optimize() now aware of the MEMOIZE opcode, can
https://hg.python.org/cpython/rev/e7dd739b4b4e 
msg232752 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014年12月16日 18:08
Many thanks for all your reviews Antoine!
History
Date User Action Args
2022年04月11日 14:57:54adminsetgithub: 64057
2014年12月16日 18:08:13serhiy.storchakasetstatus: open -> closed
messages: + msg232752

assignee: serhiy.storchaka
resolution: fixed
stage: patch review -> resolved
2014年12月16日 16:06:07python-devsetnosy: + python-dev
messages: + msg232748
2014年12月16日 12:48:47serhiy.storchakasetfiles: + pickle_optimize_memoize_2.patch

messages: + msg232738
2014年12月15日 12:34:53serhiy.storchakasetkeywords: + needs review

messages: + msg232658
2014年11月29日 14:04:26serhiy.storchakasetfiles: + pickle_optimize_memoize.patch
versions: + Python 3.4
messages: + msg231861

keywords: + patch
stage: needs patch -> patch review
2013年12月06日 03:48:47alexandre.vassalottisetmessages: + msg205348
2013年12月03日 04:41:53alexandre.vassalottisetmessages: + msg205074
2013年12月03日 03:55:36tim.peterssetmessages: + msg205071
2013年12月02日 20:30:19pitrousetnosy: + pitrou
messages: + msg205055
2013年12月02日 20:23:41pitrousetnosy: + tim.peters
2013年12月02日 20:19:29alexandre.vassalottisetmessages: + msg205051
2013年12月02日 17:05:40serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg205032
2013年12月02日 01:00:14alexandre.vassalotticreate

AltStyle によって変換されたページ (->オリジナル) /