homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author eltoder
Recipients ajaksu2, alex, alexandre.vassalotti, bcroq, belopolsky, ddorfman, eltoder, mstefanro, pitrou, rhettinger, zzzeek
Date 2012年12月29日.00:44:16
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1356741857.38.0.685467881353.issue1062277@psf.upfronthosting.co.za>
In-reply-to
Content
To recap, the issue is that pickle doesn't handle recursion via reduce arguments (i.e. arguments to the constructor function as returned in 2nd element of the tuple from __reduce__). This leads to 2 kind of effects:
class C:
 def __init__(self, x=None):
 self.x = x if x is not None else self
 def __reduce__(self):
 return C, (self.x,)
A. Recursion error:
>>> pickle.dumps(C())
Traceback (most recent call last):
 File "<pyshell#5>", line 1, in <module>
 pickle.dumps(C())
RuntimeError: maximum recursion depth exceeded while calling a Python object
This cannot be helped with the current reduce protocol. The error may be improved, but that's about it.
B. Duplication of object when unpickling:
>>> c = C([])
>>> c.x.append(c)
>>> c.x[0] is c
True
>>> c2 = pickle.loads(pickle.dumps(c))
>>> c2.x[0] is c2
False
This happens because list (or another recursion-friendly type) inside the problematic object handles recursion, but we still get the outer object, identical to the inner one.
This can be solved the same way as for tuple:
>>> t=([],1,2)
>>> t[0].append(t)
>>> t2 = pickle.loads(pickle.dumps(t))
>>> t2[0][0] is t2
True
>>> pickletools.dis(pickle.dumps(t))
 0: \x80 PROTO 3
 2: ] EMPTY_LIST
 3: q BINPUT 0
 5: h BINGET 0
 7: K BININT1 1
 9: K BININT1 2
 11: \x87 TUPLE3
 12: q BINPUT 1
 14: a APPEND
 15: K BININT1 1
 17: K BININT1 2
 19: 0 POP
 20: 0 POP
 21: 0 POP
 22: h BINGET 1
 24: . STOP
After pickling its elements tuple checks if it got into memo. If it did, this means it was pickled by one of the elements, so it POPs all elements from the stack and fetches itself via GET. This is somewhat inefficient, but probably the best it can do.
I suggest we do 3 things:
1. Improve the documentation for __reduce__ function. It should mention that all state that a) can potentially point back to the object and b) not strictly necessary in the constructor function should be passed via the 3rd element of __reduce__ tuple (aka state) instead of the 2nd element, and applied by __setstate__. This handles recursion in robust and optimal way.
2. Check that all built-in/standard types follow this advice. I see that Stefan Mihaila already fixed sets.
3. To fix case B above add the memo check from save_tuple to save_reduce. While at it, we can consider checking after pickling every element instead of after pickling all elements, so we reduce the number of POPs and the wastage they cause.
History
Date User Action Args
2012年12月29日 00:44:17eltodersetrecipients: + eltoder, rhettinger, belopolsky, ddorfman, pitrou, ajaksu2, alexandre.vassalotti, alex, zzzeek, bcroq, mstefanro
2012年12月29日 00:44:17eltodersetmessageid: <1356741857.38.0.685467881353.issue1062277@psf.upfronthosting.co.za>
2012年12月29日 00:44:17eltoderlinkissue1062277 messages
2012年12月29日 00:44:16eltodercreate

AltStyle によって変換されたページ (->オリジナル) /