homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Integrate pickle protocol version 4 GSoC work by Stefan Mihaila
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.4
process
Status: closed Resolution: out of date
Dependencies: Superseder: Implement PEP 3154 (pickle protocol 4)
View: 17810
Assigned To: alexandre.vassalotti Nosy List: alexandre.vassalotti, asvetlov, jcea, mstefanro, pitrou, serhiy.storchaka
Priority: normal Keywords: gsoc, patch

Created on 2012年08月13日 20:48 by alexandre.vassalotti, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
pickle4.diff alexandre.vassalotti, 2012年08月14日 02:42 review
pickle4-2.diff alexandre.vassalotti, 2012年08月17日 22:40 review
d0c3a8d4947a.diff mstefanro, 2013年05月11日 11:11 review
Repositories containing patches
https://bitbucket.org/mstefanro/pickle4
Messages (15)
msg168144 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2012年08月13日 20:48
Stefan Mihaila has been working on the implementation of PEP 3154, plus some other enhancements. His work is pretty complete and ready to be reviewed. I will do my best to finish a thorough review of his changes by the end of next week.
msg168175 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2012年08月14日 05:34
I get warnings when compiling with the patch:
/home/avassalotti/pickle4/Modules/_pickle.c: In function ‘save_global_binary’:
/home/avassalotti/pickle4/Modules/_pickle.c:2952: warning: pointer targets in passing argument 2 of ‘_Pickler_Write’ differ in signedness
/home/avassalotti/pickle4/Modules/_pickle.c:872: note: expected ‘const char *’ but argument is of type ‘unsigned char *’
/home/avassalotti/pickle4/Modules/_pickle.c:2956: warning: pointer targets in passing argument 2 of ‘_Pickler_Write’ differ in signedness
/home/avassalotti/pickle4/Modules/_pickle.c:872: note: expected ‘const char *’ but argument is of type ‘unsigned char *’
/home/avassalotti/pickle4/Modules/_pickle.c:2960: warning: pointer targets in passing argument 2 of ‘_Pickler_Write’ differ in signedness
/home/avassalotti/pickle4/Modules/_pickle.c:872: note: expected ‘const char *’ but argument is of type ‘unsigned char *’
/home/avassalotti/pickle4/Modules/_pickle.c:2975: warning: pointer targets in passing argument 2 of ‘_Pickler_Write’ differ in signedness
/home/avassalotti/pickle4/Modules/_pickle.c:872: note: expected ‘const char *’ but argument is of type ‘unsigned char *’
/home/avassalotti/pickle4/Modules/_pickle.c:2979: warning: pointer targets in passing argument 2 of ‘_Pickler_Write’ differ in signedness
/home/avassalotti/pickle4/Modules/_pickle.c:872: note: expected ‘const char *’ but argument is of type ‘unsigned char *’
/home/avassalotti/pickle4/Modules/_pickle.c:2988: warning: pointer targets in passing argument 2 of ‘_Pickler_Write’ differ in signedness
/home/avassalotti/pickle4/Modules/_pickle.c:872: note: expected ‘const char *’ but argument is of type ‘unsigned char *’
/home/avassalotti/pickle4/Modules/_pickle.c:3012: warning: pointer targets in passing argument 2 of ‘_Pickler_Write’ differ in signedness
/home/avassalotti/pickle4/Modules/_pickle.c:872: note: expected ‘const char *’ but argument is of type ‘unsigned char *’
/home/avassalotti/pickle4/Modules/_pickle.c: In function ‘save_set’:
/home/avassalotti/pickle4/Modules/_pickle.c:3368: warning: suggest parentheses around assignment used as truth value
/home/avassalotti/pickle4/Modules/_pickle.c: In function ‘save_frozenset’:
/home/avassalotti/pickle4/Modules/_pickle.c:3422: warning: suggest parentheses around assignment used as truth value
These should be fixed.
msg168176 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2012年08月14日 05:46
There are reference leaks in the _pickle.c part that will need to be fixed too. 
22:36:29 [~/pickle4]$ ./python -m test.regrtest -R :: test_pickle
[1/1] test_pickle
beginning 9 repetitions
123456789
.........
test_pickle leaked [14780, 14780, 14780, 14780] references, sum=59120
1 test failed:
 test_pickle
[319952 refs]
msg168200 - (view) Author: Stefan Mihaila (mstefanro) * Date: 2012年08月14日 13:43
Maybe we could postpone the review process for a few days
until I fix some known issues
msg168482 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2012年08月17日 22:40
Oops, wrong patch. Uploading the right one.
msg168520 - (view) Author: Stefan Mihaila (mstefanro) * Date: 2012年08月18日 17:14
Maybe you can set this issue as the superseder of issue9269, because the patches there have already been applied here.
msg168581 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012年08月19日 16:18
Is this patch stable? Or are there further changes coming?
msg168599 - (view) Author: Stefan Mihaila (mstefanro) * Date: 2012年08月19日 22:23
There are still some upcoming changes.
msg168809 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2012年08月21日 21:53
Some quick thoughts about the new implicit memoization scheme in Stefan's implementation. 
- The new scheme will need to be documented in PEP 3154 before we can accept the change.
- I don't really like the idea of changing the semantics of the PUT and GET opcodes. I would prefer new opcodes if possible.
- I would like to see benchmarks for this change.
msg168894 - (view) Author: Stefan Mihaila (mstefanro) * Date: 2012年08月22日 15:49
>- I don't really like the idea of changing the semantics of the PUT and GET opcodes. I would prefer new opcodes if possible.
Well, the semantics of PUT and GET haven't really changed. It's just that the PUT opcode is not generated anymore and memoization is done "in agreement" (i.e. both the pickler and the unpickler know when to memoize so their memo tables stay in sync). So, in fact, it's the semantics of the other opcodes that has slightly changed.
>- I would like to see benchmarks for this change.
I've tried the following two snippets with timeit:
 ./python3.3 -m timeit \
 -s 'from pickle import dumps' \
 -s 'd=["a"]*100'
 'dumps(d,3)' # replace 3 with 4 for comparison
 ./python3.3 -m timeit \
 -s 'from pickle import dumps' \
 -s 'd=list(map(chr,range(0,256)))' \
 'dumps(d,3)' # replace 3 with 4 for comparison
 # you can also use loads(dumps(d,3)) here to benchmark both 
 # operations at once
The first one generates 99 BINGET opcodes. It generates 1 BINPUT opcode in pickle3 and no BINPUT opcodes in pickle4.
There appears no noticeable speed difference.
The second one generates no BINGET opcodes. It generates no BINPUT opcodes in v4, respectively 256 BINPUT opcodes in v3. It appears the v4 one is slightly faster, but I have a hard time comparing correctly, given that the measurements seem to have a very large standard deviation (v4 gets times somewhere between 32.3 and 44.2, whereas v3 gets times between 37.7 and 52.2).
I'm not sure this is the best way to benchmark, so let me know what is usually used.
msg168897 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012年08月22日 16:47
Le mercredi 22 août 2012 à 15:49 +0000, Stefan Mihaila a écrit :
> I'm not sure this is the best way to benchmark, so let me know what is
> usually used.
http://hg.python.org/benchmarks/ has pickling benchmarks, you may have
to modify them to test different pickling protocols.
msg168946 - (view) Author: Stefan Mihaila (mstefanro) * Date: 2012年08月23日 15:31
Are there also some known techniques on tracking down memory leaks?
I've played around with sys.gettotalrefcount to narrow down
the place where the leaks occur, but they seem to only occur in v4,
i.e. pickle.dumps(3.0+1j, 4) leaks but pickle.dumps(3.0+1j, 3) does
not.
However, there appears to be no difference in the code that gets
executed in v3 to the one executed in v4.
msg169934 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012年09月06日 18:07
> Are there also some known techniques on tracking down memory leaks?
Nothing more than the usual debugging techniques. It is more of a matter of taste whether you like to add print() (or printf ;-)) calls, or set breakpoints in an actual debugger.
> i.e. pickle.dumps(3.0+1j, 4) leaks but pickle.dumps(3.0+1j, 3) does
not.
Well it looks like you've narrowed things down a bit here.
> However, there appears to be no difference in the code that gets
> executed in v3 to the one executed in v4.
Even the differences in memoization?
msg187497 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2013年04月21日 06:56
I have started a new implementation of PEP 3154 since Stefan hasn't been active on his. Moving the discussion to Issue #17810.
msg187565 - (view) Author: Stefan Mihaila (mstefanro) * Date: 2013年04月22日 14:22
Hello. I apologize once again for not finalizing my work, but once I have started my final year of faculty and a job, I have been busy pretty much all the time. I would really like to finish this as I've really enjoyed working on it, and everything on PEP 3154 and some other stuff has already been implemented. The only remaining part was finalizing the code review and fixing some memory leaks that gave me some headaches at the time.
I would really appreciate if you could give me a few more days before deciding to start a new implementation from scratch. I'll get to fixing those memory leaks in the next couple of days and then the code review can be finalized.
Would this be acceptable to you?
History
Date User Action Args
2022年04月11日 14:57:34adminsetgithub: 59847
2013年05月11日 11:12:04mstefanrosetfiles: + d0c3a8d4947a.diff
2013年04月22日 14:22:15mstefanrosetmessages: + msg187565
2013年04月21日 06:56:16alexandre.vassalottisetstatus: open -> closed
superseder: Implement PEP 3154 (pickle protocol 4)
messages: + msg187497

dependencies: - Unbinding of methods
resolution: out of date
stage: patch review -> resolved
2012年11月24日 14:18:01daniel.urbanlinkissue12457 superseder
2012年11月18日 10:07:39serhiy.storchakasetnosy: + serhiy.storchaka
2012年10月04日 02:11:04jceasetnosy: + jcea
2012年09月06日 18:07:02pitrousetmessages: + msg169934
2012年08月23日 15:31:14mstefanrosetmessages: + msg168946
2012年08月22日 16:47:52pitrousetmessages: + msg168897
2012年08月22日 15:49:58mstefanrosetmessages: + msg168894
2012年08月21日 21:53:00alexandre.vassalottisetmessages: + msg168809
2012年08月19日 22:23:08mstefanrosetmessages: + msg168599
2012年08月19日 16:18:43pitrousetmessages: + msg168581
2012年08月18日 17:14:52mstefanrosetmessages: + msg168520
2012年08月17日 22:40:15alexandre.vassalottisetfiles: + pickle4-2.diff

messages: + msg168482
2012年08月17日 22:39:33alexandre.vassalottisetfiles: - pickle4-2.diff
2012年08月17日 21:40:39alexandre.vassalottisetdependencies: + Unbinding of methods
2012年08月17日 21:29:47alexandre.vassalottisetfiles: + pickle4-2.diff
2012年08月17日 16:34:46asvetlovsetnosy: + asvetlov
2012年08月14日 13:43:49mstefanrosetmessages: + msg168200
2012年08月14日 05:46:47alexandre.vassalottisetmessages: + msg168176
2012年08月14日 05:34:45alexandre.vassalottisetmessages: + msg168175
2012年08月14日 02:42:43alexandre.vassalottisetfiles: + pickle4.diff
2012年08月14日 02:42:07alexandre.vassalottisetfiles: - pickle4.diff
2012年08月14日 02:41:10alexandre.vassalottisetfiles: + pickle4.diff
2012年08月13日 20:48:39alexandre.vassalotticreate

AltStyle によって変換されたページ (->オリジナル) /