This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2009年04月02日 18:53 by collinwinter, last changed 2022年04月11日 14:56 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| cpickle_dict.patch | collinwinter, 2009年04月02日 18:53 | Patch against trunk, r71058 | ||
| pickle_batch_dict_exact_py3k-5.diff | alexandre.vassalotti, 2009年04月03日 14:42 | |||
| Messages (20) | |||
|---|---|---|---|
| msg85239 - (view) | Author: Collin Winter (collinwinter) * (Python committer) | Date: 2009年04月02日 18:53 | |
The attached patch adds another version of cPickle.c's batch_dict(), batch_dict_exact(), which is specialized for "type(x) is dict". This provides a nice performance boost when pickling objects that use dictionaries: Pickle: Min: 2.216 -> 1.858: 19.24% faster Avg: 2.238 -> 1.889: 18.50% faster Significant (t=106.874099, a=0.95) Benchmark is at http://code.google.com/p/unladen-swallow/source/browse/tests/performance/macro_pickle.py (driver is ../perf.py; perf.py was run with "--rigorous -b pickle"). This patch passes all the tests added in issue 5665. I would recommend reviewing that patch first. I'll port to py3k once this is reviewed for trunk. |
|||
| msg85245 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2009年04月02日 19:14 | |
Without taking a very detailed look, the patch looks good. Are there already tests for pickling of dict subclasses? Otherwise, they should be added. |
|||
| msg85248 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2009年04月02日 19:20 | |
By the way, could the same approach be applied to lists and sets as well? |
|||
| msg85253 - (view) | Author: Collin Winter (collinwinter) * (Python committer) | Date: 2009年04月02日 19:39 | |
On Thu, Apr 2, 2009 at 12:20 PM, Antoine Pitrou <report@bugs.python.org> wrote: > > Antoine Pitrou <pitrou@free.fr> added the comment: > > By the way, could the same approach be applied to lists and sets as well? Certainly; see http://bugs.python.org/issue5671 for the list version. It doesn't make as big an impact on the benchmark, though. |
|||
| msg85257 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2009年04月02日 19:44 | |
> Certainly; see http://bugs.python.org/issue5671 for the list version. > It doesn't make as big an impact on the benchmark, though. How about splitting the benchmark in parts: - (un)pickling lists - (un)pickling dicts - (un)pickling sets (etc.) |
|||
| msg85272 - (view) | Author: Collin Winter (collinwinter) * (Python committer) | Date: 2009年04月02日 22:10 | |
Antoine: pickletester.py:test_newobj_generic() appears to test dict subclasses, though in a roundabout-ish way. I don't know of any tests for dict subclasses in the C level sense (ie, PyDict_Check() vs PyDict_CheckExact()). I can add more explicit tests for Python-level dict subclasses, if you want. |
|||
| msg85276 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) | Date: 2009年04月02日 22:56 | |
The patch produces different output for an empty dict: a sequence "MARK SETITEMS" is written, which is useless and wastes 2 bytes. |
|||
| msg85277 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2009年04月02日 22:58 | |
> Antoine: pickletester.py:test_newobj_generic() appears to test dict > subclasses, though in a roundabout-ish way. I don't know of any tests > for dict subclasses in the C level sense (ie, PyDict_Check() vs > PyDict_CheckExact()). I can add more explicit tests for Python-level > dict subclasses, if you want. Well, Python-level dict subclasses are also C-level subclasses (in the PyDict_Check() sense), or am I mistaken? |
|||
| msg85293 - (view) | Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) | Date: 2009年04月03日 05:20 | |
I ported the patch to py3k. In addition, I added a special-case when the dict contains only one item; you probably want this special-case in the trunk version as well. |
|||
| msg85294 - (view) | Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) | Date: 2009年04月03日 05:23 | |
Oops, I forgot to add the comment on top of batch_dict_exact in the patch. Here is a better patch. |
|||
| msg85296 - (view) | Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) | Date: 2009年04月03日 05:51 | |
Oops again, I just remarked that the comment for batch_dict_exact refers to batch_dict as being above, but I copied batch_dict_exact before batch_dict. Here's a good patch (hopefully) that puts batch_dict_exact at the right place. |
|||
| msg85306 - (view) | Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) | Date: 2009年04月03日 14:37 | |
Silly me, I had changed the PyDict_Size call in outer loop for Py_SIZE and this is of course totally wrong. Here's a good patch (I am pretty sure now! ;-) I ran the whole test suite and I saw no failures. Collin, you can go ahead and commit both patches. Nice work! |
|||
| msg85307 - (view) | Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) | Date: 2009年04月03日 14:42 | |
Sigh... silly me again. There is some other junk in my last patch. |
|||
| msg85333 - (view) | Author: Collin Winter (collinwinter) * (Python committer) | Date: 2009年04月03日 21:22 | |
FYI, I just added a pickle_dict microbenchmark to perf.py. Using this new microbenchmark, I see these results (perf.py -r -b pickle_dict): pickle_dict: Min: 2.092 -> 1.341: 56.04% faster Avg: 2.126 -> 1.360: 56.37% faster Significant (t=216.895643, a=0.95) I still need to address the comment about pickling empty dicts. |
|||
| msg85335 - (view) | Author: Collin Winter (collinwinter) * (Python committer) | Date: 2009年04月03日 21:48 | |
Amaury, I can't reproduce the issue you're seeing with empty dicts.
Here's what I'm doing:
dhcp-172-19-19-199:trunk collinwinter$ ./python.exe
Python 2.7a0 (trunk:71100M, Apr 3 2009, 14:40:49)
[GCC 4.0.1 (Apple Inc. build 5490)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import cPickle, pickletools
>>> data = cPickle.dumps({}, protocol=2)
>>> pickletools.dis(data)
0: \x80 PROTO 2
2: } EMPTY_DICT
3: . STOP
highest protocol among opcodes = 2
>>> data
'\x80\x02}.'
>>>
What are you doing to produce the MARK SETITEMS sequence?
|
|||
| msg85433 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) | Date: 2009年04月04日 21:56 | |
Sorry, I was wrong. I think I noticed that the case size==1 was handled differently, and incorrectly inferred the same for size==0. (btw, the patch for trunk was not updated) |
|||
| msg86188 - (view) | Author: Kelvin Liang (feisan) | Date: 2009年04月20日 03:45 | |
Can this patch be used or ported to 2.5.x? |
|||
| msg86194 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2009年04月20日 11:03 | |
Sorry, it won't even be integrated in 2.6 actually. It's a new feature, not a bug fix. |
|||
| msg88303 - (view) | Author: Collin Winter (collinwinter) * (Python committer) | Date: 2009年05月25日 05:44 | |
Fixed the len(d) == 1 size regression. Final performance of the patch relative to trunk: Using Unladen Swallow's perf.py -b pickle,pickle_dict on trunk: pickle: Min: 2.238 -> 1.895: 18.08% faster Avg: 2.241 -> 1.898: 18.04% faster Significant (t=282.066701, a=0.95) pickle_dict: Min: 2.163 -> 1.375: 57.36% faster Avg: 2.168 -> 1.376: 57.50% faster Significant (t=527.668441, a=0.95) Performance for py3k: pickle: Min: 2.849 -> 2.790: 2.10% faster Avg: 2.854 -> 2.796: 2.09% faster Significant (t=27.624303, a=0.95) pickle_dict: Min: 2.121 -> 1.512: 40.27% faster Avg: 2.128 -> 1.519: 40.13% faster Significant (t=283.406572, a=0.95) regrtest.py -uall test_xpickle passes all backwards-compatibility tests for trunk, and all other tests run by regrtest.py on Linux pass. Committed as r72909 (trunk), r72910 (py3k). |
|||
| msg88314 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2009年05月25日 09:35 | |
Thanks! > Committed as r72909 (trunk), r72910 (py3k). > > ---------- > resolution: accepted -> fixed > status: open -> closed |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:56:47 | admin | set | github: 49920 |
| 2009年05月25日 09:35:39 | pitrou | set | messages: + msg88314 |
| 2009年05月25日 05:44:08 | collinwinter | set | status: open -> closed resolution: accepted -> fixed messages: + msg88303 |
| 2009年04月20日 11:03:49 | pitrou | set | messages: + msg86194 |
| 2009年04月20日 03:45:06 | feisan | set | nosy:
+ feisan messages: + msg86188 |
| 2009年04月04日 21:56:30 | amaury.forgeotdarc | set | messages: + msg85433 |
| 2009年04月03日 21:48:36 | collinwinter | set | messages: + msg85335 |
| 2009年04月03日 21:22:08 | collinwinter | set | messages: + msg85333 |
| 2009年04月03日 14:42:29 | alexandre.vassalotti | set | files: - pickle_batch_dict_exact_py3k-4.diff |
| 2009年04月03日 14:42:24 | alexandre.vassalotti | set | files: - pickle_batch_dict_exact_py3k-3.diff |
| 2009年04月03日 14:42:16 | alexandre.vassalotti | set | files:
+ pickle_batch_dict_exact_py3k-5.diff messages: + msg85307 |
| 2009年04月03日 14:37:45 | alexandre.vassalotti | set | files:
+ pickle_batch_dict_exact_py3k-4.diff messages: + msg85306 assignee: collinwinter keywords: + patch resolution: accepted stage: commit review |
| 2009年04月03日 05:52:05 | alexandre.vassalotti | set | files: - pickle_batch_dict_exact_py3k-2.diff |
| 2009年04月03日 05:52:00 | alexandre.vassalotti | set | files: - pickle_batch_dict_exact_py3k.diff |
| 2009年04月03日 05:51:51 | alexandre.vassalotti | set | keywords:
- patch files: + pickle_batch_dict_exact_py3k-3.diff messages: + msg85296 versions: + Python 3.1 |
| 2009年04月03日 05:23:38 | alexandre.vassalotti | set | files:
+ pickle_batch_dict_exact_py3k-2.diff messages: + msg85294 |
| 2009年04月03日 05:21:03 | alexandre.vassalotti | set | files:
+ pickle_batch_dict_exact_py3k.diff nosy: + alexandre.vassalotti messages: + msg85293 |
| 2009年04月02日 22:58:56 | pitrou | set | messages: + msg85277 |
| 2009年04月02日 22:56:01 | amaury.forgeotdarc | set | nosy:
+ amaury.forgeotdarc messages: + msg85276 |
| 2009年04月02日 22:10:22 | collinwinter | set | messages: + msg85272 |
| 2009年04月02日 19:44:44 | pitrou | set | messages: + msg85257 |
| 2009年04月02日 19:39:47 | collinwinter | set | messages: + msg85253 |
| 2009年04月02日 19:20:20 | pitrou | set | messages: + msg85248 |
| 2009年04月02日 19:14:36 | pitrou | set | nosy:
+ pitrou messages: + msg85245 |
| 2009年04月02日 18:53:50 | collinwinter | create | |