This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2016年05月19日 12:02 by vstinner, last changed 2022年04月11日 14:58 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| unpickle_read.patch | vstinner, 2016年05月19日 12:02 | review | ||
| unpickle_read-2.patch | vstinner, 2016年05月20日 09:26 | review | ||
| Messages (9) | |||
|---|---|---|---|
| msg265852 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2016年05月19日 12:02 | |
According to Linux perf, the unpickle_list benchmark (of the CPython benchmark suite) heavily depends on the performance of load() and _Unpickler_Read() functions. While running benchmarks with PGO+LTO compilation, I noticed a difference around 5% because one build uses constant propation on _Unpickler_Read() (fast), wheras another build doesn't (slow). Depending on the result of the PGO, depending on the OS and the compiler, you may or may not get this nice optimization. I propose to implement it manually to not depend on the compiler. Attached short patch implements manually the optimization. It looks to have a big impact on unpickle_list, but no impact (benchmark is not significant) on fastunpickle and pickle_list: --- $ taskset -c 3 python3 perf.py ../ref_default/rel/python ../default/rel/python -b unpickle_list,fastunpickle,pickle_list --rigorous -v (...) Report on Linux smithers 4.4.9-300.fc23.x86_64 #1 SMP Wed May 4 23:56:27 UTC 2016 x86_64 x86_64 Total CPU cores: 4 ### fastunpickle ### Avg: 0.527359 +/- 0.005932 -> 0.518548 +/- 0.00953: 1.02x faster Not significant ### pickle_list ### Avg: 0.269307 +/- 0.017465 -> 0.266015 +/- 0.00423: 1.01x faster Not significant ### unpickle_list ### Avg: 0.255805 +/- 0.006942 -> 0.231444 +/- 0.00394: 1.11x faster Significant (t=21.58) --- It would be interesting to also evaluate the computed goto optimization for the load() function. (And also try computed goto for the re/_sre module, but that's a different issue.) I tuned my system and patched perf.py (of the CPython benchmark suite) to get stable benchmark results. |
|||
| msg265858 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2016年05月19日 13:38 | |
+1. Similar optimization is used in marshal. Added comments on Rietveld. |
|||
| msg265920 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2016年05月20日 09:26 | |
Updated patch 2 to address Serhiy's comments. |
|||
| msg265922 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2016年05月20日 09:37 | |
I forgot to note that the comment before _Unpickler_Read() becomes not correct, but you already addressed this in the second patch. unpickle_read-2.patch LGTM. Few months ago I tried to optimize reading in pickle, but didn't see significant benefit. After committing your patch I'm going to revive my patch. |
|||
| msg265925 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2016年05月20日 09:44 | |
New changeset f9b85b47f9c8 by Victor Stinner in branch 'default': Optimize pickle.load() and pickle.loads() https://hg.python.org/cpython/rev/f9b85b47f9c8 |
|||
| msg265926 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2016年05月20日 09:45 | |
Serhiy Storchaka: "I forgot to note that the comment before _Unpickler_Read() becomes not correct, but you already addressed this in the second patch. unpickle_read-2.patch LGTM." Cool, pushed. "Few months ago I tried to optimize reading in pickle, but didn't see significant benefit. After committing your patch I'm going to revive my patch." Cool too, keep me in touch. |
|||
| msg265946 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2016年05月20日 14:07 | |
I think that integer overflow in _Unpickler_Read() is possible. n is read from file and can be arbitrary (up to PY_SSIZE_T_MAX). This likely cause raising an exception later, but integer overflow itself causes undefined behavior, and we should avoid it. |
|||
| msg265952 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2016年05月20日 19:17 | |
New changeset 3d7b7aa89437 by Victor Stinner in branch 'default': Issue #27056: Fix _Unpickler_Read() to avoid integer overflow https://hg.python.org/cpython/rev/3d7b7aa89437 |
|||
| msg265953 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2016年05月20日 19:19 | |
Serhiy Storchaka: > I think that integer overflow in _Unpickler_Read() is possible. n is read from file and can be arbitrary (up to PY_SSIZE_T_MAX). This likely cause raising an exception later, but integer overflow itself causes undefined behavior, and we should avoid it. Hum, I understood that it's ok since numbers should be signed, but in fact I'm not confident that n is always signed. You are right, it's better to use your code to avoid the integer overflow. I pushed a fix. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:58:31 | admin | set | github: 71243 |
| 2016年05月20日 19:19:20 | vstinner | set | messages: + msg265953 |
| 2016年05月20日 19:17:50 | python-dev | set | messages: + msg265952 |
| 2016年05月20日 14:07:49 | serhiy.storchaka | set | messages: + msg265946 |
| 2016年05月20日 09:45:57 | vstinner | set | status: open -> closed resolution: fixed messages: + msg265926 |
| 2016年05月20日 09:44:03 | python-dev | set | nosy:
+ python-dev messages: + msg265925 |
| 2016年05月20日 09:37:27 | serhiy.storchaka | set | assignee: vstinner messages: + msg265922 stage: commit review |
| 2016年05月20日 09:26:27 | vstinner | set | files:
+ unpickle_read-2.patch messages: + msg265920 |
| 2016年05月19日 13:38:10 | serhiy.storchaka | set | messages: + msg265858 |
| 2016年05月19日 12:02:29 | vstinner | create | |