This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2012年07月16日 12:52 by meador.inge, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| issue15368-v0.patch | meador.inge, 2012年07月17日 03:37 | review | ||
| issue15368-v1.patch | meador.inge, 2012年07月17日 13:51 | review | ||
| Messages (22) | |||
|---|---|---|---|
| msg165596 - (view) | Author: Meador Inge (meador.inge) * (Python committer) | Date: 2012年07月16日 12:52 | |
Consider this small example (you might have to run sample program multiple times to see a difference): $ cat dis-closure.py import dis def adder(a, b): def add(): return a + b return add print(dis.dis(adder(1, 2).__code__)) $ ./python.exe dis-closure.py 5 0 LOAD_DEREF 0 (a) 3 LOAD_DEREF 1 (b) 6 BINARY_ADD 7 RETURN_VALUE None $ ./python.exe dis-closure.py 5 0 LOAD_DEREF 1 (a) 3 LOAD_DEREF 0 (b) 6 BINARY_ADD 7 RETURN_VALUE None The order of 'co_cellvars' and 'co_freevars' can be different from compile to compile, thus the bytecode can be different from compile to compile. This is due to the fact that these variable sets are managed with hashes and the ordering may come out different when the names in the hashes are given indexes (via 'dictbytype' in 'compile.c'). I am not sure if these are the only areas that causes bytecode generation to be non-deterministic. I found this behavior surprising. |
|||
| msg165600 - (view) | Author: Brett Cannon (brett.cannon) * (Python committer) | Date: 2012年07月16日 13:11 | |
I might come to regret asking this, but so what? Is this actually causing you issues, or are you literally just finding "this behavior surprising" and that's it? I mean we could simply sort the tuples, but I don't know what kind of performance hit that would cause. |
|||
| msg165601 - (view) | Author: Meador Inge (meador.inge) * (Python committer) | Date: 2012年07月16日 13:26 | |
On Mon, Jul 16, 2012 at 8:11 AM, Brett Cannon <report@bugs.python.org> wrote: > I might come to regret asking this, but so what? Is this actually causing you issues, > or are you literally just finding "this behavior surprising" and that's it? I noticed it in the context of working on issue15352. There are still a few issue in the build system where 'importlib.h' isn't regenerated when it needs to be (e.g. when the marshaling code is changed). When investigating that issue I noticed that trivial changes (e.g. `touch _bootstrap.py`) can cause differences in 'importlib.h'. Those differences are mainly due to the cell and free var indexes being reordered. Since the bytecode generation isn't deterministic, there can be some noisy diffs in your working copy when mucking around with the files related to the frozen importlib. Thus if we ever wanted to just always generate importlib.h because we can't be sure which changed source files should cause importlib.h to be regenerated, then the non-determinism would be a real annoyance. That is the only concrete issue I currently am aware of. This issue may not actually be worth addressing, but I felt it was worthy of discussion. |
|||
| msg165619 - (view) | Author: Alyssa Coghlan (ncoghlan) * (Python committer) | Date: 2012年07月16日 14:43 | |
At least we know the hash randomisation is working :) Spurious changes when freezing modules seems like a legitimate reason to fix it - the import benchmarks would probably give the compiler enough of a workout to highlight if the sorting is excessively time consuming. |
|||
| msg165621 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2012年07月16日 14:46 | |
Ditto. I think predictability of bytecode generation is useful, e.g. for make-like tools that examine content, or for unit testing. |
|||
| msg165641 - (view) | Author: Brett Cannon (brett.cannon) * (Python committer) | Date: 2012年07月16日 17:23 | |
OK, so it sounds like we need to do the equivalent of sorting those tuples when generating the bytecode. That would suggest that probably need to tweak Python/compile.c to make it deterministic. |
|||
| msg165676 - (view) | Author: Meador Inge (meador.inge) * (Python committer) | Date: 2012年07月17日 03:37 | |
I haven't done any benchmarking (yet), but here is a patch implementing the basic sorting approach. |
|||
| msg165722 - (view) | Author: Meador Inge (meador.inge) * (Python committer) | Date: 2012年07月17日 13:51 | |
New patch that addresses feedback from Brett's code review (thanks!). |
|||
| msg165732 - (view) | Author: Meador Inge (meador.inge) * (Python committer) | Date: 2012年07月17日 22:23 | |
OK, I ran the Python benchmark suite. I don't have much experience with the benchmark suite so let me know if I did something wrong or need to do more. I ran the following: python perf.py -b 2n3 ../cpython/python.exe ../cpython/python-with-sorting.exe) which produces: Report on Darwin quicksilver 11.4.0 Darwin Kernel Version 11.4.0: Mon Apr 9 19:32:15 PDT 2012; root:xnu-1699268~1/RELEASE_X86_64 x86_64 i386 Total CPU cores: 4 ### call_method_slots ### Min: 1.925941 -> 1.883251: 1.02x faster Avg: 1.931082 -> 1.887380: 1.02x faster Significant (t=39.92) Stddev: 0.01233 -> 0.00526: 2.3439x smaller Timeline: http://tinyurl.com/8943trh ### call_simple ### Min: 1.447890 -> 1.367729: 1.06x faster Avg: 1.451100 -> 1.371689: 1.06x faster Significant (t=118.46) Stddev: 0.00497 -> 0.00654: 1.3168x larger Timeline: http://tinyurl.com/7j7uppg ### fastpickle ### Min: 3.313799 -> 3.240090: 1.02x faster Avg: 3.328812 -> 3.248734: 1.02x faster Significant (t=44.45) Stddev: 0.00918 -> 0.00883: 1.0404x smaller Timeline: http://tinyurl.com/6tsk347 ### iterative_count ### Min: 1.277516 -> 1.240536: 1.03x faster Avg: 1.280186 -> 1.242743: 1.03x faster Significant (t=38.46) Stddev: 0.00534 -> 0.00435: 1.2288x smaller Timeline: http://tinyurl.com/d7vzqjf ### json_load ### Min: 2.944439 -> 2.877168: 1.02x faster Avg: 2.950717 -> 2.889950: 1.02x faster Significant (t=23.96) Stddev: 0.00737 -> 0.01635: 2.2179x larger Timeline: http://tinyurl.com/6onbxzf ### regex_compile ### Min: 2.354580 -> 2.417407: 1.03x slower Avg: 2.361253 -> 2.421576: 1.03x slower Significant (t=-44.01) Stddev: 0.00780 -> 0.00575: 1.3555x smaller Timeline: http://tinyurl.com/6wryn9v ### richards ### Min: 0.905023 -> 0.883842: 1.02x faster Avg: 0.908634 -> 0.888653: 1.02x faster Significant (t=18.96) Stddev: 0.00479 -> 0.00571: 1.1911x larger Timeline: http://tinyurl.com/chgz4ge ### silent_logging ### Min: 0.432982 -> 0.462848: 1.07x slower Avg: 0.434553 -> 0.465406: 1.07x slower Significant (t=-32.15) Stddev: 0.00390 -> 0.00556: 1.4256x larger Timeline: http://tinyurl.com/d3penzy The following not significant results are hidden, use -v to show them: call_method, call_method_unknown, fastunpickle, float, formatted_logging, json_dump, nbody, normal_startup, nqueens, pidigits, regex_effbot, regex_v8, simple_logging, startup_nosite, threaded_count, unpack_sequence. |
|||
| msg165765 - (view) | Author: Brett Cannon (brett.cannon) * (Python committer) | Date: 2012年07月18日 12:53 | |
You ran the benchmarks correctly, but I was more worried about compilation time than execution time (changing which index is for what really shouldn't make a difference in performance, and the benchmarks show that). If you want to test using the importlib benchmarks, run ``./python.exe -m importlib.test.benchmark`` and compare the output (run with -h to see all the options). |
|||
| msg165773 - (view) | Author: Meador Inge (meador.inge) * (Python committer) | Date: 2012年07月18日 13:39 | |
Oh, cool -- I didn't know about the importlib benchmarks. Thanks Brett. Here are the results (OS X 10.7.4, Dual 1.7GHz Core i5, 4 GB RAM): $ ./python.exe -m importlib.test.benchmark Measuring imports/second over 1 second, best out of 3 Entire benchmark run should take about 33 seconds Using <function __import__ at 0x1063bdd40> as __import__ sys.modules [ 41831 41812 40914 ] best is 41,831 Built-in module [ 20327 20441 19775 ] best is 20,441 Source writing bytecode: small [ 2108 2119 2117 ] best is 2,119 Source w/o bytecode: small [ 5093 5075 5106 ] best is 5,106 Source w/ bytecode: small [ 6021 5915 5997 ] best is 6,021 Source writing bytecode: tabnanny [ 379 379 379 ] best is 379 Source w/o bytecode: tabnanny [ 448 453 456 ] best is 456 Source w/ bytecode: tabnanny [ 2270 2279 2279 ] best is 2,279 Source writing bytecode: decimal [ 28 27 28 ] best is 28 Source w/o bytecode: decimal [ 30 30 30 ] best is 30 Source w/ bytecode: decimal [ 256 254 259 ] best is 259 $ ./python-with-sorting.exe -m importlib.test.benchmark Measuring imports/second over 1 second, best out of 3 Entire benchmark run should take about 33 seconds Using <function __import__ at 0x104b01d40> as __import__ sys.modules [ 42369 42203 42348 ] best is 42,369 Built-in module [ 17060 16358 20486 ] best is 20,486 Source writing bytecode: small [ 2118 2112 2127 ] best is 2,127 Source w/o bytecode: small [ 5121 5098 5125 ] best is 5,125 Source w/ bytecode: small [ 6015 5652 6046 ] best is 6,046 Source writing bytecode: tabnanny [ 367 368 366 ] best is 368 Source w/o bytecode: tabnanny [ 439 429 439 ] best is 439 Source w/ bytecode: tabnanny [ 2267 2274 2277 ] best is 2,277 Source writing bytecode: decimal [ 27 27 27 ] best is 27 Source w/o bytecode: decimal [ 28 28 28 ] best is 28 Source w/ bytecode: decimal [ 257 259 259 ] best is 259 Pretty similar results either way. |
|||
| msg165776 - (view) | Author: Brett Cannon (brett.cannon) * (Python committer) | Date: 2012年07月18日 13:50 | |
Yep, so I consider the performance worry dealt with. |
|||
| msg165778 - (view) | Author: Alyssa Coghlan (ncoghlan) * (Python committer) | Date: 2012年07月18日 13:54 | |
Agreed, so definite +1 from me. |
|||
| msg165781 - (view) | Author: Brett Cannon (brett.cannon) * (Python committer) | Date: 2012年07月18日 14:55 | |
So I think that just leaves one other person to verify the patch works as expected and then commit it. I can hopefully later this week, but no sooner than Friday, so if someone can get to it faster that would be great. |
|||
| msg165783 - (view) | Author: Meador Inge (meador.inge) * (Python committer) | Date: 2012年07月18日 15:06 | |
I can commit today unless you think we need someone else to independently verify the patch. |
|||
| msg165784 - (view) | Author: Meador Inge (meador.inge) * (Python committer) | Date: 2012年07月18日 15:22 | |
BTW, I think this should be committed on 2.7 and 3.2 as well since the same issue can happen when throwing -R. Any objections? |
|||
| msg165787 - (view) | Author: Brett Cannon (brett.cannon) * (Python committer) | Date: 2012年07月18日 16:00 | |
Nah, you are a committer and thus qualify as the second review. =) As for backporting, it only affects regeneration of bytecode, so I don't see any major harm in it (might surprise some people, but there is a legitimate reason for fixing this). |
|||
| msg165791 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2012年07月18日 19:29 | |
New changeset 7eb0180c1734 by Meador Inge in branch '2.7': Issue #15368: make bytecode generation deterministic. http://hg.python.org/cpython/rev/7eb0180c1734 New changeset 79d54fba49b3 by Meador Inge in branch '3.2': Issue #15368: make bytecode generation deterministic. http://hg.python.org/cpython/rev/79d54fba49b3 New changeset 578066372485 by Meador Inge in branch 'default': Issue #15368: make bytecode generation deterministic. http://hg.python.org/cpython/rev/578066372485 |
|||
| msg165792 - (view) | Author: Meador Inge (meador.inge) * (Python committer) | Date: 2012年07月18日 19:32 | |
Committed. Thanks for the reviews y'all. |
|||
| msg165800 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2012年07月18日 20:55 | |
Meador, you have forgotten to fix the error "PyList_GET_SIZE(src)". src is not list, should be "PyList_GET_SIZE(sorted_keys)". |
|||
| msg165803 - (view) | Author: Meador Inge (meador.inge) * (Python committer) | Date: 2012年07月18日 21:29 | |
D'oh! Thanks for catching that Serhiy! Fixing now. Sorry about that. |
|||
| msg165811 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2012年07月18日 21:51 | |
New changeset 47e074f6ca26 by Meador Inge in branch '2.7': Issue #15368: fixing variable typo. http://hg.python.org/cpython/rev/47e074f6ca26 New changeset 1c46f2ede4cb by Meador Inge in branch '3.2': Issue #15368: fixing variable typo. http://hg.python.org/cpython/rev/1c46f2ede4cb New changeset 6502de91610d by Meador Inge in branch 'default': Issue #15368: fixing variable typo. http://hg.python.org/cpython/rev/6502de91610d |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:32 | admin | set | github: 59573 |
| 2012年07月26日 05:25:14 | eric.snow | set | nosy:
+ eric.snow |
| 2012年07月18日 21:51:21 | python-dev | set | messages: + msg165811 |
| 2012年07月18日 21:29:09 | meador.inge | set | messages: + msg165803 |
| 2012年07月18日 20:55:22 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages: + msg165800 |
| 2012年07月18日 19:32:10 | meador.inge | set | status: open -> closed versions: + Python 2.7, Python 3.2 messages: + msg165792 resolution: fixed stage: commit review -> resolved |
| 2012年07月18日 19:29:58 | python-dev | set | nosy:
+ python-dev messages: + msg165791 |
| 2012年07月18日 16:00:38 | brett.cannon | set | messages: + msg165787 |
| 2012年07月18日 15:22:28 | meador.inge | set | messages: + msg165784 |
| 2012年07月18日 15:06:18 | meador.inge | set | messages: + msg165783 |
| 2012年07月18日 14:55:51 | brett.cannon | set | messages: + msg165781 |
| 2012年07月18日 13:54:16 | ncoghlan | set | messages: + msg165778 |
| 2012年07月18日 13:50:33 | brett.cannon | set | messages:
+ msg165776 stage: needs patch -> commit review |
| 2012年07月18日 13:39:02 | meador.inge | set | messages: + msg165773 |
| 2012年07月18日 12:53:14 | brett.cannon | set | messages: + msg165765 |
| 2012年07月17日 22:23:42 | meador.inge | set | messages: + msg165732 |
| 2012年07月17日 13:51:18 | meador.inge | set | files:
+ issue15368-v1.patch messages: + msg165722 |
| 2012年07月17日 03:37:15 | meador.inge | set | files:
+ issue15368-v0.patch keywords: + patch messages: + msg165676 |
| 2012年07月16日 17:23:52 | brett.cannon | set | messages: + msg165641 |
| 2012年07月16日 15:01:47 | Arfrever | set | nosy:
+ Arfrever |
| 2012年07月16日 14:46:12 | pitrou | set | nosy:
+ pitrou messages: + msg165621 |
| 2012年07月16日 14:43:06 | ncoghlan | set | nosy:
+ ncoghlan messages: + msg165619 |
| 2012年07月16日 14:31:32 | flox | set | nosy:
+ flox |
| 2012年07月16日 14:18:25 | mark.dickinson | set | nosy:
+ mark.dickinson |
| 2012年07月16日 13:26:36 | meador.inge | set | messages: + msg165601 |
| 2012年07月16日 13:11:02 | brett.cannon | set | nosy:
+ brett.cannon messages: + msg165600 |
| 2012年07月16日 12:52:36 | meador.inge | create | |