homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Py_XDECREF() expands its argument multiple times
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Arfrever, Mark.Shannon, amaury.forgeotdarc, asvetlov, benjamin.peterson, brett.cannon, dmalcolm, ilblackdragon, jcea, jkloth, larry, lemburg, mark.dickinson, python-dev, vstinner
Priority: normal Keywords: patch

Created on 2013年02月14日 16:07 by dmalcolm, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
17206.diff ilblackdragon, 2013年03月18日 06:36 Patch for 17206 review
perf.log ilblackdragon, 2013年03月18日 06:47 Benchmark comparison for 17206 patch
perf-linux.log ilblackdragon, 2013年03月21日 02:40 Benchmark comparison for 17206 in Ubuntu + GCC
asmdiff_17206.zip ilblackdragon, 2013年03月21日 02:51 Preprocessed, object and disassembled files of ceval module with and without patch.
17206-2.diff ilblackdragon, 2013年03月31日 05:10 Updated patch for 17206 review
17206-3.diff ilblackdragon, 2013年04月03日 05:07 review
object.patch jkloth, 2013年06月17日 19:10 Patch reusing _py_tmp review
object-2.patch jkloth, 2013年06月24日 17:20 review
project.diff jkloth, 2013年07月02日 15:28 review
Pull Requests
URL Status Linked Edit
PR 30855 Mark.Shannon, 2022年01月30日 17:51
Messages (48)
msg182107 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2013年02月14日 16:07
When running my refcount static analyzer [1] on a large corpus of real-world C extension code for Python, I ran into the following construct in various places:
 Py_XDECREF(PyObject_CallObject(callable, args));
This is bogus code: Py_XDECREF expands its argument multiple times, so after this goes through the C preprocessor the callable is actually called invoked up to 4 times, leaking refs to 3 of the results - assuming that none of the calls fail, and a non pydebug build of CPython (which would expand the argument more times).
The raw preprocessor output for an optimized Python 2.7 looks like this:
 do { if ((PyObject_CallObject(callable, args)) == ((void *)0)) ; else do { if ( --((PyObject*)(PyObject_CallObject(callable, args)))->ob_refcnt != 0) ; else ( (*(((PyObject*)((PyObject *)(PyObject_CallObject(callable, args))))->ob_type)->tp_dealloc)((PyObject *)((PyObject *)(PyObject_CallObject(callable, args))))); } while (0); } while (0);
Cleaned up (and removing some of the casts for clarity) it looks like this:
 do { 
 if ((PyObject_CallObject(callable, args)) == ((void *)0))
 ; 
 else 
 do {
 if (--(PyObject_CallObject(callable, args)->ob_refcnt) != 0)
 ;
 else
 (*(PyObject_CallObject(callable, args)->ob_type)->tp_dealloc)
 PyObject_CallObject(callable, args);
 } while (0);
 } while (0);
which is clearly not what the extension author was expecting.
Should we:
 (A) update the macro so that it expands its argument only once, or
 (B) warn people not to write code like the above?
Similar considerations apply to the other macros, I guess, but I've seen the above used "in the wild", I haven't yet seen it for the other macros).
Seen in the wild in:
 python-alsa-1.0.25-1.fc17:
 pyalsa/alsamixer.c:179
 00174 | 	for (i = 0; i < count; i++) { 
 00175 | 		t = PyTuple_New(2); 
 00176 | 		if (t) { 
 00177 | 			PyTuple_SET_ITEM(t, 0, PyInt_FromLong(pfd[i].fd)); 
 00178 | 			PyTuple_SET_ITEM(t, 1, PyInt_FromLong(pfd[i].events)); 
 00179>| 			Py_XDECREF(PyObject_CallObject(reg, t)); 
 00180 | 			Py_DECREF(t); 
 00181 | 		} 
 00182 | 	}	 
 00183 | 
 00184 | 	Py_XDECREF(reg); 
 pyalsa/alsahcontrol.c:190
 00185 | 	for (i = 0; i < count; i++) { 
 00186 | 		t = PyTuple_New(2); 
 00187 | 		if (t) { 
 00188 | 			PyTuple_SET_ITEM(t, 0, PyInt_FromLong(pfd[i].fd)); 
 00189 | 			PyTuple_SET_ITEM(t, 1, PyInt_FromLong(pfd[i].events)); 
 00190>| 			Py_XDECREF(PyObject_CallObject(reg, t)); 
 00191 | 			Py_DECREF(t); 
 00192 | 		} 
 00193 | 	}	 
 00194 | 
 00195 | 	Py_XDECREF(reg); 
 pyalsa/alsaseq.c:3277
 03272 | for (i = 0; i < count; i++) { 
 03273 | t = PyTuple_New(2); 
 03274 | if (t) { 
 03275 | PyTuple_SET_ITEM(t, 0, PyInt_FromLong(pfd[i].fd)); 
 03276 | PyTuple_SET_ITEM(t, 1, PyInt_FromLong(pfd[i].events)); 
 03277>| Py_XDECREF(PyObject_CallObject(reg, t)); 
 03278 | Py_DECREF(t); 
 03279 | } 
 03280 | } 
 03281 | 
 03282 | Py_XDECREF(reg); 
 python-4Suite-XML-1.0.2-14.fc17:
 Ft/Xml/src/domlette/refcounts.c:80
 00075 | /* refcount = this */ 
 00076 | expected = 1; 
 00077 | } 
 00078 | else { 
 00079 | sprintf(buf, "Unexpected object type '%.200s'" ,node->ob_type->tp_name); 
 00080>| Py_XDECREF(PyObject_CallMethod(tester, "error", "s", buf)); 
 00081 | return 0; 
 00082 | } 
 00083 | 
 00084 | sprintf(buf, "%.200s refcounts", node->ob_type->tp_name); 
 00085 | return do_test(tester, buf, expected, node->ob_refcnt); 
[Note to self: I actually saw this because it uncovered a traceback in cpychecker, which I fixed as:
http://git.fedorahosted.org/cgit/gcc-python-plugin.git/commit/?id=99fa820487e380b74c2eda1d97facdf2ee6d2f3a ]
[1] https://gcc-python-plugin.readthedocs.org/en/latest/cpychecker.html 
msg182111 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013年02月14日 16:32
On Thu, Feb 14, 2013 at 11:07 AM, Dave Malcolm <report@bugs.python.org>wrote:
>
> New submission from Dave Malcolm:
>
> When running my refcount static analyzer [1] on a large corpus of
> real-world C extension code for Python, I ran into the following construct
> in various places:
>
> Py_XDECREF(PyObject_CallObject(callable, args));
>
Eww.
>
> This is bogus code: Py_XDECREF expands its argument multiple times, so
> after this goes through the C preprocessor the callable is actually called
> invoked up to 4 times, leaking refs to 3 of the results - assuming that
> none of the calls fail, and a non pydebug build of CPython (which would
> expand the argument more times).
>
> The raw preprocessor output for an optimized Python 2.7 looks like this:
> do { if ((PyObject_CallObject(callable, args)) == ((void *)0)) ; else
> do { if ( --((PyObject*)(PyObject_CallObject(callable, args)))->ob_refcnt
> != 0) ; else ( (*(((PyObject*)((PyObject *)(PyObject_CallObject(callable,
> args))))->ob_type)->tp_dealloc)((PyObject *)((PyObject
> *)(PyObject_CallObject(callable, args))))); } while (0); } while (0);
>
> Cleaned up (and removing some of the casts for clarity) it looks like this:
> do {
> if ((PyObject_CallObject(callable, args)) == ((void *)0))
> ;
> else
> do {
> if (--(PyObject_CallObject(callable, args)->ob_refcnt) != 0)
> ;
> else
> (*(PyObject_CallObject(callable, args)->ob_type)->tp_dealloc)
> PyObject_CallObject(callable, args);
> } while (0);
> } while (0);
> which is clearly not what the extension author was expecting.
>
> Should we:
> (A) update the macro so that it expands its argument only once, or
>
How expensive is that going to be? Since there is a 'do' statement using a
temp variable would be easy to introduce, but is a compiler going to be
smart enough to inline it all so it doesn't have to waste an allocation,
etc.?
> (B) warn people not to write code like the above?
>
Only if (A) is too costly. I would modify a checkout for ALL refcount
macros and run the benchmark suite to see if there is a noticeable
difference. If there isn't then I say change all the macros (can't make one
safe and the rest not as that is just asking for inconsistent usage and
trouble).
-Brett
>
> Similar considerations apply to the other macros, I guess, but I've seen
> the above used "in the wild", I haven't yet seen it for the other macros).
>
> Seen in the wild in:
> python-alsa-1.0.25-1.fc17:
> pyalsa/alsamixer.c:179
> 00174 | for (i = 0; i < count; i++) {
> 00175 | t = PyTuple_New(2);
> 00176 | if (t) {
> 00177 | PyTuple_SET_ITEM(t, 0,
> PyInt_FromLong(pfd[i].fd));
> 00178 | PyTuple_SET_ITEM(t, 1,
> PyInt_FromLong(pfd[i].events));
> 00179>| Py_XDECREF(PyObject_CallObject(reg, t));
> 00180 | Py_DECREF(t);
> 00181 | }
> 00182 | }
> 00183 |
> 00184 | Py_XDECREF(reg);
>
> pyalsa/alsahcontrol.c:190
> 00185 | for (i = 0; i < count; i++) {
> 00186 | t = PyTuple_New(2);
> 00187 | if (t) {
> 00188 | PyTuple_SET_ITEM(t, 0,
> PyInt_FromLong(pfd[i].fd));
> 00189 | PyTuple_SET_ITEM(t, 1,
> PyInt_FromLong(pfd[i].events));
> 00190>| Py_XDECREF(PyObject_CallObject(reg, t));
> 00191 | Py_DECREF(t);
> 00192 | }
> 00193 | }
> 00194 |
> 00195 | Py_XDECREF(reg);
>
> pyalsa/alsaseq.c:3277
> 03272 | for (i = 0; i < count; i++) {
> 03273 | t = PyTuple_New(2);
> 03274 | if (t) {
> 03275 | PyTuple_SET_ITEM(t, 0, PyInt_FromLong(pfd[i].fd));
> 03276 | PyTuple_SET_ITEM(t, 1,
> PyInt_FromLong(pfd[i].events));
> 03277>| Py_XDECREF(PyObject_CallObject(reg, t));
> 03278 | Py_DECREF(t);
> 03279 | }
> 03280 | }
> 03281 |
> 03282 | Py_XDECREF(reg);
>
> python-4Suite-XML-1.0.2-14.fc17:
> Ft/Xml/src/domlette/refcounts.c:80
> 00075 | /* refcount = this */
> 00076 | expected = 1;
> 00077 | }
> 00078 | else {
> 00079 | sprintf(buf, "Unexpected object type '%.200s'"
> ,node->ob_type->tp_name);
> 00080>| Py_XDECREF(PyObject_CallMethod(tester, "error", "s", buf));
> 00081 | return 0;
> 00082 | }
> 00083 |
> 00084 | sprintf(buf, "%.200s refcounts", node->ob_type->tp_name);
> 00085 | return do_test(tester, buf, expected, node->ob_refcnt);
>
>
> [Note to self: I actually saw this because it uncovered a traceback in
> cpychecker, which I fixed as:
>
> http://git.fedorahosted.org/cgit/gcc-python-plugin.git/commit/?id=99fa820487e380b74c2eda1d97facdf2ee6d2f3a]
>
>
> [1] https://gcc-python-plugin.readthedocs.org/en/latest/cpychecker.html
>
> ----------
> components: Interpreter Core
> messages: 182107
> nosy: dmalcolm
> priority: normal
> severity: normal
> status: open
> title: Py_XDECREF() expands its argument multiple times
> versions: Python 2.7, Python 3.1, Python 3.2, Python 3.3, Python 3.4,
> Python 3.5
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue17206>
> _______________________________________
> _______________________________________________
> New-bugs-announce mailing list
> New-bugs-announce@python.org
> http://mail.python.org/mailman/listinfo/new-bugs-announce
>
msg182209 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2013年02月16日 04:12
I wish we could use inline functions...
msg184430 - (view) Author: Illia Polosukhin (ilblackdragon) * Date: 2013年03月18日 06:36
I've worked on this with Dave Malcolm @PyCon2013 sprints.
This patch is work in progress to make Py_XDECREF() and Py_XINCREF() expands their arguments once instead of multiple times.
Because patch is work in progress, it contains old version for ease of benchmarking. To enable the old version - define OLD_17206.
Tested so far only with: Clang on Mac OS X 10.7 x64.
msg184431 - (view) Author: Illia Polosukhin (ilblackdragon) * Date: 2013年03月18日 06:46
Additionally, in macros Py_XINCREF and Py_XDECREF we've took an opportunity to increase readability by changing expression:
> if (item == NULL) ; else action(item);
to more natural inverted condition:
> if (item != NULL) action(item);
There is a chance that there may be a reason for this form of expression, so let me know if this is an issue.
msg184432 - (view) Author: Illia Polosukhin (ilblackdragon) * Date: 2013年03月18日 06:47
Benchmark run on Clang Mac OS X 10.7 attached of comparison with and without patch 17206.diff.
msg184433 - (view) Author: Illia Polosukhin (ilblackdragon) * Date: 2013年03月18日 06:49
Command used for benchmarking was:
python perf.py -b 2n3 -f ../cpython/baseline-clang/python.exe ../cpython/experiment-clang/python.exe | tee perf.log
msg184488 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2013年03月18日 18:43
I'd suggest less generic names for the temporary variables, just to minimise the chance of accidental name collisions. _py_xincref_tmp and _py_xdecref_tmp, perhaps?
msg184492 - (view) Author: Illia Polosukhin (ilblackdragon) * Date: 2013年03月18日 18:59
That names were my first idea - but then I saw Py_CLEAR uses _py_tmp variable I used it.
Should I replace in Py_CLEAR to _py_clear_tmp as well?
msg184495 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2013年03月18日 19:15
Ignore me; I don't know what I was thinking. _py_tmp is fine. There's no issue with name clashes.
msg184833 - (view) Author: Illia Polosukhin (ilblackdragon) * Date: 2013年03月21日 02:40
Compiled baseline and patched version with GCC in Ubuntu 12.10 (running in VMWare).
Benchmarking results are attached:
./perf.py -b 2n3 -f ../cpython/baseline/python ../cpython/experiment/python | tee perf-linux.log
msg184834 - (view) Author: Illia Polosukhin (ilblackdragon) * Date: 2013年03月21日 02:51
Additionally, disassembled ceval.o and compared baseline with experiment (with applied patch): no actual differences found.
Attached archive contains ceval from both baseline and experiment builds:
 - cevalb.cc - processed file of baseline build
 - ceval.cc - processed file of experiment build
diff ceval.cc cevalb.cc - shows differences introduced by patch.
 - cevalb.o - object file of baseline build
 - ceval.o - object file of experiment build
 - cevalb.asm - disassembly of baseline object file
 - ceval.asm - disassembly of experiment object file
diff ceval.asm cevalb.asm - doesn't show any differences in generated assembly code.
msg184855 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2013年03月21日 07:44
- The patch contains a "#ifdef OLD_17206" that should be removed.
- I know that these macros are already used everywhere, but a test for the new feature would be nice (in _testcapimodule.c)
For example, Py_DECREF(PyLong_FromLong(0)) does not fail or leak when tests are run with -R.
msg185396 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013年03月28日 01:13
I'm not sure that it is a good idea to patch such major function (macro) in a minor version (ex: Python 2.7.x). I changed the Versions field to only select Python 3.4.
msg185614 - (view) Author: Illia Polosukhin (ilblackdragon) * Date: 2013年03月31日 05:10
Updated patch - removed old code and addded tests to _testcapimoudule.c as Amaury suggested.
Amaury, why do you mention -R flag (from what I see it does hash randomization)? I would expect some flag, that would enforce memory leakage issues to fail test.
msg185623 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2013年03月31日 11:05
> Amaury, why do you mention -R flag
He's talking about the -R argument to regrtest:
"""
-R runs each test several times and examines sys.gettotalrefcount() to
see if the test appears to be leaking references. The argument should
be of the form stab:run:fname where 'stab' is the number of times the
test is run to let gettotalrefcount settle down, 'run' is the number
of times further it is run and 'fname' is the name of the file the
reports are written to. These parameters all have defaults (5, 4 and
"reflog.txt" respectively), and the minimal invocation is '-R :'.
"""
msg185625 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2013年03月31日 11:56
Thanks for the updated patch. Some comments:
- In the tests, your functions should have return type PyObject* rather than void; you can use Py_RETURN_NONE as a convenient way to return something of the correct type. E.g.:
static PyObject *
test_decref_doesnt_leak(PyObject* self)
{
 Py_DECREF(PyLong_FromLong(0));
 Py_RETURN_NONE;
}
- Unfortunately, the test above *will* leak: even though your patch fixes Py_XDECREF, Py_DECREF will still evaluate the argument multiple times. So we get, for example:
iwasawa:cpython mdickinson$ ./python.exe -m test -R:: test_capi
[1/1] test_capi
beginning 9 repetitions
123456789
.........
test_capi leaked [1, 1, 1, 1] references, sum=4
1 test failed:
 test_capi
- It would be great to have tests for Py_INCREF and Py_XINCREF, too.
- The patch introduces two extra blank lines before Py_CLEAR.
- Not specifically about the patch, but: there's a comment in object.h that reads:
"""
*** WARNING*** The Py_DECREF macro must have a side-effect-free argument
since it may evaluate its argument multiple times. (The alternative
would be to mace it a proper function or assign it to a global temporary
variable first, both of which are slower; and in a multi-threaded
environment the global variable trick is not safe.)
"""
I don't understand why the author of that comment ruled out the possibility of a local temporary variable. Anyone have any ideas? A constraint of some flavours of pre-ANSI C, perhaps?
msg185627 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2013年03月31日 14:25
Mark, it was written 20 years ago. Who knows (or cares) why it was written that way?
Let's just write it correctly this time.
Py_INCREF, Py_DECREF, Py_XDECREF and Py_XINCREF should all expand their argument exactly once.
Py_CLEAR should expand its argument exactly twice, once as an rvalue, then once as an lvalue.
So could we fix Py_DECREF and Py_CLEAR as well, please?
msg185643 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2013年03月31日 19:29
For the record: I care. Generally speaking CPython is a lovingly
crafted source tree, and the choices its architects made are nearly
always sensible and well-reasoned. When I see things like this, things
that seem kind of dumb on first glance, I worry that we're missing
something.
It could be that it did have a sensible reason fifteen years ago, like
some bone-headed compiler generated awful code--and the macros *are*
used everywhere, so maybe performance genuinely suffered. Or maybe
the macros were originally intended to be usable as rvalues; it looks
like Py_INCREF still permits that, and changing it to be a statement
could be considered a breaking API change.
But we can't let ourselves become hidebound and superstitious. I see
no reason why a local temporary variable shouldn't be fine here, and
perhaps in all the macros Mark cites. So don't get me wrong, I'm +1.
(The documentation on Py_INCREF makes no mention of whether it's legal
as an rvalue, and all the examples use it as a statement. Perhaps we
should document it explicitly so for 3.4.)
msg185689 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2013年03月31日 22:30
On 31.03.2013 21:29, Larry Hastings wrote:
> 
> Larry Hastings added the comment:
> 
> For the record: I care. Generally speaking CPython is a lovingly
> crafted source tree, and the choices its architects made are nearly
> always sensible and well-reasoned. When I see things like this, things
> that seem kind of dumb on first glance, I worry that we're missing
> something.
Did someone test the patch on a use case where the performance of
this macro matters a lot ? E.g. deallocating large lists or
dictionaries.
Todays CPUs have plenty of registers available, so it's possible
that the patch actually speeds up things, since the decref'ed
variable may well be an indirect reference. At the time Py_DECREF
was written, registers were precious golden flasks, only to be
filled with the most valuable of things ;-)
Just want to be sure we're not missing something...
msg185889 - (view) Author: Illia Polosukhin (ilblackdragon) * Date: 2013年04月03日 05:07
Fixed Py_DECREF and Py_CLEAR as well. 
Added tests for Py_INCREF and Py_XINCREF (if somebody has a better idea how to tests that INCREF doesn't leak - please, let me know).
Removed comment that Py_DECREF evaluate it's argument multiple times as not relevant anymore.
About considerations from performance point of view - I've made toy example (only this defines and main function) to test how gcc optimizer behaves in different cases - from what I see, if expression is like this (which is majority of cases in the code):
 PyObject* obj = Foo();
 Py_XDECREF(obj)
assembly code that will be produced (with -O3) is the same before and after patch.
msg186386 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2013年04月09日 06:42
As I wrote in issue17589, there are some extension modules (pytables) that which assume that Py_INCREF is an expression:
 return Py_INCREF(x), x;
and Py_RETURN_NONE is also defined with a comma expression.
Oh, and Cython:
#define __Pyx_PyBool_FromLong(b) ((b) ? (Py_INCREF(Py_True), Py_True) : (Py_INCREF(Py_False), Py_False))
msg189362 - (view) Author: Illia Polosukhin (ilblackdragon) * Date: 2013年05月16日 12:14
Amaury, 
 I didn't update Py_INCREF macro in this patch (because it doesn't expand it's argument multiple times) - so the examples you are showing will be working fine. 
 I've updated Py_XINCREF, but it can't be used as an expression anyway.
msg189379 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2013年05月16日 13:27
The last patch (17206-3.diff) has tests for the 4 macros, and looks good to me.
msg190156 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013年05月27日 21:46
New changeset aff41a6421c2 by Benjamin Peterson in branch 'default':
don't expand the operand to Py_XINCREF/XDECREF/CLEAR/DECREF multiple times (closes #17206)
http://hg.python.org/cpython/rev/aff41a6421c2 
msg190623 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013年06月04日 22:29
"AMD64 Windows7 SP1 3.x" buildbot is crashing in test_marshal. It looks like a regression introduced by the changeset aff41a6421c2:
[ 88/375] test_marshal
Traceback (most recent call last):
 File "../lib/test/regrtest.py", line 1611, in <module>
 main_in_temp_cwd()
 File "../lib/test/regrtest.py", line 1586, in main_in_temp_cwd
 main()
 File "../lib/test/regrtest.py", line 774, in main
 raise Exception("Child error on {}: {}".format(test, result[1]))
Exception: Child error on test_marshal: Exit code 3221225725
http://buildbot.python.org/all/builders/AMD64%20Windows7%20SP1%203.x/builds/2009/steps/test/logs/stdio 
msg190628 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2013年06月04日 22:48
Hum, a stack overflow?
msg191364 - (view) Author: Jeremy Kloth (jkloth) * Date: 2013年06月17日 19:10
Indeed, after debugging, it is a stack overflow caused by the introduction of the different temporary variables in the Py_XINCREF, Py_XDECREF, and Py_DECREF macros.
I've attached an updated patch that reuses the _py_tmp variable for those temporary assignments thus keeping the required stack size down.
msg191731 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013年06月23日 21:01
> I've attached an updated patch that reuses the _py_tmp variable for those temporary assignments thus keeping the required stack size down.
I don't understand how it would change the size of the C stack, could you please explain? object.patch looks like refactoring to simplify the work of the C preprocessor.
msg191752 - (view) Author: Jeremy Kloth (jkloth) * Date: 2013年06月24日 11:19
At least in a debug build, the MSVC 64-bit compiler seems to allocate space for each unique variable declared in the function body. Therefore, by changing the temporary variables to be named identically, the amount of required space is minimized.
The refactoring of Py_DECREF is needed to prevent an error for local variable defined before use when Py_DECREF is nested within another macro already defining _py_tmp (it would expand to roughly PyObject *_py_tmp = _py_tmp).
msg191780 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2013年06月24日 16:04
We've "fixed" this debug-problem in other cases simply by skipping the test in question on debug builds.
msg191785 - (view) Author: Jeremy Kloth (jkloth) * Date: 2013年06月24日 16:44
Except, in this case, it actually crashes the interpreter. I would hope to think that it isn't common practice to just hide crashers especially when there is a fix available.
msg191786 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2013年06月24日 16:47
Could you please wrap your new macros in the do .. while boilerplate?
msg191791 - (view) Author: Jeremy Kloth (jkloth) * Date: 2013年06月24日 17:20
FYI, however, the new macro is designed as an internal implementation detail along the lines of the other _Py_* macros. That is, just a tiny piece of a larger function to be used at your own risk.
Either way, I've uploaded another version with the do { } while (0) wrapping.
msg191792 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2013年06月24日 17:47
Yes, but in case someone else mindlessly starts using it elsewhere, let's spare them the macro bugz.
msg191877 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2013年06月25日 19:15
I think blaming the crash in test_marshall
http://buildbot.python.org/all/builders/AMD64%20Windows7%20SP1%203.x/builds/2009/steps/test/logs/stdio
on the new macros (change aff41a6421c2) is correct.
It may be the proximate cause, but that doesn't make it the underlying one. The interpreter should be more resilient against minor changes to the amount of C stack used per call.
Can we get a full stack trace (for the C stack not the Python one) for the failure?
msg191878 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013年06月25日 19:25
New changeset c31bec42e411 by Victor Stinner in branch 'default':
Issue #17206: test.regrtest and test.script_helper enable faulthandler module
http://hg.python.org/cpython/rev/c31bec42e411 
msg191879 - (view) Author: Jeremy Kloth (jkloth) * Date: 2013年06月25日 19:44
I'm unsure as to how to get a stack trace from Visual Studio, but I can assure you that the changes introduced by the change aff41a6421c2 *is* the cause of the failure in test_marshal.
In a nutshell, the overflow happens in a recursion on the marhsal.c:r_object() function for the test_marshal test of BugsTestCase.test_loads_recursion().
msg191880 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013年06月25日 19:54
New changeset 1a9367d5aabc by Victor Stinner in branch 'default':
Issue #17206: Fix test_cmd_line and test_faulthandler for my previous change
http://hg.python.org/cpython/rev/1a9367d5aabc 
msg191881 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013年06月25日 19:58
> Can we get a full stack trace (for the C stack not the Python one) for the failure?
Yes, using a debugger on Windows x64. I'm waiting for a renewal of my MSDN account to get Visual Studio (full version, not the Express version limited to 32-bit).
Until this, I enabled faulthandler in subprocesses created by test.regrtest and test.script_helper because it may help this issue, but also others future crashes.
msg191883 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013年06月25日 20:12
"I enabled faulthandler in subprocesses created by test.regrtest and test.script_helper..."
Oh, it doesn't help for this issue. On Windows, faulthandler is unable to dump the Python traceback on a stack overflow. On UNIX, it allocates a diffrent stack for its signal handler, and so the signal handler can run even on a stack overflow.
The Windows exit code 0xC00000FD (3221225725) means "Stack overflow / exhaustion".
[ 81/375] test_marshal
Traceback (most recent call last):
 File "../lib/test/regrtest.py", line 1618, in <module>
 main_in_temp_cwd()
 File "../lib/test/regrtest.py", line 1593, in main_in_temp_cwd
 main()
 File "../lib/test/regrtest.py", line 775, in main
 raise Exception("Child error on {}: {}".format(test, result[1]))
Exception: Child error on test_marshal: Exit code 3221225725 
msg191888 - (view) Author: Jeremy Kloth (jkloth) * Date: 2013年06月25日 22:00
Here are some results after a detailed investigation:
- in debug, each variable declaration in a block is allocated on the stack
- in debug, r_object() uses 1416 bytes of stack 
- in release, r_object() uses 512 bytes of stack
- default stack size is 1MB
- stack for python_d.exe has been already up'ed to 2100000
So it seems that the best option would be to increase the stack size used when linking (/STACK:). I would suggest increasing it to 3MB using /STACK:3145728.
msg191919 - (view) Author: Jeremy Kloth (jkloth) * Date: 2013年06月26日 21:46
Here is some additional analysis of recursive functions in the 'pythoncore' MSVC project:
Ratio Release Debug Filename:Function Name
 Stack Stack
-----------------------------------------------
 1.000	 32	 32	_sre.asm:_validate_inner
 1.159	352	 408	ast.asm:set_context
 1.167	 48	 56	compile.asm:stackdepth_walk
 1.188	128	 152	memoryobject.asm:copy_rec
 1.250	160	 200	ast.asm:num_stmts
 1.308	104	 136	firstsets.asm:calcfirstset
 1.312	128	 168	posixmodule.asm:win32_xstat_impl_w
 1.400	 40	 56	node.asm:freechildren
 1.667	 72	 120	memoryobject.asm:tolist_rec
 1.750	 32	 56	listnode.asm:list1node
 1.750	 32	 56	parsermodule.asm:parser_compare_nodes
 1.750	 32	 56	parsermodule.asm:validate_factor
 1.750	 32	 56	parsermodule.asm:validate_not_test
 1.750	 32	 56	rotatingtree.asm:RotatingTree_Enum
 1.750	 32	 56	typeobject.asm:solid_base
 1.786	112	 200	bytearrayobject.asm:bytearray_setslice
 1.900	 80	 152	compile.asm:compiler_comprehension_generator
 2.143	 56	 120	pythonrun.asm:print_exception_recursive
 2.167	 48	 104	arraymodule.asm:array_ass_slice
 2.250	 32	 72	ast.asm:validate_slice
 2.250	 32	 72	getargs.asm:skipitem
 2.250	 32	 72	node.asm:sizeofchildren
 2.250	 32	 72	parsermodule.asm:validate_test
 2.250	 32	 72	setobject.asm:set_issuperset
 2.278	144	 328	listobject.asm:list_ass_slice
 2.417	 96	 232	arraymodule.asm:array_ass_subscr
 2.429	 56	 136	longobject.asm:PyLong_AsLongLong
 2.429	 56	 136	parsermodule.asm:node2tuple
 2.429	 56	 136	typeobject.asm:mro_subclasses
 2.500	 80	 200	bytearrayobject.asm:bytearray_ass_subscript
 2.600	 40	 104	errors.asm:PyErr_GivenExceptionMatches
 2.750	 32	 88	import.asm:update_code_filenames
 2.750	 32	 88	setobject.asm:set_richcompare
 2.750	 32	 88	symtable.asm:symtable_visit_slice
 2.750	 32	 88	typeobject.asm:PyType_Modified
 2.766	512	1416	marshal.asm:r_object
 3.125	 64	 200	longobject.asm:long_rshift
 3.250	 32	 104	compile.asm:compiler_with
 3.250	 32	 104	setobject.asm:set_issubset
 3.250	 32	 104	typeobject.asm:assign_version_tag
 3.750	 32	 120	abstract.asm:abstract_issubclass
 3.750	 32	 120	typeobject.asm:PyType_Ready
 3.833	 48	 184	ast.asm:ast_for_expr
 4.250	 32	 136	compile.asm:compiler_visit_expr
 4.250	 32	 136	mathmodule.asm:factorial_partial_product
 4.500	 80	 360	genobject.asm:gen_throw
 4.692	312	1464	symtable.asm:symtable_visit_stmt
 5.250	 32	 168	_collectionsmodule.asm:deque_extend
 5.250	 32	 168	_collectionsmodule.asm:deque_extendleft
 5.250	 32	 168	ast.asm:alias_for_import_name
 5.750	 32	 184	typeobject.asm:merge_class_dict
 6.250	 32	 200	abstract.asm:PyObject_IsInstance
 6.250	 32	 200	abstract.asm:PyObject_IsSubclass
 6.250	 32	 200	ast.asm:validate_expr
 6.500	 48	 312	Python-ast.asm:obj2ast_slice
 7.182	 88	 632	parsermodule.asm:build_node_children
 7.250	 32	 232	errors.asm:PyErr_NormalizeException
 9.167	 48	 440	symtable.asm:symtable_visit_expr
10.250	 32	 328	_json.asm:encoder_listencode_obj
10.344	256	2648	Python-ast.asm:obj2ast_expr
15.955	176	2808	Python-ast.asm:obj2ast_stmt
31.750	 32	1016	Python-ast.asm:ast2obj_expr
msg192167 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013年07月02日 00:16
"So it seems that the best option would be to increase the stack size used when linking (/STACK:). I would suggest increasing it to 3MB using /STACK:3145728."
Does Python allocate the whole stack (physical pages) at startup, or is it done on demand as on Linux?
msg192204 - (view) Author: Jeremy Kloth (jkloth) * Date: 2013年07月02日 15:28
The /STACK:reserve[,commit] linker option uses 'reserve' as the upper limit of bytes to allow the executable to allocate. The optional 'commit' value (default of 4096) is the amount of physical memory allocated as needed.
Therefore the reserve could be as large as needed to prevent stack overflows without impacting the actually memory footprint of the executable. Maybe we should consider increasing the reserve to match that of Linux (8MB)? From the list I posted above, the huge increase in stack use in Python-ast.c could lead to other overflows for certain expressions.
Attached is the patch to the project file to increase the stack reserve to 4MB. The actual value doesn't impact the memory requirements of the debug build except in edge cases that would otherwise result in stack overflow exceptions.
msg192205 - (view) Author: Jeremy Kloth (jkloth) * Date: 2013年07月02日 15:32
I forgot to note that with the increased stack reserve my previous patches to Py_DECREF() and related macros are no longer needed.
msg192206 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013年07月02日 15:45
Jeremy: "stack for python_d.exe has been already up'ed to 2100000"
Indeed, but not recently (changeset 97361d917d22):
<< Issue 2286: bump up the stack size of the 64-bit debug python_d.exe to 2100000. The default value of 200000 causes a stack overflow at 1965 iterations of r_object() in marshal.c, 35 iterations before the 2000 limit enforced by MAX_MARSHAL_STACK_DEPTH. >>
Oh it looks like it was the same issue: "Stack overflow exception caused by test_marshal on Windows x64". See issue #2286.
Amaury wrote in this issue (in 2008, msg63540):
"Can you try to raise the stack size on x64 builds? If 2Mb is enough for 32bit, 4Mb should be good in your case."
Jeremy: "Therefore the reserve could be as large as needed to prevent stack overflows without impacting the actually memory footprint of the executable. Maybe we should consider increasing the reserve to match that of Linux (8MB)?"
Ah ok, if the memory footprint is unchanged, yes please increase the arbitrary limit. I would suggest 8 MB.
msg192507 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013年07月07日 00:50
New changeset fcf079dece0d by Victor Stinner in branch 'default':
Issue #17206: On Windows, increase the stack size from 2 MB to 4.2 MB to fix
http://hg.python.org/cpython/rev/fcf079dece0d 
History
Date User Action Args
2022年04月11日 14:57:41adminsetgithub: 61408
2022年01月30日 17:51:28Mark.Shannonsetpull_requests: + pull_request29209
2016年09月25日 15:52:53SilentGhostsetstatus: open -> closed
type: behavior
resolution: fixed
2013年07月07日 00:50:07python-devsetmessages: + msg192507
2013年07月02日 15:45:01vstinnersetmessages: + msg192206
2013年07月02日 15:32:57jklothsetmessages: + msg192205
2013年07月02日 15:28:24jklothsetfiles: + project.diff

messages: + msg192204
2013年07月02日 00:16:12vstinnersetmessages: + msg192167
2013年06月26日 21:46:11jklothsetmessages: + msg191919
2013年06月25日 22:00:32jklothsetmessages: + msg191888
2013年06月25日 20:12:08vstinnersetmessages: + msg191883
2013年06月25日 19:58:58vstinnersetmessages: + msg191881
2013年06月25日 19:54:49python-devsetmessages: + msg191880
2013年06月25日 19:44:47jklothsetmessages: + msg191879
2013年06月25日 19:25:13python-devsetmessages: + msg191878
2013年06月25日 19:15:41Mark.Shannonsetmessages: + msg191877
2013年06月24日 17:47:48benjamin.petersonsetmessages: + msg191792
2013年06月24日 17:20:36jklothsetfiles: + object-2.patch

messages: + msg191791
2013年06月24日 16:47:27benjamin.petersonsetmessages: + msg191786
2013年06月24日 16:44:06jklothsetmessages: + msg191785
2013年06月24日 16:04:30benjamin.petersonsetmessages: + msg191780
2013年06月24日 11:19:58jklothsetmessages: + msg191752
2013年06月23日 21:01:18vstinnersetmessages: + msg191731
2013年06月17日 19:10:26jklothsetfiles: + object.patch
nosy: + jkloth
messages: + msg191364

2013年06月04日 22:48:45amaury.forgeotdarcsetmessages: + msg190628
2013年06月04日 22:29:43vstinnersetstatus: closed -> open
resolution: fixed -> (no value)
messages: + msg190623
2013年05月27日 21:46:44python-devsetstatus: open -> closed

nosy: + python-dev
messages: + msg190156

resolution: fixed
stage: resolved
2013年05月16日 13:27:01amaury.forgeotdarcsetmessages: + msg189379
2013年05月16日 12:54:27Arfreversetnosy: + Arfrever
2013年05月16日 12:14:51ilblackdragonsetmessages: + msg189362
2013年04月09日 06:42:04amaury.forgeotdarcsetmessages: + msg186386
2013年04月07日 10:11:54asvetlovsetnosy: + asvetlov
2013年04月03日 05:07:35ilblackdragonsetfiles: + 17206-3.diff

messages: + msg185889
2013年03月31日 22:30:58lemburgsetnosy: + lemburg
messages: + msg185689
2013年03月31日 19:29:49larrysetmessages: + msg185643
2013年03月31日 14:25:17Mark.Shannonsetnosy: + Mark.Shannon
messages: + msg185627
2013年03月31日 11:56:21mark.dickinsonsetmessages: + msg185625
2013年03月31日 11:05:53mark.dickinsonsetmessages: + msg185623
2013年03月31日 05:10:53ilblackdragonsetfiles: + 17206-2.diff

messages: + msg185614
2013年03月28日 01:13:50vstinnersetnosy: + vstinner

messages: + msg185396
versions: - Python 3.1, Python 2.7, Python 3.2, Python 3.3, Python 3.5
2013年03月21日 07:44:16amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg184855
2013年03月21日 02:51:17ilblackdragonsetfiles: + asmdiff_17206.zip

messages: + msg184834
2013年03月21日 02:40:23ilblackdragonsetfiles: + perf-linux.log

messages: + msg184833
2013年03月18日 19:15:32mark.dickinsonsetmessages: + msg184495
2013年03月18日 18:59:53ilblackdragonsetmessages: + msg184492
2013年03月18日 18:43:55mark.dickinsonsetnosy: + mark.dickinson
messages: + msg184488
2013年03月18日 06:49:52ilblackdragonsetmessages: + msg184433
2013年03月18日 06:47:28ilblackdragonsetfiles: + perf.log

messages: + msg184432
2013年03月18日 06:46:04ilblackdragonsetnosy: + larry
messages: + msg184431
2013年03月18日 06:36:49ilblackdragonsetfiles: + 17206.diff

nosy: + ilblackdragon
messages: + msg184430

keywords: + patch
2013年02月16日 04:12:29benjamin.petersonsetnosy: + benjamin.peterson
messages: + msg182209
2013年02月16日 01:40:43jceasetnosy: + jcea
2013年02月14日 16:32:30brett.cannonsetnosy: + brett.cannon
messages: + msg182111
2013年02月14日 16:07:37dmalcolmcreate

AltStyle によって変換されたページ (->オリジナル) /