homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Compiling recursive Python ASTs crash the interpreter
Type: crash Stage: resolved
Components: Interpreter Core Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: BTaskaya, belopolsky, benjamin.peterson, eric.snow, gregory.p.smith, jkloth, kj, lukasz.langa, meador.inge, miss-islington, pablogsal, terry.reedy
Priority: Keywords: patch

Created on 2011年02月03日 05:02 by benjamin.peterson, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 20594 merged BTaskaya, 2020年06月02日 10:20
PR 26521 merged miss-islington, 2021年06月03日 20:01
PR 26522 merged BTaskaya, 2021年06月03日 20:35
PR 26550 merged BTaskaya, 2021年06月05日 18:39
PR 26554 closed BTaskaya, 2021年06月05日 23:23
PR 26578 jkloth, 2021年06月08日 16:24
PR 26604 merged BTaskaya, 2021年06月08日 17:00
PR 26605 merged BTaskaya, 2021年06月08日 17:03
PR 26607 merged BTaskaya, 2021年06月08日 17:04
Messages (27)
msg127783 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2011年02月03日 05:02
You don't want to know why I was thinking about this...
$ ./python 
Python 3.2rc2+ (py3k:88302, Feb 1 2011, 19:02:10) 
[GCC 4.4.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import ast
>>> e = ast.UnaryOp(op=ast.Not(), lineno=0, col_offset=0)
>>> e.operand = e
>>> compile(ast.Expression(e), "<test>", "eval")
Segmentation fault
msg127797 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2011年02月03日 17:00
Looks like a stack overflow caused by an infinite recursion. I am not sure if it is possible to add cycle detection code without sacrificing performance or setting some arbitrary limits.
I wonder: Why ast nodes need to be mutable?
msg127798 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2011年02月03日 17:08
2011年2月3日 Alexander Belopolsky <report@bugs.python.org>:
>
> Alexander Belopolsky <belopolsky@users.sourceforge.net> added the comment:
>
> Looks like a stack overflow caused by an infinite recursion. I am not sure if it is possible to add cycle detection code without sacrificing performance or setting some arbitrary limits.
Yes, it's definitely low priority. It's probably easier to crash the
interpreter by producing differently malformed ast anyway.
>
> I wonder: Why ast nodes need to be mutable?
So people can change them.
msg127799 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2011年02月03日 17:21
On Thu, Feb 3, 2011 at 12:08 PM, Benjamin Peterson
<report@bugs.python.org> wrote:
..
>> I wonder: Why ast nodes need to be mutable?
>
> So people can change them.
Well, they are hashable, so this needs to be done carefully. Is this
necessary for AST-based optimizations? Does Python actually change
AST after it has been created? Note that for some optimizations it
may be more appropriate to build a new tree rather than mutate the old
one. Depending on the algorithm, you may or may not need to change
the nodes after they have been created in the process.
msg127800 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2011年02月03日 17:27
2011年2月3日 Alexander Belopolsky <report@bugs.python.org>:
>
> Alexander Belopolsky <belopolsky@users.sourceforge.net> added the comment:
>
> On Thu, Feb 3, 2011 at 12:08 PM, Benjamin Peterson
> <report@bugs.python.org> wrote:
> ..
>>> I wonder: Why ast nodes need to be mutable?
>>
>> So people can change them.
>
> Well, they are hashable, so this needs to be done carefully. Is this
> necessary for AST-based optimizations? Does Python actually change
> AST after it has been created? Note that for some optimizations it
> may be more appropriate to build a new tree rather than mutate the old
> one. Depending on the algorithm, you may or may not need to change
> the nodes after they have been created in the process.
Other people are, though. The hash is by identity anyway.
msg127801 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2011年02月03日 17:28
2011年2月3日 Alexander Belopolsky <report@bugs.python.org>:
>
> Alexander Belopolsky <belopolsky@users.sourceforge.net> added the comment:
>
> On Thu, Feb 3, 2011 at 12:08 PM, Benjamin Peterson
> <report@bugs.python.org> wrote:
> ..
>>> I wonder: Why ast nodes need to be mutable?
>>
>> So people can change them.
>
> Well, they are hashable, so this needs to be done carefully. Is this
> necessary for AST-based optimizations? Does Python actually change
> AST after it has been created? Note that for some optimizations it
> may be more appropriate to build a new tree rather than mutate the old
> one. Depending on the algorithm, you may or may not need to change
> the nodes after they have been created in the process.
Other people are, though. The hash is by identity anyway.
msg127816 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2011年02月03日 20:14
Alex: If the node attributes were not mutable, it would be extremely awkward, not to say inefficient, to mutate an already existing AST as returned by ast.parse().
The AST objects in the _ast module aren't what Python works with internally, anyway. When calling ast.parse(), the AST is converted to Python objects (these are defined in Python-ast.c), and compile()ing such an object converts them back to the internal tree representation. This conversion is where recursions would need to be handled.
msg155978 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2012年03月16日 00:33
i haven't confirmed if it is this exact bug but I believe a coworker just ran into something similar. he wrote code to use the ast to remove docstrings from code before passing it to compile() (as that saves a noticable amount of memory). in the case the ast for code like:
def foo():
 """this is a docstring."""
Removing the docstring and passing such a thing to compile triggers a problem. A workaround was to add a pass in such cases:
if (node.body and isinstance(node.body[0], ast.Expr) and
 isinstance(node.body[0].value, ast.Str)):
 docstring = node.body.pop(0)
 if len(node.body) == 0:
 # An empty body will sometimes provoke a segfault when you call
 # compile on the code object.
 node.body.append(ast.Pass(lineno=docstring.lineno,
 col_offset=docstring.col_offset))
regardless, it'd be better if compile() *never* crashed on strange input.
msg155986 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2012年03月16日 01:20
Have him try on 3.3. This should be less of an issue there where there is an AST validator. It doesn't fix this bug, but it does fix most accidental AST construction bugs.
msg358952 - (view) Author: Batuhan Taskaya (BTaskaya) * (Python committer) Date: 2019年12月28日 17:41
We can probably implement something like this to prevent this happening
diff --git a/Parser/asdl_c.py b/Parser/asdl_c.py
index daac0966f5..f9da52da7f 100755
--- a/Parser/asdl_c.py
+++ b/Parser/asdl_c.py
@@ -559,6 +559,11 @@ class Obj2ModVisitor(PickleVisitor):
 self.emit("asdl_seq_SET(%s, i, val);" % field.name, depth+2)
 self.emit("}", depth+1)
 else:
+ self.emit("if (obj == tmp) {", depth+1)
+ self.emit("PyErr_SetString(PyExc_RuntimeError, \"Recursing fields "
+ "are not supported for AST nodes.\");", depth+2, reflow=False)
+ self.emit("goto failed;", depth+2)
+ self.emit("}", depth+1)
 self.emit("res = obj2ast_%s(tmp, &%s, arena);" %
 (field.type, field.name), depth+1)
 self.emit("if (res != 0) goto failed;", depth+1)
msg359122 - (view) Author: (ppperry) Date: 2019年12月31日 19:06
What about indirect cycles like below:
>>> e = ast.UnaryOp(op=ast.Not(), lineno=0, col_offset=0)
>>> f = ast.UnaryOp(op=ast.Not(), lineno=0, col_offset=0)
>>> e.operand = f
>>> f.operand = e
>>> compile(ast.Expression(e), "<test>", "eval")
(I tested, this also crashes)
msg373078 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2020年07月06日 08:40
With 3.9 on Windows, using Benjamin's example, I do not get the Windows equivalent of a seg fault. However, execution stops at compile with no exception, including SystemExit.
These examples amount to limited fuzz testing of compile(). I think it should raise something like "SyntaxError: recursive ast" or even 'bad ast' if malformed non-recursive asts have the same issue.
msg373084 - (view) Author: Batuhan Taskaya (BTaskaya) * (Python committer) Date: 2020年07月06日 09:31
> With 3.9 on Windows, using Benjamin's example, I do not get the Windows equivalent of a seg fault. However, execution stops at compile with no exception, including SystemExit.
I can still reproduce on Linux,
 $ python
Python 3.10.0a0 (heads/bpo-xxxxx:f2947e354c, May 21 2020, 18:54:57) 
[GCC 9.2.1 20191008] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import ast
>>> e = ast.UnaryOp(op=ast.Not(), lineno=0, col_offset=0)
>>> e.operand = e
>>> compile(ast.Expression(e), "<test>", "eval")
[1] 15320 segmentation fault (core dumped) python
> These examples amount to limited fuzz testing of compile(). I think it should raise something like "SyntaxError: recursive ast" or even 'bad ast' if malformed non-recursive asts have the same issue.
I dont think it would be easy to locate such errors before they happen, instead I propose (actually already proposed in PR 20594) to add recursion guards to places where this might happen. This can prevent crashes on both direct and indirect cycles
>>> import ast
>>> e = ast.UnaryOp(op=ast.Not(), lineno=0, col_offset=0)
>>> e.operand = e
>>> compile(ast.Expression(e), "<test>", "eval")
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
RecursionError: maximum recursion depth exceeded while traversing 'expr' node
>>> e = ast.UnaryOp(op=ast.Not(), lineno=0, col_offset=0)
>>> f = ast.UnaryOp(op=ast.Not(), lineno=0, col_offset=0)
>>> e.operand = f
>>> f.operand = e
>>> compile(ast.Expression(e), "<test>", "eval")
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
RecursionError: maximum recursion depth exceeded while traversing 'expr' node
msg395040 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021年06月03日 20:01
New changeset f3491242e41933aa9529add7102edb68b80a25e9 by Batuhan Taskaya in branch 'main':
bpo-11105: Do not crash when compiling recursive ASTs (GH-20594)
https://github.com/python/cpython/commit/f3491242e41933aa9529add7102edb68b80a25e9
msg395046 - (view) Author: miss-islington (miss-islington) Date: 2021年06月03日 20:27
New changeset 976598d36bd180024c5f0edf1f7ec0f0b436380f by Miss Islington (bot) in branch '3.10':
bpo-11105: Do not crash when compiling recursive ASTs (GH-20594)
https://github.com/python/cpython/commit/976598d36bd180024c5f0edf1f7ec0f0b436380f
msg395050 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021年06月03日 21:22
New changeset de58b319af3a72440a74e807cf8a1194ed0c6d8c by Batuhan Taskaya in branch '3.9':
[3.9] bpo-11105: Do not crash when compiling recursive ASTs (GH-20594) (GH-26522)
https://github.com/python/cpython/commit/de58b319af3a72440a74e807cf8a1194ed0c6d8c
msg395172 - (view) Author: Ken Jin (kj) * (Python committer) Date: 2021年06月05日 17:36
The newly added test ``test_recursion_direct`` seems to trigger a stack overflow on windows in debug mode instead of a RecursionError. Release mode isn't affected and the test passes there.
One of the buildbots reflects this too: https://buildbot.python.org/all/#/builders/146/builds/337/steps/4/logs/stdio
I can avoid the crash by lowering the recursion limit in Python from 1000 to 500. The stack size for a window build is currently set to 2MB, which is usually lesser than *nix 8MB. So I think an easy solution is to increase the stack size for windows builds.
I'm guessing release builds aren't affected because some of the Py_EnterRecursiveCall helper functions are probably inlined and thus use less of the stack.
Opinions are greatly appreciated.
msg395173 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021年06月05日 17:44
Batuhan, can you take a look?
msg395174 - (view) Author: Batuhan Taskaya (BTaskaya) * (Python committer) Date: 2021年06月05日 17:47
> Batuhan, can you take a look?
Yes.
msg395177 - (view) Author: Batuhan Taskaya (BTaskaya) * (Python committer) Date: 2021年06月05日 18:24
> The stack size for a window build is currently set to 2MB, which is usually lesser than *nix 8MB. So I think an easy solution is to increase the stack size for windows builds.
> I'm guessing release builds aren't affected because some of the Py_EnterRecursiveCall helper functions are probably inlined and thus use less of the stack.
> Opinions are greatly appreciated.
I don't think that we should make a global change for this case, AFAIK some of the core parts of the interpreter maintain their own recursion checks with different handling of windows limits. E.g;
https://github.com/python/cpython/blob/fa106a685c1f199aca5be5c2d0277a14cc9057bd/Python/marshal.c#L25-L40
We might need to end up with the same motion and do the handling by ourselves. Wdyt @pablogsal @kj?
msg395189 - (view) Author: Batuhan Taskaya (BTaskaya) * (Python committer) Date: 2021年06月06日 00:11
After playing with it for a couple hours and without much success of creating a test environment (only using buildbots), I decided not to introduce hard limits. Even though they make the original tests to pass, they don't solve the problem overall and also more important part is that the 'hard limits' might cause regressions for people who do compile(<AST>) calls. 
For normal windows builds (as @kj noted) something might work in the current revision and we might just break it with introducing hard limits. Since the trees are tend to get really branchy, I don't think it is a good idea.
I'm open to any proposals/plans 
Extra: In the worst case that we can't come up with something (the AST converter functions are really long 2000+ LoC C functions so it is possible that there might be stuff that eats a lot of space on the stack), we can either
 a) revert => not a good option, this is not a regression on the python itself. It is a fix for other os's and windows release builds
 b) always skip the test on windows => we can do that but it might be counterintuitive for the future
 c) use a really low recursion limit for the test_recursion_* for windows => I'm open to fallback to this if nothing comes up. 
we might need to revert this though as is it is not a regression. It used to crash with the same exact error, just outside of the test suite, and now since it works for linux/macos/others + windows for release builds I wonder whether can just skip the test on windows and keep it as is in the worst scenario).
msg395190 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021年06月06日 00:20
> b) always skip the test on windows => we can do that but it might be counterintuitive for the future
Well, is not that the test is flaky technically, this means that the feature doesn't work on Windows (non release builds). So the reasoning has to be why we want/need to not support this on Windows. Otherwise we need to customize the limit on debug builds.
msg395341 - (view) Author: Batuhan Taskaya (BTaskaya) * (Python committer) Date: 2021年06月08日 16:55
New changeset e58d762c1fb4ad5e021d016c80c2bc4513632d2f by Batuhan Taskaya in branch 'main':
bpo-11105: reduce the recursion limit for tests (GH-26550)
https://github.com/python/cpython/commit/e58d762c1fb4ad5e021d016c80c2bc4513632d2f
msg395342 - (view) Author: Batuhan Taskaya (BTaskaya) * (Python committer) Date: 2021年06月08日 17:39
New changeset 8004c4570b1d1277ea8754e22b5eb60e63f5026c by Batuhan Taskaya in branch 'main':
bpo-11105: document the new test.support.infinite_recursion context manager (GH-26604)
https://github.com/python/cpython/commit/8004c4570b1d1277ea8754e22b5eb60e63f5026c
msg395343 - (view) Author: Batuhan Taskaya (BTaskaya) * (Python committer) Date: 2021年06月08日 17:39
New changeset bd6f0d3eadfe5623657db6aeb69b94d21f86f4a0 by Batuhan Taskaya in branch '3.10':
[3.10] bpo-11105: reduce the recursion limit for tests. (GH-26607)
https://github.com/python/cpython/commit/bd6f0d3eadfe5623657db6aeb69b94d21f86f4a0
msg395344 - (view) Author: Batuhan Taskaya (BTaskaya) * (Python committer) Date: 2021年06月08日 17:39
New changeset 87f502231c6d5b04a4d8aa23fba24fcf5303aebb by Batuhan Taskaya in branch '3.9':
[3.9] bpo-11105: reduce the recursion limit for tests. (GH-26605)
https://github.com/python/cpython/commit/87f502231c6d5b04a4d8aa23fba24fcf5303aebb
msg395345 - (view) Author: Batuhan Taskaya (BTaskaya) * (Python committer) Date: 2021年06月08日 17:42
The issue has been solved, all buildbots should now pass. Will continue to monitor the situation. Thanks for the report @kj!
History
Date User Action Args
2022年04月11日 14:57:12adminsetgithub: 55314
2021年06月08日 17:42:44BTaskayasetstatus: pending -> closed
stage: patch review -> resolved
2021年06月08日 17:42:26BTaskayasetpriority: release blocker ->
status: open -> pending
resolution: fixed
messages: + msg395345
2021年06月08日 17:39:54BTaskayasetmessages: + msg395344
2021年06月08日 17:39:38BTaskayasetmessages: + msg395343
2021年06月08日 17:39:24BTaskayasetmessages: + msg395342
2021年06月08日 17:04:24BTaskayasetpull_requests: + pull_request25190
2021年06月08日 17:03:37BTaskayasetpull_requests: + pull_request25188
2021年06月08日 17:00:30BTaskayasetpull_requests: + pull_request25187
2021年06月08日 16:55:17BTaskayasetmessages: + msg395341
2021年06月08日 16:24:49jklothsetnosy: + jkloth
pull_requests: + pull_request25186
2021年06月06日 00:35:57ppperrysetnosy: - ppperry
2021年06月06日 00:20:05pablogsalsetmessages: + msg395190
2021年06月06日 00:11:57BTaskayasetmessages: + msg395189
2021年06月05日 23:23:06BTaskayasetpull_requests: + pull_request25141
2021年06月05日 18:39:56BTaskayasetstage: patch review
pull_requests: + pull_request25137
2021年06月05日 18:24:13BTaskayasetmessages: + msg395177
2021年06月05日 18:10:56BTaskayasetstatus: closed -> open

nosy: + lukasz.langa
priority: normal -> release blocker
resolution: fixed -> (no value)
stage: resolved -> (no value)
2021年06月05日 17:47:20BTaskayasetmessages: + msg395174
2021年06月05日 17:44:42pablogsalsetmessages: + msg395173
2021年06月05日 17:36:39kjsetnosy: + kj
messages: + msg395172
2021年06月03日 21:25:00pablogsalsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2021年06月03日 21:22:38pablogsalsetmessages: + msg395050
2021年06月03日 20:35:03BTaskayasetpull_requests: + pull_request25116
2021年06月03日 20:34:05BTaskayasetpriority: low -> normal
title: Compiling evil ast crashes interpreter -> Compiling recursive Python ASTs crash the interpreter
components: + Interpreter Core, - None
versions: + Python 3.9, Python 3.11
2021年06月03日 20:27:09miss-islingtonsetmessages: + msg395046
2021年06月03日 20:01:12miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request25115
2021年06月03日 20:01:10pablogsalsetnosy: + pablogsal
messages: + msg395040
2020年09月19日 19:02:53georg.brandlsetnosy: - georg.brandl
2020年07月06日 09:31:57BTaskayasetmessages: + msg373084
2020年07月06日 08:40:47terry.reedysetnosy: + terry.reedy

messages: + msg373078
versions: + Python 3.10, - Python 2.7, Python 3.9
2020年06月02日 10:20:26BTaskayasetkeywords: + patch
stage: test needed -> patch review
pull_requests: + pull_request19824
2019年12月31日 19:06:17ppperrysetnosy: + ppperry
messages: + msg359122
2019年12月28日 17:41:29BTaskayasetversions: + Python 3.9, - Python 3.2, Python 3.3
2019年12月28日 17:41:12BTaskayasetnosy: + BTaskaya
messages: + msg358952
2012年03月16日 03:24:55eric.snowsetnosy: + eric.snow
2012年03月16日 01:20:34benjamin.petersonsetmessages: + msg155986
2012年03月16日 00:33:04gregory.p.smithsetnosy: + gregory.p.smith
messages: + msg155978
2011年12月24日 18:53:33ezio.melottisetstage: test needed
components: + None
versions: - Python 3.1
2011年08月12日 04:29:27meador.ingesetnosy: + meador.inge
2011年02月03日 20:14:53georg.brandlsetnosy: + georg.brandl
messages: + msg127816
2011年02月03日 17:28:06benjamin.petersonsetmessages: + msg127801
2011年02月03日 17:27:13benjamin.petersonsetmessages: + msg127800
2011年02月03日 17:21:42belopolskysetmessages: + msg127799
2011年02月03日 17:08:02benjamin.petersonsetmessages: + msg127798
2011年02月03日 17:00:32belopolskysetnosy: + belopolsky
messages: + msg127797
2011年02月03日 05:02:39benjamin.petersoncreate

AltStyle によって変換されたページ (->オリジナル) /