homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: PyOS_InterruptOccurred() now requires to hold the GIL: PyOS_Readline() crash
Type: crash Stage: resolved
Components: Interpreter Core Versions: Python 3.10, Python 3.9, Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: JelleZijlstra, benjamin.peterson, corona10, db3l, eryksun, remi.lapeyre, serhiy.storchaka, stutzbach, vstinner
Priority: normal Keywords: 3.9regression, patch

Created on 2020年05月30日 14:19 by JelleZijlstra, last changed 2022年04月11日 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 20549 closed corona10, 2020年05月31日 04:58
PR 20571 merged vstinner, 2020年06月01日 13:29
PR 20578 merged vstinner, 2020年06月01日 18:13
PR 20579 merged vstinner, 2020年06月01日 18:32
PR 20599 merged vstinner, 2020年06月02日 14:02
PR 20613 merged vstinner, 2020年06月03日 15:08
PR 20616 merged vstinner, 2020年06月03日 15:55
PR 20618 merged vstinner, 2020年06月03日 16:36
PR 20779 merged vstinner, 2020年06月10日 14:25
PR 20785 merged vstinner, 2020年06月10日 16:58
PR 20787 merged vstinner, 2020年06月10日 17:27
Messages (31)
msg370386 - (view) Author: Jelle Zijlstra (JelleZijlstra) * (Python committer) Date: 2020年05月30日 14:19
$ gdb ./python 
...
(gdb) r
...
Python 3.9.0b1 (tags/v3.9.0b1:97fe9cfd9f8, May 30 2020, 05:26:48) 
...
>>> import io
>>> io.FileIO(0).name
0
>>> 
Program received signal SIGSEGV, Segmentation fault.
_PyInterpreterState_GET () at ./Include/internal/pycore_pystate.h:100
100 return tstate->interp;
(gdb) bt
#0 _PyInterpreterState_GET () at ./Include/internal/pycore_pystate.h:100
#1 PyOS_InterruptOccurred () at ./Modules/signalmodule.c:1785
#2 0x0000000000673905 in my_fgets (buf=buf@entry=0xa40780 "8LJ367円377円177円", len=len@entry=100, fp=fp@entry=0x7ffff74a48e0 <_IO_2_1_stdin_>) at Parser/myreadline.c:87
#3 0x000000000067397b in PyOS_StdioReadline (sys_stdin=sys_stdin@entry=0x7ffff74a48e0 <_IO_2_1_stdin_>, sys_stdout=sys_stdout@entry=0x7ffff74a5620 <_IO_2_1_stdout_>, 
 prompt=prompt@entry=0x7ffff6f8d8e0 ">>> ") at Parser/myreadline.c:269
#4 0x0000000000673b4f in PyOS_Readline (sys_stdin=0x7ffff74a48e0 <_IO_2_1_stdin_>, sys_stdout=0x7ffff74a5620 <_IO_2_1_stdout_>, prompt=0x7ffff6f8d8e0 ">>> ") at Parser/myreadline.c:355
#5 0x00000000005d90d9 in tok_nextc (tok=0xa3fd30) at Parser/tokenizer.c:856
#6 0x00000000005dad02 in tok_get (tok=tok@entry=0xa3fd30, p_start=p_start@entry=0x7fffffffd508, p_end=p_end@entry=0x7fffffffd510) at Parser/tokenizer.c:1194
#7 0x00000000005dcb69 in PyTokenizer_Get (tok=0xa3fd30, p_start=p_start@entry=0x7fffffffd508, p_end=p_end@entry=0x7fffffffd510) at Parser/tokenizer.c:1842
#8 0x0000000000653c73 in _PyPegen_fill_token (p=0x7ffff6f8f540) at Parser/pegen/pegen.c:551
#9 0x0000000000670355 in statement_newline_rule (p=0x7ffff6f8f540) at Parser/pegen/parse.c:1143
#10 interactive_rule (p=0x7ffff6f8f540) at Parser/pegen/parse.c:725
#11 _PyPegen_parse (p=p@entry=0x7ffff6f8f540) at Parser/pegen/parse.c:19388
#12 0x0000000000654b32 in _PyPegen_run_parser (p=0x7ffff6f8f540) at Parser/pegen/pegen.c:1037
#13 0x0000000000654e4c in _PyPegen_run_parser_from_file_pointer (fp=fp@entry=0x70f74a48e0, start_rule=start_rule@entry=80, filename_ob=filename_ob@entry=0x7ffff6f84eb0, 
 enc=enc@entry=0x7ffff704ec60 "utf-8", ps1=ps1@entry=0x7ffff6f8d8e0 ">>> ", ps2=ps2@entry=0x90000000d0 <error: Cannot access memory at address 0x90000000d0>, flags=0x7fffffffd7b8, 
 errcode=0x7fffffffd6e4, arena=0x7ffff6ff6b30) at Parser/pegen/pegen.c:1097
#14 0x00000000005d6bea in PyPegen_ASTFromFileObject (fp=0x70f74a48e0, fp@entry=0x7ffff74a48e0 <_IO_2_1_stdin_>, filename_ob=filename_ob@entry=0x7ffff6f84eb0, mode=80, mode@entry=256, 
 enc=enc@entry=0x7ffff704ec60 "utf-8", ps1=ps1@entry=0x7ffff6f8d8e0 ">>> ", ps2=0x90000000d0 <error: Cannot access memory at address 0x90000000d0>, ps2@entry=0x7ffff6f8dbe0 "... ", 
 flags=<optimized out>, errcode=<optimized out>, arena=<optimized out>) at Parser/pegen/peg_api.c:52
#15 0x00000000005460d9 in PyRun_InteractiveOneObjectEx (fp=fp@entry=0x7ffff74a48e0 <_IO_2_1_stdin_>, filename=filename@entry=0x7ffff6f84eb0, flags=flags@entry=0x7fffffffd7b8) at Python/pythonrun.c:243
#16 0x000000000054631e in PyRun_InteractiveLoopFlags (fp=fp@entry=0x7ffff74a48e0 <_IO_2_1_stdin_>, filename_str=filename_str@entry=0x673e32 "<stdin>", flags=flags@entry=0x7fffffffd7b8)
 at Python/pythonrun.c:122
#17 0x0000000000546d4c in PyRun_AnyFileExFlags (fp=0x7ffff74a48e0 <_IO_2_1_stdin_>, filename=0x673e32 "<stdin>", closeit=0, flags=0x7fffffffd7b8) at Python/pythonrun.c:81
#18 0x0000000000429fb7 in pymain_run_stdin (cf=0x7fffffffd7b8, config=0x9bd800) at Modules/main.c:467
#19 pymain_run_python (exitcode=0x7fffffffd7b0) at Modules/main.c:556
#20 Py_RunMain () at Modules/main.c:632
#21 0x000000000042a2d6 in pymain_main (args=0x7fffffffd8a0) at Modules/main.c:662
#22 Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at Modules/main.c:686
#23 0x00007ffff7100830 in __libc_start_main (main=0x41ef30 <main>, argc=1, argv=0x7fffffffd9f8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffd9e8)
 at ../csu/libc-start.c:291
#24 0x0000000000429089 in _start ()
(gdb) 
Same happens with Python 3.9.0b1+ (heads/3.9:588efc29c5d, May 30 2020, 14:16:10) (current HEAD of the 3.9 branch).
In previous versions of Python this would exit the interpreter but not segfault.
msg370387 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2020年05月30日 15:01
Thanks for report, I can reproduce it
msg370388 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020年05月30日 16:11
Simpler reproducer:
import os
os.close(0)
msg370392 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2020年05月30日 17:16
FYI this change fix this issue.
--- a/Modules/signalmodule.c
+++ b/Modules/signalmodule.c
@@ -1782,7 +1782,11 @@ PyOS_FiniInterrupts(void)
 int
 PyOS_InterruptOccurred(void)
 {
- PyInterpreterState *interp = _PyInterpreterState_GET();
+ PyThreadState *tstate = _PyThreadState_GET();
+ if (!tstate) {
+ return 0;
+ }
+ PyInterpreterState *interp = tstate->interp;
 if (!_Py_ThreadCanHandleSignals(interp)) {
 return 0;
 }
msg370549 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020年06月01日 12:53
The following change modified PyOS_InterruptOccurred() to require the hold the GIL:
commit d83168854e19d0381fa57db25fca6c622917624f
Author: Victor Stinner <vstinner@python.org>
Date: Fri Mar 20 14:50:35 2020 +0100
 bpo-40010: Optimize pending calls in multithreaded applications (GH-19091)
 
 If a thread different than the main thread schedules a pending call
 (Py_AddPendingCall()), the bytecode evaluation loop is no longer
 interrupted at each bytecode instruction to check for pending calls
 which cannot be executed. Only the main thread can execute pending
 calls.
 
 Previously, the bytecode evaluation loop was interrupted at each
 instruction until the main thread executes pending calls.
 
 * Add _Py_ThreadCanHandlePendingCalls() function.
 * SIGNAL_PENDING_CALLS() now only sets eval_breaker to 1 if the
 current thread can execute pending calls. Only the main thread can
 execute pending calls.
--
PyOS_InterruptOccurred() is part of the limited C API, but it's not even documented :-( So it's not easy to say if it's a backward incompatible change. At least, as the author of the change, I can say that the change was deliberate.
bpo-40010 rationale is that only the main thread of the main interpreter must call Python signal handlers. So when another thread or another interpreter (running the main thread) asks "should the Python signal handlers be executed", the answer is always "no", even if yes, Python got a signal (which requires to execute at least one Python signal handler).
The change is that PyOS_InterruptOccurred() now requires to hold the GIL to get current Python thread state: _PyThreadState_GET() returns NULL when the GIL is released.
Modifying PyOS_InterruptOccurred() to always return 0 if the GIL is released (if _PyThreadState_GET() returns NULL) is wrong: if the function is called by the main thread of the main interpreter, it must return 1.
--
By the way, I rewrote the C signal handler in Python 3.9 to not require to get the current Python thread state. In Python 3.8, it was modified and started to require the current Python thread if writing into the wakeup file descriptor failed to schedule a "pending call".
commit b54a99d6432de93de85be2b42a63774f8b4581a0
Author: Victor Stinner <vstinner@python.org>
Date: Wed Apr 8 23:35:05 2020 +0200
 bpo-40082: trip_signal() uses the main interpreter (GH-19441)
 
 Fix the signal handler: it now always uses the main interpreter,
 rather than trying to get the current Python thread state.
 
 The following function now accepts an interpreter, instead of a
 Python thread state:
 
 * _PyEval_SignalReceived()
 * _Py_ThreadCanHandleSignals()
 * _PyEval_AddPendingCall()
 * COMPUTE_EVAL_BREAKER()
 * SET_GIL_DROP_REQUEST(), RESET_GIL_DROP_REQUEST()
 * SIGNAL_PENDING_CALLS(), UNSIGNAL_PENDING_CALLS()
 * SIGNAL_PENDING_SIGNALS(), UNSIGNAL_PENDING_SIGNALS()
 * SIGNAL_ASYNC_EXC(), UNSIGNAL_ASYNC_EXC()
 
 Py_AddPendingCall() now uses the main interpreter if it fails to the
 current Python thread state.
 
 Convert _PyThreadState_GET() and PyInterpreterState_GET_UNSAFE()
 macros to static inline functions.
msg370550 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020年06月01日 13:00
> PyOS_InterruptOccurred() now requires to hold the GIL
This change impacted gdb:
* https://sourceware.org/pipermail/gdb-patches/2020-May/169110.html
* https://bugzilla.redhat.com/show_bug.cgi?id=1829702
gdbpy_check_quit_flag() called PyOS_InterruptOccurred() without holding the GIL. It worked in Python 3.8, but started to crash in Python 3.9 beta1.
gdb had another issue, it started by always releasing the GIL and always called the Python C API with the GIL released (if I understood correctly).
gdb was fixed by calling PyGILState_Ensure()/PyGILState_Release() when calling the Python C API, especially PyOS_InterruptOccurred().
Maybe the minimum would be to *document* that the GIL must be held to call PyOS_InterruptOccurred(), and that it's not a new requirements.
msg370552 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020年06月01日 14:02
New changeset 3026cad59b87751a9215111776cac8e819458fce by Victor Stinner in branch 'master':
bpo-40826: Add _Py_EnsureTstateNotNULL() macro (GH-20571)
https://github.com/python/cpython/commit/3026cad59b87751a9215111776cac8e819458fce
msg370553 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020年06月01日 15:03
I can reproduce PyOS_InterruptOccurred() crash in Python 3.8 if I remove readline.cpython-38d-x86_64-linux-gnu.so and I disable EINTR error checking in my_fgets():
diff --git a/Parser/myreadline.c b/Parser/myreadline.c
index 43e5583b8b..2712dedacd 100644
--- a/Parser/myreadline.c
+++ b/Parser/myreadline.c
@@ -73,7 +73,7 @@ my_fgets(char *buf, int len, FILE *fp)
 clearerr(fp);
 return -1; /* EOF */
 }
-#ifdef EINTR
+#if 0
 if (err == EINTR) {
 int s;
 PyEval_RestoreThread(_PyOS_ReadlineTState);
vstinner@apu$ ./python
Python 3.8.3+ (heads/3.8-dirty:00a240bf7f, Jun 1 2020, 17:00:22) 
[GCC 10.1.1 20200507 (Red Hat 10.1.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 
^C
Erreur de segmentation (core dumped)
I cannot reproduce the issue in Python 3.7.
---
Python 3.7:
---
int
PyOS_InterruptOccurred(void)
{
 if (_Py_atomic_load_relaxed(&Handlers[SIGINT].tripped)) {
 if (PyThread_get_thread_ident() != main_thread)
 return 0;
 _Py_atomic_store_relaxed(&Handlers[SIGINT].tripped, 0);
 return 1;
 }
 return 0;
}
---
Python 3.8:
---
static int
is_main(_PyRuntimeState *runtime)
{
 unsigned long thread = PyThread_get_thread_ident();
 PyInterpreterState *interp = _PyRuntimeState_GetThreadState(runtime)->interp;
 return (thread == runtime->main_thread
 && interp == runtime->interpreters.main);
}
int
PyOS_InterruptOccurred(void)
{
 if (_Py_atomic_load_relaxed(&Handlers[SIGINT].tripped)) {
 _PyRuntimeState *runtime = &_PyRuntime;
 if (!is_main(runtime)) {
 return 0;
 }
 _Py_atomic_store_relaxed(&Handlers[SIGINT].tripped, 0);
 return 1;
 }
 return 0;
}
---
is_main() function was added in Python 3.8 by:
commit 64d6cc826dacebc2493b1bb5e8cb97828eb76f81
Author: Eric Snow <ericsnowcurrently@gmail.com>
Date: Sat Feb 23 15:40:43 2019 -0700
 bpo-35724: Explicitly require the main interpreter for signal-handling. (GH-11530)
 
 Ensure that the main interpreter is active (in the main thread) for signal-handling operations. This is increasingly relevant as people use subinterpreters more.
 
 https://bugs.python.org/issue35724 
msg370554 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020年06月01日 15:23
It seems like this issue was introduced by this change:
commit 728189884e0e128c4ffc57b785b04584d57a90c0
Author: Victor Stinner <vstinner@python.org>
Date: Thu Mar 26 22:28:11 2020 +0100
 bpo-38644: Pass tstate explicitly in signalmodule.c (GH-19184)
 
 PyOS_InterruptOccurred() now checks _Py_ThreadCanHandleSignals()
 before checking if SIGINT is tripped.
 Include/internal/pycore_pyerrors.h | 2 +
 Modules/signalmodule.c | 152 ++++++++++++++++++++++---------------
 Python/ceval.c | 4 +-
 3 files changed, 93 insertions(+), 65 deletions(-)
Before:
---
static inline int
_Py_IsMainInterpreter(PyThreadState* tstate)
{ 
 /* Use directly _PyRuntime rather than tstate->interp->runtime, since
 this function is used in performance critical code path (ceval) */
 return (tstate->interp == _PyRuntime.interpreters.main);
} 
static inline int 
_Py_ThreadCanHandleSignals(PyThreadState *tstate)
{ 
 return (_Py_IsMainThread() && _Py_IsMainInterpreter(tstate));
} 
 
static int 
thread_can_handle_signals(void)
{ 
 PyThreadState *tstate = _PyThreadState_GET(); 
 return _Py_ThreadCanHandleSignals(tstate);
} 
int
PyOS_InterruptOccurred(void)
{
 if (_Py_atomic_load_relaxed(&Handlers[SIGINT].tripped)) {
 if (!thread_can_handle_signals()) {
 return 0;
 }
 _Py_atomic_store_relaxed(&Handlers[SIGINT].tripped, 0);
 return 1;
 }
 return 0;
}
---
After:
--- 
static inline int 
_Py_IsMainInterpreter(PyThreadState* tstate)
{ 
 /* Use directly _PyRuntime rather than tstate->interp->runtime, since 
 this function is used in performance critical code path (ceval) */ 
 return (tstate->interp == _PyRuntime.interpreters.main);
}
static inline int 
_Py_ThreadCanHandleSignals(PyThreadState *tstate) 
{ 
 return (_Py_IsMainThread() && _Py_IsMainInterpreter(tstate));
}
int
PyOS_InterruptOccurred(void)
{
 PyThreadState *tstate = _PyThreadState_GET();
 if (!_Py_ThreadCanHandleSignals(tstate)) {
 return 0;
 }
 if (!_Py_atomic_load_relaxed(&Handlers[SIGINT].tripped)) {
 return 0;
 }
 _Py_atomic_store_relaxed(&Handlers[SIGINT].tripped, 0);
 return 1;
}
---
The difference is that tstate->interp is now checked *before* checking if SIGINT signal is tripped.
msg370555 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020年06月01日 15:36
So PyOS_InterruptOccurred() must be called with the GIL held since 3.8, it's just that the nobody noticed the bug before.
If SIGGINT is tripped and the GIL is released, PyOS_InterruptOccurred() does also crash in Python 3.8.
--
One way to see the bug in Python 3.8 without having to trip SIGINT.
Apply this patch:
diff --git a/Modules/signalmodule.c b/Modules/signalmodule.c
index 0c9a2671fe..b850af3163 100644
--- a/Modules/signalmodule.c
+++ b/Modules/signalmodule.c
@@ -1729,11 +1729,12 @@ PyOS_FiniInterrupts(void)
 int
 PyOS_InterruptOccurred(void)
 {
+ _PyRuntimeState *runtime = &_PyRuntime;
+ if (!is_main(runtime)) {
+ return 0;
+ }
+
 if (_Py_atomic_load_relaxed(&Handlers[SIGINT].tripped)) {
- _PyRuntimeState *runtime = &_PyRuntime;
- if (!is_main(runtime)) {
- return 0;
- }
 _Py_atomic_store_relaxed(&Handlers[SIGINT].tripped, 0);
 return 1;
 }
$ make
$ find -name "*readline*so" -delete
$ ./python
Python 3.8.3+ (heads/3.8-dirty:00a240bf7f, Jun 1 2020, 17:24:09) 
[GCC 10.1.1 20200507 (Red Hat 10.1.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os; os.close(0)
>>>
Erreur de segmentation (core dumped)
---
Reproduce the bug in Python 3.8 without modifying the code, using gdb to trigger events SIGINT at the right place:
$ gdb ./python
(gdb) b my_fgets
Breakpoint 1 at 0x68b941: file Parser/myreadline.c, line 39.
(gdb) run
Python 3.8.3+ (heads/3.8:00a240bf7f, Jun 1 2020, 17:27:24) 
[GCC 10.1.1 20200507 (Red Hat 10.1.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 
Breakpoint 1, my_fgets (...)
(gdb) p (void)close(0)
(gdb) delete 1
(gdb) signal SIGINT
Continuing with signal SIGINT.
Program received signal SIGSEGV, Segmentation fault.
is_main (runtime=0x7ff6c0 <_PyRuntime>) at ./Modules/signalmodule.c:193
193	 PyInterpreterState *interp = _PyRuntimeState_GetThreadState(runtime)->interp;
msg370556 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020年06月01日 15:38
I add "Python 3.8" version: it's also affected.
msg370569 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020年06月01日 18:34
New changeset cbe129692293251e7fbcea9ff0d822824d90c140 by Victor Stinner in branch 'master':
bpo-40826: PyOS_InterruptOccurred() requires GIL (GH-20578)
https://github.com/python/cpython/commit/cbe129692293251e7fbcea9ff0d822824d90c140
msg370575 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020年06月01日 18:57
See also bpo-40839: Disallow calling PyDict_GetItem() with the GIL released.
msg370576 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020年06月01日 18:59
New changeset c353764fd564e401cf47a5d9efab18c72c60014e by Victor Stinner in branch 'master':
bpo-40826: Fix GIL usage in PyOS_Readline() (GH-20579)
https://github.com/python/cpython/commit/c353764fd564e401cf47a5d9efab18c72c60014e
msg370661 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020年06月03日 12:40
New changeset fa7ab6aa0f9a4f695e5525db5a113cd21fa93787 by Victor Stinner in branch 'master':
bpo-40826: Add _PyOS_InterruptOccurred(tstate) function (GH-20599)
https://github.com/python/cpython/commit/fa7ab6aa0f9a4f695e5525db5a113cd21fa93787
msg370676 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020年06月03日 15:49
New changeset 5d2396c8cf68fba0a949c6ce474a505e3aba9c1f by Victor Stinner in branch '3.9':
[3.9] bpo-40826: Fix GIL usage in PyOS_Readline() (GH-20613)
https://github.com/python/cpython/commit/5d2396c8cf68fba0a949c6ce474a505e3aba9c1f
msg370679 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020年06月03日 16:28
New changeset 6f7346bb3983cd7a6aa97eeeafffb3cecd5292b8 by Victor Stinner in branch '3.8':
[3.9] bpo-40826: Fix GIL usage in PyOS_Readline() (GH-20613) (GH-20616)
https://github.com/python/cpython/commit/6f7346bb3983cd7a6aa97eeeafffb3cecd5292b8
msg370682 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020年06月03日 18:17
New changeset 6d62dc1ea4e191b8486e80a85ca0694215375424 by Victor Stinner in branch '3.9':
[3.9] bpo-40826: PyOS_InterruptOccurred() requires GIL (GH-20578) (GH-20618)
https://github.com/python/cpython/commit/6d62dc1ea4e191b8486e80a85ca0694215375424
msg370683 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020年06月03日 18:18
Ok, I fixed bugs in the REPL on Linux and Windows in 3.8, 3.9 and master branches.
I also modified PyOS_InterruptOccurred() to fail with a fatal error and a nice error message, rather than crashing with a core dump, if it's called with the GIL released. I changed in 3.9 and master branches.
Thanks Jelle Zijlstra for the bug report! I'm now closing the issue.
msg370692 - (view) Author: David Bolen (db3l) * Date: 2020年06月04日 01:34
I haven't had a chance to look too deeply, but the final set of commits (starting with fa7ab6aa) appear to be the common thread with all branches now failing with timeout exceptions in test_repl on the Windows 10 buildbot.
The first instance was in the 3.x branch (https://buildbot.python.org/all/#/builders/129/builds/1191) but they all seem similar.
-- David
msg370694 - (view) Author: David Bolen (db3l) * Date: 2020年06月04日 02:53
It appears the recent commit is causing a CRT exception dialog in test_close_stdin (test_repl.TestInteractiveInterpreter). The dialog can't get answered, which is what leads to the eventual timeout.
The assertion is "_osfile(fh) & FOPEN" from osfinfo.cpp on line 258. It actually appears twice (if the first is ignored). If both are ignored the test passes.
I was somewhat surprised to see the dialog, as I believe CRT exception dialogs are disabled during test runs (others are present in test logs). Perhaps the child interpreter used by test_repl doesn't inherit that behavior. If so, this was probably always a risk, there just weren't any assertions occurring before this in the child interpreter.
msg370696 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2020年06月04日 07:08
> Perhaps the child interpreter used by test_repl doesn't inherit 
> that behavior.
The OS error mode is inherited, unless the child is created with CREATE_DEFAULT_ERROR_MODE flag. But the CRT error mode isn't inherited. (For the spawn* family, in principle the CRT could pass its error modes in reserved STARTUPINFO fields, like it does for file descriptors, but it doesn't do that, and subprocess wouldn't be privy to that protocol anyway.) 
For example:
 >>> import sys, subprocess
 >>> from msvcrt import *
 >>> SEM_FAILCRITICALERRORS, CRTDBG_MODE_FILE, CRTDBG_MODE_WNDW
 (1, 1, 4)
 >>> SetErrorMode(SEM_FAILCRITICALERRORS)
 0
 >>> CrtSetReportMode(CRT_ERROR, CRTDBG_MODE_FILE)
 4
The return values are the previous values, which are respectively 0 (default) for the OS and CRTDBG_MODE_WNDW (message box) for the CRT. Now check the initial values in a child process:
 >>> subprocess.call(f'{sys.executable} -q')
 >>> from msvcrt import *
 >>> SetErrorMode(SEM_FAILCRITICALERRORS)
 1
 >>> CrtSetReportMode(CRT_ERROR, CRTDBG_MODE_FILE)
 4
The OS error mode of the parent was inherited, but the parent's CRT error mode was not.
libregrtest has a convenient function to suppress error reporting in the child. For example:
 >>> import sys, subprocess
 >>> subprocess.call(f'{sys.executable} -q')
 >>> import os
 >>> from test.libregrtest import setup
 >>> setup.suppress_msvcrt_asserts(0)
 >>> os.close(0)
 >>>
 0
msg370716 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020年06月04日 16:31
I am confused. Code which uses the closed file descriptor 0 uses "_Py_BEGIN_SUPPRESS_IPH":
 _Py_BEGIN_SUPPRESS_IPH
 hStdIn = (HANDLE)_get_osfhandle(fileno(sys_stdin));
 hStdErr = (HANDLE)_get_osfhandle(fileno(stderr));
 _Py_END_SUPPRESS_IPH
and
 _Py_BEGIN_SUPPRESS_IPH
 handle = (HANDLE)_get_osfhandle(fileno(fp));
 _Py_END_SUPPRESS_IPH
Macros defined as:
extern _invalid_parameter_handler _Py_silent_invalid_parameter_handler;
#define _Py_BEGIN_SUPPRESS_IPH { _invalid_parameter_handler _Py_old_handler = \
 _set_thread_local_invalid_parameter_handler(_Py_silent_invalid_parameter_handler);
#define _Py_END_SUPPRESS_IPH _set_thread_local_invalid_parameter_handler(_Py_old_handler); }
with PC/invalid_parameter_handler.c:
static void __cdecl _silent_invalid_parameter_handler(
 wchar_t const* expression,
 wchar_t const* function,
 wchar_t const* file,
 unsigned int line,
 uintptr_t pReserved) { }
_invalid_parameter_handler _Py_silent_invalid_parameter_handler = _silent_invalid_parameter_handler;
The purpose of _Py_BEGIN_SUPPRESS_IPH is to suppress such popup, no?
msg370718 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2020年06月04日 17:59
> The purpose of _Py_BEGIN_SUPPRESS_IPH is to suppress such popup, no?
That macro suppresses the CRT's invalid parameter handler, which by default terminates abnormally with a fastfail (status code 0xC0000409), even in a release build. 
In a debug build, we still have message boxes from failed asserts that call _CrtDbgReportW. For example, the following stack trace shows _CrtDbgReportW called from _get_osfhandle:
 Call Site
 win32u!NtUserWaitMessage
 user32!DialogBox2
 user32!InternalDialogBox
 user32!SoftModalMessageBox
 user32!MessageBoxWorker
 user32!MessageBoxTimeoutW
 user32!MessageBoxW
 ucrtbased!__acrt_MessageBoxW
 ucrtbased!__crt_char_traits<wchar_t>::message_box<HWND__ * __ptr64,
 wchar_t const * __ptr64 const & __ptr64,
 wchar_t const * __ptr64 const & __ptr64,
 unsigned int const & __ptr64>
 ucrtbased!common_show_message_box<wchar_t>
 ucrtbased!__acrt_show_wide_message_box
 ucrtbased!common_message_window<wchar_t>
 ucrtbased!__acrt_MessageWindowW
 ucrtbased!_VCrtDbgReportW
 ucrtbased!_CrtDbgReportW
 ucrtbased!_get_osfhandle
 python310_d!PyOS_StdioReadline
_get_osfhandle validates that the fd is open via
 _VALIDATE_CLEAR_OSSERR_RETURN(_osfile(fh) & FOPEN, EBADF, -1);
which uses the following macro:
 #define _VALIDATE_CLEAR_OSSERR_RETURN(expr, errorcode, retexpr) \
 { \
 int _Expr_val=!!(expr); \
 _ASSERT_EXPR((_Expr_val), _CRT_WIDE(#expr)); \
 if (!(_Expr_val)) \
 { \
 _doserrno = 0L; \
 errno = errorcode; \
 _INVALID_PARAMETER(_CRT_WIDE(#expr)); \
 return retexpr; \
 } \
 }
In a debug build, _ASSERT_EXPR expands as:
 #define _ASSERT_EXPR(expr, msg) \
 (void)( \
 (!!(expr)) || \
 (1 != _CrtDbgReportW(_CRT_ASSERT, _CRT_WIDE(__FILE__), __LINE__, NULL, L"%ls", msg)) || \
 (_CrtDbgBreak(), 0) \
 )
which is where _CrtDbgReportW gets called. By default it reports the assertion failure with a message box. The suppression of this in libregrtest either ignores this report or sends it to stderr:
 for m in [msvcrt.CRT_WARN, msvcrt.CRT_ERROR, msvcrt.CRT_ASSERT]:
 if verbose:
 msvcrt.CrtSetReportMode(m, msvcrt.CRTDBG_MODE_FILE)
 msvcrt.CrtSetReportFile(m, msvcrt.CRTDBG_FILE_STDERR)
 else:
 msvcrt.CrtSetReportMode(m, 0)
msvcrt.CrtSetReportMode and msvcrt.CrtSetReportFile only exist in a debug build.
msg371169 - (view) Author: David Bolen (db3l) * Date: 2020年06月10日 07:21
So a script I use on the buildbot intended to prevent hanging on assertions was failing to work in this particular case, which I've corrected. That changes the failure mode for new Win10 buildbot runs.
The test in question (test_close_stdin) still fails, but does so immediately (return code not 0, as the child process aborts) rather than timing out. 
If the assertion is expected, and something to be ignored, the test itself still needs to handle that.
msg371223 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020年06月10日 16:49
New changeset f6e58aefde2e57e4cb11ea7743955da53a3f1e80 by Victor Stinner in branch 'master':
bpo-40826: Fix test_repl.test_close_stdin() on Windows (GH-20779)
https://github.com/python/cpython/commit/f6e58aefde2e57e4cb11ea7743955da53a3f1e80
msg371224 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020年06月10日 17:00
Thanks for the bug report David Bolen and thanks for the analysis Eryk Sun. The bug should be be fixed in the master branch, and backports to 3.8 and 3.9 are coming.
msg371226 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020年06月10日 17:25
New changeset 4a4f660cfde8b683634c53e6214a6baa51de43b1 by Victor Stinner in branch '3.9':
bpo-40826: Fix test_repl.test_close_stdin() on Windows (GH-20779) (GH-20785)
https://github.com/python/cpython/commit/4a4f660cfde8b683634c53e6214a6baa51de43b1
msg371227 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020年06月10日 17:45
New changeset c7a6c7b5279f766e65b7cc9dc5bebb73acee6672 by Victor Stinner in branch '3.8':
bpo-40826: Fix test_repl.test_close_stdin() on Windows (GH-20779) (GH-20785) (GH-20787)
https://github.com/python/cpython/commit/c7a6c7b5279f766e65b7cc9dc5bebb73acee6672
msg371233 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2020年06月10日 18:41
Why doesn't SuppressCrashReport suppress the hard error dialog (i.e. SEM_FAILCRITICALERRORS)? This dialog gets created by the NtRaiseHardError system call. For example:
 >>> NtRaiseHardError = ctypes.windll.ntdll.NtRaiseHardError
 >>> response = (ctypes.c_ulong * 1)()
With the default OS error mode, the following raises a hard error dialog for STATUS_UNSUCCESSFUL (0xC0000001), with abort/retry/ignore options:
 >>> NtRaiseHardError(0xC000_0001, 0, 0, None, 0, response)
 0
 >>> response[0]
 2
The response value is limited to abort (2), retry (7), ignore (4) -- or simply returned (0) when the process hard error mode is disabled: 
 >>> msvcrt.SetErrorMode(msvcrt.SEM_FAILCRITICALERRORS)
 0
 >>> NtRaiseHardError(0xC000_0001, 0, 0, None, 0, response)
 0
 >>> response[0]
 0
NtRaiseHardError also checks for an error-mode override flag (0x1000_0000) in the status code, which is used in cases such as WinAPI FatalAppExitW. For example, the following will raise the dialog regardless of the hard error mode:
 >>> NtRaiseHardError(0xC000_0001 | 0x1000_0000, 0, 0, None, 0, response)
 0
 >>> response[0]
 2
msg371239 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020年06月10日 20:13
> Why doesn't SuppressCrashReport suppress the hard error dialog (i.e. SEM_FAILCRITICALERRORS)? This dialog gets created by the NtRaiseHardError system call.
Would you mind to open a separated issue?
History
Date User Action Args
2022年04月11日 14:59:31adminsetgithub: 85003
2020年06月10日 20:13:01vstinnersetmessages: + msg371239
2020年06月10日 18:41:22eryksunsetmessages: + msg371233
2020年06月10日 17:46:11vstinnersetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2020年06月10日 17:45:53vstinnersetmessages: + msg371227
2020年06月10日 17:27:35vstinnersetpull_requests: + pull_request19982
2020年06月10日 17:25:04vstinnersetmessages: + msg371226
2020年06月10日 17:00:22vstinnersetmessages: + msg371224
2020年06月10日 16:58:19vstinnersetpull_requests: + pull_request19981
2020年06月10日 16:49:29vstinnersetmessages: + msg371223
2020年06月10日 14:25:28vstinnersetstage: resolved -> patch review
pull_requests: + pull_request19975
2020年06月10日 07:21:24db3lsetmessages: + msg371169
2020年06月04日 17:59:45eryksunsetmessages: + msg370718
2020年06月04日 16:31:56vstinnersetstatus: closed -> open
resolution: fixed -> (no value)
messages: + msg370716
2020年06月04日 07:08:46eryksunsetnosy: + eryksun
messages: + msg370696
2020年06月04日 02:53:35db3lsetmessages: + msg370694
2020年06月04日 01:34:32db3lsetnosy: + db3l
messages: + msg370692
2020年06月03日 18:18:16vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg370683

stage: patch review -> resolved
2020年06月03日 18:17:13vstinnersetmessages: + msg370682
2020年06月03日 16:36:53vstinnersetpull_requests: + pull_request19844
2020年06月03日 16:28:26vstinnersetmessages: + msg370679
2020年06月03日 15:55:20vstinnersetpull_requests: + pull_request19842
2020年06月03日 15:49:29vstinnersetmessages: + msg370676
2020年06月03日 15:08:28vstinnersetpull_requests: + pull_request19840
2020年06月03日 12:40:03vstinnersetmessages: + msg370661
2020年06月02日 14:02:25vstinnersetpull_requests: + pull_request19827
2020年06月01日 18:59:43vstinnersetmessages: + msg370576
2020年06月01日 18:57:09vstinnersetmessages: + msg370575
2020年06月01日 18:34:23vstinnersetmessages: + msg370569
2020年06月01日 18:32:13vstinnersetpull_requests: + pull_request19818
2020年06月01日 18:13:34vstinnersetpull_requests: + pull_request19817
2020年06月01日 15:38:55vstinnersetmessages: + msg370556
versions: + Python 3.8
2020年06月01日 15:36:27vstinnersetmessages: + msg370555
2020年06月01日 15:23:52vstinnersetmessages: + msg370554
2020年06月01日 15:03:02vstinnersetmessages: + msg370553
2020年06月01日 14:02:47vstinnersetmessages: + msg370552
2020年06月01日 13:29:12vstinnersetpull_requests: + pull_request19811
2020年06月01日 13:00:20vstinnersetmessages: + msg370550
2020年06月01日 12:55:31vstinnersettitle: Segfaults when close file descriptor 0 -> PyOS_InterruptOccurred() now requires to hold the GIL: PyOS_Readline() crash
2020年06月01日 12:53:37vstinnersetmessages: + msg370549
2020年05月31日 04:58:39corona10setkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request19793
2020年05月30日 17:16:57corona10setmessages: + msg370392
2020年05月30日 16:11:19serhiy.storchakasettitle: Segfaults on io.FileIO(0).name in 3.9 -> Segfaults when close file descriptor 0
nosy: + serhiy.storchaka

messages: + msg370388

components: + Interpreter Core
2020年05月30日 15:01:45corona10setmessages: + msg370387
stage: needs patch
2020年05月30日 15:01:08corona10setnosy: + corona10
2020年05月30日 14:37:15remi.lapeyresetnosy: + vstinner, remi.lapeyre
2020年05月30日 14:32:01xtreaksettype: crash
2020年05月30日 14:19:46JelleZijlstracreate

AltStyle によって変換されたページ (->オリジナル) /