This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2020年03月27日 17:16 by vstinner, last changed 2022年04月11日 14:59 by admin.
| Messages (13) | |||
|---|---|---|---|
| msg365173 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2020年03月27日 17:16 | |
At fork, Python calls PyOS_AfterFork_Child() in the child process which indirectly calls _PyThreadState_DeleteExcept() whichs calls release_sentinel() of the thread which releases the thread state lock (threading.Thread._tstate_lock). Problem: using a lock after fork is unsafe and can crash. That's exactly what happens randomly on AIX when stressing ThreadJoinOnShutdown.test_reinit_tls_after_fork() of test_threading: https://bugs.python.org/issue40068#msg365031 There are different options to solve this issue: * Reset _tstate_lock before using it... not sure that it's worth it, since we are going to delete the threading.Thread object with its _tstate_lock object anymore. After calling fork, the child process has exactly 1 thread: all other threads have been removed. * Modify release_sentinel() to not use the lock: avoid PyThread_release_lock() call. |
|||
| msg365175 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2020年03月27日 17:18 | |
See also bpo-40089: "Add _at_fork_reinit() method to locks". |
|||
| msg367277 - (view) | Author: Hai Shi (shihai1991) * (Python triager) | Date: 2020年04月25日 17:54 | |
>since we are going to delete the threading.Thread object with its _tstate_lock object anymore. Do we have any pep or discuss record about this plan? > * Modify release_sentinel() to not use the lock: avoid PyThread_release_lock() call. Hm, I am not sure about it. Looks like we can not remove it directly. |
|||
| msg367422 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2020年04月27日 13:42 | |
> Do we have any pep or discuss record about this plan? "since we are going to delete the threading.Thread object with its _tstate_lock object anyway" sentence is a description of the current _PyThreadState_DeleteExcept() implementation. |
|||
| msg367496 - (view) | Author: Hai Shi (shihai1991) * (Python triager) | Date: 2020年04月28日 05:12 | |
> "since we are going to delete the threading.Thread object with its _tstate_lock object anyway" sentence is a description of the current _PyThreadState_DeleteExcept() implementation. Oh, got it. thanks for your explanation :) |
|||
| msg389937 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2021年03月31日 22:21 | |
I marked bpo-43665 "AIX: test_importlib regression (ENV change)" as a duplicate of this issue. test_multiprocessing_pool_circular_import() of test_importlib now triggers this crash as well. |
|||
| msg389966 - (view) | Author: Michael Felt (Michael.Felt) * | Date: 2021年04月01日 09:52 | |
Adding 3.10. While a (sort of) duplicate I also would like to add that before revision "7cb033c423b65def1632d6c3c747111543b342a2" this was not showing up as an issue with test_importlib. my issue was with test_importlib suddenly going into error. As this seem to be a long-standing issue is it perhaps a possibility to change the test so that the bot can go green again? |
|||
| msg389968 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2021年04月01日 09:59 | |
> As this seem to be a long-standing issue is it perhaps a possibility to change the test so that the bot can go green again? This issue is a real crash and it seems quite easy to get it. I prefer to not hide the bug in the test suite. The bug must be fixed. |
|||
| msg389969 - (view) | Author: Michael Felt (Michael.Felt) * | Date: 2021年04月01日 09:59 | |
OK: further. Two options are suggested: There are different options to solve this issue: * Reset _tstate_lock before using it... not sure that it's worth it, since we are going to delete the threading.Thread object with its _tstate_lock object anymore. After calling fork, the child process has exactly 1 thread: all other threads have been removed. * Modify release_sentinel() to not use the lock: avoid PyThread_release_lock() call. ** as to option 1 - it is 'worth it' if it stops the crashes ** This is deeper than I usually go in Python code - but I'll make an effort - help is appreciated. |
|||
| msg389972 - (view) | Author: Michael Felt (Michael.Felt) * | Date: 2021年04月01日 11:20 | |
OK. Please explain. Looking at tstate assignment
In posixmodule.c:PyOSAfterFork_Child()
PyStatus status;
_PyRuntimeState *runtime = &_PyRuntime;
...
PyThreadState *tstate = _PyThreadState_GET();
and later calls
status = _PyRuntimeState_ReInitThreads(runtime);
Yet in Posix/ceval.c
PyStatus
_PyEval_ReInitThreads(PyThreadState *tstate)
{
_PyRuntimeState *runtime = tstate->interp->runtime;
** this looks like runtime->interp->runtime
And then we get down to:
/* Destroy all threads except the current one */
_PyThreadState_DeleteExcept(runtime, tstate);
Is this correct - as it looks like:
_PyThreadState_DeleteExcept(runtime->interp->runtime, runtime) -- where runtime == &_PyRuntime;
|
|||
| msg390407 - (view) | Author: Michael Felt (Michael.Felt) * | Date: 2021年04月07日 09:08 | |
Willing to spend more time on this - but the variable names chosen blind me - looks like a circle. And, thinking about the address in the core dump starting with 0xd (segment 13) - confuses me somewhat - as from memory - I thought the shared library code/data converged on segment 14 (ie, addresses would begin with 0xe). So, perhaps it is the 0xd* address that is killing everything. See: https://bugs.python.org/issue43665#msg389923 -- as I am also still confused re: this line - looks suspect: with all arguments at 0xdddddddd PyVectorcall_Call(callable = 0xdddddddd, tuple = 0xdddddddd, kwargs = 0xdddddddd), line 255 in "call.c" And also this line, and lines like it - notice the value for what looks like a variable for argument_count: _PyEval_Vector(tstate = 0x22023cf0, con = 0x2000120c, locals = 0x22023960, args = 0x220239f0, argcount = 3508213100, kwnames = 0x22023cf0), line 46 in "pycore_ceval.h" _PyFunction_Vectorcall(func = 0x220239f0, stack = (nil), nargsf = 570572016, kwnames = 0x220239f0), line 361 in "call.c" method_vectorcall(method = 0x10103bdc, args = (nil), nargsf = 0, kwnames = 0x20045994), line 119 in "abstract.h" |
|||
| msg394225 - (view) | Author: David Edelsohn (David.Edelsohn) * | Date: 2021年05月23日 20:31 | |
How could release_sentinel() be structured to not call PyThread_release_lock()? This seems to be a situation where _PyThreadState_DeleteExcept() is deleting all thread states. thread__set_sentinel() sets release_sentinel() as its on_delete hook. The thread state is being deleted, release_sentinel() is called and discovers that the lock is still held.
if (lock->locked) {
PyThread_release_lock(lock->lock_lock);
Do you suggest something like _PyThread_at_fork_reinit() and leak the memory? To not release the lock in the thread of the child process?
|
|||
| msg394266 - (view) | Author: David Edelsohn (David.Edelsohn) * | Date: 2021年05月24日 20:16 | |
It seems that PyOS_AfterFork_Child() needs to do something like
PyThreadState *tstate = PyThreadState_Get();
PyObject *wr = _PyObject_CAST(tstate->on_delete_data);
PyObject *obj = PyWeakref_GET_OBJECT(wr);
lockobject *lock;
if (obj != Py_None) {
lock = (lockobject *) obj;
if (lock->locked) {
/* Leak memory on purpose. */
lock->locked = 0;
}
}
before the call to _PyEval_ReInitThreads.
Maybe encapsulate that as PyThread_ReInitThreads().
The locks in the threads in the child need to be cleared before _PyThreadState_DeleteExcept() so that Python does not try to release the locks in the child.
|
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:59:28 | admin | set | github: 84273 |
| 2021年05月24日 20:58:45 | erlendaasland | set | nosy:
+ erlendaasland |
| 2021年05月24日 20:16:41 | David.Edelsohn | set | messages: + msg394266 |
| 2021年05月23日 20:31:08 | David.Edelsohn | set | messages: + msg394225 |
| 2021年05月20日 19:27:31 | David.Edelsohn | set | nosy:
+ David.Edelsohn |
| 2021年04月07日 09:08:26 | Michael.Felt | set | messages: + msg390407 |
| 2021年04月01日 11:20:55 | Michael.Felt | set | messages: + msg389972 |
| 2021年04月01日 09:59:19 | Michael.Felt | set | messages: + msg389969 |
| 2021年04月01日 09:59:09 | vstinner | set | messages: + msg389968 |
| 2021年04月01日 09:52:42 | Michael.Felt | set | nosy:
+ Michael.Felt messages: + msg389966 versions: + Python 3.10 |
| 2021年03月31日 22:21:37 | vstinner | set | messages: + msg389937 |
| 2021年03月31日 22:20:40 | vstinner | link | issue43665 superseder |
| 2020年04月28日 05:12:33 | shihai1991 | set | messages: + msg367496 |
| 2020年04月27日 13:42:26 | vstinner | set | messages: + msg367422 |
| 2020年04月25日 17:54:07 | shihai1991 | set | messages: + msg367277 |
| 2020年04月17日 17:13:22 | shihai1991 | set | nosy:
+ shihai1991 |
| 2020年03月27日 17:18:10 | vstinner | set | messages: + msg365175 |
| 2020年03月27日 17:16:55 | vstinner | create | |