On 2021年12月19日 20:06, Tigran Aivazian wrote:
"is behaving as if there was some global lock taken behind the scene (i.e. inside Python interpreter)"? The Python interpreter does have the GIL (Global Interpreter Lock). It can't execute Python bytecodes in parallel, but timeshares between the threads. The GIL is released during I/O and by some extensions while they're processing, but when they want to return, or if they want to use the Python API, they need to acquire the GIL again. The only way to get true parallelism in CPython is to use multiprocessing, where it's running in multiple processes.So far I have narrowed it down to a block of code in solve.py doing a lot of multi-threaded FFT (i.e. with fft(..., threads=6) of pyFFTW), as well as numpy exp() and other functions and pure Python heavy list manipulation (yes, lists, not numpy arrays). All of this together (or some one part of it, yet to be discovered) is behaving as if there was some global lock taken behind the scene (i.e. inside Python interpreter), so that when multiple instances of the script (which I loosely called "threads" in previous posts, but here correct myself as the word "threads" is used more appropriately in the context of FFT in this message) are executed in parallel, they slow each other down in 3.10, but not so in 3.8.So this is definitely a very interesting 3.10 degradation problem. I will try to investigate some more tomorrow...
_______________________________________________ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/ZDETUOFHAEBAQ3DV5JZROOBN6IE3VO5O/ Code of Conduct: http://python.org/psf/codeofconduct/