This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2011年06月19日 17:25 by vstinner, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| issue.patch | rosslagerwall, 2012年01月03日 06:12 | patch | review | |
| itest.py | rosslagerwall, 2012年01月03日 06:12 | test program | ||
| Messages (9) | |||
|---|---|---|---|
| msg138648 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2011年06月19日 17:25 | |
[271/356/1] test_concurrent_futures Traceback (most recent call last): File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/multiprocessing/queues.py", line 268, in _feed send(obj) File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/multiprocessing/connection.py", line 229, in send self._send_bytes(memoryview(buf)) File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/multiprocessing/connection.py", line 423, in _send_bytes self._send(struct.pack("=i", len(buf))) File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/multiprocessing/connection.py", line 392, in _send n = write(self._handle, buf) OSError: [Errno 32] Broken pipe Timeout (1:00:00)! Thread 0x00000954: File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/threading.py", line 237 in wait File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/multiprocessing/queues.py", line 252 in _feed File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/threading.py", line 690 in run File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/threading.py", line 737 in _bootstrap_inner File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/threading.py", line 710 in _bootstrap Thread 0x00000953: File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/multiprocessing/forking.py", line 146 in poll File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/multiprocessing/forking.py", line 166 in wait File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/multiprocessing/process.py", line 150 in join File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/concurrent/futures/process.py", line 208 in shutdown_worker File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/concurrent/futures/process.py", line 264 in _queue_management_worker File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/threading.py", line 690 in run File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/threading.py", line 737 in _bootstrap_inner File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/threading.py", line 710 in _bootstrap Thread 0x00000001: File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/threading.py", line 237 in wait File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/threading.py", line 851 in join File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/concurrent/futures/process.py", line 395 in shutdown File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/test/test_concurrent_futures.py", line 67 in tearDown File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/unittest/case.py", line 407 in _executeTestPart File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/unittest/case.py", line 463 in run File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/unittest/case.py", line 514 in __call__ File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/unittest/suite.py", line 105 in run File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/unittest/suite.py", line 67 in __call__ File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/unittest/suite.py", line 105 in run File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/unittest/suite.py", line 67 in __call__ File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/test/support.py", line 1166 in run File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/test/support.py", line 1254 in _run_suite File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/test/support.py", line 1280 in run_unittest File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/test/test_concurrent_futures.py", line 628 in test_main File "./Lib/test/regrtest.py", line 1043 in runtest_inner File "./Lib/test/regrtest.py", line 841 in runtest File "./Lib/test/regrtest.py", line 668 in main File "./Lib/test/regrtest.py", line 1618 in <module> *** Error code 1 make: Fatal error: Command failed for target `buildbottest' program finished with exit code 1 See commit e6e7e42efdc2 of the issue #12310. |
|||
| msg138765 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2011年06月20日 23:27 | |
Message on a stackoverflow thread: "I have suffered from the same problem, even if connecting on localhost in python 2.7.1. After a day of debugging i found the cause and a workaround: Cause: BaseProxy class has thread local storage which caches the connection, which is reused for future connections causing "broken pipe" errors even on creating a new Manager Workaround: Delete the cached connection before reconnecting if address in BaseProxy._address_to_local: del BaseProxy._address_to_local[self.address][0].connection" http://stackoverflow.com/questions/3649458/broken-pipe-when-using-python-multiprocessing-managers-basemanager-syncmanager/5884967#5884967 --- See also maybe the (closed) issue #11663: multiprocessing doesn't detect killed processes |
|||
| msg138766 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2011年06月20日 23:37 | |
Connection._send_bytes() has a comment about broken pipes:
def _send_bytes(self, buf):
# For wire compatibility with 3.2 and lower
n = len(buf)
self._send(struct.pack("=i", len(buf)))
# The condition is necessary to avoid "broken pipe" errors
# when sending a 0-length buffer if the other end closed the pipe.
if n > 0:
self._send(buf)
But the OSError(32, "Broken pipe") occurs on sending the buffer size (a chunk of 4 bytes: self._send(struct.pack("=i", len(buf)))), not on sending the buffer content.
See also maybe the (closed) issue #9205: Parent process hanging in multiprocessing if children terminate unexpectedly
|
|||
| msg138767 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2011年06月21日 00:42 | |
Ah, submit a new task after the manager shutdown fails with OSError(32, 'Broken pipe'). Example:
---------------
from multiprocessing.managers import BaseManager
class MathsClass(object):
def foo(self):
return 42
class MyManager(BaseManager):
pass
MyManager.register('Maths', MathsClass)
if __name__ == '__main__':
manager = MyManager()
manager.start()
maths = manager.Maths()
maths.foo()
manager.shutdown()
try:
maths.foo()
finally:
manager.shutdown()
---------------
This example doesn't hang, but this issue is about concurrent.futures, not multiprocessing.
|
|||
| msg138768 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2011年06月21日 01:05 | |
Oh, I think that I found a deadlock (or something like that):
----------------------------
import concurrent.futures
import faulthandler
import os
import signal
import time
def work(n):
time.sleep(0.1)
def main():
faulthandler.register(signal.SIGUSR1)
print("pid: %s" % os.getpid())
with concurrent.futures.ProcessPoolExecutor() as executor:
for number, prime in executor.map(work, range(100)):
print("shutdown")
executor.shutdown()
print("shutdown--")
if __name__ == '__main__':
main()
----------------------------
Trace:
----------------------------
Thread 0x00007fbfc83bd700:
File "/home/haypo/prog/HG/cpython/Lib/threading.py", line 237 in wait
File "/home/haypo/prog/HG/cpython/Lib/multiprocessing/queues.py", line 252 in _feed
File "/home/haypo/prog/HG/cpython/Lib/threading.py", line 690 in run
File "/home/haypo/prog/HG/cpython/Lib/threading.py", line 737 in _bootstrap_inner
File "/home/haypo/prog/HG/cpython/Lib/threading.py", line 710 in _bootstrap
Thread 0x00007fbfc8bbe700:
File "/home/haypo/prog/HG/cpython/Lib/multiprocessing/queues.py", line 101 in put
File "/home/haypo/prog/HG/cpython/Lib/concurrent/futures/process.py", line 268 in _queue_management_worker
File "/home/haypo/prog/HG/cpython/Lib/threading.py", line 690 in run
File "/home/haypo/prog/HG/cpython/Lib/threading.py", line 737 in _bootstrap_inner
File "/home/haypo/prog/HG/cpython/Lib/threading.py", line 710 in _bootstrap
Current thread 0x00007fbfcc2e3700:
File "/home/haypo/prog/HG/cpython/Lib/threading.py", line 237 in wait
File "/home/haypo/prog/HG/cpython/Lib/threading.py", line 851 in join
File "/home/haypo/prog/HG/cpython/Lib/concurrent/futures/process.py", line 395 in shutdown
File "/home/haypo/prog/HG/cpython/Lib/concurrent/futures/_base.py", line 570 in __exit__
File "y.py", line 17 in main
File "y.py", line 20 in <module>
----------------------------
There are two child processes, but both are zombies (displayed as "<defunct>" by ps). Send SIGUSR1 signal to the frozen process to display the traceback (thanks to faulthandler).
|
|||
| msg150497 - (view) | Author: Ross Lagerwall (rosslagerwall) (Python committer) | Date: 2012年01月03日 06:12 | |
Retrieving the result of a future after the executor has been shut down can cause a hang. It seems like this regression was introduced in a76257a99636. This regression exists only for ProcessPoolExecutor. The problem is that even if there are pending work items, the processes are still signaled to shut down leaving the pending work items permanently unfinished. The patch simply removes the call to shut down the processes when there are pending work items. Attached is a patch. |
|||
| msg150498 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2012年01月03日 14:33 | |
Well I was sure I had added this code for a reason, but the tests seem to run without... Just a comment: the test isn't ProcessPoolExecutor-specific, so it should really be in the generic tests. |
|||
| msg150851 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2012年01月08日 06:43 | |
New changeset 26389e9efa9c by Ross Lagerwall in branch '3.2': Issue #12364: Fix a hang in concurrent.futures.ProcessPoolExecutor. http://hg.python.org/cpython/rev/26389e9efa9c New changeset 25f879011102 by Ross Lagerwall in branch 'default': Merge with 3.2 for #12364. http://hg.python.org/cpython/rev/25f879011102 |
|||
| msg150853 - (view) | Author: Ross Lagerwall (rosslagerwall) (Python committer) | Date: 2012年01月08日 09:35 | |
Thanks! |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:18 | admin | set | github: 56573 |
| 2012年01月08日 09:35:49 | rosslagerwall | set | status: open -> closed type: behavior messages: + msg150853 assignee: rosslagerwall resolution: fixed stage: resolved |
| 2012年01月08日 06:43:54 | python-dev | set | nosy:
+ python-dev messages: + msg150851 |
| 2012年01月03日 14:33:51 | pitrou | set | messages: + msg150498 |
| 2012年01月03日 06:12:54 | rosslagerwall | set | files: + itest.py |
| 2012年01月03日 06:12:19 | rosslagerwall | set | files:
+ issue.patch nosy: + rosslagerwall messages: + msg150497 keywords: + patch |
| 2011年07月05日 12:22:26 | vstinner | set | title: Timeout (1 hour) in test_concurrent_futures.tearDown() on sparc solaris10 gcc 3.x -> Deadlock in test_concurrent_futures |
| 2011年06月21日 01:05:34 | vstinner | set | messages: + msg138768 |
| 2011年06月21日 00:42:34 | vstinner | set | messages: + msg138767 |
| 2011年06月20日 23:37:03 | vstinner | set | messages: + msg138766 |
| 2011年06月20日 23:27:39 | vstinner | set | messages: + msg138765 |
| 2011年06月19日 17:25:15 | vstinner | create | |