homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: test_sendall_interrupted() of test_socket hangs on FreeBSD
Type: Stage:
Components: Library (Lib) Versions: Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: neologix, python-dev, vstinner
Priority: normal Keywords:

Created on 2011年04月03日 19:44 by vstinner, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Messages (7)
msg132871 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011年04月03日 19:44
I added a timeout of 30 minutes to regrtest. On "x86 FreeBSD 7.2 3.x" and "x86 FreeBSD 3.x" buildbot, test_sendall_interrupted() of test_socket does timeout after 30 minutes:
----------
...
[201/354] test_socket
Thread 0x28401040:
 File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/test/test_socket.py", line 729 in check_sendall_interrupted
 File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/test/test_socket.py", line 740 in test_sendall_interrupted
 File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/unittest/case.py", line 387 in _executeTestPart
 File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/unittest/case.py", line 442 in run
 File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/unittest/case.py", line 494 in __call__
 File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/unittest/suite.py", line 105 in run
 File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/unittest/suite.py", line 67 in __call__
 File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/unittest/suite.py", line 105 in run
 File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/unittest/suite.py", line 67 in __call__
 File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/test/support.py", line 1078 in run
 File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/test/support.py", line 1166 in _run_suite
 File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/test/support.py", line 1192 in run_unittest
 File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/test/test_socket.py", line 2052 in test_main
 File "./Lib/test/regrtest.py", line 1032 in runtest_inner
 File "./Lib/test/regrtest.py", line 826 in runtest
 File "./Lib/test/regrtest.py", line 650 in main
 File "./Lib/test/regrtest.py", line 1607 in <module>
----------
http://www.python.org/dev/buildbot/all/builders/x86%20FreeBSD%207.2%203.x/builds/1644/steps/test/logs/stdio
another example:
http://www.python.org/dev/buildbot/all/builders/x86%20FreeBSD%203.x/builds/1344/steps/test/logs/stdio
It looks like the test should be done in less than 5 seconds. Extract of the code (with_timeout=False for test_sendall_interrupted):
-------------------------
def check_sendall_interrupted(self, with_timeout):
 # socketpair() is not stricly required, but it makes things easier.
 if not hasattr(signal, 'alarm') or not hasattr(socket, 'socketpair'):
 self.skipTest("signal.alarm and socket.socketpair required for this test")
 # Our signal handlers clobber the C errno by calling a math function
 # with an invalid domain value.
 def ok_handler(*args):
 self.assertRaises(ValueError, math.acosh, 0)
 def raising_handler(*args):
 self.assertRaises(ValueError, math.acosh, 0)
 1 // 0
 c, s = socket.socketpair()
 old_alarm = signal.signal(signal.SIGALRM, raising_handler)
 try:
 if with_timeout:
 # Just above the one second minimum for signal.alarm
 c.settimeout(1.5)
 with self.assertRaises(ZeroDivisionError):
 signal.alarm(1)
 c.sendall(b"x" * (1024**2)) <~~~~ HERE
 if with_timeout:
 signal.signal(signal.SIGALRM, ok_handler)
 signal.alarm(1)
 self.assertRaises(socket.timeout, c.sendall, b"x" * (1024**2))
 finally:
 signal.signal(signal.SIGALRM, old_alarm)
 c.close()
 s.close()
-------------------------
I suppose that sendall() should be interrupted by SIGALRM. sendall() calls PyErr_CheckSignals() after each system call to send(), and exits with an error if PyErr_CheckSignals() returns NULL. If send() fails with EINTR error, sendall() retries calls send() again (in a loop).
I don't know if this hang is new or not, and if it is (or not) a border effect of the timeout feature of regrtest...
But test_signal don't always fail, it looks like a race condition.
msg132883 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011年04月03日 20:58
This test assumes that send will necessarily return if interrupted by a signal, but the kernel can automatically restart the syscall when no data has been committed (instead of returning -1 with errno set to EINTR).
And, AFAIK, that's exactly what FreeBSD does, see http://www.freebsd.org/cgi/man.cgi?query=siginterrupt&apropos=0&sektion=0&manpath=FreeBSD+8.2-RELEASE&format=html :
"""
The siginterrupt() function is used to change the system call restart
 behavior when a system call is interrupted by the specified signal. If
 the flag is false (0), then system calls will be restarted if they are
 interrupted by the specified signal and no data has been transferred yet.
 System call restart has been the default behavior since 4.2BSD, and is
 the default behaviour for signal(3) on FreeBSD.
"""
And http://www.gsp.com/cgi-bin/man.cgi?section=2&topic=sigaction :
"""
If a signal is caught during the system calls listed below, the call may be forced to terminate with the error EINTR, the call may return with a data transfer shorter than requested, or the call may be restarted. Restart of pending calls is requested by setting the SA_RESTART bit in sa_flags. The affected system calls include open(2), read(2), write(2), sendto(2), recvfrom(2), sendmsg(2) and recvmsg(2) on a communications channel or a slow device (such as a terminal, but not a regular file) and during a wait(2) or ioctl(2). However, calls that have already committed are not restarted, but instead return a partial success (for example, a short read count).
"""
So if the signal arrives while some data has been transferred, send will return with a partial write, but if the signal arrives before any data has been written, then you'll never see EINTR and remain stuck forever (unless SA_RESTART is unset).
Note that POSIX seems to require write to return with EINTR if interrupted before any data is written, see http://pubs.opengroup.org/onlinepubs/009695399/functions/write.html :
"""
If write() is interrupted by a signal before it writes any data, it shall return -1 with errno set to [EINTR].
"""
But send and sendto man pages don't require this behaviour.
msg132889 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011年04月03日 21:27
As issue #11755, this issue is a regression introduced by regrtest timeout (issue #11727): test_socket doesn't fail without regrtest timeout.
msg132897 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011年04月03日 22:21
Issue #11755 (test_itimer_real() of test_signal hang on FreeBSD) has been marked as a duplicate of this issue.
msg132904 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011年04月03日 23:21
The problem is that faulthandler's thread (faulthandler_thread) receives the SIGALRM signal: the signal interrupts sem_timedwait() which returns EINTR. PyThread_acquire_lock_timed() retries sem_timedwait() and so the other thread executing sendall() is not interrupted.
The solution is to configure which signals are handled by faulthandler_thread() using pthread_sigmask():
http://hg.python.org/sandbox/haypo/rev/4257fdfa5661
I forced a build on x86 FreeBSD (custom) buildbot:
http://www.python.org/dev/buildbot/all/builders/x86%20FreeBSD%20custom
I already tested sigprocmask() on FreeBSD (in my VM) and it works as expected: test_socket doesn't block anymore.
msg132921 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011年04月04日 09:05
New changeset ebc03d7e7110 by Victor Stinner in branch 'default':
Issue #11753: faulthandler thread uses pthread_sigmask()
http://hg.python.org/cpython/rev/ebc03d7e7110 
msg132922 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011年04月04日 09:08
test_socket and test_signal succeed on "x86 FreeBSD custom":
http://www.python.org/dev/buildbot/all/builders/x86%20FreeBSD%20custom/builds/4
(there are other issues, but there are not related)
I pushed the fix in Python 3.3 (ebc03d7e7110).
History
Date User Action Args
2022年04月11日 14:57:15adminsetgithub: 55962
2011年04月04日 09:08:05vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg132922
2011年04月04日 09:05:53python-devsetnosy: + python-dev
messages: + msg132921
2011年04月03日 23:21:59vstinnersetmessages: + msg132904
2011年04月03日 22:21:01vstinnersetmessages: + msg132897
2011年04月03日 21:27:24vstinnersetmessages: + msg132889
2011年04月03日 20:58:50neologixsetnosy: + neologix
messages: + msg132883
2011年04月03日 19:44:52vstinnercreate

AltStyle によって変換されたページ (->オリジナル) /