homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Intermittent segfaults on PPC64 AIX 3.x
Type: crash Stage:
Components: Extension Modules Versions: Python 3.6
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: Nosy List: David.Edelsohn, skrah, vstinner
Priority: normal Keywords:

Created on 2015年09月30日 09:26 by vstinner, last changed 2022年04月11日 14:58 by admin. This issue is now closed.

Messages (11)
msg251919 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015年09月30日 09:26
This buildbot has low free memory. Maybe some part of _decimal doesn't handle an allocation failure?
http://buildbot.python.org/all/builders/PPC64%20AIX%203.x/builds/4173/steps/test/logs/stdio
...
[307/399/10] test_decimal
Fatal Python error: Segmentation fault
Current thread 0x00000001 (most recent call first):
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/test_decimal.py", line 444 in eval_equation
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/test_decimal.py", line 321 in eval_line
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/test_decimal.py", line 299 in eval_file
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/test_decimal.py", line 5591 in <lambda>
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/unittest/case.py", line 600 in run
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/unittest/case.py", line 648 in __call__
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/unittest/suite.py", line 122 in run
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/unittest/suite.py", line 84 in __call__
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/unittest/suite.py", line 122 in run
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/unittest/suite.py", line 84 in __call__
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/unittest/runner.py", line 176 in run
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/support/__init__.py", line 1775 in _run_suite
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/support/__init__.py", line 1809 in run_unittest
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/test_decimal.py", line 5598 in test_main
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/libregrtest/runtest.py", line 160 in runtest_inner
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/libregrtest/runtest.py", line 113 in runtest
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/libregrtest/main.py", line 292 in run_tests_sequential
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/libregrtest/main.py", line 334 in run_tests
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/libregrtest/main.py", line 365 in main
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/libregrtest/main.py", line 407 in main
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/libregrtest/main.py", line 429 in main_in_temp_cwd
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/__main__.py", line 3 in <module>
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/runpy.py", line 85 in _run_code
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/runpy.py", line 170 in _run_module_as_main
make: *** [buildbottest] Segmentation fault (core dumped)
msg252039 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2015年10月01日 16:47
Usually these segfaults are toolchain bugs (I've had at least 8,
including gcc, suncc, libc...).
Just a couple of observations:
 - The bot builds with -DCONFIG_32=1 -DANSI=1 despite being PPC64.
 - When we had an AIX snakebite machine, the xlc compile worked on
 AIX (using about 50 obscure command line arguments).
 - In the default build, libmpdec functions use a lot of stack
 memory (for optimization while avoiding alloca). But there are
 no recursive tests, so a stack overflow would seem unlikely.
msg252074 - (view) Author: David Edelsohn (David.Edelsohn) * Date: 2015年10月02日 00:00
The system has 128GB of memory. The process limits are set to unlimited for data. AIX defaults to 32 bit, although all processors are 64 bit, so the buildbot runs as 32 bit. What does low free memory in the buildbot mean?
I'm surprised that Python requires a huge amount of memory for the tests. It's possible that Python needs to be built with special options to allow additional malloc space (-bmaxdata:0xN0000000).
msg252111 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2015年10月02日 12:28
I've checked: test_decimal does not require abnormal amounts of
memory or stack. On Linux/x86 a stack size of 256 (default 8192)
is sufficient, and memory requirements aren't that high.
We assumed that there is some memory limit on the buildbot, since
in a later run test #pwmx330 failed with MemoryError.
The easiest way to debug this is to rerun the whole test suite
under gdb with the same random seed as in
http://buildbot.python.org/all/builders/PPC64%20AIX%203.x/builds/4173/steps/test/logs/stdio 
msg252113 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2015年10月02日 12:37
And the segfaults are apparently somewhat random. This is beginning
to look like an issue unrelated to decimal that was perhaps recently
introduced (in which case "hg bisect" would be the fastest
way to debug).
http://buildbot.python.org/all/builders/PPC64%20AIX%203.x/builds/4183/steps/test/logs/stdio
[129/399/3] test_email
Fatal Python error: Segmentation fault
Current thread 0x00000001 (most recent call first):
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/email/utils.py", line 57 in _has_surrogates
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/email/message.py", line 264 in get_payload
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/test_email/test_email.py", line 3463 in test_long_lines
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/unittest/case.py", line 600 in run
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/unittest/case.py", line 648 in __call__
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/unittest/suite.py", line 122 in run
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/unittest/suite.py", line 84 in __call__
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/unittest/suite.py", line 122 in run
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/unittest/suite.py", line 84 in __call__
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/unittest/suite.py", line 122 in run
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/unittest/suite.py", line 84 in __call__
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/unittest/suite.py", line 122 in run
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/unittest/suite.py", line 84 in __call__
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/unittest/suite.py", line 122 in run
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/unittest/suite.py", line 84 in __call__
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/unittest/runner.py", line 176 in run
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/support/__init__.py", line 1775 in _run_suite
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/support/__init__.py", line 1809 in run_unittest
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/libregrtest/runtest.py", line 159 in test_runner
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/libregrtest/runtest.py", line 160 in runtest_inner
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/libregrtest/runtest.py", line 113 in runtest
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/libregrtest/main.py", line 289 in run_tests_sequential
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/libregrtest/main.py", line 331 in run_tests
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/libregrtest/main.py", line 362 in main
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/libregrtest/main.py", line 404 in main
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/libregrtest/main.py", line 426 in main_in_temp_cwd
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/__main__.py", line 3 in <module>
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/runpy.py", line 85 in _run_code
 File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/runpy.py", line 170 in _run_module_as_main
msg252114 - (view) Author: David Edelsohn (David.Edelsohn) * Date: 2015年10月02日 13:42
As we have seen with similar issues on other targets, this likely is due to the random order of tests. In another case, the timezone was not being restored properly by GLIBC. Another test is leaving the process in a state that somehow evokes this failure from test_decimal.
msg252115 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015年10月02日 13:44
I suggest to isolate tests using -j1: see my issue #25285.
(Currently, -j1 doesn't use subprocesses.)
msg252116 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2015年10月02日 13:52
If you have time, you could use an explicit seed (and gdb):
# test_email segfault:
./python -m test -j 1 -u all -W --randseed 5634141
msg252210 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2015年10月03日 13:56
> It's possible that Python needs to be built with special options to allow additional malloc space (-bmaxdata:0xN0000000).
It seems to be the case, see Misc/README.AIX. This could explain the
MemoryErrors, but not the segfaults.
Are computed-gotos stable on gcc-AIX? The README recommends disabling
them for xlc.
I'm also not sure how well Python supports threads on AIX. Often
these problems go away on unsupported platforms when configuring
--without-threads.
msg252211 - (view) Author: David Edelsohn (David.Edelsohn) * Date: 2015年10月03日 14:06
Misc/README.AIX comments about XLC do not apply to GCC.
One can adjust the memory space at normal link time with -Wl,-bmaxdata:0xN0000000. This trades off heap for shared memory segments. One does not need the extra ldedit stop, which stuffs the same value into the application header.
msg262464 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016年03月25日 23:31
The origin of the crash is unknown. Since I didn't see the crash recently, I close the issue.
History
Date User Action Args
2022年04月11日 14:58:22adminsetgithub: 69463
2016年03月25日 23:31:57vstinnersetstatus: open -> closed
resolution: out of date
messages: + msg262464
2015年10月03日 14:06:53David.Edelsohnsetmessages: + msg252211
2015年10月03日 13:56:30skrahsetmessages: + msg252210
2015年10月02日 13:52:49skrahsetmessages: + msg252116
2015年10月02日 13:44:35vstinnersetmessages: + msg252115
2015年10月02日 13:42:46David.Edelsohnsetmessages: + msg252114
2015年10月02日 12:37:03skrahsetmessages: + msg252113
title: test_decimal sometimes crash on PPC64 AIX 3.x -> Intermittent segfaults on PPC64 AIX 3.x
2015年10月02日 12:28:16skrahsetmessages: + msg252111
2015年10月02日 00:00:10David.Edelsohnsetmessages: + msg252074
2015年10月01日 16:47:41skrahsetnosy: + David.Edelsohn
messages: + msg252039
2015年10月01日 04:31:46serhiy.storchakasettype: crash
components: + Extension Modules
2015年09月30日 22:43:49vstinnersetnosy: + skrah
2015年09月30日 09:26:05vstinnercreate

AltStyle によって変換されたページ (->オリジナル) /