This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2011年04月30日 06:43 by skrah, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| freebsd-amd64-log.txt | skrah, 2011年05月02日 18:23 | |||
| Messages (10) | |||
|---|---|---|---|
| msg134839 - (view) | Author: Stefan Krah (skrah) * (Python committer) | Date: 2011年04月30日 06:43 | |
The FreeBSD-AMD64 bot exhibits sporadic hanging in unspecific places. FreeBSD is running under kvm in the background. When the hanging occurs, the virtual machine uses 100% CPU and I can't log in via ssh, so I have to kill the kvm process. The fact that the ssh login fails if a user process is misbehaving seems like a FreeBSD/kvm issue to me. However, this problem did not occur when I set up the bot a couple of weeks ago. I've started a series of older revision builds to see if anything recent causes this. |
|||
| msg134890 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2011年04月30日 23:15 | |
> The FreeBSD-AMD64 bot exhibits sporadic hanging in unspecific places. You can try a shorter regrtest timeout, edit Lib/test/regrtest.py near: if hasattr(faulthandler, 'dump_tracebacks_later'): timeout = 60*60 (or use --timeout option of the regrtest.py program) If you have an access to a terminal (using ssh), you can also set a signal to dump the traceback: edit regrtest.py to add "import signal; faulthandler.register(signal.SIGUSR1, all_threads=True)" after "faulthandler.enable()". Then use "kill -USR1 pid" to dump the traceback. Or the problem is an unlimited loop while dumping the traceback because of a timeout :-D In this case, disable the timeout using --timeout=0 option of regrtest.py. |
|||
| msg134901 - (view) | Author: Stefan Krah (skrah) * (Python committer) | Date: 2011年05月01日 06:03 | |
Thanks Victor, I can try some of that. Could this also be a problem with the buildbot software or a networking problem? The Ubuntu PPC bot might have the same issue. Here the tests appear to be finished but the clean doesn't start: http://www.python.org/dev/buildbot/all/builders/PPC%20Ubuntu%203.1/builds/387/steps/test/logs/stdio http://www.python.org/dev/buildbot/all/builders/PPC%20Ubuntu%203.1/builds/387 |
|||
| msg134922 - (view) | Author: Ned Deily (ned.deily) * (Python committer) | Date: 2011年05月01日 19:36 | |
That might be another instance of this: http://thread.gmane.org/gmane.comp.python.devel/123698 You might want to bring this up on python-dev. |
|||
| msg134997 - (view) | Author: Stefan Krah (skrah) * (Python committer) | Date: 2011年05月02日 18:23 | |
Going through the logs, this indeed looks like a buildbot software issue to me. I attach the logs that correspond to this incident: http://www.python.org/dev/buildbot/all/builders/AMD64%20FreeBSD%208.2%203.2/builds/85 After ... 2011年04月30日 01:10:56+0200 [Broker,client] closing stdin 2011年04月30日 01:10:56+0200 [Broker,client] using PTY: False ... normally you should see: ... [-] command finished with signal None, exit code 0, elapsedTime: But there is nothing until I restarted the bot. |
|||
| msg135084 - (view) | Author: Stefan Krah (skrah) * (Python committer) | Date: 2011年05月03日 22:15 | |
Another instance: 2011年05月03日 20:18:08+0200 [Broker,client] closing stdin 2011年05月03日 20:18:08+0200 [Broker,client] using PTY: False 2011年05月03日 20:20:38+0200 [-] sending app-level keepalive Again this is missing: ... [-] command finished with signal None, exit code 0, elapsedTime: Also, as we speak the Ubuntu PPC bot is hanging as well: http://www.python.org/dev/buildbot/all/builders/PPC%20Ubuntu%202.7/builds/386/steps/test/logs/stdio Antoine, do you have access to the server logs for the relevant times? My bot is on CEST. |
|||
| msg135085 - (view) | Author: Barry A. Warsaw (barry) * (Python committer) | Date: 2011年05月03日 22:40 | |
My Ubuntu PPC server is having hardware problems. It will just intermittently shut off. I've reset the SMU and the PRAM, vacuumed out the guts, reseated the RAM, pulled any possibly problematic 3rd party boards, and it still crashes. I was watching the syslog and it didn't look like a thermal shutdown, though it acted like that. The only thing I can think of is a power supply problem, so I'm going to see if I can find an inexpensive replacement. In the meantime, this machine will be offline for a couple of weeks at least. |
|||
| msg135174 - (view) | Author: Stefan Krah (skrah) * (Python committer) | Date: 2011年05月05日 07:10 | |
The FreeBSD bot had these error messages in the log files: 1) kernel: swap_pager: indefinite wait buffer: device 2) Approaching the limit on PV entries, consider increasing either the vm.pmap.shpgperproc or the vm.pmap.p v_entry_max sysctl. I set up the bot from scratch with these changes: a) Use swap partition (2GB) instead of swap file (2 GB). b) Use these sysctls: kern.ipc.shm_use_phys=1 vm.pmap.shpgperproc=4096 vm.pmap.pv_entry_max=16777216 c) Use self-compiled Python2.7 instead of the system Python2.6. Let's see how that works out. Error 1) is bad, perhaps FreeBSD does not play well with the qcow2 file system under high load. |
|||
| msg135175 - (view) | Author: Stefan Krah (skrah) * (Python committer) | Date: 2011年05月05日 07:36 | |
On second thought, I don't want to debug possible qcow2 issues, so I made another change: d) Use raw format for the image. |
|||
| msg135421 - (view) | Author: Stefan Krah (skrah) * (Python committer) | Date: 2011年05月07日 09:06 | |
I think the FreeBSD bot changes are working out fine. The Ubuntu-PPC issues were unrelated, so I'm closing this. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:16 | admin | set | github: 56171 |
| 2011年05月07日 09:06:55 | skrah | set | status: open -> closed messages: + msg135421 keywords: + buildbot resolution: fixed stage: resolved |
| 2011年05月05日 07:36:05 | skrah | set | messages: + msg135175 |
| 2011年05月05日 07:10:24 | skrah | set | messages: + msg135174 |
| 2011年05月03日 22:40:58 | barry | set | messages: + msg135085 |
| 2011年05月03日 22:17:09 | skrah | set | nosy:
+ barry |
| 2011年05月03日 22:15:40 | skrah | set | messages:
+ msg135084 title: FreeBSD-AMD64 bot sporadic hanging -> Buildbot reliability |
| 2011年05月02日 18:23:41 | skrah | set | files:
+ freebsd-amd64-log.txt messages: + msg134997 |
| 2011年05月01日 19:36:41 | ned.deily | set | nosy:
+ ned.deily messages: + msg134922 |
| 2011年05月01日 06:03:43 | skrah | set | messages: + msg134901 |
| 2011年04月30日 23:15:44 | vstinner | set | nosy:
+ vstinner messages: + msg134890 |
| 2011年04月30日 06:43:10 | skrah | create | |