This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2011年08月12日 00:25 by Michael.Hall, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| test_case.zip | Michael.Hall, 2011年08月12日 18:11 | Test Case | ||
| Messages (7) | |||
|---|---|---|---|
| msg141932 - (view) | Author: Michael Hall (Michael.Hall) | Date: 2011年08月12日 00:25 | |
I recently switched to Ubuntu 11.04 from OpenSUSE 11.4, and when I go to run a project I coded a couple days ago under OpenSUSE using the multiprocessing library, it hangs when it did not under OpenSUSE. Specifically, I am using two queues, work_queue from which the children get jobs, and results_queue where they place their results before calling JoinableQueue.task_done() and grabbing the next result. I use the "poison pill" technique to terminate the children, where a None object is placed at the end of the queue for each child, and when they get one of the terminating objects they call task_done() again (to account for the None object) and exit. In the main process, after spawning all of the children (one per physical CPU), it joins with the work_queue in order to wait for all of its children to finish. This is pretty much a cookie-cutter multiprocessing implementation that I've used successfully for years under OpenSUSE, but for some odd reason the exact same code does not work under Ubuntu. I would try porting to python 3.x, but the rest of my research team is still using 2.7, so that's not really an option right now. |
|||
| msg141933 - (view) | Author: Michael Hall (Michael.Hall) | Date: 2011年08月12日 00:30 | |
Edit: Sorry, I should have been more clear. The hang occurs after the first child process exits, at which point all four children become zombies (none of the others exit, they just zombify immediately), and the main process sits there waiting forever for the rest of the children to clear out the queue, which of course never happens. |
|||
| msg141972 - (view) | Author: Meador Inge (meador.inge) * (Python committer) | Date: 2011年08月12日 16:50 | |
Michael, It is hard to tell from your description alone where the bug is. Could you provide more detailed reproduction steps with a test case that exhibits the issue? |
|||
| msg141982 - (view) | Author: Michael Hall (Michael.Hall) | Date: 2011年08月12日 18:11 | |
Okay, I have attached the code I've been using. Don't worry about what it does (it's a biology thing), but just follow these steps: 1. Make sure you have numpy and scipy installed. 2. Extract the zip file. 3. Run it with ./svm_main.py test_obligate.dat test_transient.dat The method svm_main.grid_search and the module grid_search_process are probably the only things you need pay attention to, everything else is problem-specific. |
|||
| msg142090 - (view) | Author: Michael Hall (Michael.Hall) | Date: 2011年08月15日 00:55 | |
I tried switching from joining on the work_queue to just joining on the individual child processes, and it seems to work now. Weird. Anyway, it'd be nice to see the JoinableQueue fixed, but it's not pressing any more. |
|||
| msg235625 - (view) | Author: Davin Potts (davin) * (Python committer) | Date: 2015年02月09日 17:53 | |
Thank you for the provided test case but because it depends upon compiled code (the libsvm.so.2 file you supplied) it: (1) makes me wonder if the issue might not arise from an issue inside the supplied library (perhaps it was not rebuilt properly on your Ubuntu 11.04 system after migrating to it from OpenSUSE 11.4 -- the timestamp on the libsvm.so.2 file appears to support this suspicion); (2) does not give us a reasonably concise test case to be able to debug and begin to try to understand. Would it be possible to supply a simpler demonstration of the issue that perhaps only involves Python code? I realize this issue is quite stale now and that you (Michael) have already reported discovering a workaround. |
|||
| msg240478 - (view) | Author: Davin Potts (davin) * (Python committer) | Date: 2015年04月11日 14:41 | |
Closing this very stale issue as out of date with no response from OP since request months ago for enough info to be able to proceed. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:20 | admin | set | github: 56947 |
| 2015年04月11日 14:41:57 | davin | set | status: pending -> closed resolution: out of date messages: + msg240478 |
| 2015年02月10日 14:51:10 | davin | set | status: open -> pending |
| 2015年02月09日 17:53:03 | davin | set | nosy:
+ davin messages: + msg235625 |
| 2011年08月15日 00:55:51 | Michael.Hall | set | messages: + msg142090 |
| 2011年08月12日 18:11:39 | Michael.Hall | set | files:
+ test_case.zip messages: + msg141982 |
| 2011年08月12日 16:50:27 | meador.inge | set | nosy:
+ meador.inge, jnoller messages: + msg141972 stage: test needed |
| 2011年08月12日 00:30:41 | Michael.Hall | set | type: behavior messages: + msg141933 |
| 2011年08月12日 00:25:42 | Michael.Hall | create | |