homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author neologix
Recipients loewis, nadeem.vawda, neologix, pitrou, python-dev, vstinner
Date 2012年02月26日.11:55:49
SpamBayes Score 4.4682702e-11
Marked as misclassified No
Message-id <1330257351.04.0.329686909156.issue14107@psf.upfronthosting.co.za>
In-reply-to
Content
"""
Thread 0x00002ba588709700:
 File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/test/support.py", line 1168 in consumer
 File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/threading.py", line 682 in run
 File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/threading.py", line 729 in _bootstrap_inner
 File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/threading.py", line 702 in _bootstrap
Current thread 0x00002ba57b2d4260:
 File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/test/support.py", line 1198 in stop
 File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/test/support.py", line 1240 in wrapper
 File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/unittest/case.py", line 385 in _executeTestPart
 File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/unittest/case.py", line 440 in run
 File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/unittest/case.py", line 492 in __call__
 File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/unittest/suite.py", line 105 in run
 File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/unittest/suite.py", line 67 in __call__
 File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/unittest/suite.py", line 105 in run
 File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/unittest/suite.py", line 67 in __call__
 File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/unittest/runner.py", line 168 in run
 File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/test/support.py", line 1369 in _run_suite
 File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/test/support.py", line 1403 in run_unittest
 File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/test/test_bigmem.py", line 1252 in test_main
 File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/test/regrtest.py", line 1221 in runtest_inner
 File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/test/regrtest.py", line 907 in runtest
 File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/test/regrtest.py", line 710 in main
 File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/test/__main__.py", line 13 in <module>
 File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/runpy.py", line 73 in _run_code
 File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/runpy.py", line 160 in _run_module_as_main
"""
There's a problem with the _file_watchdog thread:
if the pipe gets full (because the consumer thread doesn't get to run
often enough), the watchdog thread will block on the write() to the
pipe.
Then, the main thread tries to stop the watchdog:
"""
static void
cancel_file_watchdog(void)
{
 /* Notify cancellation */
 PyThread_release_lock(watchdog.cancel_event);
 /* Wait for thread to join */
 PyThread_acquire_lock(watchdog.running, 1);
 PyThread_release_lock(watchdog.running);
 /* The main thread should always hold the cancel_event lock */
 PyThread_acquire_lock(watchdog.cancel_event, 1);
}
"""
The `cancel_event` lock is released, but the watchdog thread is stuck
on the write().
The only thing that could wake it up is a read() from the consumer
thread, but the main thread - the one calling cancel_file_watchdog()
- blocks when acquiring the `running` lock: since the GIL is not
released, the consumer thread can't run, so it doesn't drain the pipe,
and game over...
"""
 /* We can't do anything if the consumer is too slow, just bail out */
 if (write(watchdog.wfd, (void *) &x, sizeof(x)) < sizeof(x))
 break;
 if (write(watchdog.wfd, data, data_len) < data_len)
 break;
"""
AFAICT, this can't happen, because the write end of the pipe is not in
non-blocking mode (which would solve this issue).
Otherwise, there are two things I don't understand:
1. IIUC, the goal of the watchdog thread is to collect memory
consumption in a timely manner: that's now the case, but since the
information is printed in a standard thread, it doesn't bring any improvement (because it can be delayed for arbitrarily long), or am I
missing something?
2. instead of using a thread and the faulthandler infrastructure to run
GIL-less, why not simply use a subprocess? It could then simply
parse /proc/<PID>/statm at a regular interval, and print stats to
stdout. It would also solve point 1.
History
Date User Action Args
2012年02月26日 11:55:51neologixsetrecipients: + neologix, loewis, pitrou, vstinner, nadeem.vawda, python-dev
2012年02月26日 11:55:51neologixsetmessageid: <1330257351.04.0.329686909156.issue14107@psf.upfronthosting.co.za>
2012年02月26日 11:55:50neologixlinkissue14107 messages
2012年02月26日 11:55:49neologixcreate

AltStyle によって変換されたページ (->オリジナル) /