[Python-ideas] reducing multiprocessing.Queue contention

Charles-François Natali cf.natali at gmail.com
Wed Jan 23 12:16:14 CET 2013


Hello,
Currently, multiprocessing.Queue put() and get() methods hold locks
for the entire duration of the writing/reading to the backing
Connection (which can be a pipe, unix domain socket, or whatever it's
called on Windows).
For example, here's what the feeder thread does:
"""
 else:
 wacquire()
 try:
 send(obj)
 # Delete references to object. See issue16284
 del obj
 finally:
 wrelease()
"""
Connection.send() and Connection.recv() have to serialize the data
using pickle before writing them to the underlying file descriptor.
While the locking is necessary to guarantee atomic read/write (well,
it's not necessary if you're writing to a pipe less than PIPE_BUF, and
writes seem atomic on Windows), the locks don't have to be held while
the data is serialized.
Although I didn't make any measurement, my gut feeling is that this
serializing can take a non negligible part of the overall
sending/receiving time, for large data items. If that's the case, then
simply holding the lock for the duration of the read()/write() syscall
(and not during serialization) could reduce contention in case of
large data sending/receiving.
One way to do that would be to refactor the code a bit to provide
maybe a (private) AtomicConnection, which would encapsulate the
necessary locking: another advantage is that this would hide the
platform-dependent code inside Connection (right now, Queue only uses
a lock for ending on Unix platforms, since write is apparently atomic
on Windows).
Thoughts?


More information about the Python-ideas mailing list

AltStyle によって変換されたページ (->オリジナル) /