This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2013年01月24日 15:49 by neologix, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| queues_contention.diff | neologix, 2013年01月24日 15:49 | |||
| multi_queue.py | neologix, 2013年01月24日 15:49 | |||
| locked_send_recv.patch | pitrou, 2013年01月27日 18:41 | review | ||
| queues_contention-1.diff | neologix, 2013年03月21日 13:51 | review | ||
| queues_contention-3.diff | neologix, 2013年03月21日 15:25 | |||
| queues_contention.diff | neologix, 2013年03月24日 11:26 | review | ||
| forkingpickler.diff | neologix, 2013年03月24日 11:26 | review | ||
| Messages (20) | |||
|---|---|---|---|
| msg180532 - (view) | Author: Charles-François Natali (neologix) * (Python committer) | Date: 2013年01月24日 15:49 | |
Here's an implementation of the idea posted on python-ideas (http://mail.python.org/pipermail/python-ideas/2013-January/018846.html). The principle is really simple, we just serialize/unserialize the objects before/after holding the locks. This leads to reduced contention. Here are the results of a benchmark using from 1 reader/1 writer up to 4 readers/4 writers, on a 8-cores box: without patch: $ ./python /tmp/multi_queue.py took 0.8340198993682861 seconds with 1 workers took 1.956531047821045 seconds with 2 workers took 3.175778865814209 seconds with 3 workers took 4.277260780334473 seconds with 4 workers with patch: $ ./python /tmp/multi_queue.py took 0.7945001125335693 seconds with 1 workers took 0.7428359985351562 seconds with 2 workers took 0.7897098064422607 seconds with 3 workers took 1.1860828399658203 seconds with 4 workers I tried Richard's suggestion of serializing the data inside put(), but this reduces performance quite notably: $ ./python /tmp/multi_queue.py took 1.412883996963501 seconds with 1 workers took 1.3212130069732666 seconds with 2 workers took 1.2271699905395508 seconds with 3 workers took 1.4817359447479248 seconds with 4 workers Although I didn't analyse it further, I guess one reason could be that if the serializing is done in put(), the feeder thread has nothing to do but keep waiting for data to be available from the buffer, send it, and block until there's more to do: basically, it almost doesn't use its time-slice, and spends its time blocking and doing context switches. |
|||
| msg180533 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2013年01月24日 16:44 | |
> Here's an implementation of the idea posted on python-ideas > (http://mail.python.org/pipermail/python-ideas/2013-January/018846.html). > > The principle is really simple, we just serialize/unserialize the > objects before/after holding the locks. This leads to reduced > contention. I would like to suggest again my idea of doing it in Connection instead, with new methods (e.g. locked_send and locked_recv). Especially given it can be useful in user code to have a thread-safe Connection (I'm in this situation currently). |
|||
| msg180781 - (view) | Author: Charles-François Natali (neologix) * (Python committer) | Date: 2013年01月27日 18:29 | |
> I would like to suggest again my idea of doing it in Connection instead, > with new methods (e.g. locked_send and locked_recv). Especially given > it can be useful in user code to have a thread-safe Connection (I'm in > this situation currently). I intended to do this initially, but then it turned out to be much more intrusive than what I initially thought, and opted for a simpler approach. While it's probably a good idea to implement it in Connection, I don't really like the idea of adding new distinct methods: - this would require allocating locks for every connection, which wouldn't be used most of the time - since locks are implemented atop POSIX semaphores and some platforms only support a handful of them, it could trigger some failure - it's not really just adding locked_send() and locked_recv(): you must implemented locked send_bytes/send/recv_bytes/recv_bytes_into/recv: also, if we want to implement timed and non blocking receive (which is supported by Queue.get), it should probably be handled here So basically, I think the right way to do it would be to define an abstract ConcurrentConnection, or AtomicConnection which would override Connection methods making them thread/process safe. We'd also need exposing AtomicPipe... |
|||
| msg180782 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2013年01月27日 18:41 | |
For the record, I tried the Connection approach and here is what I ended up with. |
|||
| msg181266 - (view) | Author: Charles-François Natali (neologix) * (Python committer) | Date: 2013年02月03日 13:00 | |
> For the record, I tried the Connection approach and here is what I ended up with. I don't really like the API. Having to pass an external lock is IMO a bad idea, it should be a private instance field. Also, for consistency we'd probably need send_bytes/recv_bytes. |
|||
| msg183486 - (view) | Author: Charles-François Natali (neologix) * (Python committer) | Date: 2013年03月04日 19:24 | |
So, what do you think? Is the simple version offloading the serialization to queue enough, or should we go for a full-blown atomic Connection/Pipe/etc? I find the performance gain quite appreciable (basically queue didn't scale at all, now it scales with the number of cores). |
|||
| msg183487 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2013年03月04日 19:25 | |
IMHO the simple version is good enough. |
|||
| msg183493 - (view) | Author: Richard Oudkerk (sbt) * (Python committer) | Date: 2013年03月04日 19:48 | |
It looks like queues_contention.diff has the line obj = pickle.dumps(obj) in both _feed() and put(). Might that be why the third set of benchmarks was slower than the second? |
|||
| msg183494 - (view) | Author: Charles-François Natali (neologix) * (Python committer) | Date: 2013年03月04日 20:01 | |
> It looks like queues_contention.diff has the line > > obj = pickle.dumps(obj) > > in both _feed() and put(). Might that be why the third set of benchmarks > was slower than the second? _feed() is a Queue method, put() its SimpleQueue() counterpart. Am I missing something? |
|||
| msg183500 - (view) | Author: Richard Oudkerk (sbt) * (Python committer) | Date: 2013年03月04日 22:09 | |
On 04/03/2013 8:01pm, Charles-François Natali wrote: >> It looks like queues_contention.diff has the line >> >> obj = pickle.dumps(obj) >> >> in both _feed() and put(). Might that be why the third set of benchmarks >> was slower than the second? > > _feed() is a Queue method, put() its SimpleQueue() counterpart. Am I > missing something? No. I only looked at the diff and assumed both changes were for Queue. Since you marked issue 10886 as superceded, do you intend to do the pickling in put()? |
|||
| msg183523 - (view) | Author: Charles-François Natali (neologix) * (Python committer) | Date: 2013年03月05日 13:30 | |
> No. I only looked at the diff and assumed both changes were for Queue. OK, great. > Since you marked issue 10886 as superceded, do you intend to do the > pickling in put()? Actually no, I'll reopen it. I find the performance hit important, though. |
|||
| msg184871 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2013年03月21日 13:55 | |
By the way, I forgot to mention it previously, but multiprocessing.connection uses a custom pickler (ForkingPickler). By replacing it with plain pickle.dumps() calls, you may produce regressions since some types won't be sendable anymore (although the test suite might not check for this). |
|||
| msg184875 - (view) | Author: Charles-François Natali (neologix) * (Python committer) | Date: 2013年03月21日 15:25 | |
> By the way, I forgot to mention it previously, but > multiprocessing.connection uses a custom pickler (ForkingPickler). Thanks, I didn't know. Here's a patch using ForkingPickler. I did a bit of refactoring to move the pickling code from connection.py to forking.py. |
|||
| msg185128 - (view) | Author: Charles-François Natali (neologix) * (Python committer) | Date: 2013年03月24日 11:26 | |
I'm splitting the patches: - one which adds loads and dumps to ForkingPicler - the contention reduction patch I'd like to commit them soon. |
|||
| msg185130 - (view) | Author: Richard Oudkerk (sbt) * (Python committer) | Date: 2013年03月24日 12:03 | |
The old code deleted the obj in the feeder thread as soon as it was sent at lines 247 and 253 -- see Issue #16284. I think that should be retained. Apart from that LGTM. |
|||
| msg185131 - (view) | Author: Charles-François Natali (neologix) * (Python committer) | Date: 2013年03月24日 12:16 | |
> The old code deleted the obj in the feeder thread as soon as it was sent at lines 247 and 253 -- see Issue #16284. I think that should be retained. The object is overwritten by the pickled data, so it's not necessary anymore, no? |
|||
| msg185136 - (view) | Author: Richard Oudkerk (sbt) * (Python committer) | Date: 2013年03月24日 13:33 | |
On 24/03/2013 12:16pm, Charles-François Natali wrote: > The object is overwritten by the pickled data, so it's not necessary > anymore, no? Yes, you are right. |
|||
| msg185139 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2013年03月24日 14:24 | |
New changeset bedb4cbdd311 by Charles-François Natali in branch 'default': Issue #17025: Add dumps() and loads() to ForkingPickler. http://hg.python.org/cpython/rev/bedb4cbdd311 |
|||
| msg185214 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2013年03月25日 17:21 | |
New changeset 5022ee7e13a2 by Charles-François Natali in branch 'default': Issue #17025: multiprocessing: Reduce Queue and SimpleQueue contention. http://hg.python.org/cpython/rev/5022ee7e13a2 |
|||
| msg185217 - (view) | Author: Charles-François Natali (neologix) * (Python committer) | Date: 2013年03月25日 17:43 | |
Committed, thanks! |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:40 | admin | set | github: 61227 |
| 2013年03月25日 17:43:58 | neologix | set | status: open -> closed resolution: fixed messages: + msg185217 stage: commit review -> resolved |
| 2013年03月25日 17:21:19 | python-dev | set | messages: + msg185214 |
| 2013年03月24日 14:24:50 | python-dev | set | nosy:
+ python-dev messages: + msg185139 |
| 2013年03月24日 13:33:03 | sbt | set | messages: + msg185136 |
| 2013年03月24日 12:16:02 | neologix | set | messages: + msg185131 |
| 2013年03月24日 12:03:36 | sbt | set | messages: + msg185130 |
| 2013年03月24日 11:26:37 | neologix | set | files:
+ queues_contention.diff, forkingpickler.diff messages: + msg185128 |
| 2013年03月21日 15:25:56 | neologix | set | files:
+ queues_contention-3.diff messages: + msg184875 |
| 2013年03月21日 13:55:21 | pitrou | set | messages: + msg184871 |
| 2013年03月21日 13:51:14 | neologix | set | files:
+ queues_contention-1.diff stage: commit review versions: + Python 3.4 |
| 2013年03月05日 13:31:23 | neologix | unlink | issue10886 superseder |
| 2013年03月05日 13:30:53 | neologix | set | messages: + msg183523 |
| 2013年03月04日 22:09:55 | sbt | set | messages: + msg183500 |
| 2013年03月04日 20:01:26 | neologix | set | messages: + msg183494 |
| 2013年03月04日 19:48:53 | sbt | set | messages: + msg183493 |
| 2013年03月04日 19:25:51 | pitrou | set | messages: + msg183487 |
| 2013年03月04日 19:24:13 | neologix | set | messages: + msg183486 |
| 2013年02月23日 11:03:11 | neologix | link | issue10886 superseder |
| 2013年02月03日 13:00:49 | neologix | set | messages: + msg181266 |
| 2013年01月27日 18:41:32 | pitrou | set | files:
+ locked_send_recv.patch messages: + msg180782 |
| 2013年01月27日 18:29:39 | neologix | set | messages: + msg180781 |
| 2013年01月24日 16:44:54 | pitrou | set | messages: + msg180533 |
| 2013年01月24日 15:49:20 | neologix | set | files: + multi_queue.py |
| 2013年01月24日 15:49:06 | neologix | create | |