homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Add `Executor.filter` to concurrent.futures
Type: enhancement Stage:
Components: Documentation Versions: Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: bquinlan, cool-RR, docs@python, jnoller, ncoghlan, paul.moore, pitrou, rhettinger, sbt
Priority: normal Keywords: patch

Created on 2015年05月14日 21:26 by cool-RR, last changed 2022年04月11日 14:58 by admin.

Files
File name Uploaded Description Edit
1.patch cool-RR, 2015年05月14日 21:26 review
issue24195.stoneleaf.01.patch ethan.furman, 2015年05月15日 02:24 review
2.patch cool-RR, 2015年05月15日 07:36
filter_example.py cool-RR, 2015年05月18日 07:30
Messages (32)
msg243217 - (view) Author: Ram Rachum (cool-RR) * Date: 2015年05月14日 21:26
`Executor.filter` is to `filter` what `Executor.map` is to `map`.
See Python-ideas thread:
https://groups.google.com/forum/#!topic/python-ideas/EBOC5YCWPyo
Patch attached. I don't know how to run the Python test suite (I'm guessing it involves building Python and I don't know how to do that.) Please tell me whether the tests I wrote pass.
I'm guessing that other than that, the main thing missing in my patch is documentation. If there's agreement that this feature is desirable and the implementation is good, I'll be happy to write the docs. Let me know.
msg243218 - (view) Author: Ram Rachum (cool-RR) * Date: 2015年05月14日 21:28
Also, I notice Python 3.5 feature freeze is a bit over a week away, and I hope we can get that in so it could go in Python 3.5. (Assuming it goes in at all.)
msg243221 - (view) Author: Paul Moore (paul.moore) * (Python committer) Date: 2015年05月14日 22:03
You should add docs for the new method, as well.
msg243240 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2015年05月15日 02:24
Updated the tests (had to use real defs, not lambdas, and the expected results for filter_exception weren't right).
Tests pass.
Get some docs written! :)
(More reviews would also be good. ;)
msg243242 - (view) Author: Brian Quinlan (bquinlan) * (Python committer) Date: 2015年05月15日 03:29
This feature seems unnecessary to me but couldn't the implementation be simplified to work in terms of map? i.e. (pseudocode):
def filter(self, fn, iterable, timeout=None):
 l = list(iterable)
 return (item for (item, keep) in zip(l, self.map(fn, l, timeout)) if keep)
msg243254 - (view) Author: Ram Rachum (cool-RR) * Date: 2015年05月15日 07:36
Patch with documentation attached. (I don't know how to concatenate patches, so 2.patch contains only the documentation, while 1.patch has the implementation and the tests (but Ethan's patch is better.))
Brian, regarding your simpler implementation based on `Executor.map`: It looks good to me, but I'm not sure if it passes the tests. If it does then I don't mind if that's the implementation that would be used.
msg243256 - (view) Author: Paul Moore (paul.moore) * (Python committer) Date: 2015年05月15日 08:21
Just as a note - to test a pure Pthon patch like this, you can apply the patch to your installed Python 3.4 installation, and just run the test using that. There should be no need to build your own Python.
>python C:\Apps\Python34\Lib\test\test_concurrent_futures.py
test_cancel (__main__.FutureTests) ... ok
test_cancelled (__main__.FutureTests) ... ok
test_done (__main__.FutureTests) ... ok
test_done_callback_already_cancelled (__main__.FutureTests) ... ok
test_done_callback_already_failed (__main__.FutureTests) ... ok
test_done_callback_already_successful (__main__.FutureTests) ... ok
[... etc]
msg243260 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015年05月15日 09:24
This feature looks unnecessary to me as well. Adding features has a non-zero cost in maintenance.
msg243272 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2015年05月15日 15:59
Short History:
=============
(Ram Rachum)
What do you think about adding a method: `Executor.filter`?
I was using something like this: 
 my_things = [thing for thing in things if some_condition(thing)]
But the problem was that `some_condition` took a long time to run waiting on I/O, which is a great candidate for parallelizing with ThreadPoolExecutor. I made it work using `Executor.map` and some improvizing, but it would be nicer if I could do:
 with concurrent.futures.ThreadPoolExecutor(100) as executor:
 my_things = executor.filter(some_condition, things)
And have the condition run in parallel on all the threads.
(Nick Coughlan)
I think this is sufficiently tricky to get right that it's worth adding filter() as a parallel to the existing map() API.
msg243274 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015年05月15日 16:30
> (Nick Coughlan)
Nick Coghlan isn't currently on this issue. Unless you were talking about another, separate Nick Coughlan that I don't know of? :)
msg243276 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015年05月15日 16:33
But to answer your argument:
> I think this is sufficiently tricky to get right that it's worth
> adding filter() as a parallel to the existing map() API.
"Tricky to get right" is not a good criterion. The question is whether it's useful or not. Only the OP has had that need AFAIK.
msg243281 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2015年05月15日 16:55
That comment was from an email by Nick, not to Nick. But now I've added him. ;)
msg243296 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2015年05月16日 06:13
filter() usage has always been less common than map() usage, and we see a similar pattern in comprehension usage as well (i.e. [f(x) for x in y] is a far more common construct than [x for x in p(y)]). That "less common" status doesn't keep us from providing filter() as builtin, or syntactic support for filtering in the comprehension syntax.
As a result, the main question I'd like to see a clear and authoritative answer to is "Given 'seq2 = filter(p, seq)' or 'seq2 = [x for seq if p(x)]', what's the concurrent.futures based parallel execution syntax in cases where the filtering key is expensive to calculate?"
I'd be quite OK with Brian's 2-line implementation going into the concurrent.futures documentation as a filtering recipe, similar to the way Raymond uses the recipes in the itertools documentation to help minimise complexity growth in the core API.
I *don't* mind if Brian's judgement is that it doesn't rise to the level of being worth including as a core feature in its own right, as I agree that the typical case of filtering functions is that they're fast, and when they're not, it's often a sign that data model denormalisation may be desirable in order to cache the relevant derived property.
msg243298 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2015年05月16日 07:58
Folks could probably guess what I meant, but both filter comprehensions in my comment should have read: "[x for x in seq if p(x)]"
msg243388 - (view) Author: Ram Rachum (cool-RR) * Date: 2015年05月17日 07:52
Okay, let me understand what the opinions of people are about this feature.
Ram Rachum: +1
Antoine Pitrou: -1
Nick Coghlan: Is that a +1 or a +0?
Ethan and Paul, you've participated but I didn't get your opinion on whether you want to see this feature go in. 
If there are more -1s than +1s we can close this ticket.
msg243390 - (view) Author: Paul Moore (paul.moore) * (Python committer) Date: 2015年05月17日 08:02
I'm not a heavy user of concurrent.futures, so I don't have a strong opinion. I've never felt a need for this function, though, so I guess I'm -0. A recipe in the docs would be good, though.
msg243436 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2015年05月17日 23:48
I'm now +1 on a recipe in the docs, -0 on a concrete method on Executor
that delegates to the map() method.
msg243463 - (view) Author: Ram Rachum (cool-RR) * Date: 2015年05月18日 07:30
Looks like this a recipe in the docs is where this ticket is headed.
I took my original example for `Executor.filter` and changed it using Brian's suggestion so now it uses `Executor.map`. Please check it out. If someone other than me feels comfortable in putting it into the documentation page, I'll be happy if you could do that.
Possible issues:
This example uses `requests`, which isn't in the standard library.
I would say that this `for` line:
 for person in (person for person, filter_result in
 zip(people, executor.map(has_wikipedia_page, people))
 if filter_result):
Is a bit ugly, but since the consensus here is that this is better than implementing `Executor.filter`, so be it.
msg243493 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2015年05月18日 16:18
I'd rather see an `Executor.filter()` myself, but it's Brian Quinlan's call.
msg243871 - (view) Author: Brian Quinlan (bquinlan) * (Python committer) Date: 2015年05月23日 00:03
Ethan: I'm -0 so I'd happily go with the consensus. Does anyone have a strong opinion?
Ram: What did you mean by "Possible issues"? Did you mean that to be an example using the executor.map() technique?
msg243886 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2015年05月23日 05:03
I vary between +0 and -0 for the addition of the concrete method.
When I'm at +0, the main rationale is that we *don't* have the "Where do we stop?" risk here that itertools faces, as we're just replicating the synchronous builtins.
When I'm at -0, the main rationale is that a recipe works with *any* version of Python that provides concurrent.futures (including any version of the PyPI backport), and is hence useful immediately, while a method would only work the version where we added it.
msg243888 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2015年05月23日 06:06
It looks like we are pretty much neutral between the +1's and -1's.
Antoine seems to be opposed on general principles against bloat, while I am for it on general principles of completeness.
The recipe could still go in the docs for people to use on previous versions of Python.
If I was a module maintainer, and the other maintainers were not opposed, I would add it.
(So, Brian, care to have a co-maintainer? ;)
msg243890 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2015年05月23日 06:53
I would like to tip the balance in favor of the -1. Antoine is right about this one.
msg243891 - (view) Author: Ram Rachum (cool-RR) * Date: 2015年05月23日 07:25
Raymond: Thank you. So the discussion is back on adding a recipe to the docs.
Brian: When I said "possible issues", I followed that with a couple of issues with the example I uploaded (filter_example.py).
msg244914 - (view) Author: Ram Rachum (cool-RR) * Date: 2015年06月06日 15:30
A problem I just realized with Brian's 2-line implementation of `filter`: It doesn't work for iterables which aren't sequences, since it attempts to exhaust the iterable twice. So if you have a non-sequence you'll have to make it into a sequence first.
msg245004 - (view) Author: Brian Quinlan (bquinlan) * (Python committer) Date: 2015年06月08日 14:31
Actually it immediately converts the iterable into a list. Recall:
def filter(self, fn, iterable, timeout=None):
 l = list(iterable) # iterable => list
 return (item for (item, keep) in zip(l, self.map(fn, l, timeout)) if keep)
msg245016 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2015年06月08日 16:39
The recipe will work in simple situations.
In more complex situations:
 - infinite iterator
 - really large iterator (millions of items)
 - high work load to produce next iteration
the recipe will either have a severe impact on performance or fail entirely.
msg245017 - (view) Author: Ram Rachum (cool-RR) * Date: 2015年06月08日 16:44
Ethan: That's what I also thought, but then I figured that you're creating N futures anyway with an item attached to each, so it's on the same scale as creating a list, no?
msg245025 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2015年06月08日 17:10
The recipe creates a list before it ever starts processing, while Executor.filter() starts processing with the first item.
msg245026 - (view) Author: Ram Rachum (cool-RR) * Date: 2015年06月08日 17:11
Right, so it might be garbage-collecting the items before the futures are done being created, so the whole list wouldn't need to be saved in memory. I understand now.
msg246029 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2015年07月01日 07:25
Brian, given my comments in msg245016 are you willing to add the Executor.filter() function?
msg341635 - (view) Author: Brian Quinlan (bquinlan) * (Python committer) Date: 2019年05月06日 20:06
Hey Ethan, I'm really sorry about dropping the ball on this. I've been burnt out on Python stuff for the last couple of years.
When we left this, it looked like the -1s were in the majority and no one new has jumped on to support `filter`.
If you wanted to add this, I wouldn't object. But I've been inactive so long that I don't think that I should make the decision.
History
Date User Action Args
2022年04月11日 14:58:16adminsetgithub: 68383
2019年05月06日 20:06:01bquinlansetmessages: + msg341635
2015年07月21日 07:07:46ethan.furmansetnosy: - ethan.furman
2015年07月01日 07:25:40ethan.furmansetmessages: + msg246029
versions: + Python 3.6, - Python 3.5
2015年06月08日 17:11:33cool-RRsetmessages: + msg245026
2015年06月08日 17:10:15ethan.furmansetmessages: + msg245025
2015年06月08日 16:44:05cool-RRsetmessages: + msg245017
2015年06月08日 16:39:08ethan.furmansetmessages: + msg245016
2015年06月08日 14:31:53bquinlansetmessages: + msg245004
2015年06月06日 15:30:54cool-RRsetmessages: + msg244914
2015年05月23日 07:37:47rhettingersetassignee: docs@python

components: + Documentation, - Library (Lib)
nosy: + docs@python
2015年05月23日 07:25:14cool-RRsetmessages: + msg243891
2015年05月23日 06:53:18rhettingersetnosy: + rhettinger
messages: + msg243890
2015年05月23日 06:06:26ethan.furmansetmessages: + msg243888
2015年05月23日 05:03:28ncoghlansetmessages: + msg243886
2015年05月23日 00:03:00bquinlansetmessages: + msg243871
2015年05月18日 16:18:28ethan.furmansetmessages: + msg243493
2015年05月18日 07:30:58cool-RRsetfiles: + filter_example.py

messages: + msg243463
2015年05月17日 23:48:53ncoghlansetmessages: + msg243436
2015年05月17日 08:02:56paul.mooresetmessages: + msg243390
2015年05月17日 07:52:27cool-RRsetmessages: + msg243388
2015年05月16日 07:58:55ncoghlansetmessages: + msg243298
2015年05月16日 06:13:36ncoghlansetmessages: + msg243296
2015年05月15日 16:55:10ethan.furmansetnosy: + ncoghlan
messages: + msg243281
2015年05月15日 16:33:51pitrousetmessages: + msg243276
2015年05月15日 16:30:51pitrousetmessages: + msg243274
2015年05月15日 15:59:36ethan.furmansetmessages: + msg243272
2015年05月15日 09:24:16pitrousetmessages: + msg243260
2015年05月15日 08:21:59paul.mooresetmessages: + msg243256
2015年05月15日 07:36:59cool-RRsetfiles: + 2.patch

messages: + msg243254
2015年05月15日 03:29:19bquinlansetmessages: + msg243242
2015年05月15日 03:08:16ethan.furmansetnosy: + bquinlan, jnoller, sbt

title: Add `Executor.filter` -> Add `Executor.filter` to concurrent.futures
2015年05月15日 02:24:54ethan.furmansetfiles: + issue24195.stoneleaf.01.patch

messages: + msg243240
2015年05月15日 00:46:52ethan.furmansetnosy: + ethan.furman
2015年05月14日 22:42:52rhettingersetnosy: + pitrou
2015年05月14日 22:03:22paul.mooresetnosy: + paul.moore
messages: + msg243221
2015年05月14日 21:28:15cool-RRsetmessages: + msg243218
2015年05月14日 21:26:41cool-RRcreate

AltStyle によって変換されたページ (->オリジナル) /