homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Add filter to multiprocessing.Pool
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.6, Python 2.7
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: Mike.Drob, christian.leichsenring, davin, sbt, travis.thieman
Priority: normal Keywords:

Created on 2014年11月13日 17:35 by Mike.Drob, last changed 2022年04月11日 14:58 by admin. This issue is now closed.

Messages (4)
msg231124 - (view) Author: Mike Drob (Mike.Drob) Date: 2014年11月13日 17:35
Being able to use a pool to easily run 'map' over an iterable is very powerful, but it would also be nice to run 'filter' (or potentially 'ifilter' or 'filter_async', in keeping with the patterns already present).
msg231216 - (view) Author: Travis Thieman (travis.thieman) * Date: 2014年11月15日 20:28
Why is it insufficient to run a synchronous 'filter' over the list returned by 'Pool.map'? These functional constructs are inherently composable, and we should favor composing simple implementations of each rather than implementing special cases of them throughout the stdlib.
I think there's a clear reason for 'map' to be parallelizable because the function you're applying over the iterable could be quite expensive. 'filter' would only benefit from this if the comparison you're running is expensive, which seems like an unlikely and ill-advised use case. You can also rewrite your expensive 'filter' as a 'map' if you really need to.
msg235687 - (view) Author: Davin Potts (davin) * (Python committer) Date: 2015年02月10日 14:48
The points made by Travis are clear and solid.
Closing as this functionality is already handled well and no exceptional situations are being argued for that would require a special case.
msg377746 - (view) Author: Christian Leichsenring (christian.leichsenring) Date: 2020年10月01日 12:06
The main point the OP didn't make is exactly the issue that Pool.map returns a list which is potentially very large given that multiprocessing is used to process large amounts of data.
So IMHO either there should be the possibility to exclude elements from being saved in memory (i.e. Pool.filter) or Pool.map shouldn't return a list but just an iterable.
History
Date User Action Args
2022年04月11日 14:58:10adminsetgithub: 67053
2020年10月01日 12:06:16christian.leichsenringsetnosy: + christian.leichsenring
messages: + msg377746
2015年02月10日 14:48:47davinsetstatus: open -> closed

nosy: + davin
messages: + msg235687

resolution: rejected
stage: resolved
2014年11月15日 20:28:36travis.thiemansetnosy: + travis.thieman
messages: + msg231216
2014年11月13日 23:03:46ned.deilysetnosy: + sbt
2014年11月13日 17:35:27Mike.Drobcreate

AltStyle によって変換されたページ (->オリジナル) /