homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Compileall script: add option to use multiple cores
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: brett.cannon Nosy List: Claudiu.Popa, Jim.Jewett, brett.cannon, dholth, eric.araujo, gregory.p.smith, python-dev
Priority: low Keywords: patch

Created on 2012年10月01日 20:46 by dholth, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
compileall_v1.patch Claudiu.Popa, 2013年12月10日 12:30 review
issue16104.patch Claudiu.Popa, 2014年03月10日 19:39 Remove trailing whitespaces review
issue16104_1.patch Claudiu.Popa, 2014年03月12日 07:26 review
issue16104_2.patch Claudiu.Popa, 2014年03月12日 21:26 Minor doc fixes. review
issue16104_3.patch Claudiu.Popa, 2014年03月12日 22:08 review
issue16104_4.patch Claudiu.Popa, 2014年03月12日 22:42 review
issue16104_5.patch Claudiu.Popa, 2014年03月13日 16:53 review
issue16104_6.patch Claudiu.Popa, 2014年03月13日 17:48 review
issue16104_7.patch Claudiu.Popa, 2014年03月13日 19:18 Skip a couple of tests if concurrent.futures is unavailable review
issue16104_8.patch Claudiu.Popa, 2014年04月24日 06:20 review
issue16104_9.patch Claudiu.Popa, 2014年04月27日 13:00 review
issue16104_10.patch Claudiu.Popa, 2014年04月27日 14:06 review
issue16104_11.patch Claudiu.Popa, 2014年04月27日 16:13 Minor doc update review
issue16104_12.patch Claudiu.Popa, 2014年04月30日 09:16 review
16104.patch Claudiu.Popa, 2014年09月10日 06:58 Update the patch for the tip. review
Messages (28)
msg171744 - (view) Author: Daniel Holth (dholth) * Date: 2012年10月01日 20:46
compileall would benefit approximately linearly from additional CPU cores. There should be an option.
The noisy output would have to change. Right now it prints "compiling" and then "done" synchronously with doing the actual work.
msg171758 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2012年10月01日 23:16
This should probably use concurrent.futures instead of multiprocessing directly, but yes it would be useful.
Then again, the whole module should probably be rewritten to use importlib as well.
msg205805 - (view) Author: PCManticore (Claudiu.Popa) * (Python triager) Date: 2013年12月10日 12:30
Hello!
Here's a draft patch. It adds a new *processes* parameter to *compile_dir* and a new command line parameter as well.
msg213200 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2014年03月12日 05:50
Patch looks good. Some comments on Rietveld.
msg213209 - (view) Author: PCManticore (Claudiu.Popa) * (Python triager) Date: 2014年03月12日 07:26
Thank you for the review, Éric! Here's the updated patch.
msg213298 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2014年03月12日 21:05
FTR, py_compile and compileall use importlib in 3.4.
msg213301 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2014年03月12日 21:50
This looks ready to me.
One thing: "make -j0" is the spelling for "run using all available cores", whereas "compileall -j0" will use one process. I don’t know if this should be documented, changed or ignored.
msg213303 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2014年03月12日 21:52
I vote for changed so that -j0 uses all available cores as os.cpu_count() states.
msg213304 - (view) Author: PCManticore (Claudiu.Popa) * (Python triager) Date: 2014年03月12日 21:53
I agree. I'll modify the patch.
msg213307 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2014年03月12日 22:11
+ if args.processes <= 0:
Is that correct? For make, I think I’ve always seen "-j0", not negative values.
Could you add a test for -j0? (i.e. check that "compileall -j0" calls the function with "processes=os.cpu_count()")
msg213308 - (view) Author: PCManticore (Claudiu.Popa) * (Python triager) Date: 2014年03月12日 22:12
regrtest does that, checking for j <=0.
msg213317 - (view) Author: PCManticore (Claudiu.Popa) * (Python triager) Date: 2014年03月12日 22:42
Here's a test for j0 == os.cpu_count.
msg213340 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014年03月13日 00:28
Importing ProcessExecutor at the top-level means compileall will crash on systems which don't have multiprocessing support.
msg213417 - (view) Author: PCManticore (Claudiu.Popa) * (Python triager) Date: 2014年03月13日 16:53
Here's a new patch which addresses Éric's last comments.
Antoine, I don't have at my disposal a system without multiprocessing support. How does it crash?
msg213419 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014年03月13日 16:59
> Here's a new patch which addresses Éric's last comments.
> Antoine, I don't have at my disposal a system without multiprocessing support. How does it crash?
Neither do I, but you will probably get an ImportError of some sort.
msg213422 - (view) Author: PCManticore (Claudiu.Popa) * (Python triager) Date: 2014年03月13日 17:48
Here's a new version which catches ImportError for concurrent.futures and raises ValueError in `compile_dir` if `processes` was specified and concurrent.futures is unavailable. The only issue is that I don't know if this should be a ValueError or not. For instance, zipfile uses RuntimeError if `lzma` is unavailable.
msg214450 - (view) Author: PCManticore (Claudiu.Popa) * (Python triager) Date: 2014年03月22日 08:31
What can I do to move this forward? I believe all concerns have been addressed and it seems ready to me.
msg217118 - (view) Author: PCManticore (Claudiu.Popa) * (Python triager) Date: 2014年04月24日 06:20
Added a new version of the patch which incorporates suggestions made by Jim. Thanks for the review!
msg217173 - (view) Author: Jim Jewett (Jim.Jewett) * (Python triager) Date: 2014年04月25日 21:43
ProcessPoolExecutor already defaults to using cpu_count if max_workers is None. Consistency with that might be useful too. (and a default of 1 to mean nothing in parallel is sensible...)
msg217261 - (view) Author: PCManticore (Claudiu.Popa) * (Python triager) Date: 2014年04月27日 13:00
Added a new patch with improvements suggested by Jim. Thanks!
I removed the handling of processes=1, because it can still be useful: having a background worker which processes the files received from _walk_dir. Also, it checks that compile_dir receives a positive *processes* value, otherwise it raises a ValueError. As a side note, I just found that ProcessPoolExecutor / ThreadPoolExecutor don't verify the value of processes, leading to certain types of errors (see issue21362 for more details).
Jim, the default for processes is still None, meaning "do not use multiple process", because the purpose of ProcessPoolExecutor makes it easy for it to use processes=None=os.cpu_count(). Here we want the user to be explicit about wanting multiple processes or not.
msg217264 - (view) Author: PCManticore (Claudiu.Popa) * (Python triager) Date: 2014年04月27日 14:06
Add new patch with fixes proposed by Berker Peksag. Thanks for the review. Hopefully, this is the last iteration of this patch.
msg217399 - (view) Author: Jim Jewett (Jim.Jewett) * (Python triager) Date: 2014年04月28日 19:20
Trying to put bounds on the disagreements. Does anyone disagree with any of the following:
(1) compileall currently runs single-threaded in a single process.
(2) This enhancement intends to allow parallelization by process.
(3) Users MAY need to express whether they (require/forbid/are expressly apathetic concerning) paralellization.
(3A) There is some doubt that this even needs to be user-controlled.
(3B) If it is user-controlled, the patch proposes adding a "processes" parameter to do this.
(3C) There have been suggestions of other names (notably "workers"), but *if* it is user-controlled, the idea of a new parameter is not controversial.
(4) Users MAY need to control the degree of parallelization.
(4A) If so, setting the value of the new parameter to a positive integer > 1 is an acceptable solution.
(4B) There is not yet consensus on how to represent "Use multi-processing, with the default degree for this system.", "Do NOT use multiprocessing.", or "I don't care."
(4C) Suggested values have included 1, 0, -1, any negative number, None, and specific strings. The precise mapping between some of these and the three cases of 4B is not agreed.
(5) If multiprocessing is explicitly requested, what should happen when it is not available?
(5A) Fall back to the current way, without multi-processing.
(5B) Fall back to the current way, without multi-processing, but issue a Warning.
(5C) Raise an Exception. (ValueError, ImportError, NotImplemented?)
(6) Portions of the documentation unrelated to this should be fixed. But ideally, that would be done separately, and it will NOT be a pre-requisite to this patch.
---------------------------------------------------
Another potential value set
None (the default) ==> let the system parallelize as best it can -- as it does in multiprocessing. If the system picks "not in parallel at all", that is also OK, and no warning is raised.
0 ==> Do not parallelize.
positive integers ==> Use that many processes.
negative ==> ValueError
Would these uses of 0 and negative be too surprising for someone?
msg217586 - (view) Author: PCManticore (Claudiu.Popa) * (Python triager) Date: 2014年04月30日 09:16
Updated patch according to the python-dev thread:
- processes renamed to workers
- `workers` defaults to 1
- When `workers` is equal to 0, then `os.cpu_count` will be used
- When `workers` > 1, multiple processes will be used
- When `workers` == 1, run normally (no multiple processes)
- Negative values raises a ValueError
- Will raise NotImplementedError if multiprocessing can't be used
(when `workers` equals to 0 or > 1)
msg226684 - (view) Author: PCManticore (Claudiu.Popa) * (Python triager) Date: 2014年09月10日 06:58
If there is nothing left to do for this patch, can it be committed?
msg226822 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2014年09月12日 14:40
New changeset 9efefcab817e by Brett Cannon in branch 'default':
Issue #16104: Allow compileall to do parallel bytecode compilation.
http://hg.python.org/cpython/rev/9efefcab817e 
msg226823 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2014年09月12日 14:40
Thanks for the patch, Claudiu!
msg226824 - (view) Author: PCManticore (Claudiu.Popa) * (Python triager) Date: 2014年09月12日 14:41
Thank you for committing it. :-)
msg362786 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2020年02月27日 08:02
This caused a regression in behavior. compileall.compile_dir()'s ddir= parameter no longer does the right thing for any subdirectories.
https://bugs.python.org/issue39769 
History
Date User Action Args
2022年04月11日 14:57:36adminsetgithub: 60308
2020年02月27日 08:02:38gregory.p.smithsetnosy: + gregory.p.smith
messages: + msg362786
2014年09月12日 14:41:27Claudiu.Popasetmessages: + msg226824
2014年09月12日 14:40:30brett.cannonsetstatus: open -> closed
resolution: fixed
messages: + msg226823

stage: commit review -> resolved
2014年09月12日 14:40:06python-devsetnosy: + python-dev
messages: + msg226822
2014年09月10日 14:25:18brett.cannonsetassignee: brett.cannon
2014年09月10日 06:58:58Claudiu.Popasetfiles: + 16104.patch

messages: + msg226684
2014年04月30日 09:16:12Claudiu.Popasetfiles: + issue16104_12.patch

messages: + msg217586
2014年04月28日 19:21:32pitrousetnosy: - pitrou
2014年04月28日 19:20:44Jim.Jewettsetmessages: + msg217399
2014年04月27日 16:13:19Claudiu.Popasetfiles: + issue16104_11.patch
2014年04月27日 14:06:47Claudiu.Popasetfiles: + issue16104_10.patch

messages: + msg217264
2014年04月27日 13:33:27steven.dapranosetnosy: - steven.daprano
2014年04月27日 13:00:19Claudiu.Popasetfiles: + issue16104_9.patch

messages: + msg217261
2014年04月25日 21:43:54Jim.Jewettsetnosy: + Jim.Jewett
messages: + msg217173
2014年04月24日 06:20:06Claudiu.Popasetfiles: + issue16104_8.patch

messages: + msg217118
2014年04月23日 23:01:34terry.reedysettitle: Use multiprocessing in compileall script -> Compileall script: add option to use multiple cores
2014年03月22日 08:31:37Claudiu.Popasetmessages: + msg214450
2014年03月13日 19:18:06Claudiu.Popasetfiles: + issue16104_7.patch
2014年03月13日 17:48:33Claudiu.Popasetfiles: + issue16104_6.patch

messages: + msg213422
2014年03月13日 16:59:15pitrousetmessages: + msg213419
2014年03月13日 16:53:43Claudiu.Popasetfiles: + issue16104_5.patch

messages: + msg213417
2014年03月13日 00:28:13pitrousetnosy: + pitrou
messages: + msg213340
2014年03月12日 22:42:18Claudiu.Popasetfiles: + issue16104_4.patch

messages: + msg213317
2014年03月12日 22:12:27Claudiu.Popasetmessages: + msg213308
2014年03月12日 22:11:35eric.araujosetmessages: + msg213307
2014年03月12日 22:08:05Claudiu.Popasetfiles: + issue16104_3.patch
2014年03月12日 21:53:40Claudiu.Popasetmessages: + msg213304
2014年03月12日 21:52:17brett.cannonsetmessages: + msg213303
2014年03月12日 21:50:58eric.araujosetmessages: + msg213301
stage: patch review -> commit review
2014年03月12日 21:26:01Claudiu.Popasetfiles: + issue16104_2.patch
2014年03月12日 21:05:58eric.araujosetmessages: + msg213298
2014年03月12日 07:26:02Claudiu.Popasetfiles: + issue16104_1.patch

messages: + msg213209
2014年03月12日 05:50:28eric.araujosetstage: patch review
type: enhancement
versions: + Python 3.5, - Python 3.4
2014年03月12日 05:50:10eric.araujosetnosy: + eric.araujo
messages: + msg213200
2014年03月10日 19:39:54Claudiu.Popasetfiles: + issue16104.patch
2013年12月10日 12:30:35Claudiu.Popasetfiles: + compileall_v1.patch

nosy: + Claudiu.Popa
messages: + msg205805

keywords: + patch
2013年04月28日 02:49:41brett.cannonsetassignee: brett.cannon -> (no value)
2013年03月26日 18:18:17brett.cannonsetassignee: brett.cannon
2012年10月02日 01:41:12steven.dapranosetnosy: + steven.daprano
2012年10月01日 23:16:10brett.cannonsetpriority: normal -> low
versions: + Python 3.4
nosy: + brett.cannon

messages: + msg171758

components: + Library (Lib)
2012年10月01日 20:46:54dholthcreate

AltStyle によって変換されたページ (->オリジナル) /