homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Improve Python 3.3 startup time
Type: performance Stage:
Components: Interpreter Core Versions: Python 3.3
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: Nosy List: Arfrever, barry, eric.araujo, eric.smith, eric.snow, flox, georg.brandl, giampaolo.rodola, loewis, nadeem.vawda, neologix, pitrou, python-dev, rhettinger, terry.reedy, vstinner
Priority: normal Keywords: patch

Created on 2011年01月28日 22:49 by pitrou, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
import_stat.diff neologix, 2011年12月05日 18:32 review
Messages (17)
msg127359 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011年01月28日 22:49
Here are the system calls when importing a single pure Python module:
stat("/home/antoine/py3k/py3k/Lib/copyreg", 0x7fff1ed1f740) = -1 ENOENT (No such file or directory)
open("/home/antoine/py3k/py3k/Lib/copyreg.cpython-32m.so", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/home/antoine/py3k/py3k/Lib/copyregmodule.cpython-32m.so", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/home/antoine/py3k/py3k/Lib/copyreg.abi3.so", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/home/antoine/py3k/py3k/Lib/copyregmodule.abi3.so", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/home/antoine/py3k/py3k/Lib/copyreg.so", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/home/antoine/py3k/py3k/Lib/copyregmodule.so", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/home/antoine/py3k/py3k/Lib/copyreg.py", O_RDONLY) = 5
fstat(5, {st_mode=S_IFREG|0664, st_size=6633, ...}) = 0
open("/home/antoine/py3k/py3k/Lib/__pycache__/copyreg.cpython-32.pyc", O_RDONLY) = 6
fstat(6, {st_mode=S_IFREG|0664, st_size=5595, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f28e6381000
read(6, "l\f\r\n0円24円212円Hc0円0円0円0円0円0円0円0円0円0円0円0円\f0円0円0円@0円0円0円s307円0円"..., 4096) = 4096
fstat(6, {st_mode=S_IFREG|0664, st_size=5595, ...}) = 0
This is per possible directory, which means that it can be repeated for each dir on sys.path:
stat("readline", 0x7fff1ed28760) = -1 ENOENT (No such file or directory)
open("readline.cpython-32m.so", O_RDONLY) = -1 ENOENT (No such file or directory)
open("readlinemodule.cpython-32m.so", O_RDONLY) = -1 ENOENT (No such file or directory)
open("readline.abi3.so", O_RDONLY) = -1 ENOENT (No such file or directory)
open("readlinemodule.abi3.so", O_RDONLY) = -1 ENOENT (No such file or directory)
open("readline.so", O_RDONLY) = -1 ENOENT (No such file or directory)
open("readlinemodule.so", O_RDONLY) = -1 ENOENT (No such file or directory)
open("readline.py", O_RDONLY) = -1 ENOENT (No such file or directory)
open("readline.pyc", O_RDONLY) = -1 ENOENT (No such file or directory)
stat("/home/antoine/py3k/py3k/Lib/readline", 0x7fff1ed28760) = -1 ENOENT (No such file or directory)
open("/home/antoine/py3k/py3k/Lib/readline.cpython-32m.so", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/home/antoine/py3k/py3k/Lib/readlinemodule.cpython-32m.so", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/home/antoine/py3k/py3k/Lib/readline.abi3.so", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/home/antoine/py3k/py3k/Lib/readlinemodule.abi3.so", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/home/antoine/py3k/py3k/Lib/readline.so", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/home/antoine/py3k/py3k/Lib/readlinemodule.so", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/home/antoine/py3k/py3k/Lib/readline.py", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/home/antoine/py3k/py3k/Lib/readline.pyc", O_RDONLY) = -1 ENOENT (No such file or directory)
stat("/home/antoine/py3k/py3k/Lib/plat-linux2", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
stat("/home/antoine/py3k/py3k/Lib/plat-linux2", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
stat("/home/antoine/py3k/py3k/Lib/plat-linux2/readline", 0x7fff1ed28760) = -1 ENOENT (No such file or directory)
open("/home/antoine/py3k/py3k/Lib/plat-linux2/readline.cpython-32m.so", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/home/antoine/py3k/py3k/Lib/plat-linux2/readlinemodule.cpython-32m.so", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/home/antoine/py3k/py3k/Lib/plat-linux2/readline.abi3.so", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/home/antoine/py3k/py3k/Lib/plat-linux2/readlinemodule.abi3.so", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/home/antoine/py3k/py3k/Lib/plat-linux2/readline.so", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/home/antoine/py3k/py3k/Lib/plat-linux2/readlinemodule.so", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/home/antoine/py3k/py3k/Lib/plat-linux2/readline.py", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/home/antoine/py3k/py3k/Lib/plat-linux2/readline.pyc", O_RDONLY) = -1 ENOENT (No such file or directory)
stat("/home/antoine/py3k/py3k/build/lib.linux-x86_64-3.2", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
stat("/home/antoine/py3k/py3k/build/lib.linux-x86_64-3.2", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
stat("/home/antoine/py3k/py3k/build/lib.linux-x86_64-3.2/readline", 0x7fff1ed28760) = -1 ENOENT (No such file or directory)
open("/home/antoine/py3k/py3k/build/lib.linux-x86_64-3.2/readline.cpython-32m.so", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0775, st_size=59970, ...}) = 0
futex(0x7f28e605e0ec, FUTEX_WAKE_PRIVATE, 2147483647) = 0
open("/home/antoine/py3k/py3k/build/lib.linux-x86_64-3.2/readline.cpython-32m.so", O_RDONLY) = 4
Something has gone out of control here. Why do we need to check so many alternative locations?
msg127522 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2011年01月30日 09:19
> Something has gone out of control here. Why do we need to check so many alternative locations?
What change do you propose?
msg127523 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011年01月30日 10:56
> > Something has gone out of control here. Why do we need to check so many alternative locations?
> 
> What change do you propose?
First, I don't understand why we need to check both "foo.so" and
"foomodule.so". Second, I don't understand why we need to check both
"foo.cpython32.so" and "foo.so". If you fix both these, the number of
stat() calls per would-be extension module goes down from 6 to 2 per
directory path.
Then there seems to be a couple of redundant stat/fstat calls on some
files.
msg148884 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011年12月05日 18:32
Here's a trivial patch reducing the number of calls to open.
before:
"""
$ strace -c -e open ./python -c ""
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
100.00 0.000049 0 392 306 open
------ ----------- ----------- --------- --------- ----------------
100.00 0.000049 392 306 total
"""
after:
"""
$ strace -c -e open ./python -c ""
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
100.00 0.000024 0 86 open
------ ----------- ----------- --------- --------- ----------------
100.00 0.000024 86 total
"""
As for the flury of tentative locations, I don't feel like modifying this since I'm not familiar enough with the import machinery.
msg148890 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2011年12月05日 21:42
> First, I don't understand why we need to check both "foo.so" and
> "foomodule.so".
Because we always did, i.e. changing it now may break backwards compatibility. Now, as for why we always had *module.so also: it may be that calling an extension module foo.so might not be feasible if it needs to link with a system library called foo.so (rather than libfoo.so). 
I don't know how serious this problem is in practice - we may consider deprecating-then-removing *module.so from the import machinery.
As for checking both "foo.so" and "foo.cpython-32m.so": that's for backwards compatibility also. Existing build processes may produce foo.so, as they did in previous Python versions (in particular, if they are not based on distutils, but, say, make). Unfortunately, PEP 3149 did not specify a deprecation scheduling. You could ask Barry - I suppose that was an oversight, and not actually deliberate.
Of course, with .cpython-32m.so being present, removing *module.so would be easy enough, except again for breaking existing projects which might use that name.
msg148895 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011年12月06日 00:33
> Here's a trivial patch reducing the number of calls to open.
Looks good to me.
msg148981 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011年12月07日 18:16
New changeset a541bda2f5e2 by Charles-François Natali in branch 'default':
Issue #11051: Reduce the number of syscalls per import.
http://hg.python.org/cpython/rev/a541bda2f5e2 
msg148982 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2011年12月07日 18:27
I would have appreciated had you considered my review before pushing that change.
msg148983 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011年12月07日 18:43
> I would have appreciated had you considered my review before pushing
> that change.
Sorry, I didn't receive an email notification for your review, so I didn't know you had done one. I'll add a comment.
msg148998 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011年12月07日 22:20
I am a little puzzled by the patch. In logic, 'A and B' is equivalent to 'not A or not B'. But in the patch,
- if (_Py_stat(filename, &statbuf) == 0 && /it exists */
- S_ISDIR(statbuf.st_mode)) /* it's a directory */
+ if (_Py_stat(filename, &statbuf) != 0 || S_ISDIR(statbuf.st_mode))
you seem to change to 'not A or B', without negating B. Is this intentional? What am I missing? Some subtle effect of lazy evaluation? Or an intentional change in the logic?
msg148999 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011年12月07日 22:23
By the way, _Py_stat() can fail because of a Python error: -1 result is not handled in import.c :-( It may leak to the "XXX undetected error" message.
msg149000 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011年12月07日 22:30
OK, I gather from the comment added in the second patch that you intentionally changed the logic to restart the loop more often.
msg153642 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2012年02月18日 12:22
As noted by Antoine somewhere else (don't remember if it was on the bug tracker or mailing list), most of the startup time is due to site import:
"""
cf@neobox:~/python/cpython$ time ./python -c ''
[44249 refs]
real 0m0.445s
user 0m0.376s
sys 0m0.032s
cf@neobox:~/python/cpython$ time ./python -S -c ''
[35211 refs]
real 0m0.181s
user 0m0.144s
sys 0m0.016s
"""
And that's mostly user time, the stat() calls don't weight much here.
Here's the output of a profiling:
"""
 Ordered by: cumulative time
 ncalls tottime percall cumtime percall filename:lineno(function)
 2/1 0.006 0.003 0.303 0.303 {built-in method exec}
 1 0.004 0.004 0.303 0.303 test_startup.py:1(<module>)
 1 0.000 0.000 0.267 0.267 site.py:509(main)
 1 0.000 0.000 0.237 0.237 site.py:254(addusersitepackages)
 1 0.000 0.000 0.236 0.236 site.py:231(getusersitepackages)
 1 0.006 0.006 0.226 0.226 site.py:217(getuserbase)
 1 0.011 0.011 0.214 0.214 sysconfig.py:1(<module>)
 1 0.004 0.004 0.122 0.122 configparser.py:120(<module>)
 30 0.006 0.000 0.120 0.004 {built-in method __build_class__}
 9 0.000 0.000 0.116 0.013 re.py:204(compile)
 9 0.002 0.000 0.116 0.013 functools.py:185(wrapper)
 9 0.000 0.000 0.113 0.013 re.py:256(_compile)
 9 0.001 0.000 0.113 0.013 sre_compile.py:488(compile)
 1 0.001 0.001 0.096 0.096 configparser.py:555(RawConfigParser)
"""
On my (somewhat old) laptop, sysconfig import consumes 1/3 of the 
total startup time.
msg153693 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2012年02月19日 10:15
I changed the title of the issue because what matters is the total startup time, not how it is improved :-)
msg153697 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2012年02月19日 10:35
Using the following patch, "./python -s -c pass" is 2x faster than without the patch!
diff --git a/Lib/site.py b/Lib/site.py
--- a/Lib/site.py
+++ b/Lib/site.py
@@ -518,7 +518,8 @@ def main():
 known_paths = removeduppaths()
 if ENABLE_USER_SITE is None:
 ENABLE_USER_SITE = check_enableusersite()
- known_paths = addusersitepackages(known_paths)
+ if ENABLE_USER_SITE:
+ known_paths = addusersitepackages(known_paths)
 known_paths = addsitepackages(known_paths)
 if sys.platform == 'os2emx':
 setBEGINLIBPATH()
-s with the patch doesn't import the sysconfig module. So if we want to speed up Python, the first step is to optimize it, at least the function getting the user site packages directory.
msg153736 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2012年02月19日 23:47
I opened the issue #14057 for the sysconfig parser.
msg160876 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012年05月16日 15:42
This is out of date now that importlib is the default import system.
History
Date User Action Args
2022年04月11日 14:57:12adminsetgithub: 55260
2012年07月22日 10:21:18floxsetnosy: + flox
2012年05月16日 15:42:09pitrousetstatus: open -> closed
resolution: out of date
messages: + msg160876
2012年02月19日 23:47:36vstinnersetmessages: + msg153736
2012年02月19日 10:35:35vstinnersetmessages: + msg153697
2012年02月19日 10:15:35vstinnersetmessages: + msg153693
title: system calls per import -> Improve Python 3.3 startup time
2012年02月19日 08:33:29eric.araujosetnosy: + eric.araujo
2012年02月18日 12:22:58neologixsetmessages: + msg153642
2011年12月07日 22:30:56terry.reedysetmessages: + msg149000
2011年12月07日 22:23:09vstinnersetnosy: + vstinner
messages: + msg148999
2011年12月07日 22:20:28terry.reedysetnosy: + terry.reedy
messages: + msg148998
2011年12月07日 18:43:32neologixsetmessages: + msg148983
2011年12月07日 18:27:09loewissetmessages: + msg148982
2011年12月07日 18:16:32python-devsetnosy: + python-dev
messages: + msg148981
2011年12月07日 01:45:35Arfreversetnosy: + Arfrever
2011年12月06日 00:33:42pitrousetmessages: + msg148895
2011年12月05日 21:42:50loewissetmessages: + msg148890
2011年12月05日 20:37:31eric.snowsetnosy: + eric.snow
2011年12月05日 18:32:51neologixsetfiles: + import_stat.diff

nosy: + neologix
messages: + msg148884

keywords: + patch
2011年01月30日 22:13:30giampaolo.rodolasetnosy: + giampaolo.rodola
2011年01月30日 10:56:18pitrousetnosy: loewis, barry, georg.brandl, rhettinger, pitrou, eric.smith, nadeem.vawda
messages: + msg127523
2011年01月30日 09:38:06georg.brandlsetnosy: + georg.brandl
2011年01月30日 09:30:06rhettingersetnosy: + rhettinger
2011年01月30日 09:19:20loewissetnosy: loewis, barry, pitrou, eric.smith, nadeem.vawda
messages: + msg127522
2011年01月30日 09:04:50nadeem.vawdasetnosy: + nadeem.vawda
2011年01月28日 23:04:57eric.smithsetnosy: + eric.smith
2011年01月28日 22:49:32pitroucreate

AltStyle によって変換されたページ (->オリジナル) /