homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: import hashlib makes many programs slow
Type: Stage: resolved
Components: Library (Lib), SSL Versions: Python 3.7, Python 3.6, Python 3.3, Python 3.4, Python 3.5, Python 2.7
process
Status: closed Resolution: third party
Dependencies: Superseder:
Assigned To: christian.heimes Nosy List: bmwiedemann, christian.heimes, vstinner
Priority: normal Keywords:

Created on 2017年05月05日 03:23 by bmwiedemann, last changed 2022年04月11日 14:58 by admin. This issue is now closed.

Messages (13)
msg293039 - (view) Author: Bernhard M. Wiedemann (bmwiedemann) * Date: 2017年05月05日 03:23
Steps to Reproduce:
echo import hashlib > test.py
time python -m cProfile -s tottime test.py 2>&1 | head
Actual Results:
shows 27ms spent in hashlib.py
The problem goes away when dropping everything after line 133
in hashlib.py
see also issue 21288 
msg293073 - (view) Author: Bernhard M. Wiedemann (bmwiedemann) * Date: 2017年05月05日 08:16
traced it down a bit further.
nearly all that time is spent in
import _hashlib
which probably means, it is spent in functions like PyInit__hashlib in Modules/_hashopenssl.c
I can see in strace that after loading libssl, libcrypto and libz,
it calls getpid and opens /dev/urandom 8 times
and most of the time is spent in the middle of these:
grep open...dev.uran strace.out
1493970277.542629 open("/dev/urandom", O_RDONLY|O_CLOEXEC) = 4
1493970277.542847 open("/dev/urandom", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_CLOEXEC) = 4
1493970277.543370 open("/dev/urandom", O_RDONLY|O_CLOEXEC) = 4
1493970277.543967 open("/dev/urandom", O_RDONLY|O_CLOEXEC) = 4
1493970277.592232 open("/dev/urandom", O_RDONLY|O_CLOEXEC) = 4
1493970277.592518 open("/dev/urandom", O_RDONLY|O_CLOEXEC) = 4
1493970277.597778 open("/dev/urandom", O_RDONLY|O_CLOEXEC) = 4
1493970277.598629 open("/dev/urandom", O_RDONLY|O_CLOEXEC) = 4
msg293087 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2017年05月05日 09:42
I cannot reproduce the open /dev/urandom calls locally. You seem to be running an old Python version and/or old Kernel. Python prefers getrandom() over /dev/urandom. The urandom code caches the fd for /dev/urandom.
$ strace ./python -c 'import hashlib' 2>&1 | grep random
getrandom("\xe0\xa5\x00\xda\xe1\xce\x1a\x9c\xe7\x61\x71\xf2\x1d\x15\x63\x19\xdc\x60\x70\x3c\xef\xf6\xcb\xdb", 24, GRND_NONBLOCK) = 24
open("/dev/urandom", O_RDONLY|O_NOCTTY|O_NONBLOCK) = 3
The O_RDONLY|O_NOCTTY|O_NONBLOCK|O_CLOEXEC comes from OpenSSL.
What's your Python version, OpenSSL version, Kernel version and which platform/distribution are you on?
msg293090 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017年05月05日 10:00
What is your Python version? It looks like an old version.
msg293091 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017年05月05日 10:01
What is your OS? Which kind of computer is it? I'm surprised that you noticed open calls on /dev/urandom, it should be very quick, no?
msg293092 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2017年05月05日 10:12
I rather suspect it's OpenSSL. It takes a bit to load and initialize libcrypto.so. On my system it takes about 17ms to import _hashlib. There is nothing we can do about it in CPython.
msg293102 - (view) Author: Bernhard M. Wiedemann (bmwiedemann) * Date: 2017年05月05日 12:18
I'm running openSUSE Tumbleweed with python-2.7.13 and python3-3.6.1
on an Intel Pentium N3530 with linux-4.10.12
There the total time of the "import _hashlib" oneliner varies between 100 and 250 ms (python3 is a bit slower and the machine is slower when running on battery)
the urandom calls themselves are not slow, but probably the processing in between. Can't that be done on demand on first use?
This is (one minor part of) slowing down python CLI tools like OpenStack's
cinder help
msg293103 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017年05月05日 12:51
> the urandom calls themselves are not slow, but probably the processing in between. Can't that be done on demand on first use?
If the time is spent on loading OpenSSL, maybe you should try to
import hashlib on demand?
msg293114 - (view) Author: Bernhard M. Wiedemann (bmwiedemann) * Date: 2017年05月05日 15:57
This might work in some places, but if you look at real-world users like
urllib3/util/ssl_.py
> from hashlib import md5, sha1, sha256
or
twisted/words/protocols/jabber/xmlstream.py
> from hashlib import sha1
It would probably not need any openssl at all, but still gets the full penalty time from the import.
So you get all or nothing.
It would be nice to just get access to those few hashes you need for your protocal (because you certainly do not want to reimplement it).
msg293117 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2017年05月05日 16:08
If import of hashlib is slowed down by OpenSSL, then import of ssl module will be even slower.
Am 5. Mai 2017 17:57:41 MESZ schrieb "Bernhard M. Wiedemann" <report@bugs.python.org>:
>
>Bernhard M. Wiedemann added the comment:
>
>This might work in some places, but if you look at real-world users
>like
>urllib3/util/ssl_.py
>> from hashlib import md5, sha1, sha256
>
>or
>twisted/words/protocols/jabber/xmlstream.py
>> from hashlib import sha1
>
>It would probably not need any openssl at all, but still gets the full
>penalty time from the import.
>So you get all or nothing.
>It would be nice to just get access to those few hashes you need for
>your protocal (because you certainly do not want to reimplement it).
>
>----------
>
>_______________________________________
>Python tracker <report@bugs.python.org>
><http://bugs.python.org/issue30276>
>_______________________________________
msg293123 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017年05月05日 16:42
I ran a quick benchmark on my laptop, Fedora 25:
haypo@selma$ python3 -m perf command -o ref.json -- python3 -c 'pass' && python3 -m perf command -o hashlib.json -- python3 -c 'import hashlib' && python3 -m perf compare_to ref.json hashlib.json --table
+-----------+---------+------------------------------+
| Benchmark | ref | hashlib |
+===========+=========+==============================+
| command | 26.7 ms | 30.2 ms: 1.13x slower (+13%) |
+-----------+---------+------------------------------+
Importing hashlib only takes 3.5 ms on my laptop.
msg293129 - (view) Author: Bernhard M. Wiedemann (bmwiedemann) * Date: 2017年05月05日 17:18
getting to a similar size on Debian, so I'll move this into the openSUSE bugtracker to find out why it is so slow there.
msg293555 - (view) Author: Bernhard M. Wiedemann (bmwiedemann) * Date: 2017年05月12日 16:17
Tracking this in https://bugzilla.opensuse.org/show_bug.cgi?id=1038906
now.
It turned out to be a problem with (our?) openssl-1.0.2
which Debian stable does not have yet
and Debian/9 (stretch) has openssl-1.1.0 without this problem
The nicest python-based reproducer so far is
time python -S -c '' ; time python -S -c 'import _hashlib'
History
Date User Action Args
2022年04月11日 14:58:46adminsetgithub: 74462
2017年05月12日 16:17:46bmwiedemannsetmessages: + msg293555
2017年05月05日 17:18:32bmwiedemannsetstatus: open -> closed
resolution: third party
messages: + msg293129

stage: resolved
2017年05月05日 16:42:42vstinnersetmessages: + msg293123
2017年05月05日 16:08:52christian.heimessetmessages: + msg293117
2017年05月05日 15:57:41bmwiedemannsetmessages: + msg293114
2017年05月05日 12:51:07vstinnersetmessages: + msg293103
2017年05月05日 12:18:11bmwiedemannsetmessages: + msg293102
2017年05月05日 10:12:54christian.heimessetmessages: + msg293092
2017年05月05日 10:01:53vstinnersetmessages: + msg293091
2017年05月05日 10:00:45vstinnersetmessages: + msg293090
2017年05月05日 09:42:29christian.heimessetnosy: + vstinner
messages: + msg293087
2017年05月05日 08:16:13bmwiedemannsetmessages: + msg293073
2017年05月05日 03:23:43bmwiedemanncreate

AltStyle によって変換されたページ (->オリジナル) /