This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2019年08月07日 20:36 by Alexander.Pyhalov, last changed 2022年04月11日 14:59 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| 1.py | Alexander.Pyhalov, 2019年08月17日 02:47 | |||
| Messages (15) | |||
|---|---|---|---|
| msg349197 - (view) | Author: Alexander Pyhalov (Alexander.Pyhalov) | Date: 2019年08月07日 20:36 | |
We've moved illumos-gate wsdiff python tool from Python 2 to Python 3. The tool does the following - for each file from old and new proto area compares file attributes to find differences in binary otput (spawning elfdump, dump and other utilities). Performance has degraded in two times when commands.getstatusoutput() was replaced by subprocess.getstatusoutput(). os.popen() now is Popen() wrapper, so it also has poor performance. Even naive popen() implementation using os.system() and os.mkstemp() behaved much more efficiently (about two times faster e.g. 20 minuts vs 40 minutes for single tool pass). |
|||
| msg349207 - (view) | Author: Josh Rosenberg (josh.r) * (Python triager) | Date: 2019年08月08日 01:26 | |
Have you tried switching to using Popen itself (or run, which keeps it to one layer of convenience wrapping)? subprocess.getstatusoutput is three layers of wrapping (it calls check_output, which in turn calls run, which in turn calls Popen), and (unlike Python 2) has to decode all the output. run would avoid two layers of wrapping, can optionally receive the raw bytes instead of decoding to str, and avoids needing to wrap the whole thing in a shell (which system, older popen, and getstatusoutput all do). Beyond that, it looks like when 3.8 releases, Popen should get *much* faster if the call meets certain conditions, see https://docs.python.org/3.8/whatsnew/3.8.html#optimizations for details. If you can make your use cases conform to those requirements (e.g. by using shutil.which to determine the absolute paths to your utilities instead of relying on PATH lookup), the speed up should eliminate (or more than eliminate) the performance regression you're seeing (per #35537, on macOS, which got the least benefit, it was still a 1.7x improvement; on other OSes, the multiplier hits 61x or 106x). |
|||
| msg349211 - (view) | Author: Alexander Pyhalov (Alexander.Pyhalov) | Date: 2019年08月08日 04:18 | |
I've tried to rewrite subporcess.getstatusoutput() usage with subprocess.Popen() and switch to shell=False, it didn't help, so I doubti it getstatusoutput() overhead, it's Popen() issue. |
|||
| msg349857 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2019年08月16日 12:33 | |
Can you post a simple script showcasing the performance degradation? |
|||
| msg349893 - (view) | Author: Alexander Pyhalov (Alexander.Pyhalov) | Date: 2019年08月17日 02:47 | |
I couldn't reproduce entire test, as wsdiff script is rather large, but here's the simplified version. If Popen is used more often, difference is much more significant. # Using workaround $ python3.5 ~/tmp/1.py 1 10.780487537384033 # Without workaround $ python3.5 ~/tmp/1.py 13.347045660018921 # Using python2.7 $ python2.7 ~/tmp/1.py 9.83385586739 |
|||
| msg349894 - (view) | Author: Alexander Pyhalov (Alexander.Pyhalov) | Date: 2019年08月17日 02:53 | |
Even if I use import subprocess process = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True, universal_newlines=True) return process.stdout.read() difference in times are the same. |
|||
| msg349897 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2019年08月17日 09:18 | |
I don't seen any significant difference here (Ubuntu 18.04): $ time python2 sp.py 10.3433089256 real 0m10,362s user 0m6,565s sys 0m4,372s $ time python3 sp.py 11.746907234191895 real 0m11,781s user 0m7,356s sys 0m5,239s |
|||
| msg350064 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2019年08月21日 11:06 | |
This issue lacks a lot of information: * What is your operating system (name and version)? On Linux, what is your Linux kernel version? * Which Python version did you try? * Which command are you running? * Do you use a shell? * Do you use bytes (default) or Unicode (universal_newlines=True or text=True)? * Can you provide a minimum reproducer? I don't know how to use msg349894: cmd is not defined. Attached 1.py uses 2 commands, appararently both use a shell: * "find /usr/bin -type f 2>/dev/null" * "objdump '%s'" "objdump '%s'" is unsafe and can lead to shell injection: try to avoid the usage of a shell. Use subprocess.Popen directly, or an helper which doesn't use shell=True. |
|||
| msg351343 - (view) | Author: Nick Henderson (nickhendo) | Date: 2019年09月09日 03:20 | |
We have recently bumped into a similar problem. Using FreeBSD, subprocess calls were taking more than 10 times the usual time to execute after migrating to python3.6. After some digging, the default for 'close_fds' was changed to 'True'. On linux, this actually made things faster, but for unix, much slower. Passing 'close_fds=False' solved this for us. |
|||
| msg351352 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2019年09月09日 07:26 | |
I create bpo-38061 "FreeBSD: Optimize subprocess.Popen(close_fds=True) using closefrom()". |
|||
| msg367162 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2020年04月24日 00:28 | |
> We've moved illumos-gate wsdiff python tool from Python 2 to Python 3. I guess that you're talking about https://github.com/illumos/illumos-gate/blob/master/usr/src/tools/scripts/wsdiff.py This project is part of illumos: "An open-source Unix operating system". It's a fork of Solaris. Python does not officially support Solaris not Illumos. If someone provides a patch, we can review it. But I'm not aware of anyone working actively on supporting Illumos. In the meanwhile, there are two workarounds: * Use subprocess.Popen(close_fds=False): it should be safe since PEP 446 has been implemented in Python 3.4... except for C extension modules and C libraries which don't implement it. * Reduce the maximum number of file descriptors. On Linux and FreeBSD (if fdescfs kernel module is loaded and mounted), Python 3 iterates on /proc/self/fd/. On FreeBSD, the closefrom() syscall can be used: PR 19697. But I'm not aware of any more efficient solution for Illumos. Since the latest comment of the original poster was last August, I close the issue. |
|||
| msg367181 - (view) | Author: John Levon (movement) | Date: 2020年04月24日 09:43 | |
closefrom() is on both Solaris and illumos too - and might even have originated there as an API - so if that's the issue, it should be trivially fixable |
|||
| msg367182 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2020年04月24日 09:53 | |
I think we would accept a PR since it would probably be trivial, but someone has to submit it :-) |
|||
| msg367185 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2020年04月24日 10:09 | |
I merged FreeBSD changes: New changeset 162c567d164b5742c0d1f3f8bd8c8bab9c117cd0 by Victor Stinner in branch 'master': bpo-38061: os.closerange() uses closefrom() on FreeBSD (GH-19696) https://github.com/python/cpython/commit/162c567d164b5742c0d1f3f8bd8c8bab9c117cd0 New changeset e6f8abd500751a834b6fff4f107ecbd29f2184fe by Victor Stinner in branch 'master': bpo-38061: subprocess uses closefrom() on FreeBSD (GH-19697) https://github.com/python/cpython/commit/e6f8abd500751a834b6fff4f107ecbd29f2184fe Free feel to propose a more generic way to detect when closefrom() function is available ;-) Or just a patch to add "|| defined(sun)". |
|||
| msg369358 - (view) | Author: John Levon (movement) | Date: 2020年05月19日 14:39 | |
I checked, and the supposition this is due to lack of closefrom() doesn't seem to be correct. Running the test case and looking at 'truss' output, there is no large number of close() that one would expect if this was the issue. I don't see Alexander's 2-time speedup however: root@piano:/export/src/cpython# /export/python3/bin/python3 ./1.py 11.679689645767212 root@piano:/export/src/cpython# vi 1.p^C root@piano:/export/src/cpython# /export/python3/bin/python3 ./1.py foo 10.402687549591064 root@piano:/export/src/cpython# python2.7 ./1.py 10.0434100628 Any difference doesn't seem to be distinguishable from noise on my system. If I run this: from __future__ import print_function import os import sys import time def getoutput(cmd): # Hand-crafted variant if len(sys.argv) >1: import shlex, tempfile f, fpath = tempfile.mkstemp() status = os.system("{ " + cmd + "; } >" + shlex.quote(fpath) + " 2>&1") with os.fdopen(f, "r") as tfile: output = tfile.read() os.unlink(fpath) if output[-1:] == '\n': output = output[:-1] return output else: import subprocess p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=None, close_fds=True) return p.communicate()[0] t = time.time() for file in getoutput("find /usr/bin -type f 2>/dev/null").decode().split('\n'): diff = getoutput("/usr/bin/objdump '%s' 2>/dev/null" % file) then I get more variation than can be measured by changing close_fds. Running something similar (no decode()) under python 2.7 is *slower* than python3. So, something other than closefrom() is going on here. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:59:18 | admin | set | github: 81971 |
| 2020年05月19日 14:39:22 | movement | set | messages:
+ msg369358 title: subprocess.Popen() is extremely slow (with close_fds=True which is the default) on Illumos -> subprocess.Popen() is sometimes slower in python3 under illumos |
| 2020年04月24日 10:09:09 | vstinner | set | messages: + msg367185 |
| 2020年04月24日 09:53:28 | pitrou | set | messages: + msg367182 |
| 2020年04月24日 09:43:37 | movement | set | messages: + msg367181 |
| 2020年04月24日 00:28:51 | vstinner | set | status: open -> closed title: subprocess.Popen() is extremely slow -> subprocess.Popen() is extremely slow (with close_fds=True which is the default) on Illumos messages: + msg367162 resolution: out of date stage: resolved |
| 2019年09月09日 07:26:12 | vstinner | set | messages: + msg351352 |
| 2019年09月09日 03:20:52 | nickhendo | set | nosy:
+ nickhendo messages: + msg351343 |
| 2019年08月21日 11:06:13 | vstinner | set | nosy:
+ vstinner messages: + msg350064 |
| 2019年08月17日 09:18:39 | pitrou | set | messages: + msg349897 |
| 2019年08月17日 02:53:20 | Alexander.Pyhalov | set | messages: + msg349894 |
| 2019年08月17日 02:47:34 | Alexander.Pyhalov | set | files:
+ 1.py messages: + msg349893 |
| 2019年08月16日 12:33:05 | pitrou | set | nosy:
+ pitrou messages: + msg349857 |
| 2019年08月15日 18:48:06 | movement | set | nosy:
+ movement |
| 2019年08月08日 04:18:34 | Alexander.Pyhalov | set | messages: + msg349211 |
| 2019年08月08日 01:26:09 | josh.r | set | nosy:
+ josh.r messages: + msg349207 |
| 2019年08月07日 20:36:00 | Alexander.Pyhalov | create | |