How do I profile a Python script?

Question 1

Project Euler and other coding contests often have a maximum time to run or people boast of how fast their particular solution runs. With Python, sometimes the approaches are somewhat kludgey - i.e., adding timing code to __main__.

What is a good way to profile how long a Python program takes to run?

Question 2

Project euler programs shouldn't need profiling. Either you have an algorithm that works in under a minute, or you have entirely the wrong algorithm. "Tuning" is rarely appropriate. You generally have to take a fresh approach.

Question 3

S.Lott: Profiling is often a helpful way to determine which subroutines are slow. Subroutines that take a long time are great candidates for algorithmic improvement.

Question 4

It's worth mentioning two packages: py-spy and nvtx for cases when the code runs on CPUs and/or GPUs.

Question 5

There's also line-profiler, for line-by-line profiling

Question 6

Python includes a profiler called cProfile. It not only gives the total running time, but also times each function separately, and tells you how many times each function was called, making it easy to determine where you should make optimizations.

You can call it from within your code, or from the interpreter, like this:

import cProfile
cProfile.run('foo()')

Even more usefully, you can invoke cProfile when running a script:

python -m cProfile myscript.py

Or when running a module:

python -m cProfile -m mymodule

To make it even easier, I made a little batch file called 'profile.bat':

python -m cProfile %1

So all I have to do is run:

profile euler048.py

And I get this:

1007 function calls in 0.061 CPU seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
 1 0.000 0.000 0.061 0.061 <string>:1(<module>)
 1000 0.051 0.000 0.051 0.000 euler048.py:2(<lambda>)
 1 0.005 0.005 0.061 0.061 euler048.py:2(<module>)
 1 0.000 0.000 0.061 0.061 {execfile}
 1 0.002 0.002 0.053 0.053 {map}
 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler objects}
 1 0.000 0.000 0.000 0.000 {range}
 1 0.003 0.003 0.003 0.003 {sum}

For more information, check out this tutorial from PyCon 2013 titled Python Profiling
Also via YouTube.

Question 7

Also it is useful to sort the results, that can be done by -s switch, example: '-s time'. You can use cumulative/name/time/file sorting options.

Question 8

It is also worth noting that you can use the cProfile module from ipython using the magic function %prun (profile run). First import your module, and then call the main function with %prun: import euler048; %prun euler048.main()

Question 9

For visualizing cProfile dumps (created by python -m cProfile -o <out.profile> <script>), RunSnakeRun, invoked as runsnake <out.profile> is invaluable.

Question 10

@NeilG even for python 3, cprofile is still recommended over profile.

Question 11

For visualizing cProfile dumps, RunSnakeRun hasn't been updated since 2011 and doesn't support python3. You should use snakeviz instead

Question 12

A while ago I made pycallgraph which generates a visualisation from your Python code. Edit: I've updated the example to work with 3.3, the latest release as of this writing.

After a pip install pycallgraph and installing GraphViz you can run it from the command line:

pycallgraph graphviz -- ./mypythonscript.py

Or, you can profile particular parts of your code:

from pycallgraph import PyCallGraph
from pycallgraph.output import GraphvizOutput
with PyCallGraph(output=GraphvizOutput()):
 code_to_profile()

Either of these will generate a pycallgraph.png file similar to the image below:

enter image description here

Question 13

Are you coloring based on the amount of calls? If so, you should color based on time because the function with the most calls isn't always the one that takes the most time.

Question 14

@red You can customise colours however you like, and even independently for each measurement. For example red for calls, blue for time, green for memory usage.

Question 15

getting this error

Traceback (most recent call last): /pycallgraph.py", line 90, in generate output.done() File "/net_downloaded/pycallgraph-develop/pycallgraph/output/graphviz.py", line 94, in done source = self.generate() File "/net_downloaded/pycallgraph-develop/pycallgraph/output/graphviz.py", line 143, in generate indent_join.join(self.generate_attributes()), File "/net_downloaded/pycallgraph-develop/pycallgraph/output/graphviz.py", line 169, in generate_attributes section, self.attrs_from_dict(attrs), ValueError: zero length field name in format

Question 16

I updated this to mention that you need to install GraphViz for things to work as described. On Ubuntu this is just sudo apt-get install graphviz.

Question 17

The github page states that this project is abandoned ... :(

Question 18

It's worth pointing out that using the profiler only works (by default) on the main thread, and you won't get any information from other threads if you use them. This can be a bit of a gotcha as it is completely unmentioned in the profiler documentation.

If you also want to profile threads, you'll want to look at the threading.setprofile() function in the docs.

You could also create your own threading.Thread subclass to do it:

class ProfiledThread(threading.Thread):
 # Overrides threading.Thread.run()
 def run(self):
 profiler = cProfile.Profile()
 try:
 return profiler.runcall(threading.Thread.run, self)
 finally:
 profiler.dump_stats('myprofile-%d.profile' % (self.ident,))

and use that ProfiledThread class instead of the standard one. It might give you more flexibility, but I'm not sure it's worth it, especially if you are using third-party code which wouldn't use your class.

Question 19

I don't see any reference to runcall in the documentation either. Giving a look at cProfile.py, I'm not sure why you use the threading.Thread.run function nor self as argument. I'd have expected to see a reference to another thread's run method here.

Question 20

It's not in the documentation, but it is in the module. See hg.python.org/cpython/file/6bf07db23445/Lib/cProfile.py#l140. That allows you to profile a specific function call, and in our case we want to profile the Thread's target function, which is what the threading.Thread.run() call executes. But as I said in the answer, it's probably not worth it to subclass Thread, since any third-party code won't use it, and to instead use threading.setprofile().

Question 21

wrapping the code with profiler.enable() and profiler.disable() seems to work quite well, too. That's basically what runcall do and it doesn't enforce any number of argument or similar things.

Question 22

I combined my own stackoverflow.com/questions/10748118/… with ddaa.net/blog/python/lsprof-calltree and it kindof works ;!-)

Question 23

Joe, do you know how the profiler plays with asyncio in Python 3.4?

Question 24

Simplest and quickest way to find where all the time is going.

1. pip install snakeviz
2. python -m cProfile -o temp.dat <PROGRAM>.py
3. snakeviz temp.dat

Draws a pie chart in a browser. Biggest piece is the problem function. Very simple.

Question 25

See also zaxliu’s answer which provides a link to the tool and example output.

Question 26

Using this on windows, created a bat script for pycharm integration, it works like a charm! Thank you

Question 27

The python wiki is a great page for profiling resources: http://wiki.python.org/moin/PythonSpeed/PerformanceTips#Profiling_Code

as is the python docs: http://docs.python.org/library/profile.html

as shown by Chris Lawlor cProfile is a great tool and can easily be used to print to the screen:

python -m cProfile -s time mine.py <args>

or to file:

python -m cProfile -o output.file mine.py <args>

PS> If you are using Ubuntu, make sure to install python-profile

apt-get install python-profiler

If you output to file you can get nice visualizations using the following tools

PyCallGraph : a tool to create call graph images
install:

 pip install pycallgraph

run:

 pycallgraph mine.py args

view:

 gimp pycallgraph.png

You can use whatever you like to view the png file, I used gimp
Unfortunately I often get

dot: graph is too large for cairo-renderer bitmaps. Scaling by 0.257079 to fit

which makes my images unusably small. So I generally create svg files:

pycallgraph -f svg -o pycallgraph.svg mine.py <args>

PS> make sure to install graphviz (which provides the dot program):

pip install graphviz

Alternative Graphing using gprof2dot via @maxy / @quodlibetor :

pip install gprof2dot
python -m cProfile -o profile.pstats mine.py
gprof2dot -f pstats profile.pstats | dot -Tsvg -o mine.svg

Question 28

gprof2dot can do those graphs too. I think the output is a bit nicer (example).

Question 29

graphviz is also required if you are using OSX

Question 30

Project was archived on github and appears to be no longer maintained. github.com/gak/pycallgraph

Question 31

pip install pycallgraph gave me error: subprocess-exited-with-error ... error in pycallgraph setup command: use_2to3 is invalid.

Question 32

@Maxy's comment on this answer helped me out enough that I think it deserves its own answer: I already had cProfile-generated .pstats files and I didn't want to re-run things with pycallgraph, so I used gprof2dot, and got pretty svgs:

$ sudo apt-get install graphviz
$ git clone https://github.com/jrfonseca/gprof2dot
$ ln -s "$PWD"/gprof2dot/gprof2dot.py ~/bin
$ cd $PROJECT_DIR
$ gprof2dot.py -f pstats profile.pstats | dot -Tsvg -o callgraph.svg

and BLAM!

It uses dot (the same thing that pycallgraph uses) so output looks similar. I get the impression that gprof2dot loses less information though:

gprof2dot example output

Question 33

Good approach, works really well as you can view SVG in Chrome etc and scale it up/down. Third line has typo, should be: ln -s pwd/gprof2dot/gprof2dot.py $HOME/bin (or use ln -s $PWD/gprof2dot/gprof2dot.py ~/bin in most shells - grave accent is taken as formatting in first version).

Question 34

Ah, good point. I get ln's argument-order wrong almost every time.

Question 35

the trick is to remember that ln and cp have the same argument order - think of it as 'copying file1 to file2 or dir2, but making a link'

Question 36

That makes sense, I think the use of "TARGET" in the manpage throws me.

Question 37

Thanks @quodlibetor! On Win 10, depending on the conda or pip install, the command line editor might claim that dot is not recognizable. Setting a PATH for dot is not advisable e.g. as per github.com/ContinuumIO/anaconda-issues/issues/1666. One can use the full path of graphviz dot instead, e.g.: i) python -m cProfile -o profile.pstats main.py ii) gprof2dot -f pstats profile.pstats | "C:\Program Files (x86)\Graphviz2.38\bin\dot.exe" -Tsvg -o gprof2dot_pstats.svg.

Question 38

I ran into a handy tool called SnakeViz when researching this topic. SnakeViz is a web-based profiling visualization tool. It is very easy to install and use. The usual way I use it is to generate a stat file with %prun and then do analysis in SnakeViz.

The main viz technique used is Sunburst chart as shown below, in which the hierarchy of function calls is arranged as layers of arcs and time info encoded in their angular widths.

The best thing is you can interact with the chart. For example, to zoom in one can click on an arc, and the arc and its descendants will be enlarged as a new sunburst to display more details.

enter image description here

Question 39

CodeCabbie's answer includes the (short) installation instructions, and shows how to (easily) use SnakeViz.

Question 40

Here I've read IMHO good guide how to use profiling for Python on jupyter notebook: towardsdatascience.com/speed-up-jupyter-notebooks-20716cbe2025

Question 41

cProfile is great for profiling, while kcachegrind is great for visualizing the results. The pyprof2calltree in between handles the file conversion.

python -m cProfile -o script.profile script.py
pyprof2calltree -i script.profile -o script.calltree
kcachegrind script.calltree

Required system packages:

kcachegrind (Linux), qcachegrind (MacOs)

Setup on Ubuntu:

apt-get install kcachegrind 
pip install pyprof2calltree

The result:

Screenshot of the result

Question 42

Mac Users install brew install qcachegrind and substitude each kcachegrind with qcachegrind in the description for successful profiling.

Question 43

I had to do this to get it to work: export QT_X11_NO_MITSHM=1

Question 44

Out of bunch of solutions listed here: this one worked best with large profile data. gprof2dot is not interactive and does not have the overall cpu time (only relative percentage) tuna and snakeviz die on larger profile. pycallgraph is archived and no longer maintained

Question 45

@YonatanSimson You probably run kcachegrind in a docker container, which doesn't share IPC with the host by default. Another way to fix that is to run the docker container with --ipc=host.

Question 46

I recently created tuna for visualizing Python runtime and import profiles; this may be helpful here.

enter image description here

Install with

pip install tuna

Create a runtime profile

python3 -m cProfile -o program.prof yourfile.py

or an import profile (Python 3.7+ required)

python3 -X importprofile yourfile.py 2> import.log

Then just run tuna on the file

tuna program.prof

Question 47

This was the first solution that worked well for me.

Question 48

Kudos and thanks for creating Tuna. Among the solutions shown on this page, yours and pyinstrument were the best in figuring out what was taking up the most time in my app.

Question 49

Also worth mentioning is the GUI cProfile dump viewer RunSnakeRun. It allows you to sort and select, thereby zooming in on the relevant parts of the program. The sizes of the rectangles in the picture is proportional to the time taken. If you mouse over a rectangle it highlights that call in the table and everywhere on the map. When you double-click on a rectangle it zooms in on that portion. It will show you who calls that portion and what that portion calls.

The descriptive information is very helpful. It shows you the code for that bit which can be helpful when you are dealing with built-in library calls. It tells you what file and what line to find the code.

Also want to point at that the OP said 'profiling' but it appears he meant 'timing'. Keep in mind programs will run slower when profiled.

enter image description here

Question 50

This screenshot is evidence that Engineers need to be taught at least one chapter on how to use colors in GUI's. I've worked with Interaction Designers who'd turn away in disgust, on seeing this splosh of random colors.

Question 51

pprofile

line_profiler (already presented here) also inspired pprofile, which is described as:

Line-granularity, thread-aware deterministic and statistic pure-python profiler

It provides line-granularity as line_profiler, is pure Python, can be used as a standalone command or a module, and can even generate callgrind-format files that can be easily analyzed with [k|q]cachegrind.

vprof

There is also vprof, a Python package described as:

[...] providing rich and interactive visualizations for various Python program characteristics such as running time and memory usage.

heatmap

Question 52

Haven't tried pprofile, but I'm upvoting vprof. Its "code heatmap" mode is similar to the Matlab profiler. Currently, correct usage on Windows is not in the readme, but in vprof's GitHub issues: py -m vprof -c <config> <src>

Question 53

A nice profiling module is the line_profiler (called using the script kernprof.py). It can be downloaded here.

My understanding is that cProfile only gives information about total time spent in each function. So individual lines of code are not timed. This is an issue in scientific computing since often one single line can take a lot of time. Also, as I remember, cProfile didn't catch the time I was spending in say numpy.dot.

Question 54

Note that the original repository has been archived. The currently maintained version is here: github.com/pyutils/line_profiler

Question 55

The terminal-only (and simplest) solution, in case all those fancy UI's fail to install or to run:
ignore cProfile completely and replace it with pyinstrument, that will collect and display the tree of calls right after execution.

Install:

$ pip install pyinstrument

Profile and display result:

$ python -m pyinstrument ./prog.py

Works with python2 and 3.

[EDIT] The documentation of the API, for profiling only a part of the code, can be found here.

Question 56

Thank you, I think your answer should be much higher :)

Question 57

This is exactly the thing I was looking for. Thank you.

Question 58

With a statistical profiler like austin, no instrumentation is required, meaning that you can get profiling data out of a Python application simply with

austin python3 my_script.py

The raw output isn't very useful, but you can pipe that to flamegraph.pl to get a flame graph representation of that data that gives you a breakdown of where the time (measured in microseconds of real time) is being spent.

austin python3 my_script.py | flamegraph.pl > my_script_profile.svg

Alternatively, you can also use the web application Speedscope.app for quick visualisation of the collected samples. If you have pprof installed, you can also get austin-python (with e.g. pipx install austin-python) and use the austin2pprof to covert to the pprof format.

However, if you have VS Code installed you could use the Austin extension for a more interactive experience, with source code heat maps, top functions and collected call stacks

Austin VS Code extension

If you'd rather use the terminal, you can also use the TUI, that also has a live graph mode:

Austin TUI graph mode

Question 59

There's a lot of great answers but they either use command line or some external program for profiling and/or sorting the results.

I really missed some way I could use in my IDE (eclipse-PyDev) without touching the command line or installing anything. So here it is.

Profiling without command line

def count():
 from math import sqrt
 for x in range(10**5):
 sqrt(x)
if __name__ == '__main__':
 import cProfile, pstats
 cProfile.run("count()", "{}.profile".format(__file__))
 s = pstats.Stats("{}.profile".format(__file__))
 s.strip_dirs()
 s.sort_stats("time").print_stats(10)

See docs or other answers for more info.

Question 60

for example, the profile prints {map} or {xxx} . how do I know the method {xxx} is called from which file? my profile prints {method 'compress' of 'zlib.Compress' objects} takes most of time, but I don't use any zlib , so I guess some call numpy function may use it . How do I know which is the exactly file and line takes much time?

Question 61

This isn't fair... I dunno why this great answer has so few upvotes... much more useful than the other high-upvoted ones :/

Question 62

Following Joe Shaw's answer about multi-threaded code not to work as expected, I figured that the runcall method in cProfile is merely doing self.enable() and self.disable() calls around the profiled function call, so you can simply do that yourself and have whatever code you want in-between with minimal interference with existing code.

Question 63

Excellent tip! A quick peek at cprofile.py's source code reveals that's exactly what runcall() does. Being more specific, after creating a Profile instance with prof = cprofile.Profile(), immediately call prof.disable(), and then just add prof.enable() and prof.disable() calls around the section of code you want profiled.

Question 64

This is very helpful, but it seems the code that is actually between enable and disable is not profiled -- only the functions it calls. Do I have this right? I'd have to wrap that code in a function call for it to count toward any of the numbers in print_stats().

Chris Lawlor Chris Lawlor 49.3k11 gold badges51 silver badges71 bronze badges · Accepted Answer · 2009-02-24 16:01:40Z

Python includes a profiler called cProfile. It not only gives the total running time, but also times each function separately, and tells you how many times each function was called, making it easy to determine where you should make optimizations.

You can call it from within your code, or from the interpreter, like this:

import cProfile
cProfile.run('foo()')

Even more usefully, you can invoke cProfile when running a script:

python -m cProfile myscript.py

Or when running a module:

python -m cProfile -m mymodule

To make it even easier, I made a little batch file called 'profile.bat':

python -m cProfile %1

So all I have to do is run:

profile euler048.py

And I get this:

1007 function calls in 0.061 CPU seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
 1 0.000 0.000 0.061 0.061 <string>:1(<module>)
 1000 0.051 0.000 0.051 0.000 euler048.py:2(<lambda>)
 1 0.005 0.005 0.061 0.061 euler048.py:2(<module>)
 1 0.000 0.000 0.061 0.061 {execfile}
 1 0.002 0.002 0.053 0.053 {map}
 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler objects}
 1 0.000 0.000 0.000 0.000 {range}
 1 0.003 0.003 0.003 0.003 {sum}

For more information, check out this tutorial from PyCon 2013 titled Python Profiling
Also via YouTube.

Also it is useful to sort the results, that can be done by -s switch, example: '-s time'. You can use cumulative/name/time/file sorting options.
It is also worth noting that you can use the cProfile module from ipython using the magic function %prun (profile run). First import your module, and then call the main function with %prun: import euler048; %prun euler048.main()
For visualizing cProfile dumps (created by python -m cProfile -o <out.profile> <script>), RunSnakeRun, invoked as runsnake <out.profile> is invaluable.
@NeilG even for python 3, cprofile is still recommended over profile.
For visualizing cProfile dumps, RunSnakeRun hasn't been updated since 2011 and doesn't support python3. You should use snakeviz instead

CollectivesTM on Stack Overflow

How do I profile a Python script?

34 Answers 34

pprofile

vprof

Profiling without command line

Get it!

Load it!

Use it!

%time

%timeit

%prun

%memit

%lprun

sys.getsizeof

asizeof() from pympler

tracker from pympler

Pympler doc

Update Version 2

Install:

Quick Start:

Using in Jupyter, let you have realtime view of elapsed times

Update Version 1

Example

Example Output

Linked

Hot Network Questions

CollectivesTM on Stack Overflow

34 Answers 34

pprofile

vprof

Profiling without command line

Get it!

Load it!

Use it!

%time

%timeit

%prun

%memit

%lprun

sys.getsizeof

asizeof() from pympler

tracker from pympler

Pympler doc

Update Version 2

Install:

Quick Start:

Using in Jupyter, let you have realtime view of elapsed times

Update Version 1

Example

Example Output

Linked

Related