I am trying to speed up the process of saving my charts to images. Right now I am creating a cString Object where I save the chart to by using savefig; but I would really, really appreciate any help to improve this method of saving the image. I have to do this operation dozens of times, and the savefig command is very very slow; there must be a better way of doing it. I read something about saving it as uncompressed raw image, but I have no clue of how to do it. I don't really care about agg if I can switch to another faster backend too.
ie:
RAM = cStringIO.StringIO()
CHART = plt.figure(....
**code for creating my chart**
CHART.savefig(RAM, format='png')
I have been using matplotlib with FigureCanvasAgg backend.
Thanks!
2 Answers 2
If you just want a raw buffer, try fig.canvas.print_rgb
, fig.canvas.print_raw
, etc (the difference between the two is that raw
is rgba, whereas rgb
is rgb. There's also print_png
, print_ps
, etc)
This will use fig.dpi
instead of the default dpi value for savefig
(100 dpi). Still, even comparing fig.canvas.print_raw(f)
and fig.savefig(f, format='raw', dpi=fig.dpi)
the print_canvas
version is (削除) marginally faster (削除ここまで) insignificantly faster, since it doesn't bother resetting the color of the axis patch, etc, that savefig
does by default.
Regardless, though, most of the time spent saving a figure in a raw format is just drawing the figure, which there's no way to get around.
At any rate, as a pointless-but-fun example, consider the following:
import matplotlib.pyplot as plt
import numpy as np
import cStringIO
plt.ion()
fig = plt.figure()
ax = fig.add_subplot(111)
num = 50
max_dim = 10
x = max_dim / 2 * np.ones(num)
s, c = 100 * np.random.random(num), np.random.random(num)
scat = ax.scatter(x,x,s,c)
ax.axis([0,max_dim,0,max_dim])
ax.set_autoscale_on(False)
for i in xrange(1000):
xy = np.random.random(2*num).reshape(num,2) - 0.5
offsets = scat.get_offsets() + 0.3 * xy
offsets.clip(0, max_dim, offsets)
scat.set_offsets(offsets)
scat._sizes += 30 * (np.random.random(num) - 0.5)
scat._sizes.clip(1, 300, scat._sizes)
fig.canvas.draw()
Brownian walk animation
If we look at the raw draw time:
import matplotlib.pyplot as plt
import numpy as np
import cStringIO
fig = plt.figure()
ax = fig.add_subplot(111)
num = 50
max_dim = 10
x = max_dim / 2 * np.ones(num)
s, c = 100 * np.random.random(num), np.random.random(num)
scat = ax.scatter(x,x,s,c)
ax.axis([0,max_dim,0,max_dim])
ax.set_autoscale_on(False)
for i in xrange(1000):
xy = np.random.random(2*num).reshape(num,2) - 0.5
offsets = scat.get_offsets() + 0.3 * xy
offsets.clip(0, max_dim, offsets)
scat.set_offsets(offsets)
scat._sizes += 30 * (np.random.random(num) - 0.5)
scat._sizes.clip(1, 300, scat._sizes)
fig.canvas.draw()
This takes ~25 seconds on my machine.
If we instead dump a raw RGBA buffer to a cStringIO buffer, it's actually marginally faster at ~22 seconds (This is only true because I'm using an interactive backend! Otherwise it would be equivalent.):
import matplotlib.pyplot as plt
import numpy as np
import cStringIO
fig = plt.figure()
ax = fig.add_subplot(111)
num = 50
max_dim = 10
x = max_dim / 2 * np.ones(num)
s, c = 100 * np.random.random(num), np.random.random(num)
scat = ax.scatter(x,x,s,c)
ax.axis([0,max_dim,0,max_dim])
ax.set_autoscale_on(False)
for i in xrange(1000):
xy = np.random.random(2*num).reshape(num,2) - 0.5
offsets = scat.get_offsets() + 0.3 * xy
offsets.clip(0, max_dim, offsets)
scat.set_offsets(offsets)
scat._sizes += 30 * (np.random.random(num) - 0.5)
scat._sizes.clip(1, 300, scat._sizes)
ram = cStringIO.StringIO()
fig.canvas.print_raw(ram)
ram.close()
If we compare this to using savefig
, with a comparably set dpi:
import matplotlib.pyplot as plt
import numpy as np
import cStringIO
fig = plt.figure()
ax = fig.add_subplot(111)
num = 50
max_dim = 10
x = max_dim / 2 * np.ones(num)
s, c = 100 * np.random.random(num), np.random.random(num)
scat = ax.scatter(x,x,s,c)
ax.axis([0,max_dim,0,max_dim])
ax.set_autoscale_on(False)
for i in xrange(1000):
xy = np.random.random(2*num).reshape(num,2) - 0.5
offsets = scat.get_offsets() + 0.3 * xy
offsets.clip(0, max_dim, offsets)
scat.set_offsets(offsets)
scat._sizes += 30 * (np.random.random(num) - 0.5)
scat._sizes.clip(1, 300, scat._sizes)
ram = cStringIO.StringIO()
fig.savefig(ram, format='raw', dpi=fig.dpi)
ram.close()
This takes ~23.5 seconds. Basically, savefig
just sets some default parameters and calls print_raw
, in this case, so there's very little difference.
Now, if we compare a raw image format with a compressed image format (png), we see a much more significant difference:
import matplotlib.pyplot as plt
import numpy as np
import cStringIO
fig = plt.figure()
ax = fig.add_subplot(111)
num = 50
max_dim = 10
x = max_dim / 2 * np.ones(num)
s, c = 100 * np.random.random(num), np.random.random(num)
scat = ax.scatter(x,x,s,c)
ax.axis([0,max_dim,0,max_dim])
ax.set_autoscale_on(False)
for i in xrange(1000):
xy = np.random.random(2*num).reshape(num,2) - 0.5
offsets = scat.get_offsets() + 0.3 * xy
offsets.clip(0, max_dim, offsets)
scat.set_offsets(offsets)
scat._sizes += 30 * (np.random.random(num) - 0.5)
scat._sizes.clip(1, 300, scat._sizes)
ram = cStringIO.StringIO()
fig.canvas.print_png(ram)
ram.close()
This takes ~52 seconds! Obviously, there's a lot of overhead in compressing an image.
At any rate, this is probably a needlessly complex example... I think I just wanted to avoid actual work...
-
Nice example Joe, even if it might be overkill. I'm wondering if you saved the frames drawn by each iteration on the disk and then compiled them offline into an animated gif, or is there someway of compiling the drawn frames "in-stream" into an animated gif? I don't mean using the $animation$ module, as I'd like to save animations produced by interactive (mouse-event driven) plots.achennu– achennu2013年04月16日 07:32:45 +00:00Commented Apr 16, 2013 at 7:32
-
1Well, did some searching and I suppose your suggestion might be that shown here: stackoverflow.com/a/14986894/467522 , right?achennu– achennu2013年04月16日 07:37:46 +00:00Commented Apr 16, 2013 at 7:37
-
Actually, this particular gif was made by just saving each iteration and compiling them offline (with imagemagick's
convert
). (I think this example predates the release of a matplotlib version with theanimation
module.) At any rate, it should be possible to useffmpeg
to create an animated gif, but if I recall correctly, saving as a gif using theanimation
module doesn't work quite correctly. (I may be misremembering, and it may have been fixed by now, regardless. It's been awhile since I've tried.)Joe Kington– Joe Kington2013年04月17日 03:14:01 +00:00Commented Apr 17, 2013 at 3:14 -
Realize this is an old thread but wondering if there's a way to avoid cStringIO. Any pure Matplotlib solution?so860– so8602020年02月12日 16:21:49 +00:00Commented Feb 12, 2020 at 16:21
I needed to quickly generate lots of plots as well. I found that multiprocessing improved the plotting speed with the number of cores available. For example, if 100 plots took 10 seconds in one process, it took ~3 seconds when the task was split across 4 cores.
Explore related questions
See similar questions with these tags.
format='raw'
orformat='rgba'
. It looks like they produce the same output.