Hi everybody, About two years ago, I wrote a backend for matplotlib on Mac OS X. This is a native backend for Mac OS X, meaning that most of it is implemented in C (Objective-C, to be precise) to fully make use of Apple's Quartz graphics-rendering technology. I have been using this backend for my own work for the past two years, and recently I updated it for use with matplotlib 0.98.3. To make this backend available to other matplotlib users, I submitted it as a patch to sourceforge (patch 2179017 on the matplotlib issue tracker): http://sourceforge.net/tracker/index.php?func=detail&aid=2179017&group_id=80706&atid=560722 I'd be interested in any feedback on this backend from matplotlib users and developers. Some information about this backend: 1) No 3rd party library is needed other than Python and NumPy. 2) The backend is interactive both with python and with ipython. 3) The backend was written in C for optimal performance, and depending on the application it can be much, much faster than existing backends. 4) One drawback compared to the existing cocoa-agg backend is that the latter allows easy integration of matplotlib into a larger cocoa application, whereas my backend only cares about matplotlib. Some history about this backend: I used to be one of the developers of pygist, a scientific visualization package for Python that was written about ten years ago and is still in use. About two years ago I switched to matplotlib (http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/python/gist). Whereas matplotlib has a vastly superior range of high-level plotting capabilities, pygist excelled at sheer speed. This was achieved by having three backends (Windows, Mac OS X, X11) written in C for optimal performance, which at the same time avoided the need for external 3rd party libraries for drawing. I believe that matplotlib can achieve the same speed and performance by similarly having a native backend. Enjoy! --Michiel.
Michiel de Hoon wrote: > I wrote a backend for matplotlib on Mac OS X. This is a native > backend for Mac OS X very nice! > 4) One drawback compared to the existing cocoa-agg backend is that > the latter allows easy integration of matplotlib into a larger cocoa > application, whereas my backend only cares about matplotlib. well, as far as many of us are concerned, matplotlib IS an embeddable plotting package. I suppose you could say that your backend only cares about pylab. Is there any possibility to embed it in another app? I know that wx, for instance, can pass a Window handle to other libs, so that you can have that window managed and drawn by other code, while wx handles the rest of the app -- would this kind of thing be possible with your code with wx, and Cocoa, and QT, and ? I imagine GTK would be off the table as it is using X11, though I suppose if you are using Apple's X11, it could even be possible there. > Whereas matplotlib has a vastly superior range of high-level plotting > capabilities, pygist excelled at sheer speed. This was achieved by > having three backends (Windows, Mac OS X, X11) written in C for > optimal performance, I'm still curious where all this speed comes from. MPL already uses Agg for a lot, and it's generally reported to be as fast as many native drawing APIs (though maybe not quartz?) -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chr...@no...
--- On Tue, 10/28/08, Christopher Barker <Chr...@no...> wrote: > > 4) One drawback compared to the existing cocoa-agg > > backend is that the latter allows easy integration > > of matplotlib into a larger cocoa application, > > whereas my backend only cares about matplotlib. > ... > Is there any possibility to embed it in another app? I know > that wx, for instance, can pass a Window handle to other > libs, so that you can have that window managed and drawn > by other code, while wx handles the rest of the app -- > would this kind of thing be possible with your code with > wx, and Cocoa, and QT, and ? That is a good idea. I'll check whether it is possible. Basically, if the embedding app can manage a window by passing it the window handle then it should be possible. > I'm still curious where all this speed comes from. MPL > already uses Agg > for a lot, and it's generally reported to be as fast as > many native > drawing APIs (though maybe not quartz?) > At this point, most of it is coming from having complete control over the event loop, which allows to avoid superfluous calls to draw(). Best, --Michiel
Michiel de Hoon wrote: > --- On Tue, 10/28/08, Christopher Barker <Chr...@no...> > wrote: >> I'm still curious where all this speed comes from. > At this point, most of it is coming from having complete control over > the event loop, which allows to avoid superfluous calls to draw(). well, what would be really nice is if we could figure out how to get rid of some of this superfluous calls to draw(0 in all the back-ends! I have noticed a bunch of extras in wxAgg, but had a hard time untangling it all. Also, OS-X does double buffer itself, so there may be extra work being done there is other back-ends -- essentially triple buffering. oh well. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chr...@no...
Nice work ... and an ambitious effort. I've gotten it running, and am a bit perplexed by some of the performance I'm seeing. Specifically, the following bit takes well over twice as long to run as does WxAgg. Does this align with others' testing? The only difference I detect is that the Mac backend puts up a window on the call to plt.figure, while WxAgg waits until the call to show(). Thanks, Eric # for testing macosx backend only import matplotlib matplotlib.use('macosx') import matplotlib.pyplot as plt f=plt.figure() import numpy as np x=np.arange(1e4) y=np.sin(x) ax=f.add_subplot(111) sc=ax.scatter(x,y,c=x**2.0) plt.show() On Tue, Oct 28, 2008 at 12:13 PM, Christopher Barker <Chr...@no...> wrote: > Michiel de Hoon wrote: >> I wrote a backend for matplotlib on Mac OS X. This is a native >> backend for Mac OS X > > very nice! > >> 4) One drawback compared to the existing cocoa-agg backend is that >> the latter allows easy integration of matplotlib into a larger cocoa >> application, whereas my backend only cares about matplotlib. > > well, as far as many of us are concerned, matplotlib IS an embeddable > plotting package. I suppose you could say that your backend only cares > about pylab. > > Is there any possibility to embed it in another app? I know that wx, for > instance, can pass a Window handle to other libs, so that you can have > that window managed and drawn by other code, while wx handles the rest > of the app -- would this kind of thing be possible with your code with > wx, and Cocoa, and QT, and ? I imagine GTK would be off the table as it > is using X11, though I suppose if you are using Apple's X11, it could > even be possible there. > >> Whereas matplotlib has a vastly superior range of high-level plotting >> capabilities, pygist excelled at sheer speed. This was achieved by >> having three backends (Windows, Mac OS X, X11) written in C for >> optimal performance, > > I'm still curious where all this speed comes from. MPL already uses Agg > for a lot, and it's generally reported to be as fast as many native > drawing APIs (though maybe not quartz?) > > -Chris > > > > -- > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chr...@no... > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > Build the coolest Linux based applications with Moblin SDK & win great prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > Matplotlib-devel mailing list > Mat...@li... > https://lists.sourceforge.net/lists/listinfo/matplotlib-devel >
On Tue, Oct 28, 2008 at 12:24 PM, Eric Bruning <eri...@gm...> wrote: > Nice work ... and an ambitious effort. > > I've gotten it running, and am a bit perplexed by some of the > performance I'm seeing. Specifically, the following bit takes well > over twice as long to run as does WxAgg. Does this align with others' > testing? I haven't had a chance to look at the code yet, but I suspect he hasn't implemented the path collection draw method. If it's not implemented, we fall back on drawing each path separately, which is a lot slower. scatter ultimately triggers a call to Renderer.draw_path_collection which has a default implementation and a specialization in backend_agg. JDH
--- On Tue, 10/28/08, John Hunter <jd...@gm...> wrote: > I haven't had a chance to look at the code yet, but I > suspect he > hasn't implemented the path collection draw method. If > it's not > implemented, we fall back on drawing each path separately, > which is a > lot slower. scatter ultimately triggers a call to > Renderer.draw_path_collection which has a default > implementation and a > specialization in backend_agg. > Good point. Indeed I was not aware of the draw_path_collection method and I have not implemented it. I will implement this method and report back with the timings for Eric's example. Thanks! --Michiel.
Dear all, I have now implemented the draw_path_collection, draw_quad_mesh, and draw_markers methods in the Mac OS X native backend (see patch 2179017 at http://sourceforge.net/tracker/?func=detail&atid=560722&aid=2179017&group_id=80706). Some timings are below. In terms of raw drawing speed, the native backend can be either faster or slower than agg. On the other hand, the native backend can be considerably faster if the agg backend uses many calls to draw(); the native backend can avoid these superfluous because it has complete control over the event loop (see the third example below). Best, --Michiel. # Scatter plot n = 1e6 import matplotlib.pyplot as plt f=plt.figure() import numpy as np x=np.arange(n) y=np.sin(x) ax=f.add_subplot(111) plt.scatter(x,y,c=x**2.0) # Time in seconds # n 100,000 1,000,000 2,000,000 3,000,000 # MacOSX 6 45 92 140 # WxAgg 7 56 112 172 # TkAgg 9 56 113 172 # GtkAgg 7 55 111 173 # Quad mesh import numpy as np from matplotlib.pyplot import figure, show, savefig from matplotlib import cm, colors from numpy import ma n = 1000 x = np.cos(np.linspace(-1.5,1.5,n)) y = np.linspace(-1.5,1.5,n*2) X,Y = np.meshgrid(x,y); Qx = np.sin(Y**2) - np.cos(X) Qz = np.sin(Y) + np.sin(X) Qx = (Qx + 1.1) Z = np.sqrt(X**2 + Y**3)/5; Z = (Z - Z.min()) / (Z.max() - Z.min()) # The color array can include masked values: Zm = ma.masked_where(np.fabs(Qz) < 0.5*np.amax(Qz), Z) fig = figure() ax = fig.add_subplot(111) ax.set_axis_bgcolor("#bdb76b") ax.pcolormesh(Qx,Qz,Z) show() # Timings in seconds # n Mac OS X TkAgg # 500 6 6 # 1000 23 7 # 2000 138 40 # Subplots from pylab import * figure() x=np.arange(20) y=1+x**2 n = 4 for i in range(n*n): subplot(n,n,i+1) bar(x,y,log=True) xlim(-5,25) ylim(1,1e4) # Timings in seconds # n Mac OS X TkAgg # 2 2 6 # 3 3 23 # 4 5 66 --- On Tue, 10/28/08, Michiel de Hoon <mjl...@ya...> wrote: > --- On Tue, 10/28/08, John Hunter <jd...@gm...> > wrote: > > I haven't had a chance to look at the code yet, > > but I suspect he hasn't implemented the > > path collection draw method. If it's not > > implemented, we fall back on drawing each path > > separately, which is a lot slower. scatter ultimately > > triggers a call to Renderer.draw_path_collection > > which has a default implementation and a specialization > > in backend_agg. > > Good point. Indeed I was not aware of the > draw_path_collection method and I have not implemented it. I > will implement this method and report back with the timings > for Eric's example.
Hi Michiel, This looks great -- in particular I am intrigued by the final timing results which show your backend 12 times faster than tkagg. I am not sure where this speedup is coming from -- do you have some ideas? Because you are creating lots-o-subplots in that example, there is a lot of overhead at the python layer (many axes, many ticks, etc) so I don't see how a faster backend could generate such a significant improvement. What kind of timings do you see if you issue a plot rather than bar call in that example? One thing about bar in particular is that we draw lots of separate rectangles, each with thie own gc, and it has been on my wishlist for some time to do this as a collection. If you are handling gc creation, etc, in C, that may account for a big part of the difference. Since the new macosx backend was released in 0.98.5, I also need to decide whether this patch belongs on the branch, and hence will get pushed out as early as today in a bugfix release when some changes JJ and Michael are working on are ready, or the trunk, in which case it could be months. In favor of the trunk: this is more of a feature enhancement than a bugfix, and patches to the branch should be bugfixes with an eye to stability of the released code, though a good argument could be made that this is a bugfix. In favor of the branch: it is brand new code advertised as beta in 0.98.5 and so it is unlikely that anyone is using it seriously yet, and since it is beta, we should get as much of it out there ASAP so folks can test and pound on it. I'm in favor of branch, but I wanted to bring this up here since we are fairly new to the branch/trunk release maintenance game and want to get some input and provide some color about which patches should go where, especially in gray areas like this. JDH On Tue, Dec 16, 2008 at 6:08 PM, Michiel de Hoon <mjl...@ya...> wrote: > Dear all, > > I have now implemented the draw_path_collection, draw_quad_mesh, and draw_markers methods in the Mac OS X native backend (see patch 2179017 at > http://sourceforge.net/tracker/?func=detail&atid=560722&aid=2179017&group_id=80706). Some timings are below. In terms of raw drawing speed, the native backend can be either faster or slower than agg. On the other hand, the native backend can be considerably faster if the agg backend uses many calls to draw(); the native backend can avoid these superfluous because it has complete control over the event loop (see the third example below). > # Timings in seconds > # n Mac OS X TkAgg > # 2 2 6 > # 3 3 23 > # 4 5 66 > >
John Hunter wrote: > Hi Michiel, > > This looks great -- in particular I am intrigued by the final timing > results which show your backend 12 times faster than tkagg. I am not > sure where this speedup is coming from -- do you have some ideas? > Because you are creating lots-o-subplots in that example, there is a > lot of overhead at the python layer (many axes, many ticks, etc) so I > don't see how a faster backend could generate such a significant > improvement. What kind of timings do you see if you issue a plot > rather than bar call in that example? One thing about bar in > particular is that we draw lots of separate rectangles, each with thie > own gc, and it has been on my wishlist for some time to do this as a > collection. If you are handling gc creation, etc, in C, that may > account for a big part of the difference. > > Since the new macosx backend was released in 0.98.5, I also need to > decide whether this patch belongs on the branch, and hence will get > pushed out as early as today in a bugfix release when some changes JJ > and Michael are working on are ready, or the trunk, in which case it > could be months. In favor of the trunk: this is more of a feature > enhancement than a bugfix, and patches to the branch should be > bugfixes with an eye to stability of the released code, though a good > argument could be made that this is a bugfix. In favor of the branch: > it is brand new code advertised as beta in 0.98.5 and so it is > unlikely that anyone is using it seriously yet, and since it is beta, > we should get as much of it out there ASAP so folks can test and pound > on it. I'm in favor of branch, but I wanted to bring this up here > since we are fairly new to the branch/trunk release maintenance game > and want to get some input and provide some color about which patches > should go where, especially in gray areas like this. I'm +1 on going ahead and putting this on the branch, for the reasons you mentioned. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma
Hi Michiel, +1 to Chris Barker's request for information on where Agg makes extra calls to draw(). The 20% speedup in scatter performance is nice, and is clearly related to Agg. Any idea why the pcolormesh example is so much slower in Mac OS X than TkAgg? Thanks for your continued work on this. -Eric On Wed, Dec 17, 2008 at 5:52 AM, John Hunter <jd...@gm...> wrote: > Hi Michiel, > > This looks great -- in particular I am intrigued by the final timing > results which show your backend 12 times faster than tkagg. I am not > sure where this speedup is coming from -- do you have some ideas? > Because you are creating lots-o-subplots in that example, there is a > lot of overhead at the python layer (many axes, many ticks, etc) so I > don't see how a faster backend could generate such a significant > improvement. What kind of timings do you see if you issue a plot > rather than bar call in that example? One thing about bar in > particular is that we draw lots of separate rectangles, each with thie > own gc, and it has been on my wishlist for some time to do this as a > collection. If you are handling gc creation, etc, in C, that may > account for a big part of the difference. > > Since the new macosx backend was released in 0.98.5, I also need to > decide whether this patch belongs on the branch, and hence will get > pushed out as early as today in a bugfix release when some changes JJ > and Michael are working on are ready, or the trunk, in which case it > could be months. In favor of the trunk: this is more of a feature > enhancement than a bugfix, and patches to the branch should be > bugfixes with an eye to stability of the released code, though a good > argument could be made that this is a bugfix. In favor of the branch: > it is brand new code advertised as beta in 0.98.5 and so it is > unlikely that anyone is using it seriously yet, and since it is beta, > we should get as much of it out there ASAP so folks can test and pound > on it. I'm in favor of branch, but I wanted to bring this up here > since we are fairly new to the branch/trunk release maintenance game > and want to get some input and provide some color about which patches > should go where, especially in gray areas like this. > > JDH > > > On Tue, Dec 16, 2008 at 6:08 PM, Michiel de Hoon <mjl...@ya...> wrote: >> Dear all, >> >> I have now implemented the draw_path_collection, draw_quad_mesh, and draw_markers methods in the Mac OS X native backend (see patch 2179017 at >> http://sourceforge.net/tracker/?func=detail&atid=560722&aid=2179017&group_id=80706). Some timings are below. In terms of raw drawing speed, the native backend can be either faster or slower than agg. On the other hand, the native backend can be considerably faster if the agg backend uses many calls to draw(); the native backend can avoid these superfluous because it has complete control over the event loop (see the third example below). >> # Timings in seconds >> # n Mac OS X TkAgg >> # 2 2 6 >> # 3 3 23 >> # 4 5 66 >> >> >
> This looks great -- in particular I am intrigued by the > final timing results which show your backend 12 times > faster than tkagg. I am not sure where this speedup is > coming from -- do you have some ideas? In this example, I am drawing 16 subplots in a 4x4 grid. With Tkagg, I am noticing that the first few subplots appear quickly, but subsequent subplots get slower and slower. I think that this is due to how the event loop works. In my understanding, tkagg redraws the window when a subplot is added. So to draw subplot 16, tkagg also needs to redraw subplots 1..15, causing the progressive slowdown. The native backend draws all 16 at once, and draws each of them only once. Using plot() instead of bar() doesn't really make a difference; the same slowdown happens there with the agg backends. In principle, it should be possible to avoid these redraws with the agg and other backends, but it depends on how much of the underlying event loop is exposed by Tkinter/gtk/wx. Basically, instead of calling figManager.show() from draw_if_interactive(), we'd have to call it from inside the Tkinter/gtk/wx event loop just before the event loop starts waiting for events. However, it depends on whether the functionality to insert calls into the event loop is available on Tkinter/gtk/wx. > Since the new macosx backend was released in 0.98.5, I also > need to decide whether this patch belongs on the branch, and hence > will get pushed out as early as today in a bugfix release when some > changes JJ and Michael are working on are ready, or the trunk, in > which case it could be months. > .... I'm in favor of branch, ... Me too. :-). --Michiel > > JDH > > > On Tue, Dec 16, 2008 at 6:08 PM, Michiel de Hoon > <mjl...@ya...> wrote: > > Dear all, > > > > I have now implemented the draw_path_collection, > draw_quad_mesh, and draw_markers methods in the Mac OS X > native backend (see patch 2179017 at > > > http://sourceforge.net/tracker/?func=detail&atid=560722&aid=2179017&group_id=80706). > Some timings are below. In terms of raw drawing speed, the > native backend can be either faster or slower than agg. On > the other hand, the native backend can be considerably > faster if the agg backend uses many calls to draw(); the > native backend can avoid these superfluous because it has > complete control over the event loop (see the third example > below). > > # Timings in seconds > > # n Mac OS X TkAgg > > # 2 2 6 > > # 3 3 23 > > # 4 5 66 > > > >
On Fri, Dec 19, 2008 at 2:20 AM, Michiel de Hoon <mjl...@ya...> wrote: >> This looks great -- in particular I am intrigued by the >> final timing results which show your backend 12 times >> faster than tkagg. I am not sure where this speedup is >> coming from -- do you have some ideas? > > In this example, I am drawing 16 subplots in a 4x4 grid. With Tkagg, I am noticing that the first few subplots appear quickly, but subsequent subplots get slower and slower. I think that this is due to how the event loop works. In my understanding, tkagg redraws the window when a subplot is added. So to draw subplot 16, tkagg also needs to redraw subplots 1..15, causing the progressive slowdown. The native backend draws all 16 at once, and draws each of them only once. Using plot() instead of bar() doesn't really make a difference; the same slowdown happens there with the agg backends. Could you post the script you are using to do the profiling? The call to subplot/plot/bar should not trigger a draw, unless "interactive" is set to True http://matplotlib.sourceforge.net/users/shell.html Interactive is not the best word, but it is the rc parameter meaning "you are using mpl from the interactive prompt and want every pyplot command to update the plot". If the macosx backend is not doing this it should. If tkagg is issuing draw commands on pyplot commands when interactive is False, it is a bug that we should be able to fix. Thanks, JDH
> Could you post the script you are using to do the > profiling? This is the code that I was using from pylab import * import numpy figure() x=numpy.arange(20) y=1+x**2 n = 4 for i in range(n*n): subplot(n,n,i+1) bar(x,y,log=True) xlim(-5,25) ylim(1,1e4) > The call to subplot/plot/bar should not trigger a draw, unless > "interactive" is set to True I was doing the profiling with "interactive" set to True (both for the Agg backends and for the Mac OS X native backend). With "interactive" set to False, I don't see any significant speed difference between Agg and the native backend. > Interactive is not the best word, but it is the rc > parameter meaning > "you are using mpl from the interactive prompt and > want every pyplot > command to update the plot". If the macosx backend is > not doing this it should. In its current form, the MacOSX backend assumes that mpl is being used from the interactive prompt and the plot is updated whenever there are no further Python commands (in other words, when Python is waiting for the user to type in the next Python command). Maybe this is a naive question, but why would a user want every pyplot command to update the plot? --Michiel.
On Dec 19, 2008, at 7:52 AM, John Hunter wrote: > Could you post the script you are using to do the profiling? The call > to subplot/plot/bar should not trigger a draw, unless "interactive" is > set to True > > http://matplotlib.sourceforge.net/users/shell.html > > Interactive is not the best word, but it is the rc parameter meaning > "you are using mpl from the interactive prompt and want every pyplot > command to update the plot". If the macosx backend is not doing this > it should. If tkagg is issuing draw commands on pyplot commands when > interactive is False, it is a bug that we should be able to fix. The interactive backends (wx, tk, gtk) all handle draw_idle in a way which delays the drawing until there are no more commands to run. By changing draw_if_interactive to use draw_idle instead of draw, wouldn't this automatically smooth over the performance issues without the user having to toggle interactive in their scripts? - Paul
> The interactive backends (wx, tk, gtk) all handle draw_idle > in a way which delays the drawing until there are no > more commands to run. > > By changing draw_if_interactive to use draw_idle instead of > draw, > wouldn't this automatically smooth over the performance > issues without > the user having to toggle interactive in their scripts? I just tried this approach with the tkagg and gtk backends, and indeed, this does solve the performance issue in interactive mode. --Michiel.
On Fri, Dec 19, 2008 at 9:07 AM, Paul Kienzle <pau...@ni...> wrote: >> Interactive is not the best word, but it is the rc parameter meaning >> "you are using mpl from the interactive prompt and want every pyplot >> command to update the plot". If the macosx backend is not doing this >> it should. If tkagg is issuing draw commands on pyplot commands when >> interactive is False, it is a bug that we should be able to fix. > > The interactive backends (wx, tk, gtk) all handle draw_idle in a way > which delays the drawing until there are no more commands to run. This seems like a reasonable change and I have added it to the trunk. It would be nice to get a canvas.draw_idle on the qt backend, so perhaps Darren you can add this to your list if you get some free time. JDH
> This seems like a reasonable change and I have added it to > the trunk. > It would be nice to get a canvas.draw_idle on the qt > backend, so > perhaps Darren you can add this to your list if you get > some free > time. > I have written such a function for the qt4 backend; see patch #2468809 at https://sourceforge.net/tracker/index.php?func=detail&aid=2468809&group_id=80706&atid=560722 I am not a big qt4 user, so it would be good if somebody else could look at this patch before adding it to the trunk. --Michiel.
On Fri, Dec 26, 2008 at 8:40 AM, Michiel de Hoon <mjl...@ya...> wrote: > I have written such a function for the qt4 backend; see patch #2468809 at > > https://sourceforge.net/tracker/index.php?func=detail&aid=2468809&group_id=80706&atid=560722 > > I am not a big qt4 user, so it would be good if somebody else could look at this patch before adding it to the trunk. I would like to apply this patch, but I am not a qt user either, so if someone could test this and get back to us, that would be great. JDH
On Tue, Dec 30, 2008 at 10:57 AM, John Hunter <jd...@gm...> wrote: > On Fri, Dec 26, 2008 at 8:40 AM, Michiel de Hoon <mjl...@ya...> wrote: > >> I have written such a function for the qt4 backend; see patch #2468809 at >> >> https://sourceforge.net/tracker/index.php?func=detail&aid=2468809&group_id=80706&atid=560722 >> >> I am not a big qt4 user, so it would be good if somebody else could look at this patch before adding it to the trunk. > > I would like to apply this patch, but I am not a qt user either, so if > someone could test this and get back to us, that would be great. Never had any luck getting a tester, so I went ahead and committed this to the trunk. I should probably get a working qt backend for testing on one of the machines I use.... JDH
On Sat, Jan 10, 2009 at 3:09 PM, John Hunter <jd...@gm...> wrote: > On Tue, Dec 30, 2008 at 10:57 AM, John Hunter <jd...@gm...> wrote: > > On Fri, Dec 26, 2008 at 8:40 AM, Michiel de Hoon <mjl...@ya...> > wrote: > > > >> I have written such a function for the qt4 backend; see patch #2468809 > at > >> > >> > https://sourceforge.net/tracker/index.php?func=detail&aid=2468809&group_id=80706&atid=560722 > >> > >> I am not a big qt4 user, so it would be good if somebody else could look > at this patch before adding it to the trunk. > > > > I would like to apply this patch, but I am not a qt user either, so if > > someone could test this and get back to us, that would be great. > > Never had any luck getting a tester, so I went ahead and committed > this to the trunk. I should probably get a working qt backend for > testing on one of the machines I use.... > I'm sorry John, I didnt see your original request for testing. I tried running the following with interactive on and off: from pylab import * import numpy figure() x=numpy.arange(20) y=1+x**2 n = 4 for i in range(n*n): subplot(n,n,i+1) bar(x,y,log=True) xlim(-5,25) ylim(1,1e4) I didnt notice any significant diffference in speed in the two modes with the qt4agg backend, and the figures looked fine. Is there anything else I should be looking for? Darren
On Sat, Jan 10, 2009 at 2:49 PM, Darren Dale <dsd...@gm...> wrote: >> Never had any luck getting a tester, so I went ahead and committed >> this to the trunk. I should probably get a working qt backend for >> testing on one of the machines I use.... > > I'm sorry John, I didnt see your original request for testing. > > I tried running the following with interactive on and off: > I didnt notice any significant diffference in speed in the two modes with > the qt4agg backend, and the figures looked fine. Is there anything else I > should be looking for? No, that should do it -- thanks for taking a look. JDH
John, Sometime in January, we are going to spend some time fixing a few minor MPL bugs we've hit and a probably work on a few enhancements (I'll send you a list in Jan before we start anything - it's nothing major). We're also going to work on writing a set of tests that try various plots w/ units. I was thinking this would be a good time to introduce a standard test harness into the MPL CM tree. I think we should: 1) Select a standard test harness. The two big hitters seem to be unittest and nose. unittest has the advantage that it's shipped w/ Python. nose seems to do better with automatic discovery of test cases. 2) Establish a set of testing requirements. Naming conventions, usage conventions, etc. Things like tests should never print anything to the screen (i.e. correct behavior is encoded in the test case) or rely on a GUI unless that's what is being tested (allows tests to be run w/o an X-server). Basically write some documentation for the test system that includes how to use it and what's required of people when they add tests. 3) Write a test 'template' for people to use. This would define a test case and put TODO statements or something like it in place for people to fill in. More than one might be good for various classes of tests (maybe an image comparison template for testing agg drawing and a non-plot template for testing basic computations like transforms?). Some things we do on my project for our Python test systems: We put all unit tests in a 'test' directory inside the python package being tested. The disadvantage of this is that potentially large tests are inside the code to be delivered (though a nice delivery script can easily strip them out). The advantage of this is that it makes coverage checking easier. You can run the test case for a package and then check the coverage in the module w/o trying to figure out which things should be coverage checked or not. If you put the test cases in a different directory tree, then it's much harder to identify coverage sources. Though in our case we have 100's of python modules - in MPL's case, there is really just MPL, projections, backends, and numerix so maybe that's not too much of a problem. Automatic coverage isn't something that is must have, but it is really nice. I've found that it actually causes developers to write more tests because they can run the coverage and get a "score" that other people will see. It's also a good way to check a new submission to see if the developer has done basic testing of the code. For our tests, we require that the test never print anything to the screen, clean up any of its output files (i.e. leave the directory in the same state it was before), and only report that the test passed or failed and if it failed, add some error message. The key thing is that the conditions for correctness are encoded into the test itself. We have a command line option that gets passed to the test cases to say "don't clean up" so that you can examine the output from a failing test case w/o modifying the test code. This option is really useful when an image comparison fails. We've wrapped the basic python unittest package. It's pretty simple and reasonably powerful. I doubt there is anything MPL would be doing that it can't handle. The auto-discovery of nose is nice but unnecessary in my opinion. As long as people follow a standard way of doing things, auto-discovery is fairly easy. Of course if you prefer nose and don't mind the additional tool requirement, that's fine too. Some things that are probably needed: - command line executable that runs the tests. - support flags for running only some tests - support flags for running only tests that don't need a GUI backend (require Agg?). This allows automated testing and visual testing to be combined. GUI tests could be placed in identified directories and then only run when requested since by their nature they require specific backends and user interaction. - nice report on test pass/fail status - hooks to add coverage checking and reporting in the future - test utilities - image comparison tools - ??? basically anything that helps w/ testing and could be common across test cases As a first cut, I would suggest is something like this: .../test/run.py mplTest/ test_unit/ test_transform/ test_... The run script would execute all/some of the tests. Any common test code would be put in the mplTest directory. Any directory named 'test_XXX' is for test cases where 'XXX' is some category name that can be used in the run script to run a subset of cases. Inside each test_XXX directory, one unittest class per file. The run script would find the .py files in the test_XXX directories, import them, find all the unittest classes, and run them. The run script also sets up sys.path so that the mplTest package is available. Links: http://docs.python.org/library/unittest.html http://somethingaboutorange.com/mrl/projects/nose/ http://kbyanc.blogspot.com/2007/06/pythons-unittest-module-aint-that-bad.html coverage checking: http://nedbatchelder.com/code/modules/coverage.html http://darcs.idyll.org/~t/projects/figleaf/doc/ Thoughts? Ted ps: looking at the current unit directory, it looks like at least one test (nose_tests) is using nose even though it's not supplied w/ MPL. Most of the tests do something and show a plot but the correct behavior is never written into the test.
On Mon, Dec 22, 2008 at 11:45 AM, Drain, Theodore R <the...@jp...> wrote: > John, > Sometime in January, we are going to spend some time fixing a few minor MPL bugs we've hit and a probably work on a few enhancements (I'll send you a list in Jan before we start anything - it's nothing major). We're also going to work on writing a set of tests that try various plots w/ units. I was thinking this would be a good time to introduce a standard test harness into the MPL CM tree. Hey Ted -- Sorry I haven't gotten back to you yet. These proposals sound good. I have only very limited experience with unit testing and you have tons, so I don't have a lot to add to what you've already written, but I have a few inline comments below. > I think we should: > > 1) Select a standard test harness. The two big hitters seem to be unittest and nose. unittest has the advantage that it's shipped w/ Python. nose seems to do better with automatic discovery of test cases. I prefer nose. I've used both a bit and find nose much more intuitive and easy to use. The fact that ipython, numpy, and scipy are all using nose makes the choice fairly compelling, especially if some of your image specific tests could be ported w/o too much headache. > 2) Establish a set of testing requirements. Naming conventions, usage conventions, etc. Things like tests should never print anything to the screen (i.e. correct behavior is encoded in the test case) or rely on a GUI unless that's what is being tested (allows tests to be run w/o an X-server). Basically write some documentation for the test system that includes how to use it and what's required of people when they add tests. > > 3) Write a test 'template' for people to use. This would define a test case and put TODO statements or something like it in place for people to fill in. More than one might be good for various classes of tests (maybe an image comparison template for testing agg drawing and a non-plot template for testing basic computations like transforms?). > > Some things we do on my project for our Python test systems: > > We put all unit tests in a 'test' directory inside the python package being tested. The disadvantage of this is that potentially large tests are inside the code to be delivered (though a nice delivery script can easily strip them out). The advantage of this is that it makes coverage checking easier. You can run the test case for a package and then check the coverage in the module w/o trying to figure out which things should be coverage checked or not. If you put the test cases in a different directory tree, then it's much harder to identify coverage sources. Though in our case we have 100's of python modules - in MPL's case, there is really just MPL, projections, backends, and numerix so maybe that's not too much of a problem. > > Automatic coverage isn't something that is must have, but it is really nice. I've found that it actually causes developers to write more tests because they can run the coverage and get a "score" that other people will see. It's also a good way to check a new submission to see if the developer has done basic testing of the code. All of the above sounds reasonable and I don't have strong opinions on any of it, so I will defer to those who write the initial framework and tests. > For our tests, we require that the test never print anything to the screen, clean up any of its output files (i.e. leave the directory in the same state it was before), and only report that the test passed or failed and if it failed, add some error message. The key thing is that the conditions for correctness are encoded into the test itself. We have a command line option that gets passed to the test cases to say "don't clean up" so that you can examine the output from a failing test case w/o modifying the test code. This option is really useful when an image comparison fails. > We've wrapped the basic python unittest package. It's pretty simple and reasonably powerful. I doubt there is anything MPL would be doing that it can't handle. The auto-discovery of nose is nice but unnecessary in my opinion. As long as people follow a standard way of doing things, auto-discovery is fairly easy. Of course if you prefer nose and don't mind the additional tool requirement, that's fine too. Some things that are probably needed: > > - command line executable that runs the tests. > - support flags for running only some tests > - support flags for running only tests that don't need a GUI backend > (require Agg?). This allows automated testing and visual testing to be > combined. GUI tests could be placed in identified directories and then > only run when requested since by their nature they require specific backends > and user interaction. > - nice report on test pass/fail status > - hooks to add coverage checking and reporting in the future > - test utilities > - image comparison tools > - ??? basically anything that helps w/ testing and could be common across > test cases > > As a first cut, I would suggest is something like this: > > .../test/run.py > mplTest/ > test_unit/ > test_transform/ > test_... > > The run script would execute all/some of the tests. Any common test code would be put in the mplTest directory. Any directory named 'test_XXX' is for test cases where 'XXX' is some category name that can be used in the run script to run a subset of cases. Inside each test_XXX directory, one unittest class per file. The run script would find the .py files in the test_XXX directories, import them, find all the unittest classes, and run them. The run script also sets up sys.path so that the mplTest package is available. > > Links: > http://docs.python.org/library/unittest.html > http://somethingaboutorange.com/mrl/projects/nose/ > http://kbyanc.blogspot.com/2007/06/pythons-unittest-module-aint-that-bad.html > > coverage checking: > http://nedbatchelder.com/code/modules/coverage.html > http://darcs.idyll.org/~t/projects/figleaf/doc/ > > Thoughts? > Ted > ps: looking at the current unit directory, it looks like at least one test (nose_tests) is using nose even though it's not supplied w/ MPL. Most of the tests do something and show a plot but the correct behavior is never written into the test. My fault -- I wrote some tests to make sure all the different kwargs variants were processed properly, but since we did not have a "correctness of output" framework in place, punted on that part. I think having coverage of the myriad ways of setting properties is of some value. On the issue of units (not unit testing but unit support which is motivating your writing of unit test) I think we may need a new approach. The current approach is to put unitized data into the artists, and update the converted data at the artist layer. I don't know that this is the proper design. For this approach to work, every scalar and array quantity must support units at the artist layer, and all the calculations that are done at the plotting layer (eg error bar) to setup these artists must be careful to preserve unitized data throughout. So it is burdensome on the artist layer and on the plotting function layer. The problem is compounded because most of the other developers are not really aware of how to use the units interface, which I take responsibility for because they have oft asked for a design document, which I have yet to provide because I am unhappy with the design. So new code tends to break functions that once had unit support. Which is why we need unit tests .... I think everything might be easier if mpl had an intermediate class layer PlotItem for plot types, eg XYPlot, BarChart, ErrorBar as we already do for Legend. The plotting functions would instantiate these objects with the input arguments and track unit data through the reference to the axis. These plot objects would contain all the artist primitives which would store their data in native floating point, which would remove the burden on the artists from handling units and put it all in the plot creation/update logic. The objects would store references to all of the original inputs, and would update the primitive artists on unit changes. The basic problem is that the unitized data must live somewhere, and I am not sure that the low level primitive artists are the best place for that -- it may be a better idea to keep this data at the level of a PlotItem and let the primitive artists handle already converted floating point data. This is analogous to the current approach of passing transformed data to the backends to make it easier to write new backends. I need to chew on this some more. But this question aside, by all means fire away on creating the unit tests. JDH
On Tue, Jan 6, 2009 at 10:02 AM, John Hunter <jd...@gm...> wrote: > On Mon, Dec 22, 2008 at 11:45 AM, Drain, Theodore R > <the...@jp...> wrote: > > John, > > Sometime in January, we are going to spend some time fixing a few minor > MPL bugs we've hit and a probably work on a few enhancements (I'll send you > a list in Jan before we start anything - it's nothing major). We're also > going to work on writing a set of tests that try various plots w/ units. I > was thinking this would be a good time to introduce a standard test harness > into the MPL CM tree. > > Hey Ted -- Sorry I haven't gotten back to you yet. These proposals > sound good. I have only very limited experience with unit testing and > you have tons, so I don't have a lot to add to what you've already > written, but I have a few inline comments below. > > > > I think we should: > > > > 1) Select a standard test harness. The two big hitters seem to be > unittest and nose. unittest has the advantage that it's shipped w/ Python. > nose seems to do better with automatic discovery of test cases. > > I prefer nose. I've used both a bit and find nose much more intuitive > and easy to use. The fact that ipython, numpy, and scipy are all > using nose makes the choice fairly compelling, especially if some of > your image specific tests could be ported w/o too much headache. > I also prefer to use nose. nose can still be used to discover tests written with unittest. > > > 2) Establish a set of testing requirements. Naming conventions, usage > conventions, etc. Things like tests should never print anything to the > screen (i.e. correct behavior is encoded in the test case) or rely on a GUI > unless that's what is being tested (allows tests to be run w/o an X-server). > Basically write some documentation for the test system that includes how to > use it and what's required of people when they add tests. > > > > 3) Write a test 'template' for people to use. This would define a test > case and put TODO statements or something like it in place for people to > fill in. More than one might be good for various classes of tests (maybe an > image comparison template for testing agg drawing and a non-plot template > for testing basic computations like transforms?). > > > > Some things we do on my project for our Python test systems: > > > > We put all unit tests in a 'test' directory inside the python package > being tested. The disadvantage of this is that potentially large tests are > inside the code to be delivered (though a nice delivery script can easily > strip them out). The advantage of this is that it makes coverage checking > easier. You can run the test case for a package and then check the coverage > in the module w/o trying to figure out which things should be coverage > checked or not. If you put the test cases in a different directory tree, > then it's much harder to identify coverage sources. Though in our case we > have 100's of python modules - in MPL's case, there is really just MPL, > projections, backends, and numerix so maybe that's not too much of a > problem. > > > > Automatic coverage isn't something that is must have, but it is really > nice. I've found that it actually causes developers to write more tests > because they can run the coverage and get a "score" that other people will > see. It's also a good way to check a new submission to see if the developer > has done basic testing of the code. > > All of the above sounds reasonable and I don't have strong opinions on > any of it, so I will defer to those who write the initial framework > and tests. > > > > For our tests, we require that the test never print anything to the > screen, clean up any of its output files (i.e. leave the directory in the > same state it was before), and only report that the test passed or failed > and if it failed, add some error message. The key thing is that the > conditions for correctness are encoded into the test itself. We have a > command line option that gets passed to the test cases to say "don't clean > up" so that you can examine the output from a failing test case w/o > modifying the test code. This option is really useful when an image > comparison fails. > > We've wrapped the basic python unittest package. It's pretty simple and > reasonably powerful. I doubt there is anything MPL would be doing that it > can't handle. The auto-discovery of nose is nice but unnecessary in my > opinion. As long as people follow a standard way of doing things, > auto-discovery is fairly easy. Of course if you prefer nose and don't mind > the additional tool requirement, that's fine too. Some things that are > probably needed: > > > > - command line executable that runs the tests. > > - support flags for running only some tests > > - support flags for running only tests that don't need a GUI > backend > > (require Agg?). This allows automated testing and visual > testing to be > > combined. GUI tests could be placed in identified directories > and then > > only run when requested since by their nature they require > specific backends > > and user interaction. > > - nice report on test pass/fail status > > - hooks to add coverage checking and reporting in the future > > - test utilities > > - image comparison tools > > - ??? basically anything that helps w/ testing and could be common > across > > test cases > > > > As a first cut, I would suggest is something like this: > > > > .../test/run.py > > mplTest/ > > test_unit/ > > test_transform/ > > test_... > > > > The run script would execute all/some of the tests. Any common test code > would be put in the mplTest directory. Any directory named 'test_XXX' is > for test cases where 'XXX' is some category name that can be used in the run > script to run a subset of cases. Inside each test_XXX directory, one > unittest class per file. The run script would find the .py files in the > test_XXX directories, import them, find all the unittest classes, and run > them. The run script also sets up sys.path so that the mplTest package is > available. > > > > Links: > > http://docs.python.org/library/unittest.html > > http://somethingaboutorange.com/mrl/projects/nose/ > > > http://kbyanc.blogspot.com/2007/06/pythons-unittest-module-aint-that-bad.html > > > > coverage checking: > > http://nedbatchelder.com/code/modules/coverage.html > > http://darcs.idyll.org/~t/projects/figleaf/doc/<http://darcs.idyll.org/%7Et/projects/figleaf/doc/> > > > > Thoughts? > > Ted > > ps: looking at the current unit directory, it looks like at least one > test (nose_tests) is using nose even though it's not supplied w/ MPL. Most > of the tests do something and show a plot but the correct behavior is never > written into the test. > > > My fault -- I wrote some tests to make sure all the different kwargs > variants were processed properly, but since we did not have a > "correctness of output" framework in place, punted on that part. I > think having coverage of the myriad ways of setting properties is of > some value. > > On the issue of units (not unit testing but unit support which is > motivating your writing of unit test) I think we may need a new > approach. The current approach is to put unitized data into the > artists, and update the converted data at the artist layer. I don't > know that this is the proper design. For this approach to work, every > scalar and array quantity must support units at the artist layer, and > all the calculations that are done at the plotting layer (eg error > bar) to setup these artists must be careful to preserve unitized data > throughout. So it is burdensome on the artist layer and on the > plotting function layer. > > The problem is compounded because most of the other developers are not > really aware of how to use the units interface, which I take > responsibility for because they have oft asked for a design document, > which I have yet to provide because I am unhappy with the design. So > new code tends to break functions that once had unit support. Which > is why we need unit tests .... > I'm not ready to make a proper announcement yet, but I spent most of my free time over the break continuing development on my Quantities package, which supports units and unit transformations, and some new support for error propagation that I just put together yesterday. There is some information at packages.python.org/quantities , if anyone is interested. Probably the best place to start is the tutorial in the documentation. I'm planning on requesting comments and feedback from scipy-dev or numpy-discussion after a little more development and optimization. The basic interface and functionality are in place, though. If you guys have a chance to try out the package and provide some early feedback, I would really appreciate it. > > I think everything might be easier if mpl had an intermediate class > layer PlotItem for plot types, eg XYPlot, BarChart, ErrorBar as we > already do for Legend. The plotting functions would instantiate these > objects with the input arguments and track unit data through the > reference to the axis. These plot objects would contain all the > artist primitives which would store their data in native floating > point, which would remove the burden on the artists from handling > units and put it all in the plot creation/update logic. The objects > would store references to all of the original inputs, and would update > the primitive artists on unit changes. The basic problem is that the > unitized data must live somewhere, and I am not sure that the low > level primitive artists are the best place for that -- it may be a > better idea to keep this data at the level of a PlotItem and let the > primitive artists handle already converted floating point data. This > is analogous to the current approach of passing transformed data to > the backends to make it easier to write new backends. I need to chew > on this some more. > > > But this question aside, by all means fire away on creating the unit tests. > > JDH > > > ------------------------------------------------------------------------------ > _______________________________________________ > Matplotlib-devel mailing list > Mat...@li... > https://lists.sourceforge.net/lists/listinfo/matplotlib-devel >