SourceForge logo
SourceForge logo
Menu

matplotlib-devel

From: vehemental <jim...@gm...> - 2009年06月17日 14:05:53
Hello,
I'm using matplotlib for various tasks beautifully...but on some occasions,
I have to visualize large datasets (in the range of 10M data points) (using
imshow or regular plots)...system start to choke a bit at that point...
I would like to be consistent somehow and not use different tools for
basically similar tasks...
so I'd like some pointers regarding rendering performance...as I would be
interested to be involved in dev is there is something to be done....
To active developers, what's the general feel does matplotlib have room to
spare in its rendering performance?...
or is it pretty tied down to the speed of Agg right now?
Is there something to gain from using the multiprocessing module now
included by default in 2.6?
or even go as far as using something like pyGPU for fast vectorized
computations...?
I've seen around previous discussions about OpenGL being a backend in some
future...
would it really stand up compared to the current backends? is there clues
about that right now?
thanks for any inputs! :D
bye
-- 
View this message in context: http://www.nabble.com/Large-datasets-performance....-tp24074329p24074329.html
Sent from the matplotlib - devel mailing list archive at Nabble.com.
From: Nicolas R. <Nic...@lo...> - 2009年06月17日 14:26:09
Hello,
To give you some hints on performances using OpenGL, you can have a look
at glumpy: http://www.loria.fr/~rougier/tmp/glumpy.tgz
(It requires pyglet for the OpenGL backend).
It is not yet finished but it is usable. Current version allows to
visualize static numpy float32 array up to 8000x8000 and dynamic numpy
float32 array around 500x500 depending on GPU hardware (dynamic means
that you update image at around 30 fps/second).
The idea behind glumpy is to directly translate a numpy array into a
texture and to use shaders to make the colormap transformation and
filtering (nearest, bilinear or bicubic).
Nicolas
On Wed, 2009年06月17日 at 07:02 -0700, vehemental wrote:
> Hello,
> 
> I'm using matplotlib for various tasks beautifully...but on some occasions,
> I have to visualize large datasets (in the range of 10M data points) (using
> imshow or regular plots)...system start to choke a bit at that point...
> 
> I would like to be consistent somehow and not use different tools for
> basically similar tasks...
> so I'd like some pointers regarding rendering performance...as I would be
> interested to be involved in dev is there is something to be done....
> 
> To active developers, what's the general feel does matplotlib have room to
> spare in its rendering performance?...
> or is it pretty tied down to the speed of Agg right now?
> Is there something to gain from using the multiprocessing module now
> included by default in 2.6?
> or even go as far as using something like pyGPU for fast vectorized
> computations...?
> 
> I've seen around previous discussions about OpenGL being a backend in some
> future...
> would it really stand up compared to the current backends? is there clues
> about that right now?
> 
> thanks for any inputs! :D
> bye
From: Michael D. <md...@st...> - 2009年06月17日 14:34:13
vehemental wrote:
> Hello,
>
> I'm using matplotlib for various tasks beautifully...but on some occasions,
> I have to visualize large datasets (in the range of 10M data points) (using
> imshow or regular plots)...system start to choke a bit at that point...
> 
The first thing I would check is whether your system becomes starved for 
memory at this point and virtual memory swapping kicks in.
A common technique for faster plotting of image data is to downsample it 
before passing it to matplotlib. Same with line plots -- they can be 
decimated. There is newer/faster path simplification code in SVN trunk 
that may help with complex line plots (when the path.simplify rcParam is 
True). I would suggest starting with that as a baseline to see how much 
performance it already gives over the released version.
> I would like to be consistent somehow and not use different tools for
> basically similar tasks...
> so I'd like some pointers regarding rendering performance...as I would be
> interested to be involved in dev is there is something to be done....
>
> To active developers, what's the general feel does matplotlib have room to
> spare in its rendering performance?...
> 
I've spent a lot of time optimizing the Agg backend (which is already 
one of the fastest software-only approaches out there), and I'm out of 
obvious ideas. But a fresh set of eyes may find new things. An 
advantage of Agg that shouldn't be overlooked is that is works 
identically everywhere.
> or is it pretty tied down to the speed of Agg right now?
> Is there something to gain from using the multiprocessing module now
> included by default in 2.6?
> 
Probably not. If the work of rendering were to be divided among cores, 
that would probably be done at the C++ level anyway to see any gains. 
As it is, the problem with plotting many points generally tends to be 
limited by memory bandwidth anyway, not processor speed.
> or even go as far as using something like pyGPU for fast vectorized
> computations...?
> 
Perhaps. But again, the computation isn't the bottleneck -- it's 
usually a memory bandwidth starvation issue in my experience. Using a 
GPU may only make matters worse. Note that I consider that approach 
distinct from just using OpenGL to colormap and render the image as a 
texture. That approach may bear some fruit -- but only for image 
plots. Vector graphics acceleration with GPUs is still difficult to do 
in high quality across platforms and chipsets and beat software for speed.
> I've seen around previous discussions about OpenGL being a backend in some
> future...
> 
> would it really stand up compared to the current backends? is there clues
> about that right now?
>
> thanks for any inputs! :D
> bye
> 
Hope this helps,
Mike
-- 
Michael Droettboom
Science Software Branch
Operations and Engineering Division
Space Telescope Science Institute
Operated by AURA for NASA
From: Jimmy P. <jim...@gm...> - 2009年06月17日 14:56:13
2009年6月17日 Michael Droettboom <md...@st...>
> vehemental wrote:
>
>> Hello,
>>
>> I'm using matplotlib for various tasks beautifully...but on some
>> occasions,
>> I have to visualize large datasets (in the range of 10M data points)
>> (using
>> imshow or regular plots)...system start to choke a bit at that point...
>>
>>
> The first thing I would check is whether your system becomes starved for
> memory at this point and virtual memory swapping kicks in.
the python process is sitting around a 300Mo of memory comsumption....there
should plenty of memory left...
but I will look more closely to what's happenning...
I would assume the Memory bandwidth to not be very high, given the cheapness
of the comp i' m using :D
>
>
> A common technique for faster plotting of image data is to downsample it
> before passing it to matplotlib. Same with line plots -- they can be
> decimated. There is newer/faster path simplification code in SVN trunk that
> may help with complex line plots (when the path.simplify rcParam is True).
> I would suggest starting with that as a baseline to see how much
> performance it already gives over the released version.
yes totally make sense...no need to visualize 3 millions points if you can
only display 200 000....
I'm already doing that to some extent, but it's taking time on its own...but
at least I have solutions to reduce this time if needed....
i' ll try the SVN version....see if I can extract some improvements....
>
> I would like to be consistent somehow and not use different tools for
>> basically similar tasks...
>> so I'd like some pointers regarding rendering performance...as I would be
>> interested to be involved in dev is there is something to be done....
>>
>> To active developers, what's the general feel does matplotlib have room to
>> spare in its rendering performance?...
>>
>>
> I've spent a lot of time optimizing the Agg backend (which is already one
> of the fastest software-only approaches out there), and I'm out of obvious
> ideas. But a fresh set of eyes may find new things. An advantage of Agg
> that shouldn't be overlooked is that is works identically everywhere.
>
>> or is it pretty tied down to the speed of Agg right now?
>> Is there something to gain from using the multiprocessing module now
>> included by default in 2.6?
>>
>>
> Probably not. If the work of rendering were to be divided among cores,
> that would probably be done at the C++ level anyway to see any gains. As it
> is, the problem with plotting many points generally tends to be limited by
> memory bandwidth anyway, not processor speed.
>
>> or even go as far as using something like pyGPU for fast vectorized
>> computations...?
>>
>>
> Perhaps. But again, the computation isn't the bottleneck -- it's usually a
> memory bandwidth starvation issue in my experience. Using a GPU may only
> make matters worse. Note that I consider that approach distinct from just
> using OpenGL to colormap and render the image as a texture. That approach
> may bear some fruit -- but only for image plots. Vector graphics
> acceleration with GPUs is still difficult to do in high quality across
> platforms and chipsets and beat software for speed.
>
So if I hear you correctly, the Matplotlib/Agg combination is not terribly
slower that would be a C plotting lib using Agg as well to render...
and we are talking more about hardware limitations, right?
>
> I've seen around previous discussions about OpenGL being a backend in some
>> future...
>> would it really stand up compared to the current backends? is there clues
>> about that right now?
>>
>
Thanks Nicolas, I' ll take a closer look at GLnumpy....
I can probably gather some info by making a comparison of an imshow to the
equivalent in OGL....
>
>> thanks for any inputs! :D
>> bye
>>
>>
> Hope this helps,
it did! thanks
jimmy
>
> Mike
>
> --
> Michael Droettboom
> Science Software Branch
> Operations and Engineering Division
> Space Telescope Science Institute
> Operated by AURA for NASA
>
>
From: Jimmy P. <jim...@gm...> - 2009年06月17日 16:07:23
The demo-animation.py worked beautifully out of the box at 150fps....
I upped a bit the array size to 1200x1200...still around 40fps...
very interesting...
jimmy
2009年6月17日 Jimmy Paillet <jim...@gm...>
>
>
> 2009年6月17日 Michael Droettboom <md...@st...>
>
>> vehemental wrote:
>>
>>> Hello,
>>>
>>> I'm using matplotlib for various tasks beautifully...but on some
>>> occasions,
>>> I have to visualize large datasets (in the range of 10M data points)
>>> (using
>>> imshow or regular plots)...system start to choke a bit at that point...
>>>
>>>
>> The first thing I would check is whether your system becomes starved for
>> memory at this point and virtual memory swapping kicks in.
>
>
> the python process is sitting around a 300Mo of memory comsumption....there
> should plenty of memory left...
> but I will look more closely to what's happenning...
> I would assume the Memory bandwidth to not be very high, given the
> cheapness of the comp i' m using :D
>
>>
>>
>> A common technique for faster plotting of image data is to downsample it
>> before passing it to matplotlib. Same with line plots -- they can be
>> decimated. There is newer/faster path simplification code in SVN trunk that
>> may help with complex line plots (when the path.simplify rcParam is True).
>> I would suggest starting with that as a baseline to see how much
>> performance it already gives over the released version.
>
>
> yes totally make sense...no need to visualize 3 millions points if you can
> only display 200 000....
> I'm already doing that to some extent, but it's taking time on its
> own...but at least I have solutions to reduce this time if needed....
> i' ll try the SVN version....see if I can extract some improvements....
>
>
>>
>> I would like to be consistent somehow and not use different tools for
>>> basically similar tasks...
>>> so I'd like some pointers regarding rendering performance...as I would be
>>> interested to be involved in dev is there is something to be done....
>>>
>>> To active developers, what's the general feel does matplotlib have room
>>> to
>>> spare in its rendering performance?...
>>>
>>>
>> I've spent a lot of time optimizing the Agg backend (which is already one
>> of the fastest software-only approaches out there), and I'm out of obvious
>> ideas. But a fresh set of eyes may find new things. An advantage of Agg
>> that shouldn't be overlooked is that is works identically everywhere.
>>
>>> or is it pretty tied down to the speed of Agg right now?
>>> Is there something to gain from using the multiprocessing module now
>>> included by default in 2.6?
>>>
>>>
>> Probably not. If the work of rendering were to be divided among cores,
>> that would probably be done at the C++ level anyway to see any gains. As it
>> is, the problem with plotting many points generally tends to be limited by
>> memory bandwidth anyway, not processor speed.
>>
>>> or even go as far as using something like pyGPU for fast vectorized
>>> computations...?
>>>
>>>
>> Perhaps. But again, the computation isn't the bottleneck -- it's usually
>> a memory bandwidth starvation issue in my experience. Using a GPU may only
>> make matters worse. Note that I consider that approach distinct from just
>> using OpenGL to colormap and render the image as a texture. That approach
>> may bear some fruit -- but only for image plots. Vector graphics
>> acceleration with GPUs is still difficult to do in high quality across
>> platforms and chipsets and beat software for speed.
>>
>
>
> So if I hear you correctly, the Matplotlib/Agg combination is not terribly
> slower that would be a C plotting lib using Agg as well to render...
> and we are talking more about hardware limitations, right?
>
>
>>
>> I've seen around previous discussions about OpenGL being a backend in
>>> some
>>> future...
>>> would it really stand up compared to the current backends? is there
>>> clues
>>> about that right now?
>>>
>>
> Thanks Nicolas, I' ll take a closer look at GLnumpy....
> I can probably gather some info by making a comparison of an imshow to the
> equivalent in OGL....
>
>
>
>>
>>> thanks for any inputs! :D
>>> bye
>>>
>>>
>> Hope this helps,
>
>
> it did! thanks
> jimmy
>
>
>>
>> Mike
>>
>> --
>> Michael Droettboom
>> Science Software Branch
>> Operations and Engineering Division
>> Space Telescope Science Institute
>> Operated by AURA for NASA
>>
>>
>
From: Gökhan S. <gok...@gm...> - 2009年06月17日 15:10:38
On Wed, Jun 17, 2009 at 9:25 AM, Nicolas Rougier
<Nic...@lo...>wrote:
>
> Hello,
>
> To give you some hints on performances using OpenGL, you can have a look
> at glumpy: http://www.loria.fr/~rougier/tmp/glumpy.tgz<http://www.loria.fr/%7Erougier/tmp/glumpy.tgz>
> (It requires pyglet for the OpenGL backend).
>
> It is not yet finished but it is usable. Current version allows to
> visualize static numpy float32 array up to 8000x8000 and dynamic numpy
> float32 array around 500x500 depending on GPU hardware (dynamic means
> that you update image at around 30 fps/second).
>
> The idea behind glumpy is to directly translate a numpy array into a
> texture and to use shaders to make the colormap transformation and
> filtering (nearest, bilinear or bicubic).
>
> Nicolas
Nicholas,
How do you run a the demo scripts in glumpy?
I get errors both with Ipython run and python script_name.py
In [1]: run demo-simple.py
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
/home/gsever/glumpy/demo-simple.py in <module>()
 20 #
 21 #
-----------------------------------------------------------------------------
---> 22 import glumpy
 23 import numpy as np
 24 import pyglet, pyglet.gl as gl
/home/gsever/glumpy/glumpy/__init__.py in <module>()
 23 import colormap
 24 from color import Color
---> 25 from image import Image
 26 from trackball import Trackball
 27 from app import app, proxy
/home/gsever/glumpy/glumpy/image.py in <module>()
 25
 26
---> 27 class Image(object):
 28 ''' '''
 29 def __init__(self, Z, format=None, cmap=colormap.IceAndFire,
vmin=None,
/home/gsever/glumpy/glumpy/image.py in Image()
 119 return self._cmap
 120
--> 121 @cmap.setter
 122 def cmap(self, cmap):
 123 ''' Colormap to be used to represent the array. '''
AttributeError: 'property' object has no attribute 'setter'
WARNING: Failure executing file: <demo-simple.py>
[gsever@ccn glumpy]$ python demo-cube.py
Traceback (most recent call last):
 File "demo-cube.py", line 22, in <module>
 import glumpy
 File "/home/gsever/glumpy/glumpy/__init__.py", line 25, in <module>
 from image import Image
 File "/home/gsever/glumpy/glumpy/image.py", line 27, in <module>
 class Image(object):
 File "/home/gsever/glumpy/glumpy/image.py", line 121, in Image
 @cmap.setter
AttributeError: 'property' object has no attribute 'setter'
Have Python 2.5.2...
From: Nicolas R. <Nic...@lo...> - 2009年06月17日 15:29:19
I think the setter method is available in python 2.6 only. I modified
sources and put them at same place. It should be ok now.
Nicolas
On Wed, 2009年06月17日 at 10:10 -0500, Gökhan SEVER wrote:
> On Wed, Jun 17, 2009 at 9:25 AM, Nicolas Rougier
> <Nic...@lo...> wrote:
> 
> Hello,
> 
> To give you some hints on performances using OpenGL, you can
> have a look
> at glumpy: http://www.loria.fr/~rougier/tmp/glumpy.tgz
> (It requires pyglet for the OpenGL backend).
> 
> It is not yet finished but it is usable. Current version
> allows to
> visualize static numpy float32 array up to 8000x8000 and
> dynamic numpy
> float32 array around 500x500 depending on GPU hardware
> (dynamic means
> that you update image at around 30 fps/second).
> 
> The idea behind glumpy is to directly translate a numpy array
> into a
> texture and to use shaders to make the colormap transformation
> and
> filtering (nearest, bilinear or bicubic).
> 
> Nicolas
> 
> Nicholas,
> 
> How do you run a the demo scripts in glumpy?
> 
> I get errors both with Ipython run and python script_name.py 
> 
> In [1]: run demo-simple.py
> ---------------------------------------------------------------------------
> AttributeError Traceback (most recent call
> last)
> 
> /home/gsever/glumpy/demo-simple.py in <module>()
> 20 #
> 21 #
> -----------------------------------------------------------------------------
> ---> 22 import glumpy
> 23 import numpy as np
> 24 import pyglet, pyglet.gl as gl
> 
> /home/gsever/glumpy/glumpy/__init__.py in <module>()
> 23 import colormap
> 24 from color import Color
> ---> 25 from image import Image
> 26 from trackball import Trackball
> 27 from app import app, proxy
> 
> /home/gsever/glumpy/glumpy/image.py in <module>()
> 25 
> 26 
> ---> 27 class Image(object):
> 28 ''' '''
> 29 def __init__(self, Z, format=None,
> cmap=colormap.IceAndFire, vmin=None,
> 
> /home/gsever/glumpy/glumpy/image.py in Image()
> 119 return self._cmap
> 120 
> --> 121 @cmap.setter
> 122 def cmap(self, cmap):
> 123 ''' Colormap to be used to represent the array. '''
> 
> AttributeError: 'property' object has no attribute 'setter'
> WARNING: Failure executing file: <demo-simple.py>
> 
> 
> 
> 
> 
> [gsever@ccn glumpy]$ python demo-cube.py 
> Traceback (most recent call last):
> File "demo-cube.py", line 22, in <module>
> import glumpy
> File "/home/gsever/glumpy/glumpy/__init__.py", line 25, in <module>
> from image import Image
> File "/home/gsever/glumpy/glumpy/image.py", line 27, in <module>
> class Image(object):
> File "/home/gsever/glumpy/glumpy/image.py", line 121, in Image
> @cmap.setter
> AttributeError: 'property' object has no attribute 'setter'
> 
> 
> Have Python 2.5.2... 
> 
> 
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.
Thanks for helping keep SourceForge clean.
X





Briefly describe the problem (required):
Upload screenshot of ad (required):
Select a file, or drag & drop file here.
Screenshot instructions:

Click URL instructions:
Right-click on the ad, choose "Copy Link", then paste here →
(This may not be possible with some types of ads)

More information about our ad policies

Ad destination/click URL:

AltStyle によって変換されたページ (->オリジナル) /