matplotlib

Thread: [matplotlib-devel] example data in example code

Brought to you by: cjgohlke, dsdale, efiring, heeres, and 8 others

matplotlib-devel

[matplotlib-devel] example data in example code

From: John H. <jd...@gm...> - 2009年07月31日 18:10:37

In some examples, I have been moving example functions and data into a
module, so that they can be run from anywhere. Many other examples
still rely on a relative path in the examples dir. Eg, I go to the
gallery and download the source for the axes grid toolkit example
simple_rgb.py, and try to run it from my desktop, I get the error "no
module names demo_image". While I know how to get the data, a naive
user will not. So in some examples I have been adopting the approach,
eg in examples/pylab_examples/scatter_demo2.py
 import matplotlib
 datafile = matplotlib.get_example_data('goog.npy')
These examples will run anywhere mpl is installed. Another approach
would to write a version of get_example_data that checks locally for a
datafile, and if it is not where you expect to be, attempt a
urlretrieve as a temp file.
The gallery is becoming the goto place for most users of the website,
and I would like as many examples as possible to run after a simple
download to the desktop . I am sensitive to packagers who may not
want to ship large amounts of data w/ the main library, so we may want
to minimize the amount we ship in mpl-data which
matplotlib.get_example_data uses, but it may be a good idea to setup a
new svn directory at the top level (mpl_data) and write a urllib
enabled matplotlib.get_example_data that fetches it from the repo if
it can't find it locally.
JDH

Re: [matplotlib-devel] example data in example code

From: John H. <jd...@gm...> - 2009年07月31日 19:35:29

On Fri, Jul 31, 2009 at 1:10 PM, John Hunter<jd...@gm...> wrote:
> The gallery is becoming the goto place for most users of the website,
> and I would like as many examples as possible to run after a simple
> download to the desktop . I am sensitive to packagers who may not
> want to ship large amounts of data w/ the main library, so we may want
> to minimize the amount we ship in mpl-data which
> matplotlib.get_example_data uses, but it may be a good idea to setup a
> new svn directory at the top level (mpl_data) and write a urllib
> enabled matplotlib.get_example_data that fetches it from the repo if
> it can't find it locally.
OK, I committed a first pass at this to HEAD. I created a new svn
directory called mpl_data
 svn co https://matplotlib.svn.sourceforge.net/svnroot/matplotlib/mpl_data
and a cbook.get_mpl_data function, as used in this example::
 import matplotlib.cbook as cbook
 import matplotlib.pyplot as plt
 fname = cbook.get_mpl_data('lena.png', asfileobj=False)
 print 'fname', fname
 im = plt.imread(fname)
 plt.imshow(im)
 plt.show()
The function will check ~/.matplotlib/mpl_data and fetch it using
urllib from svn HEAD if it is not there, caching in the process. It
would be nice to support an svn revision (w/o relying on svn) as I
note in this comment in get_mpl_data:
 # TODO: how to handle stale data in the cache that has been
 # updated from svn -- is there a clean http way to get the current
 # revision number that will not leave us at the mercy of html
 # changes at sf?
If others agree w/ the basic concept, we should port as many data
requiring examples over, removing data from examples/data and
lib/matplotlib/mpl-data/example as we go. This will result in smaller
tarballs and binaries, and make the examples more portable.
JDH

Re: [matplotlib-devel] example data in example code

From: Josh H. <jh...@vn...> - 2009年08月04日 16:18:01

So, I just downloaded 0.99 rc1 and wanted to play with axesgrid examples and
got the results you reported below in your example. I am in fact naive, and
its not clear to me how to get around this problem of the demo_image module
not being found. What is the solution? 
Thanks,
Josh
John Hunter-4 wrote:
> 
> In some examples, I have been moving example functions and data into a
> module, so that they can be run from anywhere. Many other examples
> still rely on a relative path in the examples dir. Eg, I go to the
> gallery and download the source for the axes grid toolkit example
> simple_rgb.py, and try to run it from my desktop, I get the error "no
> module names demo_image". While I know how to get the data, a naive
> user will not. So in some examples I have been adopting the approach,
> eg in examples/pylab_examples/scatter_demo2.py
> 
> import matplotlib
> datafile = matplotlib.get_example_data('goog.npy')
> 
> These examples will run anywhere mpl is installed. Another approach
> would to write a version of get_example_data that checks locally for a
> datafile, and if it is not where you expect to be, attempt a
> urlretrieve as a temp file.
> 
> The gallery is becoming the goto place for most users of the website,
> and I would like as many examples as possible to run after a simple
> download to the desktop . I am sensitive to packagers who may not
> want to ship large amounts of data w/ the main library, so we may want
> to minimize the amount we ship in mpl-data which
> matplotlib.get_example_data uses, but it may be a good idea to setup a
> new svn directory at the top level (mpl_data) and write a urllib
> enabled matplotlib.get_example_data that fetches it from the repo if
> it can't find it locally.
> 
> JDH
> 
-----
Josh Hemann
Statistical Advisor 
http://www.vni.com/ Visual Numerics 
jh...@vn... | P 720.407.4214 | F 720.407.4199 
-- 
View this message in context: http://www.nabble.com/example-data-in-example-code-tp24760754p24811726.html
Sent from the matplotlib - devel mailing list archive at Nabble.com.

Re: [matplotlib-devel] example data in example code

From: John H. <jd...@gm...> - 2009年08月04日 17:06:11

On Tue, Aug 4, 2009 at 11:17 AM, Josh Hemann<jh...@vn...> wrote:
>
> So, I just downloaded 0.99 rc1 and wanted to play with axesgrid examples and
> got the results you reported below in your example. I am in fact naive, and
> its not clear to me how to get around this problem of the demo_image module
> not being found. What is the solution?
The solution is to get the examples directory and run it from there,
where it will have the example data. Although I added support for
having auto-fetched data in svn, we haven't ported the examples over
to use it yet. If you have svn you can grab the examples dir with
 svn co https://matplotlib.svn.sourceforge.net/svnroot/matplotlib/trunk/matplotlib/examples
mpl_examples
and then run the examples in their directory, eg examples/axes_grid.
Then they should be able to see their data.
JDH

Re: [matplotlib-devel] example data in example code

From: Jouni K. S. <jk...@ik...> - 2009年08月04日 19:46:06

John Hunter <jd...@gm...> writes:
> # TODO: how to handle stale data in the cache that has been
> # updated from svn -- is there a clean http way to get the current
> # revision number that will not leave us at the mercy of html
> # changes at sf?
The mod_dav_svn server sends an ETag header that happens to contain the
revision number where the file was last modified, and a Last-Modified
header that contains the date of that revision. The clean http way to
make use of these is to make a conditional request - I hacked up a
processor class for urllib2 that does this, and checked it in.
-- 
Jouni K. Seppänen
http://www.iki.fi/jks

Re: [matplotlib-devel] example data in example code

From: John H. <jd...@gm...> - 2009年08月05日 11:52:45

On Tue, Aug 4, 2009 at 2:45 PM, Jouni K. Seppänen<jk...@ik...> wrote:
> The mod_dav_svn server sends an ETag header that happens to contain the
> revision number where the file was last modified, and a Last-Modified
> header that contains the date of that revision. The clean http way to
> make use of these is to make a conditional request - I hacked up a
> processor class for urllib2 that does this, and checked it in.
Wow, that is really clever and cool. Nicely done. I added
mpl_data/testdata.csv which is easier to modify than lena.png to test
the revision control and it worked beautifully
(examples/misc/mpl_data_test.py)
I didn't understand this part of the code:
 fn = rightmost
 while os.path.exists(self.in_cache_dir(fn)):
 fn = rightmost + '.' + str(random.randint(0,9999999))
when would there be a name clash that would require the randint appended?
Also, how hard would it be to add support for a directory structure?
I see you are getting the filename from the url as the last thing past
the '/'. Is there any way to generalize this so a relative path could
be supported in the svn repo and local cache dir?
JDH

Re: [matplotlib-devel] example data in example code

From: John H. <jd...@gm...> - 2009年08月05日 12:11:23

On Tue, Aug 4, 2009 at 2:45 PM, Jouni K. Seppänen<jk...@ik...> wrote:
> John Hunter <jd...@gm...> writes:
>
>>   # TODO: how to handle stale data in the cache that has been
>>   # updated from svn -- is there a clean http way to get the current
>>   # revision number that will not leave us at the mercy of html
>>   # changes at sf?
>
> The mod_dav_svn server sends an ETag header that happens to contain the
> revision number where the file was last modified, and a Last-Modified
> header that contains the date of that revision. The clean http way to
> make use of these is to make a conditional request - I hacked up a
> processor class for urllib2 that does this, and checked it in.
Also, it would be preferable for the returned file object which
supports the "seek" method. This is what cbook.to_filehandle checks
for, and what mlab.csv2rec uses to rewind the file after doing a data
introspection pass through to get the data types. Eg,
>>> import matplotlib.mlab as mlab
>>> import matplotlib.cbook as cbook
>>> r = mlab.csv2rec( cbook.get_mpl_data('testdata.csv') )
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "/Users/jdhunter/dev/lib/python2.6/site-packages/matplotlib/mlab.py",
line 2108, in csv2rec
 fh = cbook.to_filehandle(fname)
 File "/Users/jdhunter/dev/lib/python2.6/site-packages/matplotlib/cbook.py",
line 339, in to_filehandle
 raise ValueError('fname must be a string or file handle')
ValueError: fname must be a string or file handle
Perhaps we could return a plain file handle pointing to the cached data?
JDH

Re: [matplotlib-devel] example data in example code

From: Ryan M. <rm...@gm...> - 2009年08月05日 13:25:04

On Wed, Aug 5, 2009 at 7:11 AM, John Hunter <jd...@gm...> wrote:
> >>> import matplotlib.mlab as mlab
> >>> import matplotlib.cbook as cbook
> >>> r = mlab.csv2rec( cbook.get_mpl_data('testdata.csv') )
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> File "/Users/jdhunter/dev/lib/python2.6/site-packages/matplotlib/mlab.py",
> line 2108, in csv2rec
> fh = cbook.to_filehandle(fname)
> File
> "/Users/jdhunter/dev/lib/python2.6/site-packages/matplotlib/cbook.py",
> line 339, in to_filehandle
> raise ValueError('fname must be a string or file handle')
> ValueError: fname must be a string or file handle
>
> Perhaps we could return a plain file handle pointing to the cached data?
Another option is to use StringIO to create a new file-like object after
read()-ing in all the data.
Ryan
-- 
Ryan May
Graduate Research Assistant
School of Meteorology
University of Oklahoma

Re: [matplotlib-devel] example data in example code

From: John H. <jd...@gm...> - 2009年08月05日 15:21:39

On Wed, Aug 5, 2009 at 7:11 AM, John Hunter<jd...@gm...> wrote:
> Perhaps we could return a plain file handle pointing to the cached data?
OK, I've made a few changes to the code so Jouni you will probably
want to review them
* I renamed the svn repo and function to be "sample_data" rather than
"mpl_data" to avoid confusion with lib/matplotlib/mpl-data. The svn
repo, the examples and the cbook function have all been renamed. The
repo is ::
 svn co https://matplotlib.svn.sourceforge.net/svnroot/matplotlib/trunk/sample_data
 and the examples are::
 johnh@udesktop191:mpl> ls examples/misc/sam*.py
 examples/misc/sample_data_demo.py examples/misc/sample_data_test.py
* I added support for nested subdirs, so you can now do, as in
examples/misc/sample_data_test.py::
 datafile = 'testdir/subdir/testsub.csv'
 fh = cbook.get_sample_data(datafile)
* I commented out the random number appending, because I do not see
the use case, but we can re-add it when you enlighten me :-)
* I always return a file handle to the cached file, so seek works, and
is exercised in examples/misc/sample_data_test.py
It is probably worth doing a little more work to make the processor
plus the "get_sample_data" function all part of one class, so other
people can reuse it with other repos and other dirs. Eg, something
like the following in cbook::
 myserver = ViewVCCacheServer(mycachedir, myurlbase)
 get_sample_data = myserver.get_sample_data

Re: [matplotlib-devel] example data in example code

From: Jouni K. S. <jk...@ik...> - 2009年08月09日 19:02:10

John Hunter <jd...@gm...> writes:
> * I commented out the random number appending, because I do not see
> the use case, but we can re-add it when you enlighten me :-)
I did that in case someone wanted to retrieve files from several
different locations -- my version of the cache handler was not tied to
any particular base URL. Since all cached files were in one flat
directory, there was the danger of filename collisions. 
> * I added support for nested subdirs, so you can now do, as in
> examples/misc/sample_data_test.py::
>
> datafile = 'testdir/subdir/testsub.csv'
> fh = cbook.get_sample_data(datafile)
I think mirroring a directory structure is somewhat more complicated
than caching a set of arbitrary URLs in a flat cache directory. For
example, I think the remove_stale_files method will need to be changed
to walk all subdirectories, and handling cases such as having a
subdirectory named foo that is replaced by a file named foo could be
complicated.
One thing that's still missing is off-line usage: if the user does not
have net connectivity at the moment but does have the file in the cache,
it should not cause an error.
Perhaps the base URL should be 
http://matplotlib.svn.sourceforge.net/svnroot/matplotlib/trunk/sample_data/
instead of
http://matplotlib.svn.sourceforge.net/viewvc/matplotlib/trunk/sample_data/
to avoid dependency on the viewvc service of SourceForge.
-- 
Jouni K. Seppänen
http://www.iki.fi/jks

Re: [matplotlib-devel] example data in example code

From: John H. <jd...@gm...> - 2009年08月10日 01:21:54

On Sun, Aug 9, 2009 at 2:01 PM, Jouni K. Seppänen<jk...@ik...> wrote:
> I think mirroring a directory structure is somewhat more complicated
> than caching a set of arbitrary URLs in a flat cache directory. For
> example, I think the remove_stale_files method will need to be changed
> to walk all subdirectories, and handling cases such as having a
> subdirectory named foo that is replaced by a file named foo could be
> complicated.
>
> One thing that's still missing is off-line usage: if the user does not
> have net connectivity at the moment but does have the file in the cache,
> it should not cause an error.
>
> Perhaps the base URL should be
> http://matplotlib.svn.sourceforge.net/svnroot/matplotlib/trunk/sample_data/
> instead of
> http://matplotlib.svn.sourceforge.net/viewvc/matplotlib/trunk/sample_data/
> to avoid dependency on the viewvc service of SourceForge.
Would you like to take a crack at these fixes? I have scipy coming up
and need to start getting my tutorial material together, so I am not
going to have a lot of time for bug fixes, though I would be happy to
get as many fixes and patches in next week and try to get one bugfix
0.99.1 out before scipy.
JDH

Re: [matplotlib-devel] example data in example code

From: Jouni K. S. <jk...@ik...> - 2009年08月15日 20:05:02

John Hunter <jd...@gm...> writes:
> Would you like to take a crack at these fixes? [...] I would be happy
> to get as many fixes and patches in next week and try to get one
> bugfix 0.99.1 out before scipy.
I started fixing the issues. It's not complete yet, but the current
state should be usable. However, this work is not on the 0.99 branch,
only on the trunk.
Handling the off-line use case is harder than I thought: at least on my
laptop the urlopen call just blocks, so there needs to be a timeout
mechanism.
-- 
Jouni K. Seppänen
http://www.iki.fi/jks

Thanks for helping keep SourceForge clean.