5

I'm writing a little GIS tool in Python and I want to display shapefiles. I've successfully gotten matplotlib working using this answer, but it is very slow and heavy to load any shapefile that happens to be over 5mb. The code follows:

if len(shpFilePath) > 0:
 try:
 shp = shapefile.Reader(shpFilePath)
 plt.figure()
 try:
 for shape in shp.shapeRecords():
 x = [i[0] for i in shape.shape.points[:]]
 y = [i[1] for i in shape.shape.points[:]]
 plt.plot(x, y)
 plt.show(1)
 except AssertionError:
 self.error_popup('Error', 'Shapefile does not contain points.',
 'Check feature type and try again')
 except shapefile.ShapefileException:
 self.error_popup('Error', 'Shapefile not found.', 'Ensure that the path is correct and try again.')

Is there anything I can do to improve the speed and the memory that this consumes? I'm sure the problem lies in the creation of the x and y lists.

I loaded a 10mb shapefile yesterday and it ended up using nearly 1gb of RAM just to display it. I've been looking at other libraries but it seems like matplotlib is the most frequently recommended one.

asked Jul 20, 2016 at 12:27

4 Answers 4

6

This is not a problem of Matplotlib but your script and the module you use for reading shapefiles

1) You know that there are points in the geometries of the Polygon shapefile thus eliminate try... except

2) you load and read the shapefile twice for x and y (memory)

for shape in shp.shapeRecords():
 xy = [i for i in shape.shape.points[:]]
 x = [i[0] for i in xy]
 y = [i[1] for i in xy]

or directly

 for shape in shp.shapeRecords(): 
 xy = [i for i in shape.shape.points[:]]
 x,y = zip(*[(j[0],j[1]) for j in xy])

enter image description here

3) You can also use the Geo_interface (look at Plot shapefile with matplotlib)

 for shape in shp.shapeRecords():
 poly = shape.shape.__geo_interface__
 print(poly)
 {'type': 'Polygon', 'coordinates': (((203602.55736766502, 89867.47994546698), (204061.86095852466, 89822.92064187612), (203983.02526755622, 89322.48538616339), (203684.82069737124, 89031.13609345393), (203280.35932631575, 89260.78788888374), (203184.3854416585, 89624.11759508614), (203602.55736766502, 89867.47994546698)),)}

And you have the GeoJSON representation of the geometry (Polygon). You can plot the Polygon as in the reference

enter image description here

The LinearRing of the Polygon

 x,y = zip(*[(i[0],i[1]) for i in poly['coordinates'][0]])

enter image description here

And the nodes of the Polygon

enter image description here

4) The problem of Pyshp (shapefile) is that it loads the complete shapefile into memory and if the shapefile is too big...
You can use a generator (read the layer one feature by one feature)

def records(filename): 
 # generator 
 reader = shapefile.Reader(filename) 
 for sr in reader.shapeRecords(): 
 geom = sr.shape.__geo_interface__ 
 yield geom
 features = records("a_polygon.shp")
 features.next()
 {'type': 'Polygon', 'coordinates': (((203602.55736766502, 89867.47994546698), (204061.86095852466, 89822.92064187612), (203983.02526755622, 89322.48538616339), (203684.82069737124, 89031.13609345393), (203280.35932631575, 89260.78788888374), (203184.3854416585, 89624.11759508614), (203602.55736766502, 89867.47994546698)),)}

Or directly

 shapes = shapefile.Reader('a_polygon.shp')
 shapes.iterShapes().next().__geo_interface__
 {'type': 'Polygon', 'coordinates': (((203602.55736766502, 89867.47994546698), (204061.86095852466, 89822.92064187612), (203983.02526755622, 89322.48538616339), (203684.82069737124, 89031.13609345393), (203280.35932631575, 89260.78788888374), (203184.3854416585, 89624.11759508614), (203602.55736766502, 89867.47994546698)),)}

5) Or use a Python module that directly uses generators/iterators :Fiona

import fiona
shapes = fiona.open("a_polygon.shp")
first = shapes.next() # for for feature in shapes
print(first)
{'geometry': {'type': 'Polygon', 'coordinates': [[(203602.55736766502, 89867.47994546698), (204061.86095852466, 89822.92064187612), (203983.02526755622, 89322.48538616339), (203684.82069737124, 89031.13609345393), (203280.35932631575, 89260.78788888374), (203184.3854416585, 89624.11759508614), (203602.55736766502, 89867.47994546698)]]}, 'type': 'Feature', 'id': '0', 'properties': OrderedDict([(u'id', None)])}
 print(first['geometry']['coordinates']
 [[(203602.55736766502, 89867.47994546698), (204061.86095852466, 89822.92064187612), (203983.02526755622, 89322.48538616339), (203684.82069737124, 89031.13609345393), (203280.35932631575, 89260.78788888374), (203184.3854416585, 89624.11759508614), (203602.55736766502, 89867.47994546698)]]
answered Jul 20, 2016 at 17:30
2
  • Thank you for the awesome response! It hadn't occurred to me that I was loading the data twice. The information about using a generator was helpful as well. Commented Jul 20, 2016 at 20:35
  • A good answer, but bad advice on #1. try/except doesn't add much overhead, and it is never a good idea to run without exception handling Commented Sep 22, 2016 at 12:55
4

You can use geopandas for plotting as discussed in this answer.

You can also use pyshp as in following code

from descartes import PolygonPatch
import shapefile
sf=shapefile.Reader('shapefile')
fig = plt.figure() 
ax = fig.gca() 
for poly in sf.shapes():
 poly_geo=poly.__geo_interface__
 ax.add_patch(PolygonPatch(poly_geo, fc='#ffffff', ec='#000000', alpha=0.5, zorder=2 ))
ax.axis('scaled')
plt.show()
answered Jan 2, 2018 at 6:49
3

Using geopandas, the code would look like:

import geopandas
df = geopandas.read_file(shpFilePath)
df.plot()

and this should also be faster (at least starting from geopandas 0.3.0)

answered Feb 6, 2018 at 15:54
2

Dowloading the shape file repository of Berlin, here are some of the solutions.

joris

This solution only takes 20s to execute.

import geopandas
import matplotlib.pyplot as plt
df = geopandas.read_file('berlin-latest-free.shp/gis_osm_landuse_a_free_1.dbf')
df.plot()
plt.savefig('test2.png')

enter image description here

It looks a bit as if the projection was not done at all or not done correctly.

ldocao

https://gis.stackexchange.com/a/152331/70242 - takes super long to execute

import matplotlib.pyplot as plt
import shapefile as shp # Requires the pyshp package
sf = shp.Reader('berlin-latest-free.shp/gis_osm_landuse_a_free_1.dbf')
plt.figure(dpi=300)
for shape in sf.shapeRecords():
 x = [i[0] for i in shape.shape.points[:]]
 y = [i[1] for i in shape.shape.points[:]]
 plt.plot(x, y)
plt.savefig('test.png')

enter image description here

answered Oct 26, 2018 at 14:55

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.