4

I have a series of Easting and Northing coordinates in pandas dataframe along with other columns of information. I know the coordinates fall within and administrative area and want to match the coordinates and rest of columns to that administrative area, I have the shapefile of the administrative boundaries. I'm familiar with R and visually this looks easy, I can plot the coordinate and see it fall into the area. However i have n=8,335 records to match and need to use Python as all other records are loaded and the analysis i have is better suited to Python.

So far i have work flow as;

Convert Easting and Northing to lat and long

Somehow use Shapely to tell me if point is in polygon / I understand that matplotlib.path.Path.contains_points executes faster.

Assign point to polygon and add new column to dataframe with the polgon reference in it.

Use pandas merge to join dataframe to others. Project to map again and check visually.

Does that sound like a sensible approach or is there a more efficient route?

PolyGeo
65.5k29 gold badges115 silver badges349 bronze badges
asked Apr 24, 2016 at 21:25

1 Answer 1

8

If you're working with spatial data and Pandas you should take a look at GeoPandas.

The example below demonstrates how to perform a spatial join in GeoPandas (which uses Shapely). A GeoDataFrame object is created from a list of cities and their coordinates and is joined to an ESRI Shapefile containing countries.

import pandas
import geopandas
import geopandas.tools
from shapely.geometry import Point
# Create a DataFrame with some cities, including their location
raw_data = [
 ("London", 51.5, -0.1),
 ("Paris", 48.9, 2.4),
 ("San Francisco", 37.8, -122.4),
]
places = pandas.DataFrame(raw_data, columns=["name", "latitude", "longitude"])
# Create the geometry column from the coordinates
# Remember that longitude is east-west (i.e. X) and latitude is north-south (i.e. Y)
places["geometry"] = places.apply(lambda row: Point(row["longitude"], row["latitude"]), axis=1)
del(places["latitude"], places["longitude"])
# Convert to a GeoDataFrame
places = geopandas.GeoDataFrame(places, geometry="geometry")
# Declare the coordinate system for the places GeoDataFrame
# GeoPandas doesn't do any transformations automatically when performing
# the spatial join. The layers are already in the same CRS (WGS84) so no
# transformation is needed.
places.crs = {"init": "epsg:4326"}
# Load the countries polygons
countries = geopandas.GeoDataFrame.from_file("ne_110m_admin_0_countries.shp")
# Drop all columns except the name and polygon geometry
countries = countries[["name", "geometry"]]
# Perform the spatial join
result = geopandas.tools.sjoin(places, countries, how="left")
# Print the results...
print(result.head())

Note that the spatial join feature is still fairly new and hasn't made it into the stable branch yet - you'll need to download and install the development version.

https://github.com/geopandas/geopandas

The result looks like this:

 name_left geometry index_right name_right
0 London POINT (-0.1 51.5) 57 United Kingdom
1 Paris POINT (2.4 48.9) 55 France
2 San Francisco POINT (-122.4 37.8) 168 United States

You can also use GeoPandas to plot the data with matplotlib:

import matplotlib.pyplot as plt
fig, ax = plt.subplots(1)
countries.plot(ax=ax, color="#cccccc")
places.plot(ax=ax, markersize=5, color="#cc0000")
plt.show()

places and countries

answered May 10, 2016 at 19:56
1
  • How would i accomplish this with entire states? Meaning creating borders with longitude and latitude. Maybe with a multi-line polygon of some sort. Commented Mar 2, 2017 at 19:36

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.