I have two datasets, one with address points (lat,long) coordinates. And another one with line string geometry (each linestring represents a road). I am trying to find the road name for each address point. I'm trying to do spatial join or find the road line with shortest distance to the address point using shapely line.distance(point). How can I achieve this in python? One thing I can think of is to find the distance between the address point and every line in road dataset. And get the one with minimal distance. Is there an alternate approach to this?
Both the dataset has lat long points in wgs84 projection system.
Line coordinate example :
LINESTRING (168.0473, -44.40220, 168.0485, -44.40117, 168.0486, -44.40085, 168.0487, -44.40067)
Point coordinate example:
(168.04665153,-44.3990252)
-
1rtree might be useful for indexing your lines and points. There is a nearest neighbour search which might help identify which road is the closest to your point. Or at least narrow it down.Fezter– Fezter2017年08月14日 06:25:21 +00:00Commented Aug 14, 2017 at 6:25
-
Any example please?ds_user– ds_user2017年08月14日 06:27:10 +00:00Commented Aug 14, 2017 at 6:27
1 Answer 1
If you have datasets with (long,lat) coordinates, you need to project them prior to distance calculations. For instance, next code opens a point and a line layers previous to these calculations. Results (all distances and minimal distances and its points) are printed by console.
import fiona
from shapely.geometry import shape, Point, LineString
import pyproj
srcProj = pyproj.Proj(init='EPSG:4326')
dstProj = pyproj.Proj(init='EPSG:32612')
path1 = '/home/zeito/pyqgis_data/points_test_4326.shp'
path2 = '/home/zeito/pyqgis_data/new_lines_4326.shp'
points = fiona.open(path1)
line = fiona.open(path2)
points = [ (shape(feat["geometry"]).xy[0][0], shape(feat["geometry"]).xy[1][0])
for feat in points ]
lines = [ zip(shape(feat["geometry"]).coords.xy[0], shape(feat["geometry"]).coords.xy[1])
for feat in line ]
proj_lines = [ [] for i in range(len(lines)) ]
for i, item in enumerate(lines):
for element in item:
x = element[0]
y = element[1]
x, y = pyproj.transform(srcProj, dstProj, x, y)
proj_lines[i].append((x, y))
proj_points = []
for point in points:
x = point[0]
y = point[1]
x, y = pyproj.transform(srcProj, dstProj, x, y)
proj_points.append(Point(x,y))
distances = [ [] for i in range(len(lines)) ]
for i, line in enumerate(proj_lines):
for point in proj_points:
distances[i].append(LineString(line).distance(point))
print distances
for i, list in enumerate(distances):
print "minimal distance: {:.2f} to this point: {}".format(min(list), points[i])
After running the code, it was gotten:
[[240.76129848041424, 82.17367496992877, 534.8119728316819], [180.2963692590486, 601.7275241365959, 1155.1004277619784]]
minimal distance: 82.17 to this point: (-112.17112916813936, 40.17311785635686)
minimal distance: 180.30 to this point: (-112.1560813471297, 40.170581085323384)
Layers look like at next image. QuickWKT plugin of QGIS helped to corroborate that these determined points are corresponding to minimal distance.
-
Thanks for your answer. I will try this now, Is it possible to get the id of the line string with minimal distance? Ultimately, i am trying to find which road the address point falls under. May be the output to be point id and line id as a dataframe?ds_user– ds_user2017年08月14日 23:47:10 +00:00Commented Aug 14, 2017 at 23:47
-
1And how efficient this approach will be? Because, I have 2 million points and 2 million linestrings. So calculating distance between each point and each line will be highly time assuming. What do you suggest?ds_user– ds_user2017年08月14日 23:49:22 +00:00Commented Aug 14, 2017 at 23:49