Generate sample coordinates inside a Polygon

Question 1

I have a Polygon named as poly. I attempted to randomly select 5 coordinate points that lies inside the polygon.

import numpy as np
from shapely.geometry import Polygon, Point
poly = Polygon([(141.4378366,-25.95915986), (165.4279876,-29.43400298), (163.1382942,-47.65345814), (133.1675418,-42.99807751)])
minx, miny, maxx, maxy = poly.bounds 
longs = np.arange(minx, maxx, 0.002); lats = np.arange(miny, maxy, 0.002) 
longs = np.tile(longs,3).ravel(); lats = np.repeat(lats,3).ravel()
coords = np.array([(x,y) for x,y in zip(longs,lats)])
points = [Point(xy) for xy in coords]
check = [xy.within(poly) for xy in points]
pointsInside = coords[check]
ranIdx = np.random.choice(len(pointsInside),5,replace=False) 
result = pointsInside[ranIdx]
print result

I think my code is ineffective. Are there any ideas for a straight and elegant implementation?

Question 2

Relevant cs.stackexchange.com/questions/14007/…

Question 3

@MrE In my problem just selecting 5 points inside the polygon is sufficient and it does not require that they are exactly uniformly and randomly distributed.

Question 4

Rejection sampling was proposed in comments on the other answer. The problem with rejection sampling is that the area of a polygon can be an arbitrarily small fraction of its bounding box, for example:

def ε_poly(ε):
 "Return a polygon that occupies a fraction ε of its bounding box."
 assert 0 < ε <= 1
 return Polygon([(0, 0), (1, 0), (ε, ε), (0, 1)])

Rejection sampling will take on average \1ドル\over ε\$ attempts to generate one sample point inside ε_poly(ε), and \1ドル\over ε\$ can be arbitrarily large.

A more robust approach is as follows:

Triangulate the polygon and calculate the area of each triangle.
For each sample:
Pick the triangle \$t\$ containing the sample, using random selection weighted by the area of each triangle.
Pick a random point uniformly in the triangle, as follows:
1. Pick a random point \$x, y\$ uniformly in the unit square.
2. If \$x + y > 1\$, use the point \1ドル-x, 1-y\$ instead. The effect of this is to ensure that the point is chosen uniformly in the unit right triangle with vertices \$(0, 0), (0, 1), (1, 0)\$
3. Apply the appropriate affine transformation to transform the unit right triangle to the triangle \$t\$.

Here's one possible implementation, using shapely.ops.triangulate and shapely.affinity.affine_transform:

import random
from shapely.affinity import affine_transform
from shapely.geometry import Point, Polygon
from shapely.ops import triangulate
def random_points_in_polygon(polygon, k):
 "Return list of k points chosen uniformly at random inside the polygon."
 areas = []
 transforms = []
 for t in triangulate(polygon):
 areas.append(t.area)
 (x0, y0), (x1, y1), (x2, y2), _ = t.exterior.coords
 transforms.append([x1 - x0, x2 - x0, y2 - y0, y1 - y0, x0, y0])
 points = []
 for transform in random.choices(transforms, weights=areas, k=k):
 x, y = [random.random() for _ in range(2)]
 if x + y > 1:
 p = Point(1 - x, 1 - y)
 else:
 p = Point(x, y)
 points.append(affine_transform(p, transform))
 return points

This is fine if \$k\$ is small, as indicated by the OP. (If \$k\$ is large, then you would want to vectorize the construction of the points in NumPy.)

Concern was expressed in comments that the affine transformation would cause the random points to get closer along one axis than the other, due to different scaling on the two axes. But that's not a good way to think about it, because closest points are not preserved by affine transforms. (A point might have a different nearest neighbour after the transform.) Instead, remember that affine transformations are linear in both axes, so that the chance of a random point appearing in a region remains proportional to the area of the region after the transform. This means that if the points were uniformly distributed in the unit right triangle, they remain uniformly distributed in the transformed triangle. Here's a quick demo:

import matplotlib.pyplot as plt
triangle = Polygon([(0, 0), (5, 0), (0, 1)])
points = random_points_in_polygon(triangle, 500)
plt.gca().set_aspect('equal')
plt.plot(*np.array(triangle.boundary.coords).T, 'g')
coords = np.array([p.coords for p in points]).reshape(-1, 2)
plt.plot(*coords.T, 'b.')
plt.show()

Slender green triangle with 500 random blue points

If the figure doesn't convince you, then here's some code that prints the mean absolute offsets along the two axes between the random points and their nearest neighbours:

import scipy.spatial
tree = scipy.spatial.KDTree(coords)
_, neighbours = tree.query(coords, [2])
print(np.mean(np.abs(coords - coords[neighbours[:,0]]), axis=0))

Typical output shows that the means on the two axes are very close even though the \$x\$-coordinates have been scaled and the \$y\$-coordinates have not:

[0.02201801 0.02486154]

Question 5

Good answer, but I can see one problem with this is the distribution of points inside each triangle. The points are choosen uniformly in the space of [(0,0),(0,1),(1,0)] before the transformation, but will obviously not be uniformly distributed after transformation. For example, if the transformation scaled down in 'x' direction, the points could be closely packed in 'x' and more sparse in 'y' direction.

Question 6

@ApiwatChantawibul: Your "obviously" is doing a lot of work there! See updated answer.

Question 7

I see you give an experiment to refute. But I maintain my position. It helps to think in extreme case to see the problem. Let the average 'perpendicular-to-axis' distance to the nearest point be $d_x,d_y$ in original 'unit' triangle. They should be equal for now. Now, imagine the transformation scale one direction down so much the triangle is almost a line. We can obviously see that the new average 'perpendicular-to-axis' distance to the nearest point in one direction has stayed the same while the other has reduced to almost 0.

Question 8

@ApiwatChantawibul You are assuming that the nearest neighbour stays the same after the affine transformation. But that is not the case: the transformation may make a different point a nearer neighbour.

Question 9

@ApiwatChantawibul I added a numerical demonstration that the mean absolute distance to the nearest neighbour is approximately the same on the x and y axes. This ought to convince you.

Question 10

Style Nitpicks

Put spaces after commas / Keep Lines within 120 Chars

poly = Polygon([(141.4378366,-25.95915986), (165.4279876,-29.43400298), (163.1382942,-47.65345814), (133.1675418,-42.99807751)])

becomes

poly = Polygon([(141.4378366, -25.95915986), (165.4279876, -29.43400298), (163.1382942, -47.65345814),
 (133.1675418, -42.99807751)])

Use Pythonic Underscores

minx, miny, maxx, maxy = poly.bounds

becomes

min_x, min_y, max_x, max_y = poly.bounds

Eschew Semicolons (Personal Opinion)

You can use semicolons like this:

longs = np.arange(minx, maxx, 0.002); lats = np.arange(miny, maxy, 0.002)

But personally, I just wouldn't in Python. Eww.

An Alternative Algorithm

EDIT: Having read @MrE's and @MartinR's comments, I now propose this rejection sampling method. Although, this could miss frequently in a polygon with a large bounding box relative to its area; .e.g. an 8-point Christmas star with a small inner circle.

def random_points_within(poly, num_points):
 min_x, min_y, max_x, max_y = poly.bounds
 points = []
 while len(points) < num_points:
 random_point = Point([random.uniform(min_x, max_x), random.uniform(min_y, max_y)])
 if (random_point.within(poly)):
 points.append(random_point)
 return points

Another alternative that never misses, but may not be well distributed.

This was my original idea, but it didn't look so good in the morning.

Firstly, work out how to select one point within a polygon.

def random_point_within(poly):
 min_x, min_y, max_x, max_y = poly.bounds
 x = random.uniform(min_x, max_x)
 x_line = LineString([(x, min_y), (x, max_y)])
 x_line_intercept_min, x_line_intercept_max = x_line.intersection(poly).xy[1].tolist()
 y = random.uniform(x_line_intercept_min, x_line_intercept_max)
 return Point([x, y])

Then simply call that in a list comprehension to generate however many points you desire.

points = [random_point_within(poly) for i in range(5)]
checks = [point.within(poly) for point in points]

My approach is to select x randomly within the polygon, then constrain y.

This approach doesn't require NumPy and always produces a point within the polygon.

Question 11

I liked your code besides the way you have used linestring ratherthan finding the way to directy capture the x and y positions...

Question 12

I might be mistaken, but I think that your random point selection is not uniform with respect to the polygon. E.g. for the triangle with vertices (0,0), (1,0), (0,1) the chance to get a point with x > 0.5 is 50%, but the part of the triangle with x > 0.5 is only 25% of the total triangle area.

Question 13

The simplest method to get uniformly distributed samples is rejection sampling: 1. Find the bounding box of the polygon. 2. Generate a uniform sample in the bounding box 3. If sample is inside polygon, return sample, else go to 2.

Question 14

@MrE Could you explain me how to generate samples within the bounding box, not necessarily uniformly random though.

Question 15

I personally prefer seeing 80-100 chars as a limit. 120 as an absolute max for rare cases or if heavily nested.

Question 16

used this function with raster based 10m spacing

may be useful:

def random_points_within_poygon_and_10m_sbuffer(poly, num_points):
min_x, min_y, max_x, max_y = poly.bounds
min_lon = min_x
max_lon = max_x
min_lat = min_y
max_lat = max_y
# obtain utm bounds and dimentions
utmZone = int((np.floor((min_lon+ 180)/6)% 60) + 1)
fieldProj = Proj("+proj=utm +zone="+str(utmZone)+", +north +ellps=WGS84 +datum=WGS84 +units=m +no_defs")
UTMx_min, UTMy_max = fieldProj(min_lon, max_lat)
UTMx_max, UTMy_max = fieldProj(max_lon, max_lat)
UTMx_min, UTMy_min = fieldProj(min_lon, min_lat)
UTMx_min = math.floor(UTMx_min*0.2)*5.0 + 2.5 # make it standard grid to match img
UTMy_max = math.ceil(UTMy_max*0.2)*5.0 - 2.5 # make it standard grid to match img
utm_npix_x = int(0.1*(math.ceil((UTMx_max - UTMx_min)*0.1)*10))
utm_npix_y = int(0.1*(math.ceil((UTMy_max - UTMy_min)*0.1)*10))
#set spacing raster
spacing_utm_grid = np.arange(utm_npix_y*utm_npix_x).reshape((utm_npix_y, utm_npix_x))
spacing_utm_grid[:,:] = 0
points = []
while len(points) < num_points:
 pair_coordinates = [random.uniform(min_x, max_x), random.uniform(min_y, max_y)]
 random_point = Point(pair_coordinates)
 if (random_point.within(poly)):
 utm_x, utm_y = fieldProj(pair_coordinates[0], pair_coordinates[1])
 utm_x_pix = math.floor((utm_x - UTMx_min)/10)
 utm_y_pix = math.floor((UTMy_max -utm_y)/10)
 if(spacing_utm_grid[utm_y_pix, utm_x_pix]==0):
 points.append(random_point)
 spacing_utm_grid[utm_y_pix,utm_x_pix]=1
return points

Question 17

Welcome to Code Review! You have presented an alternative solution, but haven't reviewed the code. Please explain your reasoning (how your solution works and why it is better than the original) so that the author and other readers can learn from your thought process. Please read Why are alternative solutions not welcome?

Gareth Rees Gareth Rees 50.1k3 gold badges130 silver badges210 bronze badges · Answer 1 · 2018-09-24 15:23:17Z

Rejection sampling was proposed in comments on the other answer. The problem with rejection sampling is that the area of a polygon can be an arbitrarily small fraction of its bounding box, for example:

def ε_poly(ε):
 "Return a polygon that occupies a fraction ε of its bounding box."
 assert 0 < ε <= 1
 return Polygon([(0, 0), (1, 0), (ε, ε), (0, 1)])

Rejection sampling will take on average \1ドル\over ε\$ attempts to generate one sample point inside ε_poly(ε), and \1ドル\over ε\$ can be arbitrarily large.

A more robust approach is as follows:

Triangulate the polygon and calculate the area of each triangle.
For each sample:
Pick the triangle \$t\$ containing the sample, using random selection weighted by the area of each triangle.
Pick a random point uniformly in the triangle, as follows:
1. Pick a random point \$x, y\$ uniformly in the unit square.
2. If \$x + y > 1\$, use the point \1ドル-x, 1-y\$ instead. The effect of this is to ensure that the point is chosen uniformly in the unit right triangle with vertices \$(0, 0), (0, 1), (1, 0)\$
3. Apply the appropriate affine transformation to transform the unit right triangle to the triangle \$t\$.

Here's one possible implementation, using shapely.ops.triangulate and shapely.affinity.affine_transform:

import random
from shapely.affinity import affine_transform
from shapely.geometry import Point, Polygon
from shapely.ops import triangulate
def random_points_in_polygon(polygon, k):
 "Return list of k points chosen uniformly at random inside the polygon."
 areas = []
 transforms = []
 for t in triangulate(polygon):
 areas.append(t.area)
 (x0, y0), (x1, y1), (x2, y2), _ = t.exterior.coords
 transforms.append([x1 - x0, x2 - x0, y2 - y0, y1 - y0, x0, y0])
 points = []
 for transform in random.choices(transforms, weights=areas, k=k):
 x, y = [random.random() for _ in range(2)]
 if x + y > 1:
 p = Point(1 - x, 1 - y)
 else:
 p = Point(x, y)
 points.append(affine_transform(p, transform))
 return points

This is fine if \$k\$ is small, as indicated by the OP. (If \$k\$ is large, then you would want to vectorize the construction of the points in NumPy.)

Concern was expressed in comments that the affine transformation would cause the random points to get closer along one axis than the other, due to different scaling on the two axes. But that's not a good way to think about it, because closest points are not preserved by affine transforms. (A point might have a different nearest neighbour after the transform.) Instead, remember that affine transformations are linear in both axes, so that the chance of a random point appearing in a region remains proportional to the area of the region after the transform. This means that if the points were uniformly distributed in the unit right triangle, they remain uniformly distributed in the transformed triangle. Here's a quick demo:

import matplotlib.pyplot as plt
triangle = Polygon([(0, 0), (5, 0), (0, 1)])
points = random_points_in_polygon(triangle, 500)
plt.gca().set_aspect('equal')
plt.plot(*np.array(triangle.boundary.coords).T, 'g')
coords = np.array([p.coords for p in points]).reshape(-1, 2)
plt.plot(*coords.T, 'b.')
plt.show()

Slender green triangle with 500 random blue points

If the figure doesn't convince you, then here's some code that prints the mean absolute offsets along the two axes between the random points and their nearest neighbours:

import scipy.spatial
tree = scipy.spatial.KDTree(coords)
_, neighbours = tree.query(coords, [2])
print(np.mean(np.abs(coords - coords[neighbours[:,0]]), axis=0))

Typical output shows that the means on the two axes are very close even though the \$x\$-coordinates have been scaled and the \$y\$-coordinates have not:

[0.02201801 0.02486154]

Good answer, but I can see one problem with this is the distribution of points inside each triangle. The points are choosen uniformly in the space of [(0,0),(0,1),(1,0)] before the transformation, but will obviously not be uniformly distributed after transformation. For example, if the transformation scaled down in 'x' direction, the points could be closely packed in 'x' and more sparse in 'y' direction.
@ApiwatChantawibul: Your "obviously" is doing a lot of work there! See updated answer.
I see you give an experiment to refute. But I maintain my position. It helps to think in extreme case to see the problem. Let the average 'perpendicular-to-axis' distance to the nearest point be $d_x,d_y$ in original 'unit' triangle. They should be equal for now. Now, imagine the transformation scale one direction down so much the triangle is almost a line. We can obviously see that the new average 'perpendicular-to-axis' distance to the nearest point in one direction has stayed the same while the other has reduced to almost 0.
@ApiwatChantawibul You are assuming that the nearest neighbour stays the same after the affine transformation. But that is not the case: the transformation may make a different point a nearer neighbour.
@ApiwatChantawibul I added a numerical demonstration that the mean absolute distance to the nearest neighbour is approximately the same on the x and y axes. This ought to convince you.

user27318user27318 · Answer 2 · 2014-11-14 13:16:55Z

Style Nitpicks

Put spaces after commas / Keep Lines within 120 Chars

poly = Polygon([(141.4378366,-25.95915986), (165.4279876,-29.43400298), (163.1382942,-47.65345814), (133.1675418,-42.99807751)])

becomes

poly = Polygon([(141.4378366, -25.95915986), (165.4279876, -29.43400298), (163.1382942, -47.65345814),
 (133.1675418, -42.99807751)])

Use Pythonic Underscores

minx, miny, maxx, maxy = poly.bounds

becomes

min_x, min_y, max_x, max_y = poly.bounds

Eschew Semicolons (Personal Opinion)

You can use semicolons like this:

longs = np.arange(minx, maxx, 0.002); lats = np.arange(miny, maxy, 0.002)

But personally, I just wouldn't in Python. Eww.

An Alternative Algorithm

EDIT: Having read @MrE's and @MartinR's comments, I now propose this rejection sampling method. Although, this could miss frequently in a polygon with a large bounding box relative to its area; .e.g. an 8-point Christmas star with a small inner circle.

def random_points_within(poly, num_points):
 min_x, min_y, max_x, max_y = poly.bounds
 points = []
 while len(points) < num_points:
 random_point = Point([random.uniform(min_x, max_x), random.uniform(min_y, max_y)])
 if (random_point.within(poly)):
 points.append(random_point)
 return points

Another alternative that never misses, but may not be well distributed.

This was my original idea, but it didn't look so good in the morning.

Firstly, work out how to select one point within a polygon.

def random_point_within(poly):
 min_x, min_y, max_x, max_y = poly.bounds
 x = random.uniform(min_x, max_x)
 x_line = LineString([(x, min_y), (x, max_y)])
 x_line_intercept_min, x_line_intercept_max = x_line.intersection(poly).xy[1].tolist()
 y = random.uniform(x_line_intercept_min, x_line_intercept_max)
 return Point([x, y])

Then simply call that in a list comprehension to generate however many points you desire.

points = [random_point_within(poly) for i in range(5)]
checks = [point.within(poly) for point in points]

My approach is to select x randomly within the polygon, then constrain y.

This approach doesn't require NumPy and always produces a point within the polygon.

I liked your code besides the way you have used linestring ratherthan finding the way to directy capture the x and y positions...
I might be mistaken, but I think that your random point selection is not uniform with respect to the polygon. E.g. for the triangle with vertices (0,0), (1,0), (0,1) the chance to get a point with x > 0.5 is 50%, but the part of the triangle with x > 0.5 is only 25% of the total triangle area.
The simplest method to get uniformly distributed samples is rejection sampling: 1. Find the bounding box of the polygon. 2. Generate a uniform sample in the bounding box 3. If sample is inside polygon, return sample, else go to 2.
@MrE Could you explain me how to generate samples within the bounding box, not necessarily uniformly random though.
I personally prefer seeing 80-100 chars as a limit. 120 as an absolute max for rare cases or if heavily nested.

Val Val 1 · Answer 3 · 2018-08-23 20:00:53Z

used this function with raster based 10m spacing

may be useful:

def random_points_within_poygon_and_10m_sbuffer(poly, num_points):
min_x, min_y, max_x, max_y = poly.bounds
min_lon = min_x
max_lon = max_x
min_lat = min_y
max_lat = max_y
# obtain utm bounds and dimentions
utmZone = int((np.floor((min_lon+ 180)/6)% 60) + 1)
fieldProj = Proj("+proj=utm +zone="+str(utmZone)+", +north +ellps=WGS84 +datum=WGS84 +units=m +no_defs")
UTMx_min, UTMy_max = fieldProj(min_lon, max_lat)
UTMx_max, UTMy_max = fieldProj(max_lon, max_lat)
UTMx_min, UTMy_min = fieldProj(min_lon, min_lat)
UTMx_min = math.floor(UTMx_min*0.2)*5.0 + 2.5 # make it standard grid to match img
UTMy_max = math.ceil(UTMy_max*0.2)*5.0 - 2.5 # make it standard grid to match img
utm_npix_x = int(0.1*(math.ceil((UTMx_max - UTMx_min)*0.1)*10))
utm_npix_y = int(0.1*(math.ceil((UTMy_max - UTMy_min)*0.1)*10))
#set spacing raster
spacing_utm_grid = np.arange(utm_npix_y*utm_npix_x).reshape((utm_npix_y, utm_npix_x))
spacing_utm_grid[:,:] = 0
points = []
while len(points) < num_points:
 pair_coordinates = [random.uniform(min_x, max_x), random.uniform(min_y, max_y)]
 random_point = Point(pair_coordinates)
 if (random_point.within(poly)):
 utm_x, utm_y = fieldProj(pair_coordinates[0], pair_coordinates[1])
 utm_x_pix = math.floor((utm_x - UTMx_min)/10)
 utm_y_pix = math.floor((UTMy_max -utm_y)/10)
 if(spacing_utm_grid[utm_y_pix, utm_x_pix]==0):
 points.append(random_point)
 spacing_utm_grid[utm_y_pix,utm_x_pix]=1
return points

Welcome to Code Review! You have presented an alternative solution, but haven't reviewed the code. Please explain your reasoning (how your solution works and why it is better than the original) so that the author and other readers can learn from your thought process. Please read Why are alternative solutions not welcome?

Stack Exchange Network

Generate sample coordinates inside a Polygon

3 Answers 3

Style Nitpicks

Put spaces after commas / Keep Lines within 120 Chars

Use Pythonic Underscores

Eschew Semicolons (Personal Opinion)

An Alternative Algorithm

Another alternative that never misses, but may not be well distributed.

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Generate sample coordinates inside a Polygon

3 Answers 3

Style Nitpicks

Put spaces after commas / Keep Lines within 120 Chars

Use Pythonic Underscores

Eschew Semicolons (Personal Opinion)

An Alternative Algorithm

Another alternative that never misses, but may not be well distributed.

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions