I am looking for a numpy array based alternative to the GRASS r.cost function.
I have a friction surface raster where the value of each cell corresponds to travel time. I would like to find the cells within n travel time of a user specified cell, so essentially the cumulative sum in multiple directions until the max travel time is achieved. I have written a PyQGIS script that does this using the GRASS r.cost function (specifying the max cost) which works well.
However, eventually this procedure will be part of a bigger Python (not PyQGIS) script where rasters are ingested using rioxarray and then processed as numpy arrays. I know that I can use GRASS here as well, but I'm wondering if there is a simpler alternative? I would like to minimise barriers for users, ideally they will not need to install GRASS or QGIS. I've been searching fruitlessly for options...
Any suggestions?
3 Answers 3
Read the raster into a numpy array with rasterio then use skimage.graph.route_through_array to find least cost path.
import matplotlib.pyplot as plt
import rasterio
from rasterio.plot import show
from skimage.graph import route_through_array
cost_raster = r"C:\Users\bera\Desktop\gistest\np_cost\cost_new.tif"
dataset = rasterio.open(cost_raster)
meta = dataset.meta
arr = dataset.read(1)
#Plot the cost raster
fig, ax = plt.subplots(nrows=1, ncols=1, figsize=(8,8))
show(source=dataset, ax=ax, cmap="inferno", transform=meta["transform"], vmax=30)
img = ax.get_images()[0]
plt.colorbar(img, ax=ax, fraction=0.02, label="Cost")
#Define the start and end points for which the least cost path will be calculated
start_x, start_y = 801765, 7545417
end_x, end_y = 800771, 7546770
#Plot them
start_point = ax.scatter(start_x, start_y, color="red", marker="o", s=40,
label="Start point", zorder=2)
end_point = ax.scatter(end_x, end_y, color="lime", marker="s", s=40,
label="End point", zorder=2)
#Input to route_through array is array row and column indexes, not the georeferenced
# coordinates. Find the array row and column indexes corresponding to the
# start and endpoints
start_row_ix, start_col_ix = dataset.index(x=start_x, y=start_y)
end_row_ix, end_col_ix = dataset.index(x=end_x, y=end_y)
#Calculate least cost path
path, cost = route_through_array(array=arr, start=(start_row_ix, start_col_ix),
end=(end_row_ix, end_col_ix))
#Translate row and column indexes to coordinates, to be able to plot them
path_row_indexes = [x[0] for x in path]
path_col_indexes = [x[1] for x in path]
path_x_coords, path_y_coords = dataset.xy(row=path_row_indexes, col=path_col_indexes)
cost_path = ax.plot(path_x_coords, path_y_coords, ls=(0,(2,4)), color="cyan", lw=2,
label="Least cost path", zorder=1)[0]
#Add a legend and cost label
ax.legend(handles=[start_point, end_point, cost_path], labelcolor="white",
facecolor="black")
ax.annotate(text=f"Cost: {round(cost, 0)}", xy=(end_x+50, end_y),
color="white", weight="bold")
For those with the same issue, I am now using the gdistance package with good results.
Note that for r.cost the input raster values should refer to the time taken to traverse the whole cell.
For gdistance the input raster values should correspond to minutes required to move a metre in that cell.
This is not exactly the same but it may be helpful for some people (especially those with dependency issues relating to GDAL.
def count_connected_ones(arr):
# Create a structuring element for 8-connectivity
struct = generate_binary_structure(2, 2)
# Label connected components
labeled_arr, num_labels = label(arr, struct)
# Replace the original ones with the maximum values
d = {0:0}
for i in range(1,num_labels+1):
d[i] = np.count_nonzero(labeled_arr==i)
lookup_func = np.vectorize(lambda x: d.get(x, x))
# Apply the lookup function to the array
result = lookup_func(labeled_arr)
return result