I'm working on an A* pathfinding algorithm to solve a programming problem, and part of the constraint is the runtime. I need to find a path in a 2D grid with blocked or non-blocked tiles. You can't move diagonally.
I have succeeded to make a working prototype of the function which can and will solve the problems encountered, but it lacks optimisation.
I think that I need to optimise my heuristic function. For now, I calculate candidly: the distance between the current tile and the goal if no tile were blocked, plus the current cost.
n.distGoal = abs(n.x - nTargetX) + abs(n.y - nTargetY) + n.cost;
Interestingly, I've found an answer that does not include the cost when calculating its heuristic. When trying it out, my algorithm doesn't work anymore.
Have you any clues on how I could make the function more optimised?
Here are a few more snippets of my code, to help understand and maybe spot an error that causes long runtime:
Finding the neighbours:
set<tile> neighbours;
for(int k=-1;k<2;k+=2){
if(!(cur.x+k < 0|| cur.x+k>=nMapWidth)){ //if we're out of bonds, there's nowhere to go
struct tile neigh;
neigh.x = cur.x+k;
neigh.y = cur.y; //yum
neigh.mapWidth = nMapWidth;
int positionInMap = neigh.x + (neigh.y*nMapWidth);
if(!(pMap[positionInMap] == 0)){ //if we can walk there
neigh.cost = cur.cost + 1;
neighbours.insert(neigh);
}
}
if(!(cur.y+k < 0 || cur.y+k>=nMapHeight)){ //if we're out of bonds, there's nowhere to go
struct tile neigh;
neigh.x = cur.x;
neigh.y = cur.y+k;
neigh.mapWidth = nMapWidth;
int positionInMap = neigh.x + (neigh.y*nMapWidth);
if(!(pMap[positionInMap] == 0)){ //if we can walk there
neigh.cost = cur.cost + 1;
neighbours.insert(neigh);
}
}
}
Reconstructing the path when we reached the end:
void buildPath(map<tile,tile> cameFrom, tile cur, int* pOutBuffer){ //Reconstructs the path
int i = cur.distGoal - 1;
pOutBuffer[i] = cur.x + cur.y*cur.mapWidth;
map<tile,tile>::iterator it = cameFrom.find(cur);
while (it != cameFrom.end()){
i--;
if(i<0){
break;
}
cur = it->second;
pOutBuffer[i] = cur.x + cur.y*cur.mapWidth;
it = cameFrom.find(cur);
}
}
1 Answer 1
Interestingly, I've found an answer that does not include the cost when calculating its heuristic. When trying it out, my algorithm doesn't work anymore.
Let's make sure that you understand the algorithm. It is:
- We have a queue where the entries in the queue are a possible path and an estimate of the cost of that path.
- The queue keeps itself sorted by cost, lowest cost path first, highest cost path last
Basically the idea is to assume that the lowest-cost path is the one to explore first; only if that does not work out do we try the next-lowest.
So now let's talk about the cost. The estimated cost is the sum of:
- The known cost of the path so far
- The estimated cost of getting to the goal from the end of the path
If you've already gone 100 squares (known) and you think you have 50 more to go (estimated), the total estimated cost is 150.
You MUST include the known cost in with the estimated cost of the remainder; if you did not then you would, say, take a path known to be at least 100 long and put it in the queue as though it had a cost of 50!
So you must compute the estimated cost in two parts: the known cost of the path, and the estimated cost of the remainder. It is the second part that is usually called the "heuristic", because it's a guess. The known cost can be computed exactly.
I think that I need to optimise my heuristic function. For now, I calculate candidly : the distance between the current tile and the goal if no tile were blocked, plus the current cost.
The key feature of the heuristic is that it must never overestimate. A heuristic which has this property is called admissible.
It sounds like your heuristic is admissible. If you can only move orthogonally, not diagonally, then the Manhattan distance is admissible.
Your heuristic is called the Manhattan distance because it's the distance between two points in New York City, if you're constrained to moving on the grid rather than going diagonally.
If you can move diagonally then the Manhattan distance can overestimate, and is not admissible. If diagonal movement is allowed then use the Euclidean distance: the straight-line distance between two points.
One way you can test your algorithm is to use zero as your heuristic; just say that the estimated cost is the known cost plus nothing. That is an admissible heuristic because it always under-estimates. A-star with a zero heuristic is also called Dijkstra's Algorithm.
You say you want to optimize your heuristic. The best way to optimize the heuristic is to make it as accurate as possible without ever overestimating, while at the same time making it fast to compute. The Manhattan distance is a great heuristic because it is very fast to compute, is often exact, and never overestimates.
The more accurate the heuristic is, the fewer paths the algorithm has to consider before finding the best one. So accuracy is important.
I would stick to using the Manhattan distance and profile your code to figure out where the performance problem is. Likely the problem is not with your heuristic at all, but with maintaining the heap invariant in the priority queue, or with figuring out which nodes are already visited.
If you're interested in this subject you might want to read my series of articles from 2007 on implementing a-star in C#.
neigh.cost = cur.cost + 1
, as all tiles have either a cost of 1 or are impassable. \$\endgroup\$visited
boolean. The code is significantly faster! But apparently, it doesn't solve corner cases. Thanks for your help, I'm back to work I guess! \$\endgroup\$