I have implemented A* search in MATLAB, but I am looking for ways to increase the speed and optimize it. I have tried using a priority queue but I found it doesn't work that well, so I am using a different way to implement the search. I will explain the details, so it might get a bit long. I appreciate your patience.
The grid that I am performing the search on is called workSpace
. I am using the cell indexes, so by checking workSpace[inx] == 0
I can tell if the cell is occupied or not, 0 -> free
and 1 -> occupied
. This is the main body of the A*. I am passing the work space, the index for the start cell, and the index of the goal cell. As well as the heauristic h
, and cost g
functions. nNodes
is the total number of nodes, which I use to find the successor nodes.
function [visitedNodes, f, cameFrom] = aStar(workSpace, startIndx, goalIndx, nNodes, h, g)
dim = sqrt(nNodes);
node = startIndx;
cameFrom(nNodes, 1) = 0;
cameFrom(node) = node;
closedSet(nNodes, 1) = 0;
openSet(nNodes, 1) = 0;
costSoFar(nNodes, 1) = 0;
f = inf(nNodes, 1);
openSet(node) = 1;
costSoFar(node) = 0;
f(node) = 0;
visitedNodes = 0;
while sum(openSet) ~= 0
[~, minFIndx] = min(f);
f(minFIndx) = inf;
currentNode = minFIndx;
if currentNode == goalIndx
disp('goal Found')
return
end
openSet(currentNode) = 0;
closedSet(currentNode) = 1;
childNodes = search.getNeighboursByIndx(workSpace, currentNode, nNodes, dim);
for i = 1:numel(childNodes)
if closedSet(childNodes(i)) == 1
continue
end
tentativeGScore = costSoFar(currentNode) + g(currentNode);
if openSet(childNodes(i)) ~= 1 || tentativeGScore < costSoFar(childNodes(i))
cameFrom(childNodes(i)) = currentNode;
costSoFar(childNodes(i)) = tentativeGScore;
f(childNodes(i)) = costSoFar(childNodes(i)) + h(childNodes(i));
if openSet(childNodes(i)) == 0
openSet(childNodes(i)) = 1;
end
end
end
end
end
As I mentioned, I am not using a priority queue. I am using the below mechanism to simulate the priority queue.
[~, minFIndx] = min(f);
f(minFIndx) = inf;
currentNode = minFIndx;
min
searches through f
, which is f = g+h
, and returns the index of the lowest cell and then I set the value of that cell to inf
so it doesn't come up again in the next round. I use the below function to get the successors, it is also very simple:
function successors = getNeighboursByIndx(workSpace, nodeIndx, nNodes, dim)
delta = [ 1; dim;...
-1; -dim];
neighbours = bsxfun(@plus, delta, nodeIndx);
% Create the successor matrix and check if all neighbours are within the grid/freeSpace
% if not, don't add them to the successors matrix
successors(4, 1) = 0;
for i=1:4
% (1) the index can't be negative
% (2) the index should be smaller than the total number of nodes in the grid
% (3) the index should not be on the wall around the working space
% (1) (2) (3)
if (neighbours(i) > 0) && (neighbours(i) <= nNodes) && (mod(neighbours(i), dim) ~= 1)
if ~(workSpace(neighbours(i)) == 1) % if the index is not in the wallSpace(1) it is in the freeSpace(0)
successors(i) = neighbours(i);
end
end
end
% remove the neighbours that are not eligible as a successor
% again, `successors` contains the indexes of neighbouring cells
successors(successors == 0) = [];
end
This function is very simple. I use only 4-neighbours, Up(1)-Down(-1) | Right(dim)-Left(-dim).
The current algorithm completes the search on 1681 cells in around 0.05 seconds
. The profiler is telling me that the getNeighboursByIndx
function takes almost 50% of the total time. In this specific work space it gets called 411 times. Please let me know if it is not clear and you need more information.
Edit: I am running the search on a dynamic workSpace
, the state of a cell is not static and it might change from a free cell to an occupied one or vice-versa. This is the algorithm I am using:
set all cells in workSpace to free `0`
while True:
Check for changes in the workSpace
if thereIsAChange
update the workSpace
perform the search on the new workSpace
end
end
I don't change the state of the workSpace
during the search. Is it a bad idea?
2 Answers 2
You say getNeighboursByIndx
takes 50% of the total time. That means that you should try to reduce either:
- The number of calls to
getNeighboursByIndx
- The time it takes to run
getNeighboursByIndx
There's a lot of data I don't have, so trying to improve the number of calls to getNeighboursByIndx
is hard. Improving the performance of the function itself however, should be possible.
If you look at the function you'll see that you are doing a lot of equality checks inside a loop, where only one element at a time is checked against some other value. This is a prime example of something that should be vectorized.
If you use &
instead of &&
you can avoid the loop and if
's quite simply. From the documentation:
expr1 && expr2
represents a logical AND operation that employs short-circuiting behavior.
Where as &
:
A & B
performs a logical AND of arrays A and B and returns an array containing elements set to either logical 1 (true) or logical 0 (false).
What you can do here is to create a logical mask, with true
and false
in the positions corresponding to the logical conditions, and use that mask to populate successor
:
mask = neighbours > 0 & neighbours <= nNodes & (mod(neighbours, dim) ~= 1) ...
& ~(workSpace(neighbours) == 1)
successors(mask) = neighbours(mask);
You can do the same simplification quite easily with the for loop for i = 1:numel(childNodes)
too.
Also, you don't need to use bsxfun
when adding scalars to a vector (I'm assuming nodeIndx
is a scalar).
The getNeighboursByIndx
function can thus be rewritten as:
function successors = getNeighboursByIndx(workSpace, nodeIndx, nNodes, dim)
delta = [ 1; dim; -1; -dim];
neighbours = delta + nodeIndx;
mask = neighbours > 0 & neighbours <= nNodes & (mod(neighbours, dim) ~= 1) ...
& ~(workSpace(neighbours) == 1)
successors = neighbours(mask);
end
Much faster, much simpler, and much cleaner! =)
-
\$\begingroup\$ The index in neighbours might not be in the range so
(workSpace(neighbours) == 1)
might make the code crash. You should use 2 different masks. But even then it is not faster than the for loop. It's almost twice slower than the for loop. But it indeed looks nicer. \$\endgroup\$Ali– Ali2016年07月23日 13:38:18 +00:00Commented Jul 23, 2016 at 13:38
The biggest thing that I found is that you are changeing the array size of successor, which you don't need.
Further the map is static during a single search. If it isn't A* can't guarantee optimality, as we are never able to close a node (the path currently concidered best may become blocked during the while
loop).
Therefore it may make sense to preallocate the neightbours, if the work space has many blocked paths (like a maze). Additionally instead of removing cases where we walk against a wall, just walk to the current position. A* will sort those out for us.
Some micro optimizations (not in any particular order):
- Inline
search.getNeighboursByIndx(workSpace, currentNode, nNodes, dim)
- its not that big and assearch
is an object or struct, it is painfully slow to access. closedSet
andopenSet
are both boolean arrays. Initialize and treat them as such usingfalse(dimension)
andtrue(dimension)
this makes comparisons faster (else the double is "casted" to a boolean every time)- change the check in the while loop from
sum(openSet) ~= 0
toany(openSet)
depending on change above. See: here - do not reaccess
childsNodes(i)
every time. It's marginally faster to store it in a variable and reference that one. - if you have a LOT (say over 100k) nodes, then
max()
will be outperformed by a priority queue if you use a heap as backbone ( O(n) for max and O(log(n)) for heap). However I don't know of any pure matlab implementation of a heap. DO NOT usesort()
for the queue. That is O(n*log(n)) (Quick Sort implementation) and really hurts.
I couldn't get the code to display as one large block in this post. So here is the pastebin version. Feel free to edit it.
-
\$\begingroup\$ You are right, my bad. Matlab is implementing quick sort, see here \$\endgroup\$FirefoxMetzger– FirefoxMetzger2016年08月09日 11:47:33 +00:00Commented Aug 9, 2016 at 11:47
workSpace
during the search. I have edited my question. \$\endgroup\$