A* search in MATLAB

Question 1

I have implemented A* search in MATLAB, but I am looking for ways to increase the speed and optimize it. I have tried using a priority queue but I found it doesn't work that well, so I am using a different way to implement the search. I will explain the details, so it might get a bit long. I appreciate your patience.

The grid that I am performing the search on is called workSpace. I am using the cell indexes, so by checking workSpace[inx] == 0 I can tell if the cell is occupied or not, 0 -> free and 1 -> occupied. This is the main body of the A*. I am passing the work space, the index for the start cell, and the index of the goal cell. As well as the heauristic h, and cost g functions. nNodes is the total number of nodes, which I use to find the successor nodes.

function [visitedNodes, f, cameFrom] = aStar(workSpace, startIndx, goalIndx, nNodes, h, g)
 dim = sqrt(nNodes);
 node = startIndx;
 cameFrom(nNodes, 1) = 0;
 cameFrom(node) = node;
 closedSet(nNodes, 1) = 0;
 openSet(nNodes, 1) = 0;
 costSoFar(nNodes, 1) = 0; 
 f = inf(nNodes, 1);
 openSet(node) = 1;
 costSoFar(node) = 0;
 f(node) = 0;
 visitedNodes = 0;
 while sum(openSet) ~= 0
 [~, minFIndx] = min(f);
 f(minFIndx) = inf;
 currentNode = minFIndx;
 if currentNode == goalIndx
 disp('goal Found')
 return
 end
 openSet(currentNode) = 0;
 closedSet(currentNode) = 1;
 childNodes = search.getNeighboursByIndx(workSpace, currentNode, nNodes, dim);
 for i = 1:numel(childNodes)
 if closedSet(childNodes(i)) == 1
 continue
 end
 tentativeGScore = costSoFar(currentNode) + g(currentNode);
 if openSet(childNodes(i)) ~= 1 || tentativeGScore < costSoFar(childNodes(i)) 
 cameFrom(childNodes(i)) = currentNode;
 costSoFar(childNodes(i)) = tentativeGScore;
 f(childNodes(i)) = costSoFar(childNodes(i)) + h(childNodes(i));
 if openSet(childNodes(i)) == 0
 openSet(childNodes(i)) = 1; 
 end
 end
 end
 end
end

As I mentioned, I am not using a priority queue. I am using the below mechanism to simulate the priority queue.

 [~, minFIndx] = min(f);
 f(minFIndx) = inf;
 currentNode = minFIndx;

min searches through f, which is f = g+h, and returns the index of the lowest cell and then I set the value of that cell to inf so it doesn't come up again in the next round. I use the below function to get the successors, it is also very simple:

function successors = getNeighboursByIndx(workSpace, nodeIndx, nNodes, dim)
 delta = [ 1; dim;... 
 -1; -dim];
 neighbours = bsxfun(@plus, delta, nodeIndx);
 % Create the successor matrix and check if all neighbours are within the grid/freeSpace
 % if not, don't add them to the successors matrix
 successors(4, 1) = 0;
 for i=1:4 
 % (1) the index can't be negative 
 % (2) the index should be smaller than the total number of nodes in the grid
 % (3) the index should not be on the wall around the working space
 % (1) (2) (3)
 if (neighbours(i) > 0) && (neighbours(i) <= nNodes) && (mod(neighbours(i), dim) ~= 1) 
 if ~(workSpace(neighbours(i)) == 1) % if the index is not in the wallSpace(1) it is in the freeSpace(0)
 successors(i) = neighbours(i);
 end
 end
 end
 % remove the neighbours that are not eligible as a successor
 % again, `successors` contains the indexes of neighbouring cells
 successors(successors == 0) = [];
end

This function is very simple. I use only 4-neighbours, Up(1)-Down(-1) | Right(dim)-Left(-dim).

The current algorithm completes the search on 1681 cells in around 0.05 seconds. The profiler is telling me that the getNeighboursByIndx function takes almost 50% of the total time. In this specific work space it gets called 411 times. Please let me know if it is not clear and you need more information.

Edit: I am running the search on a dynamic workSpace, the state of a cell is not static and it might change from a free cell to an occupied one or vice-versa. This is the algorithm I am using:

set all cells in workSpace to free `0`
while True:
 Check for changes in the workSpace
 if thereIsAChange
 update the workSpace
 perform the search on the new workSpace
 end
end

I don't change the state of the workSpace during the search. Is it a bad idea?

Question 2

Why don't you memoize the neighbours? It seems there's no variation in the connectivity of the items.

Question 3

@N74 Actually the reason I am trying to improve the performance is that the workspace is not a static environment and the occupied cells are changing, so I must check for the eligibility of the neighbours. Thanks.

Question 4

In this case you can try to make a function to update the neighbours when the workspace changes.

Question 5

@N74 That's somehow what I'm doing, but I thought it is not a good idea to change the workSpace during the search. I have edited my question.

Question 6

You say getNeighboursByIndx takes 50% of the total time. That means that you should try to reduce either:

The number of calls to getNeighboursByIndx
The time it takes to run getNeighboursByIndx

There's a lot of data I don't have, so trying to improve the number of calls to getNeighboursByIndx is hard. Improving the performance of the function itself however, should be possible.

If you look at the function you'll see that you are doing a lot of equality checks inside a loop, where only one element at a time is checked against some other value. This is a prime example of something that should be vectorized.

If you use & instead of && you can avoid the loop and if's quite simply. From the documentation:

expr1 && expr2 represents a logical AND operation that employs short-circuiting behavior.

Where as &:

A & B performs a logical AND of arrays A and B and returns an array containing elements set to either logical 1 (true) or logical 0 (false).

What you can do here is to create a logical mask, with true and false in the positions corresponding to the logical conditions, and use that mask to populate successor:

mask = neighbours > 0 & neighbours <= nNodes & (mod(neighbours, dim) ~= 1) ...
& ~(workSpace(neighbours) == 1)
successors(mask) = neighbours(mask);

You can do the same simplification quite easily with the for loop for i = 1:numel(childNodes) too.

Also, you don't need to use bsxfun when adding scalars to a vector (I'm assuming nodeIndx is a scalar).

The getNeighboursByIndx function can thus be rewritten as:

function successors = getNeighboursByIndx(workSpace, nodeIndx, nNodes, dim)
delta = [ 1; dim; -1; -dim];
neighbours = delta + nodeIndx;
mask = neighbours > 0 & neighbours <= nNodes & (mod(neighbours, dim) ~= 1) ...
 & ~(workSpace(neighbours) == 1)
 successors = neighbours(mask);
end

Much faster, much simpler, and much cleaner! =)

Question 7

The index in neighbours might not be in the range so (workSpace(neighbours) == 1) might make the code crash. You should use 2 different masks. But even then it is not faster than the for loop. It's almost twice slower than the for loop. But it indeed looks nicer.

Question 8

The biggest thing that I found is that you are changeing the array size of successor, which you don't need.

Further the map is static during a single search. If it isn't A* can't guarantee optimality, as we are never able to close a node (the path currently concidered best may become blocked during the while loop).

Therefore it may make sense to preallocate the neightbours, if the work space has many blocked paths (like a maze). Additionally instead of removing cases where we walk against a wall, just walk to the current position. A* will sort those out for us.

Some micro optimizations (not in any particular order):

Inline search.getNeighboursByIndx(workSpace, currentNode, nNodes, dim) - its not that big and as search is an object or struct, it is painfully slow to access.
closedSet and openSet are both boolean arrays. Initialize and treat them as such using false(dimension) and true(dimension) this makes comparisons faster (else the double is "casted" to a boolean every time)
change the check in the while loop from sum(openSet) ~= 0 to any(openSet) depending on change above. See: here
do not reaccess childsNodes(i) every time. It's marginally faster to store it in a variable and reference that one.
if you have a LOT (say over 100k) nodes, then max() will be outperformed by a priority queue if you use a heap as backbone ( O(n) for max and O(log(n)) for heap). However I don't know of any pure matlab implementation of a heap. DO NOT use sort() for the queue. That is O(n*log(n)) (Quick Sort implementation) and really hurts.

I couldn't get the code to display as one large block in this post. So here is the pastebin version. Feel free to edit it.

Question 9

You are right, my bad. Matlab is implementing quick sort, see here

Stewie Griffin Stewie Griffin 2,0771 gold badge18 silver badges34 bronze badges · Answer 1 · 2016-07-12 12:49:21Z

You say getNeighboursByIndx takes 50% of the total time. That means that you should try to reduce either:

The number of calls to getNeighboursByIndx
The time it takes to run getNeighboursByIndx

There's a lot of data I don't have, so trying to improve the number of calls to getNeighboursByIndx is hard. Improving the performance of the function itself however, should be possible.

If you look at the function you'll see that you are doing a lot of equality checks inside a loop, where only one element at a time is checked against some other value. This is a prime example of something that should be vectorized.

If you use & instead of && you can avoid the loop and if's quite simply. From the documentation:

expr1 && expr2 represents a logical AND operation that employs short-circuiting behavior.

Where as &:

A & B performs a logical AND of arrays A and B and returns an array containing elements set to either logical 1 (true) or logical 0 (false).

What you can do here is to create a logical mask, with true and false in the positions corresponding to the logical conditions, and use that mask to populate successor:

mask = neighbours > 0 & neighbours <= nNodes & (mod(neighbours, dim) ~= 1) ...
& ~(workSpace(neighbours) == 1)
successors(mask) = neighbours(mask);

You can do the same simplification quite easily with the for loop for i = 1:numel(childNodes) too.

Also, you don't need to use bsxfun when adding scalars to a vector (I'm assuming nodeIndx is a scalar).

The getNeighboursByIndx function can thus be rewritten as:

function successors = getNeighboursByIndx(workSpace, nodeIndx, nNodes, dim)
delta = [ 1; dim; -1; -dim];
neighbours = delta + nodeIndx;
mask = neighbours > 0 & neighbours <= nNodes & (mod(neighbours, dim) ~= 1) ...
 & ~(workSpace(neighbours) == 1)
 successors = neighbours(mask);
end

Much faster, much simpler, and much cleaner! =)

The index in neighbours might not be in the range so (workSpace(neighbours) == 1) might make the code crash. You should use 2 different masks. But even then it is not faster than the for loop. It's almost twice slower than the for loop. But it indeed looks nicer.

FirefoxMetzger FirefoxMetzger 1,1016 silver badges11 bronze badges · Answer 2 · 2016-08-09 10:13:46Z

The biggest thing that I found is that you are changeing the array size of successor, which you don't need.

Further the map is static during a single search. If it isn't A* can't guarantee optimality, as we are never able to close a node (the path currently concidered best may become blocked during the while loop).

Therefore it may make sense to preallocate the neightbours, if the work space has many blocked paths (like a maze). Additionally instead of removing cases where we walk against a wall, just walk to the current position. A* will sort those out for us.

Some micro optimizations (not in any particular order):

Inline search.getNeighboursByIndx(workSpace, currentNode, nNodes, dim) - its not that big and as search is an object or struct, it is painfully slow to access.
closedSet and openSet are both boolean arrays. Initialize and treat them as such using false(dimension) and true(dimension) this makes comparisons faster (else the double is "casted" to a boolean every time)
change the check in the while loop from sum(openSet) ~= 0 to any(openSet) depending on change above. See: here
do not reaccess childsNodes(i) every time. It's marginally faster to store it in a variable and reference that one.
if you have a LOT (say over 100k) nodes, then max() will be outperformed by a priority queue if you use a heap as backbone ( O(n) for max and O(log(n)) for heap). However I don't know of any pure matlab implementation of a heap. DO NOT use sort() for the queue. That is O(n*log(n)) (Quick Sort implementation) and really hurts.

I couldn't get the code to display as one large block in this post. So here is the pastebin version. Feel free to edit it.

You are right, my bad. Matlab is implementing quick sort, see here

Stack Exchange Network

A* search in MATLAB

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

A* search in MATLAB

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions