For my project I can have 0-500 moving squares. The largest possible square width is at most 3 times longer than the smallest possible square width. The squares can move up to 3 of their own body's width per second. About 1 square is added and about 1 square is deleted every second.
What I'm trying to implement now is a neural network to drive the speed and direction of the squares. Each square will have their own neural network which will take the x and y position of the 5 nearest squares to it. To find the nearest squares, I will be taking the euclidean distance from the center of their squares bodies.
My problem is that calculating the distance of each square to every single other square - to find its 5 nearest other squares - is going to make my program run at a snail's pace.
So what is a good data structure I could use to reduce the number of checks I'm performing?
Also I heard that kd-trees were good at finding the nearest neighbor. However, because the squares are going to be moving a lot I'm not sure if it'd actually be the best data structure to use.
2 Answers 2
Since you only measure distances from the centers of the squares, you can restate your problem as follows: Maintain a dynamic collection $X$ of $n$ points in Euclidean space so that you can quickly find the $k$ nearest neighbors in $X \setminus \{p\}$ of a query point $p \in X$.
Using the data structure described here you can maintain such a collection in $O(\log^3 n)$ expected amortized time per insertion and $O(\log^6 n)$ expected amortized time per deletion, after a preproessing requiring $O(n \log^2 n)$ expected time.
The data structure supports nearest neighbor queries (given a point $q$, report the point in $X$ that is closest to $q$) in $O(\log^2 n)$ worst-case time. Then, you can report the $k$ nearest neighbors of $p \in X$ in $O(k \log^6 n)$ expected amortized time as follows:
- Delete $p$ from $X$ in $O(\log^6 n)$ expected amortized time.
- Repeat $k$ times:
- Query the data structure for the nearest neighbor $p' \in X$ of $p$ in $O(\log^2 n)$ worst-case time.
- Report $p'$ as one of the $k$ nearest neighbors of $p$.
- Delete $p'$ from $X$ in $O(\log^6 n)$ expected amortized time.
- Reinsert all deleted points in $X$ in $O(k \log^3 n)$ expected amortized time.
You may be able to borrow from methods for n-body simulation, such as the Barnes-Hut method. It sounds like it relies on an octree (or quadtree). You can update a quadtree or octree efficiently and use it to speed up nearest-neighbor queries, and it is simple to implement.
It suffices to store the centers of the squares in the data structure. Their width is irrelevant for purposes of finding the 5 nearest squares.
You can find an overview of algorithms for nearest neighbor search here: https://en.wikipedia.org/wiki/Nearest_neighbor_search