Given $n$ points $x_1, x_2, ..., x_n \in \mathbb{R}^k,ドル where $\mathbb{R}^k$ can be high dimensional. Is it possible to devise a fast algorithm
(1) Preparation: first take the n points as an input, do whatever preprocessing necessary.
(2) Query: for each query $f_i = w^\top x$ being a hyperplane and a constant $b,ドル find the subset of points $\{x_i | w^\top x_i \geq b\}$.
such that the asymptotic complexity for the query is less than $\Theta(k \cdot n)$ ?
-
$\begingroup$ There are no restrictions on the hyperplane, and an arbitrary hyperplane can be specified at query time, correct? $\endgroup$Joe– Joe2014年04月21日 18:03:43 +00:00Commented Apr 21, 2014 at 18:03
-
$\begingroup$ How much space do you allow your preprocessing to use? $\endgroup$Joe– Joe2014年04月21日 18:04:36 +00:00Commented Apr 21, 2014 at 18:04
2 Answers 2
The keywords you may be looking for are simplex range searching and half-space range searching. See chapter 16 of Computational Geometry: Algorithms and Applications. If you are interested in general simplices (intersections of multiple half-planes), then there are two data structures in particular that might interest you: partition trees and cutting trees.
In the following paper describing partition trees, we have query time of $O(n^{1-1/d}(\log n)^{O(1)}),ドル preprocessing time of $O(n \log n),ドル and space requirement of $O(n)$.
Matoušek, Jiří. "Efficient partition trees." Discrete & Computational Geometry 8.1 (1992): 315-334.
In a cutting tree, you consider the dual case (each point in your input is represented by a line in the dual plane) and you tradeoff space for query time. You get $O(\log n)$ query time at a cost of $O(n^{2+\epsilon})$ space.
However, if you really only care about a single half-plane in your query, you can do better. In The Power of Geometric Duality, Chazelle gives an optimal data structure for half-space range queries with $O(\log n + k)$ query time and $O(n)$ space.
-
$\begingroup$ I think for the last solution you mention, $k$ is the number of points to be reported, not dimension of space. It seems to achieve $log n$ query time in high-dimensional space with dimension $d,ドル it requires at least $n^d$ preprocessing time. See Chazelle "cutting hyperplanes for divide and conquer". $\endgroup$Strin– Strin2014年04月23日 14:55:37 +00:00Commented Apr 23, 2014 at 14:55
-
$\begingroup$ Yes, you're right. I use $d$ for dimension and $k$ for output size in my answer, which are sort of the canonical variable choices in the literature. Sorry, I forgot that the question used $k$ for dimension. $\endgroup$Joe– Joe2014年04月24日 02:50:00 +00:00Commented Apr 24, 2014 at 2:50
This is a bit of a stretch, but you can compute the arrangement of hyperplanes corresponding to $w \cdot x_i = w \cdot x_j$ for all pairs of points $x_i,x_j$ in your set I believe in in $O(n^{2k-1})$ time. Google arrangements of hyperplanes to confirm the details. For sure you can compute the arrangement of hyperplanes in $O(n^{O(k)})$ time assuming $k$ is a constant. Then for each full-dimensional cone $C$ in the arrangement of hyperplanes, you get a unique ordering on the points induced by all functionals in the cone $C$. Let $w(C)$ be the centroid vector of the generators of the cone $C$. Then it's not absolutely guaranteed, but given a weight vector $w$ the ordering induced by $w$ should be very close to the ordering induced by $w(C)$ where $C$ minimizes the distance between $w$ and $w(C)$ (assuming all vectors are normalized to have length 1ドル$). Thus, for large $n$ compared to $k,ドル you can build a range tree out of the centroids $w(C)$ and find the nearest neighbor to a query vector $w$ in $O((\log n)^{k-1})$ time assuming $k$ is a constant. Once you know the closest $w(C),ドル you automatically have the ordering of points induced by weight vectors in $C$ precomputed, so you can binary search to find the last point in the ordering that satisfies $w \cdot x \geq b$. This takes $O(\log n)$ time (again assuming $k$ is a constant). Overall time complexity is $O((\log n)^{k-1}),ドル which for large $n$ compared to $k$ beats $O(n)$. Note the ordering might be slightly off though if $w$ lies in a cone $C$ whose centroid $w(C)$ is not the closest centroid to $w$. However you can prove that the true cone $C$ is either the cone that contains $w$ or it is a neighbor of the cone that contains $w$. So if your points are in general position, then at most your computed set of points that satisfy $w \cdot x \geq b$ will be off by one point.
However, I must confess that even in dimensions $k=3$ or 4ドル,ドル this seems like an awful lot of preprocessing time and storage (computing the entire arrangement of hyperplanes and then building a range tree out of every cone centroid) just to get the complexity down to $O((\log n)^{k-1})$. Especially the storage requirements for the range tree. Also if $k > \log n + 1$ then the complexity is already worse than $O(n)$. So you need $n$ to be exponential or greater in $k$.
Explore related questions
See similar questions with these tags.