An $\varepsilon$-net for a range space $(X,\mathcal{R})$ is a subset $N$ of $X$ such that $N\cap R$ is nonempty for all $R\in \mathcal{R}$ such that $|X\cap R| \ge \varepsilon |X|$.
Given a range space $(X,R)$ of VC-dimension $d,ドル an $\varepsilon$-net of size $O\left(\frac{d}{\varepsilon}\log\left(\frac{d}{\varepsilon}\right)\right)$ can be computed in time $O(d)^{3d}\left(\frac{1}{\varepsilon^2}\log\left(\frac{d}{\varepsilon}\right)\right)^d|X|$ (see [1], Thm 4.6).
To what extent is the $O(d)^{3d}$ term intrinsic to this problem? Specifically, can it be improved to 2ドル^{O(d)}$? Are there lower bounds known?
A related question: are there general conditions on $(X,R)$ for which such an improvement is known to exist?
[1] Bernard Chazelle. The Discrepancy Method. 2000.
-
2$\begingroup$ Nice question ! and welcome Don ! $\endgroup$Suresh Venkat– Suresh Venkat2011年05月09日 18:19:07 +00:00Commented May 9, 2011 at 18:19
1 Answer 1
If you are ok with a randomized algorithm, then you can just randomly sample $O((d/\epsilon) \log (d/\epsilon \delta))$ points, and this will be an $\epsilon$-net with probability at least 1ドル-\delta$.
For the deterministic algorithm, I believe the $d^{O(d)}$ term comes from the following reason. You first partition your full set into subsets of size about $s = O((d/\epsilon) \log (d/\epsilon))$ where $s$ is the size of the $\epsilon$-net you hope to get in the end. Then you merge two such subsets and reduce to half that size. (Basically, this is repeated until one set is left.) The reduction step essentially considers all interesting ranges of the merged set of size 2ドルs$. By VC-bounds there are about $O((2s)^d)$ ranges, and you need to "check" each of these to see if it has been hit. The $s^d$ has a $d^{O(d)}$ in it. The $d$ in the $s$ term is needed. So to reduce this bound, using this deterministic framework, you would somehow need to implicitly "check" all ranges without explicitly enumerating them. This seems difficult to do deterministically, but I am not sure I know an explicit lower bound - most work in this arena assumes $d$ is constant and abuses this heavily.
Explore related questions
See similar questions with these tags.