How intrinsic is the $d^d$ term in the running time for constructing $\varepsilon$-nets in range spaces of VC-dimension d?

Question 1

An $\varepsilon$-net for a range space $(X,\mathcal{R})$ is a subset $N$ of $X$ such that $N\cap R$ is nonempty for all $R\in \mathcal{R}$ such that $|X\cap R| \ge \varepsilon |X|$.

Given a range space $(X,R)$ of VC-dimension $d,ドル an $\varepsilon$-net of size $O\left(\frac{d}{\varepsilon}\log\left(\frac{d}{\varepsilon}\right)\right)$ can be computed in time $O(d)^{3d}\left(\frac{1}{\varepsilon^2}\log\left(\frac{d}{\varepsilon}\right)\right)^d|X|$ (see [1], Thm 4.6).

To what extent is the $O(d)^{3d}$ term intrinsic to this problem? Specifically, can it be improved to 2ドル^{O(d)}$? Are there lower bounds known?

A related question: are there general conditions on $(X,R)$ for which such an improvement is known to exist?

[1] Bernard Chazelle. The Discrepancy Method. 2000.

Question 2

Nice question ! and welcome Don !

Question 3

If you are ok with a randomized algorithm, then you can just randomly sample $O((d/\epsilon) \log (d/\epsilon \delta))$ points, and this will be an $\epsilon$-net with probability at least 1ドル-\delta$.

For the deterministic algorithm, I believe the $d^{O(d)}$ term comes from the following reason. You first partition your full set into subsets of size about $s = O((d/\epsilon) \log (d/\epsilon))$ where $s$ is the size of the $\epsilon$-net you hope to get in the end. Then you merge two such subsets and reduce to half that size. (Basically, this is repeated until one set is left.) The reduction step essentially considers all interesting ranges of the merged set of size 2ドルs$. By VC-bounds there are about $O((2s)^d)$ ranges, and you need to "check" each of these to see if it has been hit. The $s^d$ has a $d^{O(d)}$ in it. The $d$ in the $s$ term is needed. So to reduce this bound, using this deterministic framework, you would somehow need to implicitly "check" all ranges without explicitly enumerating them. This seems difficult to do deterministically, but I am not sure I know an explicit lower bound - most work in this arena assumes $d$ is constant and abuses this heavily.

Jeff Phillips 1661 silver badge3 bronze badges · Accepted Answer · 2011-05-09 18:39:15Z

If you are ok with a randomized algorithm, then you can just randomly sample $O((d/\epsilon) \log (d/\epsilon \delta))$ points, and this will be an $\epsilon$-net with probability at least 1ドル-\delta$.

For the deterministic algorithm, I believe the $d^{O(d)}$ term comes from the following reason. You first partition your full set into subsets of size about $s = O((d/\epsilon) \log (d/\epsilon))$ where $s$ is the size of the $\epsilon$-net you hope to get in the end. Then you merge two such subsets and reduce to half that size. (Basically, this is repeated until one set is left.) The reduction step essentially considers all interesting ranges of the merged set of size 2ドルs$. By VC-bounds there are about $O((2s)^d)$ ranges, and you need to "check" each of these to see if it has been hit. The $s^d$ has a $d^{O(d)}$ in it. The $d$ in the $s$ term is needed. So to reduce this bound, using this deterministic framework, you would somehow need to implicitly "check" all ranges without explicitly enumerating them. This seems difficult to do deterministically, but I am not sure I know an explicit lower bound - most work in this arena assumes $d$ is constant and abuses this heavily.

Stack Exchange Network

How intrinsic is the $d^d$ term in the running time for constructing $\varepsilon$-nets in range spaces of VC-dimension d?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

How intrinsic is the $d^d$ term in the running time for constructing $\varepsilon$-nets in range spaces of VC-dimension d?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions