Hidden constant in eps-sample size computation

Question 1

Given a range space $(X,R)$ with VC-Dimension $\le d,ドル we can create an $\varepsilon$-sample with probability at least 1ドル-\delta$ by sampling $ O\left(\frac{1}{\varepsilon^2}\left(d+\log\frac{1}{\delta}\right)\right)$ points with replacement from $X$.

Löffler and Phillips [1] showed experimentally that the constant hidden in the Big-Oh notation is at most 0.5. I was wondering whether there is any known theoretical, rather than experimental, upper bound to this constant.

Thanks in advance for any reference

[1] Löffler, M and Phillips, J.M. "Shape fitting on point sets with probability distributions". ESA'09

Question 2

I discovered after this publication that there is older work by Dvoretzky-Kiefer-Wolfowitz in 1956 that looks specifically at this one-dimensional case for sampling. There is also follow up by just Kiefer and Wolfowitz a year or two later that handles the (anchored) rectangle case.

For these specific settings, there has been some much more refined work looking at the actual constants. Here is a reference that might help:

P. Massart, "The Tight Constant in the Dvoretzky-Kiefer-Wolfowitz Inequality", 1990.

I don't know off-hand any other work on the constants other than the 1-d or "anchored" rectangle cases

Jeff

Question 3

Thank you for your answer Jeff. My understanding was that c should be a global constant, that is the same independently of the points and ranges. This I got from reading Theorem 14.4.4 from Alon and Spencer:"There is a constant c...". Now my reading of your answer suggests me that the 0.5 you find in your work was only the constant for the range space you defined in your work. Am I understanding this correctly?

Question 4

Also I'm not sure I understand what you mean by "one-dimensional case for sampling".

Question 5

Yes, there should be a universal constant c that holds independent of any range (as long as its VC-dimension is bounded) and any point set. My experiments attempted to estimate this. However, certain point sets will allow for a smaller constants (for instance points that lie approximately on a line, any convex range will look like intervals, and act like the VC-dimension is only 2).

Question 6

By "one-dimensional case for sampling" I meant that this work referred to random sampling bounds. There are other deterministic approaches. In one dimension, sorting and taking 1ドル/\epsilon$ evenly spaced points will yield an $\epsilon$-sample deterministically. Here the constant is 1, and the dependence on $\epsilon$ is much better. More complicated results exist for other range spaces as well, but the constants are much less understood, and likely much worse.

Jeff Phillips 861 bronze badge · Accepted Answer · 2012-11-01 00:26:01Z

7

$\begingroup$

I discovered after this publication that there is older work by Dvoretzky-Kiefer-Wolfowitz in 1956 that looks specifically at this one-dimensional case for sampling. There is also follow up by just Kiefer and Wolfowitz a year or two later that handles the (anchored) rectangle case.

For these specific settings, there has been some much more refined work looking at the actual constants. Here is a reference that might help:

P. Massart, "The Tight Constant in the Dvoretzky-Kiefer-Wolfowitz Inequality", 1990.

I don't know off-hand any other work on the constants other than the 1-d or "anchored" rectangle cases

Jeff

Share

Improve this answer

edited Nov 1, 2012 at 20:40

Kaveh's user avatar

Kaveh

22.2k8 gold badges87 silver badges187 bronze badges

answered Nov 1, 2012 at 0:26

Jeff Phillips's user avatar

Jeff Phillips

861 bronze badge

$\endgroup$

4

1

$\begingroup$ Thank you for your answer Jeff. My understanding was that c should be a global constant, that is the same independently of the points and ranges. This I got from reading Theorem 14.4.4 from Alon and Spencer:"There is a constant c...". Now my reading of your answer suggests me that the 0.5 you find in your work was only the constant for the range space you defined in your work. Am I understanding this correctly? $\endgroup$

Matteo
– Matteo

2012年11月01日 02:27:52 +00:00
Commented Nov 1, 2012 at 2:27
$\begingroup$ Also I'm not sure I understand what you mean by "one-dimensional case for sampling". $\endgroup$

Matteo
– Matteo

2012年11月01日 03:59:44 +00:00
Commented Nov 1, 2012 at 3:59
2

$\begingroup$ Yes, there should be a universal constant c that holds independent of any range (as long as its VC-dimension is bounded) and any point set. My experiments attempted to estimate this. However, certain point sets will allow for a smaller constants (for instance points that lie approximately on a line, any convex range will look like intervals, and act like the VC-dimension is only 2). $\endgroup$

Jeff Phillips
– Jeff Phillips

2012年11月09日 07:32:22 +00:00
Commented Nov 9, 2012 at 7:32
2

$\begingroup$ By "one-dimensional case for sampling" I meant that this work referred to random sampling bounds. There are other deterministic approaches. In one dimension, sorting and taking 1ドル/\epsilon$ evenly spaced points will yield an $\epsilon$-sample deterministically. Here the constant is 1, and the dependence on $\epsilon$ is much better. More complicated results exist for other range spaces as well, but the constants are much less understood, and likely much worse. $\endgroup$

Jeff Phillips
– Jeff Phillips

2012年11月09日 07:35:03 +00:00
Commented Nov 9, 2012 at 7:35

Add a comment |

Stack Exchange Network

Hidden constant in eps-sample size computation

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Hidden constant in eps-sample size computation

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions