Uniformly most powerful test
In statistical hypothesis testing, a uniformly most powerful (UMP) test is a hypothesis test which has the greatest power {\displaystyle 1-\beta } among all possible tests of a given size α. For example, according to the Neyman–Pearson lemma, the likelihood-ratio test is UMP for testing simple (point) hypotheses.
Setting
[edit ]Let {\displaystyle X} denote a random vector (corresponding to the measurements), taken from a parametrized family of probability density functions or probability mass functions {\displaystyle f_{\theta }(x)}, which depends on the unknown deterministic parameter {\displaystyle \theta \in \Theta }. The parameter space {\displaystyle \Theta } is partitioned into two disjoint sets {\displaystyle \Theta _{0}} and {\displaystyle \Theta _{1}}. Let {\displaystyle H_{0}} denote the hypothesis that {\displaystyle \theta \in \Theta _{0}}, and let {\displaystyle H_{1}} denote the hypothesis that {\displaystyle \theta \in \Theta _{1}}. The binary test of hypotheses is performed using a test function {\displaystyle \varphi (x)} with a reject region {\displaystyle R} (a subset of measurement space).
- {\displaystyle \varphi (x)={\begin{cases}1&{\text{if }}x\in R\0円&{\text{if }}x\in R^{c}\end{cases}}}
meaning that {\displaystyle H_{1}} is in force if the measurement {\displaystyle X\in R} and that {\displaystyle H_{0}} is in force if the measurement {\displaystyle X\in R^{c}}. Note that {\displaystyle R\cup R^{c}} is a disjoint covering of the measurement space.
Formal definition
[edit ]A test function {\displaystyle \varphi (x)} is UMP of size {\displaystyle \alpha } if for any other test function {\displaystyle \varphi '(x)} satisfying
- {\displaystyle \sup _{\theta \in \Theta _{0}}\;\operatorname {E} [\varphi '(X)|\theta ]=\alpha '\leq \alpha =\sup _{\theta \in \Theta _{0}}\;\operatorname {E} [\varphi (X)|\theta ],円}
we have
- {\displaystyle \forall \theta \in \Theta _{1},\quad \operatorname {E} [\varphi '(X)|\theta ]=1-\beta '(\theta )\leq 1-\beta (\theta )=\operatorname {E} [\varphi (X)|\theta ].}
The Karlin–Rubin theorem
[edit ]The Karlin–Rubin theorem (named for Samuel Karlin and Herman Rubin) can be regarded as an extension of the Neyman–Pearson lemma for composite hypotheses.[1] Consider a scalar measurement having a probability density function parameterized by a scalar parameter θ, and define the likelihood ratio {\displaystyle l(x)=f_{\theta _{1}}(x)/f_{\theta _{0}}(x)}. If {\displaystyle l(x)} is monotone non-decreasing, in {\displaystyle x}, for any pair {\displaystyle \theta _{1}\geq \theta _{0}} (meaning that the greater {\displaystyle x} is, the more likely {\displaystyle H_{1}} is), then the threshold test:
- {\displaystyle \varphi (x)={\begin{cases}1&{\text{if }}x>x_{0}\0円&{\text{if }}x<x_{0}\end{cases}}}
- where {\displaystyle x_{0}} is chosen such that {\displaystyle \operatorname {E} _{\theta _{0}}\varphi (X)=\alpha }
is the UMP test of size α for testing {\displaystyle H_{0}:\theta \leq \theta _{0}{\text{ vs. }}H_{1}:\theta >\theta _{0}.}
Note that exactly the same test is also UMP for testing {\displaystyle H_{0}:\theta =\theta _{0}{\text{ vs. }}H_{1}:\theta >\theta _{0}.}
Important case: exponential family
[edit ]Although the Karlin-Rubin theorem may seem weak because of its restriction to scalar parameter and scalar measurement, it turns out that there exist a host of problems for which the theorem holds. In particular, the one-dimensional exponential family of probability density functions or probability mass functions with
- {\displaystyle f_{\theta }(x)=g(\theta )h(x)\exp(\eta (\theta )T(x))}
has a monotone non-decreasing likelihood ratio in the sufficient statistic {\displaystyle T(x)}, provided that {\displaystyle \eta (\theta )} is non-decreasing.
Example
[edit ]Let {\displaystyle X=(X_{0},\ldots ,X_{M-1})} denote i.i.d. normally distributed {\displaystyle N}-dimensional random vectors with mean {\displaystyle \theta m} and covariance matrix {\displaystyle R}. We then have
- {\displaystyle {\begin{aligned}f_{\theta }(X)={}&(2\pi )^{-MN/2}|R|^{-M/2}\exp \left\{-{\frac {1}{2}}\sum _{n=0}^{M-1}(X_{n}-\theta m)^{T}R^{-1}(X_{n}-\theta m)\right\}\\[4pt]={}&(2\pi )^{-MN/2}|R|^{-M/2}\exp \left\{-{\frac {1}{2}}\sum _{n=0}^{M-1}\left(\theta ^{2}m^{T}R^{-1}m\right)\right\}\\[4pt]&\exp \left\{-{\frac {1}{2}}\sum _{n=0}^{M-1}X_{n}^{T}R^{-1}X_{n}\right\}\exp \left\{\theta m^{T}R^{-1}\sum _{n=0}^{M-1}X_{n}\right\}\end{aligned}}}
which is exactly in the form of the exponential family shown in the previous section, with the sufficient statistic being
- {\displaystyle T(X)=m^{T}R^{-1}\sum _{n=0}^{M-1}X_{n}.}
Thus, we conclude that the test
- {\displaystyle \varphi (T)={\begin{cases}1&T>t_{0}\0円&T<t_{0}\end{cases}}\qquad \operatorname {E} _{\theta _{0}}\varphi (T)=\alpha }
is the UMP test of size {\displaystyle \alpha } for testing {\displaystyle H_{0}:\theta \leqslant \theta _{0}} vs. {\displaystyle H_{1}:\theta >\theta _{0}}
Further discussion
[edit ]In general, UMP tests do not exist for vector parameters or for two-sided tests (a test in which one hypothesis lies on both sides of the alternative). The reason is that in these situations, the most powerful test of a given size for one possible value of the parameter (e.g. for {\displaystyle \theta _{1}} where {\displaystyle \theta _{1}>\theta _{0}}) is different from the most powerful test of the same size for a different value of the parameter (e.g. for {\displaystyle \theta _{2}} where {\displaystyle \theta _{2}<\theta _{0}}). As a result, no test is uniformly most powerful in these situations.
References
[edit ]- ^ Casella, G.; Berger, R.L. (2008), Statistical Inference, Brooks/Cole. ISBN 0-495-39187-5 (Theorem 8.3.17)
Further reading
[edit ]- Ferguson, T. S. (1967). "Sec. 5.2: Uniformly most powerful tests". Mathematical Statistics: A decision theoretic approach. New York: Academic Press.
- Mood, A. M.; Graybill, F. A.; Boes, D. C. (1974). "Sec. IX.3.2: Uniformly most powerful tests". Introduction to the theory of statistics (3rd ed.). New York: McGraw-Hill.
- L. L. Scharf, Statistical Signal Processing, Addison-Wesley, 1991, section 4.7.