Fisher–Tippett–Gnedenko theorem
In statistics, the Fisher–Tippett–Gnedenko theorem (also the Fisher–Tippett theorem or the extreme value theorem) is a general result in extreme value theory regarding asymptotic distribution of extreme order statistics. The maximum of a sample of iid random variables after proper renormalization can only converge in distribution to one of three possible distribution families: the Gumbel distribution, the Fréchet distribution, or the Weibull distribution. Credit for the extreme value theorem and its convergence details are given to Fréchet (1927),[1] Fisher and Tippett (1928),[2] von Mises (1936),[3] [4] and Gnedenko (1943).[5]
The role of the extremal types theorem for maxima is similar to that of central limit theorem for averages, except that the central limit theorem applies to the average of a sample from any distribution with finite variance, while the Fisher–Tippet–Gnedenko theorem only states that if the distribution of a normalized maximum converges, then the limit has to be one of a particular class of distributions. It does not state that the distribution of the normalized maximum does converge.
Statement
[edit ]Let {\displaystyle X_{1},X_{2},\ldots ,X_{n}} be an n-sized sample of independent and identically-distributed random variables, each of whose cumulative distribution function is {\displaystyle F}. Suppose that there exist two sequences of real numbers {\displaystyle a_{n}>0} and {\displaystyle b_{n}\in \mathbb {R} } such that the following limits converge to a non-degenerate distribution function:
- {\displaystyle \lim _{n\to \infty }\mathbb {P} \left({\frac {\max\{X_{1},\dots ,X_{n}\}-b_{n}}{a_{n}}}\leq x\right)=G(x),}
or equivalently:
- {\displaystyle \lim _{n\to \infty }{\bigl (}F(a_{n}x+b_{n}){\bigr )}^{n}=G(x).}
In such circumstances, the limiting function {\displaystyle G} is the cumulative distribution function of a distribution belonging to either the Gumbel, the Fréchet, or the Weibull distribution family.[6]
In other words, if the limit above converges, then up to a linear change of coordinates {\displaystyle G(x)} will assume either the form:[7]
- {\displaystyle G_{\gamma }(x)=\exp {\big (}\!-(1+\gamma x)^{-1/\gamma }{\big )}\quad {\text{for }}\gamma \neq 0,}
with the non-zero parameter {\displaystyle \gamma } also satisfying {\displaystyle 1+\gamma x>0} for every {\displaystyle x} value supported by {\displaystyle F} (for all values {\displaystyle x} for which {\displaystyle F(x)\neq 0}).[clarification needed ] Otherwise it has the form:
- {\displaystyle G_{0}(x)=\exp {\bigl (}\!-\exp(-x){\bigr )}\quad {\text{for }}\gamma =0.}
This is the cumulative distribution function of the generalized extreme value distribution (GEV) with extreme value index {\displaystyle \gamma }. The GEV distribution groups the Gumbel, Fréchet, and Weibull distributions into a single composite form.
Conditions of convergence
[edit ]The Fisher–Tippett–Gnedenko theorem is a statement about the convergence of the limiting distribution {\displaystyle G(x)}, above. The study of conditions for convergence of {\displaystyle G} to particular cases of the generalized extreme value distribution began with Mises (1936)[3] [5] [4] and was further developed by Gnedenko (1943).[5]
Let {\displaystyle F} be the distribution function of {\displaystyle X}, and {\displaystyle X_{1},\dots ,X_{n}} be some i.i.d. sample thereof. Also let {\displaystyle x_{\mathsf {max}}} be the population maximum: {\displaystyle x_{\mathsf {max}}\equiv \sup\{x\mid F(x)<1\}}.
Then the limiting distribution of the normalized sample maximum, given by {\displaystyle G} above, will then be one of the following three types:[7]
- Fréchet distribution ({\displaystyle \gamma >0}): For strictly positive {\displaystyle \gamma >0}, the limiting distribution converges if and only if {\displaystyle x_{\mathsf {max}}=\infty } and
- {\displaystyle \lim _{t\rightarrow \infty }{\frac {1-F(ut)}{1-F(t)}}=u^{1/\gamma }\ } for all {\displaystyle u>0}.
- In this case, possible sequences that will satisfy the theorem conditions are {\displaystyle b_{n}=0} and {\displaystyle a_{n}=F^{-1}\!\left(1-{\tfrac {1}{n}}\right)}. Strictly positive {\displaystyle \gamma } corresponds to what is called a heavy tailed distribution.
- Gumbel distribution ({\displaystyle \gamma =0}): For trivial {\displaystyle \gamma =0}, and with {\displaystyle x_{\mathsf {max}}} either finite or infinite, the limiting distribution converges if and only if
- {\displaystyle \lim _{t\rightarrow x_{\mathsf {max}}}{\frac {1-F(t+u,円{\tilde {g}}(t))}{1-F(t)}}=\mathrm {e} ^{-u}} for all {\displaystyle u>0} with {\displaystyle {\tilde {g}}(t)\equiv {\frac {\int _{t}^{x_{\mathsf {max}}}{\bigl (}1-F(s){\bigr )},円\mathrm {d} s}{1-F(t)}}}.
- Possible sequences here are {\displaystyle b_{n}=F^{-1}\left(\ 1-{\tfrac {1}{n}}\right)} and {\displaystyle a_{n}={\tilde {g}}{\bigl (}F^{-1}\!\left(1-{\tfrac {1}{n}}\right){\bigr )}}.
- Weibull distribution ({\displaystyle \gamma <0}): For strictly negative {\displaystyle \gamma <0}, the limiting distribution converges if and only if {\displaystyle x_{\mathsf {max}}<\infty } (is finite) and
- {\displaystyle \lim _{t\rightarrow 0^{+}}{\frac {1-F(x_{\mathsf {max}}-ut)}{1-F(x_{\mathsf {max}}-t)}}=u^{-1/\gamma }} for all {\displaystyle u>0}.
- Note that for this case the exponential term {\displaystyle -1/\gamma } is strictly positive, since {\displaystyle \gamma } is strictly negative.
- Possible sequences here are {\displaystyle b_{n}=x_{\mathsf {max}}} and {\displaystyle a_{n}=x_{\mathsf {max}}-F^{-1}\!\left(1-{\tfrac {1}{n}}\right)}.
Note that the second formula (the Gumbel distribution) is the limit of the first (the Fréchet distribution) as {\displaystyle \gamma } goes to zero.
Examples
[edit ]Fréchet distribution
[edit ]The Cauchy distribution's density function is:
- {\displaystyle f(x)={\frac {1}{\ \pi ^{2}+x^{2}\ }}\ ,}
and its cumulative distribution function is:
- {\displaystyle F(x)={\frac {\ 1\ }{2}}+{\frac {1}{\ \pi \ }}\arctan \left({\frac {x}{\ \pi \ }}\right)~.}
A little bit of calculus show that the right tail's cumulative distribution {\displaystyle \ 1-F(x)\ } is asymptotic to {\displaystyle \ {\frac {1}{\ x\ }}\ ,} or
- {\displaystyle \ln F(x)\rightarrow {\frac {-1~}{\ x\ }}\quad {\mathsf {~as~}}\quad x\rightarrow \infty \ ,}
so we have
- {\displaystyle \ln \left(\ F(x)^{n}\ \right)=n\ \ln F(x)\sim -{\frac {-n~}{\ x\ }}~.}
Thus we have
- {\displaystyle F(x)^{n}\approx \exp \left({\frac {-n~}{\ x\ }}\right)}
and letting {\displaystyle \ u\equiv {\frac {x}{\ n\ }}-1\ } (and skipping some explanation)
- {\displaystyle \lim _{n\to \infty }{\Bigl (}\ F(n\ u+n)^{n}\ {\Bigr )}=\exp \left({\tfrac {-1~}{\ 1+u\ }}\right)=G_{1}(u)\ }
for any {\displaystyle \ u~.}
Gumbel distribution
[edit ]Let us take the normal distribution with cumulative distribution function
- {\displaystyle F(x)={\frac {1}{2}}\operatorname {erfc} \left({\frac {-x~}{\ {\sqrt {2\ }}\ }}\right)~.}
We have
- {\displaystyle \ln F(x)\rightarrow -{\frac {\ \exp \left(-{\tfrac {1}{2}}x^{2}\right)\ }{{\sqrt {2\pi \ }}\ x}}\quad {\mathsf {~as~}}\quad x\rightarrow \infty }
and thus
- {\displaystyle \ln \left(\ F(x)^{n}\ \right)=n\ln F(x)\rightarrow -{\frac {\ n\exp \left(-{\tfrac {1}{2}}x^{2}\right)\ }{{\sqrt {2\pi \ }}\ x}}\quad {\mathsf {~as~}}\quad x\rightarrow \infty ~.}
Hence we have
- {\displaystyle F(x)^{n}\approx \exp \left(-\ {\frac {\ n\ \exp \left(-{\tfrac {1}{2}}x^{2}\right)\ }{\ {\sqrt {2\pi \ }}\ x\ }}\right)~.}
If we define {\displaystyle \ c_{n}\ } as the value that exactly satisfies
- {\displaystyle {\frac {\ n\exp \left(-\ {\tfrac {1}{2}}c_{n}^{2}\right)\ }{\ {\sqrt {2\pi \ }}\ c_{n}\ }}=1\ ,}
then around {\displaystyle \ x=c_{n}\ }
- {\displaystyle {\frac {\ n\ \exp \left(-\ {\tfrac {1}{2}}x^{2}\right)\ }{{\sqrt {2\pi \ }}\ x}}\approx \exp \left(\ c_{n}\ (c_{n}-x)\ \right)~.}
As {\displaystyle \ n\ } increases, this becomes a good approximation for a wider and wider range of {\displaystyle \ c_{n}\ (c_{n}-x)\ } so letting {\displaystyle \ u\equiv c_{n}\ (x-c_{n})\ } we find that
- {\displaystyle \lim _{n\to \infty }{\biggl (}\ F\left({\tfrac {u}{~c_{n}\ }}+c_{n}\right)^{n}\ {\biggr )}=\exp \!{\Bigl (}-\exp(-u){\Bigr )}=G_{0}(u)~.}
Equivalently,
- {\displaystyle \lim _{n\to \infty }\mathbb {P} \ {\Biggl (}{\frac {\ \max\{X_{1},\ \ldots ,\ X_{n}\}-c_{n}\ }{\left({\frac {1}{~c_{n}\ }}\right)}}\leq u{\Biggr )}=\exp \!{\Bigl (}-\exp(-u){\Bigr )}=G_{0}(u)~.}
With this result, we see retrospectively that we need {\displaystyle \ \ln c_{n}\approx {\frac {\ \ln \ln n\ }{2}}\ } and then
- {\displaystyle c_{n}\approx {\sqrt {2\ln n\ }}\ ,}
so the maximum is expected to climb toward infinity ever more slowly.
Weibull distribution
[edit ]We may take the simplest example, a uniform distribution between 0 and 1, with cumulative distribution function
- {\displaystyle F(x)=x\ } for any x value from 0 to 1 .
For values of {\displaystyle \ x\ \rightarrow \ 1\ } we have
- {\displaystyle \ln {\Bigl (}\ F(x)^{n}\ {\Bigr )}=n\ \ln F(x)\ \rightarrow \ n\ (\ 1-x\ )~.}
So for {\displaystyle \ x\approx 1\ } we have
- {\displaystyle \ F(x)^{n}\approx \exp(\ n-n\ x\ )~.}
Let {\displaystyle \ u\equiv 1+n\ (\ 1-x\ )\ } and get
- {\displaystyle \lim _{n\to \infty }{\Bigl (}\ F\!\left({\tfrac {\ u\ }{n}}+1-{\tfrac {\ 1\ }{n}}\right)\ {\Bigr )}^{n}=\exp \!{\bigl (}\ -(1-u)\ {\bigr )}=G_{-1}(u)~.}
Close examination of that limit shows that the expected maximum approaches 1 in inverse proportion to n .
See also
[edit ]References
[edit ]- ^ Fréchet, M. (1927). "Sur la loi de probabilité de l'écart maximum". Annales de la Société Polonaise de Mathématique. 6 (1): 93–116.
- ^ Fisher, R. A.; Tippett, L. H. C. (1928). "Limiting forms of the frequency distribution of the largest and smallest member of a sample". Mathematical Proceedings of the Cambridge Philosophical Society . 24 (2): 180–190. Bibcode:1928PCPS...24..180F. doi:10.1017/s0305004100015681. S2CID 123125823.
- ^ a b von Mises, R. (1936). "La distribution de la plus grande de n valeurs" [The distribution of the largest of n values]. Rev. Math. Union Interbalcanique. 1 (in French): 141–160.
- ^ a b Falk, Michael; Marohn, Frank (1993). "von Mises conditions revisited". The Annals of Probability: 1310–1328.
- ^ a b c Gnedenko, B.V. (1943). "Sur la distribution limite du terme maximum d'une serie aleatoire". Annals of Mathematics . 44 (3): 423–453. doi:10.2307/1968974. JSTOR 1968974.
- ^ Mood, A.M. (1950). "5. Order Statistics". Introduction to the theory of statistics. New York, NY: McGraw-Hill. pp. 251–270.
- ^ a b Haan, Laurens; Ferreira, Ana (2007). Extreme Value Theory: An introduction. Springer.
Further reading
[edit ]- Lee, Seyoon; Kim, Joseph H.T. (8 March 2018). "Exponentiated generalized Pareto distribution". Communications in Statistics – Theory and Methods. 48 (8) (online ed.): 2014–2038. arXiv:1708.01686 . doi:10.1080/03610926.2018.1441418. ISSN 1532-415X – via tandfonline.com.