Jump to content
Wikipedia The Free Encyclopedia

Fisher–Tippett–Gnedenko theorem

From Wikipedia, the free encyclopedia
Theorem in statistics
This article is about the extreme value theorem in statistics. For the result in calculus, see extreme value theorem.

In statistics, the Fisher–Tippett–Gnedenko theorem (also the Fisher–Tippett theorem or the extreme value theorem) is a general result in extreme value theory regarding asymptotic distribution of extreme order statistics. The maximum of a sample of iid random variables after proper renormalization can only converge in distribution to one of three possible distribution families: the Gumbel distribution, the Fréchet distribution, or the Weibull distribution. Credit for the extreme value theorem and its convergence details are given to Fréchet (1927),[1] Fisher and Tippett (1928),[2] von Mises (1936),[3] [4] and Gnedenko (1943).[5]

The role of the extremal types theorem for maxima is similar to that of central limit theorem for averages, except that the central limit theorem applies to the average of a sample from any distribution with finite variance, while the Fisher–Tippet–Gnedenko theorem only states that if the distribution of a normalized maximum converges, then the limit has to be one of a particular class of distributions. It does not state that the distribution of the normalized maximum does converge.

Statement

[edit ]

Let X 1 , X 2 , , X n {\displaystyle X_{1},X_{2},\ldots ,X_{n}} {\displaystyle X_{1},X_{2},\ldots ,X_{n}} be an n-sized sample of independent and identically-distributed random variables, each of whose cumulative distribution function is F {\displaystyle F} {\displaystyle F}. Suppose that there exist two sequences of real numbers a n > 0 {\displaystyle a_{n}>0} {\displaystyle a_{n}>0} and b n R {\displaystyle b_{n}\in \mathbb {R} } {\displaystyle b_{n}\in \mathbb {R} } such that the following limits converge to a non-degenerate distribution function:

lim n P ( max { X 1 , , X n } b n a n x ) = G ( x ) , {\displaystyle \lim _{n\to \infty }\mathbb {P} \left({\frac {\max\{X_{1},\dots ,X_{n}\}-b_{n}}{a_{n}}}\leq x\right)=G(x),} {\displaystyle \lim _{n\to \infty }\mathbb {P} \left({\frac {\max\{X_{1},\dots ,X_{n}\}-b_{n}}{a_{n}}}\leq x\right)=G(x),}

or equivalently:

lim n ( F ( a n x + b n ) ) n = G ( x ) . {\displaystyle \lim _{n\to \infty }{\bigl (}F(a_{n}x+b_{n}){\bigr )}^{n}=G(x).} {\displaystyle \lim _{n\to \infty }{\bigl (}F(a_{n}x+b_{n}){\bigr )}^{n}=G(x).}

In such circumstances, the limiting function G {\displaystyle G} {\displaystyle G} is the cumulative distribution function of a distribution belonging to either the Gumbel, the Fréchet, or the Weibull distribution family.[6]

In other words, if the limit above converges, then up to a linear change of coordinates G ( x ) {\displaystyle G(x)} {\displaystyle G(x)} will assume either the form:[7]

G γ ( x ) = exp ( ( 1 + γ x ) 1 / γ ) for  γ 0 , {\displaystyle G_{\gamma }(x)=\exp {\big (}\!-(1+\gamma x)^{-1/\gamma }{\big )}\quad {\text{for }}\gamma \neq 0,} {\displaystyle G_{\gamma }(x)=\exp {\big (}\!-(1+\gamma x)^{-1/\gamma }{\big )}\quad {\text{for }}\gamma \neq 0,}

with the non-zero parameter γ {\displaystyle \gamma } {\displaystyle \gamma } also satisfying 1 + γ x > 0 {\displaystyle 1+\gamma x>0} {\displaystyle 1+\gamma x>0} for every x {\displaystyle x} {\displaystyle x} value supported by F {\displaystyle F} {\displaystyle F} (for all values x {\displaystyle x} {\displaystyle x} for which F ( x ) 0 {\displaystyle F(x)\neq 0} {\displaystyle F(x)\neq 0}).[clarification needed ] Otherwise it has the form:

G 0 ( x ) = exp ( exp ( x ) ) for  γ = 0. {\displaystyle G_{0}(x)=\exp {\bigl (}\!-\exp(-x){\bigr )}\quad {\text{for }}\gamma =0.} {\displaystyle G_{0}(x)=\exp {\bigl (}\!-\exp(-x){\bigr )}\quad {\text{for }}\gamma =0.}

This is the cumulative distribution function of the generalized extreme value distribution (GEV) with extreme value index γ {\displaystyle \gamma } {\displaystyle \gamma }. The GEV distribution groups the Gumbel, Fréchet, and Weibull distributions into a single composite form.

Conditions of convergence

[edit ]

The Fisher–Tippett–Gnedenko theorem is a statement about the convergence of the limiting distribution G ( x ) {\displaystyle G(x)} {\displaystyle G(x)}, above. The study of conditions for convergence of G {\displaystyle G} {\displaystyle G} to particular cases of the generalized extreme value distribution began with Mises (1936)[3] [5] [4] and was further developed by Gnedenko (1943).[5]

Let F {\displaystyle F} {\displaystyle F} be the distribution function of X {\displaystyle X} {\displaystyle X}, and X 1 , , X n {\displaystyle X_{1},\dots ,X_{n}} {\displaystyle X_{1},\dots ,X_{n}} be some i.i.d. sample thereof. Also let x m a x {\displaystyle x_{\mathsf {max}}} {\displaystyle x_{\mathsf {max}}} be the population maximum: x m a x sup { x F ( x ) < 1 } {\displaystyle x_{\mathsf {max}}\equiv \sup\{x\mid F(x)<1\}} {\displaystyle x_{\mathsf {max}}\equiv \sup\{x\mid F(x)<1\}}.

Then the limiting distribution of the normalized sample maximum, given by G {\displaystyle G} {\displaystyle G} above, will then be one of the following three types:[7]

  • Fréchet distribution ( γ > 0 {\displaystyle \gamma >0} {\displaystyle \gamma >0}): For strictly positive γ > 0 {\displaystyle \gamma >0} {\displaystyle \gamma >0}, the limiting distribution converges if and only if x m a x = {\displaystyle x_{\mathsf {max}}=\infty } {\displaystyle x_{\mathsf {max}}=\infty } and
lim t 1 F ( u t ) 1 F ( t ) = u 1 / γ   {\displaystyle \lim _{t\rightarrow \infty }{\frac {1-F(ut)}{1-F(t)}}=u^{1/\gamma }\ } {\displaystyle \lim _{t\rightarrow \infty }{\frac {1-F(ut)}{1-F(t)}}=u^{1/\gamma }\ } for all u > 0 {\displaystyle u>0} {\displaystyle u>0}.
In this case, possible sequences that will satisfy the theorem conditions are b n = 0 {\displaystyle b_{n}=0} {\displaystyle b_{n}=0} and a n = F 1 ( 1 1 n ) {\displaystyle a_{n}=F^{-1}\!\left(1-{\tfrac {1}{n}}\right)} {\displaystyle a_{n}=F^{-1}\!\left(1-{\tfrac {1}{n}}\right)}. Strictly positive γ {\displaystyle \gamma } {\displaystyle \gamma } corresponds to what is called a heavy tailed distribution.
  • Gumbel distribution ( γ = 0 {\displaystyle \gamma =0} {\displaystyle \gamma =0}): For trivial γ = 0 {\displaystyle \gamma =0} {\displaystyle \gamma =0}, and with x m a x {\displaystyle x_{\mathsf {max}}} {\displaystyle x_{\mathsf {max}}} either finite or infinite, the limiting distribution converges if and only if
lim t x m a x 1 F ( t + u g ~ ( t ) ) 1 F ( t ) = e u {\displaystyle \lim _{t\rightarrow x_{\mathsf {max}}}{\frac {1-F(t+u,円{\tilde {g}}(t))}{1-F(t)}}=\mathrm {e} ^{-u}} {\displaystyle \lim _{t\rightarrow x_{\mathsf {max}}}{\frac {1-F(t+u,円{\tilde {g}}(t))}{1-F(t)}}=\mathrm {e} ^{-u}} for all u > 0 {\displaystyle u>0} {\displaystyle u>0} with g ~ ( t ) t x m a x ( 1 F ( s ) ) d s 1 F ( t ) {\displaystyle {\tilde {g}}(t)\equiv {\frac {\int _{t}^{x_{\mathsf {max}}}{\bigl (}1-F(s){\bigr )},円\mathrm {d} s}{1-F(t)}}} {\displaystyle {\tilde {g}}(t)\equiv {\frac {\int _{t}^{x_{\mathsf {max}}}{\bigl (}1-F(s){\bigr )},円\mathrm {d} s}{1-F(t)}}}.
Possible sequences here are b n = F 1 (   1 1 n ) {\displaystyle b_{n}=F^{-1}\left(\ 1-{\tfrac {1}{n}}\right)} {\displaystyle b_{n}=F^{-1}\left(\ 1-{\tfrac {1}{n}}\right)} and a n = g ~ ( F 1 ( 1 1 n ) ) {\displaystyle a_{n}={\tilde {g}}{\bigl (}F^{-1}\!\left(1-{\tfrac {1}{n}}\right){\bigr )}} {\displaystyle a_{n}={\tilde {g}}{\bigl (}F^{-1}\!\left(1-{\tfrac {1}{n}}\right){\bigr )}}.
  • Weibull distribution ( γ < 0 {\displaystyle \gamma <0} {\displaystyle \gamma <0}): For strictly negative γ < 0 {\displaystyle \gamma <0} {\displaystyle \gamma <0}, the limiting distribution converges if and only if x m a x < {\displaystyle x_{\mathsf {max}}<\infty } {\displaystyle x_{\mathsf {max}}<\infty } (is finite) and
lim t 0 + 1 F ( x m a x u t ) 1 F ( x m a x t ) = u 1 / γ {\displaystyle \lim _{t\rightarrow 0^{+}}{\frac {1-F(x_{\mathsf {max}}-ut)}{1-F(x_{\mathsf {max}}-t)}}=u^{-1/\gamma }} {\displaystyle \lim _{t\rightarrow 0^{+}}{\frac {1-F(x_{\mathsf {max}}-ut)}{1-F(x_{\mathsf {max}}-t)}}=u^{-1/\gamma }} for all u > 0 {\displaystyle u>0} {\displaystyle u>0}.
Note that for this case the exponential term 1 / γ {\displaystyle -1/\gamma } {\displaystyle -1/\gamma } is strictly positive, since γ {\displaystyle \gamma } {\displaystyle \gamma } is strictly negative.
Possible sequences here are b n = x m a x {\displaystyle b_{n}=x_{\mathsf {max}}} {\displaystyle b_{n}=x_{\mathsf {max}}} and a n = x m a x F 1 ( 1 1 n ) {\displaystyle a_{n}=x_{\mathsf {max}}-F^{-1}\!\left(1-{\tfrac {1}{n}}\right)} {\displaystyle a_{n}=x_{\mathsf {max}}-F^{-1}\!\left(1-{\tfrac {1}{n}}\right)}.

Note that the second formula (the Gumbel distribution) is the limit of the first (the Fréchet distribution) as γ {\displaystyle \gamma } {\displaystyle \gamma } goes to zero.

Examples

[edit ]
This section does not cite any sources . Please help improve this section by adding citations to reliable sources. Unsourced material may be challenged and removed. (April 2023) (Learn how and when to remove this message)

Fréchet distribution

[edit ]

The Cauchy distribution's density function is:

f ( x ) = 1   π 2 + x 2     , {\displaystyle f(x)={\frac {1}{\ \pi ^{2}+x^{2}\ }}\ ,} {\displaystyle f(x)={\frac {1}{\ \pi ^{2}+x^{2}\ }}\ ,}

and its cumulative distribution function is:

F ( x ) =   1   2 + 1   π   arctan ( x   π   )   . {\displaystyle F(x)={\frac {\ 1\ }{2}}+{\frac {1}{\ \pi \ }}\arctan \left({\frac {x}{\ \pi \ }}\right)~.} {\displaystyle F(x)={\frac {\ 1\ }{2}}+{\frac {1}{\ \pi \ }}\arctan \left({\frac {x}{\ \pi \ }}\right)~.}

A little bit of calculus show that the right tail's cumulative distribution   1 F ( x )   {\displaystyle \ 1-F(x)\ } {\displaystyle \ 1-F(x)\ } is asymptotic to   1   x     , {\displaystyle \ {\frac {1}{\ x\ }}\ ,} {\displaystyle \ {\frac {1}{\ x\ }}\ ,} or

ln F ( x ) 1     x     a s   x   , {\displaystyle \ln F(x)\rightarrow {\frac {-1~}{\ x\ }}\quad {\mathsf {~as~}}\quad x\rightarrow \infty \ ,} {\displaystyle \ln F(x)\rightarrow {\frac {-1~}{\ x\ }}\quad {\mathsf {~as~}}\quad x\rightarrow \infty \ ,}

so we have

ln (   F ( x ) n   ) = n   ln F ( x ) n     x     . {\displaystyle \ln \left(\ F(x)^{n}\ \right)=n\ \ln F(x)\sim -{\frac {-n~}{\ x\ }}~.} {\displaystyle \ln \left(\ F(x)^{n}\ \right)=n\ \ln F(x)\sim -{\frac {-n~}{\ x\ }}~.}

Thus we have

F ( x ) n exp ( n     x   ) {\displaystyle F(x)^{n}\approx \exp \left({\frac {-n~}{\ x\ }}\right)} {\displaystyle F(x)^{n}\approx \exp \left({\frac {-n~}{\ x\ }}\right)}

and letting   u x   n   1   {\displaystyle \ u\equiv {\frac {x}{\ n\ }}-1\ } {\displaystyle \ u\equiv {\frac {x}{\ n\ }}-1\ } (and skipping some explanation)

lim n (   F ( n   u + n ) n   ) = exp ( 1     1 + u   ) = G 1 ( u )   {\displaystyle \lim _{n\to \infty }{\Bigl (}\ F(n\ u+n)^{n}\ {\Bigr )}=\exp \left({\tfrac {-1~}{\ 1+u\ }}\right)=G_{1}(u)\ } {\displaystyle \lim _{n\to \infty }{\Bigl (}\ F(n\ u+n)^{n}\ {\Bigr )}=\exp \left({\tfrac {-1~}{\ 1+u\ }}\right)=G_{1}(u)\ }

for any   u   . {\displaystyle \ u~.} {\displaystyle \ u~.}

Gumbel distribution

[edit ]

Let us take the normal distribution with cumulative distribution function

F ( x ) = 1 2 erfc ( x     2     )   . {\displaystyle F(x)={\frac {1}{2}}\operatorname {erfc} \left({\frac {-x~}{\ {\sqrt {2\ }}\ }}\right)~.} {\displaystyle F(x)={\frac {1}{2}}\operatorname {erfc} \left({\frac {-x~}{\ {\sqrt {2\ }}\ }}\right)~.}

We have

ln F ( x )   exp ( 1 2 x 2 )   2 π     x   a s   x {\displaystyle \ln F(x)\rightarrow -{\frac {\ \exp \left(-{\tfrac {1}{2}}x^{2}\right)\ }{{\sqrt {2\pi \ }}\ x}}\quad {\mathsf {~as~}}\quad x\rightarrow \infty } {\displaystyle \ln F(x)\rightarrow -{\frac {\ \exp \left(-{\tfrac {1}{2}}x^{2}\right)\ }{{\sqrt {2\pi \ }}\ x}}\quad {\mathsf {~as~}}\quad x\rightarrow \infty }

and thus

ln (   F ( x ) n   ) = n ln F ( x )   n exp ( 1 2 x 2 )   2 π     x   a s   x   . {\displaystyle \ln \left(\ F(x)^{n}\ \right)=n\ln F(x)\rightarrow -{\frac {\ n\exp \left(-{\tfrac {1}{2}}x^{2}\right)\ }{{\sqrt {2\pi \ }}\ x}}\quad {\mathsf {~as~}}\quad x\rightarrow \infty ~.} {\displaystyle \ln \left(\ F(x)^{n}\ \right)=n\ln F(x)\rightarrow -{\frac {\ n\exp \left(-{\tfrac {1}{2}}x^{2}\right)\ }{{\sqrt {2\pi \ }}\ x}}\quad {\mathsf {~as~}}\quad x\rightarrow \infty ~.}

Hence we have

F ( x ) n exp (     n   exp ( 1 2 x 2 )     2 π     x   )   . {\displaystyle F(x)^{n}\approx \exp \left(-\ {\frac {\ n\ \exp \left(-{\tfrac {1}{2}}x^{2}\right)\ }{\ {\sqrt {2\pi \ }}\ x\ }}\right)~.} {\displaystyle F(x)^{n}\approx \exp \left(-\ {\frac {\ n\ \exp \left(-{\tfrac {1}{2}}x^{2}\right)\ }{\ {\sqrt {2\pi \ }}\ x\ }}\right)~.}

If we define   c n   {\displaystyle \ c_{n}\ } {\displaystyle \ c_{n}\ } as the value that exactly satisfies

  n exp (   1 2 c n 2 )     2 π     c n   = 1   , {\displaystyle {\frac {\ n\exp \left(-\ {\tfrac {1}{2}}c_{n}^{2}\right)\ }{\ {\sqrt {2\pi \ }}\ c_{n}\ }}=1\ ,} {\displaystyle {\frac {\ n\exp \left(-\ {\tfrac {1}{2}}c_{n}^{2}\right)\ }{\ {\sqrt {2\pi \ }}\ c_{n}\ }}=1\ ,}

then around   x = c n   {\displaystyle \ x=c_{n}\ } {\displaystyle \ x=c_{n}\ }

  n   exp (   1 2 x 2 )   2 π     x exp (   c n   ( c n x )   )   . {\displaystyle {\frac {\ n\ \exp \left(-\ {\tfrac {1}{2}}x^{2}\right)\ }{{\sqrt {2\pi \ }}\ x}}\approx \exp \left(\ c_{n}\ (c_{n}-x)\ \right)~.} {\displaystyle {\frac {\ n\ \exp \left(-\ {\tfrac {1}{2}}x^{2}\right)\ }{{\sqrt {2\pi \ }}\ x}}\approx \exp \left(\ c_{n}\ (c_{n}-x)\ \right)~.}

As   n   {\displaystyle \ n\ } {\displaystyle \ n\ } increases, this becomes a good approximation for a wider and wider range of   c n   ( c n x )   {\displaystyle \ c_{n}\ (c_{n}-x)\ } {\displaystyle \ c_{n}\ (c_{n}-x)\ } so letting   u c n   ( x c n )   {\displaystyle \ u\equiv c_{n}\ (x-c_{n})\ } {\displaystyle \ u\equiv c_{n}\ (x-c_{n})\ } we find that

lim n (   F ( u   c n   + c n ) n   ) = exp ( exp ( u ) ) = G 0 ( u )   . {\displaystyle \lim _{n\to \infty }{\biggl (}\ F\left({\tfrac {u}{~c_{n}\ }}+c_{n}\right)^{n}\ {\biggr )}=\exp \!{\Bigl (}-\exp(-u){\Bigr )}=G_{0}(u)~.} {\displaystyle \lim _{n\to \infty }{\biggl (}\ F\left({\tfrac {u}{~c_{n}\ }}+c_{n}\right)^{n}\ {\biggr )}=\exp \!{\Bigl (}-\exp(-u){\Bigr )}=G_{0}(u)~.}

Equivalently,

lim n P   (   max { X 1 ,   ,   X n } c n   ( 1   c n   ) u ) = exp ( exp ( u ) ) = G 0 ( u )   . {\displaystyle \lim _{n\to \infty }\mathbb {P} \ {\Biggl (}{\frac {\ \max\{X_{1},\ \ldots ,\ X_{n}\}-c_{n}\ }{\left({\frac {1}{~c_{n}\ }}\right)}}\leq u{\Biggr )}=\exp \!{\Bigl (}-\exp(-u){\Bigr )}=G_{0}(u)~.} {\displaystyle \lim _{n\to \infty }\mathbb {P} \ {\Biggl (}{\frac {\ \max\{X_{1},\ \ldots ,\ X_{n}\}-c_{n}\ }{\left({\frac {1}{~c_{n}\ }}\right)}}\leq u{\Biggr )}=\exp \!{\Bigl (}-\exp(-u){\Bigr )}=G_{0}(u)~.}

With this result, we see retrospectively that we need   ln c n   ln ln n   2   {\displaystyle \ \ln c_{n}\approx {\frac {\ \ln \ln n\ }{2}}\ } {\displaystyle \ \ln c_{n}\approx {\frac {\ \ln \ln n\ }{2}}\ } and then

c n 2 ln n     , {\displaystyle c_{n}\approx {\sqrt {2\ln n\ }}\ ,} {\displaystyle c_{n}\approx {\sqrt {2\ln n\ }}\ ,}

so the maximum is expected to climb toward infinity ever more slowly.

Weibull distribution

[edit ]

We may take the simplest example, a uniform distribution between 0 and 1, with cumulative distribution function

F ( x ) = x   {\displaystyle F(x)=x\ } {\displaystyle F(x)=x\ } for any x value from 0 to 1 .

For values of   x     1   {\displaystyle \ x\ \rightarrow \ 1\ } {\displaystyle \ x\ \rightarrow \ 1\ } we have

ln (   F ( x ) n   ) = n   ln F ( x )     n   (   1 x   )   . {\displaystyle \ln {\Bigl (}\ F(x)^{n}\ {\Bigr )}=n\ \ln F(x)\ \rightarrow \ n\ (\ 1-x\ )~.} {\displaystyle \ln {\Bigl (}\ F(x)^{n}\ {\Bigr )}=n\ \ln F(x)\ \rightarrow \ n\ (\ 1-x\ )~.}

So for   x 1   {\displaystyle \ x\approx 1\ } {\displaystyle \ x\approx 1\ } we have

  F ( x ) n exp (   n n   x   )   . {\displaystyle \ F(x)^{n}\approx \exp(\ n-n\ x\ )~.} {\displaystyle \ F(x)^{n}\approx \exp(\ n-n\ x\ )~.}

Let   u 1 + n   (   1 x   )   {\displaystyle \ u\equiv 1+n\ (\ 1-x\ )\ } {\displaystyle \ u\equiv 1+n\ (\ 1-x\ )\ } and get

lim n (   F (   u   n + 1   1   n )   ) n = exp (   ( 1 u )   ) = G 1 ( u )   . {\displaystyle \lim _{n\to \infty }{\Bigl (}\ F\!\left({\tfrac {\ u\ }{n}}+1-{\tfrac {\ 1\ }{n}}\right)\ {\Bigr )}^{n}=\exp \!{\bigl (}\ -(1-u)\ {\bigr )}=G_{-1}(u)~.} {\displaystyle \lim _{n\to \infty }{\Bigl (}\ F\!\left({\tfrac {\ u\ }{n}}+1-{\tfrac {\ 1\ }{n}}\right)\ {\Bigr )}^{n}=\exp \!{\bigl (}\ -(1-u)\ {\bigr )}=G_{-1}(u)~.}

Close examination of that limit shows that the expected maximum approaches 1 in inverse proportion to n .

See also

[edit ]

References

[edit ]
  1. ^ Fréchet, M. (1927). "Sur la loi de probabilité de l'écart maximum". Annales de la Société Polonaise de Mathématique. 6 (1): 93–116.
  2. ^ Fisher, R. A.; Tippett, L. H. C. (1928). "Limiting forms of the frequency distribution of the largest and smallest member of a sample". Mathematical Proceedings of the Cambridge Philosophical Society . 24 (2): 180–190. Bibcode:1928PCPS...24..180F. doi:10.1017/s0305004100015681. S2CID 123125823.
  3. ^ a b von Mises, R. (1936). "La distribution de la plus grande de n valeurs" [The distribution of the largest of n values]. Rev. Math. Union Interbalcanique. 1 (in French): 141–160.
  4. ^ a b Falk, Michael; Marohn, Frank (1993). "von Mises conditions revisited". The Annals of Probability: 1310–1328.
  5. ^ a b c Gnedenko, B.V. (1943). "Sur la distribution limite du terme maximum d'une serie aleatoire". Annals of Mathematics . 44 (3): 423–453. doi:10.2307/1968974. JSTOR 1968974.
  6. ^ Mood, A.M. (1950). "5. Order Statistics". Introduction to the theory of statistics. New York, NY: McGraw-Hill. pp. 251–270.
  7. ^ a b Haan, Laurens; Ferreira, Ana (2007). Extreme Value Theory: An introduction. Springer.

Further reading

[edit ]

AltStyle によって変換されたページ (->オリジナル) /