Jump to content
Wikipedia The Free Encyclopedia

Generalized Pareto distribution

From Wikipedia, the free encyclopedia
Family of probability distributions often used to model tails or extreme values
This article is about a particular family of continuous distributions referred to as the generalized Pareto distribution. For the hierarchy of generalized Pareto distributions, see Pareto distribution.
This article needs additional citations for verification . Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Generalized Pareto distribution" – news · newspapers · books · scholar · JSTOR
(March 2012) (Learn how and when to remove this message)
Generalized Pareto distribution
Probability density function
Gpdpdf
GPD distribution functions for μ = 0 {\displaystyle \mu =0} {\displaystyle \mu =0} and different values of σ {\displaystyle \sigma } {\displaystyle \sigma } and ξ {\displaystyle \xi } {\displaystyle \xi }
Cumulative distribution function
Gpdcdf
Parameters

μ ( , ) {\displaystyle \mu \in (-\infty ,\infty ),円} {\displaystyle \mu \in (-\infty ,\infty ),円} location (real)
σ ( 0 , ) {\displaystyle \sigma \in (0,\infty ),円} {\displaystyle \sigma \in (0,\infty ),円} scale (real)

ξ ( , ) {\displaystyle \xi \in (-\infty ,\infty ),円} {\displaystyle \xi \in (-\infty ,\infty ),円} shape (real)
Support

x μ ( ξ 0 ) {\displaystyle x\geq \mu ,円\;(\xi \geq 0)} {\displaystyle x\geq \mu ,円\;(\xi \geq 0)}

μ x μ σ / ξ ( ξ < 0 ) {\displaystyle \mu \leq x\leq \mu -\sigma /\xi ,円\;(\xi <0)} {\displaystyle \mu \leq x\leq \mu -\sigma /\xi ,円\;(\xi <0)}
PDF

1 σ ( 1 + ξ z ) ( 1 / ξ + 1 ) {\displaystyle {\frac {1}{\sigma }}(1+\xi z)^{-(1/\xi +1)}} {\displaystyle {\frac {1}{\sigma }}(1+\xi z)^{-(1/\xi +1)}}

where z = x μ σ {\displaystyle z={\frac {x-\mu }{\sigma }}} {\displaystyle z={\frac {x-\mu }{\sigma }}}
CDF 1 ( 1 + ξ z ) 1 / ξ {\displaystyle 1-(1+\xi z)^{-1/\xi },円} {\displaystyle 1-(1+\xi z)^{-1/\xi },円}
Mean μ + σ 1 ξ ( ξ < 1 ) {\displaystyle \mu +{\frac {\sigma }{1-\xi }},円\;(\xi <1)} {\displaystyle \mu +{\frac {\sigma }{1-\xi }},円\;(\xi <1)}
Median μ + σ ( 2 ξ 1 ) ξ {\displaystyle \mu +{\frac {\sigma (2^{\xi }-1)}{\xi }}} {\displaystyle \mu +{\frac {\sigma (2^{\xi }-1)}{\xi }}}
Mode μ {\displaystyle \mu } {\displaystyle \mu }
Variance σ 2 ( 1 ξ ) 2 ( 1 2 ξ ) ( ξ < 1 / 2 ) {\displaystyle {\frac {\sigma ^{2}}{\left(1-\xi \right)^{2}(1-2\xi )}},円\;(\xi <1/2)} {\displaystyle {\frac {\sigma ^{2}}{\left(1-\xi \right)^{2}(1-2\xi )}},円\;(\xi <1/2)}
Skewness 2 ( 1 + ξ ) 1 2 ξ ( 1 3 ξ ) ( ξ < 1 / 3 ) {\displaystyle {\frac {2(1+\xi ){\sqrt {1-2\xi }}}{(1-3\xi )}},円\;(\xi <1/3)} {\displaystyle {\frac {2(1+\xi ){\sqrt {1-2\xi }}}{(1-3\xi )}},円\;(\xi <1/3)}
Excess kurtosis 3 ( 1 2 ξ ) ( 2 ξ 2 + ξ + 3 ) ( 1 3 ξ ) ( 1 4 ξ ) 3 ( ξ < 1 / 4 ) {\displaystyle {\frac {3(1-2\xi )(2\xi ^{2}+\xi +3)}{(1-3\xi )(1-4\xi )}}-3,円\;(\xi <1/4)} {\displaystyle {\frac {3(1-2\xi )(2\xi ^{2}+\xi +3)}{(1-3\xi )(1-4\xi )}}-3,円\;(\xi <1/4)}
Entropy log ( σ ) + ξ + 1 {\displaystyle \log(\sigma )+\xi +1} {\displaystyle \log(\sigma )+\xi +1}
MGF e θ μ j = 0 [ ( θ σ ) j k = 0 j ( 1 k ξ ) ] , ( k ξ < 1 ) {\displaystyle e^{\theta \mu },円\sum _{j=0}^{\infty }\left[{\frac {(\theta \sigma )^{j}}{\prod _{k=0}^{j}(1-k\xi )}}\right],\;(k\xi <1)} {\displaystyle e^{\theta \mu },円\sum _{j=0}^{\infty }\left[{\frac {(\theta \sigma )^{j}}{\prod _{k=0}^{j}(1-k\xi )}}\right],\;(k\xi <1)}
CF e i t μ j = 0 [ ( i t σ ) j k = 0 j ( 1 k ξ ) ] , ( k ξ < 1 ) {\displaystyle e^{it\mu },円\sum _{j=0}^{\infty }\left[{\frac {(it\sigma )^{j}}{\prod _{k=0}^{j}(1-k\xi )}}\right],\;(k\xi <1)} {\displaystyle e^{it\mu },円\sum _{j=0}^{\infty }\left[{\frac {(it\sigma )^{j}}{\prod _{k=0}^{j}(1-k\xi )}}\right],\;(k\xi <1)}
Method of moments ξ = 1 2 ( 1 ( E [ X ] μ ) 2 Var [ X ] ) {\displaystyle \xi ={\frac {1}{2}}\left(1-{\frac {\left(\operatorname {E} [X]-\mu \right)^{2}}{\operatorname {Var} [X]}}\right)} {\displaystyle \xi ={\frac {1}{2}}\left(1-{\frac {\left(\operatorname {E} [X]-\mu \right)^{2}}{\operatorname {Var} [X]}}\right)}
σ = ( E [ X ] μ ) ( 1 ξ ) {\displaystyle \sigma =(\operatorname {E} [X]-\mu )(1-\xi )} {\displaystyle \sigma =(\operatorname {E} [X]-\mu )(1-\xi )}
Expected shortfall { μ + σ [ ( 1 p ) ξ 1 ξ + ( 1 p ) ξ 1 ξ ] , ξ 0 μ + σ [ 1 ln ( 1 p ) ] , ξ = 0 {\displaystyle {\begin{cases}\mu +\sigma \left[{\frac {(1-p)^{-\xi }}{1-\xi }}+{\frac {(1-p)^{-\xi }-1}{\xi }}\right],&\xi \neq 0\\\mu +\sigma [1-\ln(1-p)],&\xi =0\end{cases}}} {\displaystyle {\begin{cases}\mu +\sigma \left[{\frac {(1-p)^{-\xi }}{1-\xi }}+{\frac {(1-p)^{-\xi }-1}{\xi }}\right],&\xi \neq 0\\\mu +\sigma [1-\ln(1-p)],&\xi =0\end{cases}}}[1]

In statistics, the generalized Pareto distribution (GPD) is a family of continuous probability distributions. It is often used to model the tails of another distribution. It is specified by three parameters: location μ {\displaystyle \mu } {\displaystyle \mu }, scale σ {\displaystyle \sigma } {\displaystyle \sigma }, and shape ξ {\displaystyle \xi } {\displaystyle \xi }.[2] [3] Sometimes it is specified by only scale and shape[4] and sometimes only by its shape parameter. Some references give the shape parameter as κ = ξ {\displaystyle \kappa =-\xi ,円} {\displaystyle \kappa =-\xi ,円}.[5]

With shape ξ > 0 {\displaystyle \xi >0} {\displaystyle \xi >0} and location μ = σ / ξ {\displaystyle \mu =\sigma /\xi } {\displaystyle \mu =\sigma /\xi }, the GPD is equivalent to the Pareto distribution with scale x m = σ / ξ {\displaystyle x_{m}=\sigma /\xi } {\displaystyle x_{m}=\sigma /\xi } and shape α = 1 / ξ {\displaystyle \alpha =1/\xi } {\displaystyle \alpha =1/\xi }.

Definition

[edit ]

The cumulative distribution function of X GPD ( μ , σ , ξ ) {\displaystyle X\sim {\text{GPD}}(\mu ,\sigma ,\xi )} {\displaystyle X\sim {\text{GPD}}(\mu ,\sigma ,\xi )} ( μ R {\displaystyle \mu \in \mathbb {R} } {\displaystyle \mu \in \mathbb {R} }, σ > 0 {\displaystyle \sigma >0} {\displaystyle \sigma >0}, and ξ R {\displaystyle \xi \in \mathbb {R} } {\displaystyle \xi \in \mathbb {R} }) is

F ( μ , σ , ξ ) ( x ) = { 1 ( 1 + ξ ( x μ ) σ ) 1 / ξ for  ξ 0 , 1 exp ( x μ σ ) for  ξ = 0 , {\displaystyle F_{(\mu ,\sigma ,\xi )}(x)={\begin{cases}1-\left(1+{\frac {\xi (x-\mu )}{\sigma }}\right)^{-1/\xi }&{\text{for }}\xi \neq 0,\1円-\exp \left(-{\frac {x-\mu }{\sigma }}\right)&{\text{for }}\xi =0,\end{cases}}} {\displaystyle F_{(\mu ,\sigma ,\xi )}(x)={\begin{cases}1-\left(1+{\frac {\xi (x-\mu )}{\sigma }}\right)^{-1/\xi }&{\text{for }}\xi \neq 0,\1円-\exp \left(-{\frac {x-\mu }{\sigma }}\right)&{\text{for }}\xi =0,\end{cases}}} where the support of X {\displaystyle X} {\displaystyle X} is x μ {\displaystyle x\geq \mu } {\displaystyle x\geq \mu } when ξ 0 {\displaystyle \xi \geq 0,円} {\displaystyle \xi \geq 0,円}, and μ x μ σ / ξ {\displaystyle \mu \leq x\leq \mu -\sigma /\xi } {\displaystyle \mu \leq x\leq \mu -\sigma /\xi } when ξ < 0 {\displaystyle \xi <0} {\displaystyle \xi <0}.

The probability density function (pdf) of X GPD ( μ , σ , ξ ) {\displaystyle X\sim {\text{GPD}}(\mu ,\sigma ,\xi )} {\displaystyle X\sim {\text{GPD}}(\mu ,\sigma ,\xi )} is

f ( μ , σ , ξ ) ( x ) = 1 σ ( 1 + ξ ( x μ ) σ ) ( 1 + 1 / ξ ) , {\displaystyle f_{(\mu ,\sigma ,\xi )}(x)={\frac {1}{\sigma }}\left(1+{\frac {\xi (x-\mu )}{\sigma }}\right)^{-\left(1+1/\xi \right)},} {\displaystyle f_{(\mu ,\sigma ,\xi )}(x)={\frac {1}{\sigma }}\left(1+{\frac {\xi (x-\mu )}{\sigma }}\right)^{-\left(1+1/\xi \right)},}

again, for x μ {\displaystyle x\geq \mu } {\displaystyle x\geq \mu } when ξ 0 {\displaystyle \xi \geq 0} {\displaystyle \xi \geq 0}, and μ x μ σ / ξ {\displaystyle \mu \leq x\leq \mu -\sigma /\xi } {\displaystyle \mu \leq x\leq \mu -\sigma /\xi } when ξ < 0 {\displaystyle \xi <0} {\displaystyle \xi <0}.

The pdf is a solution of the following differential equation: [citation needed ]

{ f ( x ) ( μ ξ + σ + ξ x ) + ( ξ + 1 ) f ( x ) = 0 , f ( 0 ) = 1 σ ( 1 μ ξ σ ) 1 ξ 1 {\displaystyle {\begin{cases}f'(x)\left(-\mu \xi +\sigma +\xi x\right)+(\xi +1)f(x)=0,\\[1ex]f(0)={\frac {1}{\sigma }}\left(1-{\frac {\mu \xi }{\sigma }}\right)^{-{\frac {1}{\xi }}-1}\end{cases}}} {\displaystyle {\begin{cases}f'(x)\left(-\mu \xi +\sigma +\xi x\right)+(\xi +1)f(x)=0,\\[1ex]f(0)={\frac {1}{\sigma }}\left(1-{\frac {\mu \xi }{\sigma }}\right)^{-{\frac {1}{\xi }}-1}\end{cases}}}

The standard cumulative distribution function (cdf) of the GPD is defined using z = x μ σ . {\displaystyle z={\frac {x-\mu }{\sigma }}.} {\displaystyle z={\frac {x-\mu }{\sigma }}.}[6]

F ξ ( z ) = { 1 ( 1 + ξ z ) 1 / ξ for  ξ 0 , 1 e z for  ξ = 0. {\displaystyle F_{\xi }(z)={\begin{cases}1-\left(1+\xi z\right)^{-1/\xi }&{\text{for }}\xi \neq 0,\1円-e^{-z}&{\text{for }}\xi =0.\end{cases}}} {\displaystyle F_{\xi }(z)={\begin{cases}1-\left(1+\xi z\right)^{-1/\xi }&{\text{for }}\xi \neq 0,\1円-e^{-z}&{\text{for }}\xi =0.\end{cases}}}

where the support is z 0 {\displaystyle z\geq 0} {\displaystyle z\geq 0} for ξ 0 {\displaystyle \xi \geq 0} {\displaystyle \xi \geq 0} and 0 z 1 / ξ {\displaystyle 0\leq z\leq -1/\xi } {\displaystyle 0\leq z\leq -1/\xi } for ξ < 0 {\displaystyle \xi <0} {\displaystyle \xi <0}. The corresponding probability density function (pdf) is

f ξ ( z ) = { ( 1 + ξ z ) ( 1 + 1 / ξ ) for  ξ 0 , e z for  ξ = 0. {\displaystyle f_{\xi }(z)={\begin{cases}\left(1+\xi z\right)^{-(1+1/\xi )}&{\text{for }}\xi \neq 0,\\e^{-z}&{\text{for }}\xi =0.\end{cases}}} {\displaystyle f_{\xi }(z)={\begin{cases}\left(1+\xi z\right)^{-(1+1/\xi )}&{\text{for }}\xi \neq 0,\\e^{-z}&{\text{for }}\xi =0.\end{cases}}}

Special cases

[edit ]
  • If ξ = 0 {\displaystyle \xi =0} {\displaystyle \xi =0}, the GPD is the exponential distribution.
  • If ξ > 0 {\displaystyle \xi >0} {\displaystyle \xi >0}, the GPD is the Pareto distribution with shape α = 1 / ξ {\displaystyle \alpha =1/\xi } {\displaystyle \alpha =1/\xi }.
  • If ξ < 0 {\displaystyle \xi <0} {\displaystyle \xi <0}, the GPD is the power function distribution with shape α = 1 / ξ {\displaystyle \alpha =-1/\xi } {\displaystyle \alpha =-1/\xi }.
  • If ξ = 1 {\displaystyle \xi =-1} {\displaystyle \xi =-1}, the GPD is the continuous uniform distribution U ( 0 , σ ) {\displaystyle U(0,\sigma )} {\displaystyle U(0,\sigma )}.[7]
  • If X G P D ( μ = 0 , σ , ξ ) {\displaystyle X\sim \mathrm {GPD} (\mu =0,\sigma ,\xi )} {\displaystyle X\sim \mathrm {GPD} (\mu =0,\sigma ,\xi )}, then Y = log ( X ) e x G P D ( σ , ξ ) {\displaystyle Y=\log(X)\sim \mathrm {exGPD} (\sigma ,\xi )} {\displaystyle Y=\log(X)\sim \mathrm {exGPD} (\sigma ,\xi )} [1]. (exGPD stands for the exponentiated generalized Pareto distribution.)
  • GPD is similar to the Burr distribution.

Prediction

[edit ]
  • It is often of interest to predict probabilities of out-of-sample data under the assumption that both the training data and the out-of-sample data follow a GPD.
  • Predictions of probabilities generated by substituting maximum likelihood estimates of the GPD parameters into the cumulative distribution function ignore parameter uncertainty. As a result, the probabilities are not well calibrated, do not reflect the frequencies of out-of-sample events, and, in particular, underestimate the probabilities of out-of-sample tail events.[8]
  • Predictions generated using the objective Bayesian approach of calibrating prior prediction have been shown to greatly reduce this underestimation, although not completely eliminate it.[8] Calibrating prior prediction is implemented in the R software package fitdistcp.[2]

Generating generalized Pareto random variables

[edit ]

Generating GPD random variables

[edit ]

If U is uniformly distributed on (0, 1], then

X = μ + σ ( U ξ 1 ) ξ G P D ( μ , σ , ξ 0 ) {\displaystyle X=\mu +{\frac {\sigma (U^{-\xi }-1)}{\xi }}\sim \mathrm {GPD} (\mu ,\sigma ,\xi \neq 0)} {\displaystyle X=\mu +{\frac {\sigma (U^{-\xi }-1)}{\xi }}\sim \mathrm {GPD} (\mu ,\sigma ,\xi \neq 0)} and X = μ σ ln ( U ) G P D ( μ , σ , ξ = 0 ) . {\displaystyle X=\mu -\sigma \ln(U)\sim \mathrm {GPD} (\mu ,\sigma ,\xi =0).} {\displaystyle X=\mu -\sigma \ln(U)\sim \mathrm {GPD} (\mu ,\sigma ,\xi =0).}

Both formulas are obtained by inversion of the cdf.

The Pareto package in R and the gprnd command in the Matlab Statistics Toolbox can be used to generate generalized Pareto random numbers.

GPD as an Exponential-Gamma Mixture

[edit ]

A GPD random variable can also be expressed as an exponential random variable, with a Gamma distributed rate parameter.

X Λ E x p ( Λ ) {\displaystyle X\mid \Lambda \sim \mathrm {Exp} (\Lambda )} {\displaystyle X\mid \Lambda \sim \mathrm {Exp} (\Lambda )} and Λ G a m m a ( α , β ) {\displaystyle \Lambda \sim \mathrm {Gamma} (\alpha ,,円\beta )} {\displaystyle \Lambda \sim \mathrm {Gamma} (\alpha ,,円\beta )} then X G P D ( ξ = 1 / α ,   σ = β / α ) {\displaystyle X\sim \mathrm {GPD} (\xi =1/\alpha ,\ \sigma =\beta /\alpha )} {\displaystyle X\sim \mathrm {GPD} (\xi =1/\alpha ,\ \sigma =\beta /\alpha )}

Notice however, that since the parameters for the Gamma distribution must be greater than zero, we obtain the additional restrictions that ξ {\displaystyle \xi } {\displaystyle \xi } must be positive.

In addition to this mixture (or compound) expression, the generalized Pareto distribution can also be expressed as a simple ratio. Concretely, for Y E x p ( 1 ) {\displaystyle Y\sim \mathrm {Exp} (1)} {\displaystyle Y\sim \mathrm {Exp} (1)} and Z G a m m a ( 1 / ξ , 1 ) , {\displaystyle Z\sim \mathrm {Gamma} (1/\xi ,,1円),,円} {\displaystyle Z\sim \mathrm {Gamma} (1/\xi ,,1円),,円} we have μ + σ Y ξ Z G P D ( μ , σ , ξ ) . {\displaystyle \mu +{\frac {\sigma Y}{\xi Z}}\sim \mathrm {GPD} (\mu ,\sigma ,\xi ),円.} {\displaystyle \mu +{\frac {\sigma Y}{\xi Z}}\sim \mathrm {GPD} (\mu ,\sigma ,\xi ),円.} This is a consequence of the mixture after setting β = α {\displaystyle \beta =\alpha } {\displaystyle \beta =\alpha } and taking into account that the rate parameters of the exponential and gamma distribution are simply inverse multiplicative constants.

Exponentiated generalized Pareto distribution

[edit ]

The exponentiated generalized Pareto distribution (exGPD)

[edit ]
The pdf of the e x G P D ( σ , ξ ) {\displaystyle \mathrm {exGPD} (\sigma ,\xi )} {\displaystyle \mathrm {exGPD} (\sigma ,\xi )} (exponentiated generalized Pareto distribution) for different values σ {\displaystyle \sigma } {\displaystyle \sigma } and ξ {\displaystyle \xi } {\displaystyle \xi }.

If X G P D ( μ = 0 , σ , ξ ) {\displaystyle X\sim \mathrm {GPD} (\mu =0,\sigma ,\xi )} {\displaystyle X\sim \mathrm {GPD} (\mu =0,\sigma ,\xi )}, then Y = log ( X ) {\displaystyle Y=\log(X)} {\displaystyle Y=\log(X)} is distributed according to the exponentiated generalized Pareto distribution, denoted by Y e x G P D ( σ , ξ ) {\displaystyle Y\sim \mathrm {exGPD} (\sigma ,\xi )} {\displaystyle Y\sim \mathrm {exGPD} (\sigma ,\xi )}.

The probability density function(pdf) of Y e x G P D ( σ , ξ ) ( σ > 0 ) {\displaystyle Y\sim \mathrm {exGPD} (\sigma ,\xi ),円,円(\sigma >0)} {\displaystyle Y\sim \mathrm {exGPD} (\sigma ,\xi ),円,円(\sigma >0)} is

g ( σ , ξ ) ( y ) = { e y σ ( 1 + ξ e y σ ) 1 / ξ 1 for  ξ 0 , 1 σ e y e y / σ for  ξ = 0 , {\displaystyle g_{(\sigma ,\xi )}(y)={\begin{cases}{\frac {e^{y}}{\sigma }}{\bigg (}1+{\frac {\xi e^{y}}{\sigma }}{\bigg )}^{-1/\xi -1},円,円,円,円{\text{for }}\xi \neq 0,\\{\frac {1}{\sigma }}e^{y-e^{y}/\sigma },円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円{\text{for }}\xi =0,\end{cases}}} {\displaystyle g_{(\sigma ,\xi )}(y)={\begin{cases}{\frac {e^{y}}{\sigma }}{\bigg (}1+{\frac {\xi e^{y}}{\sigma }}{\bigg )}^{-1/\xi -1},円,円,円,円{\text{for }}\xi \neq 0,\\{\frac {1}{\sigma }}e^{y-e^{y}/\sigma },円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円,円{\text{for }}\xi =0,\end{cases}}} where the support is < y < {\displaystyle -\infty <y<\infty } {\displaystyle -\infty <y<\infty } for ξ 0 {\displaystyle \xi \geq 0} {\displaystyle \xi \geq 0}, and < y log ( σ / ξ ) {\displaystyle -\infty <y\leq \log(-\sigma /\xi )} {\displaystyle -\infty <y\leq \log(-\sigma /\xi )} for ξ < 0 {\displaystyle \xi <0} {\displaystyle \xi <0}.

For all ξ {\displaystyle \xi } {\displaystyle \xi }, the log σ {\displaystyle \log \sigma } {\displaystyle \log \sigma } becomes the location parameter. See the right panel for the pdf when the shape ξ {\displaystyle \xi } {\displaystyle \xi } is positive.

The exGPD has finite moments of all orders for all σ > 0 {\displaystyle \sigma >0} {\displaystyle \sigma >0} and < ξ < {\displaystyle -\infty <\xi <\infty } {\displaystyle -\infty <\xi <\infty }.

The variance of the e x G P D ( σ , ξ ) {\displaystyle \mathrm {exGPD} (\sigma ,\xi )} {\displaystyle \mathrm {exGPD} (\sigma ,\xi )} as a function of ξ {\displaystyle \xi } {\displaystyle \xi }. Note that the variance only depends on ξ {\displaystyle \xi } {\displaystyle \xi }. The red dotted line represents the variance evaluated at ξ = 0 {\displaystyle \xi =0} {\displaystyle \xi =0}, that is, ψ ( 1 ) = π 2 / 6 {\displaystyle \psi '(1)=\pi ^{2}/6} {\displaystyle \psi '(1)=\pi ^{2}/6}.

The moment-generating function of Y e x G P D ( σ , ξ ) {\displaystyle Y\sim \mathrm {exGPD} (\sigma ,\xi )} {\displaystyle Y\sim \mathrm {exGPD} (\sigma ,\xi )} is M Y ( s ) = E [ e s Y ] = { 1 ξ ( σ ξ ) s B ( s + 1 , 1 / ξ ) , for  1 < s < , ξ < 0 , 1 ξ ( σ ξ ) s B ( s + 1 , 1 / ξ s ) for  1 < s < 1 / ξ , ξ > 0 , σ s Γ ( 1 + s ) , for  1 < s < , ξ = 0 , {\displaystyle M_{Y}(s)=\operatorname {E} \left[e^{sY}\right]={\begin{cases}-{\frac {1}{\xi }}\left(-{\frac {\sigma }{\xi }}\right)^{s}B(s{+}1,,円-1/\xi ),&{\text{for }}&-1<s<\infty ,&\xi <0,\\[1ex]{\frac {1}{\xi }}\left({\frac {\sigma }{\xi }}\right)^{s}B(s{+}1,,1円/\xi -s)&{\text{for }}&-1<s<1/\xi ,&\xi >0,\\[1ex]\sigma ^{s}\Gamma (1+s),&{\text{for }}&-1<s<\infty ,&\xi =0,\end{cases}}} {\displaystyle M_{Y}(s)=\operatorname {E} \left[e^{sY}\right]={\begin{cases}-{\frac {1}{\xi }}\left(-{\frac {\sigma }{\xi }}\right)^{s}B(s{+}1,,円-1/\xi ),&{\text{for }}&-1<s<\infty ,&\xi <0,\\[1ex]{\frac {1}{\xi }}\left({\frac {\sigma }{\xi }}\right)^{s}B(s{+}1,,1円/\xi -s)&{\text{for }}&-1<s<1/\xi ,&\xi >0,\\[1ex]\sigma ^{s}\Gamma (1+s),&{\text{for }}&-1<s<\infty ,&\xi =0,\end{cases}}} where B ( a , b ) {\displaystyle B(a,b)} {\displaystyle B(a,b)} and Γ ( a ) {\displaystyle \Gamma (a)} {\displaystyle \Gamma (a)} denote the beta function and gamma function, respectively.

The expected value of Y e x G P D ( σ , ξ ) {\displaystyle Y\sim \mathrm {exGPD} (\sigma ,\xi )} {\displaystyle Y\sim \mathrm {exGPD} (\sigma ,\xi )} depends on the scale σ {\displaystyle \sigma } {\displaystyle \sigma } and shape ξ {\displaystyle \xi } {\displaystyle \xi } parameters, while the ξ {\displaystyle \xi } {\displaystyle \xi } participates through the digamma function: E [ Y ] = { log ( σ ξ ) + ψ ( 1 ) ψ ( 1 / ξ + 1 ) for  ξ < 0 , log σ log ξ + ψ ( 1 ) ψ ( 1 / ξ ) for  ξ > 0 , log σ + ψ ( 1 ) for  ξ = 0. {\displaystyle \operatorname {E} [Y]={\begin{cases}\log \left(-{\frac {\sigma }{\xi }}\right)+\psi (1)-\psi (-1/\xi +1)&{\text{for }}\xi <0,\\[1ex]\log \sigma -\log \xi +\psi (1)-\psi (1/\xi )&{\text{for }}\xi >0,\\[1ex]\log \sigma +\psi (1)&{\text{for }}\xi =0.\end{cases}}} {\displaystyle \operatorname {E} [Y]={\begin{cases}\log \left(-{\frac {\sigma }{\xi }}\right)+\psi (1)-\psi (-1/\xi +1)&{\text{for }}\xi <0,\\[1ex]\log \sigma -\log \xi +\psi (1)-\psi (1/\xi )&{\text{for }}\xi >0,\\[1ex]\log \sigma +\psi (1)&{\text{for }}\xi =0.\end{cases}}} Note that for a fixed value for the ξ ( , ) {\displaystyle \xi \in (-\infty ,\infty )} {\displaystyle \xi \in (-\infty ,\infty )}, the log   σ {\displaystyle \log \ \sigma } {\displaystyle \log \ \sigma } plays as the location parameter under the exponentiated generalized Pareto distribution.

The variance of Y e x G P D ( σ , ξ ) {\displaystyle Y\sim \mathrm {exGPD} (\sigma ,\xi )} {\displaystyle Y\sim \mathrm {exGPD} (\sigma ,\xi )} depends on the shape parameter ξ {\displaystyle \xi } {\displaystyle \xi } only through the polygamma function of order 1 (also called the trigamma function): Var [ Y ] = { ψ ( 1 ) ψ ( 1 / ξ + 1 ) for  ξ < 0 , ψ ( 1 ) + ψ ( 1 / ξ ) for  ξ > 0 , ψ ( 1 ) for  ξ = 0. {\displaystyle \operatorname {Var} [Y]={\begin{cases}\psi '(1)-\psi '(-1/\xi +1)&{\text{for }}\xi <0,\\\psi '(1)+\psi '(1/\xi )&{\text{for }}\xi >0,\\\psi '(1)&{\text{for }}\xi =0.\end{cases}}} {\displaystyle \operatorname {Var} [Y]={\begin{cases}\psi '(1)-\psi '(-1/\xi +1)&{\text{for }}\xi <0,\\\psi '(1)+\psi '(1/\xi )&{\text{for }}\xi >0,\\\psi '(1)&{\text{for }}\xi =0.\end{cases}}} See the right panel for the variance as a function of ξ {\displaystyle \xi } {\displaystyle \xi }. Note that ψ ( 1 ) = π 2 / 6 1.644934 {\displaystyle \psi '(1)=\pi ^{2}/6\approx 1.644934} {\displaystyle \psi '(1)=\pi ^{2}/6\approx 1.644934}.

Note that the roles of the scale parameter σ {\displaystyle \sigma } {\displaystyle \sigma } and the shape parameter ξ {\displaystyle \xi } {\displaystyle \xi } under Y e x G P D ( σ , ξ ) {\displaystyle Y\sim \mathrm {exGPD} (\sigma ,\xi )} {\displaystyle Y\sim \mathrm {exGPD} (\sigma ,\xi )} are separably interpretable, which may lead to a robust efficient estimation for the ξ {\displaystyle \xi } {\displaystyle \xi } than using the X G P D ( σ , ξ ) {\displaystyle X\sim \mathrm {GPD} (\sigma ,\xi )} {\displaystyle X\sim \mathrm {GPD} (\sigma ,\xi )} [3]. The roles of the two parameters are associated each other under X G P D ( μ = 0 , σ , ξ ) {\displaystyle X\sim \mathrm {GPD} (\mu =0,\sigma ,\xi )} {\displaystyle X\sim \mathrm {GPD} (\mu =0,\sigma ,\xi )} (at least up to the second central moment); see the formula of variance V a r ( X ) {\displaystyle Var(X)} {\displaystyle Var(X)} wherein both parameters are participated.

The Hill's estimator

[edit ]

Assume that X 1 : n = ( X 1 , , X n ) {\displaystyle X_{1:n}=(X_{1},\cdots ,X_{n})} {\displaystyle X_{1:n}=(X_{1},\cdots ,X_{n})} are n {\displaystyle n} {\displaystyle n} observations (need not be i.i.d.) from an unknown heavy-tailed distribution F {\displaystyle F} {\displaystyle F} such that its tail distribution is regularly varying with the tail-index 1 / ξ {\displaystyle 1/\xi } {\displaystyle 1/\xi } (hence, the corresponding shape parameter is ξ {\displaystyle \xi } {\displaystyle \xi }). To be specific, the tail distribution is described as F ¯ ( x ) = 1 F ( x ) = L ( x ) x 1 / ξ , for some  ξ > 0 , where  L  is a slowly varying function. {\displaystyle {\bar {F}}(x)=1-F(x)=L(x)\cdot x^{-1/\xi },,円,円,円,円,円{\text{for some }}\xi >0,,円,円{\text{where }}L{\text{ is a slowly varying function.}}} {\displaystyle {\bar {F}}(x)=1-F(x)=L(x)\cdot x^{-1/\xi },,円,円,円,円,円{\text{for some }}\xi >0,,円,円{\text{where }}L{\text{ is a slowly varying function.}}} It is of a particular interest in the extreme value theory to estimate the shape parameter ξ {\displaystyle \xi } {\displaystyle \xi }, especially when ξ {\displaystyle \xi } {\displaystyle \xi } is positive (so called the heavy-tailed distribution).

Let F u {\displaystyle F_{u}} {\displaystyle F_{u}} be their conditional excess distribution function. Pickands–Balkema–de Haan theorem (Pickands, 1975; Balkema and de Haan, 1974) states that for a large class of underlying distribution functions F {\displaystyle F} {\displaystyle F}, and large u {\displaystyle u} {\displaystyle u}, F u {\displaystyle F_{u}} {\displaystyle F_{u}} is well approximated by the generalized Pareto distribution (GPD), which motivated Peak Over Threshold (POT) methods to estimate ξ {\displaystyle \xi } {\displaystyle \xi }: the GPD plays the key role in POT approach.

A renowned estimator using the POT methodology is the Hill's estimator. Technical formulation of the Hill's estimator is as follows. For 1 i n {\displaystyle 1\leq i\leq n} {\displaystyle 1\leq i\leq n}, write X ( i ) {\displaystyle X_{(i)}} {\displaystyle X_{(i)}} for the i {\displaystyle i} {\displaystyle i}-th largest value of X 1 , , X n {\displaystyle X_{1},\cdots ,X_{n}} {\displaystyle X_{1},\cdots ,X_{n}}. Then, with this notation, the Hill's estimator (see page 190 of Reference 5 by Embrechts et al [4]) based on the k {\displaystyle k} {\displaystyle k} upper order statistics is defined as ξ ^ k Hill = ξ ^ k Hill ( X 1 : n ) = 1 k 1 j = 1 k 1 log ( X ( j ) X ( k ) ) , for  2 k n . {\displaystyle {\widehat {\xi }}_{k}^{\text{Hill}}={\widehat {\xi }}_{k}^{\text{Hill}}(X_{1:n})={\frac {1}{k-1}}\sum _{j=1}^{k-1}\log {\bigg (}{\frac {X_{(j)}}{X_{(k)}}}{\bigg )},,円,円,円,円,円,円,円,円{\text{for }}2\leq k\leq n.} {\displaystyle {\widehat {\xi }}_{k}^{\text{Hill}}={\widehat {\xi }}_{k}^{\text{Hill}}(X_{1:n})={\frac {1}{k-1}}\sum _{j=1}^{k-1}\log {\bigg (}{\frac {X_{(j)}}{X_{(k)}}}{\bigg )},,円,円,円,円,円,円,円,円{\text{for }}2\leq k\leq n.} In practice, the Hill estimator is used as follows. First, calculate the estimator ξ ^ k Hill {\displaystyle {\widehat {\xi }}_{k}^{\text{Hill}}} {\displaystyle {\widehat {\xi }}_{k}^{\text{Hill}}} at each integer k { 2 , , n } {\displaystyle k\in \{2,\cdots ,n\}} {\displaystyle k\in \{2,\cdots ,n\}}, and then plot the ordered pairs { ( k , ξ ^ k Hill ) } k = 2 n {\displaystyle \{(k,{\widehat {\xi }}_{k}^{\text{Hill}})\}_{k=2}^{n}} {\displaystyle \{(k,{\widehat {\xi }}_{k}^{\text{Hill}})\}_{k=2}^{n}}. Then, select from the set of Hill estimators { ξ ^ k Hill } k = 2 n {\displaystyle \{{\widehat {\xi }}_{k}^{\text{Hill}}\}_{k=2}^{n}} {\displaystyle \{{\widehat {\xi }}_{k}^{\text{Hill}}\}_{k=2}^{n}} which are roughly constant with respect to k {\displaystyle k} {\displaystyle k}: these stable values are regarded as reasonable estimates for the shape parameter ξ {\displaystyle \xi } {\displaystyle \xi }. If X 1 , , X n {\displaystyle X_{1},\cdots ,X_{n}} {\displaystyle X_{1},\cdots ,X_{n}} are i.i.d., then the Hill's estimator is a consistent estimator for the shape parameter ξ {\displaystyle \xi } {\displaystyle \xi } [5].

Note that the Hill estimator ξ ^ k Hill {\displaystyle {\widehat {\xi }}_{k}^{\text{Hill}}} {\displaystyle {\widehat {\xi }}_{k}^{\text{Hill}}} makes a use of the log-transformation for the observations X 1 : n = ( X 1 , , X n ) {\displaystyle X_{1:n}=(X_{1},\cdots ,X_{n})} {\displaystyle X_{1:n}=(X_{1},\cdots ,X_{n})}. (The Pickand's estimator ξ ^ k Pickand {\displaystyle {\widehat {\xi }}_{k}^{\text{Pickand}}} {\displaystyle {\widehat {\xi }}_{k}^{\text{Pickand}}} also employed the log-transformation, but in a slightly different way [6].)

See also

[edit ]

References

[edit ]
  1. ^ a b Norton, Matthew; Khokhlov, Valentyn; Uryasev, Stan (2019). "Calculating CVaR and bPOE for common probability distributions with application to portfolio optimization and density estimation" (PDF). Annals of Operations Research. 299 (1–2). Springer: 1281–1315. arXiv:1811.11301 . doi:10.1007/s10479-019-03373-1. S2CID 254231768. Archived from the original (PDF) on 2023年03月31日. Retrieved 2023年02月27日.
  2. ^ Coles, Stuart (2001年12月12日). An Introduction to Statistical Modeling of Extreme Values. Springer. p. 75. ISBN 9781852334598.
  3. ^ Dargahi-Noubary, G. R. (1989). "On tail estimation: An improved method". Mathematical Geology. 21 (8): 829–842. Bibcode:1989MatGe..21..829D. doi:10.1007/BF00894450. S2CID 122710961.
  4. ^ Hosking, J. R. M.; Wallis, J. R. (1987). "Parameter and Quantile Estimation for the Generalized Pareto Distribution". Technometrics. 29 (3): 339–349. doi:10.2307/1269343. JSTOR 1269343.
  5. ^ Davison, A. C. (1984年09月30日). "Modelling Excesses over High Thresholds, with an Application". In de Oliveira, J. Tiago (ed.). Statistical Extremes and Applications. Kluwer. p. 462. ISBN 9789027718044.
  6. ^ Embrechts, Paul; Klüppelberg, Claudia; Mikosch, Thomas (1997年01月01日). Modelling extremal events for insurance and finance. Springer. p. 162. ISBN 9783540609315.
  7. ^ Castillo, Enrique, and Ali S. Hadi. "Fitting the generalized Pareto distribution to data." Journal of the American Statistical Association 92.440 (1997): 1609-1620.
  8. ^ a b Jewson, Stephen; Sweeting, Trevor; Jewson, Lynne (2025年02月20日). "Reducing reliability bias in assessments of extreme weather risk using calibrating priors". Advances in Statistical Climatology, Meteorology and Oceanography. 11 (1): 1–22. Bibcode:2025ASCMO..11....1J. doi:10.5194/ascmo-11-1-2025 . ISSN 2364-3579.

Further reading

[edit ]
[edit ]
Discrete
univariate
with finite
support
with infinite
support
Continuous
univariate
supported on a
bounded interval
supported on a
semi-infinite
interval
supported
on the whole
real line
with support
whose type varies
Mixed
univariate
continuous-
discrete
Multivariate
(joint)
Directional
Degenerate
and singular
Degenerate
Dirac delta function
Singular
Cantor
Families

AltStyle によって変換されたページ (->オリジナル) /