Jump to content
Wikipedia The Free Encyclopedia

Zeta distribution

From Wikipedia, the free encyclopedia
This article needs additional citations for verification . Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Zeta distribution" – news · newspapers · books · scholar · JSTOR
(August 2011) (Learn how and when to remove this message)
zeta
Probability mass function
Plot of the Zeta PMF
Plot of the Zeta PMF on a log-log scale. (The function is only defined at positive integer values of k. The connecting lines do not indicate continuity.)
Cumulative distribution function
Plot of the Zeta CMF
Parameters s ( 1 , ) {\displaystyle s\in (1,\infty )} {\displaystyle s\in (1,\infty )}
Support k { 1 , 2 , } {\displaystyle k\in \{1,2,\ldots \}} {\displaystyle k\in \{1,2,\ldots \}}
PMF 1 / k s ζ ( s ) {\displaystyle {\frac {1/k^{s}}{\zeta (s)}}} {\displaystyle {\frac {1/k^{s}}{\zeta (s)}}}
CDF H k , s ζ ( s ) {\displaystyle {\frac {H_{k,s}}{\zeta (s)}}} {\displaystyle {\frac {H_{k,s}}{\zeta (s)}}}
Mean ζ ( s 1 ) ζ ( s )   for   s > 2 {\displaystyle {\frac {\zeta (s-1)}{\zeta (s)}}~{\textrm {for}}~s>2} {\displaystyle {\frac {\zeta (s-1)}{\zeta (s)}}~{\textrm {for}}~s>2}
Mode 1 {\displaystyle 1,円} {\displaystyle 1,円}
Variance ζ ( s ) ζ ( s 2 ) ζ ( s 1 ) 2 ζ ( s ) 2   for   s > 3 {\displaystyle {\frac {\zeta (s)\zeta (s-2)-\zeta (s-1)^{2}}{\zeta (s)^{2}}}~{\textrm {for}}~s>3} {\displaystyle {\frac {\zeta (s)\zeta (s-2)-\zeta (s-1)^{2}}{\zeta (s)^{2}}}~{\textrm {for}}~s>3}
Entropy k = 1 1 / k s ζ ( s ) log ( k s ζ ( s ) ) . {\displaystyle \sum _{k=1}^{\infty }{\frac {1/k^{s}}{\zeta (s)}}\log(k^{s}\zeta (s)).,円\!} {\displaystyle \sum _{k=1}^{\infty }{\frac {1/k^{s}}{\zeta (s)}}\log(k^{s}\zeta (s)).,円\!}
MGF does not exist
CF Li s ( e i t ) ζ ( s ) {\displaystyle {\frac {\operatorname {Li} _{s}(e^{it})}{\zeta (s)}}} {\displaystyle {\frac {\operatorname {Li} _{s}(e^{it})}{\zeta (s)}}}
PGF Li s ( z ) ζ ( s ) {\displaystyle {\frac {\operatorname {Li} _{s}(z)}{\zeta (s)}}} {\displaystyle {\frac {\operatorname {Li} _{s}(z)}{\zeta (s)}}}

In probability theory and statistics, the zeta distribution is a discrete probability distribution. If X is a zeta-distributed random variable with parameter s, then the probability that X takes the positive integer value k is given by the probability mass function

f s ( k ) = k s ζ ( s ) {\displaystyle f_{s}(k)={\frac {k^{-s}}{\zeta (s)}}} {\displaystyle f_{s}(k)={\frac {k^{-s}}{\zeta (s)}}}

where ζ(s) is the Riemann zeta function (which is undefined for s = 1).

The multiplicities of distinct prime factors of X are independent random variables.

The Riemann zeta function being the sum of all terms k s {\displaystyle k^{-s}} {\displaystyle k^{-s}} for positive integer k, it appears thus as the normalization of the Zipf distribution. The terms "Zipf distribution" and the "zeta distribution" are often used interchangeably. But while the Zeta distribution is a probability distribution by itself, it is not associated to the Zipf's law with same exponent.

Definition

[edit ]

The Zeta distribution is defined for positive integers k 1 {\displaystyle k\geq 1} {\displaystyle k\geq 1}, and its probability mass function is given by

P ( x = k ) = 1 ζ ( s ) k s , {\displaystyle P(x=k)={\frac {1}{\zeta (s)}}k^{-s},} {\displaystyle P(x=k)={\frac {1}{\zeta (s)}}k^{-s},}

where s > 1 {\displaystyle s>1} {\displaystyle s>1} is the parameter, and ζ ( s ) {\displaystyle \zeta (s)} {\displaystyle \zeta (s)} is the Riemann zeta function.

The cumulative distribution function is given by

P ( x k ) = H k , s ζ ( s ) , {\displaystyle P(x\leq k)={\frac {H_{k,s}}{\zeta (s)}},} {\displaystyle P(x\leq k)={\frac {H_{k,s}}{\zeta (s)}},}

where H k , s {\displaystyle H_{k,s}} {\displaystyle H_{k,s}} is the generalized harmonic number

H k , s = i = 1 k 1 i s . {\displaystyle H_{k,s}=\sum _{i=1}^{k}{\frac {1}{i^{s}}}.} {\displaystyle H_{k,s}=\sum _{i=1}^{k}{\frac {1}{i^{s}}}.}

Moments

[edit ]

The nth raw moment is defined as the expected value of Xn:

m n = E ( X n ) = 1 ζ ( s ) k = 1 1 k s n {\displaystyle m_{n}=E(X^{n})={\frac {1}{\zeta (s)}}\sum _{k=1}^{\infty }{\frac {1}{k^{s-n}}}} {\displaystyle m_{n}=E(X^{n})={\frac {1}{\zeta (s)}}\sum _{k=1}^{\infty }{\frac {1}{k^{s-n}}}}

The series on the right is just a series representation of the Riemann zeta function, but it only converges for values of s n {\displaystyle s-n} {\displaystyle s-n} that are greater than unity. Thus:

m n = { ζ ( s n ) / ζ ( s ) for  n < s 1 for  n s 1 {\displaystyle m_{n}={\begin{cases}\zeta (s-n)/\zeta (s)&{\text{for }}n<s-1\\\infty &{\text{for }}n\geq s-1\end{cases}}} {\displaystyle m_{n}={\begin{cases}\zeta (s-n)/\zeta (s)&{\text{for }}n<s-1\\\infty &{\text{for }}n\geq s-1\end{cases}}}

The ratio of the zeta functions is well-defined, even for n > s − 1 because the series representation of the zeta function can be analytically continued. This does not change the fact that the moments are specified by the series itself, and are therefore undefined for large n.

Moment generating function

[edit ]

The moment generating function is defined as

M ( t ; s ) = E ( e t X ) = 1 ζ ( s ) k = 1 e t k k s . {\displaystyle M(t;s)=E(e^{tX})={\frac {1}{\zeta (s)}}\sum _{k=1}^{\infty }{\frac {e^{tk}}{k^{s}}}.} {\displaystyle M(t;s)=E(e^{tX})={\frac {1}{\zeta (s)}}\sum _{k=1}^{\infty }{\frac {e^{tk}}{k^{s}}}.}

The series is just the definition of the polylogarithm, valid for e t < 1 {\displaystyle e^{t}<1} {\displaystyle e^{t}<1} so that

M ( t ; s ) = Li s ( e t ) ζ ( s )  for  t < 0. {\displaystyle M(t;s)={\frac {\operatorname {Li} _{s}(e^{t})}{\zeta (s)}}{\text{ for }}t<0.} {\displaystyle M(t;s)={\frac {\operatorname {Li} _{s}(e^{t})}{\zeta (s)}}{\text{ for }}t<0.}

Since this does not converge on an open interval containing t = 0 {\displaystyle t=0} {\displaystyle t=0}, the moment generating function does not exist.

The case s = 1

[edit ]

ζ(1) is infinite as the harmonic series, and so the case when s = 1 is not meaningful. However, if A is any set of positive integers that has a density, i.e. if

lim n N ( A , n ) n {\displaystyle \lim _{n\to \infty }{\frac {N(A,n)}{n}}} {\displaystyle \lim _{n\to \infty }{\frac {N(A,n)}{n}}}

exists where N(An) is the number of members of A less than or equal to n, then

lim s 1 + P ( X A ) {\displaystyle \lim _{s\to 1^{+}}P(X\in A),円} {\displaystyle \lim _{s\to 1^{+}}P(X\in A),円}

is equal to that density.

The latter limit can also exist in some cases in which A does not have a density. For example, if A is the set of all positive integers whose first digit is d, then A has no density, but nonetheless the second limit given above exists and is proportional to

log ( d + 1 ) log ( d ) = log ( 1 + 1 d ) , {\displaystyle \log(d+1)-\log(d)=\log \left(1+{\frac {1}{d}}\right),,円} {\displaystyle \log(d+1)-\log(d)=\log \left(1+{\frac {1}{d}}\right),,円}

which is Benford's law.

Infinite divisibility

[edit ]

The Zeta distribution can be constructed with a sequence of independent random variables with a geometric distribution. Let p {\displaystyle p} {\displaystyle p} be a prime number and X ( p s ) {\displaystyle X(p^{-s})} {\displaystyle X(p^{-s})} be a random variable with a geometric distribution of parameter p s {\displaystyle p^{-s}} {\displaystyle p^{-s}}, namely

P ( X ( p s ) = k ) = p k s ( 1 p s ) {\displaystyle \quad \quad \quad \mathbb {P} \left(X(p^{-s})=k\right)=p^{-ks}(1-p^{-s})} {\displaystyle \quad \quad \quad \mathbb {P} \left(X(p^{-s})=k\right)=p^{-ks}(1-p^{-s})}

If the random variables ( X ( p s ) ) p P {\displaystyle (X(p^{-s}))_{p\in {\mathcal {P}}}} {\displaystyle (X(p^{-s}))_{p\in {\mathcal {P}}}} are independent, then, the random variable Z s {\displaystyle Z_{s}} {\displaystyle Z_{s}} defined by

Z s = p P p X ( p s ) {\displaystyle \quad \quad \quad Z_{s}=\prod _{p\in {\mathcal {P}}}p^{X(p^{-s})}} {\displaystyle \quad \quad \quad Z_{s}=\prod _{p\in {\mathcal {P}}}p^{X(p^{-s})}}

has the zeta distribution: P ( Z s = n ) = 1 n s ζ ( s ) {\displaystyle \mathbb {P} \left(Z_{s}=n\right)={\frac {1}{n^{s}\zeta (s)}}} {\displaystyle \mathbb {P} \left(Z_{s}=n\right)={\frac {1}{n^{s}\zeta (s)}}}.

Stated differently, the random variable log ( Z s ) = p P X ( p s ) log ( p ) {\displaystyle \log(Z_{s})=\sum _{p\in {\mathcal {P}}}X(p^{-s}),円\log(p)} {\displaystyle \log(Z_{s})=\sum _{p\in {\mathcal {P}}}X(p^{-s}),円\log(p)} is infinitely divisible with Lévy measure given by the following sum of Dirac masses:

Π s ( d x ) = p P k 1 p k s k δ k log ( p ) ( d x ) {\displaystyle \quad \quad \quad \Pi _{s}(dx)=\sum _{p\in {\mathcal {P}}}\sum _{k\geqslant 1}{\frac {p^{-ks}}{k}}\delta _{k\log(p)}(dx)} {\displaystyle \quad \quad \quad \Pi _{s}(dx)=\sum _{p\in {\mathcal {P}}}\sum _{k\geqslant 1}{\frac {p^{-ks}}{k}}\delta _{k\log(p)}(dx)}

See also

[edit ]

Other "power-law" distributions

[edit ]
Discrete
univariate
with finite
support
with infinite
support
Continuous
univariate
supported on a
bounded interval
supported on a
semi-infinite
interval
supported
on the whole
real line
with support
whose type varies
Mixed
univariate
continuous-
discrete
Multivariate
(joint)
Directional
Degenerate
and singular
Degenerate
Dirac delta function
Singular
Cantor
Families

AltStyle によって変換されたページ (->オリジナル) /