Beta negative binomial distribution
| Beta Negative Binomial | |||
|---|---|---|---|
| Parameters |
{\displaystyle \alpha >0} shape (real) {\displaystyle \beta >0} shape (real) {\displaystyle r>0} — number of successes until the experiment is stopped (integer but can be extended to real) | ||
| Support | {\displaystyle k\in \{0,1,2,\ldots \}} | ||
| PMF | {\displaystyle {\frac {\mathrm {B} (r+k,\alpha +\beta )}{\mathrm {B} (r,\alpha )}}{\frac {\Gamma (k+\beta )}{k!\;\Gamma (\beta )}}} | ||
| Mean | {\displaystyle {\begin{cases}{\frac {r\beta }{\alpha -1}}&{\text{if}}\ \alpha >1\\\infty &{\text{otherwise}}\ \end{cases}}} | ||
| Variance | {\displaystyle {\begin{cases}{\frac {r\beta (r+\alpha -1)(\beta +\alpha -1)}{(\alpha -2){(\alpha -1)}^{2}}}&{\text{if}}\ \alpha >2\\\infty &{\text{otherwise}}\ \end{cases}}} | ||
| Skewness | {\displaystyle {\begin{cases}{\frac {(2r+\alpha -1)(2\beta +\alpha -1)}{(\alpha -3){\sqrt {\frac {r\beta (r+\alpha -1)(\beta +\alpha -1)}{\alpha -2}}}}}&{\text{if}}\ \alpha >3\\\infty &{\text{otherwise}}\ \end{cases}}} | ||
| MGF | does not exist | ||
| CF | {\displaystyle {}_{2}F_{1}(\beta ,r;\alpha +\beta +r;e^{it}){\frac {(\alpha )^{(r)}}{(\alpha +\beta )^{(r)}}}\!} where {\displaystyle (x)^{(r)}={\frac {\Gamma (x+r)}{\Gamma (x)}}} is the Pochhammer symbol and {\displaystyle {}_{2}F_{1}} is the hypergeometric function. | ||
| PGF | {\displaystyle {}_{2}F_{1}(\beta ,r;\alpha +\beta +r;z){\frac {(\alpha )^{(r)}}{(\alpha +\beta )^{(r)}}}} | ||
In probability theory, a beta negative binomial distribution is the probability distribution of a discrete random variable {\displaystyle X} equal to the number of failures needed to get {\displaystyle r} successes in a sequence of independent Bernoulli trials. The probability {\displaystyle p} of success on each trial stays constant within any given experiment but varies across different experiments following a beta distribution. Thus the distribution is a compound probability distribution.
This distribution has also been called both the inverse Markov-Pólya distribution and the generalized Waring distribution[1] or simply abbreviated as the BNB distribution. A shifted form of the distribution has been called the beta-Pascal distribution.[1]
If parameters of the beta distribution are {\displaystyle \alpha } and {\displaystyle \beta }, and if
- {\displaystyle X\mid p\sim \mathrm {NB} (r,p),}
where
- {\displaystyle p\sim {\textrm {B}}(\alpha ,\beta ),}
then the marginal distribution of {\displaystyle X} (i.e. the posterior predictive distribution) is a beta negative binomial distribution:
- {\displaystyle X\sim \mathrm {BNB} (r,\alpha ,\beta ).}
In the above, {\displaystyle \mathrm {NB} (r,p)} is the negative binomial distribution and {\displaystyle {\textrm {B}}(\alpha ,\beta )} is the beta distribution.
Definition and derivation
[edit ]Denoting {\displaystyle f_{X|p}(k|q),f_{p}(q|\alpha ,\beta )} the densities of the negative binomial and beta distributions respectively, we obtain the PMF {\displaystyle f(k|\alpha ,\beta ,r)} of the BNB distribution by marginalization:
- {\displaystyle {\begin{aligned}f(k|\alpha ,\beta ,r)\;=&\;\int _{0}^{1}f_{X|p}(k|r,q)\cdot f_{p}(q|\alpha ,\beta )\mathrm {d} q\\=&\;\int _{0}^{1}{\binom {k+r-1}{k}}(1-q)^{k}q^{r}\cdot {\frac {q^{\alpha -1}(1-q)^{\beta -1}}{\mathrm {B} (\alpha ,\beta )}}\mathrm {d} q\\=&\;{\frac {1}{\mathrm {B} (\alpha ,\beta )}}{\binom {k+r-1}{k}}\int _{0}^{1}q^{\alpha +r-1}(1-q)^{\beta +k-1}\mathrm {d} q\end{aligned}}}
Noting that the integral evaluates to:
- {\displaystyle \int _{0}^{1}q^{\alpha +r-1}(1-q)^{\beta +k-1}\mathrm {d} q={\frac {\Gamma (\alpha +r)\Gamma (\beta +k)}{\Gamma (\alpha +\beta +k+r)}}}
we can arrive at the following formulas by relatively simple manipulations.
If {\displaystyle r} is an integer, then the PMF can be written in terms of the beta function,:
- {\displaystyle f(k|\alpha ,\beta ,r)={\binom {r+k-1}{k}}{\frac {\mathrm {B} (\alpha +r,\beta +k)}{\mathrm {B} (\alpha ,\beta )}}}.
More generally, the PMF can be written
- {\displaystyle f(k|\alpha ,\beta ,r)={\frac {\Gamma (r+k)}{k!\;\Gamma (r)}}{\frac {\mathrm {B} (\alpha +r,\beta +k)}{\mathrm {B} (\alpha ,\beta )}}}
or
- {\displaystyle f(k|\alpha ,\beta ,r)={\frac {\mathrm {B} (r+k,\alpha +\beta )}{\mathrm {B} (r,\alpha )}}{\frac {\Gamma (k+\beta )}{k!\;\Gamma (\beta )}}}.
PMF expressed with Gamma
[edit ]Using the properties of the Beta function, the PMF with integer {\displaystyle r} can be rewritten as:
- {\displaystyle f(k|\alpha ,\beta ,r)={\binom {r+k-1}{k}}{\frac {\Gamma (\alpha +r)\Gamma (\beta +k)\Gamma (\alpha +\beta )}{\Gamma (\alpha +r+\beta +k)\Gamma (\alpha )\Gamma (\beta )}}}.
More generally, the PMF can be written as
- {\displaystyle f(k|\alpha ,\beta ,r)={\frac {\Gamma (r+k)}{k!\;\Gamma (r)}}{\frac {\Gamma (\alpha +r)\Gamma (\beta +k)\Gamma (\alpha +\beta )}{\Gamma (\alpha +r+\beta +k)\Gamma (\alpha )\Gamma (\beta )}}}.
PMF expressed with the rising Pochammer symbol
[edit ]The PMF is often also presented in terms of the Pochammer symbol for integer {\displaystyle r}
- {\displaystyle f(k|\alpha ,\beta ,r)={\frac {r^{(k)}\alpha ^{(r)}\beta ^{(k)}}{k!(\alpha +\beta )^{(r+k)}}}}
Properties
[edit ]Factorial Moments
[edit ]The k-th factorial moment of a beta negative binomial random variable X is defined for {\displaystyle k<\alpha } and in this case is equal to
- {\displaystyle \operatorname {E} {\bigl [}(X)_{k}{\bigr ]}={\frac {\Gamma (r+k)}{\Gamma (r)}}{\frac {\Gamma (\beta +k)}{\Gamma (\beta )}}{\frac {\Gamma (\alpha -k)}{\Gamma (\alpha )}}.}
Non-identifiable
[edit ]The beta negative binomial is non-identifiable which can be seen easily by simply swapping {\displaystyle r} and {\displaystyle \beta } in the above density or characteristic function and noting that it is unchanged. Thus estimation demands that a constraint be placed on {\displaystyle r}, {\displaystyle \beta } or both.
Relation to other distributions
[edit ]The beta negative binomial distribution contains the beta geometric distribution as a special case when either {\displaystyle r=1} or {\displaystyle \beta =1}. It can therefore approximate the geometric distribution arbitrarily well. It also approximates the negative binomial distribution arbitrary well for large {\displaystyle \alpha }. It can therefore approximate the Poisson distribution arbitrarily well for large {\displaystyle \alpha }, {\displaystyle \beta } and {\displaystyle r}.
Heavy tailed
[edit ]By Stirling's approximation to the beta function, it can be easily shown that for large {\displaystyle k}
- {\displaystyle f(k|\alpha ,\beta ,r)\sim {\frac {\Gamma (\alpha +r)}{\Gamma (r)\mathrm {B} (\alpha ,\beta )}}{\frac {k^{r-1}}{(\beta +k)^{r+\alpha }}}}
which implies that the beta negative binomial distribution is heavy tailed and that moments less than or equal to {\displaystyle \alpha } do not exist.
Beta geometric distribution
[edit ]The beta geometric distribution is an important special case of the beta negative binomial distribution occurring for {\displaystyle r=1}. In this case the pmf simplifies to
- {\displaystyle f(k|\alpha ,\beta )={\frac {\mathrm {B} (\alpha +1,\beta +k)}{\mathrm {B} (\alpha ,\beta )}}}.
This distribution is used in some Buy Till you Die (BTYD) models.
Further, when {\displaystyle \beta =1} the beta geometric reduces to the Yule–Simon distribution. However, it is more common to define the Yule-Simon distribution in terms of a shifted version of the beta geometric. In particular, if {\displaystyle X\sim BG(\alpha ,1)} then {\displaystyle X+1\sim YS(\alpha )}.
Beta negative binomial as a Pólya urn model
[edit ]In the case when the 3 parameters {\displaystyle r,\alpha } and {\displaystyle \beta } are positive integers, the Beta negative binomial can also be motivated by an urn model - or more specifically a basic Pólya urn model. Consider an urn initially containing {\displaystyle \alpha } red balls (the stopping color) and {\displaystyle \beta } blue balls. At each step of the model, a ball is drawn at random from the urn and replaced, along with one additional ball of the same color. The process is repeated over and over, until {\displaystyle r} red colored balls are drawn. The random variable {\displaystyle X} of observed draws of blue balls are distributed according to a {\displaystyle \mathrm {BNB} (r,\alpha ,\beta )}. Note, at the end of the experiment, the urn always contains the fixed number {\displaystyle r+\alpha } of red balls while containing the random number {\displaystyle X+\beta } blue balls.
By the non-identifiability property, {\displaystyle X} can be equivalently generated with the urn initially containing {\displaystyle \alpha } red balls (the stopping color) and {\displaystyle r} blue balls and stopping when {\displaystyle \beta } red balls are observed.
See also
[edit ]Notes
[edit ]References
[edit ]- Johnson, N.L.; Kotz, S.; Kemp, A.W. (1993) Univariate Discrete Distributions, 2nd edition, Wiley ISBN 0-471-54897-9 (Section 6.2.3)
- Kemp, C.D.; Kemp, A.W. (1956) "Generalized hypergeometric distributions, Journal of the Royal Statistical Society , Series B, 18, 202–211
- Wang, Zhaoliang (2011) "One mixed negative binomial distribution with application", Journal of Statistical Planning and Inference, 141 (3), 1153-1160 doi:10.1016/j.jspi.2010年09月02日0
External links
[edit ]- Interactive graphic: Univariate Distribution Relationships