Beta negative binomial distribution

Beta Negative Binomial
Parameters	shape (real); shape (real) ; — number of successes until the experiment is stopped (integer but can be extended to real)
Support
PMF
Mean
Variance
Skewness
MGF	does not exist
CF	where is the gamma function and is the hypergeometric function.

Last updated June 11, 2023

In probability theory, a beta negative binomial distribution is the probability distribution of a discrete random variable $X$ equal to the number of failures needed to get $r$ successes in a sequence of independent Bernoulli trials. The probability $p$ of success on each trial stays constant within any given experiment but varies across different experiments following a beta distribution. Thus the distribution is a compound probability distribution.

Definition and derivation

Denoting $f_{X|p}(k|q),f_{p}(q|\alpha ,\beta )$ the densities of the negative binomial and beta distributions respectively, we obtain the PMF $f(k|\alpha ,\beta ,r)$ of the BNB distribution by marginalization:

f(k|\alpha ,\beta ,r)=\int _{0}^{1}f_{X|p}(k|r,q)\cdot f_{p}(q|\alpha ,\beta )\mathrm {d} q=\int _{0}^{1}{\binom {k+r-1}{k}}(1-q)^{k}q^{r}\cdot {\frac {q^{\alpha -1}(1-q)^{\beta -1}}{\mathrm {B} (\alpha ,\beta )}}\mathrm {d} q={\frac {1}{\mathrm {B} (\alpha ,\beta )}}{\binom {k+r-1}{k}}\int _{0}^{1}q^{\alpha +r-1}(1-q)^{\beta +k-1}\mathrm {d} q

Noting that the integral evaluates to:

\int _{0}^{1}q^{\alpha +r-1}(1-q)^{\beta +k-1}\mathrm {d} q={\frac {\Gamma (\alpha +r)\Gamma (\beta +k)}{\Gamma (\alpha +\beta +k+r)}}

we can arrive at the following formulas by relatively simple manipulations.

If $r$ is an integer, then the PMF can be written in terms of the beta function,:

f(k|\alpha ,\beta ,r)={\binom {r+k-1}{k}}{\frac {\mathrm {B} (\alpha +r,\beta +k)}{\mathrm {B} (\alpha ,\beta )}}

.

More generally, the PMF can be written

f(k|\alpha ,\beta ,r)={\frac {\Gamma (r+k)}{k!\;\Gamma (r)}}{\frac {\mathrm {B} (\alpha +r,\beta +k)}{\mathrm {B} (\alpha ,\beta )}}

or

f(k|\alpha ,\beta ,r)={\frac {\mathrm {B} (r+k,\alpha +\beta )}{\mathrm {B} (r,\alpha )}}{\frac {\Gamma (k+\beta )}{k!\;\Gamma (\beta )}}

.

PMF expressed with Gamma

Using the properties of the Beta function, the PMF with integer $r$ can be rewritten as:

f(k|\alpha ,\beta ,r)={\binom {r+k-1}{k}}{\frac {\Gamma (\alpha +r)\Gamma (\beta +k)\Gamma (\alpha +\beta )}{\Gamma (\alpha +r+\beta +k)\Gamma (\alpha )\Gamma (\beta )}}

.

More generally, the PMF can be written as

f(k|\alpha ,\beta ,r)={\frac {\Gamma (r+k)}{k!\;\Gamma (r)}}{\frac {\Gamma (\alpha +r)\Gamma (\beta +k)\Gamma (\alpha +\beta )}{\Gamma (\alpha +r+\beta +k)\Gamma (\alpha )\Gamma (\beta )}}

.

PMF expressed with the rising Pochammer symbol

The PMF is often also presented in terms of the Pochammer symbol for integer $r$

f(k|\alpha ,\beta ,r)={\frac {r^{(k)}\alpha ^{(r)}\beta ^{(k)}}{k!(\alpha +\beta )^{(r+k)}}}

Properties

Factorial Moments

The $k$ -th factorial moment of a beta negative binomial random variable $X$ is defined for $k<\alpha$ and in this case is equal to

\operatorname {E} {\bigl [}(X)_{k}{\bigr ]}={\frac {\Gamma (r+k)}{\Gamma (r)}}{\frac {\Gamma (\beta +k)}{\Gamma (\beta )}}{\frac {\Gamma (\alpha -k)}{\Gamma (\alpha )}}.

Non-identifiable

The beta negative binomial is non-identifiable which can be seen easily by simply swapping $r$ and $\beta$ in the above density or characteristic function and noting that it is unchanged. Thus estimation demands that a constraint be placed on $r$ , $\beta$ or both.

Relation to other distributions

The beta negative binomial distribution contains the beta geometric distribution as a special case when either $r=1$ or $\beta =1$ . It can therefore approximate the geometric distribution arbitrarily well. It also approximates the negative binomial distribution arbitrary well for large $\alpha$ . It can therefore approximate the Poisson distribution arbitrarily well for large $\alpha$ , $\beta$ and $r$ .

Heavy tailed

By Stirling's approximation to the beta function, it can be easily shown that for large $k$

f(k|\alpha ,\beta ,r)\sim {\frac {\Gamma (\alpha +r)}{\Gamma (r)\mathrm {B} (\alpha ,\beta )}}{\frac {k^{r-1}}{(\beta +k)^{r+\alpha }}}

which implies that the beta negative binomial distribution is heavy tailed and that moments less than or equal to $\alpha$ do not exist.

Beta geometric distribution

The beta geometric distribution is an important special case of the beta negative binomial distribution occurring for $r=1$ . In this case the pmf simplifies to

f(k|\alpha ,\beta )={\frac {\mathrm {B} (\alpha +1,\beta +k)}{\mathrm {B} (\alpha ,\beta )}}

.

This distribution is used in some Buy Till you Die (BTYD) models.

Further, when $\beta =1$ the beta geometric reduces to the Yule–Simon distribution. However, it is more common to define the Yule-Simon distribution in terms of a shifted version of the beta geometric. In particular, if $X\sim BG(\alpha ,1)$ then $X+1\sim YS(\alpha )$ .

Beta negative binomial as a Pólya urn model

In the case when the 3 parameters $r,\alpha$ and $\beta$ are positive integers, the Beta negative binomial can also be motivated by an urn model - or more specifically a basic Pólya urn model. Consider an urn initially containing $\alpha$ red balls (the stopping color) and $\beta$ blue balls. At each step of the model, a ball is drawn at random from the urn and replaced, along with one additional ball of the same color. The process is repeated over and over, until $r$ red colored balls are drawn. The random variable $X$ of observed draws of blue balls are distributed according to a $\mathrm {BNB} (r,\alpha ,\beta )$ . Note, at the end of the experiment, the urn always contains the fixed number $r+\alpha$ of red balls while containing the random number $X+\beta$ blue balls.

By the non-identifiability property, $X$ can be equivalently generated with the urn initially containing $\alpha$ red balls (the stopping color) and $r$ blue balls and stopping when $\beta$ red balls are observed.

Notes

1 2 Johnson et al. (1993)

Related Research Articles

In mathematics, the binomial coefficients are the positive integers that occur as coefficients in the binomial theorem. Commonly, a binomial coefficient is indexed by a pair of integers $n \geq k \geq 0$ and is written $It is the coefficient of the x k term in the polynomial expansion of the binomial power (1 + x) n; this coefficient can be computed by the multiplicative formula$

<span class="mw-page-title-main">Negative binomial distribution</span> Probability distribution

In probability theory and statistics, the negative binomial distribution is a discrete probability distribution that models the number of failures in a sequence of independent and identically distributed Bernoulli trials before a specified (non-random) number of successes occurs. For example, we can define rolling a 6 on a dice as a success, and rolling any other number as a failure, and ask how many failure rolls will occur before we see the third success. In such a case, the probability distribution of the number of failures that appear will be a negative binomial distribution.

In probability theory and statistics, the beta distribution is a family of continuous probability distributions defined on the interval [0, 1] in terms of two positive parameters, denoted by alpha (α) and beta (β), that appear as exponents of the variable and its complement to 1, respectively, and control the shape of the distribution.

In probability theory and statistics, the gamma distribution is a two-parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-squared distribution are special cases of the gamma distribution. There are two equivalent parameterizations in common use:

With a shape parameter $and a scale parameter .$
With a shape parameter $and an inverse scale parameter, called a rate parameter.$

In mathematics, the beta function, also called the Euler integral of the first kind, is a special function that is closely related to the gamma function and to binomial coefficients. It is defined by the integral

<span class="mw-page-title-main">Gumbel distribution</span> Particular case of the generalized extreme value distribution

In probability theory and statistics, the Gumbel distribution is used to model the distribution of the maximum of a number of samples of various distributions.

In Bayesian probability theory, if the posterior distribution $is in the same probability distribution family as the prior probability distribution, the prior and posterior are then called conjugate distributions, and the prior is called a conjugate prior for the likelihood function .$

<span class="mw-page-title-main">Dirichlet distribution</span> Probability distribution

In probability and statistics, the Dirichlet distribution (after Peter Gustav Lejeune Dirichlet), often denoted $, is a family of continuous multivariate probability distributions parameterized by a vector of positive reals. It is a multivariate generalization of the beta distribution, hence its alternative name of multivariate beta distribution (MBD) . Dirichlet distributions are commonly used as prior distributions in Bayesian statistics, and in fact, the Dirichlet distribution is the conjugate prior of the categorical distribution and multinomial distribution.$

In probability and statistics, the Yule–Simon distribution is a discrete probability distribution named after Udny Yule and Herbert A. Simon. Simon originally called it the Yule distribution.

In probability theory and statistics, the inverse gamma distribution is a two-parameter family of continuous probability distributions on the positive real line, which is the distribution of the reciprocal of a variable distributed according to the gamma distribution.

In probability theory and statistics, the beta prime distribution is an absolutely continuous probability distribution. If $has a beta distribution, then the odds has a beta prime distribution.$

In probability theory and statistics, the beta-binomial distribution is a family of discrete probability distributions on a finite support of non-negative integers arising when the probability of success in each of a fixed or known number of Bernoulli trials is either unknown or random. The beta-binomial distribution is the binomial distribution in which the probability of success at each of n trials is not fixed but randomly drawn from a beta distribution. It is frequently used in Bayesian statistics, empirical Bayes methods and classical statistics to capture overdispersion in binomial type distributed data.

Expected shortfall (ES) is a risk measure—a concept used in the field of financial risk measurement to evaluate the market risk or credit risk of a portfolio. The "expected shortfall at q% level" is the expected return on the portfolio in the worst $of cases. ES is an alternative to value at risk that is more sensitive to the shape of the tail of the loss distribution.$

In probability theory and statistics, the Dirichlet-multinomial distribution is a family of discrete multivariate probability distributions on a finite support of non-negative integers. It is also called the Dirichlet compound multinomial distribution (DCM) or multivariate Pólya distribution. It is a compound probability distribution, where a probability vector p is drawn from a Dirichlet distribution with parameter vector $, and an observation drawn from a multinomial distribution with probability vector p and number of trials n . The Dirichlet parameter vector captures the prior belief about the situation and can be seen as a pseudocount: observations of each outcome that occur before the actual data is collected. The compounding corresponds to a Pólya urn scheme. It is frequently encountered in Bayesian statistics, machine learning, empirical Bayes methods and classical statistics as an overdispersed multinomial distribution.$

A ratio distribution is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two random variables X and Y, the distribution of the random variable Z that is formed as the ratio Z = X/Y is a ratio distribution.

In statistics, the generalized Dirichlet distribution (GD) is a generalization of the Dirichlet distribution with a more general covariance structure and almost twice the number of parameters. Random vectors with a GD distribution are completely neutral.

Tail value at risk (TVaR), also known as tail conditional expectation (TCE) or conditional tail expectation (CTE), is a risk measure associated with the more general value at risk. It quantifies the expected value of the loss given that an event outside a given probability level has occurred.

In statistics, a Pólya urn model, named after George Pólya, is a type of statistical model used as an idealized mental exercise framework, unifying many treatments.

In probability theory and statistics, the Dirichlet negative multinomial distribution is a multivariate distribution on the non-negative integers. It is a multivariate extension of the beta negative binomial distribution. It is also a generalization of the negative multinomial distribution (NM(k, p)) allowing for heterogeneity or overdispersion to the probability vector. It is used in quantitative marketing research to flexibly model the number of household transactions across multiple brands.

References

Johnson, N.L.; Kotz, S.; Kemp, A.W. (1993) Univariate Discrete Distributions, 2nd edition, Wiley ISBN 0-471-54897-9 (Section 6.2.3)
Kemp, C.D.; Kemp, A.W. (1956) "Generalized hypergeometric distributions, Journal of the Royal Statistical Society , Series B, 18, 202–211
Wang, Zhaoliang (2011) "One mixed negative binomial distribution with application", Journal of Statistical Planning and Inference, 141 (3), 1153-1160 doi : 10.1016/j.jspi.2010.09.020

External links

Interactive graphic: Univariate Distribution Relationships

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[Johnson-1] 1 2 Johnson et al. (1993)

[1]