Compound Poisson distribution

Last updated May 21, 2024

In probability theory, a compound Poisson distribution is the probability distribution of the sum of a number of independent identically-distributed random variables, where the number of terms to be added is itself a Poisson-distributed variable. The result can be either a continuous or a discrete distribution.

Definition

Suppose that

N\sim \operatorname {Poisson} (\lambda ),

i.e., N is a random variable whose distribution is a Poisson distribution with expected value λ, and that

X_{1},X_{2},X_{3},\dots

are identically distributed random variables that are mutually independent and also independent of N. Then the probability distribution of the sum of $N$ i.i.d. random variables

Y=\sum _{n=1}^{N}X_{n}

is a compound Poisson distribution.

In the case N = 0, then this is a sum of 0 terms, so the value of Y is 0. Hence the conditional distribution of Y given that N = 0 is a degenerate distribution.

The compound Poisson distribution is obtained by marginalising the joint distribution of (Y,N) over N, and this joint distribution can be obtained by combining the conditional distribution Y | N with the marginal distribution of N.

Properties

The expected value and the variance of the compound distribution can be derived in a simple way from law of total expectation and the law of total variance. Thus

\operatorname {E} (Y)=\operatorname {E} \left[\operatorname {E} (Y\mid N)\right]=\operatorname {E} \left[N\operatorname {E} (X)\right]=\operatorname {E} (N)\operatorname {E} (X),

{\begin{aligned}\operatorname {Var} (Y)&=\operatorname {E} \left[\operatorname {Var} (Y\mid N)\right]+\operatorname {Var} \left[\operatorname {E} (Y\mid N)\right]=\operatorname {E} \left[N\operatorname {Var} (X)\right]+\operatorname {Var} \left[N\operatorname {E} (X)\right],\\[6pt]&=\operatorname {E} (N)\operatorname {Var} (X)+\left(\operatorname {E} (X)\right)^{2}\operatorname {Var} (N).\end{aligned}}

Then, since E(N) = Var(N) if N is Poisson-distributed, these formulae can be reduced to

\operatorname {E} (Y)=\operatorname {E} (N)\operatorname {E} (X),

\operatorname {Var} (Y)=\operatorname {E} (N)(\operatorname {Var} (X)+(\operatorname {E} (X))^{2})=\operatorname {E} (N){\operatorname {E} (X^{2})}.

The probability distribution of Y can be determined in terms of characteristic functions:

\varphi _{Y}(t)=\operatorname {E} (e^{itY})=\operatorname {E} \left(\left(\operatorname {E} (e^{itX}\mid N)\right)^{N}\right)=\operatorname {E} \left((\varphi _{X}(t))^{N}\right),\,

and hence, using the probability-generating function of the Poisson distribution, we have

\varphi _{Y}(t)={\textrm {e}}^{\lambda (\varphi _{X}(t)-1)}.\,

An alternative approach is via cumulant generating functions:

K_{Y}(t)=\ln \operatorname {E} [e^{tY}]=\ln \operatorname {E} [\operatorname {E} [e^{tY}\mid N]]=\ln \operatorname {E} [e^{NK_{X}(t)}]=K_{N}(K_{X}(t)).\,

Via the law of total cumulance it can be shown that, if the mean of the Poisson distribution λ = 1, the cumulants of Y are the same as the moments of X₁.^{[ citation needed ]}

Every infinitely divisible probability distribution is a limit of compound Poisson distributions.^[1] And compound Poisson distributions is infinitely divisible by the definition.

Discrete compound Poisson distribution

When $X_{1},X_{2},X_{3},\dots$ are positive integer-valued i.i.d random variables with $P(X_{1}=k)=\alpha _{k},\ (k=1,2,\ldots )$ , then this compound Poisson distribution is named discrete compound Poisson distribution^[2]^[3]^[4] (or stuttering-Poisson distribution^[5]) . We say that the discrete random variable $Y$ satisfying probability generating function characterization

P_{Y}(z)=\sum \limits _{i=0}^{\infty }P(Y=i)z^{i}=\exp \left(\sum \limits _{k=1}^{\infty }\alpha _{k}\lambda (z^{k}-1)\right),\quad (|z|\leq 1)

has a discrete compound Poisson(DCP) distribution with parameters $(\alpha _{1}\lambda ,\alpha _{2}\lambda ,\ldots )\in \mathbb {R} ^{\infty }$ (where ${\textstyle \sum _{i=1}^{\infty }\alpha _{i}=1}$ , with ${\textstyle \alpha _{i}\geq 0,\lambda >0}$ ), which is denoted by

X\sim {\text{DCP}}(\lambda {\alpha _{1}},\lambda {\alpha _{2}},\ldots )

Moreover, if $X\sim {\operatorname {DCP} }(\lambda {\alpha _{1}},\ldots ,\lambda {\alpha _{r}})$ , we say $X$ has a discrete compound Poisson distribution of order $r$ . When $r=1,2$ , DCP becomes Poisson distribution and Hermite distribution, respectively. When $r=3,4$ , DCP becomes triple stuttering-Poisson distribution and quadruple stuttering-Poisson distribution, respectively.^[6] Other special cases include: shift geometric distribution, negative binomial distribution, Geometric Poisson distribution, Neyman type A distribution, Luria–Delbrück distribution in Luria–Delbrück experiment. For more special case of DCP, see the reviews paper^[7] and references therein.

Feller's characterization of the compound Poisson distribution states that a non-negative integer valued r.v. $X$ is infinitely divisible if and only if its distribution is a discrete compound Poisson distribution.^[8] The negative binomial distribution is discrete infinitely divisible, i.e., if X has a negative binomial distribution, then for any positive integer n, there exist discrete i.i.d. random variables X₁, ..., X_n whose sum has the same distribution that X has. The shift geometric distribution is discrete compound Poisson distribution since it is a trivial case of negative binomial distribution.

This distribution can model batch arrivals (such as in a bulk queue ^[5]^[9]). The discrete compound Poisson distribution is also widely used in actuarial science for modelling the distribution of the total claim amount.^[3]

When some $\alpha _{k}$ are negative, it is the discrete pseudo compound Poisson distribution.^[3] We define that any discrete random variable $Y$ satisfying probability generating function characterization

G_{Y}(z)=\sum \limits _{i=0}^{\infty }P(Y=i)z^{i}=\exp \left(\sum \limits _{k=1}^{\infty }\alpha _{k}\lambda (z^{k}-1)\right),\quad (|z|\leq 1)

has a discrete pseudo compound Poisson distribution with parameters $(\lambda _{1},\lambda _{2},\ldots )=:(\alpha _{1}\lambda ,\alpha _{2}\lambda ,\ldots )\in \mathbb {R} ^{\infty }$ where ${\textstyle \sum _{i=1}^{\infty }{\alpha _{i}}=1}$ and ${\textstyle \sum _{i=1}^{\infty }{\left|{\alpha _{i}}\right|}<\infty }$ , with ${\alpha _{i}}\in \mathbb {R} ,\lambda >0$ .

Compound Poisson Gamma distribution

If X has a gamma distribution, of which the exponential distribution is a special case, then the conditional distribution of Y | N is again a gamma distribution. The marginal distribution of Y is a Tweedie distribution ^[10] with variance power 1 < p < 2 (proof via comparison of characteristic function (probability theory)). To be more explicit, if

N\sim \operatorname {Poisson} (\lambda ),

and

X_{i}\sim \operatorname {\Gamma } (\alpha ,\beta )

i.i.d., then the distribution of

Y=\sum _{i=1}^{N}X_{i}

is a reproductive exponential dispersion model $ED(\mu ,\sigma ^{2})$ with

{\begin{aligned}\operatorname {E} [Y]&=\lambda {\frac {\alpha }{\beta }}=:\mu ,\\[4pt]\operatorname {Var} [Y]&=\lambda {\frac {\alpha (1+\alpha )}{\beta ^{2}}}=:\sigma ^{2}\mu ^{p}.\end{aligned}}

The mapping of parameters Tweedie parameter $\mu ,\sigma ^{2},p$ to the Poisson and Gamma parameters $\lambda ,\alpha ,\beta$ is the following:

{\begin{aligned}\lambda &={\frac {\mu ^{2-p}}{(2-p)\sigma ^{2}}},\\[4pt]\alpha &={\frac {2-p}{p-1}},\\[4pt]\beta &={\frac {\mu ^{1-p}}{(p-1)\sigma ^{2}}}.\end{aligned}}

Compound Poisson processes

A compound Poisson process with rate $\lambda >0$ and jump size distribution G is a continuous-time stochastic process $\{\,Y(t):t\geq 0\,\}$ given by

Y(t)=\sum _{i=1}^{N(t)}D_{i},

where the sum is by convention equal to zero as long as N(t) = 0. Here, $\{\,N(t):t\geq 0\,\}$ is a Poisson process with rate $\lambda$ , and $\{\,D_{i}:i\geq 1\,\}$ are independent and identically distributed random variables, with distribution function G, which are also independent of $\{\,N(t):t\geq 0\,\}.\,$ ^[11]

For the discrete version of compound Poisson process, it can be used in survival analysis for the frailty models.^[12]

Applications

A compound Poisson distribution, in which the summands have an exponential distribution, was used by Revfeim to model the distribution of the total rainfall in a day, where each day contains a Poisson-distributed number of events each of which provides an amount of rainfall which has an exponential distribution.^[13] Thompson applied the same model to monthly total rainfalls.^[14]

There have been applications to insurance claims ^[15]^[16] and x-ray computed tomography.^[17]^[18]^[19]

Related Research Articles

<span class="mw-page-title-main">Negative binomial distribution</span> Probability distribution

In probability theory and statistics, the negative binomial distribution is a discrete probability distribution that models the number of failures in a sequence of independent and identically distributed Bernoulli trials before a specified (non-random) number of successes occurs. For example, we can define rolling a 6 on some dice as a success, and rolling any other number as a failure, and ask how many failure rolls will occur before we see the third success. In such a case, the probability distribution of the number of failures that appear will be a negative binomial distribution.

In probability theory and statistics, the exponential distribution or negative exponential distribution is the probability distribution of the distance between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average rate; the distance parameter could be any meaningful mono-dimensional measure of the process, such as time between production errors, or length along a roll of fabric in the weaving manufacturing process. It is a particular case of the gamma distribution. It is the continuous analogue of the geometric distribution, and it has the key property of being memoryless. In addition to being used for the analysis of Poisson point processes it is found in various other contexts.

In probability theory and statistics, the geometric distribution is either one of two discrete probability distributions:

A compound Poisson process is a continuous-time stochastic process with jumps. The jumps arrive randomly according to a Poisson process and the size of the jumps is also random, with a specified probability distribution. To be precise, a compound Poisson process, parameterised by a rate $and jump size distribution G, is a process given by$

<span class="mw-page-title-main">Dirichlet distribution</span> Probability distribution

In probability and statistics, the Dirichlet distribution (after Peter Gustav Lejeune Dirichlet), often denoted $, is a family of continuous multivariate probability distributions parameterized by a vector of positive reals. It is a multivariate generalization of the beta distribution, hence its alternative name of multivariate beta distribution (MBD) . Dirichlet distributions are commonly used as prior distributions in Bayesian statistics, and in fact, the Dirichlet distribution is the conjugate prior of the categorical distribution and multinomial distribution.$

In statistics and information theory, a maximum entropy probability distribution has entropy that is at least as great as that of all other members of a specified class of probability distributions. According to the principle of maximum entropy, if nothing is known about a distribution except that it belongs to a certain class, then the distribution with the largest entropy should be chosen as the least-informative default. The motivation is twofold: first, maximizing entropy minimizes the amount of prior information built into the distribution; second, many physical systems tend to move towards maximal entropy configurations over time.

In probability theory, the inverse Gaussian distribution is a two-parameter family of continuous probability distributions with support on (0,∞).

A ratio distribution is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two random variables X and Y, the distribution of the random variable Z that is formed as the ratio Z = X/Y is a ratio distribution.

In probability and statistics, the Tweedie distributions are a family of probability distributions which include the purely continuous normal, gamma and inverse Gaussian distributions, the purely discrete scaled Poisson distribution, and the class of compound Poisson–gamma distributions which have positive mass at zero, but are otherwise continuous. Tweedie distributions are a special case of exponential dispersion models and are often used as distributions for generalized linear models.

In probability theory and statistics, the Conway–Maxwell–Poisson distribution is a discrete probability distribution named after Richard W. Conway, William L. Maxwell, and Siméon Denis Poisson that generalizes the Poisson distribution by adding a parameter to model overdispersion and underdispersion. It is a member of the exponential family, has the Poisson distribution and geometric distribution as special cases and the Bernoulli distribution as a limiting case.

In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time if these events occur with a known constant mean rate and independently of the time since the last event. It can also be used for the number of events in other types of intervals than time, and in dimension greater than 1.

In probability theory and statistics, the noncentral beta distribution is a continuous probability distribution that is a noncentral generalization of the (central) beta distribution.

In statistics, a zero-inflated model is a statistical model based on a zero-inflated probability distribution, i.e. a distribution that allows for frequent zero-valued observations.

The discrete-stable distributions are a class of probability distributions with the property that the sum of several random variables from such a distribution under appropriate scaling is distributed according to the same family. They are the discrete analogue of the continuous-stable distributions.

Wiener–Lévy theorem is a theorem in Fourier analysis, which states that a function of an absolutely convergent Fourier series has an absolutely convergent Fourier series under some conditions. The theorem was named after Norbert Wiener and Paul Lévy.

In probability theory and statistics, the Conway–Maxwell–binomial (CMB) distribution is a three parameter discrete probability distribution that generalises the binomial distribution in an analogous manner to the way that the Conway–Maxwell–Poisson distribution generalises the Poisson distribution. The CMB distribution can be used to model both positive and negative association among the Bernoulli summands,.

A mixed Poisson distribution is a univariate discrete probability distribution in stochastics. It results from assuming that the conditional distribution of a random variable, given the value of the rate parameter, is a Poisson distribution, and that the rate parameter itself is considered as a random variable. Hence it is a special case of a compound probability distribution. Mixed Poisson distributions can be found in actuarial mathematics as a general approach for the distribution of the number of claims and is also examined as an epidemiological model. It should not be confused with compound Poisson distribution or compound Poisson process.

The Blackwell-Girshick equation is an equation in probability theory that allows for the calculation of the variance of random sums of random variables. It is the equivalent of Wald's lemma for the expectation of composite distributions.

<span class="mw-page-title-main">Neyman Type A distribution</span> Compound Poisson-family discrete probability distribution

In statistics and probability, the Neyman Type A distribution is a discrete probability distribution from the family of Compound Poisson distribution. First of all, to easily understand this distribution we will demonstrate it with the following example explained in Univariate Discret Distributions; we have a statistical model of the distribution of larvae in a unit area of field by assuming that the variation in the number of clusters of eggs per unit area could be represented by a Poisson distribution with parameter $, while the number of larvae developing per cluster of eggs are assumed to have independent Poisson distribution all with the same parameter . If we want to know how many larvae there are, we define a random variable Y as the sum of the number of larvae hatched in each group. Therefore, Y = X 1 + X 2 + ... X j, where X 1,..., X j are independent Poisson variables with parameter and .$

References

↑ Lukacs, E. (1970). Characteristic functions. London: Griffin. ISBN 0-85264-170-2.
↑ Johnson, N.L., Kemp, A.W., and Kotz, S. (2005) Univariate Discrete Distributions, 3rd Edition, Wiley, ISBN 978-0-471-27246-5.
1 2 3 Huiming, Zhang; Yunxiao Liu; Bo Li (2014). "Notes on discrete compound Poisson model with applications to risk theory". Insurance: Mathematics and Economics. 59: 325–336. doi:10.1016/j.insmatheco.2014.09.012.
↑ Huiming, Zhang; Bo Li (2016). "Characterizations of discrete compound Poisson distributions". Communications in Statistics - Theory and Methods. 45 (22): 6789–6802. doi:10.1080/03610926.2014.901375. S2CID 125475756.
1 2 Kemp, C. D. (1967). ""Stuttering – Poisson" distributions". Journal of the Statistical and Social Enquiry of Ireland. 21 (5): 151–157. hdl:2262/6987.
↑ Patel, Y. C. (1976). Estimation of the parameters of the triple and quadruple stuttering-Poisson distributions. Technometrics, 18(1), 67-73.
↑ Wimmer, G., Altmann, G. (1996). The multiple Poisson distribution, its characteristics and a variety of forms. Biometrical journal, 38(8), 995-1011.
↑ Feller, W. (1968). An Introduction to Probability Theory and its Applications. Vol. I (3rd ed.). New York: Wiley.
↑ Adelson, R. M. (1966). "Compound Poisson Distributions". Journal of the Operational Research Society . 17 (1): 73–75. doi:10.1057/jors.1966.8.
↑ Jørgensen, Bent (1997). The theory of dispersion models. Chapman & Hall. ISBN 978-0412997112.
↑ S. M. Ross (2007). Introduction to Probability Models (ninth ed.). Boston: Academic Press. ISBN 978-0-12-598062-3.
↑ Ata, N.; Özel, G. (2013). "Survival functions for the frailty models based on the discrete compound Poisson process". Journal of Statistical Computation and Simulation. 83 (11): 2105–2116. doi:10.1080/00949655.2012.679943. S2CID 119851120.
↑ Revfeim, K. J. A. (1984). "An initial model of the relationship between rainfall events and daily rainfalls". Journal of Hydrology. 75 (1–4): 357–364. Bibcode:1984JHyd...75..357R. doi:10.1016/0022-1694(84)90059-3.
↑ Thompson, C. S. (1984). "Homogeneity analysis of a rainfall series: an application of the use of a realistic rainfall model". J. Climatology. 4 (6): 609–619. Bibcode:1984IJCli...4..609T. doi:10.1002/joc.3370040605.
↑ Jørgensen, Bent; Paes De Souza, Marta C. (January 1994). "Fitting Tweedie's compound poisson model to insurance claims data". Scandinavian Actuarial Journal. 1994 (1): 69–93. doi:10.1080/03461238.1994.10413930.
↑ Smyth, Gordon K.; Jørgensen, Bent (29 August 2014). "Fitting Tweedie's Compound Poisson Model to Insurance Claims Data: Dispersion Modelling". ASTIN Bulletin. 32 (1): 143–157. doi: 10.2143/AST.32.1.1020 .
↑ Whiting, Bruce R. (3 May 2002). Antonuk, Larry E.; Yaffe, Martin J. (eds.). "Signal statistics in x-ray computed tomography". Medical Imaging 2002: Physics of Medical Imaging. 4682. International Society for Optics and Photonics: 53–60. Bibcode:2002SPIE.4682...53W. doi:10.1117/12.465601. S2CID 116487704.
↑ Elbakri, Idris A.; Fessler, Jeffrey A. (16 May 2003). Sonka, Milan; Fitzpatrick, J. Michael (eds.). "Efficient and accurate likelihood for iterative image reconstruction in x-ray computed tomography". Medical Imaging 2003: Image Processing. 5032. SPIE: 1839–1850. Bibcode:2003SPIE.5032.1839E. CiteSeerX 10.1.1.419.3752 . doi:10.1117/12.480302. S2CID 12215253.
↑ Whiting, Bruce R.; Massoumzadeh, Parinaz; Earl, Orville A.; O'Sullivan, Joseph A.; Snyder, Donald L.; Williamson, Jeffrey F. (24 August 2006). "Properties of preprocessed sinogram data in x-ray computed tomography". Medical Physics. 33 (9): 3290–3303. Bibcode:2006MedPh..33.3290W. doi:10.1118/1.2230762. PMID 17022224.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Lukacs, E. (1970). Characteristic functions. London: Griffin. ISBN 0-85264-170-2.

[libro-2] Johnson, N.L., Kemp, A.W., and Kotz, S. (2005) Univariate Discrete Distributions, 3rd Edition, Wiley, ISBN 978-0-471-27246-5.

[zhang-3] 1 2 3 Huiming, Zhang; Yunxiao Liu; Bo Li (2014). "Notes on discrete compound Poisson model with applications to risk theory". Insurance: Mathematics and Economics. 59: 325–336. doi:10.1016/j.insmatheco.2014.09.012.

[zhang2-4] Huiming, Zhang; Bo Li (2016). "Characterizations of discrete compound Poisson distributions". Communications in Statistics - Theory and Methods. 45 (22): 6789–6802. doi:10.1080/03610926.2014.901375. S2CID 125475756.

[kemp-5] 1 2 Kemp, C. D. (1967). ""Stuttering – Poisson" distributions". Journal of the Statistical and Social Enquiry of Ireland. 21 (5): 151–157. hdl:2262/6987.

[6] Patel, Y. C. (1976). Estimation of the parameters of the triple and quadruple stuttering-Poisson distributions. Technometrics, 18(1), 67-73.

[7] Wimmer, G., Altmann, G. (1996). The multiple Poisson distribution, its characteristics and a variety of forms. Biometrical journal, 38(8), 995-1011.

[8] Feller, W. (1968). An Introduction to Probability Theory and its Applications. Vol. I (3rd ed.). New York: Wiley.

[9] Adelson, R. M. (1966). "Compound Poisson Distributions". Journal of the Operational Research Society . 17 (1): 73–75. doi:10.1057/jors.1966.8.

[Jørgensen-1997-10] Jørgensen, Bent (1997). The theory of dispersion models. Chapman & Hall. ISBN 978-0412997112.

[11] S. M. Ross (2007). Introduction to Probability Models (ninth ed.). Boston: Academic Press. ISBN 978-0-12-598062-3.

[12] Ata, N.; Özel, G. (2013). "Survival functions for the frailty models based on the discrete compound Poisson process". Journal of Statistical Computation and Simulation. 83 (11): 2105–2116. doi:10.1080/00949655.2012.679943. S2CID 119851120.

[Revf-13] Revfeim, K. J. A. (1984). "An initial model of the relationship between rainfall events and daily rainfalls". Journal of Hydrology. 75 (1–4): 357–364. Bibcode:1984JHyd...75..357R. doi:10.1016/0022-1694(84)90059-3.

[Thom-14] Thompson, C. S. (1984). "Homogeneity analysis of a rainfall series: an application of the use of a realistic rainfall model". J. Climatology. 4 (6): 609–619. Bibcode:1984IJCli...4..609T. doi:10.1002/joc.3370040605.

[15] Jørgensen, Bent; Paes De Souza, Marta C. (January 1994). "Fitting Tweedie's compound poisson model to insurance claims data". Scandinavian Actuarial Journal. 1994 (1): 69–93. doi:10.1080/03461238.1994.10413930.

[16] Smyth, Gordon K.; Jørgensen, Bent (29 August 2014). "Fitting Tweedie's Compound Poisson Model to Insurance Claims Data: Dispersion Modelling". ASTIN Bulletin. 32 (1): 143–157. doi: 10.2143/AST.32.1.1020 .

[17] Whiting, Bruce R. (3 May 2002). Antonuk, Larry E.; Yaffe, Martin J. (eds.). "Signal statistics in x-ray computed tomography". Medical Imaging 2002: Physics of Medical Imaging. 4682. International Society for Optics and Photonics: 53–60. Bibcode:2002SPIE.4682...53W. doi:10.1117/12.465601. S2CID 116487704.

[18] Elbakri, Idris A.; Fessler, Jeffrey A. (16 May 2003). Sonka, Milan; Fitzpatrick, J. Michael (eds.). "Efficient and accurate likelihood for iterative image reconstruction in x-ray computed tomography". Medical Imaging 2003: Image Processing. 5032. SPIE: 1839–1850. Bibcode:2003SPIE.5032.1839E. CiteSeerX 10.1.1.419.3752 . doi:10.1117/12.480302. S2CID 12215253.

[19] Whiting, Bruce R.; Massoumzadeh, Parinaz; Earl, Orville A.; O'Sullivan, Joseph A.; Snyder, Donald L.; Williamson, Jeffrey F. (24 August 2006). "Properties of preprocessed sinogram data in x-ray computed tomography". Medical Physics. 33 (9): 3290–3303. Bibcode:2006MedPh..33.3290W. doi:10.1118/1.2230762. PMID 17022224.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]