Geometric distribution

Last updated
Geometric
Probability mass function
Geometric pmf.svg
Cumulative distribution function
Geometric cdf.svg
Parameters success probability (real) success probability (real)
Support k trials where k failures where
PMF
CDF for ,
for
for ,
for
Mean
Median


(not unique if

Contents

is an integer)


(not unique if is an integer)
Mode
Variance
Skewness
Excess kurtosis
Entropy
MGF
for

for
CF
PGF

In probability theory and statistics, the geometric distribution is either one of two discrete probability distributions:

Which of these is called the geometric distribution is a matter of convention and convenience.

These two different geometric distributions should not be confused with each other. Often, the name shifted geometric distribution is adopted for the former one (distribution of ); however, to avoid ambiguity, it is considered wise to indicate which is intended, by mentioning the support explicitly.

The geometric distribution gives the probability that the first occurrence of success requires independent trials, each with success probability . If the probability of success on each trial is , then the probability that the -th trial is the first success is

for

The above form of the geometric distribution is used for modeling the number of trials up to and including the first success. By contrast, the following form of the geometric distribution is used for modeling the number of failures until the first success:

for

In either case, the sequence of probabilities is a geometric sequence.

For example, suppose an ordinary die is thrown repeatedly until the first time a "1" appears. The probability distribution of the number of times it is thrown is supported on the infinite set and is a geometric distribution with .

The geometric distribution is denoted by Geo(p) where . [1]

Definitions

Consider a sequence of trials, where each trial has only two possible outcomes (designated failure and success). The probability of success is assumed to be the same for each trial. In such a sequence of trials, the geometric distribution is useful to model the number of failures before the first success since the experiment can have an indefinite number of trials until success, unlike the binomial distribution which has a set number of trials. The distribution gives the probability that there are zero failures before the first success, one failure before the first success, two failures before the first success, and so on. [2]

Assumptions: When is the geometric distribution an appropriate model?

The geometric distribution is an appropriate model if the following assumptions are true. [3]

If these conditions are true, then the geometric random variable Y is the count of the number of failures before the first success. The possible number of failures before the first success is 0, 1, 2, 3, and so on. In the graphs above, this formulation is shown on the right.

An alternative formulation is that the geometric random variable X is the total number of trials up to and including the first success, and the number of failures is X  1. In the graphs above, this formulation is shown on the left.

Probability outcomes examples

The general formula to calculate the probability of k failures before the first success, where the probability of success is p and the probability of failure is q = 1  p, is

for k = 0, 1, 2, 3, ...

E1) A doctor is seeking an antidepressant for a newly diagnosed patient. Suppose that, of the available anti-depressant drugs, the probability that any particular drug will be effective for a particular patient is p = 0.6. What is the probability that the first drug found to be effective for this patient is the first drug tried, the second drug tried, and so on? What is the expected number of drugs that will be tried to find one that is effective?

The probability that the first drug works. There are zero failures before the first success. Y = 0 failures. The probability Pr(zero failures before first success) is simply the probability that the first drug works.

The probability that the first drug fails, but the second drug works. There is one failure before the first success. Y = 1 failure. The probability for this sequence of events is Pr(first drug fails) p(second drug succeeds), which is given by

The probability that the first drug fails, the second drug fails, but the third drug works. There are two failures before the first success. Y = 2 failures. The probability for this sequence of events is Pr(first drug fails) p(second drug fails) Pr(third drug is success)

E2) A newlywed couple plans to have children and will continue until the first girl. What is the probability that there are zero boys before the first girl, one boy before the first girl, two boys before the first girl, and so on?

The probability of having a girl (success) is p= 0.5 and the probability of having a boy (failure) is q = 1  p = 0.5.

The probability of no boys before the first girl is

The probability of one boy before the first girl is

The probability of two boys before the first girl is

and so on.

Properties

Moments and cumulants

The expected value for the number of independent trials to get the first success, and the variance of a geometrically distributed random variable X is:

Similarly, the expected value and variance of the geometrically distributed random variable Y = X - 1 (See definition of distribution ) is:

Proof

Expected value of X

Consider the expected value of X as above, i.e. the average number of trials until a success. On the first trial we either succeed with probability , or we fail with probability . If we fail the remaining mean number of trials until a success is identical to the original mean. This follows from the fact that all trials are independent. From this we get the formula:

which if solved for gives:

Expected value of Y

That the expected value of Y as above is (1  p)/p can be trivially seen from which follows from the linearity of expectation, or can be shown in the following way:

The interchange of summation and differentiation is justified by the fact that convergent power series converge uniformly on compact subsets of the set of points where they converge.

Let μ = (1  p)/p be the expected value of Y. Then the cumulants of the probability distribution of Y satisfy the recursion

Expected value examples

E3) A patient is waiting for a suitable matching kidney donor for a transplant. If the probability that a randomly selected donor is a suitable match is p = 0.1, what is the expected number of donors who will be tested before a matching donor is found?

With p = 0.1, the mean number of failures before the first success is E(Y) = (1 − p)/p =(1 − 0.1)/0.1 = 9.

For the alternative formulation, where X is the number of trials up to and including the first success, the expected value is E(X) = 1/p = 1/0.1 = 10.

For example 1 above, with p = 0.6, the mean number of failures before the first success is E(Y) = (1 − p)/p = (1 − 0.6)/0.6 = 0.67.

Higher-order moments

The moments for the number of failures before the first success are given by

where is the polylogarithm function.

General properties

The geometric distribution supported on {0, 1, 2, 3, ... } is the only memoryless discrete distribution. Note that the geometric distribution supported on {1, 2, ... } is not memoryless.
where q = 1  p, and similarly for the other digits, and, more generally, similarly for numeral systems with other bases than 10. When the base is 2, this shows that a geometrically distributed random variable can be written as a sum of independent random variables whose probability distributions are indecomposable.
follows a negative binomial distribution with parameters r and p. [6]
is also geometrically distributed, with parameter [7]
has a geometric distribution taking values in the set {0, 1, 2, ...}, with expected value r/(1  r).[ citation needed ]
where is the floor (or greatest integer) function, is a geometrically distributed random variable with parameter p = 1  eλ (thus λ = ln(1  p) [8] ) and taking values in the set {0, 1, 2, ...}. This can be used to generate geometrically distributed pseudorandom numbers by first generating exponentially distributed pseudorandom numbers from a uniform pseudorandom number generator: then is geometrically distributed with parameter , if is uniformly distributed in [0,1].

More generally, if p = λ/n, where λ is a parameter, then as n  the distribution of X/n approaches an exponential distribution with rate λ:

therefore the distribution function of X/n converges to , which is that of an exponential random variable.

Statistical inference

Parameter estimation

For both variants of the geometric distribution, the parameter p can be estimated by equating the expected value with the sample mean. This is the method of moments, which in this case happens to yield maximum likelihood estimates of p. [9] [10]

Specifically, for the first variant let k = k1, ..., kn be a sample where ki  1 for i = 1, ..., n. Then p can be estimated as

In Bayesian inference, the Beta distribution is the conjugate prior distribution for the parameter p. If this parameter is given a Beta(α, β) prior, then the posterior distribution is

The posterior mean E[p] approaches the maximum likelihood estimate as α and β approach zero.

In the alternative case, let k1, ..., kn be a sample where ki  0 for i = 1, ..., n. Then p can be estimated as

The posterior distribution of p given a Beta(α, β) prior is [11]

Again the posterior mean E[p] approaches the maximum likelihood estimate as α and β approach zero.

For either estimate of using Maximum Likelihood, the bias is equal to

which yields the bias-corrected maximum likelihood estimator

Computational methods

Geometric distribution using R

The R function dgeom(k,prob) calculates the probability that there are k failures before the first success, where the argument "prob" is the probability of success on each trial.

For example,

dgeom(0,0.6)=0.6

dgeom(1,0.6)=0.24

R uses the convention that k is the number of failures, so that the number of trials up to and including the first success is k + 1.

The following R code creates a graph of the geometric distribution from Y = 0 to 10, with p = 0.6.

Y=0:10plot(Y,dgeom(Y,0.6),type="h",ylim=c(0,1),main="Geometric distribution for p=0.6",ylab="Pr(Y=Y)",xlab="Y=Number of failures before first success")

Geometric distribution using Excel

The geometric distribution, for the number of failures before the first success, is a special case of the negative binomial distribution, for the number of failures before s successes.

The Excel function NEGBINOMDIST(number_f, number_s, probability_s) calculates the probability of k = number_f failures before s = number_s successes where p = probability_s is the probability of success on each trial. For the geometric distribution, let number_s = 1 success. [12]

For example,

=NEGBINOMDIST(0, 1, 0.6) = 0.6
=NEGBINOMDIST(1, 1, 0.6) = 0.24

Like R, Excel uses the convention that k is the number of failures, so that the number of trials up to and including the first success is k + 1.

See also

Related Research Articles

<span class="mw-page-title-main">Binomial distribution</span> Probability distribution

In probability theory and statistics, the binomial distribution with parameters n and p is the discrete probability distribution of the number of successes in a sequence of n independent experiments, each asking a yes–no question, and each with its own Boolean-valued outcome: success or failure. A single success/failure experiment is also called a Bernoulli trial or Bernoulli experiment, and a sequence of outcomes is called a Bernoulli process; for a single trial, i.e., n = 1, the binomial distribution is a Bernoulli distribution. The binomial distribution is the basis for the popular binomial test of statistical significance.

<span class="mw-page-title-main">Negative binomial distribution</span> Probability distribution

In probability theory and statistics, the negative binomial distribution is a discrete probability distribution that models the number of failures in a sequence of independent and identically distributed Bernoulli trials before a specified (non-random) number of successes occurs. For example, we can define rolling a 6 on a dice as a success, and rolling any other number as a failure, and ask how many failure rolls will occur before we see the third success. In such a case, the probability distribution of the number of failures that appear will be a negative binomial distribution.

<span class="mw-page-title-main">Exponential distribution</span> Probability distribution

In probability theory and statistics, the exponential distribution or negative exponential distribution is the probability distribution of the distance between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average rate; the distance parameter could be any meaningful mono-dimensional measure of the process, such as time between production errors, or length along a roll of fabric in the weaving manufacturing process. It is a particular case of the gamma distribution. It is the continuous analogue of the geometric distribution, and it has the key property of being memoryless. In addition to being used for the analysis of Poisson point processes it is found in various other contexts.

<span class="mw-page-title-main">Pareto distribution</span> Probability distribution

The Pareto distribution, named after the Italian civil engineer, economist, and sociologist Vilfredo Pareto, is a power-law probability distribution that is used in description of social, quality control, scientific, geophysical, actuarial, and many other types of observable phenomena; the principle originally applied to describing the distribution of wealth in a society, fitting the trend that a large portion of wealth is held by a small fraction of the population. The Pareto principle or "80-20 rule" stating that 80% of outcomes are due to 20% of causes was named in honour of Pareto, but the concepts are distinct, and only Pareto distributions with shape value of log45 ≈ 1.16 precisely reflect it. Empirical observation has shown that this 80-20 distribution fits a wide range of cases, including natural phenomena and human activities.

<span class="mw-page-title-main">Law of large numbers</span> Averages of repeated trials converge to the expected value

In probability theory, the law of large numbers (LLN) is a mathematical theorem that states that the average of the results obtained from a large number of independent and identical random samples converges to the true value, if it exists. More formally, the LLN states that given a sample of independent and identically distributed values, the sample mean converges to the true mean.

<span class="mw-page-title-main">Hypergeometric distribution</span> Discrete probability distribution

In probability theory and statistics, the hypergeometric distribution is a discrete probability distribution that describes the probability of successes in draws, without replacement, from a finite population of size that contains exactly objects with that feature, wherein each draw is either a success or a failure. In contrast, the binomial distribution describes the probability of successes in draws with replacement.

<span class="mw-page-title-main">Bernoulli distribution</span> Probability distribution modeling a coin toss which need not be fair

In probability theory and statistics, the Bernoulli distribution, named after Swiss mathematician Jacob Bernoulli, is the discrete probability distribution of a random variable which takes the value 1 with probability and the value 0 with probability . Less formally, it can be thought of as a model for the set of possible outcomes of any single experiment that asks a yes–no question. Such questions lead to outcomes that are Boolean-valued: a single bit whose value is success/yes/true/one with probability p and failure/no/false/zero with probability q. It can be used to represent a coin toss where 1 and 0 would represent "heads" and "tails", respectively, and p would be the probability of the coin landing on heads. In particular, unfair coins would have

<span class="mw-page-title-main">Weibull distribution</span> Continuous probability distribution

In probability theory and statistics, the Weibull distribution is a continuous probability distribution. It models a broad range of random variables, largely in the nature of a time to failure or time between events. Examples are maximum one-day rainfalls and the time a user spends on a web page.

<span class="mw-page-title-main">Beta distribution</span> Probability distribution

In probability theory and statistics, the beta distribution is a family of continuous probability distributions defined on the interval [0, 1] or in terms of two positive parameters, denoted by alpha (α) and beta (β), that appear as exponents of the variable and its complement to 1, respectively, and control the shape of the distribution.

In probability theory, the probability generating function of a discrete random variable is a power series representation (the generating function) of the probability mass function of the random variable. Probability generating functions are often employed for their succinct description of the sequence of probabilities Pr(X = i) in the probability mass function for a random variable X, and to make available the well-developed theory of power series with non-negative coefficients.

<span class="mw-page-title-main">Gumbel distribution</span> Particular case of the generalized extreme value distribution

In probability theory and statistics, the Gumbel distribution is used to model the distribution of the maximum of a number of samples of various distributions.

In probability theory and statistics, the cumulantsκn of a probability distribution are a set of quantities that provide an alternative to the moments of the distribution. Any two probability distributions whose moments are identical will have identical cumulants as well, and vice versa.

<span class="mw-page-title-main">Joint probability distribution</span> Type of probability distribution

Given two random variables that are defined on the same probability space, the joint probability distribution is the corresponding probability distribution on all possible pairs of outputs. The joint distribution can just as well be considered for any given number of random variables. The joint distribution encodes the marginal distributions, i.e. the distributions of each of the individual random variables and the conditional probability distributions, which deal with how the outputs of one random variable are distributed when given information on the outputs of the other random variable(s).

<span class="mw-page-title-main">Laplace distribution</span> Probability distribution

In probability theory and statistics, the Laplace distribution is a continuous probability distribution named after Pierre-Simon Laplace. It is also sometimes called the double exponential distribution, because it can be thought of as two exponential distributions spliced together along the abscissa, although the term is also sometimes used to refer to the Gumbel distribution. The difference between two independent identically distributed exponential random variables is governed by a Laplace distribution, as is a Brownian motion evaluated at an exponentially distributed random time. Increments of Laplace motion or a variance gamma process evaluated over the time scale also have a Laplace distribution.

<span class="mw-page-title-main">Beta-binomial distribution</span> Discrete probability distribution

In probability theory and statistics, the beta-binomial distribution is a family of discrete probability distributions on a finite support of non-negative integers arising when the probability of success in each of a fixed or known number of Bernoulli trials is either unknown or random. The beta-binomial distribution is the binomial distribution in which the probability of success at each of n trials is not fixed but randomly drawn from a beta distribution. It is frequently used in Bayesian statistics, empirical Bayes methods and classical statistics to capture overdispersion in binomial type distributed data.

<span class="mw-page-title-main">Poisson distribution</span> Discrete probability distribution

In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time if these events occur with a known constant mean rate and independently of the time since the last event. It can also be used for the number of events in other types of intervals than time, and in dimension greater than 1.

In probability theory and statistics, the Poisson binomial distribution is the discrete probability distribution of a sum of independent Bernoulli trials that are not necessarily identically distributed. The concept is named after Siméon Denis Poisson.

A geometric stable distribution or geo-stable distribution is a type of leptokurtic probability distribution. Geometric stable distributions were introduced in Klebanov, L. B., Maniya, G. M., and Melamed, I. A. (1985). A problem of Zolotarev and analogs of infinitely divisible and stable distributions in a scheme for summing a random number of random variables. These distributions are analogues for stable distributions for the case when the number of summands is random, independent of the distribution of summand, and having geometric distribution. The geometric stable distribution may be symmetric or asymmetric. A symmetric geometric stable distribution is also referred to as a Linnik distribution. The Laplace distribution and asymmetric Laplace distribution are special cases of the geometric stable distribution. The Mittag-Leffler distribution is also a special case of a geometric stable distribution.

In probability theory and statistics, an inverse distribution is the distribution of the reciprocal of a random variable. Inverse distributions arise in particular in the Bayesian context of prior distributions and posterior distributions for scale parameters. In the algebra of random variables, inverse distributions are special cases of the class of ratio distributions, in which the numerator random variable has a degenerate distribution.

<span class="mw-page-title-main">Negative hypergeometric distribution</span>

In probability theory and statistics, the negative hypergeometric distribution describes probabilities for when sampling from a finite population without replacement in which each sample can be classified into two mutually exclusive categories like Pass/Fail or Employed/Unemployed. As random selections are made from the population, each subsequent draw decreases the population causing the probability of success to change with each draw. Unlike the standard hypergeometric distribution, which describes the number of successes in a fixed sample size, in the negative hypergeometric distribution, samples are drawn until failures have been found, and the distribution describes the probability of finding successes in such a sample. In other words, the negative hypergeometric distribution describes the likelihood of successes in a sample with exactly failures.

References

  1. 1 2 A modern introduction to probability and statistics : understanding why and how. Dekking, Michel, 1946-. London: Springer. 2005. pp. 48–50, 61–62, 152. ISBN   9781852338961. OCLC   262680588.{{cite book}}: CS1 maint: others (link)
  2. Holmes, Alexander; Illowsky, Barbara; Dean, Susan (29 November 2017). Introductory Business Statistics. Houston, Texas: OpenStax.
  3. Raikar, Sanat Pai (31 August 2023). "Geometric distribution". Encyclopedia Britannica.
  4. Park, Sung Y.; Bera, Anil K. (June 2009). "Maximum entropy autoregressive conditional heteroskedasticity model". Journal of Econometrics. 150 (2): 219–230. doi:10.1016/j.jeconom.2008.12.014.
  5. Gallager, R.; van Voorhis, D. (March 1975). "Optimal source codes for geometrically distributed integer alphabets (Corresp.)". IEEE Transactions on Information Theory. 21 (2): 228–230. doi:10.1109/TIT.1975.1055357. ISSN   0018-9448.
  6. Pitman, Jim. Probability (1993 edition). Springer Publishers. pp 372.
  7. Ciardo, Gianfranco; Leemis, Lawrence M.; Nicol, David (1 June 1995). "On the minimum of independent geometrically distributed random variables". Statistics & Probability Letters. 23 (4): 313–326. doi:10.1016/0167-7152(94)00130-Z. hdl: 2060/19940028569 . S2CID   1505801.
  8. "Wolfram-Alpha: Computational Knowledge Engine". www.wolframalpha.com.
  9. casella, george; berger, roger l (2002). statistical inference (2nd ed.). pp. 312–315. ISBN   0-534-24312-6.
  10. "MLE Examples: Exponential and Geometric Distributions Old Kiwi - Rhea". www.projectrhea.org. Retrieved 2019-11-17.
  11. "3. Conjugate families of distributions" (PDF). Archived (PDF) from the original on 2010-04-08.
  12. "3.5 Geometric Probability Distribution using Excel Spreadsheet". Statistics LibreTexts. 2021-07-24. Retrieved 2023-10-20.