Lukacs's proportion-sum independence theorem

Last updated March 18, 2023

In statistics, Lukacs's proportion-sum independence theorem is a result that is used when studying proportions, in particular the Dirichlet distribution. It is named after Eugene Lukacs.^[1]

The theorem

If Y₁ and Y₂ are non-degenerate, independent random variables, then the random variables

W=Y_{1}+Y_{2}{\text{ and }}P={\frac {Y_{1}}{Y_{1}+Y_{2}}}

are independently distributed if and only if both Y₁ and Y₂ have gamma distributions with the same scale parameter.

Corollary

Suppose Y_i, i = 1, ..., k be non-degenerate, independent, positive random variables. Then each of k − 1 random variables

P_{i}={\frac {Y_{i}}{\sum _{i=1}^{k}Y_{i}}}

is independent of

W=\sum _{i=1}^{k}Y_{i}

if and only if all the Y_i have gamma distributions with the same scale parameter.^[2]

Related Research Articles

The Cauchy distribution, named after Augustin Cauchy, is a continuous probability distribution. It is also known, especially among physicists, as the Lorentz distribution, Cauchy–Lorentz distribution, Lorentz(ian) function, or Breit–Wigner distribution. The Cauchy distribution $is the distribution of the x -intercept of a ray issuing from with a uniformly distributed angle. It is also the distribution of the ratio of two independent normally distributed random variables with mean zero.$

In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is

In probability theory, the central limit theorem (CLT) establishes that, in many situations, for identically distributed independent samples, the standardized sample mean tends towards the standard normal distribution even if the original variables themselves are not normally distributed.

<span class="mw-page-title-main">Negative binomial distribution</span> Probability distribution

In probability theory and statistics, the negative binomial distribution is a discrete probability distribution that models the number of failures in a sequence of independent and identically distributed Bernoulli trials before a specified (non-random) number of successes occurs. For example, we can define rolling a 6 on a dice as a success, and rolling any other number as a failure, and ask how many failure rolls will occur before we see the third success. In such a case, the probability distribution of the number of failures that appear will be a negative binomial distribution.

In probability theory and statistics, the chi-squared distribution with $degrees of freedom is the distribution of a sum of the squares of independent standard normal random variables. The chi-squared distribution is a special case of the gamma distribution and is one of the most widely used probability distributions in inferential statistics, notably in hypothesis testing and in construction of confidence intervals. This distribution is sometimes called the central chi-squared distribution, a special case of the more general noncentral chi-squared distribution.$

In probability theory and statistics, the Weibull distribution is a continuous probability distribution. It is named after Swedish mathematician Waloddi Weibull, who described it in detail in 1951, although it was first identified by Maurice René Fréchet and first applied by Rosin & Rammler (1933) to describe a particle size distribution.

In probability theory and statistics, the gamma distribution is a two-parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-square distribution are special cases of the gamma distribution. There are two equivalent parameterizations in common use:

With a shape parameter $and a scale parameter .$
With a shape parameter $and an inverse scale parameter, called a rate parameter.$

In probability theory, a compound Poisson distribution is the probability distribution of the sum of a number of independent identically-distributed random variables, where the number of terms to be added is itself a Poisson-distributed variable. The result can be either a continuous or a discrete distribution.

In probability theory and statistics, the Laplace distribution is a continuous probability distribution named after Pierre-Simon Laplace. It is also sometimes called the double exponential distribution, because it can be thought of as two exponential distributions spliced together along the abscissa, although the term is also sometimes used to refer to the Gumbel distribution. The difference between two independent identically distributed exponential random variables is governed by a Laplace distribution, as is a Brownian motion evaluated at an exponentially distributed random time. Increments of Laplace motion or a variance gamma process evaluated over the time scale also have a Laplace distribution.

<span class="mw-page-title-main">Dirichlet distribution</span> Probability distribution

In probability and statistics, the Dirichlet distribution (after Peter Gustav Lejeune Dirichlet), often denoted $, is a family of continuous multivariate probability distributions parameterized by a vector of positive reals. It is a multivariate generalization of the beta distribution, hence its alternative name of multivariate beta distribution (MBD) . Dirichlet distributions are commonly used as prior distributions in Bayesian statistics, and in fact, the Dirichlet distribution is the conjugate prior of the categorical distribution and multinomial distribution.$

In probability theory, a distribution is said to be stable if a linear combination of two independent random variables with this distribution has the same distribution, up to location and scale parameters. A random variable is said to be stable if its distribution is stable. The stable distribution family is also sometimes referred to as the Lévy alpha-stable distribution, after Paul Lévy, the first mathematician to have studied it.

In probability theory and statistics, the characteristic function of any real-valued random variable completely defines its probability distribution. If a random variable admits a probability density function, then the characteristic function is the Fourier transform of the probability density function. Thus it provides an alternative route to analytical results compared with working directly with probability density functions or cumulative distribution functions. There are particularly simple results for the characteristic functions of distributions defined by the weighted sums of random variables.

In probability theory, Dirichlet processes are a family of stochastic processes whose realizations are probability distributions. In other words, a Dirichlet process is a probability distribution whose range is itself a set of probability distributions. It is often used in Bayesian inference to describe the prior knowledge about the distribution of random variables—how likely it is that the random variables are distributed according to one or another particular distribution.

In probability theory and statistics, the Dirichlet-multinomial distribution is a family of discrete multivariate probability distributions on a finite support of non-negative integers. It is also called the Dirichlet compound multinomial distribution (DCM) or multivariate Pólya distribution. It is a compound probability distribution, where a probability vector p is drawn from a Dirichlet distribution with parameter vector $, and an observation drawn from a multinomial distribution with probability vector p and number of trials n . The Dirichlet parameter vector captures the prior belief about the situation and can be seen as a pseudocount: observations of each outcome that occur before the actual data is collected. The compounding corresponds to a Pólya urn scheme. It is frequently encountered in Bayesian statistics, machine learning, empirical Bayes methods and classical statistics as an overdispersed multinomial distribution.$

The Nakagami distribution or the Nakagami-m distribution is a probability distribution related to the gamma distribution. The family of Nakagami distributions has two parameters: a shape parameter $and a second parameter controlling spread .$

In probability theory and statistics, a categorical distribution is a discrete probability distribution that describes the possible results of a random variable that can take on one of K possible categories, with the probability of each category separately specified. There is no innate underlying ordering of these outcomes, but numerical labels are often attached for convenience in describing the distribution,. The K-dimensional categorical distribution is the most general distribution over a K-way event; any other discrete distribution over a size-K sample space is a special case. The parameters specifying the probabilities of each possible outcome are constrained only by the fact that each must be in the range 0 to 1, and all must sum to 1.

In statistics, and specifically in the study of the Dirichlet distribution, a neutral vector of random variables is one that exhibits a particular type of statistical independence amongst its elements. In particular, when elements of the random vector must add up to certain sum, then an element in the vector is neutral with respect to the others if the distribution of the vector created by expressing the remaining elements as proportions of their total is independent of the element that was omitted.

In statistics, the generalized Dirichlet distribution (GD) is a generalization of the Dirichlet distribution with a more general covariance structure and almost twice the number of parameters. Random vectors with a GD distribution are completely neutral.

In probability theory and statistics, there are several relationships among probability distributions. These relations can be categorized in the following groups:

In statistics, a Pólya urn model, named after George Pólya, is a type of statistical model used as an idealized mental exercise framework, unifying many treatments.

References

↑ Lukacs, Eugene (1955). "A characterization of the gamma distribution". Annals of Mathematical Statistics. 26 (2): 319–324. doi: 10.1214/aoms/1177728549 .
↑ Mosimann, James E. (1962). "On the compound multinomial distribution, the multivariate $\beta$ distribution, and correlation among proportions". Biometrika. 49 (1 and 2): 65–82. doi:10.1093/biomet/49.1-2.65. JSTOR 2333468.

Ng, W. N.; Tian, G-L; Tang, M-L (2011). Dirichlet and Related Distributions. John Wiley & Sons, Ltd. ISBN 978-0-470-68819-9. page 64. Lukacs's proportion-sum independence theorem and the corollary with a proof.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[lukacs1955-1] Lukacs, Eugene (1955). "A characterization of the gamma distribution". Annals of Mathematical Statistics. 26 (2): 319–324. doi: 10.1214/aoms/1177728549 .

[mosimann1962-2] Mosimann, James E. (1962). "On the compound multinomial distribution, the multivariate $\beta$ distribution, and correlation among proportions". Biometrika. 49 (1 and 2): 65–82. doi:10.1093/biomet/49.1-2.65. JSTOR 2333468.

[1]

[2]

Lukacs's proportion-sum independence theorem

Contents

The theorem

Corollary

Related Research Articles

References