In statistics, the generalized Dirichlet distribution (GD) is a generalization of the Dirichlet distribution with a more general covariance structure and almost twice the number of parameters. Random vectors with a GD distribution are completely neutral. [1]
The density function of is
where we define . Here denotes the Beta function. This reduces to the standard Dirichlet distribution if for ( is arbitrary).
For example, if k=4, then the density function of is
where and .
Connor and Mosimann define the PDF as they did for the following reason. Define random variables with . Then have the generalized Dirichlet distribution as parametrized above, if the are independent beta with parameters , .
Wong [2] gives the slightly more concise form for
where for and . Note that Wong defines a distribution over a dimensional space (implicitly defining ) while Connor and Mosiman use a dimensional space with .
If , then
where for and . Thus
As stated above, if for then the distribution reduces to a standard Dirichlet. This condition is different from the usual case, in which setting the additional parameters of the generalized distribution to zero results in the original distribution. However, in the case of the GDD, this results in a very complicated density function.
Suppose is generalized Dirichlet, and that is multinomial with trials (here ). Writing for and the joint posterior of is a generalized Dirichlet distribution with
where and for
Wong gives the following system as an example of how the Dirichlet and generalized Dirichlet distributions differ. He posits that a large urn contains balls of different colours. The proportion of each colour is unknown. Write for the proportion of the balls with colour in the urn.
Experiment 1. Analyst 1 believes that (ie, is Dirichlet with parameters ). The analyst then makes glass boxes and puts marbles of colour in box (it is assumed that the are integers ). Then analyst 1 draws a ball from the urn, observes its colour (say colour ) and puts it in box . He can identify the correct box because they are transparent and the colours of the marbles within are visible. The process continues until balls have been drawn. The posterior distribution is then Dirichlet with parameters being the number of marbles in each box.
Experiment 2. Analyst 2 believes that follows a generalized Dirichlet distribution: . All parameters are again assumed to be positive integers. The analyst makes wooden boxes. The boxes have two areas: one for balls and one for marbles. The balls are coloured but the marbles are not coloured. Then for , he puts balls of colour , and marbles, in to box . He then puts a ball of colour in box . The analyst then draws a ball from the urn. Because the boxes are wood, the analyst cannot tell which box to put the ball in (as he could in experiment 1 above); he also has a poor memory and cannot remember which box contains which colour balls. He has to discover which box is the correct one to put the ball in. He does this by opening box 1 and comparing the balls in it to the drawn ball. If the colours differ, the box is the wrong one. The analyst places a marble in box 1 and proceeds to box 2. He repeats the process until the balls in the box match the drawn ball, at which point he places the ball in the box with the other balls of matching colour. The analyst then draws another ball from the urn and repeats until balls are drawn. The posterior is then generalized Dirichlet with parameters being the number of balls, and the number of marbles, in each box.
Note that in experiment 2, changing the order of the boxes has a non-trivial effect, unlike experiment 1.
In probability theory and statistics, the gamma distribution is a two-parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-squared distribution are special cases of the gamma distribution. There are two equivalent parameterizations in common use:
In mathematics, the beta function, also called the Euler integral of the first kind, is a special function that is closely related to the gamma function and to binomial coefficients. It is defined by the integral
In probability theory and statistics, the Gumbel distribution is used to model the distribution of the maximum of a number of samples of various distributions.
In probability and statistics, the Dirichlet distribution (after Peter Gustav Lejeune Dirichlet), often denoted , is a family of continuous multivariate probability distributions parameterized by a vector of positive reals. It is a multivariate generalization of the beta distribution, hence its alternative name of multivariate beta distribution (MBD). Dirichlet distributions are commonly used as prior distributions in Bayesian statistics, and in fact, the Dirichlet distribution is the conjugate prior of the categorical distribution and multinomial distribution.
In quantum field theory, the Lehmann–Symanzik–Zimmermann (LSZ) reduction formula is a method to calculate S-matrix elements from the time-ordered correlation functions of a quantum field theory. It is a step of the path that starts from the Lagrangian of some quantum field theory and leads to prediction of measurable quantities. It is named after the three German physicists Harry Lehmann, Kurt Symanzik and Wolfhart Zimmermann.
In probability and statistics, the Kumaraswamy's double bounded distribution is a family of continuous probability distributions defined on the interval (0,1). It is similar to the Beta distribution, but much simpler to use especially in simulation studies since its probability density function, cumulative distribution function and quantile functions can be expressed in closed form. This distribution was originally proposed by Poondi Kumaraswamy for variables that are lower and upper bounded with a zero-inflation. This was extended to inflations at both extremes [0,1] in later work with S. G. Fletcher.
In probability theory and statistics, the generalized inverse Gaussian distribution (GIG) is a three-parameter family of continuous probability distributions with probability density function
In probability theory and statistics, the beta prime distribution is an absolutely continuous probability distribution. If has a beta distribution, then the odds has a beta prime distribution.
In probability theory and statistics, the beta-binomial distribution is a family of discrete probability distributions on a finite support of non-negative integers arising when the probability of success in each of a fixed or known number of Bernoulli trials is either unknown or random. The beta-binomial distribution is the binomial distribution in which the probability of success at each of n trials is not fixed but randomly drawn from a beta distribution. It is frequently used in Bayesian statistics, empirical Bayes methods and classical statistics to capture overdispersion in binomial type distributed data.
In probability theory and statistics, the Dirichlet-multinomial distribution is a family of discrete multivariate probability distributions on a finite support of non-negative integers. It is also called the Dirichlet compound multinomial distribution (DCM) or multivariate Pólya distribution. It is a compound probability distribution, where a probability vector p is drawn from a Dirichlet distribution with parameter vector , and an observation drawn from a multinomial distribution with probability vector p and number of trials n. The Dirichlet parameter vector captures the prior belief about the situation and can be seen as a pseudocount: observations of each outcome that occur before the actual data is collected. The compounding corresponds to a Pólya urn scheme. It is frequently encountered in Bayesian statistics, machine learning, empirical Bayes methods and classical statistics as an overdispersed multinomial distribution.
A ratio distribution is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two random variables X and Y, the distribution of the random variable Z that is formed as the ratio Z = X/Y is a ratio distribution.
In probability and statistics, the generalized K-distribution is a three-parameter family of continuous probability distributions. The distribution arises by compounding two gamma distributions. In each case, a re-parametrization of the usual form of the family of gamma distributions is used, such that the parameters are:
In mathematics, the Fox H-functionH(x) is a generalization of the Meijer G-function and the Fox–Wright function introduced by Charles Fox (1961). It is defined by a Mellin–Barnes integral
In statistics, a Pólya urn model, named after George Pólya, is a family of urn models that can be used to interpret many commonly used statistical models.
In probability theory, a beta negative binomial distribution is the probability distribution of a discrete random variable equal to the number of failures needed to get successes in a sequence of independent Bernoulli trials. The probability of success on each trial stays constant within any given experiment but varies across different experiments following a beta distribution. Thus the distribution is a compound probability distribution.
In mathematics, the Fox–Wright function (also known as Fox–Wright Psi function, not to be confused with Wright Omega function) is a generalisation of the generalised hypergeometric function pFq(z) based on ideas of Charles Fox (1928) and E. Maitland Wright (1935):
In mathematics, the Kodaira–Spencer map, introduced by Kunihiko Kodaira and Donald C. Spencer, is a map associated to a deformation of a scheme or complex manifold X, taking a tangent space of a point of the deformation space to the first cohomology group of the sheaf of vector fields on X.
In statistics, the matrix t-distribution is the generalization of the multivariate t-distribution from vectors to matrices. The matrix t-distribution shares the same relationship with the multivariate t-distribution that the matrix normal distribution shares with the multivariate normal distribution. For example, the matrix t-distribution is the compound distribution that results from sampling from a matrix normal distribution having sampled the covariance matrix of the matrix normal from an inverse Wishart distribution.
This article summarizes several identities in exterior calculus.
In probability theory and statistics, the Dirichlet negative multinomial distribution is a multivariate distribution on the non-negative integers. It is a multivariate extension of the beta negative binomial distribution. It is also a generalization of the negative multinomial distribution (NM(k, p)) allowing for heterogeneity or overdispersion to the probability vector. It is used in quantitative marketing research to flexibly model the number of household transactions across multiple brands.