In probability and statistics, the generalized beta distribution [1] is a continuous probability distribution with four shape parameters (however it's customary to make explicit the scale parameter as a fifth parameter, while the location parameter is usually left implicit), including more than thirty named distributions as limiting or special cases. It has been used in the modeling of income distribution, stock returns, as well as in regression analysis. The exponential generalized beta (EGB) distribution follows directly from the GB and generalizes other common distributions.
A generalized beta random variable, Y, is defined by the following probability density function:
and zero otherwise. Here the parameters satisfy , and , , and positive. The function B(p,q) is the beta function. The parameter is the scale parameter and can thus be set to without loss of generality, but it is usually made explicit as in the function above (while the location parameter is usually left implicit and set to as in the function above).
It can be shown that the hth moment can be expressed as follows:
where denotes the hypergeometric series (which converges for all h if c < 1, or for all h / a < q if c = 1 ).
The generalized beta encompasses many distributions as limiting or special cases. These are depicted in the GB distribution tree shown above. Listed below are its three direct descendants, or sub-families.
The generalized beta of the first kind is defined by the following pdf:
for where , , and are positive. It is easily verified that
The moments of the GB1 are given by
The GB1 includes the beta of the first kind (B1), generalized gamma(GG), and Pareto as special cases:
The GB2 is defined by the following pdf:
for and zero otherwise. One can verify that
The moments of the GB2 are given by
The GB2 is also known as the Generalized Beta Prime (Patil, Boswell, Ratnaparkhi (1984)), [2] the transformed beta (Venter, 1983), [3] the generalized F (Kalfleisch and Prentice, 1980), [4] and is a special case (μ≡0) of the Feller-Pareto (Arnold, 1983) [5] distribution. The GB2 nests common distributions such as the generalized gamma (GG), Burr type 3, Burr type 12, Dagum, lognormal, Weibull, gamma, Lomax, F statistic, Fisk or Rayleigh, chi-square, half-normal, half-Student's t, exponential, asymmetric log-Laplace, log-Laplace, power function, and the log-logistic. [6]
The beta family of distributions (B) is defined by: [1]
for and zero otherwise. Its relation to the GB is seen below:
The beta family includes the beta of the first and second kind [7] (B1 and B2, where the B2 is also referred to as the Beta prime), which correspond to c = 0 and c = 1, respectively. Setting , yields the standard two-parameter beta distribution.
The generalized gamma distribution (GG) is a limiting case of the GB2. Its PDF is defined by: [8]
with the th moments given by
As noted earlier, the GB distribution family tree visually depicts the special and limiting cases (see McDonald and Xu (1995) ).
The Pareto (PA) distribution is the following limiting case of the generalized gamma:
The power (P) distribution is the following limiting case of the generalized gamma:
which is equivalent to the power function distribution for and .
The asymmetric log-Laplace distribution (also referred to as the double Pareto distribution [9] ) is defined by: [10]
where the th moments are given by
When , this is equivalent to the log-Laplace distribution.
Letting (without location parameter), the random variable , with re-parametrization and , is distributed as an exponential generalized beta (EGB), with the following pdf:
for , and zero otherwise. The EGB includes generalizations of the Gompertz, Gumbel, extreme value type I, logistic, Burr-2, exponential, and normal distributions. The parameter is the location parameter of the EGB (while is the scale parameter of the GB), and is the scale parameter of the EGB (while is a shape parameter of the GB); The EGB has thus three shape parameters.
Included is a figure showing the relationship between the EGB and its special and limiting cases. [11]
Using similar notation as above, the moment-generating function of the EGB can be expressed as follows:
A multivariate generalized beta pdf extends the univariate distributions listed above. For variables , define parameter vectors by , , , and where each and is positive, and . The parameter is assumed to be positive, and define the function = for = .
The pdf of the multivariate generalized beta () may be written as follows:
where for and when = .
Like the univariate generalized beta distribution, the multivariate generalized beta includes several distributions in its family as special cases. By imposing certain constraints on the parameter vectors, the following distributions can be easily derived. [12]
When each is equal to 0, the MGB function simplifies to the multivariate generalized beta of the first kind (MGB1), which is defined by:
where .
In the case where each is equal to 1, the MGB simplifies to the multivariate generalized beta of the second kind (MGB2), with the pdf defined below:
when for all .
The multivariate generalized gamma (MGG) pdf can be derived from the MGB pdf by substituting = and taking the limit as , with Stirling's approximation for the gamma function, yielding the following function:
which is the product of independently but not necessarily identically distributed generalized gamma random variables.
Similar pdfs can be constructed for other variables in the family tree shown above, simply by placing an M in front of each pdf name and finding the appropriate limiting and special cases of the MGB as indicated by the constraints and limits of the univariate distribution. Additional multivariate pdfs in the literature include the Dirichlet distribution (standard form) given by , the multivariate inverted beta and inverted Dirichlet (Dirichlet type 2) distribution given by , and the multivariate Burr distribution given by .
The marginal density functions of the MGB1 and MGB2, respectively, are the generalized beta distributions of the first and second kind, and are given as follows:
The flexibility provided by the GB family is used in modeling the distribution of:
Applications involving members of the EGB family include: [1] [6]
The GB2 and several of its special and limiting cases have been widely used as models for the distribution of income. For some early examples see Thurow (1970), [13] Dagum (1977), [14] Singh and Maddala (1976), [15] and McDonald (1984). [6] Maximum likelihood estimations using individual, grouped, or top-coded data are easily performed with these distributions.
Measures of inequality, such as the Gini index (G), Pietra index (P), and Theil index (T) can be expressed in terms of the distributional parameters, as given by McDonald and Ransom (2008): [16]
The hazard function, h(s), where f(s) is a pdf and F(s) the corresponding cdf, is defined by
Hazard functions are useful in many applications, such as modeling unemployment duration, the failure time of products or life expectancy. Taking a specific example, if s denotes the length of life, then h(s) is the rate of death at age s, given that an individual has lived up to age s. The shape of the hazard function for human mortality data might appear as follows: decreasing mortality in the first few months of life, then a period of relatively constant mortality and finally an increasing probability of death at older ages.
Special cases of the generalized beta distribution offer more flexibility in modeling the shape of the hazard function, which can call for "∪" or "∩" shapes or strictly increasing (denoted by I}) or decreasing (denoted by D) lines. The generalized gamma is "∪"-shaped for a>1 and p<1/a, "∩"-shaped for a<1 and p>1/a, I-shaped for a>1 and p>1/a and D-shaped for a<1 and p>1/a. [17] This is summarized in the figure below. [18] [19]
The likelihood function is the joint probability mass of observed data viewed as a function of the parameters of a statistical model. Intuitively, the likelihood function is the probability of observing data assuming is the actual parameter.
In probability theory and statistics, the Weibull distribution is a continuous probability distribution. It models a broad range of random variables, largely in the nature of a time to failure or time between events. Examples are maximum one-day rainfalls and the time a user spends on a web page.
In probability theory and statistics, the beta distribution is a family of continuous probability distributions defined on the interval [0, 1] or in terms of two positive parameters, denoted by alpha (α) and beta (β), that appear as exponents of the variable and its complement to 1, respectively, and control the shape of the distribution.
In probability theory and statistics, the gamma distribution is a versatile two-parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-squared distribution are special cases of the gamma distribution. There are two equivalent parameterizations in common use:
In Bayesian probability theory, if, given a likelihood function , the posterior distribution is in the same probability distribution family as the prior probability distribution , the prior and posterior are then called conjugate distributions with respect to that likelihood function and the prior is called a conjugate prior for the likelihood function .
The principle of detailed balance can be used in kinetic systems which are decomposed into elementary processes. It states that at equilibrium, each elementary process is in equilibrium with its reverse process.
In probability theory and statistics, the generalized inverse Gaussian distribution (GIG) is a three-parameter family of continuous probability distributions with probability density function
In probability theory and statistics, the beta prime distribution is an absolutely continuous probability distribution. If has a beta distribution, then the odds has a beta prime distribution.
In mathematics, the Weyl character formula in representation theory describes the characters of irreducible representations of compact Lie groups in terms of their highest weights. It was proved by Hermann Weyl. There is a closely related formula for the character of an irreducible representation of a semisimple Lie algebra. In Weyl's approach to the representation theory of connected compact Lie groups, the proof of the character formula is a key step in proving that every dominant integral element actually arises as the highest weight of some irreducible representation. Important consequences of the character formula are the Weyl dimension formula and the Kostant multiplicity formula.
A ratio distribution is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two random variables X and Y, the distribution of the random variable Z that is formed as the ratio Z = X/Y is a ratio distribution.
In mathematics, the spectral theory of ordinary differential equations is the part of spectral theory concerned with the determination of the spectrum and eigenfunction expansion associated with a linear ordinary differential equation. In his dissertation, Hermann Weyl generalized the classical Sturm–Liouville theory on a finite closed interval to second order differential operators with singularities at the endpoints of the interval, possibly semi-infinite or infinite. Unlike the classical case, the spectrum may no longer consist of just a countable set of eigenvalues, but may also contain a continuous part. In this case the eigenfunction expansion involves an integral over the continuous part with respect to a spectral measure, given by the Titchmarsh–Kodaira formula. The theory was put in its final simplified form for singular differential equations of even degree by Kodaira and others, using von Neumann's spectral theorem. It has had important applications in quantum mechanics, operator theory and harmonic analysis on semisimple Lie groups.
In probability theory and statistics, the normal-inverse-gamma distribution is a four-parameter family of multivariate continuous probability distributions. It is the conjugate prior of a normal distribution with unknown mean and variance.
In general relativity, a point mass deflects a light ray with impact parameter by an angle approximately equal to
In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time if these events occur with a known constant mean rate and independently of the time since the last event. It can also be used for the number of events in other types of intervals than time, and in dimension greater than 1.
A product distribution is a probability distribution constructed as the distribution of the product of random variables having two other known distributions. Given two statistically independent random variables X and Y, the distribution of the random variable Z that is formed as the product is a product distribution.
In statistics, the generalized Marcum Q-function of order is defined as
In physics, the Maxwell–Jüttner distribution, sometimes called Jüttner–Synge distribution, is the distribution of speeds of particles in a hypothetical gas of relativistic particles. Similar to the Maxwell–Boltzmann distribution, the Maxwell–Jüttner distribution considers a classical ideal gas where the particles are dilute and do not significantly interact with each other. The distinction from Maxwell–Boltzmann's case is that effects of special relativity are taken into account. In the limit of low temperatures much less than , this distribution becomes identical to the Maxwell–Boltzmann distribution.
In statistics, the variance function is a smooth function that depicts the variance of a random quantity as a function of its mean. The variance function is a measure of heteroscedasticity and plays a large role in many settings of statistical modelling. It is a main ingredient in the generalized linear model framework and a tool used in non-parametric regression, semiparametric regression and functional data analysis. In parametric modeling, variance functions take on a parametric form and explicitly describe the relationship between the variance and the mean of a random quantity. In a non-parametric setting, the variance function is assumed to be a smooth function.
In representation theory of mathematics, the Waldspurger formula relates the special values of two L-functions of two related admissible irreducible representations. Let k be the base field, f be an automorphic form over k, π be the representation associated via the Jacquet–Langlands correspondence with f. Goro Shimura (1976) proved this formula, when and f is a cusp form; Günter Harder made the same discovery at the same time in an unpublished paper. Marie-France Vignéras (1980) proved this formula, when and f is a newform. Jean-Loup Waldspurger, for whom the formula is named, reproved and generalized the result of Vignéras in 1985 via a totally different method which was widely used thereafter by mathematicians to prove similar formulas.