Pearson distribution

Last updated
Diagram of the Pearson system, showing distributions of types I, III, VI, V, and IV in terms of b1 (squared skewness) and b2 (traditional kurtosis) Pearson system.png
Diagram of the Pearson system, showing distributions of types I, III, VI, V, and IV in terms of β1 (squared skewness) and β2 (traditional kurtosis)

The Pearson distribution is a family of continuous probability distributions. It was first published by Karl Pearson in 1895 and subsequently extended by him in 1901 and 1916 in a series of articles on biostatistics.



The Pearson system was originally devised in an effort to model visibly skewed observations. It was well known at the time how to adjust a theoretical model to fit the first two cumulants or moments of observed data: Any probability distribution can be extended straightforwardly to form a location-scale family. Except in pathological cases, a location-scale family can be made to fit the observed mean (first cumulant) and variance (second cumulant) arbitrarily well. However, it was not known how to construct probability distributions in which the skewness (standardized third cumulant) and kurtosis (standardized fourth cumulant) could be adjusted equally freely. This need became apparent when trying to fit known theoretical models to observed data that exhibited skewness. Pearson's examples include survival data, which are usually asymmetric.

In his original paper, Pearson (1895, p. 360) identified four types of distributions (numbered I through IV) in addition to the normal distribution (which was originally known as type V). The classification depended on whether the distributions were supported on a bounded interval, on a half-line, or on the whole real line; and whether they were potentially skewed or necessarily symmetric. A second paper (Pearson 1901) fixed two omissions: it redefined the type V distribution (originally just the normal distribution, but now the inverse-gamma distribution) and introduced the type VI distribution. Together the first two papers cover the five main types of the Pearson system (I, III, IV, V, and VI). In a third paper, Pearson (1916) introduced further special cases and subtypes (VII through XII).

Rhind (1909, pp. 430–432) devised a simple way of visualizing the parameter space of the Pearson system, which was subsequently adopted by Pearson (1916, plate 1 and pp. 430ff., 448ff.). The Pearson types are characterized by two quantities, commonly referred to as β1 and β2. The first is the square of the skewness: β1 = γ1 where γ1 is the skewness, or third standardized moment. The second is the traditional kurtosis, or fourth standardized moment: β2 = γ2 + 3. (Modern treatments define kurtosis γ2 in terms of cumulants instead of moments, so that for a normal distribution we have γ2 = 0 and β2 = 3. Here we follow the historical precedent and use β2.) The diagram on the right shows which Pearson type a given concrete distribution (identified by a point (β1, β2)) belongs to.

Many of the skewed and/or non-mesokurtic distributions familiar to us today were still unknown in the early 1890s. What is now known as the beta distribution had been used by Thomas Bayes as a posterior distribution of the parameter of a Bernoulli distribution in his 1763 work on inverse probability. The Beta distribution gained prominence due to its membership in Pearson's system and was known until the 1940s as the Pearson type I distribution. [1] (Pearson's type II distribution is a special case of type I, but is usually no longer singled out.) The gamma distribution originated from Pearson's work (Pearson 1893, p. 331; Pearson 1895, pp. 357, 360, 373–376) and was known as the Pearson type III distribution, before acquiring its modern name in the 1930s and 1940s. [2] Pearson's 1895 paper introduced the type IV distribution, which contains Student's t-distribution as a special case, predating William Sealy Gosset's subsequent use by several years. His 1901 paper introduced the inverse-gamma distribution (type V) and the beta prime distribution (type VI).


A Pearson density p is defined to be any valid solution to the differential equation (cf. Pearson 1895, p. 381)


According to Ord, [3] Pearson devised the underlying form of Equation (1) on the basis of, firstly, the formula for the derivative of the logarithm of the density function of the normal distribution (which gives a linear function) and, secondly, from a recurrence relation for values in the probability mass function of the hypergeometric distribution (which yields the linear-divided-by-quadratic structure).

In Equation (1), the parameter a determines a stationary point, and hence under some conditions a mode of the distribution, since

follows directly from the differential equation.

Since we are confronted with a first-order linear differential equation with variable coefficients, its solution is straightforward:

The integral in this solution simplifies considerably when certain special cases of the integrand are considered. Pearson (1895, p. 367) distinguished two main cases, determined by the sign of the discriminant (and hence the number of real roots) of the quadratic function

Particular types of distribution

Case 1, negative discriminant

The Pearson type IV distribution

If the discriminant of the quadratic function (2) is negative (), it has no real roots. Then define

Observe that α is a well-defined real number and α ≠ 0, because by assumption and therefore b2 ≠ 0. Applying these substitutions, the quadratic function (2) is transformed into

The absence of real roots is obvious from this formulation, because α2 is necessarily positive.

We now express the solution to the differential equation (1) as a function of y:

Pearson (1895, p. 362) called this the "trigonometrical case", because the integral

involves the inverse trigonometric arctan function. Then

Finally, let

Applying these substitutions, we obtain the parametric function:

This unnormalized density has support on the entire real line. It depends on a scale parameter α > 0 and shape parameters m > 1/2 and ν. One parameter was lost when we chose to find the solution to the differential equation (1) as a function of y rather than x. We therefore reintroduce a fourth parameter, namely the location parameter λ. We have thus derived the density of the Pearson type IV distribution:

The normalizing constant involves the complex Gamma function (Γ) and the Beta function  (B). Notice that the location parameter λ here is not the same as the original location parameter introduced in the general formulation, but is related via

The Pearson type VII distribution

Plot of Pearson type VII densities with l = 0, s = 1, and: g2 = [?] (red); g2 = 4 (blue); and g2 = 0 (black) Pearson type VII distribution PDF.svg
Plot of Pearson type VII densities with λ = 0, σ = 1, and: γ2 = ∞ (red); γ2 = 4 (blue); and γ2 = 0 (black)

The shape parameter ν of the Pearson type IV distribution controls its skewness. If we fix its value at zero, we obtain a symmetric three-parameter family. This special case is known as the Pearson type VII distribution (cf. Pearson 1916, p. 450). Its density is

where B is the Beta function.

An alternative parameterization (and slight specialization) of the type VII distribution is obtained by letting

which requires m > 3/2. This entails a minor loss of generality but ensures that the variance of the distribution exists and is equal to σ2. Now the parameter m only controls the kurtosis of the distribution. If m approaches infinity as λ and σ are held constant, the normal distribution arises as a special case:

This is the density of a normal distribution with mean λ and standard deviation σ.

It is convenient to require that m > 5/2 and to let

This is another specialization, and it guarantees that the first four moments of the distribution exist. More specifically, the Pearson type VII distribution parameterized in terms of (λ, σ, γ2) has a mean of λ, standard deviation of σ, skewness of zero, and positive excess kurtosis of γ2.

Student's t-distribution

The Pearson type VII distribution is equivalent to the non-standardized Student's t-distribution with parameters ν > 0, μ, σ2 by applying the following substitutions to its original parameterization:

Observe that the constraint m > 1/2 is satisfied.

The resulting density is

which is easily recognized as the density of a Student's t-distribution.

This implies that the Pearson type VII distribution subsumes the standard Student's t-distribution and also the standard Cauchy distribution. In particular, the standard Student's t-distribution arises as a subcase, when μ = 0 and σ2 = 1, equivalent to the following substitutions:

The density of this restricted one-parameter family is a standard Student's t:

Case 2, non-negative discriminant

If the quadratic function (2) has a non-negative discriminant (), it has real roots a1 and a2 (not necessarily distinct):

In the presence of real roots the quadratic function (2) can be written as

and the solution to the differential equation is therefore

Pearson (1895, p. 362) called this the "logarithmic case", because the integral

involves only the logarithm function and not the arctan function as in the previous case.

Using the substitution

we obtain the following solution to the differential equation (1):

Since this density is only known up to a hidden constant of proportionality, that constant can be changed and the density written as follows:

The Pearson type I distribution

The Pearson type I distribution (a generalization of the beta distribution) arises when the roots of the quadratic equation (2) are of opposite sign, that is, . Then the solution p is supported on the interval . Apply the substitution

where , which yields a solution in terms of y that is supported on the interval (0, 1):

One may define:

Regrouping constants and parameters, this simplifies to:

Thus follows a with . It turns out that m1, m2 > −1 is necessary and sufficient for p to be a proper probability density function.

The Pearson type II distribution

The Pearson type II distribution is a special case of the Pearson type I family restricted to symmetric distributions.

For the Pearson Type II Curve, [4]


The ordinate, y, is the frequency of . The Pearson Type II Curve is used in computing the table of significant correlation coefficients for Spearman's rank correlation coefficient when the number of items in a series is less than 100 (or 30, depending on some sources). After that, the distribution mimics a standard Student's t-distribution. For the table of values, certain values are used as the constants in the previous equation:

The moments of x used are

The Pearson type III distribution


is . The Pearson type III distribution is a gamma distribution or chi-squared distribution.

The Pearson type V distribution

Defining new parameters:

follows an . The Pearson type V distribution is an inverse-gamma distribution.

The Pearson type VI distribution


follows a . The Pearson type VI distribution is a beta prime distribution or F-distribution.

Relation to other distributions

The Pearson family subsumes the following distributions, among others:

Alternatives to the Pearson system of distributions for the purpose of fitting distributions to data are the quantile-parameterized distributions (QPDs) and the metalog distributions. QPDs and metalogs can provide greater shape and bounds flexibility than the Pearson system. Instead of fitting moments, QPDs are typically fit to empirical CDF or other data with linear least squares.


These models are used in financial markets, given their ability to be parametrized in a way that has intuitive meaning for market traders. A number of models are in current use that capture the stochastic nature of the volatility of rates, stocks, etc.,[ which? ][ citation needed ] and this family of distributions may prove to be one of the more important.

In the United States, the Log-Pearson III is the default distribution for flood frequency analysis. [5]

Recently, there have been alternatives developed to the Pearson distributions that are more flexible and easier to fit to data. See the metalog distributions.


  1. Miller, Jeff; et al. (2006-07-09). "Beta distribution". Earliest Known Uses of Some of the Words of Mathematics. Retrieved 2006-12-09.
  2. Miller, Jeff; et al. (2006-12-07). "Gamma distribution". Earliest Known Uses of Some of the Words of Mathematics. Retrieved 2006-12-09.
  3. Ord J.K. (1972) p. 2
  4. Ramsey, Philip H. (1989-09-01). "Critical Values for Spearman's Rank Order Correlation". Journal of Educational Statistics. 14 (3): 245–253. JSTOR   1165017.
  5. "Guidelines for Determine Flood Flow Frequency" (PDF). USGS Water. March 1982. Retrieved 2019-06-14.


Primary sources

Secondary sources

Related Research Articles

In particle physics, the Dirac equation is a relativistic wave equation derived by British physicist Paul Dirac in 1928. In its free form, or including electromagnetic interactions, it describes all spin-12 massive particles, called "Dirac particles", such as electrons and quarks for which parity is a symmetry. It is consistent with both the principles of quantum mechanics and the theory of special relativity, and was the first theory to account fully for special relativity in the context of quantum mechanics. It was validated by accounting for the fine structure of the hydrogen spectrum in a completely rigorous way.

<span class="mw-page-title-main">Exponential distribution</span> Probability distribution

In probability theory and statistics, the exponential distribution or negative exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average rate. It is a particular case of the gamma distribution. It is the continuous analogue of the geometric distribution, and it has the key property of being memoryless. In addition to being used for the analysis of Poisson point processes it is found in various other contexts.

<span class="mw-page-title-main">Beta distribution</span> Probability distribution

In probability theory and statistics, the beta distribution is a family of continuous probability distributions defined on the interval [0, 1] or in terms of two positive parameters, denoted by alpha (α) and beta (β), that appear as exponents of the variable and its complement to 1, respectively, and control the shape of the distribution.

In the general theory of relativity, the Einstein field equations relate the geometry of spacetime to the distribution of matter within it.

<span class="mw-page-title-main">Stable distribution</span> Distribution of variables which satisfies a stability property under linear combinations

In probability theory, a distribution is said to be stable if a linear combination of two independent random variables with this distribution has the same distribution, up to location and scale parameters. A random variable is said to be stable if its distribution is stable. The stable distribution family is also sometimes referred to as the Lévy alpha-stable distribution, after Paul Lévy, the first mathematician to have studied it.

In differential geometry, a tensor density or relative tensor is a generalization of the tensor field concept. A tensor density transforms as a tensor field when passing from one coordinate system to another, except that it is additionally multiplied or weighted by a power W of the Jacobian determinant of the coordinate transition function or its absolute value. A tensor density with a single index is called a vector density. A distinction is made among (authentic) tensor densities, pseudotensor densities, even tensor densities and odd tensor densities. Sometimes tensor densities with a negative weight W are called tensor capacity. A tensor density can also be regarded as a section of the tensor product of a tensor bundle with a density bundle.

<span class="mw-page-title-main">Electromagnetic tensor</span> Mathematical object that describes the electromagnetic field in spacetime

In electromagnetism, the electromagnetic tensor or electromagnetic field tensor is a mathematical object that describes the electromagnetic field in spacetime. The field tensor was first used after the four-dimensional tensor formulation of special relativity was introduced by Hermann Minkowski. The tensor allows related physical laws to be written very concisely, and allows for the quantization of the electromagnetic field by Lagrangian formulation described below.

In general relativity, a geodesic generalizes the notion of a "straight line" to curved spacetime. Importantly, the world line of a particle free from all external, non-gravitational forces is a particular type of geodesic. In other words, a freely moving or falling particle always moves along a geodesic.

<span class="mw-page-title-main">Generalized inverse Gaussian distribution</span>

In probability theory and statistics, the generalized inverse Gaussian distribution (GIG) is a three-parameter family of continuous probability distributions with probability density function

<span class="mw-page-title-main">Beta prime distribution</span> Probability distribution

In probability theory and statistics, the beta prime distribution is an absolutely continuous probability distribution. If has a beta distribution, then the odds has a beta prime distribution.

<span class="mw-page-title-main">Covariant formulation of classical electromagnetism</span> Ways of writing certain laws of physics

The covariant formulation of classical electromagnetism refers to ways of writing the laws of classical electromagnetism in a form that is manifestly invariant under Lorentz transformations, in the formalism of special relativity using rectilinear inertial coordinate systems. These expressions both make it simple to prove that the laws of classical electromagnetism take the same form in any inertial coordinate system, and also provide a way to translate the fields and forces from one frame to another. However, this is not as general as Maxwell's equations in curved spacetime or non-rectilinear coordinate systems.

<span class="mw-page-title-main">Maxwell's equations in curved spacetime</span> Electromagnetism in general relativity

In physics, Maxwell's equations in curved spacetime govern the dynamics of the electromagnetic field in curved spacetime or where one uses an arbitrary coordinate system. These equations can be viewed as a generalization of the vacuum Maxwell's equations which are normally formulated in the local coordinates of flat spacetime. But because general relativity dictates that the presence of electromagnetic fields induce curvature in spacetime, Maxwell's equations in flat spacetime should be viewed as a convenient approximation.

In the theory of general relativity, a stress–energy–momentum pseudotensor, such as the Landau–Lifshitz pseudotensor, is an extension of the non-gravitational stress–energy tensor that incorporates the energy–momentum of gravity. It allows the energy–momentum of a system of gravitating matter to be defined. In particular it allows the total of matter plus the gravitating energy–momentum to form a conserved current within the framework of general relativity, so that the total energy–momentum crossing the hypersurface of any compact space–time hypervolume vanishes.

In probability theory and statistics, the normal-gamma distribution is a bivariate four-parameter family of continuous probability distributions. It is the conjugate prior of a normal distribution with unknown mean and precision.

<span class="mw-page-title-main">Normal-inverse-gamma distribution</span>

In probability theory and statistics, the normal-inverse-gamma distribution is a four-parameter family of multivariate continuous probability distributions. It is the conjugate prior of a normal distribution with unknown mean and variance.

Financial models with long-tailed distributions and volatility clustering have been introduced to overcome problems with the realism of classical financial models. These classical models of financial time series typically assume homoskedasticity and normality cannot explain stylized phenomena such as skewness, heavy tails, and volatility clustering of the empirical asset returns in finance. In 1963, Benoit Mandelbrot first used the stable distribution to model the empirical distributions which have the skewness and heavy-tail property. Since -stable distributions have infinite -th moments for all , the tempered stable processes have been proposed for overcoming this limitation of the stable distribution.

<span class="mw-page-title-main">Lomax distribution</span>

The Lomax distribution, conditionally also called the Pareto Type II distribution, is a heavy-tail probability distribution used in business, economics, actuarial science, queueing theory and Internet traffic modeling. It is named after K. S. Lomax. It is essentially a Pareto distribution that has been shifted so that its support begins at zero.

In mathematics, Ricci calculus constitutes the rules of index notation and manipulation for tensors and tensor fields on a differentiable manifold, with or without a metric tensor or connection. It is also the modern name for what used to be called the absolute differential calculus, developed by Gregorio Ricci-Curbastro in 1887–1896, and subsequently popularized in a paper written with his pupil Tullio Levi-Civita in 1900. Jan Arnoldus Schouten developed the modern notation and formalism for this mathematical framework, and made contributions to the theory, during its applications to general relativity and differential geometry in the early twentieth century.

<span class="mw-page-title-main">Relativistic Lagrangian mechanics</span> Mathematical formulation of special and general relativity

In theoretical physics, relativistic Lagrangian mechanics is Lagrangian mechanics applied in the context of special relativity and general relativity.

<span class="mw-page-title-main">Dual graviton</span> Hypothetical particle found in supergravity

In theoretical physics, the dual graviton is a hypothetical elementary particle that is a dual of the graviton under electric-magnetic duality, as an S-duality, predicted by some formulations of supergravity in eleven dimensions.
