Generalized logistic distribution

Last updated December 15, 2024

The term generalized logistic distribution is used as the name for several different families of probability distributions. For example, Johnson et al.^[1] list four forms, which are listed below.

Type I has also been called the skew-logistic distribution. Type IV subsumes the other types and is obtained when applying the logit transform to beta random variates. Following the same convention as for the log-normal distribution, type IV may be referred to as the logistic-beta distribution, with reference to the standard logistic function, which is the inverse of the logit transform.

For other families of distributions that have also been called generalized logistic distributions, see the shifted log-logistic distribution, which is a generalization of the log-logistic distribution; and the metalog ("meta-logistic") distribution, which is highly shape-and-bounds flexible and can be fit to data with linear least squares.

Definitions

The following definitions are for standardized versions of the families, which can be expanded to the full form as a location-scale family. Each is defined using either the cumulative distribution function (F) or the probability density function (ƒ), and is defined on (-∞,∞).

Type I

F(x;\alpha )={\frac {1}{(1+e^{-x})^{\alpha }}}\equiv (1+e^{-x})^{-\alpha },\quad \alpha >0.

The corresponding probability density function is:

f(x;\alpha )={\frac {\alpha e^{-x}}{\left(1+e^{-x}\right)^{\alpha +1}}},\quad \alpha >0.

This type has also been called the "skew-logistic" distribution.

Type II

F(x;\alpha )=1-{\frac {e^{-\alpha x}}{(1+e^{-x})^{\alpha }}},\quad \alpha >0.

The corresponding probability density function is:

f(x;\alpha )={\frac {\alpha e^{-\alpha x}}{(1+e^{-x})^{\alpha +1}}},\quad \alpha >0.

Type III

f(x;\alpha )={\frac {1}{B(\alpha ,\alpha )}}{\frac {e^{-\alpha x}}{(1+e^{-x})^{2\alpha }}},\quad \alpha >0.

Here B is the beta function. The moment generating function for this type is

M(t)={\frac {\Gamma (\alpha -t)\Gamma (\alpha +t)}{(\Gamma (\alpha ))^{2}}},\quad -\alpha <t<\alpha .

The corresponding cumulative distribution function is:

{\displaystyle F(x;\alpha )={\frac {\left(e^{x}+1\right)\Gamma (\alpha )e^{\alpha (-x)}\left(e^{-x}+1\right)^{-2\alpha }\,_{2}{\tilde {F}}_{1}\left(1,1-\alpha

Type IV

{\begin{aligned}f(x;\alpha ,\beta )&={\frac {1}{B(\alpha ,\beta )}}{\frac {e^{-\beta x}}{(1+e^{-x})^{\alpha +\beta }}},\quad \alpha ,\beta >0\\[4pt]&={\frac {\sigma (x)^{\alpha }\sigma (-x)^{\beta }}{B(\alpha ,\beta )}}.\end{aligned}}

Where, B is the beta function and $\sigma (x)=1/(1+e^{-x})$ is the standard logistic function. The moment generating function for this type is

M(t)={\frac {\Gamma (\beta -t)\Gamma (\alpha +t)}{\Gamma (\alpha )\Gamma (\beta )}},\quad -\alpha <t<\beta .

This type is also called the "exponential generalized beta of the second type".^[1]

The corresponding cumulative distribution function is:

{\displaystyle F(x;\alpha ,\beta )={\frac {\left(e^{x}+1\right)\Gamma (\alpha )e^{\beta (-x)}\left(e^{-x}+1\right)^{-\alpha -\beta }\,_{2}{\tilde {F}}_{1}\left(1,1-\beta

Relationship between types

Type IV is the most general form of the distribution. The Type III distribution can be obtained from Type IV by fixing $\beta =\alpha$ . The Type II distribution can be obtained from Type IV by fixing $\alpha =1$ (and renaming $\beta$ to $\alpha$ ). The Type I distribution can be obtained from Type IV by fixing $\beta =1$ . Fixing $\alpha =\beta =1$ gives the standard logistic distribution.

Type IV (logistic-beta) properties

The Type IV generalized logistic, or logistic-beta distribution, with support $x\in \mathbb {R}$ and shape parameters $\alpha ,\beta >0$ , has (as shown above) the probability density function (pdf):

f(x;\alpha ,\beta )={\frac {1}{B(\alpha ,\beta )}}{\frac {e^{-\beta x}}{(1+e^{-x})^{\alpha +\beta }}}={\frac {\sigma (x)^{\alpha }\sigma (-x)^{\beta }}{B(\alpha ,\beta )}},

where $\sigma (x)=1/(1+e^{-x})$ is the standard logistic function. The probability density functions for three different sets of shape parameters are shown in the plot, where the distributions have been scaled and shifted to give zero means and unity variances, in order to facilitate comparison of the shapes.

In what follows, the notation $B_{\sigma }(\alpha ,\beta )$ is used to denote the Type IV distribution.

Relationship with Gamma Distribution

This distribution can be obtained in terms of the gamma distribution as follows. Let $y\sim {\text{Gamma}}(\alpha ,\gamma )$ and independently, $z\sim {\text{Gamma}}(\beta ,\gamma )$ and let $x=\ln y-\ln z$ . Then $x\sim B_{\sigma }(\alpha ,\beta )$ .^[2]

Symmetry

If $x\sim B_{\sigma }(\alpha ,\beta )$ , then $-x\sim B_{\sigma }(\beta ,\alpha )$ .

Mean and variance

By using the logarithmic expectations of the gamma distribution, the mean and variance can be derived as:

{\begin{aligned}{\text{E}}[x]&=\psi (\alpha )-\psi (\beta )\\{\text{var}}[x]&=\psi '(\alpha )+\psi '(\beta )\\\end{aligned}}

where $\psi$ is the digamma function, while $\psi '=\psi ^{(1)}$ is its first derivative, also known as the trigamma function, or the first polygamma function. Since $\psi$ is strictly increasing, the sign of the mean is the same as the sign of $\alpha -\beta$ . Since $\psi '$ is strictly decreasing, the shape parameters can also be interpreted as concentration parameters. Indeed, as shown below, the left and right tails respectively become thinner as $\alpha$ or $\beta$ are increased. The two terms of the variance represent the contributions to the variance of the left and right parts of the distribution.

Cumulants and skewness

The cumulant generating function is $K(t)=\ln M(t)$ , where the moment generating function $M(t)$ is given above. The cumulants, $\kappa _{n}$ , are the $n$ -th derivatives of $K(t)$ , evaluated at $t=0$ :

\kappa _{n}=K^{(n)}(0)=\psi ^{(n-1)}(\alpha )+(-1)^{n}\psi ^{(n-1)}(\beta )

where $\psi ^{(0)}=\psi$ and $\psi ^{(n-1)}$ are the digamma and polygamma functions. In agreement with the derivation above, the first cumulant, $\kappa _{1}$ , is the mean and the second, $\kappa _{2}$ , is the variance.

The third cumulant, $\kappa _{3}$ , is the third central moment $E[(x-E[x])^{3}]$ , which when scaled by the third power of the standard deviation gives the skewness:

{\text{skew}}[x]={\frac {\psi ^{(2)}(\alpha )-\psi ^{(2)}(\beta )}{{\sqrt {{\text{var}}[x]}}^{3}}}

The sign (and therefore the handedness) of the skewness is the same as the sign of $\alpha -\beta$ .

Mode

The mode (pdf maximum) can be derived by finding $x$ where the log pdf derivative is zero:

{\frac {d}{dx}}\ln f(x;\alpha ,\beta )=\alpha \sigma (-x)-\beta \sigma (x)=0

This simplifies to $\alpha /\beta =e^{x}$ , so that:^[2]

{\text{mode}}[x]=\ln {\frac {\alpha }{\beta }}

Tail behaviour

In each of the left and right tails, one of the sigmoids in the pdf saturates to one, so that the tail is formed by the other sigmoid. For large negative $x$ , the left tail of the pdf is proportional to $\sigma (x)^{\alpha }\approx e^{\alpha x}$ , while the right tail (large positive $x$ ) is proportional to $\sigma (-x)^{\beta }\approx e^{-\beta x}$ . This means the tails are independently controlled by $\alpha$ and $\beta$ . Although type IV tails are heavier than those of the normal distribution ( $e^{-{\frac {x^{2}}{2v}}}$ , for variance $v$ ), the type IV means and variances remain finite for all $\alpha ,\beta >0$ . This is in contrast with the Cauchy distribution for which the mean and variance do not exist. In the log pdf plots shown here, the type IV tails are linear, the normal distribution tails are quadratic and the Cauchy tails are logarithmic.

Exponential family properties

$B_{\sigma }(\alpha ,\beta )$ forms an exponential family with natural parameters $\alpha$ and $\beta$ and sufficient statistics $\log \sigma (x)$ and $\log \sigma (-x)$ . The expected values of the sufficient statistics can be found by differentiation of the log-normalizer:^[3]

{\begin{aligned}E[\log \sigma (x)]&={\frac {\partial \log B(\alpha ,\beta )}{\partial \alpha }}=\psi (\alpha )-\psi (\alpha +\beta )\\E[\log \sigma (-x)]&={\frac {\partial \log B(\alpha ,\beta )}{\partial \beta }}=\psi (\beta )-\psi (\alpha +\beta )\\\end{aligned}}

Given a data set $x_{1},\ldots ,x_{n}$ assumed to have been generated IID from $B_{\sigma }(\alpha ,\beta )$ , the maximum-likelihood parameter estimate is:

{\begin{aligned}{\hat {\alpha }},{\hat {\beta }}=\arg \max _{\alpha ,\beta }&\;{\frac {1}{n}}\sum _{i=1}^{n}\log f(x_{i};\alpha ,\beta )\\=\arg \max _{\alpha ,\beta }&\;\alpha {\Bigl (}{\frac {1}{n}}\sum _{i}\log \sigma (x_{i}){\Bigr )}+\beta {\Bigl (}{\frac {1}{n}}\sum _{i}\log \sigma (-x_{i}){\Bigr )}-\log B(\alpha ,\beta )\\=\arg \max _{\alpha ,\beta }&\;\alpha \,{\overline {\log \sigma (x)}}+\beta \,{\overline {\log \sigma (-x)}}-\log B(\alpha ,\beta )\end{aligned}}

where the overlines denote the averages of the sufficient statistics. The maximum-likelihood estimate depends on the data only via these average statistics. Indeed, at the maximum-likelihood estimate the expected values and averages agree:

{\begin{aligned}\psi ({\hat {\alpha }})-\psi ({\hat {\alpha }}+{\hat {\beta }})&={\overline {\log \sigma (x)}}\\\psi ({\hat {\beta }})-\psi ({\hat {\alpha }}+{\hat {\beta }})&={\overline {\log \sigma (-x)}}\\\end{aligned}}

which is also where the partial derivatives of the above maximand vanish.

Relationships with other distributions

Relationships with other distributions include:

The log-ratio of gamma variates is of type IV as detailed above.
If $y\sim {\text{BetaPrime}}(\alpha ,\beta )$ , then $x=\ln y$ has a type IV distribution, with parameters $\alpha$ and $\beta$ . See beta prime distribution.
If $z\sim {\text{Gamma}}(\beta ,1)$ and $y\mid z\sim {\text{Gamma}}(\alpha ,z)$ , where $z$ is used as the rate parameter of the second gamma distribution, then $y$ has a compound gamma distribution, which is the same as ${\text{BetaPrime}}(\alpha ,\beta )$ , so that $x=\ln y$ has a type IV distribution.
If $p\sim {\text{Beta}}(\alpha ,\beta )$ , then $x={\text{logit}}\,p$ has a type IV distribution, with parameters $\alpha$ and $\beta$ . See beta distribution. The logit function, $\mathrm {logit} (p)=\log {\frac {p}{1-p}}$ is the inverse of the logistic function. This relationship explains the name logistic-beta for this distribution: if the logistic function is applied to logistic-beta variates, the transformed distribution is beta.

Large shape parameters

For large values of the shape parameters, $\alpha ,\beta \gg 1$ , the distribution becomes more Gaussian, with:

{\begin{aligned}E[x]&\approx \ln {\frac {\alpha }{\beta }}\\{\text{var}}[x]&\approx {\frac {\alpha +\beta }{\alpha \beta }}\end{aligned}}

This is demonstrated in the pdf and log pdf plots here.

Random variate generation

Since random sampling from the gamma and beta distributions are readily available on many software platforms, the above relationships with those distributions can be used to generate variates from the type IV distribution.

Generalization with location and scale parameters

A flexible, four-parameter family can be obtained by adding location and scale parameters. One way to do this is if $x\sim B_{\sigma }(\alpha ,\beta )$ , then let $y=kx+\delta$ , where $k>0$ is the scale parameter and $\delta \in \mathbb {R}$ is the location parameter. The four-parameter family obtained thus has the desired additional flexibility, but the new parameters may be hard to interpret because $\delta \neq E[y]$ and $k^{2}\neq {\text{var}}[y]$ . Moreover maximum-likelihood estimation with this parametrization is hard. These problems can be addressed as follows.

Recall that the mean and variance of $x$ are:

{\begin{aligned}{\tilde {\mu }}&=\psi (\alpha )-\psi (\beta ),&{\tilde {s}}^{2}&=\psi '(\alpha )+\psi '(\beta )\end{aligned}}

Now expand the family with location parameter $\mu \in \mathbb {R}$ and scale parameter $s>0$ , via the transformation:

{\begin{aligned}y&=\mu +{\frac {s}{\tilde {s}}}(x-{\tilde {\mu }})\iff x={\tilde {\mu }}+{\frac {\tilde {s}}{s}}(y-\mu )\end{aligned}}

so that $\mu =E[y]$ and $s^{2}={\text{var}}[y]$ are now interpretable. It may be noted that allowing $s$ to be either positive or negative does not generalize this family, because of the above-noted symmetry property. We adopt the notation $y\sim {\bar {B}}_{\sigma }(\alpha ,\beta ,\mu ,s^{2})$ for this family.

If the pdf for $x\sim B_{\sigma }(\alpha ,\beta )$ is $f(x;\alpha ,\beta )$ , then the pdf for $y\sim {\bar {B}}_{\sigma }(\alpha ,\beta ,\mu ,s^{2})$ is:

{\bar {f}}(y;\alpha ,\beta ,\mu ,s^{2})={\frac {\tilde {s}}{s}}\,f(x;\alpha ,\beta )

where it is understood that $x$ is computed as detailed above, as a function of $y,\alpha ,\beta ,\mu ,s$ . The pdf and log-pdf plots above, where the captions contain (means=0, variances=1), are for ${\bar {B}}_{\sigma }(\alpha ,\beta ,0,1)$ .

Maximum likelihood parameter estimation

In this section, maximum-likelihood estimation of the distribution parameters, given a dataset $x_{1},\ldots ,x_{n}$ is discussed in turn for the families $B_{\sigma }(\alpha ,\beta )$ and ${\bar {B}}_{\sigma }(\alpha ,\beta ,\mu ,s^{2})$ .

Maximum likelihood for standard Type IV

As noted above, $B_{\sigma }(\alpha ,\beta )$ is an exponential family with natural parameters $\alpha ,\beta$ , the maximum-likelihood estimates of which depend only on averaged sufficient statistics:

{\begin{aligned}{\overline {\log \sigma (x)}}&={\frac {1}{n}}\sum _{i}\log \sigma (x_{i})&&{\text{and}}&{\overline {\log \sigma (-x)}}&={\frac {1}{n}}\sum _{i}\log \sigma (-x_{i})\end{aligned}}

Once these statistics have been accumulated, the maximum-likelihood estimate is given by:

{\begin{aligned}{\hat {\alpha }},{\hat {\beta }}=\arg \max _{\alpha ,\beta >0}&\;\alpha \,{\overline {\log \sigma (x)}}+\beta \,{\overline {\log \sigma (-x)}}-\log B(\alpha ,\beta )\end{aligned}}

By using the parametrization $\theta _{1}=\log \alpha$ and $\theta _{2}=\log \beta$ an unconstrained numerical optimization algorithm like BFGS can be used. Optimization iterations are fast, because they are independent of the size of the data-set.

An alternative is to use an EM-algorithm based on the composition: $x-\log(\gamma \delta )\sim B_{\sigma }(\alpha ,\beta )$ if $z\sim {\text{Gamma}}(\beta ,\gamma )$ and $e^{x}\mid z\sim {\text{Gamma}}(\alpha ,z/\delta )$ . Because of the self-conjugacy of the gamma distribution, the posterior expectations, $\left\langle z\right\rangle _{P(z\mid x)}$ and $\left\langle \log z\right\rangle _{P(z\mid x)}$ that are required for the E-step can be computed in closed form. The M-step parameter update can be solved analogously to maximum-likelihood for the gamma distribution.

Maximum likelihood for the four-parameter family

The maximum-likelihood problem for ${\bar {B}}_{\sigma }(\alpha ,\beta ,\mu ,s^{2})$ , having pdf ${\bar {f}}$ is:

{\hat {\alpha }},{\hat {\beta }},{\hat {\mu }},{\hat {s}}=\arg \max _{\alpha ,\beta ,\mu ,s}\log {\frac {1}{n}}\sum _{i}{\bar {f}}(x_{i};\alpha ,\beta ,\mu ,s^{2})

This is no longer an exponential family, so that each optimization iteration has to traverse the whole data-set. Moreover the computation of the partial derivatives (as required for example by BFGS) is considerably more complex than for the above two-parameter case. However, all the component functions are readily available in software packages with automatic differentiation. Again, the positive parameters can be parametrized in terms of their logarithms to obtain an unconstrained numerical optimization problem.

For this problem, numerical optimization may fail unless the initial location and scale parameters are chosen appropriately. However the above-mentioned interpretability of these parameters in the parametrization of ${\bar {B}}_{\sigma }$ can be used to do this. Specifically, the initial values for $\mu$ and $s^{2}$ can be set to the empirical mean and variance of the data.

Related Research Articles

In particle physics, the Dirac equation is a relativistic wave equation derived by British physicist Paul Dirac in 1928. In its free form, or including electromagnetic interactions, it describes all spin-1/2 massive particles, called "Dirac particles", such as electrons and quarks for which parity is a symmetry. It is consistent with both the principles of quantum mechanics and the theory of special relativity, and was the first theory to account fully for special relativity in the context of quantum mechanics. It was validated by accounting for the fine structure of the hydrogen spectrum in a completely rigorous way. It has become vital in the building of the Standard Model.

In probability theory and statistics, the exponential distribution or negative exponential distribution is the probability distribution of the distance between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average rate; the distance parameter could be any meaningful mono-dimensional measure of the process, such as time between production errors, or length along a roll of fabric in the weaving manufacturing process. It is a particular case of the gamma distribution. It is the continuous analogue of the geometric distribution, and it has the key property of being memoryless. In addition to being used for the analysis of Poisson point processes it is found in various other contexts.

In probability theory and statistics, the beta distribution is a family of continuous probability distributions defined on the interval [0, 1] or in terms of two positive parameters, denoted by alpha (α) and beta (β), that appear as exponents of the variable and its complement to 1, respectively, and control the shape of the distribution.

In probability theory and statistics, the gamma distribution is a versatile two-parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-squared distribution are special cases of the gamma distribution. There are two equivalent parameterizations in common use:

With a shape parameter $α$ and a scale parameter $θ$
With a shape parameter $and a rate parameter ⁠ ⁠$

<span class="mw-page-title-main">Gumbel distribution</span> Particular case of the generalized extreme value distribution

In probability theory and statistics, the Gumbel distribution is used to model the distribution of the maximum of a number of samples of various distributions.

In probability theory and statistics, the F-distribution or F-ratio, also known as Snedecor's F distribution or the Fisher–Snedecor distribution, is a continuous probability distribution that arises frequently as the null distribution of a test statistic, most notably in the analysis of variance (ANOVA) and other F-tests.

In probability theory and statistics, the generalized extreme value (GEV) distribution is a family of continuous probability distributions developed within extreme value theory to combine the Gumbel, Fréchet and Weibull families also known as type I, II and III extreme value distributions. By the extreme value theorem the GEV distribution is the only possible limit distribution of properly normalized maxima of a sequence of independent and identically distributed random variables. that a limit distribution needs to exist, which requires regularity conditions on the tail of the distribution. Despite this, the GEV distribution is often used as an approximation to model the maxima of long (finite) sequences of random variables.

In probability theory and statistics, the inverse gamma distribution is a two-parameter family of continuous probability distributions on the positive real line, which is the distribution of the reciprocal of a variable distributed according to the gamma distribution.

In probability theory and statistics, the beta prime distribution is an absolutely continuous probability distribution. If $has a beta distribution, then the odds has a beta prime distribution.$

The folded normal distribution is a probability distribution related to the normal distribution. Given a normally distributed random variable X with mean μ and variance σ², the random variable Y = |X| has a folded normal distribution. Such a case may be encountered if only the magnitude of some variable is recorded, but not its sign. The distribution is called "folded" because probability mass to the left of x = 0 is folded over by taking the absolute value. In the physics of heat conduction, the folded normal distribution is a fundamental solution of the heat equation on the half space; it corresponds to having a perfect insulator on a hyperplane through the origin.

Expected shortfall (ES) is a risk measure—a concept used in the field of financial risk measurement to evaluate the market risk or credit risk of a portfolio. The "expected shortfall at q% level" is the expected return on the portfolio in the worst $of cases. ES is an alternative to value at risk that is more sensitive to the shape of the tail of the loss distribution.$

<span class="mw-page-title-main">Generalized Pareto distribution</span> Family of probability distributions often used to model tails or extreme values

In statistics, the generalized Pareto distribution (GPD) is a family of continuous probability distributions. It is often used to model the tails of another distribution. It is specified by three parameters: location $, scale, and shape . Sometimes it is specified by only scale and shape and sometimes only by its shape parameter. Some references give the shape parameter as .$

A ratio distribution is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two random variables X and Y, the distribution of the random variable Z that is formed as the ratio Z = X/Y is a ratio distribution.

<span class="mw-page-title-main">Truncated normal distribution</span> Type of probability distribution

In probability and statistics, the truncated normal distribution is the probability distribution derived from that of a normally distributed random variable by bounding the random variable from either below or above. The truncated normal distribution has wide applications in statistics and econometrics.

In financial mathematics, tail value at risk (TVaR), also known as tail conditional expectation (TCE) or conditional tail expectation (CTE), is a risk measure associated with the more general value at risk. It quantifies the expected value of the loss given that an event outside a given probability level has occurred.

<span class="mw-page-title-main">Log-logistic distribution</span> Continuous probability distribution for a non-negative random variable

In probability and statistics, the log-logistic distribution is a continuous probability distribution for a non-negative random variable. It is used in survival analysis as a parametric model for events whose rate increases initially and decreases later, as, for example, mortality rate from cancer following diagnosis or treatment. It has also been used in hydrology to model stream flow and precipitation, in economics as a simple model of the distribution of wealth or income, and in networking to model the transmission times of data considering both the network and the software.

<span class="mw-page-title-main">Shifted log-logistic distribution</span>

The shifted log-logistic distribution is a probability distribution also known as the generalized log-logistic or the three-parameter log-logistic distribution. It has also been called the generalized logistic distribution, but this conflicts with other uses of the term: see generalized logistic distribution.

In probability theory and statistics, the half-normal distribution is a special case of the folded normal distribution.

<span class="mw-page-title-main">Normal-inverse-gamma distribution</span>

In probability theory and statistics, the normal-inverse-gamma distribution is a four-parameter family of multivariate continuous probability distributions. It is the conjugate prior of a normal distribution with unknown mean and variance.

<span class="mw-page-title-main">Logit-normal distribution</span> Probability distribution

In probability theory, a logit-normal distribution is a probability distribution of a random variable whose logit has a normal distribution. If Y is a random variable with a normal distribution, and t is the standard logistic function, then X = t(Y) has a logit-normal distribution; likewise, if X is logit-normally distributed, then Y = logit(X)= log (X/(1-X)) is normally distributed. It is also known as the logistic normal distribution, which often refers to a multinomial logit version (e.g.).

References

1 2 Johnson, N.L., Kotz, S., Balakrishnan, N. (1995) Continuous Univariate Distributions, Volume 2, Wiley. ISBN 0-471-58494-0 (pages 140–142)
1 2 Leigh J. Halliwell (2018). "The Log-Gamma Distribution and Non-Normal Error". S2CID 173176687.{{cite journal}}: Cite journal requires |journal= (help)
↑ C.M.Bishop, Pattern Recognition and Machine Learning, Springer 2006.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[J1-1] 1 2 Johnson, N.L., Kotz, S., Balakrishnan, N. (1995) Continuous Univariate Distributions, Volume 2, Wiley. ISBN 0-471-58494-0 (pages 140–142)

[Haliwell-2] 1 2 Leigh J. Halliwell (2018). "The Log-Gamma Distribution and Non-Normal Error". S2CID 173176687.{{cite journal}}: Cite journal requires |journal= (help)

[3] C.M.Bishop, Pattern Recognition and Machine Learning, Springer 2006.

[1]

[2]

[3]