Logistic distribution

Logistic distribution
	Probability density function
	Cumulative distribution function
Parameters	location (real); scale (real)
Support
PDF
CDF
Quantile
Mean
Median
Mode
Variance
Skewness
Excess kurtosis
Entropy
MGF	; for ; and is the Beta function
CF
Expected shortfall	; where is the binary entropy function

Last updated May 15, 2024

In probability theory and statistics, the logistic distribution is a continuous probability distribution. Its cumulative distribution function is the logistic function, which appears in logistic regression and feedforward neural networks. It resembles the normal distribution in shape but has heavier tails (higher kurtosis). The logistic distribution is a special case of the Tukey lambda distribution.

Specification

Probability density function

When the location parameter $μ$ is 0 and the scale parameter $s$ is 1, then the probability density function of the logistic distribution is given by

{\begin{aligned}f(x;0,1)&={\frac {e^{-x}}{(1+e^{-x})^{2}}}\\[4pt]&={\frac {1}{(e^{x/2}+e^{-x/2})^{2}}}\\[5pt]&={\frac {1}{4}}\operatorname {sech} ^{2}\left({\frac {x}{2}}\right).\end{aligned}}

Thus in general the density is:

{\begin{aligned}f(x;\mu ,s)&={\frac {e^{-(x-\mu )/s}}{s\left(1+e^{-(x-\mu )/s}\right)^{2}}}\\[4pt]&={\frac {1}{s\left(e^{(x-\mu )/(2s)}+e^{-(x-\mu )/(2s)}\right)^{2}}}\\[4pt]&={\frac {1}{4s}}\operatorname {sech} ^{2}\left({\frac {x-\mu }{2s}}\right).\end{aligned}}

Because this function can be expressed in terms of the square of the hyperbolic secant function "sech", it is sometimes referred to as the sech-square(d) distribution.^[2] (See also: hyperbolic secant distribution).

Cumulative distribution function

The logistic distribution receives its name from its cumulative distribution function, which is an instance of the family of logistic functions. The cumulative distribution function of the logistic distribution is also a scaled version of the hyperbolic tangent.

F(x;\mu ,s)={\frac {1}{1+e^{-(x-\mu )/s}}}={\frac {1}{2}}+{\frac {1}{2}}\operatorname {tanh} \left({\frac {x-\mu }{2s}}\right).

In this equation $μ$ is the mean, and $s$ is a scale parameter proportional to the standard deviation.

Quantile function

The inverse cumulative distribution function (quantile function) of the logistic distribution is a generalization of the logit function. Its derivative is called the quantile density function. They are defined as follows:

Q(p;\mu ,s)=\mu +s\ln \left({\frac {p}{1-p}}\right).

Q'(p;s)={\frac {s}{p(1-p)}}.

Alternative parameterization

An alternative parameterization of the logistic distribution can be derived by expressing the scale parameter, $s$ , in terms of the standard deviation, $\sigma$ , using the substitution $s\,=\,q\,\sigma$ , where $q\,=\,{\sqrt {3}}/{\pi }\,=\,0.551328895\ldots$ . The alternative forms of the above functions are reasonably straightforward.

Applications

The logistic distribution—and the S-shaped pattern of its cumulative distribution function (the logistic function) and quantile function (the logit function)—have been extensively used in many different areas.

Logistic regression

One of the most common applications is in logistic regression, which is used for modeling categorical dependent variables (e.g., yes-no choices or a choice of 3 or 4 possibilities), much as standard linear regression is used for modeling continuous variables (e.g., income or population). Specifically, logistic regression models can be phrased as latent variable models with error variables following a logistic distribution. This phrasing is common in the theory of discrete choice models, where the logistic distribution plays the same role in logistic regression as the normal distribution does in probit regression. Indeed, the logistic and normal distributions have a quite similar shape. However, the logistic distribution has heavier tails, which often increases the robustness of analyses based on it compared with using the normal distribution.

Physics

The PDF of this distribution has the same functional form as the derivative of the Fermi function. In the theory of electron properties in semiconductors and metals, this derivative sets the relative weight of the various electron energies in their contributions to electron transport. Those energy levels whose energies are closest to the distribution's "mean" (Fermi level) dominate processes such as electronic conduction, with some smearing induced by temperature.^[3]^: 34 Note however that the pertinent probability distribution in Fermi–Dirac statistics is actually a simple Bernoulli distribution, with the probability factor given by the Fermi function.

The logistic distribution arises as limit distribution of a finite-velocity damped random motion described by a telegraph process in which the random times between consecutive velocity changes have independent exponential distributions with linearly increasing parameters.^[4]

Hydrology

Fitted cumulative logistic distribution to October rainfalls using CumFreq, see also Distribution fitting FitLogisticdistr.tif — Fitted cumulative logistic distribution to October rainfalls using CumFreq, see also Distribution fitting

In hydrology the distribution of long duration river discharge and rainfall (e.g., monthly and yearly totals, consisting of the sum of 30 respectively 360 daily values) is often thought to be almost normal according to the central limit theorem.^[5] The normal distribution, however, needs a numeric approximation. As the logistic distribution, which can be solved analytically, is similar to the normal distribution, it can be used instead. The blue picture illustrates an example of fitting the logistic distribution to ranked October rainfalls—that are almost normally distributed—and it shows the 90% confidence belt based on the binomial distribution. The rainfall data are represented by plotting positions as part of the cumulative frequency analysis.

Chess ratings

The United States Chess Federation and FIDE have switched its formula for calculating chess ratings from the normal distribution to the logistic distribution; see the article on Elo rating system (itself based on the normal distribution).

Related distributions

Logistic distribution mimics the sech distribution.
If $X\sim \mathrm {Logistic} (\mu ,s)$ then $kX+\ell \sim \mathrm {Logistic} (k\mu +\ell ,|k|s)$ .
If $X\sim$ U(0, 1) then $\mu +s(\log X-\log(1-X))\sim \mathrm {Logistic} (\mu ,s)$ .
If $X\sim \mathrm {Gumbel} (\mu _{X},\beta )$ and $Y\sim \mathrm {Gumbel} (\mu _{Y},\beta )$ independently then $X-Y\sim \mathrm {Logistic} (\mu _{X}-\mu _{Y},\beta )\,$ .
If $X$ and $Y\sim \mathrm {Gumbel} (\mu ,\beta )$ then $X+Y\nsim \mathrm {Logistic} (2\mu ,\beta )\,$ (The sum is not a logistic distribution). Note that $E(X+Y)=2\mu +2\beta \gamma \neq 2\mu =E\left(\mathrm {Logistic} (2\mu ,\beta )\right)$ .
If X ~ Logistic(μ, s) then exp(X) ~ LogLogistic $\left(\alpha =e^{\mu },\beta ={\frac {1}{s}}\right)$ , and exp(X) + γ ~ shifted log-logistic $\left(\alpha =e^{\mu },\beta ={\frac {1}{s}},\gamma \right)$ .
If X ~ Exponential(1) then

\mu +s\log(e^{X}-1)\sim \operatorname {Logistic} (\mu ,s).

If X, Y ~ Exponential(1) then

\mu -s\log \left({\frac {X}{Y}}\right)\sim \operatorname {Logistic} (\mu ,s).

The metalog distribution is generalization of the logistic distribution, in which power series expansions in terms of $p$ are substituted for logistic parameters $\mu$ and $\sigma$ . The resulting metalog quantile function is highly shape flexible, has a simple closed form, and can be fit to data with linear least squares.

Derivations

Higher-order moments

The nth-order central moment can be expressed in terms of the quantile function:

{\begin{aligned}\operatorname {E} [(X-\mu )^{n}]&=\int _{-\infty }^{\infty }(x-\mu )^{n}\,dF(x)\\&=\int _{0}^{1}{\big (}Q(p)-\mu {\big )}^{n}\,dp=s^{n}\int _{0}^{1}\left[\ln \!\left({\frac {p}{1-p}}\right)\right]^{n}\,dp.\end{aligned}}

This integral is well-known^[6] and can be expressed in terms of Bernoulli numbers:

\operatorname {E} [(X-\mu )^{n}]=s^{n}\pi ^{n}(2^{n}-2)\cdot |B_{n}|.

Notes

↑ Norton, Matthew; Khokhlov, Valentyn; Uryasev, Stan (2019). "Calculating CVaR and bPOE for common probability distributions with application to portfolio optimization and density estimation" (PDF). Annals of Operations Research. 299 (1–2). Springer: 1281–1315. doi:10.1007/s10479-019-03373-1 . Retrieved 2023-02-27.
↑ Johnson, Kotz & Balakrishnan (1995, p.116).
↑ Davies, John H. (1998). The Physics of Low-dimensional Semiconductors: An Introduction. Cambridge University Press. ISBN 9780521484916.
↑ A. Di Crescenzo, B. Martinucci (2010) "A damped telegraph random process with logistic stationary distribution", J. Appl. Prob. , vol. 47, pp. 84–96.
↑ Ritzema, H.P., ed. (1994). Frequency and Regression Analysis. Chapter 6 in: Drainage Principles and Applications, Publication 16, International Institute for Land Reclamation and Improvement (ILRI), Wageningen, The Netherlands. pp. 175–224. ISBN 90-70754-33-9.
↑ OEIS: A001896

Related Research Articles

The Cauchy distribution, named after Augustin Cauchy, is a continuous probability distribution. It is also known, especially among physicists, as the Lorentz distribution, Cauchy–Lorentz distribution, Lorentz(ian) function, or Breit–Wigner distribution. The Cauchy distribution $is the distribution of the x -intercept of a ray issuing from with a uniformly distributed angle. It is also the distribution of the ratio of two independent normally distributed random variables with mean zero.$

In probability theory and statistics, the exponential distribution or negative exponential distribution is the probability distribution of the distance between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average rate; the distance parameter could be any meaningful mono-dimensional measure of the process, such as time between production errors, or length along a roll of fabric in the weaving manufacturing process. It is a particular case of the gamma distribution. It is the continuous analogue of the geometric distribution, and it has the key property of being memoryless. In addition to being used for the analysis of Poisson point processes it is found in various other contexts.

<span class="mw-page-title-main">Multivariate normal distribution</span> Generalization of the one-dimensional normal distribution to higher dimensions

In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be k-variate normally distributed if every linear combination of its k components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) correlated real-valued random variables, each of which clusters around a mean value.

In probability theory and statistics, the beta distribution is a family of continuous probability distributions defined on the interval [0, 1] or in terms of two positive parameters, denoted by alpha (α) and beta (β), that appear as exponents of the variable and its complement to 1, respectively, and control the shape of the distribution.

In probability theory and statistics, the gamma distribution is a versatile two-parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-squared distribution are special cases of the gamma distribution. There are two equivalent parameterizations in common use:

With a shape parameter $k$ and a scale parameter $θ$
With a shape parameter $and an inverse scale parameter, called a rate parameter.$

In statistics, the logistic model is a statistical model that models the log-odds of an event as a linear combination of one or more independent variables. In regression analysis, logistic regression is estimating the parameters of a logistic model. Formally, in binary logistic regression there is a single binary dependent variable, coded by an indicator variable, where the two values are labeled "0" and "1", while the independent variables can each be a binary variable or a continuous variable. The corresponding probability of the value labeled "1" can vary between 0 and 1, hence the labeling; the function that converts log-odds to probability is the logistic function, hence the name. The unit of measurement for the log-odds scale is called a logit, from logistic unit, hence the alternative names. See § Background and § Definition for formal mathematics, and § Example for a worked example.

<span class="mw-page-title-main">Gumbel distribution</span> Particular case of the generalized extreme value distribution

In probability theory and statistics, the Gumbel distribution is used to model the distribution of the maximum of a number of samples of various distributions.

In probability theory and statistics, the F-distribution or F-ratio, also known as Snedecor's F distribution or the Fisher–Snedecor distribution, is a continuous probability distribution that arises frequently as the null distribution of a test statistic, most notably in the analysis of variance (ANOVA) and other F-tests.

In probability theory and statistics, the Laplace distribution is a continuous probability distribution named after Pierre-Simon Laplace. It is also sometimes called the double exponential distribution, because it can be thought of as two exponential distributions spliced together along the abscissa, although the term is also sometimes used to refer to the Gumbel distribution. The difference between two independent identically distributed exponential random variables is governed by a Laplace distribution, as is a Brownian motion evaluated at an exponentially distributed random time. Increments of Laplace motion or a variance gamma process evaluated over the time scale also have a Laplace distribution.

In probability theory and statistics, the Lévy distribution, named after Paul Lévy, is a continuous probability distribution for a non-negative random variable. In spectroscopy, this distribution, with frequency as the dependent variable, is known as a van der Waals profile. It is a special case of the inverse-gamma distribution. It is a stable distribution.

In probability theory and statistics, the generalized extreme value (GEV) distribution is a family of continuous probability distributions developed within extreme value theory to combine the Gumbel, Fréchet and Weibull families also known as type I, II and III extreme value distributions. By the extreme value theorem the GEV distribution is the only possible limit distribution of properly normalized maxima of a sequence of independent and identically distributed random variables. Note that a limit distribution needs to exist, which requires regularity conditions on the tail of the distribution. Despite this, the GEV distribution is often used as an approximation to model the maxima of long (finite) sequences of random variables.

In information theory, the cross-entropy between two probability distributions $and, over the same underlying set of events, measures the average number of bits needed to identify an event drawn from the set when the coding scheme used for the set is optimized for an estimated probability distribution, rather than the true distribution .$

In statistics, binomial regression is a regression analysis technique in which the response has a binomial distribution: it is the number of successes in a series of independent Bernoulli trials, where each trial has probability of success . In binomial regression, the probability of a success is related to explanatory variables: the corresponding concept in ordinary regression is to relate the mean value of the unobserved response to explanatory variables.

Expected shortfall (ES) is a risk measure—a concept used in the field of financial risk measurement to evaluate the market risk or credit risk of a portfolio. The "expected shortfall at q% level" is the expected return on the portfolio in the worst $of cases. ES is an alternative to value at risk that is more sensitive to the shape of the tail of the loss distribution.$

A ratio distribution is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two random variables X and Y, the distribution of the random variable Z that is formed as the ratio Z = X/Y is a ratio distribution.

In financial mathematics, tail value at risk (TVaR), also known as tail conditional expectation (TCE) or conditional tail expectation (CTE), is a risk measure associated with the more general value at risk. It quantifies the expected value of the loss given that an event outside a given probability level has occurred.

<span class="mw-page-title-main">Log-logistic distribution</span> Continuous probability distribution for a non-negative random variable

In probability and statistics, the log-logistic distribution is a continuous probability distribution for a non-negative random variable. It is used in survival analysis as a parametric model for events whose rate increases initially and decreases later, as, for example, mortality rate from cancer following diagnosis or treatment. It has also been used in hydrology to model stream flow and precipitation, in economics as a simple model of the distribution of wealth or income, and in networking to model the transmission times of data considering both the network and the software.

The term generalized logistic distribution is used as the name for several different families of probability distributions. For example, Johnson et al. list four forms, which are listed below.

In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time if these events occur with a known constant mean rate and independently of the time since the last event. It can also be used for the number of events in other types of intervals than time, and in dimension greater than 1.

In statistics, the variance function is a smooth function that depicts the variance of a random quantity as a function of its mean. The variance function is a measure of heteroscedasticity and plays a large role in many settings of statistical modelling. It is a main ingredient in the generalized linear model framework and a tool used in non-parametric regression, semiparametric regression and functional data analysis. In parametric modeling, variance functions take on a parametric form and explicitly describe the relationship between the variance and the mean of a random quantity. In a non-parametric setting, the variance function is assumed to be a smooth function.

References

John S. deCani & Robert A. Stine (1986). "A note on deriving the information matrix for a logistic distribution". The American Statistician. 40. American Statistical Association: 220–222. doi:10.2307/2684541.
N., Balakrishnan (1992). Handbook of the Logistic Distribution. Marcel Dekker, New York. ISBN 0-8247-8587-8.
Johnson, N. L.; Kotz, S.; N., Balakrishnan (1995). Continuous Univariate Distributions. Vol. 2 (2nd ed.). ISBN 0-471-58494-0.

Modis, Theodore (1992) Predictions: Society's Telltale Signature Reveals the Past and Forecasts the Future, Simon & Schuster, New York. ISBN 0-671-75917-5

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[norton-1] Norton, Matthew; Khokhlov, Valentyn; Uryasev, Stan (2019). "Calculating CVaR and bPOE for common probability distributions with application to portfolio optimization and density estimation" (PDF). Annals of Operations Research. 299 (1–2). Springer: 1281–1315. doi:10.1007/s10479-019-03373-1 . Retrieved 2023-02-27.

[2] Johnson, Kotz & Balakrishnan (1995, p.116).

[3] Davies, John H. (1998). The Physics of Low-dimensional Semiconductors: An Introduction. Cambridge University Press. ISBN 9780521484916.

[4] A. Di Crescenzo, B. Martinucci (2010) "A damped telegraph random process with logistic stationary distribution", J. Appl. Prob. , vol. 47, pp. 84–96.

[5] Ritzema, H.P., ed. (1994). Frequency and Regression Analysis. Chapter 6 in: Drainage Principles and Applications, Publication 16, International Institute for Land Reclamation and Improvement (ILRI), Wageningen, The Netherlands. pp. 175–224. ISBN 90-70754-33-9.

[6] OEIS: A001896

[1]

[2]

[3]

[4]

[5]

[6]

Logistic distribution
Probability density function
Cumulative distribution function
Parameters	$\mu ,$ location (real) $s>0,$ scale (real)
Support	$x\in (-\infty ,\infty )$
PDF	${\frac {e^{-(x-\mu )/s}}{s\left(1+e^{-(x-\mu )/s}\right)^{2}}}$
CDF	${\frac {1}{1+e^{-(x-\mu )/s}}}={\frac {1+\tanh {\frac {x-\mu }{2s}}}{2}}$
Quantile	$\mu +s\log \left({\frac {p}{1-p}}\right)$
Mean	$\mu$
Median	$\mu$
Mode	$\mu$
Variance	${\frac {s^{2}\pi ^{2}}{3}}$
Skewness	$0$
Excess kurtosis	$6/5$
Entropy	$\ln s+2$
MGF	$e^{\mu t}\mathrm {B} (1-st,1+st)$ for $t\in (-1/s,1/s)$ and $\mathrm {B}$ is the Beta function
CF	$e^{it\mu }{\frac {\pi st}{\sinh(\pi st)}}$
Expected shortfall	$\mu +{\frac {sH(p)}{1-p}}$ where $H(p)$ is the binary entropy function^[1] $H(p)=-p\ln(p)-(1-p)\ln(1-p)$