Probability density function | |||
Cumulative distribution function | |||
Parameters | scale shape | ||
---|---|---|---|
Support | |||
CDF | |||
Mean | if , else undefined | ||
Median | |||
Mode | if , 0 otherwise | ||
Variance | See main text | ||
Entropy | |||
MGF | [1] where is the Beta function. [2] | ||
CF | [1] where is the Beta function. [2] | ||
Expected shortfall | where is the incomplete beta function. [3] |
In probability and statistics, the log-logistic distribution (known as the Fisk distribution in economics) is a continuous probability distribution for a non-negative random variable. It is used in survival analysis as a parametric model for events whose rate increases initially and decreases later, as, for example, mortality rate from cancer following diagnosis or treatment. It has also been used in hydrology to model stream flow and precipitation, in economics as a simple model of the distribution of wealth or income, and in networking to model the transmission times of data considering both the network and the software.
The log-logistic distribution is the probability distribution of a random variable whose logarithm has a logistic distribution. It is similar in shape to the log-normal distribution but has heavier tails. Unlike the log-normal, its cumulative distribution function can be written in closed form.
There are several different parameterizations of the distribution in use. The one shown here gives reasonably interpretable parameters and a simple form for the cumulative distribution function. [4] [5] The parameter is a scale parameter and is also the median of the distribution. The parameter is a shape parameter. The distribution is unimodal when and its dispersion decreases as increases.
The cumulative distribution function is
where , ,
The probability density function is
An alternative parametrization is given by the pair in analogy with the logistic distribution:
The th raw moment exists only when when it is given by [6] [7]
where B is the beta function. Expressions for the mean, variance, skewness and kurtosis can be derived from this. Writing for convenience, the mean is
and the variance is
Explicit expressions for the skewness and kurtosis are lengthy. [8] As tends to infinity the mean tends to , the variance and skewness tend to zero and the excess kurtosis tends to 6/5 (see also related distributions below).
The quantile function (inverse cumulative distribution function) is :
It follows that the median is , the lower quartile is and the upper quartile is .
The log-logistic distribution provides one parametric model for survival analysis. Unlike the more commonly used Weibull distribution, it can have a non-monotonic hazard function: when the hazard function is unimodal (when ≤ 1, the hazard decreases monotonically). The fact that the cumulative distribution function can be written in closed form is particularly useful for analysis of survival data with censoring. [9] The log-logistic distribution can be used as the basis of an accelerated failure time model by allowing to differ between groups, or more generally by introducing covariates that affect but not by modelling as a linear function of the covariates. [10]
The survival function is
and so the hazard function is
The log-logistic distribution with shape parameter is the marginal distribution of the inter-times in a geometric-distributed counting process. [11]
The log-logistic distribution has been used in hydrology for modelling stream flow rates and precipitation. [4] [5]
Extreme values like maximum one-day rainfall and river discharge per month or per year often follow a log-normal distribution. [12] The log-normal distribution, however, needs a numeric approximation. As the log-logistic distribution, which can be solved analytically, is similar to the log-normal distribution, it can be used instead.
The blue picture illustrates an example of fitting the log-logistic distribution to ranked maximum one-day October rainfalls and it shows the 90% confidence belt based on the binomial distribution. The rainfall data are represented by the plotting position r/(n+1) as part of the cumulative frequency analysis.
The log-logistic has been used as a simple model of the distribution of wealth or income in economics, where it is known as the Fisk distribution. [13] Its Gini coefficient is . [14]
Derivation of Gini coefficient |
---|
The Gini coefficient for a continuous probability distribution takes the form: where is the CDF of the distribution and is the expected value. For the log-logistic distribution, the formula for the Gini coefficient becomes: Defining the substitution leads to the simpler equation: And making the substitution further simplifies the Gini coefficient formula to: The integral component is equivalent to the standard beta function . The beta function may also be written as: where is the gamma function. Using the properties of the gamma function, it can be shown that: From Euler's reflection formula, the expression can be simplified further: Finally, we may conclude that the Gini coefficient for the log-logistic distribution . |
The log-logistic has been used as a model for the period of time beginning when some data leaves a software user application in a computer and the response is received by the same application after travelling through and being processed by other computers, applications, and network segments, most or all of them without hard real-time guarantees (for example, when an application is displaying data coming from a remote sensor connected to the Internet). It has been shown to be a more accurate probabilistic model for that than the log-normal distribution or others, as long as abrupt changes of regime in the sequences of those times are properly detected. [15]
Several different distributions are sometimes referred to as the generalized log-logistic distribution, as they contain the log-logistic as a special case. [14] These include the Burr Type XII distribution (also known as the Singh–Maddala distribution) and the Dagum distribution, both of which include a second shape parameter. Both are in turn special cases of the even more general generalized beta distribution of the second kind. Another more straightforward generalization of the log-logistic is the shifted log-logistic distribution.
Another generalized log-logistic distribution is the log-transform of the metalog distribution, in which power series expansions in terms of are substituted for logistic distribution parameters and . The resulting log-metalog distribution is highly shape flexible, has simple closed form PDF and quantile function, can be fit to data with linear least squares, and subsumes the log-logistic distribution is special case.
In probability theory and statistics, the exponential distribution or negative exponential distribution is the probability distribution of the distance between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average rate; the distance parameter could be any meaningful mono-dimensional measure of the process, such as time between production errors, or length along a roll of fabric in the weaving manufacturing process. It is a particular case of the gamma distribution. It is the continuous analogue of the geometric distribution, and it has the key property of being memoryless. In addition to being used for the analysis of Poisson point processes it is found in various other contexts.
The Pareto distribution, named after the Italian civil engineer, economist, and sociologist Vilfredo Pareto, is a power-law probability distribution that is used in description of social, quality control, scientific, geophysical, actuarial, and many other types of observable phenomena; the principle originally applied to describing the distribution of wealth in a society, fitting the trend that a large portion of wealth is held by a small fraction of the population. The Pareto principle or "80-20 rule" stating that 80% of outcomes are due to 20% of causes was named in honour of Pareto, but the concepts are distinct, and only Pareto distributions with shape value of log45 ≈ 1.16 precisely reflect it. Empirical observation has shown that this 80-20 distribution fits a wide range of cases, including natural phenomena and human activities.
In probability theory and statistics, the Weibull distribution is a continuous probability distribution. It models a broad range of random variables, largely in the nature of a time to failure or time between events. Examples are maximum one-day rainfalls and the time a user spends on a web page.
In probability theory and statistics, the beta distribution is a family of continuous probability distributions defined on the interval [0, 1] or in terms of two positive parameters, denoted by alpha (α) and beta (β), that appear as exponents of the variable and its complement to 1, respectively, and control the shape of the distribution.
In probability theory and statistics, the gamma distribution is a versatile two-parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-squared distribution are special cases of the gamma distribution. There are two equivalent parameterizations in common use:
In probability theory and statistics, the Gumbel distribution is used to model the distribution of the maximum of a number of samples of various distributions.
In probability theory and statistics, the logistic distribution is a continuous probability distribution. Its cumulative distribution function is the logistic function, which appears in logistic regression and feedforward neural networks. It resembles the normal distribution in shape but has heavier tails. The logistic distribution is a special case of the Tukey lambda distribution.
In probability theory, a distribution is said to be stable if a linear combination of two independent random variables with this distribution has the same distribution, up to location and scale parameters. A random variable is said to be stable if its distribution is stable. The stable distribution family is also sometimes referred to as the Lévy alpha-stable distribution, after Paul Lévy, the first mathematician to have studied it.
In probability theory and statistics, the generalized extreme value (GEV) distribution is a family of continuous probability distributions developed within extreme value theory to combine the Gumbel, Fréchet and Weibull families also known as type I, II and III extreme value distributions. By the extreme value theorem the GEV distribution is the only possible limit distribution of properly normalized maxima of a sequence of independent and identically distributed random variables. Note that a limit distribution needs to exist, which requires regularity conditions on the tail of the distribution. Despite this, the GEV distribution is often used as an approximation to model the maxima of long (finite) sequences of random variables.
In probability theory and statistics, the inverse gamma distribution is a two-parameter family of continuous probability distributions on the positive real line, which is the distribution of the reciprocal of a variable distributed according to the gamma distribution.
In probability and statistics, the Kumaraswamy's double bounded distribution is a family of continuous probability distributions defined on the interval (0,1). It is similar to the beta distribution, but much simpler to use especially in simulation studies since its probability density function, cumulative distribution function and quantile functions can be expressed in closed form. This distribution was originally proposed by Poondi Kumaraswamy for variables that are lower and upper bounded with a zero-inflation. This was extended to inflations at both extremes [0,1] in later work with S. G. Fletcher.
In probability theory and statistics, the beta prime distribution is an absolutely continuous probability distribution. If has a beta distribution, then the odds has a beta prime distribution.
In statistics, the generalized Pareto distribution (GPD) is a family of continuous probability distributions. It is often used to model the tails of another distribution. It is specified by three parameters: location , scale , and shape . Sometimes it is specified by only scale and shape and sometimes only by its shape parameter. Some references give the shape parameter as .
The shifted log-logistic distribution is a probability distribution also known as the generalized log-logistic or the three-parameter log-logistic distribution. It has also been called the generalized logistic distribution, but this conflicts with other uses of the term: see generalized logistic distribution.
The term generalized logistic distribution is used as the name for several different families of probability distributions. For example, Johnson et al. list four forms, which are listed below.
In probability theory, the arcsine distribution is the probability distribution whose cumulative distribution function involves the arcsine and the square root:
In probability, statistics, economics, and actuarial science, the Benini distribution is a continuous probability distribution that is a statistical size distribution often applied to model incomes, severity of claims or losses in actuarial applications, and other economic data. Its tail behavior decays faster than a power law, but not as fast as an exponential. This distribution was introduced by Rodolfo Benini in 1905. Somewhat later than Benini's original work, the distribution has been independently discovered or discussed by a number of authors.
The Lomax distribution, conditionally also called the Pareto Type II distribution, is a heavy-tail probability distribution used in business, economics, actuarial science, queueing theory and Internet traffic modeling. It is named after K. S. Lomax. It is essentially a Pareto distribution that has been shifted so that its support begins at zero.
In probability and statistics, the Gamma/Gompertz distribution is a continuous probability distribution. It has been used as an aggregate-level model of customer lifetime and a model of mortality risks.
Hypertabastic survival models were introduced in 2007 by Mohammad Tabatabai, Zoran Bursac, David Williams, and Karan Singh. This distribution can be used to analyze time-to-event data in biomedical and public health areas and normally called survival analysis. In engineering, the time-to-event analysis is referred to as reliability theory and in business and economics it is called duration analysis. Other fields may use different names for the same analysis. These survival models are applicable in many fields such as biomedical, behavioral science, social science, statistics, medicine, bioinformatics, medical informatics, data science especially in machine learning, computational biology, business economics, engineering, and commercial entities. They not only look at the time to event, but whether or not the event occurred. These time-to-event models can be applied in a variety of applications for instance, time after diagnosis of cancer until death, comparison of individualized treatment with standard care in cancer research, time until an individual defaults on loans, relapsed time for drug and smoking cessation, time until property sold after being put on the market, time until an individual upgrades to a new phone, time until job relocation, time until bones receive microscopic fractures when undergoing different stress levels, time from marriage until divorce, time until infection due to catheter, and time from bridge completion until first repair.
{{citation}}
: CS1 maint: multiple names: authors list (link)