Skew normal distribution

Skew Normal
	Probability density function
	Cumulative distribution function
Parameters	location (real); scale (positive, real); shape (real)
Support
PDF
CDF	; is Owen's T function
Mean	where
Mode
Variance
Skewness
Excess kurtosis
MGF
CF

Last updated May 12, 2024

In probability theory and statistics, the skew normal distribution is a continuous probability distribution that generalises the normal distribution to allow for non-zero skewness.

Definition

Let $\phi (x)$ denote the standard normal probability density function

\phi (x)={\frac {1}{\sqrt {2\pi }}}e^{-{\frac {x^{2}}{2}}}

with the cumulative distribution function given by

\Phi (x)=\int _{-\infty }^{x}\phi (t)\ dt={\frac {1}{2}}\left[1+\operatorname {erf} \left({\frac {x}{\sqrt {2}}}\right)\right],

where "erf" is the error function. Then the probability density function (pdf) of the skew-normal distribution with parameter $\alpha$ is given by

f(x)=2\phi (x)\Phi (\alpha x).\,

This distribution was first introduced by O'Hagan and Leonard (1976).^[1] Alternative forms to this distribution, with the corresponding quantile function, have been given by Ashour and Abdel-Hamid^[2] and by Mudholkar and Hutson.^[3]

A stochastic process that underpins the distribution was described by Andel, Netuka and Zvara (1984).^[4] Both the distribution and its stochastic process underpinnings were consequences of the symmetry argument developed in Chan and Tong (1986),^[5] which applies to multivariate cases beyond normality, e.g. skew multivariate t distribution and others. The distribution is a particular case of a general class of distributions with probability density functions of the form $f(x)=2\phi (x)\Phi (x)$ where $\phi (\cdot )$ is any PDF symmetric about zero and $\Phi (\cdot )$ is any CDF whose PDF is symmetric about zero.^[6]

To add location and scale parameters to this, one makes the usual transform $x\rightarrow {\frac {x-\xi }{\omega }}$ . One can verify that the normal distribution is recovered when $\alpha =0$ , and that the absolute value of the skewness increases as the absolute value of $\alpha$ increases. The distribution is right skewed if $\alpha >0$ and is left skewed if $\alpha <0$ . The probability density function with location $\xi$ , scale $\omega$ , and parameter $\alpha$ becomes

f(x)={\frac {2}{\omega }}\phi \left({\frac {x-\xi }{\omega }}\right)\Phi \left(\alpha \left({\frac {x-\xi }{\omega }}\right)\right).\,

The skewness ( $\gamma _{1}$ ) of the distribution is limited to slightly less than the interval $(-1,1)$ (see Estimation).

As has been shown,^[7] the mode (maximum) $m_{o}$ of the distribution is unique. For general $\alpha$ there is no analytic expression for $m_{o}$ , but a quite accurate (numerical) approximation is:

{\begin{aligned}\delta &={\frac {\alpha }{\sqrt {1+\alpha ^{2}}}}\\m_{o}(\alpha )&\approx {\sqrt {\frac {2}{\pi }}}\delta -\left(1-{\frac {\pi }{4}}\right){\frac {\left({\sqrt {\frac {2}{\pi }}}\delta \right)^{3}}{1-{\frac {2}{\pi }}\delta ^{2}}}-{\frac {\mathrm {sgn} (\alpha )}{2}}e^{\left(-{\frac {2\pi }{|\alpha |}}\right)}\\\end{aligned}}

Estimation

Maximum likelihood estimates for $\xi$ , $\omega$ , and $\alpha$ can be computed numerically, but no closed-form expression for the estimates is available unless $\alpha =0$ . In contrast, the method of moments has a closed-form expression since the skewness equation can be inverted with

|\delta |={\sqrt {{\frac {\pi }{2}}{\frac {|\gamma _{1}|^{\frac {2}{3}}}{|\gamma _{1}|^{\frac {2}{3}}+((4-\pi )/2)^{\frac {2}{3}}}}}}

where $\delta ={\frac {\alpha }{\sqrt {1+\alpha ^{2}}}}$ and the sign of $\delta$ is the same as the sign of $\gamma _{1}$ . Consequently, $\alpha ={\frac {\delta }{\sqrt {1-\delta ^{2}}}}$ , $\omega ={\frac {\sigma }{\sqrt {1-2\delta ^{2}/\pi }}}$ , and $\xi =\mu -\omega \delta {\sqrt {\frac {2}{\pi }}}$ where $\mu$ and $\sigma$ are the mean and standard deviation. As long as the sample skewness ${\hat {\gamma }}_{1}$ is not too large, these formulas provide method of moments estimates ${\hat {\alpha }}$ , ${\hat {\omega }}$ , and ${\hat {\xi }}$ based on a sample's ${\hat {\mu }}$ , ${\hat {\sigma }}$ , and ${\hat {\gamma }}_{1}$ .

The maximum (theoretical) skewness is obtained by setting ${\delta =1}$ in the skewness equation, giving $\gamma _{1}\approx 0.9952717$ . However it is possible that the sample skewness is larger, and then $\alpha$ cannot be determined from these equations. When using the method of moments in an automatic fashion, for example to give starting values for maximum likelihood iteration, one should therefore let (for example) $|{\hat {\gamma }}_{1}|=\min(0.99,|(1/n)\sum {((x_{i}-{\hat {\mu }})/{\hat {\sigma }})^{3}}|)$ .

Concern has been expressed about the impact of skew normal methods on the reliability of inferences based upon them.^[8]

Related distributions

The exponentially modified normal distribution is another 3-parameter distribution that is a generalization of the normal distribution to skewed cases. The skew normal still has a normal-like tail in the direction of the skew, with a shorter tail in the other direction; that is, its density is asymptotically proportional to $e^{-kx^{2}}$ for some positive $k$ . Thus, in terms of the seven states of randomness, it shows "proper mild randomness". In contrast, the exponentially modified normal has an exponential tail in the direction of the skew; its density is asymptotically proportional to $e^{-k|x|}$ . In the same terms, it shows "borderline mild randomness".

Thus, the skew normal is useful for modeling skewed distributions which nevertheless have no more outliers than the normal, while the exponentially modified normal is useful for cases with an increased incidence of outliers in (just) one direction.

Related Research Articles

In mathematical analysis, the Dirac delta function, also known as the unit impulse, is a generalized function on the real numbers, whose value is zero everywhere except at zero, and whose integral over the entire real line is equal to one. Since there is no function having this property, to model the delta "function" rigorously involves the use of limits or, as is common in mathematics, measure theory and the theory of distributions.

In mechanics and geometry, the 3D rotation group, often denoted SO(3), is the group of all rotations about the origin of three-dimensional Euclidean space $under the operation of composition.$

In physics, the Hamilton–Jacobi equation, named after William Rowan Hamilton and Carl Gustav Jacob Jacobi, is an alternative formulation of classical mechanics, equivalent to other formulations such as Newton's laws of motion, Lagrangian mechanics and Hamiltonian mechanics.

In physics, the Polyakov action is an action of the two-dimensional conformal field theory describing the worldsheet of a string in string theory. It was introduced by Stanley Deser and Bruno Zumino and independently by L. Brink, P. Di Vecchia and P. S. Howe in 1976, and has become associated with Alexander Polyakov after he made use of it in quantizing the string in 1981. The action reads:

In probability theory, a distribution is said to be stable if a linear combination of two independent random variables with this distribution has the same distribution, up to location and scale parameters. A random variable is said to be stable if its distribution is stable. The stable distribution family is also sometimes referred to as the Lévy alpha-stable distribution, after Paul Lévy, the first mathematician to have studied it.

The Havriliak–Negami relaxation is an empirical modification of the Debye relaxation model in electromagnetism. Unlike the Debye model, the Havriliak–Negami relaxation accounts for the asymmetry and broadness of the dielectric dispersion curve. The model was first used to describe the dielectric relaxation of some polymers, by adding two exponential parameters to the Debye equation:

In calculus, the Leibniz integral rule for differentiation under the integral sign states that for an integral of the form

In physics, precisely in the study of the theory of general relativity and many alternatives to it, the post-Newtonian formalism is a calculational tool that expresses Einstein's (nonlinear) equations of gravity in terms of the lowest-order deviations from Newton's law of universal gravitation. This allows approximations to Einstein's equations to be made in the case of weak fields. Higher-order terms can be added to increase accuracy, but for strong fields, it may be preferable to solve the complete equations numerically. Some of these post-Newtonian approximations are expansions in a small parameter, which is the ratio of the velocity of the matter forming the gravitational field to the speed of light, which in this case is better called the speed of gravity. In the limit, when the fundamental speed of gravity becomes infinite, the post-Newtonian expansion reduces to Newton's law of gravity.

Scalar–tensor–vector gravity (STVG) is a modified theory of gravity developed by John Moffat, a researcher at the Perimeter Institute for Theoretical Physics in Waterloo, Ontario. The theory is also often referred to by the acronym MOG.

The Newman–Penrose (NP) formalism is a set of notation developed by Ezra T. Newman and Roger Penrose for general relativity (GR). Their notation is an effort to treat general relativity in terms of spinor notation, which introduces complex forms of the usual variables used in GR. The NP formalism is itself a special case of the tetrad formalism, where the tensors of the theory are projected onto a complete vector basis at each point in spacetime. Usually this vector basis is chosen to reflect some symmetry of the spacetime, leading to simplified expressions for physical observables. In the case of the NP formalism, the vector basis chosen is a null tetrad: a set of four null vectors—two real, and a complex-conjugate pair. The two real members often asymptotically point radially inward and radially outward, and the formalism is well adapted to treatment of the propagation of radiation in curved spacetime. The Weyl scalars, derived from the Weyl tensor, are often used. In particular, it can be shown that one of these scalars— $in the appropriate frame—encodes the outgoing gravitational radiation of an asymptotically flat system.$

In spectroscopy, the Autler–Townes effect, is a dynamical Stark effect corresponding to the case when an oscillating electric field is tuned in resonance to the transition frequency of a given spectral line, and resulting in a change of the shape of the absorption/emission spectra of that spectral line. The AC Stark effect was discovered in 1955 by American physicists Stanley Autler and Charles Townes.

In mathematics, the Schur orthogonality relations, which were proven by Issai Schur through Schur's lemma, express a central fact about representations of finite groups. They admit a generalization to the case of compact groups in general, and in particular compact Lie groups, such as the rotation group SO(3).

Expected shortfall (ES) is a risk measure—a concept used in the field of financial risk measurement to evaluate the market risk or credit risk of a portfolio. The "expected shortfall at q% level" is the expected return on the portfolio in the worst $of cases. ES is an alternative to value at risk that is more sensitive to the shape of the tail of the loss distribution.$

In financial mathematics, tail value at risk (TVaR), also known as tail conditional expectation (TCE) or conditional tail expectation (CTE), is a risk measure associated with the more general value at risk. It quantifies the expected value of the loss given that an event outside a given probability level has occurred.

Bilinear time–frequency distributions, or quadratic time–frequency distributions, arise in a sub-field of signal analysis and signal processing called time–frequency signal processing, and, in the statistical analysis of time series data. Such methods are used where one needs to deal with a situation where the frequency composition of a signal may be changing over time; this sub-field used to be called time–frequency signal analysis, and is now more often called time–frequency signal processing due to the progress in using these methods to a wide range of signal-processing problems.

The Birnbaum–Saunders distribution, also known as the fatigue life distribution, is a probability distribution used extensively in reliability applications to model failure times. There are several alternative formulations of this distribution in the literature. It is named after Z. W. Birnbaum and S. C. Saunders.

In mathematical logic and set theory, an ordinal collapsing function is a technique for defining certain recursive large countable ordinals, whose principle is to give names to certain ordinals much larger than the one being defined, perhaps even large cardinals, and then "collapse" them down to a system of notations for the sought-after ordinal. For this reason, ordinal collapsing functions are described as an impredicative manner of naming ordinals.

An LC circuit can be quantized using the same methods as for the quantum harmonic oscillator. An LC circuit is a variety of resonant circuit, and consists of an inductor, represented by the letter L, and a capacitor, represented by the letter C. When connected together, an electric current can alternate between them at the circuit's resonant frequency:

Calculations in the Newman–Penrose (NP) formalism of general relativity normally begin with the construction of a complex null tetrad $, where is a pair of real null vectors and is a pair of complex null vectors. These tetrad vectors respect the following normalization and metric conditions assuming the spacetime signature$

In mathematics, Rathjen's $psi function$ is an ordinal collapsing function developed by Michael Rathjen. It collapses weakly Mahlo cardinals $to generate large countable ordinals. A weakly Mahlo cardinal is a cardinal such that the set of regular cardinals below is closed under . Rathjen uses this to diagonalise over the weakly inaccessible hierarchy.$

References

↑ O'Hagan, A.; Leonard, Tom (1976). "Bayes estimation subject to uncertainty about parameter constraints". Biometrika. 63 (1): 201–203. doi:10.1093/biomet/63.1.201. ISSN 0006-3444.
↑ Ashour, Samir K.; Abdel-hameed, Mahmood A. (October 2010). "Approximate skew normal distribution". Journal of Advanced Research. 1 (4): 341–350. doi: 10.1016/j.jare.2010.06.004 . ISSN 2090-1232.
↑ Mudholkar, Govind S.; Hutson, Alan D. (February 2000). "The epsilon–skew–normal distribution for analyzing near-normal data". Journal of Statistical Planning and Inference. 83 (2): 291–309. doi:10.1016/s0378-3758(99)00096-8. ISSN 0378-3758.
↑ Andel, J., Netuka, I. and Zvara, K. (1984) On threshold autoregressive processes. Kybernetika, 20, 89-106
↑ Chan, K. S.; Tong, H. (March 1986). "A note on certain integral equations associated with non-linear time series analysis". Probability Theory and Related Fields. 73 (1): 153–158. doi: 10.1007/bf01845999 . ISSN 0178-8051. S2CID 121106515.
↑ Azzalini, A. (1985). "A class of distributions which includes the normal ones". Scandinavian Journal of Statistics. 12: 171–178.
↑ Azzalini, Adelchi; Capitanio, Antonella (2014). The skew-normal and related families. pp. 32–33. ISBN 978-1-107-02927-9.
↑ Pewsey, Arthur. "Problems of inference for Azzalini's skewnormal distribution." Journal of Applied Statistics 27.7 (2000): 859-870

External links

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] O'Hagan, A.; Leonard, Tom (1976). "Bayes estimation subject to uncertainty about parameter constraints". Biometrika. 63 (1): 201–203. doi:10.1093/biomet/63.1.201. ISSN 0006-3444.

[2] Ashour, Samir K.; Abdel-hameed, Mahmood A. (October 2010). "Approximate skew normal distribution". Journal of Advanced Research. 1 (4): 341–350. doi: 10.1016/j.jare.2010.06.004 . ISSN 2090-1232.

[3] Mudholkar, Govind S.; Hutson, Alan D. (February 2000). "The epsilon–skew–normal distribution for analyzing near-normal data". Journal of Statistical Planning and Inference. 83 (2): 291–309. doi:10.1016/s0378-3758(99)00096-8. ISSN 0378-3758.

[4] Andel, J., Netuka, I. and Zvara, K. (1984) On threshold autoregressive processes. Kybernetika, 20, 89-106

[5] Chan, K. S.; Tong, H. (March 1986). "A note on certain integral equations associated with non-linear time series analysis". Probability Theory and Related Fields. 73 (1): 153–158. doi: 10.1007/bf01845999 . ISSN 0178-8051. S2CID 121106515.

[Azzalini1985-6] Azzalini, A. (1985). "A class of distributions which includes the normal ones". Scandinavian Journal of Statistics. 12: 171–178.

[Azzalini2014-7] Azzalini, Adelchi; Capitanio, Antonella (2014). The skew-normal and related families. pp. 32–33. ISBN 978-1-107-02927-9.

[8] Pewsey, Arthur. "Problems of inference for Azzalini's skewnormal distribution." Journal of Applied Statistics 27.7 (2000): 859-870

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]