L-moment

Last updated

In statistics, L-moments are a sequence of statistics used to summarize the shape of a probability distribution. [1] [2] [3] [4] They are linear combinations of order statistics (L-statistics) analogous to conventional moments, and can be used to calculate quantities analogous to standard deviation, skewness and kurtosis, termed the L-scale, L-skewness and L-kurtosis respectively (the L-mean is identical to the conventional mean). Standardised L-moments are called L-moment ratios and are analogous to standardized moments. Just as for conventional moments, a theoretical distribution has a set of population L-moments. Sample L-moments can be defined for a sample from the population, and can be used as estimators of the population L-moments.

Contents

Population L-moments

For a random variable X, the rth population L-moment is [1]

where Xk:n denotes the rth order statistic (rth smallest value) in an independent sample of size n from the distribution of X and denotes expected value operator. In particular, the first four population L-moments are

Note that the coefficients of the rth L-moment are the same as in the rth term of the binomial transform, as used in the r-order finite difference (finite analog to the derivative).

The first two of these L-moments have conventional names:

is the "mean", "L-mean", or "L-location",
is the "L-scale".

The L-scale is equal to half the Mean absolute difference. [5]

Sample L-moments

The sample L-moments can be computed as the population L-moments of the sample, summing over r-element subsets of the sample hence averaging by dividing by the binomial coefficient:

Grouping these by order statistic counts the number of ways an element of an n element sample can be the jth element of an r element subset, and yields formulas of the form below. Direct estimators for the first four L-moments in a finite sample of n observations are: [6]

where x(i) is the ith order statistic and is a binomial coefficient. Sample L-moments can also be defined indirectly in terms of probability weighted moments, [1] [7] [8] which leads to a more efficient algorithm for their computation. [6] [9]

L-moment ratios

A set of L-moment ratios, or scaled L-moments, is defined by

The most useful of these are called the L-skewness, and the L-kurtosis.

L-moment ratios lie within the interval ( −1, 1 ) . Tighter bounds can be found for some specific L-moment ratios; in particular, the L-kurtosis lies in [ + 1 /4, 1 ) , and

[1]

A quantity analogous to the coefficient of variation, but based on L-moments, can also be defined: which is called the "coefficient of L-variation", or "L-CV". For a non-negative random variable, this lies in the interval ( 0, 1 ) [1] and is identical to the Gini coefficient. [10]

L-moments are statistical quantities that are derived from probability weighted moments [11] (PWM) which were defined earlier (1979). [7] PWM are used to efficiently estimate the parameters of distributions expressable in inverse form such as the Gumbel, [8] the Tukey lambda, and the Wakeby distributions.

Usage

There are two common ways that L-moments are used, in both cases analogously to the conventional moments:

  1. As summary statistics for data.
  2. To derive estimators for the parameters of probability distributions, applying the method of moments to the L-moments rather than conventional moments.

In addition to doing these with standard moments, the latter (estimation) is more commonly done using maximum likelihood methods; however using L-moments provides a number of advantages. Specifically, L-moments are more robust than conventional moments, and existence of higher L-moments only requires that the random variable have finite mean. One disadvantage of L-moment ratios for estimation is their typically smaller sensitivity. For instance, the Laplace distribution has a kurtosis of 6 and weak exponential tails, but a larger 4th L-moment ratio than e.g. the student-t distribution with d.f.=3, which has an infinite kurtosis and much heavier tails.

As an example consider a dataset with a few data points and one outlying data value. If the ordinary standard deviation of this data set is taken it will be highly influenced by this one point: however, if the L-scale is taken it will be far less sensitive to this data value. Consequently, L-moments are far more meaningful when dealing with outliers in data than conventional moments. However, there are also other better suited methods to achieve an even higher robustness than just replacing moments by L-moments. One example of this is using L-moments as summary statistics in extreme value theory  (EVT). This application shows the limited robustness of L-moments, i.e. L-statistics are not resistant statistics, as a single extreme value can throw them off, but because they are only linear (not higher-order statistics), they are less affected by extreme values than conventional moments.

Another advantage L-moments have over conventional moments is that their existence only requires the random variable to have finite mean, so the L-moments exist even if the higher conventional moments do not exist (for example, for Student's t distribution with low degrees of freedom). A finite variance is required in addition in order for the standard errors of estimates of the L-moments to be finite. [1]

Some appearances of L-moments in the statistical literature include the book by David & Nagaraja (2003, Section 9.9) [12] and a number of papers. [10] [13] [14] [15] [16] [17] A number of favourable comparisons of L-moments with ordinary moments have been reported. [18] [19]

Values for some common distributions

The table below gives expressions for the first two L moments and numerical values of the first two L-moment ratios of some common continuous probability distributions with constant L-moment ratios. [1] [5] More complex expressions have been derived for some further distributions for which the L-moment ratios vary with one or more of the distributional parameters, including the log-normal, Gamma, generalized Pareto, generalized extreme value, and generalized logistic distributions. [1]

DistributionParametersmean, λ1L-scale, λ2L-skewness, τ3L-kurtosis, τ4
Uniform a, b 1 /2(a + b)  1 /6(ba) 0 0
Logistic μ, sμs0 1 /6 = 0.1667
Normal μ, σ2μσ/π0 30 θm /π  - 9 = 0.1226
Laplace μ, bμ 3 / 4 b01/ 3 2 = 0.2357
Student's t, 2 d.f. ν = 2 0π/ 2 2 = 1.111 0 3 / 8 = 0.375
Student's t, 4 d.f. ν = 4 0 15 /64π = 0.7363 0 111 /512 = 0.2168
Exponential λ1/λ1/ 2 λ 1 /3 = 0.3333  1 /6 = 0.1667
Gumbel μ, βμ + γe ββ log2(3) 2 log2(3) - 3 = 0.1699 16 - 10 log2(3) = 0.1504

The notation for the parameters of each distribution is the same as that used in the linked article. In the expression for the mean of the Gumbel distribution, γe is the Euler–Mascheroni constant 0.5772 1566 4901 ... .

Extensions

Trimmed L-moments are generalizations of L-moments that give zero weight to extreme observations. They are therefore more robust to the presence of outliers, and unlike L-moments they may be well-defined for distributions for which the mean does not exist, such as the Cauchy distribution. [20]

See also

Related Research Articles

In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed statistical model, the observed data is most probable. The point in the parameter space that maximizes the likelihood function is called the maximum likelihood estimate. The logic of maximum likelihood is both intuitive and flexible, and as such the method has become a dominant means of statistical inference.

<span class="mw-page-title-main">Special unitary group</span> Group of unitary matrices with determinant of 1

In mathematics, the special unitary group of degree n, denoted SU(n), is the Lie group of n × n unitary matrices with determinant 1.

In mathematical analysis, Hölder's inequality, named after Otto Hölder, is a fundamental inequality between integrals and an indispensable tool for the study of Lp spaces.

In mathematics, the moments of a function are certain quantitative measures related to the shape of the function's graph. If the function represents mass density, then the zeroth moment is the total mass, the first moment is the center of mass, and the second moment is the moment of inertia. If the function is a probability distribution, then the first moment is the expected value, the second central moment is the variance, the third standardized moment is the skewness, and the fourth standardized moment is the kurtosis. The mathematical concept is closely related to the concept of moment in physics.

<span class="mw-page-title-main">Theta function</span> Special functions of several complex variables

In mathematics, theta functions are special functions of several complex variables. They show up in many topics, including Abelian varieties, moduli spaces, quadratic forms, and solitons. As Grassmann algebras, they appear in quantum field theory.

In mathematics, the Jacobi elliptic functions are a set of basic elliptic functions. They are found in the description of the motion of a pendulum, as well as in the design of electronic elliptic filters. While trigonometric functions are defined with reference to a circle, the Jacobi elliptic functions are a generalization which refer to other conic sections, the ellipse in particular. The relation to trigonometric functions is contained in the notation, for example, by the matching notation for . The Jacobi elliptic functions are used more often in practical problems than the Weierstrass elliptic functions as they do not require notions of complex analysis to be defined and/or understood. They were introduced by Carl Gustav Jakob Jacobi (1829). Carl Friedrich Gauss had already studied special Jacobi elliptic functions in 1797, the lemniscate elliptic functions in particular, but his work was published much later.

In mathematics and signal processing, the Hilbert transform is a specific singular integral that takes a function, u(t) of a real variable and produces another function of a real variable H(u)(t). The Hilbert transform is given by the Cauchy principal value of the convolution with the function (see § Definition). The Hilbert transform has a particularly simple representation in the frequency domain: It imparts a phase shift of ±90° (π/2 radians) to every frequency component of a function, the sign of the shift depending on the sign of the frequency (see § Relationship with the Fourier transform). The Hilbert transform is important in signal processing, where it is a component of the analytic representation of a real-valued signal u(t). The Hilbert transform was first introduced by David Hilbert in this setting, to solve a special case of the Riemann–Hilbert problem for analytic functions.

In probability theory, the factorial moment is a mathematical quantity defined as the expectation or average of the falling factorial of a random variable. Factorial moments are useful for studying non-negative integer-valued random variables, and arise in the use of probability-generating functions to derive the moments of discrete random variables.

Variational Bayesian methods are a family of techniques for approximating intractable integrals arising in Bayesian inference and machine learning. They are typically used in complex statistical models consisting of observed variables as well as unknown parameters and latent variables, with various sorts of relationships among the three types of random variables, as might be described by a graphical model. As typical in Bayesian inference, the parameters and latent variables are grouped together as "unobserved variables". Variational Bayesian methods are primarily used for two purposes:

  1. To provide an analytical approximation to the posterior probability of the unobserved variables, in order to do statistical inference over these variables.
  2. To derive a lower bound for the marginal likelihood of the observed data. This is typically used for performing model selection, the general idea being that a higher marginal likelihood for a given model indicates a better fit of the data by that model and hence a greater probability that the model in question was the one that generated the data.

In statistics and information theory, a maximum entropy probability distribution has entropy that is at least as great as that of all other members of a specified class of probability distributions. According to the principle of maximum entropy, if nothing is known about a distribution except that it belongs to a certain class, then the distribution with the largest entropy should be chosen as the least-informative default. The motivation is twofold: first, maximizing entropy minimizes the amount of prior information built into the distribution; second, many physical systems tend to move towards maximal entropy configurations over time.

In mathematics, a stiff equation is a differential equation for which certain numerical methods for solving the equation are numerically unstable, unless the step size is taken to be extremely small. It has proven difficult to formulate a precise definition of stiffness, but the main idea is that the equation includes some terms that can lead to rapid variation in the solution.

<span class="mw-page-title-main">Lemniscate elliptic functions</span> Mathematical functions

In mathematics, the lemniscate elliptic functions are elliptic functions related to the arc length of the lemniscate of Bernoulli. They were first studied by Giulio Fagnano in 1718 and later by Leonhard Euler and Carl Friedrich Gauss, among others.

In probability theory and statistics, the factorial moment generating function (FMGF) of the probability distribution of a real-valued random variable X is defined as

Expected shortfall (ES) is a risk measure—a concept used in the field of financial risk measurement to evaluate the market risk or credit risk of a portfolio. The "expected shortfall at q% level" is the expected return on the portfolio in the worst of cases. ES is an alternative to value at risk that is more sensitive to the shape of the tail of the loss distribution.

In probability theory and statistics, the normal-gamma distribution is a bivariate four-parameter family of continuous probability distributions. It is the conjugate prior of a normal distribution with unknown mean and precision.

<span class="mw-page-title-main">Tukey lambda distribution</span>

Formalized by John Tukey, the Tukey lambda distribution is a continuous, symmetric probability distribution defined in terms of its quantile function. It is typically used to identify an appropriate distribution and not used in statistical models directly.

<span class="mw-page-title-main">Conway–Maxwell–Poisson distribution</span> Probability distribution

In probability theory and statistics, the Conway–Maxwell–Poisson distribution is a discrete probability distribution named after Richard W. Conway, William L. Maxwell, and Siméon Denis Poisson that generalizes the Poisson distribution by adding a parameter to model overdispersion and underdispersion. It is a member of the exponential family, has the Poisson distribution and geometric distribution as special cases and the Bernoulli distribution as a limiting case.

<span class="mw-page-title-main">Cnoidal wave</span> Nonlinear and exact periodic wave solution of the Korteweg–de Vries equation

In fluid dynamics, a cnoidal wave is a nonlinear and exact periodic wave solution of the Korteweg–de Vries equation. These solutions are in terms of the Jacobi elliptic function cn, which is why they are coined cnoidal waves. They are used to describe surface gravity waves of fairly long wavelength, as compared to the water depth.

Khabibullin's conjecture is a conjecture in mathematics related to Paley's problem for plurisubharmonic functions and to various extremal problems in the theory of entire functions of several variables. The conjecture was named after its proposer, B. N. Khabibullin.

<span class="mw-page-title-main">Dixon elliptic functions</span>

In mathematics, the Dixon elliptic functions sm and cm are two elliptic functions that map from each regular hexagon in a hexagonal tiling to the whole complex plane. Because these functions satisfy the identity , as real functions they parametrize the cubic Fermat curve , just as the trigonometric functions sine and cosine parametrize the unit circle .

References

  1. 1 2 3 4 5 6 7 8 Hosking, J.R.M. (1990). "L-moments: analysis and estimation of distributions using linear combinations of order statistics". Journal of the Royal Statistical Society, Series B. 52 (1): 105–124. JSTOR   2345653.
  2. Hosking, J.R.M. (1992). "Moments or L moments? An example comparing two measures of distributional shape". The American Statistician. 46 (3): 186–189. doi:10.2307/2685210. JSTOR   2685210.
  3. Hosking, J.R.M. (2006). "On the characterization of distributions by their L-moments". Journal of Statistical Planning and Inference. 136: 193–198. doi:10.1016/j.jspi.2004.06.004.
  4. Asquith, W.H. (2011) Distributional analysis with L-moment statistics using the R environment for statistical computing, Create Space Independent Publishing Platform, [print-on-demand], ISBN   1-463-50841-7
  5. 1 2 Jones, M.C. (2002). "Student's simplest distribution". Journal of the Royal Statistical Society, Series D . 51 (1): 41–49. doi:10.1111/1467-9884.00297. JSTOR   3650389.
  6. 1 2 Wang, Q.J. (1996). "Direct sample estimators of L-moments". Water Resources Research. 32 (12): 3617–3619. doi:10.1029/96WR02675.
  7. 1 2 Greenwood, J.A.; Landwehr, J.M.; Matalas, N.C.; Wallis, J.R. (1979). "Probability weighted moments: Definition and relation to parameters of several distributions expressed in inverse form" (PDF). Water Resources Research. 15 (5): 1049–1054. doi:10.1029/WR015i005p01049. S2CID   121955257. Archived from the original (PDF) on 2020-02-10.
  8. 1 2 Landwehr, J.M.; Matalas, N.C.; Wallis, J.R. (1979). "Probability weighted moments compared with some traditional techniques in estimating Gumbel parameters and quantiles". Water Resources Research. 15 (5): 1055–1064. doi:10.1029/WR015i005p01055.
  9. "L moments". NIST Dataplot. itl.nist.gov (documentation). National Institute of Standards and Technology. 6 January 2006. Retrieved 19 January 2013.
  10. 1 2 Valbuena, R.; Maltamo, M.; Mehtätalo, L.; Packalen, P. (2017). "Key structural features of Boreal forests may be detected directly using L-moments from airborne lidar data". Remote Sensing of Environment. 194: 437–446. doi:10.1016/j.rse.2016.10.024.
  11. Hosking, JRM; Wallis, JR (2005). Regional Frequency Analysis: An Approach Based on L-moments. Cambridge University Press. p. 3. ISBN   978-0521019408 . Retrieved 22 January 2013.
  12. David, H. A.; Nagaraja, H. N. (2003). Order Statistics (3rd ed.). Wiley. ISBN   978-0-471-38926-2.
  13. Serfling, R.; Xiao, P. (2007). "A contribution to multivariate L-moments: L-comoment matrices". Journal of Multivariate Analysis. 98 (9): 1765–1781. CiteSeerX   10.1.1.62.4288 . doi:10.1016/j.jmva.2007.01.008.
  14. Delicado, P.; Goria, M. N. (2008). "A small sample comparison of maximum likelihood, moments and L-moments methods for the asymmetric exponential power distribution". Computational Statistics & Data Analysis. 52 (3): 1661–1673. doi:10.1016/j.csda.2007.05.021.
  15. Alkasasbeh, M. R.; Raqab, M. Z. (2009). "Estimation of the generalized logistic distribution parameters: comparative study". Statistical Methodology. 6 (3): 262–279. doi:10.1016/j.stamet.2008.10.001.
  16. Jones, M. C. (2004). "On some expressions for variance, covariance, skewness and L-moments". Journal of Statistical Planning and Inference. 126 (1): 97–106. doi:10.1016/j.jspi.2003.09.001.
  17. Jones, M. C. (2009). "Kumaraswamy's distribution: A beta-type distribution with some tractability advantages". Statistical Methodology. 6 (1): 70–81. doi:10.1016/j.stamet.2008.04.001.
  18. Royston, P. (1992). "Which measures of skewness and kurtosis are best?". Statistics in Medicine . 11 (3): 333–343. doi:10.1002/sim.4780110306. PMID   1609174.
  19. Ulrych, T. J.; Velis, D. R.; Woodbury, A. D.; Sacchi, M. D. (2000). "L-moments and C-moments". Stochastic Environmental Research and Risk Assessment. 14 (1): 50–68. doi:10.1007/s004770050004. S2CID   120542594.
  20. Elamir, Elsayed A. H.; Seheult, Allan H. (2003). "Trimmed L-moments". Computational Statistics & Data Analysis. 43 (3): 299–314. doi:10.1016/S0167-9473(02)00250-5.