Extreme value theory

Last updated
Extreme value theory is used to model the risk of extreme, rare events, such as the 1755 Lisbon earthquake. 1755 Lisbon earthquake.jpg
Extreme value theory is used to model the risk of extreme, rare events, such as the 1755 Lisbon earthquake.

Extreme value theory or extreme value analysis (EVA) is a branch of statistics dealing with the extreme deviations from the median of probability distributions. It seeks to assess, from a given ordered sample of a given random variable, the probability of events that are more extreme than any previously observed. Extreme value analysis is widely used in many disciplines, such as structural engineering, finance, economics, earth sciences, traffic prediction, and geological engineering. For example, EVA might be used in the field of hydrology to estimate the probability of an unusually large flooding event, such as the 100-year flood. Similarly, for the design of a breakwater, a coastal engineer would seek to estimate the 50 year wave and design the structure accordingly.

Contents

Data analysis

Two main approaches exist for practical extreme value analysis.

The first method relies on deriving block maxima (minima) series as a preliminary step. In many situations it is customary and convenient to extract the annual maxima (minima), generating an annual maxima series (AMS).

The second method relies on extracting, from a continuous record, the peak values reached for any period during which values exceed a certain threshold (falls below a certain threshold). This method is generally referred to as the peak over threshold method (POT). [1]

For AMS data, the analysis may partly rely on the results of the Fisher–Tippett–Gnedenko theorem, leading to the generalized extreme value distribution being selected for fitting. [2] [3] However, in practice, various procedures are applied to select between a wider range of distributions. The theorem here relates to the limiting distributions for the minimum or the maximum of a very large collection of independent random variables from the same distribution. Given that the number of relevant random events within a year may be rather limited, it is unsurprising that analyses of observed AMS data often lead to distributions other than the generalized extreme value distribution (GEVD) being selected. [4]

For POT data, the analysis may involve fitting two distributions: One for the number of events in a time period considered and a second for the size of the exceedances.

A common assumption for the first is the Poisson distribution, with the generalized Pareto distribution being used for the exceedances. A tail-fitting can be based on the Pickands–Balkema–de Haan theorem. [5] [6]

Novak (2011) reserves the term "POT method" to the case where the threshold is non-random, and distinguishes it from the case where one deals with exceedances of a random threshold. [7]

Applications

Applications of extreme value theory include predicting the probability distribution of:

History

The field of extreme value theory was pioneered by L. Tippett (1902–1985). Tippett was employed by the British Cotton Industry Research Association, where he worked to make cotton thread stronger. In his studies, he realized that the strength of a thread was controlled by the strength of its weakest fibres. With the help of R.A. Fisher, Tippet obtained three asymptotic limits describing the distributions of extremes assuming independent variables. E.J. Gumbel (1958) [22] codified this theory. These results can be extended to allow for slight correlations between variables, but the classical theory does not extend to strong correlations of the order of the variance. One universality class of particular interest is that of log-correlated fields, where the correlations decay logarithmically with the distance.

Univariate theory

The theory for extreme values of a single variable is governed by the extreme value theorem , also called the Fisher–Tippett–Gnedenko theorem , which describes which of the three possible distributions for extreme values applies for a particular statistical variable which is summarized in this section.

Let be a sample of independent and identically distributed random variables with cumulative distribution function and let denote the sample maximum.

In theory, the exact distribution of the maximum can be derived:

The value of the associated indicator function is a Bernoulli process with a success probability that depends on the magnitude of the extreme event. The number of extreme events within trials thus follows a binomial distribution and the number of trials until an event occurs follows a geometric distribution with expected value and standard deviation of the same order

In practice, we might not have the distribution function but the Fisher–Tippett–Gnedenko theorem provides an asymptotic result. If there exist sequences of paired constants with and such that

as then

where the parameter depends on how steeply of the distribution's tail(s) diminish (called "ordinary" tail(s), "thin" tail(s), and "fat" tail(s), with the normal distribution put in the "thin" tailed group instead of "ordinary" for this context, at least). When normalized, belongs to one of the following non-degenerate distribution families:

Type 1: Gumbel distribution, for
when the distribution of has an "ordinary" exponentially diminishing tail.


Type 2: Fréchet distribution, for
when the distribution of has a heavy tail (including polynomial decay).


Type 3: Weibull distribution,
for
when the distribution of has a thin tail with finite upper bound.

Multivariate theory

Extreme value theory in more than one variable introduces additional issues that have to be addressed. One problem that arises is that one must specify what constitutes an extreme event. [23] Although this is straightforward in the univariate case, there is no unambiguous way to do this in the multivariate case. The fundamental problem is that although it is possible to order a set of real-valued numbers, there is no natural way to order a set of vectors.

As an example, in the univariate case, given a set of observations it is straightforward to find the most extreme event simply by taking the maximum (or minimum) of the observations. However, in the bivariate case, given a set of observations , it is not immediately clear how to find the most extreme event. Suppose that one has measured the values at a specific time and the values at a later time. Which of these events would be considered more extreme? There is no universal answer to this question.

Another issue in the multivariate case is that the limiting model is not as fully prescribed as in the univariate case. In the univariate case, the model (GEV distribution) contains three parameters whose values are not predicted by the theory and must be obtained by fitting the distribution to the data. In the multivariate case, the model not only contains unknown parameters, but also a function whose exact form is not prescribed by the theory. However, this function must obey certain constraints. [24] [25] It is not straightforward to devise estimators that obey such constraints though some have been recently constructed. [26] [27] [28]

As an example of an application, bivariate extreme value theory has been applied to ocean research. [23] [29]

Non-stationary extremes

Statistical modeling for nonstationary time series was developed in the 1990s. [30] Methods for nonstationary multivariate extremes have been introduced more recently. [31] The latter can be used for tracking how the dependence between extreme values changes over time, or over another covariate. [32] [33] [34]

See also

Extreme value distributions


Related Research Articles

<span class="mw-page-title-main">Normal distribution</span> Probability distribution

In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is

In probability theory, the central limit theorem (CLT) states that, under appropriate conditions, the distribution of a normalized version of the sample mean converges to a standard normal distribution. This holds even if the original variables themselves are not normally distributed. There are several versions of the CLT, each applying in the context of different conditions.

<span class="mw-page-title-main">Multivariate normal distribution</span> Generalization of the one-dimensional normal distribution to higher dimensions

In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be k-variate normally distributed if every linear combination of its k components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) correlated real-valued random variables each of which clusters around a mean value.

<span class="mw-page-title-main">Log-normal distribution</span> Probability distribution

In probability theory, a log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed. Thus, if the random variable X is log-normally distributed, then Y = ln(X) has a normal distribution. Equivalently, if Y has a normal distribution, then the exponential function of Y, X = exp(Y), has a log-normal distribution. A random variable which is log-normally distributed takes only positive real values. It is a convenient and useful model for measurements in exact and engineering sciences, as well as medicine, economics and other topics (e.g., energies, concentrations, lengths, prices of financial instruments, and other metrics).

<span class="mw-page-title-main">Gumbel distribution</span> Particular case of the generalized extreme value distribution

In probability theory and statistics, the Gumbel distribution is used to model the distribution of the maximum of a number of samples of various distributions.

In probability and statistics, an exponential family is a parametric set of probability distributions of a certain form, specified below. This special form is chosen for mathematical convenience, including the enabling of the user to calculate expectations, covariances using differentiation based on some useful algebraic properties, as well as for generality, as exponential families are in a sense very natural sets of distributions to consider. The term exponential class is sometimes used in place of "exponential family", or the older term Koopman–Darmois family. Sometimes loosely referred to as "the" exponential family, this class of distributions is distinct because they all possess a variety of desirable properties, most importantly the existence of a sufficient statistic.

In statistics, the Wishart distribution is a generalization of the gamma distribution to multiple dimensions. It is named in honor of John Wishart, who first formulated the distribution in 1928. Other names include Wishart ensemble, or Wishart–Laguerre ensemble, or LOE, LUE, LSE.

In statistics, a generalized linear model (GLM) is a flexible generalization of ordinary linear regression. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a link function and by allowing the magnitude of the variance of each measurement to be a function of its predicted value.

In probability theory and statistics, the generalized extreme value (GEV) distribution is a family of continuous probability distributions developed within extreme value theory to combine the Gumbel, Fréchet and Weibull families also known as type I, II and III extreme value distributions. By the extreme value theorem the GEV distribution is the only possible limit distribution of properly normalized maxima of a sequence of independent and identically distributed random variables. Note that a limit distribution needs to exist, which requires regularity conditions on the tail of the distribution. Despite this, the GEV distribution is often used as an approximation to model the maxima of long (finite) sequences of random variables.

In statistics and information theory, a maximum entropy probability distribution has entropy that is at least as great as that of all other members of a specified class of probability distributions. According to the principle of maximum entropy, if nothing is known about a distribution except that it belongs to a certain class, then the distribution with the largest entropy should be chosen as the least-informative default. The motivation is twofold: first, maximizing entropy minimizes the amount of prior information built into the distribution; second, many physical systems tend to move towards maximal entropy configurations over time.

<span class="mw-page-title-main">Characteristic function (probability theory)</span> Fourier transform of the probability density function

In probability theory and statistics, the characteristic function of any real-valued random variable completely defines its probability distribution. If a random variable admits a probability density function, then the characteristic function is the Fourier transform of the probability density function. Thus it provides an alternative route to analytical results compared with working directly with probability density functions or cumulative distribution functions. There are particularly simple results for the characteristic functions of distributions defined by the weighted sums of random variables.

Bayesian linear regression is a type of conditional modeling in which the mean of one variable is described by a linear combination of other variables, with the goal of obtaining the posterior probability of the regression coefficients and ultimately allowing the out-of-sample prediction of the regressandconditional on observed values of the regressors. The simplest and most widely used version of this model is the normal linear model, in which given is distributed Gaussian. In this model, and under a particular choice of prior probabilities for the parameters—so-called conjugate priors—the posterior can be found analytically. With more arbitrarily chosen priors, the posteriors generally have to be approximated.

<span class="mw-page-title-main">Generalized Pareto distribution</span> Family of probability distributions often used to model tails or extreme values

In statistics, the generalized Pareto distribution (GPD) is a family of continuous probability distributions. It is often used to model the tails of another distribution. It is specified by three parameters: location , scale , and shape . Sometimes it is specified by only scale and shape and sometimes only by its shape parameter. Some references give the shape parameter as .

In probability theory and statistics, the Dirichlet-multinomial distribution is a family of discrete multivariate probability distributions on a finite support of non-negative integers. It is also called the Dirichlet compound multinomial distribution (DCM) or multivariate Pólya distribution. It is a compound probability distribution, where a probability vector p is drawn from a Dirichlet distribution with parameter vector , and an observation drawn from a multinomial distribution with probability vector p and number of trials n. The Dirichlet parameter vector captures the prior belief about the situation and can be seen as a pseudocount: observations of each outcome that occur before the actual data is collected. The compounding corresponds to a Pólya urn scheme. It is frequently encountered in Bayesian statistics, machine learning, empirical Bayes methods and classical statistics as an overdispersed multinomial distribution.

<span class="mw-page-title-main">Fréchet distribution</span> Continuous probability distribution

The Fréchet distribution, also known as inverse Weibull distribution, is a special case of the generalized extreme value distribution. It has the cumulative distribution function

<span class="mw-page-title-main">Skew normal distribution</span> Probability distribution

In probability theory and statistics, the skew normal distribution is a continuous probability distribution that generalises the normal distribution to allow for non-zero skewness.

In statistics, the Fisher–Tippett–Gnedenko theorem is a general result in extreme value theory regarding asymptotic distribution of extreme order statistics. The maximum of a sample of iid random variables after proper renormalization can only converge in distribution to one of only 3 possible distribution families: the Gumbel distribution, the Fréchet distribution, or the Weibull distribution. Credit for the extreme value theorem and its convergence details are given to Fréchet (1927), Fisher and Tippett (1928), Mises (1936), and Gnedenko (1943).

The Pickands–Balkema–De Haan theorem gives the asymptotic tail distribution of a random variable, when its true distribution is unknown. It is often called the second theorem in extreme value theory. Unlike the first theorem, which concerns the maximum of a sample, the Pickands–Balkema–De Haan theorem describes the values above a threshold.

In statistics, the matrix t-distribution is the generalization of the multivariate t-distribution from vectors to matrices. The matrix t-distribution shares the same relationship with the multivariate t-distribution that the matrix normal distribution shares with the multivariate normal distribution. For example, the matrix t-distribution is the compound distribution that results from sampling from a matrix normal distribution having sampled the covariance matrix of the matrix normal from an inverse Wishart distribution.

In probability theory and statistics, the generalized multivariate log-gamma (G-MVLG) distribution is a multivariate distribution introduced by Demirhan and Hamurkaroglu in 2011. The G-MVLG is a flexible distribution. Skewness and kurtosis are well controlled by the parameters of the distribution. This enables one to control dispersion of the distribution. Because of this property, the distribution is effectively used as a joint prior distribution in Bayesian analysis, especially when the likelihood is not from the location-scale family of distributions such as normal distribution.

References

  1. Leadbetter, M.R. (1991). "On a basis for 'peaks over threshold' modeling". Statistics and Probability Letters. 12 (4): 357–362. doi:10.1016/0167-7152(91)90107-3.
  2. Fisher & Tippett (1928)
  3. Gnedenko (1943)
  4. Embrechts, Klüppelberg & Mikosch (1997)
  5. Pickands (1975)
  6. Balkema & de Haan (1974)
  7. Novak (2011)
  8. Tippett, Lepore & Cohen (2016)
  9. Batt, Ryan D.; Carpenter, Stephen R.; Ives, Anthony R. (March 2017). "Extreme events in lake ecosystem time series". Limnology and Oceanography Letters. 2 (3): 63. Bibcode:2017LimOL...2...63B. doi: 10.1002/lol2.10037 .
  10. Alvarado, Sandberg & Pickford (1998) , p. 68
  11. Makkonen (2008)
  12. Einmahl, J.H.J.; Smeets, S.G.W.R. (2009). Ultimate 100m world records through extreme-value theory (PDF) (Report). CentER Discussion Paper. Vol. 57. Tilburg University. Archived from the original (PDF) on 2016-03-12. Retrieved 2009-08-12.
  13. Gembris, D.; Taylor, J.; Suter, D. (2002). "Trends and random fluctuations in athletics". Nature. 417 (6888): 506. Bibcode:2002Natur.417..506G. doi: 10.1038/417506a . hdl:2003/25362. PMID   12037557. S2CID   13469470.
  14. Gembris, D.; Taylor, J.; Suter, D. (2007). "Evolution of athletic records: Statistical effects versus real improvements". Journal of Applied Statistics . 34 (5): 529–545. Bibcode:2007JApSt..34..529G. doi:10.1080/02664760701234850. hdl:2003/25404. S2CID   55378036.
  15. Spearing, H.; Tawn, J.; Irons, D.; Paulden, T.; Bennett, G. (2021). "Ranking, and other properties, of elite swimmers using extreme value theory". Journal of the Royal Statistical Society. Series A (Statistics in Society). 184 (1): 368–395. arXiv: 1910.10070 . doi: 10.1111/rssa.12628 . S2CID   204823947.
  16. Songchitruksa, P.; Tarko, A.P. (2006). "The extreme value theory approach to safety estimation". Accident Analysis and Prevention. 38 (4): 811–822. doi:10.1016/j.aap.2006.02.003. PMID   16546103.
  17. Orsini, F.; Gecchele, G.; Gastaldi, M.; Rossi, R. (2019). "Collision prediction in roundabouts: A comparative study of extreme value theory approaches". Transportmetrica. Series A: Transport Science. 15 (2): 556–572. doi:10.1080/23249935.2018.1515271. S2CID   158343873.
  18. Tsinos, C.G.; Foukalas, F.; Khattab, T.; Lai, L. (February 2018). "On channel selection for carrier aggregation systems". IEEE Transactions on Communications . 66 (2): 808–818. doi:10.1109/TCOMM.2017.2757478. S2CID   3405114.
  19. Wong, Felix; Collins, James J. (2 November 2020). "Evidence that coronavirus superspreading is fat-tailed". Proceedings of the National Academy of Sciences of the U.S. 117 (47): 29416–29418. Bibcode:2020PNAS..11729416W. doi: 10.1073/pnas.2018490117 . ISSN   0027-8424. PMC   7703634 . PMID   33139561.
  20. Basnayake, Kanishka; Mazaud, David; Bemelmans, Alexis; Rouach, Nathalie; Korkotian, Eduard; Holcman, David (4 June 2019). "Fast calcium transients in dendritic spines driven by extreme statistics". PLOS Biology . 17 (6): e2006202. doi: 10.1371/journal.pbio.2006202 . ISSN   1545-7885. PMC   6548358 . PMID   31163024.
  21. Younis, Abubaker; Abdeljalil, Anwar; Omer, Ali (1 January 2023). "Determination of panel generation factor using peaks over threshold method and short-term data for an off-grid photovoltaic system in Sudan: A case of Khartoum city". Solar Energy. 249: 242–249. doi:10.1016/j.solener.2022.11.039. ISSN   0038-092X. S2CID   254207549.
  22. Gumbel (2004)
  23. 1 2 Morton, I.D.; Bowers, J. (December 1996). "Extreme value analysis in a multivariate offshore environment". Applied Ocean Research. 18 (6): 303–317. Bibcode:1996AppOR..18..303M. doi:10.1016/s0141-1187(97)00007-2. ISSN   0141-1187.
  24. Beirlant, Jan; Goegebeur, Yuri; Teugels, Jozef; Segers, Johan (27 August 2004). Statistics of Extremes: Theory and applications. Wiley Series in Probability and Statistics. Chichester, UK: John Wiley & Sons, Ltd. doi:10.1002/0470012382. ISBN   978-0-470-01238-3.
  25. Coles, Stuart (2001). An Introduction to Statistical Modeling of Extreme Values. Springer Series in Statistics. doi:10.1007/978-1-4471-3675-0. ISBN   978-1-84996-874-4. ISSN   0172-7397.
  26. de Carvalho, M.; Davison, A.C. (2014). "Spectral density ratio models for multivariate extremes" (PDF). Journal of the American Statistical Association. 109: 764‒776. doi:10.1016/j.spl.2017.03.030. hdl:20.500.11820/9e2f7cff-d052-452a-b6a2-dc8095c44e0c. S2CID   53338058.
  27. Hanson, T.; de Carvalho, M.; Chen, Yuhui (2017). "Bernstein polynomial angular densities of multivariate extreme value distributions" (PDF). Statistics and Probability Letters. 128: 60–66. doi:10.1016/j.spl.2017.03.030. hdl:20.500.11820/9e2f7cff-d052-452a-b6a2-dc8095c44e0c. S2CID   53338058.
  28. de Carvalho, M. (2013). "A Euclidean likelihood estimator for bivariate tail dependence" (PDF). Communications in Statistics – Theory and Methods. 42 (7): 1176–1192. arXiv: 1204.3524 . doi:10.1080/03610926.2012.709905. S2CID   42652601.
  29. Zachary, S.; Feld, G.; Ward, G.; Wolfram, J. (October 1998). "Multivariate extrapolation in the offshore environment". Applied Ocean Research . 20 (5): 273–295. Bibcode:1998AppOR..20..273Z. doi:10.1016/s0141-1187(98)00027-3. ISSN   0141-1187.
  30. Davison, A.C.; Smith, Richard (1990). "Models for exceedances over high thresholds". Journal of the Royal Statistical Society. Series B (Methodological). 52 (3): 393–425. doi:10.1111/j.2517-6161.1990.tb01796.x.
  31. de Carvalho, M. (2016). "Statistics of extremes: Challenges and opportunities". Handbook of EVT and its Applications to Finance and Insurance (PDF). Hoboken, NJ: John Wiley's Sons. pp. 195–214. ISBN   978-1-118-65019-6.
  32. Castro, D.; de Carvalho, M.; Wadsworth, J. (2018). "Time-Varying Extreme Value Dependence with Application to Leading European Stock Markets" (PDF). Annals of Applied Statistics. 12: 283–309. doi:10.1214/17-AOAS1089. S2CID   33350408.
  33. Mhalla, L.; de Carvalho, M.; Chavez-Demoulin, V. (2019). "Regression type models for extremal dependence" (PDF). Scandinavian Journal of Statistics. 46 (4): 1141–1167. doi:10.1111/sjos.12388. S2CID   53570822.
  34. Mhalla, L.; de Carvalho, M.; Chavez-Demoulin, V. (2018). "Local robust estimation of the Pickands dependence function". Annals of Statistics . 46 (6A): 2806–2843. doi: 10.1214/17-AOS1640 . S2CID   59467614.

Sources

Software