A return period, also known as a recurrence interval or repeat interval, is an average time or an estimated average time between events such as earthquakes, floods, [1] landslides, [2] or river discharge flows to occur.
It is a statistical measurement typically based on historic data over an extended period, and is used usually for risk analysis. Examples include deciding whether a project should be allowed to go forward in a zone of a certain risk or designing structures to withstand events with a certain return period. The following analysis assumes that the probability of the event occurring does not vary over time and is independent of past events.
Recurrence interval
For floods, the event may be measured in terms of m3/s or height; for storm surges, in terms of the height of the surge, and similarly for other events. This is Weibull's Formula. [4] : 12 [5] [ failed verification ]
The theoretical return period between occurrences is the inverse of the average frequency of occurrence. For example, a 10-year flood has a 1/10 = 0.1 or 10% chance of being exceeded in any one year and a 50-year flood has a 0.02 or 2% chance of being exceeded in any one year.
This does not mean that a 100-year flood will happen regularly every 100 years, or only once in 100 years. Despite the connotations of the name "return period". In any given 100-year period, a 100-year event may occur once, twice, more, or not at all, and each outcome has a probability that can be computed as below.
Also, the estimated return period below is a statistic: it is computed from a set of data (the observations), as distinct from the theoretical value in an idealized distribution. One does not actually know that a certain or greater magnitude happens with 1% probability, only that it has been observed exactly once in 100 years.
That distinction is significant because there are few observations of rare events: for instance if observations go back 400 years, the most extreme event (a 400-year event by the statistical definition) may later be classed, on longer observation, as a 200-year event (if a comparable event immediately occurs) or a 500-year event (if no comparable event occurs for a further 100 years).
Further, one cannot determine the size of a 1000-year event based on such records alone but instead must use a statistical model to predict the magnitude of such an (unobserved) event. Even if the historic return interval is a lot less than 1000 years, if there are a number of less-severe events of a similar nature recorded, the use of such a model is likely to provide useful information to help estimate the future return interval.
One would like to be able to interpret the return period in probabilistic models. The most logical interpretation for this is to take the return period as the counting rate in a Poisson distribution since it is the expectation value of the rate of occurrences. An alternative interpretation is to take it as the probability for a yearly Bernoulli trial in the binomial distribution. That is disfavoured because each year does not represent an independent Bernoulli trial but is an arbitrary measure of time. This question is mainly academic as the results obtained will be similar under both the Poisson and binomial interpretations.
The probability mass function of the Poisson distribution is
where is the number of occurrences the probability is calculated for, the time period of interest, is the return period and is the counting rate.
The probability of no-occurrence can be obtained simply considering the case for . The formula is
Consequently, the probability of exceedance (i.e. the probability of an event "stronger" than the event with return period to occur at least once within the time period of interest) is
Note that for any event with return period , the probability of exceedance within an interval equal to the return period (i.e. ) is independent from the return period and it is equal to . This means, for example, that there is a 63.2% probability of a flood larger than the 50-year return flood to occur within any period of 50 year.
If the return period of occurrence is 243 years () then the probability of exactly one occurrence in ten years is
In a given period of n years, the probability of a given number r of events of a return period is given by the binomial distribution as follows.
This is valid only if the probability of more than one occurrence per year is zero. Often that is a close approximation, in which case the probabilities yielded by this formula hold approximately.
If in such a way that then
Take
where
Given that the return period of an event is 100 years,
So the probability that such an event occurs exactly once in 10 successive years is:
Return period is useful for risk analysis (such as natural, inherent, or hydrologic risk of failure). [6] When dealing with structure design expectations, the return period is useful in calculating the riskiness of the structure.
The probability of at least one event that exceeds design limits during the expected life of the structure is the complement of the probability that no events occur which exceed design limits.
The equation for assessing this parameter is
where
In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is
In probability and statistics, Student's t-distribution is a continuous probability distribution that generalizes the standard normal distribution. Like the latter, it is symmetric around zero and bell-shaped.
In probability theory, Chebyshev's inequality guarantees that, for a wide class of probability distributions, no more than a certain fraction of values can be more than a certain distance from the mean. Specifically, no more than 1/k2 of the distribution's values can be k or more standard deviations away from the mean. The rule is often called Chebyshev's theorem, about the range of standard deviations around the mean, in statistics. The inequality has great utility because it can be applied to any probability distribution in which the mean and variance are defined. For example, it can be used to prove the weak law of large numbers.
In probability theory, the law of large numbers (LLN) is a theorem that describes the result of performing the same experiment a large number of times. According to the law, the average of the results obtained from a large number of trials should be close to the expected value and tends to become closer to the expected value as more trials are performed.
Ergodic theory is a branch of mathematics that studies statistical properties of deterministic dynamical systems; it is the study of ergodicity. In this context, "statistical properties" refers to properties which are expressed through the behavior of time averages of various functions along trajectories of dynamical systems. The notion of deterministic dynamical systems assumes that the equations determining the dynamics do not contain any random perturbations, noise, etc. Thus, the statistics with which we are concerned are properties of the dynamics.
In probability theory and statistics, the Gumbel distribution is used to model the distribution of the maximum of a number of samples of various distributions.
In statistical inference, specifically predictive inference, a prediction interval is an estimate of an interval in which a future observation will fall, with a certain probability, given what has already been observed. Prediction intervals are often used in regression analysis.
In probability theory and statistics, the Laplace distribution is a continuous probability distribution named after Pierre-Simon Laplace. It is also sometimes called the double exponential distribution, because it can be thought of as two exponential distributions spliced together along the abscissa, although the term is also sometimes used to refer to the Gumbel distribution. The difference between two independent identically distributed exponential random variables is governed by a Laplace distribution, as is a Brownian motion evaluated at an exponentially distributed random time. Increments of Laplace motion or a variance gamma process evaluated over the time scale also have a Laplace distribution.
In probability theory and directional statistics, the von Mises distribution is a continuous probability distribution on the circle. It is a close approximation to the wrapped normal distribution, which is the circular analogue of the normal distribution. A freely diffusing angle on a circle is a wrapped normally distributed random variable with an unwrapped variance that grows linearly in time. On the other hand, the von Mises distribution is the stationary distribution of a drift and diffusion process on the circle in a harmonic potential, i.e. with a preferred orientation. The von Mises distribution is the maximum entropy distribution for circular data when the real and imaginary parts of the first circular moment are specified. The von Mises distribution is a special case of the von Mises–Fisher distribution on the N-dimensional sphere.
Renewal theory is the branch of probability theory that generalizes the Poisson process for arbitrary holding times. Instead of exponentially distributed holding times, a renewal process may have any independent and identically distributed (IID) holding times that have finite mean. A renewal-reward process additionally has a random sequence of rewards incurred at each holding time, which are IID but need not be independent of the holding times.
In statistics, a pivotal quantity or pivot is a function of observations and unobservable parameters such that the function's probability distribution does not depend on the unknown parameters. A pivot quantity need not be a statistic—the function and its value can depend on the parameters of the model, but its distribution must not. If it is a statistic, then it is known as an ancillary statistic.
In statistics, the 68–95–99.7 rule, also known as the empirical rule, is a shorthand used to remember the percentage of values that lie within an interval estimate in a normal distribution: 68%, 95%, and 99.7% of the values lie within one, two, and three standard deviations of the mean, respectively.
The shifted log-logistic distribution is a probability distribution also known as the generalized log-logistic or the three-parameter log-logistic distribution. It has also been called the generalized logistic distribution, but this conflicts with other uses of the term: see generalized logistic distribution.
In probability theory and statistics, the index of dispersion, dispersion index,coefficient of dispersion,relative variance, or variance-to-mean ratio (VMR), like the coefficient of variation, is a normalized measure of the dispersion of a probability distribution: it is a measure used to quantify whether a set of observed occurrences are clustered or dispersed compared to a standard statistical model.
In statistics, additive smoothing, also called Laplace smoothing or Lidstone smoothing, is a technique used to smooth categorical data. Given a set of observation counts from a -dimensional multinomial distribution with trials, a "smoothed" version of the counts gives the estimator:
In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known constant mean rate and independently of the time since the last event. It is named after French mathematician Siméon Denis Poisson. The Poisson distribution can also be used for the number of events in other specified interval types such as distance, area, or volume. It plays an important role for discrete-stable distributions.
In the theory of renewal processes, a part of the mathematical theory of probability, the residual time or the forward recurrence time is the time between any given time and the next epoch of the renewal process under consideration. In the context of random walks, it is also known as overshoot. Another way to phrase residual time is "how much more time is there to wait?".
In statistics, almost sure hypothesis testing or a.s. hypothesis testing utilizes almost sure convergence in order to determine the validity of a statistical hypothesis with probability one. This is to say that whenever the null hypothesis is true, then an a.s. hypothesis test will fail to reject the null hypothesis w.p. 1 for all sufficiently large samples. Similarly, whenever the alternative hypothesis is true, then an a.s. hypothesis test will reject the null hypothesis with probability one, for all sufficiently large samples. Along similar lines, an a.s. confidence interval eventually contains the parameter of interest with probability 1. Dembo and Peres (1994) proved the existence of almost sure hypothesis tests.
An intensity-duration-frequency curve is a mathematical function that relates the intensity of an event with its duration and frequency of occurrence. Frequency is the inverse of the probability of occurrence. These curves are commonly used in hydrology for flood forecasting and civil engineering for urban drainage design. However, the IDF curves are also analysed in hydrometeorology because of the interest in the time concentration or time-structure of the rainfall, but it is also possible to define IDF curves for drought events. Additionally, applications of IDF curves to risk-based design are emerging outside of hydrometeorology, for example some authors developed IDF curves for food supply chain inflow shocks to US cities.
In optimal transport, a branch of mathematics, polar factorization of vector fields is a basic result due to Brenier (1987), with antecedents of Knott-Smith (1984) and Rachev (1985), that generalizes many existing results among which are the polar decomposition of real matrices, and the rearrangement of real-valued functions.