In statistics, the **range** of a set of data is the difference between the largest and smallest values. It can give you a rough idea of how the outcome of the data set will be before you look at it actually ^{ [1] } Difference here is specific, the **range** of a set of data is the result of subtracting the **smallest value** from **largest value.**

- For continuous IID random variables
- Distribution
- Moments
- For continuous non-IID random variables
- For discrete IID random variables
- Distribution 2
- Derivation
- Related quantities
- See also
- References

However, in descriptive statistics, this concept of range has a more complex meaning. The range is the size of the smallest interval (statistics) which contains all the data and provides an indication of statistical dispersion. It is measured in the same units as the data. Since it only depends on two of the observations, it is most useful in representing the dispersion of small data sets.^{ [2] } Range happens to be the lowest and the hightest numbers subtracted

For *n* independent and identically distributed continuous random variables *X*_{1}, *X*_{2}, ..., *X*_{n} with cumulative distribution function G(*x*) and probability density function g(*x*). Let T denote the range of a sample of size *n* from a population with distribution function *G*(*x*).

The range has cumulative distribution function^{ [3] }^{ [4] }

Gumbel notes that the "beauty of this formula is completely marred by the facts that, in general, we cannot express *G*(*x* + *t*) by *G*(*x*), and that the numerical integration is lengthy and tiresome."^{ [3] }^{:385}

If the distribution of each *X*_{i} is limited to the right (or left) then the asymptotic distribution of the range is equal to the asymptotic distribution of the largest (smallest) value. For more general distributions the asymptotic distribution can be expressed as a Bessel function.^{ [3] }

The mean range is given by^{ [5] }

where *x*(*G*) is the inverse function. In the case where each of the *X*_{i} has a standard normal distribution, the mean range is given by^{ [6] }

For *n* nonidentically distributed independent continuous random variables *X*_{1}, *X*_{2}, ..., *X*_{n} with cumulative distribution functions *G*_{1}(*x*), *G*_{2}(*x*), ..., *G*_{n}(*x*) and probability density functions *g*_{1}(*x*), *g*_{2}(*x*), ..., *g*_{n}(*x*), the range has cumulative distribution function ^{ [4] }

For *n* independent and identically distributed discrete random variables *X*_{1}, *X*_{2}, ..., *X*_{n} with cumulative distribution function *G*(*x*) and probability mass function *g*(*x*) the range of the *X*_{i} is the range of a sample of size *n* from a population with distribution function *G*(*x*). We can assume without loss of generality that the support of each *X*_{i} is {1,2,3,...,*N*} where *N* is a positive integer or infinity.^{ [7] }^{ [8] }

The range has probability mass function^{ [7] }^{ [9] }^{ [10] }

If we suppose that *g*(*x*) = 1/*N*, the discrete uniform distribution for all *x*, then we find^{ [9] }^{ [11] }

The probability of having a specific range value, *t*, can be determined by adding the probabilities of having two samples differing by *t*, and every other sample having a value between the two extremes. The probability of one sample having a value of *x* is . The probability of another having a value *t* greater than *x* is:

The probability of all other values lying between these two extremes is:

Combining the three together yields:

The range is a simple function of the sample maximum and minimum and these are specific examples of order statistics. In particular, the range is a linear function of order statistics, which brings it into the scope of L-estimation.

In probability theory and statistics, the **cumulative distribution function** (**CDF**) of a real-valued random variable , or just **distribution function** of , evaluated at , is the probability that will take a value less than or equal to .

The **Cauchy distribution**, named after Augustin Cauchy, is a continuous probability distribution. It is also known, especially among physicists, as the **Lorentz distribution**, **Cauchy–Lorentz distribution**, **Lorentz(ian) function**, or **Breit–Wigner distribution**. The Cauchy distribution is the distribution of the x-intercept of a ray issuing from with a uniformly distributed angle. It is also the distribution of the ratio of two independent normally distributed random variables with mean zero.

In statistics, the **Kolmogorov–Smirnov test** is a nonparametric test of the equality of continuous, one-dimensional probability distributions that can be used to compare a sample with a reference probability distribution, or to compare two samples. It is named after Andrey Kolmogorov and Nikolai Smirnov.

In economics, the **Lorenz curve** is a graphical representation of the distribution of income or of wealth. It was developed by Max O. Lorenz in 1905 for representing inequality of the wealth distribution.

In statistics and probability theory, a **median** is a value separating the higher half from the lower half of a data sample, a population or a probability distribution. For a data set, it may be thought of as "the middle" value. The basic advantage of the median in describing data compared to the mean is that it is not skewed so much by a small proportion of extremely large or small values, and so it may give a better idea of a "typical" value. For example, in understanding statistics like household income or assets, which vary greatly, the mean may be skewed by a small number of extremely high or low values. Median income, for example, may be a better way to suggest what a "typical" income is. Because of this, the median is of central importance in robust statistics, as it is the most resistant statistic, having a breakdown point of 50%: so long as no more than half the data are contaminated, the median will not give an arbitrarily large or small result.

**Probability theory** is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set of axioms. Typically these axioms formalise probability in terms of a probability space, which assigns a measure taking values between 0 and 1, termed the probability measure, to a set of outcomes called the sample space. Any specified subset of these outcomes is called an event. Central subjects in probability theory include discrete and continuous random variables, probability distributions, and stochastic processes, which provide mathematical abstractions of non-deterministic or uncertain processes or measured quantities that may either be single occurrences or evolve over time in a random fashion. Although it is not possible to perfectly predict random events, much can be said about their behavior. Two major results in probability theory describing such behaviour are the law of large numbers and the central limit theorem.

In probability theory and statistics, a **probability distribution** is the mathematical function that gives the probabilities of occurrence of different possible **outcomes** for an experiment. It is a mathematical description of a random phenomenon in terms of its sample space and the probabilities of events.

In probability and statistics, a **random variable**, **random quantity**, **aleatory variable**, or **stochastic variable** is described informally as a variable whose values depend on outcomes of a random phenomenon. The formal mathematical treatment of random variables is a topic in probability theory. In that context, a random variable is understood as a measurable function defined on a probability space that maps from the sample space to the real numbers.

In probability theory, a **probability density function** (**PDF**), or **density** of a continuous random variable, is a function whose value at any given sample in the sample space can be interpreted as providing a *relative likelihood* that the value of the random variable would equal that sample. In other words, while the *absolute likelihood* for a continuous random variable to take on any particular value is 0, the value of the PDF at two different samples can be used to infer, in any particular draw of the random variable, how much more likely it is that the random variable would equal one sample compared to the other sample.

In probability and statistics, a **probability mass function** (**PMF**) is a function that gives the probability that a discrete random variable is exactly equal to some value. Sometimes it is also known as the discrete density function. The probability mass function is often the primary means of defining a discrete probability distribution, and such functions exist for either scalar or multivariate random variables whose domain is discrete.

In mathematics, the **moments** of a function are quantitative measures related to the shape of the function's graph. The concept is used in both mechanics and statistics. If the function represents mass, then the zeroth moment is the total mass, the first moment divided by the total mass is the center of mass, and the second moment is the rotational inertia. If the function is a probability distribution, then the zeroth moment is the total probability, the first moment is the expected value, the second central moment is the variance, the third standardized moment is the skewness, and the fourth standardized moment is the kurtosis. The mathematical concept is closely related to the concept of moment in physics.

In probability theory and statistics, the **marginal distribution** of a subset of a collection of random variables is the probability distribution of the variables contained in the subset. It gives the probabilities of various values of the variables in the subset without reference to the values of the other variables. This contrasts with a conditional distribution, which gives the probabilities contingent upon the values of the other variables.

In probability and statistics, a **mixture distribution** is the probability distribution of a random variable that is derived from a collection of other random variables as follows: first, a random variable is selected by chance from the collection according to given probabilities of selection, and then the value of the selected random variable is realized. The underlying random variables may be random real numbers, or they may be random vectors, in which case the mixture distribution is a multivariate distribution.

Given random variables , that are defined on a probability space, the **joint probability distribution** for is a probability distribution that gives the probability that each of falls in any particular range or discrete set of values specified for that variable. In the case of only two random variables, this is called a **bivariate distribution**, but the concept generalizes to any number of random variables, giving a **multivariate distribution**.

In probability theory and statistics, the **characteristic function** of any real-valued random variable completely defines its probability distribution. If a random variable admits a probability density function, then the characteristic function is the Fourier transform of the probability density function. Thus it provides an alternative route to analytical results compared with working directly with probability density functions or cumulative distribution functions. There are particularly simple results for the characteristic functions of distributions defined by the weighted sums of random variables.

**Differential entropy** is a concept in information theory that began as an attempt by Shannon to extend the idea of (Shannon) entropy, a measure of average surprisal of a random variable, to continuous probability distributions. Unfortunately, Shannon did not derive this formula, and rather just assumed it was the correct continuous analogue of discrete entropy, but it is not. The actual continuous version of discrete entropy is the limiting density of discrete points (LDDP). Differential entropy is commonly encountered in the literature, but it is a limiting case of the LDDP, and one that loses its fundamental association with discrete entropy.

In probability theory, an **empirical process** is a stochastic process that describes the proportion of objects in a system in a given state. For a process in a discrete state space a **population continuous time Markov chain** or **Markov population model** is a process which counts the number of objects in a given state . In mean field theory, limit theorems are considered and generalise the central limit theorem for empirical measures. Applications of the theory of empirical processes arise in non-parametric statistics.

The **mean absolute difference** (univariate) is a measure of statistical dispersion equal to the average absolute difference of two independent values drawn from a probability distribution. A related statistic is the **relative mean absolute difference**, which is the mean absolute difference divided by the arithmetic mean, and equal to twice the Gini coefficient. The mean absolute difference is also known as the **absolute mean difference** and the **Gini mean difference** (GMD). The mean absolute difference is sometimes denoted by Δ or as MD.

In probability and statistics, **studentized range distribution** is the continuous probability distribution of the studentized range of an i.i.d. sample from a normally distributed population.

**V-statistics** are a class of statistics named for Richard von Mises who developed their asymptotic distribution theory in a fundamental paper in 1947. V-statistics are closely related to U-statistics introduced by Wassily Hoeffding in 1948. A V-statistic is a statistical function defined by a particular statistical functional of a probability distribution.

- ↑ George Woodbury (2001).
*An Introduction to Statistics*. Cengage Learning. p. 74. ISBN 0534377556. - ↑ Carin Viljoen (2000).
*Elementary Statistics: Vol 2*. Pearson South Africa. pp. 7–27. ISBN 186891075X. - 1 2 3 E. J. Gumbel (1947). "The Distribution of the Range".
*The Annals of Mathematical Statistics*.**18**(3): 384–412. doi: 10.1214/aoms/1177730387 . JSTOR 2235736. - 1 2 Tsimashenka, I.; Knottenbelt, W.; Harrison, P. (2012). "Controlling Variability in Split-Merge Systems".
*Analytical and Stochastic Modeling Techniques and Applications*(PDF). Lecture Notes in Computer Science.**7314**. p. 165. doi:10.1007/978-3-642-30782-9_12. ISBN 978-3-642-30781-2. - ↑ H. O. Hartley; H. A. David (1954). "Universal Bounds for Mean Range and Extreme Observation".
*The Annals of Mathematical Statistics*.**25**(1): 85–99. doi: 10.1214/aoms/1177728848 . JSTOR 2236514. - ↑ L. H. C. Tippett (1925). "On the Extreme Individuals and the Range of Samples Taken from a Normal Population".
*Biometrika*.**17**(3/4): 364–387. doi:10.1093/biomet/17.3-4.364. JSTOR 2332087. - 1 2 Evans, D. L.; Leemis, L. M.; Drew, J. H. (2006). "The Distribution of Order Statistics for Discrete Random Variables with Applications to Bootstrapping".
*INFORMS Journal on Computing*.**18**: 19. doi:10.1287/ijoc.1040.0105. - ↑ Irving W. Burr (1955). "Calculation of Exact Sampling Distribution of Ranges from a Discrete Population".
*The Annals of Mathematical Statistics*.**26**(3): 530–532. doi: 10.1214/aoms/1177728500 . JSTOR 2236482. - 1 2 Abdel-Aty, S. H. (1954). "Ordered variables in discontinuous distributions".
*Statistica Neerlandica*.**8**(2): 61–82. doi:10.1111/j.1467-9574.1954.tb00442.x. - ↑ Siotani, M. (1956). "Order statistics for discrete case with a numerical application to the binomial distribution".
*Annals of the Institute of Statistical Mathematics*.**8**: 95–96. doi:10.1007/BF02863574. - ↑ Paul R. Rider (1951). "The Distribution of the Range in Samples from a Discrete Rectangular Population".
*Journal of the American Statistical Association*.**46**(255): 375–378. doi:10.1080/01621459.1951.10500796. JSTOR 2280515.

This page is based on this Wikipedia article

Text is available under the CC BY-SA 4.0 license; additional terms may apply.

Images, videos and audio are available under their respective licenses.

Text is available under the CC BY-SA 4.0 license; additional terms may apply.

Images, videos and audio are available under their respective licenses.