# Average absolute deviation

Last updated

The average absolute deviation (AAD) of a data set is the average of the absolute deviations from a central point. It is a summary statistic of statistical dispersion or variability. In the general form, the central point can be a mean, median, mode, or the result of any other measure of central tendency or any reference value related to the given data set. AAD includes the mean absolute deviation and the median absolute deviation (both abbreviated as MAD).

## Measures of dispersion

Several measures of statistical dispersion are defined in terms of the absolute deviation. The term "average absolute deviation" does not uniquely identify a measure of statistical dispersion, as there are several measures that can be used to measure absolute deviations, and there are several measures of central tendency that can be used as well. Thus, to uniquely identify the absolute deviation it is necessary to specify both the measure of deviation and the measure of central tendency. Unfortunately, the statistical literature has not yet adopted a standard notation, as both the mean absolute deviation around the mean and the median absolute deviation around the median have been denoted by their initials "MAD" in the literature, which may lead to confusion, since in general, they may have values considerably different from each other.

## Mean absolute deviation around a central point

The mean absolute deviation of a set {x1, x2, ..., xn} is

${\displaystyle {\frac {1}{n}}\sum _{i=1}^{n}|x_{i}-m(X)|.}$

The choice of measure of central tendency, ${\displaystyle m(X)}$, has a marked effect on the value of the mean deviation. For example, for the data set {2, 2, 3, 4, 14}:

Measure of central tendency ${\displaystyle m(X)}$Mean absolute deviation
Arithmetic Mean = 5${\displaystyle {\frac {|2-5|+|2-5|+|3-5|+|4-5|+|14-5|}{5}}=3.6}$
Median = 3${\displaystyle {\frac {|2-3|+|2-3|+|3-3|+|4-3|+|14-3|}{5}}=2.8}$
Mode = 2${\displaystyle {\frac {|2-2|+|2-2|+|3-2|+|4-2|+|14-2|}{5}}=3.0}$

The mean absolute deviation from the median is less than or equal to the mean absolute deviation from the mean. In fact, the mean absolute deviation from the median is always less than or equal to the mean absolute deviation from any other fixed number.

The mean absolute deviation from the mean is less than or equal to the standard deviation; one way of proving this relies on Jensen's inequality.

Proof

Jensen's inequality is ${\displaystyle \varphi \left(\mathbb {E} [Y]\right)\leq \mathbb {E} \left[\varphi (Y)\right]}$, where φ is a convex function, this implies for ${\displaystyle Y=\vert X-\mu \vert }$ that:

${\displaystyle \mathbb {E} \left(|X-\mu \right|)^{2}\leq \mathbb {E} \left(|X-\mu |^{2}\right)}$
${\displaystyle \mathbb {E} \left(|X-\mu \right|)^{2}\leq \operatorname {Var} (X)}$

Since both sides are positive, and the square root is a monotonically increasing function in the positive domain:

${\displaystyle \mathbb {E} \left(|X-\mu \right|)\leq {\sqrt {\operatorname {Var} (X)}}}$

For a general case of this statement, see Hölder's inequality.

For the normal distribution, the ratio of mean absolute deviation to standard deviation is ${\textstyle {\sqrt {2/\pi }}=0.79788456\ldots }$. Thus if X is a normally distributed random variable with expected value 0 then, see Geary (1935): [1]

${\displaystyle w={\frac {E|X|}{\sqrt {E(X^{2})}}}={\sqrt {\frac {2}{\pi }}}.}$

In other words, for a normal distribution, mean absolute deviation is about 0.8 times the standard deviation. However, in-sample measurements deliver values of the ratio of mean average deviation / standard deviation for a given Gaussian sample n with the following bounds: ${\displaystyle w_{n}\in [0,1]}$, with a bias for small n. [2]

### Mean absolute deviation around the mean

The mean absolute deviation (MAD), also referred to as the "mean deviation" or sometimes "average absolute deviation", is the mean of the data's absolute deviations around the data's mean: the average (absolute) distance from the mean. "Average absolute deviation" can refer to either this usage, or to the general form with respect to a specified central point (see above).

MAD has been proposed to be used in place of standard deviation since it corresponds better to real life. [3] Because the MAD is a simpler measure of variability than the standard deviation, it can be useful in school teaching. [4] [5]

This method's forecast accuracy is very closely related to the mean squared error (MSE) method which is just the average squared error of the forecasts. Although these methods are very closely related, MAD is more commonly used because it is both easier to compute (avoiding the need for squaring) [6] and easier to understand. [7]

### Mean absolute deviation around the median

The median is the point about which the mean deviation is minimized. The MAD median offers a direct measure of the scale of a random variable around its median

${\displaystyle D_{\text{med}}=E|X-{\text{median}}|}$

This is the maximum likelihood estimator of the scale parameter ${\displaystyle b}$ of the Laplace distribution. For the normal distribution we have ${\textstyle D_{\text{mean}}=\sigma {\sqrt {2/\pi }}\approx 0.797884\sigma }$. Since the median minimizes the average absolute distance, we have ${\displaystyle D_{\text{med}}\leq D_{\text{mean}}}$.

By using the general dispersion function, Habib (2011) defined MAD about median as

${\displaystyle D_{\text{med}}=E|X-{\text{median}}|=2\operatorname {Cov} (X,I_{O})}$

where the indicator function is

${\displaystyle \mathbf {I} _{O}:={\begin{cases}1&{\text{if }}x>{\text{median}},\\0&{\text{otherwise}}.\end{cases}}}$

This representation allows for obtaining MAD median correlation coefficients.[ citation needed ]

## Median absolute deviation around a central point

While in principle the mean or any other central point could be taken as the central point for the median absolute deviation, most often the median value is taken instead.

### Median absolute deviation around the median

The median absolute deviation (also MAD) is the median of the absolute deviation from the median . It is a robust estimator of dispersion.

For the example {2, 2, 3, 4, 14}: 3 is the median, so the absolute deviations from the median are {1, 1, 0, 1, 11} (reordered as {0, 1, 1, 1, 11}) with a median of 1, in this case unaffected by the value of the outlier 14, so the median absolute deviation is 1.

For a symmetric distribution, the median absolute deviation is equal to half the interquartile range.

## Maximum absolute deviation

The maximum absolute deviation around an arbitrary point is the maximum of the absolute deviations of a sample from that point. While not strictly a measure of central tendency, the maximum absolute deviation can be found using the formula for the average absolute deviation as above with ${\displaystyle m(X)=\max(X)}$, where ${\displaystyle \max(X)}$ is the sample maximum.

## Minimization

The measures of statistical dispersion derived from absolute deviation characterize various measures of central tendency as minimizing dispersion: The median is the measure of central tendency most associated with the absolute deviation. Some location parameters can be compared as follows:

• L2 norm statistics: the mean minimizes the mean squared error
• L1 norm statistics: the median minimizes average absolute deviation,
• L norm statistics: the mid-range minimizes the maximum absolute deviation
• trimmed L norm statistics: for example, the midhinge (average of first and third quartiles) which minimizes the median absolute deviation of the whole distribution, also minimizes the maximum absolute deviation of the distribution after the top and bottom 25% have been trimmed off.

## Estimation

The mean absolute deviation of a sample is a biased estimator of the mean absolute deviation of the population. In order for the absolute deviation to be an unbiased estimator, the expected value (average) of all the sample absolute deviations must equal the population absolute deviation. However, it does not. For the population 1,2,3 both the population absolute deviation about the median and the population absolute deviation about the mean are 2/3. The average of all the sample absolute deviations about the mean of size 3 that can be drawn from the population is 44/81, while the average of all the sample absolute deviations about the median is 4/9. Therefore, the absolute deviation is a biased estimator.

However, this argument is based on the notion of mean-unbiasedness. Each measure of location has its own form of unbiasedness (see entry on biased estimator). The relevant form of unbiasedness here is median unbiasedness.

## Related Research Articles

In statistics, a central tendency is a central or typical value for a probability distribution.

In statistics, an estimator is a rule for calculating an estimate of a given quantity based on observed data: thus the rule, the quantity of interest and its result are distinguished. For example, the sample mean is a commonly used estimator of the population mean.

In statistics and probability theory, the median is the value separating the higher half from the lower half of a data sample, a population, or a probability distribution. For a data set, it may be thought of as "the middle" value. The basic feature of the median in describing data compared to the mean is that it is not skewed by a small proportion of extremely large or small values, and therefore provides a better representation of a "typical" value. Median income, for example, may be a better way to suggest what a "typical" income is, because income distribution can be very skewed. The median is of central importance in robust statistics, as it is the most resistant statistic, having a breakdown point of 50%: so long as no more than half the data are contaminated, the median is not an arbitrarily large or small result.

In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean of the set, while a high standard deviation indicates that the values are spread out over a wider range.

In probability theory and statistics, skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean. The skewness value can be positive, zero, negative, or undefined.

In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers is spread out from their average value. Variance has a central role in statistics, where some ideas that use it include descriptive statistics, statistical inference, hypothesis testing, goodness of fit, and Monte Carlo sampling. Variance is an important tool in the sciences, where statistical analysis of data is common. The variance is the square of the standard deviation, the second central moment of a distribution, and the covariance of the random variable with itself, and it is often represented by , , , , or .

In statistics, the mean squared error (MSE) or mean squared deviation (MSD) of an estimator measures the average of the squares of the errors—that is, the average squared difference between the estimated values and the actual value. MSE is a risk function, corresponding to the expected value of the squared error loss. The fact that MSE is almost always strictly positive is because of randomness or because the estimator does not account for information that could produce a more accurate estimate. In machine learning, specifically empirical risk minimization, MSE may refer to the empirical risk, as an estimate of the true MSE.

In statistics, an empirical distribution function is the distribution function associated with the empirical measure of a sample. This cumulative distribution function is a step function that jumps up by 1/n at each of the n data points. Its value at any specified value of the measured variable is the fraction of observations of the measured variable that are less than or equal to the specified value.

In statistics, the mid-range or mid-extreme is a measure of central tendency of a sample (statistics) defined as the arithmetic mean of the maximum and minimum values of the data set:

Robust statistics is statistics with good performance for data drawn from a wide range of probability distributions, especially for distributions that are not normal. Robust statistical methods have been developed for many common problems, such as estimating location, scale, and regression parameters. One motivation is to produce statistical methods that are not unduly affected by outliers. Another motivation is to provide methods with good performance when there are small departures from a parametric distribution. For example, robust methods work well for mixtures of two normal distributions with different standard-deviations; under this model, non-robust methods like a t-test work poorly.

The mean absolute difference (univariate) is a measure of statistical dispersion equal to the average absolute difference of two independent values drawn from a probability distribution. A related statistic is the relative mean absolute difference, which is the mean absolute difference divided by the arithmetic mean, and equal to twice the Gini coefficient. The mean absolute difference is also known as the absolute mean difference and the Gini mean difference (GMD). The mean absolute difference is sometimes denoted by Δ or as MD.

In statistics, mean absolute error (MAE) is a measure of errors between paired observations expressing the same phenomenon. Examples of Y versus X include comparisons of predicted versus observed, subsequent time versus initial time, and one technique of measurement versus an alternative technique of measurement. MAE is calculated as the sum of absolute errors divided by the sample size:

In statistics, the bias of an estimator is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called unbiased. In statistics, "bias" is an objective property of an estimator. Bias can also be measured with respect to the median, rather than the mean, in which case one distinguishes median-unbiased from the usual mean-unbiasedness property. Bias is a distinct concept from consistency: consistent estimators converge in probability to the true value of the parameter, but may be biased or unbiased; see bias versus consistency for more.

In mathematics and statistics, deviation is a measure of difference between the observed value of a variable and some other value, often that variable's mean. The sign of the deviation reports the direction of that difference. The magnitude of the value indicates the size of the difference.

In statistics, the median absolute deviation (MAD) is a robust measure of the variability of a univariate sample of quantitative data. It can also refer to the population parameter that is estimated by the MAD calculated from a sample.

In statistics, Bessel's correction is the use of n − 1 instead of n in the formula for the sample variance and sample standard deviation, where n is the number of observations in a sample. This method corrects the bias in the estimation of the population variance. It also partially corrects the bias in the estimation of the population standard deviation. However, the correction often increases the mean squared error in these estimations. This technique is named after Friedrich Bessel.

In statistics, an L-estimator is an estimator which is a linear combination of order statistics of the measurements. This can be as little as a single point, as in the median, or as many as all points, as in the mean.

In statistics, a trimmed estimator is an estimator derived from another estimator by excluding some of the extreme values, a process called truncation. This is generally done to obtain a more robust statistic, and the extreme values are considered outliers. Trimmed estimators also often have higher efficiency for mixture distributions and heavy-tailed distributions than the corresponding untrimmed estimator, at the cost of lower efficiency for other distributions, such as the normal distribution.

In statistics, robust measures of scale are methods that quantify the statistical dispersion in a sample of numerical data while resisting outliers. The most common such robust statistics are the interquartile range (IQR) and the median absolute deviation (MAD). These are contrasted with conventional or non-robust measures of scale, such as sample variance or standard deviation, which are greatly influenced by outliers.

In the comparison of various statistical procedures, efficiency is a measure of quality of an estimator, of an experimental design, or of a hypothesis testing procedure. Essentially, a more efficient estimator, experiment, or test needs fewer observations than a less efficient one to achieve a given error performance. An efficient estimator is characterized by a small variance or mean square error, indicating that there is a small deviance between the estimated value and the "true" value.

## References

1. Geary, R. C. (1935). The ratio of the mean deviation to the standard deviation as a test of normality. Biometrika, 27(3/4), 310–332.
2. See also Geary's 1936 and 1946 papers: Geary, R. C. (1936). Moments of the ratio of the mean deviation to the standard deviation for normal samples. Biometrika, 28(3/4), 295–307 and Geary, R. C. (1947). Testing for normality. Biometrika, 34(3/4), 209–242.
3. Taleb, Nassim Nicholas (2014). "What scientific idea is ready for retirement?". Edge. Archived from the original on 2014-01-16. Retrieved 2014-01-16.{{cite web}}: CS1 maint: bot: original URL status unknown (link)
4. Kader, Gary (March 1999). "Means and MADS". Mathematics Teaching in the Middle School. 4 (6): 398–403. Archived from the original on 2013-05-18. Retrieved 20 February 2013.
5. Franklin, Christine, Gary Kader, Denise Mewborn, Jerry Moreno, Roxy Peck, Mike Perry, and Richard Scheaffer (2007). Guidelines for Assessment and Instruction in Statistics Education (PDF). American Statistical Association. ISBN   978-0-9791747-1-1. Archived (PDF) from the original on 2013-03-07. Retrieved 2013-02-20.
6. Nahmias, Steven; Olsen, Tava Lennon (2015), Production and Operations Analysis (7th ed.), Waveland Press, p. 62, ISBN   9781478628248, MAD is often the preferred method of measuring the forecast error because it does not require squaring.
7. Stadtler, Hartmut; Kilger, Christoph; Meyr, Herbert, eds. (2014), Supply Chain Management and Advanced Planning: Concepts, Models, Software, and Case Studies, Springer Texts in Business and Economics (5th ed.), Springer, p. 143, ISBN   9783642553097, the meaning of the MAD is easier to interpret.