Mean absolute scaled error

Last updated December 22, 2024

In statistics, the mean absolute scaled error (MASE) is a measure of the accuracy of forecasts. It is the mean absolute error of the forecast values, divided by the mean absolute error of the in-sample one-step naive forecast. It was proposed in 2005 by statistician Rob J. Hyndman and Professor of Decision Sciences Anne B. Koehler, who described it as a "generally applicable measurement of forecast accuracy without the problems seen in the other measurements."^[1] The mean absolute scaled error has favorable properties when compared to other methods for calculating forecast errors, such as root-mean-square-deviation, and is therefore recommended for determining comparative accuracy of forecasts.^[2]

Rationale

The mean absolute scaled error has the following desirable properties:^[3]

Scale invariance : The mean absolute scaled error is independent of the scale of the data, so can be used to compare forecasts across data sets with different scales.
Predictable behavior as $y_{t}\rightarrow 0$ : Percentage forecast accuracy measures such as the Mean absolute percentage error (MAPE) rely on division of $y_{t}$ , skewing the distribution of the MAPE for values of $y_{t}$ near or equal to 0. This is especially problematic for data sets whose scales do not have a meaningful 0, such as temperature in Celsius or Fahrenheit, and for intermittent demand data sets, where $y_{t}=0$ occurs frequently.
Symmetry: The mean absolute scaled error penalizes positive and negative forecast errors equally, and penalizes errors in large forecasts and small forecasts equally. In contrast, the MAPE and median absolute percentage error (MdAPE) fail both of these criteria, while the "symmetric" sMAPE and sMdAPE^[4] fail the second criterion.
Interpretability: The mean absolute scaled error can be easily interpreted, as values greater than one indicate that in-sample one-step forecasts from the naïve method perform better than the forecast values under consideration.
Asymptotic normality of the MASE: The Diebold-Mariano test for one-step forecasts is used to test the statistical significance of the difference between two sets of forecasts.^[5]^[6]^[7] To perform hypothesis testing with the Diebold-Mariano test statistic, it is desirable for $DM\sim N(0,1)$ , where $DM$ is the value of the test statistic. The DM statistic for the MASE has been empirically shown to approximate this distribution, while the mean relative absolute error (MRAE), MAPE and sMAPE do not.^[2]

Non seasonal time series

For a non-seasonal time series,^[8] the mean absolute scaled error is estimated by

\mathrm {MASE} =\mathrm {mean} \left({\frac {\left|e_{j}\right|}{{\frac {1}{T-1}}\sum _{t=2}^{T}\left|Y_{t}-Y_{t-1}\right|}}\right)={\frac {{\frac {1}{J}}\sum _{j}\left|e_{j}\right|}{{\frac {1}{T-1}}\sum _{t=2}^{T}\left|Y_{t}-Y_{t-1}\right|}}

^[3]

where the numerator e_j is the forecast error for a given period (with J, the number of forecasts), defined as the actual value (Y_j) minus the forecast value (F_j) for that period: e_j = Y_j − F_j, and the denominator is the mean absolute error of the one-step "naive forecast method" on the training set (here defined as t = 1..T),^[8] which uses the actual value from the prior period as the forecast: F_t = Y_t−1^[9]

Seasonal time series

For a seasonal time series, the mean absolute scaled error is estimated in a manner similar to the method for non-seasonal time series:

$\mathrm {MASE} =\mathrm {mean} \left({\frac {\left|e_{j}\right|}{{\frac {1}{T-m}}\sum _{t=m+1}^{T}\left|Y_{t}-Y_{t-m}\right|}}\right)={\frac {{\frac {1}{J}}\sum _{j}\left|e_{j}\right|}{{\frac {1}{T-m}}\sum _{t=m+1}^{T}\left|Y_{t}-Y_{t-m}\right|}}$ ^[8]

The main difference with the method for non-seasonal time series, is that the denominator is the mean absolute error of the one-step "seasonal naive forecast method" on the training set,^[8] which uses the actual value from the prior season as the forecast: F_t = Y_t−m,^[9] where m is the seasonal period.

This scale-free error metric "can be used to compare forecast methods on a single series and also to compare forecast accuracy between series. This metric is well suited to intermittent-demand series (a data set containing a large amount of zeros) because it never gives infinite or undefined values^[1] except in the irrelevant case where all historical data are equal.^[3]

When comparing forecasting methods, the method with the lowest MASE is the preferred method.

Non-time series data

For non-time series data, the mean of the data ( ${\bar {Y}}$ ) can be used as the "base" forecast.^[10]

\mathrm {MASE} =\mathrm {mean} \left({\frac {\left|e_{j}\right|}{{\frac {1}{J}}\sum _{j=1}^{J}\left|Y_{j}-{\bar {Y}}\right|}}\right)={\frac {{\frac {1}{J}}\sum _{j}\left|e_{j}\right|}{{\frac {1}{J}}\sum _{j}\left|Y_{j}-{\bar {Y}}\right|}}

In this case the MASE is the Mean absolute error divided by the Mean Absolute Deviation.

Related Research Articles

The median of a set of numbers is the value separating the higher half from the lower half of a data sample, a population, or a probability distribution. For a data set, it may be thought of as the “middle" value. The basic feature of the median in describing data compared to the mean is that it is not skewed by a small proportion of extremely large or small values, and therefore provides a better representation of the center. Median income, for example, may be a better way to describe the center of the income distribution because increases in the largest incomes alone have no effect on the median. For this reason, the median is of central importance in robust statistics.

In statistics, the standard deviation is a measure of the amount of variation of the values of a variable about its mean. A low standard deviation indicates that the values tend to be close to the mean of the set, while a high standard deviation indicates that the values are spread out over a wider range. The standard deviation is commonly used in the determination of what constitutes an outlier and what does not. Standard deviation may be abbreviated SD or Std Dev, and is most commonly represented in mathematical texts and equations by the lowercase Greek letter σ (sigma), for the population standard deviation, or the Latin letter s, for the sample standard deviation.

In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers is spread out from their average value. It is the second central moment of a distribution, and the covariance of the random variable with itself, and it is often represented by $,,,, or .$

The weighted arithmetic mean is similar to an ordinary arithmetic mean, except that instead of each of the data points contributing equally to the final average, some data points contribute more than others. The notion of weighted mean plays a role in descriptive statistics and also occurs in a more general form in several other areas of mathematics.

In mathematics, a Riemann sum is a certain kind of approximation of an integral by a finite sum. It is named after nineteenth century German mathematician Bernhard Riemann. One very common application is in numerical integration, i.e., approximating the area of functions or lines on a graph, where it is also known as the rectangle rule. It can also be applied for approximating the length of curves and other approximations.

In mathematics, the error function, often denoted by $erf$ , is a function $:\mathbb {C} \to \mathbb {C} }$ defined as:

Forecasting is the process of making predictions based on past and present data. Later these can be compared with what actually happens. For example, a company might estimate their revenue in the next year, then compare it against the actual results creating a variance actual analysis. Prediction is a similar but more general term. Forecasting might refer to specific formal statistical methods employing time series, cross-sectional or longitudinal data, or alternatively to less formal judgmental methods or the process of prediction and assessment of its accuracy. Usage can vary between areas of application: for example, in hydrology the terms "forecast" and "forecasting" are sometimes reserved for estimates of values at certain specific future times, while the term "prediction" is used for more general estimates, such as the number of times floods will occur over a long period.

The average absolute deviation (AAD) of a data set is the average of the absolute deviations from a central point. It is a summary statistic of statistical dispersion or variability. In the general form, the central point can be a mean, median, mode, or the result of any other measure of central tendency or any reference value related to the given data set. AAD includes the mean absolute deviation and the median absolute deviation.

In statistics, propagation of uncertainty is the effect of variables' uncertainties on the uncertainty of a function based on them. When the variables are the values of experimental measurements they have uncertainties due to measurement limitations which propagate due to the combination of variables in the function.

In astronomy, air mass or airmass is a measure of the amount of air along the line of sight when observing a star or other celestial source from below Earth's atmosphere. It is formulated as the integral of air density along the light ray.

In time series analysis used in statistics and econometrics, autoregressive integrated moving average (ARIMA) and seasonal ARIMA (SARIMA) models are generalizations of the autoregressive moving average (ARMA) model to non-stationary series and periodic variation, respectively. All these models are fitted to time series in order to better understand it and predict future values. The purpose of these generalizations is to fit the data as well as possible. Specifically, ARMA assumes that the series is stationary, that is, its expected value is constant in time. If instead the series has a trend, the trend is removed by "differencing", leaving a stationary series. This operation generalizes ARMA and corresponds to the "integrated" part of ARIMA. Analogously, periodic variation is removed by "seasonal differencing".

In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model by the principle of least squares: minimizing the sum of the squares of the differences between the observed dependent variable in the input dataset and the output of the (linear) function of the independent variable. Some sources consider OLS to be linear regression.

The mean absolute percentage error (MAPE), also known as mean absolute percentage deviation (MAPD), is a measure of prediction accuracy of a forecasting method in statistics. It usually expresses the accuracy as a ratio defined by the formula:

The mean absolute difference (univariate) is a measure of statistical dispersion equal to the average absolute difference of two independent values drawn from a probability distribution. A related statistic is the relative mean absolute difference, which is the mean absolute difference divided by the arithmetic mean, and equal to twice the Gini coefficient. The mean absolute difference is also known as the absolute mean difference and the Gini mean difference (GMD). The mean absolute difference is sometimes denoted by Δ or as MD.

In statistics, mean absolute error (MAE) is a measure of errors between paired observations expressing the same phenomenon. Examples of Y versus X include comparisons of predicted versus observed, subsequent time versus initial time, and one technique of measurement versus an alternative technique of measurement. MAE is calculated as the sum of absolute errors divided by the sample size: $It is thus an arithmetic average of the absolute errors, where is the prediction and the true value. Alternative formulations may include relative frequencies as weight factors. The mean absolute error uses the same scale as the data being measured. This is known as a scale-dependent accuracy measure and therefore cannot be used to make comparisons between predicted values that use different scales. The mean absolute error is a common measure of forecast error in time series analysis, sometimes used in confusion with the more standard definition of mean absolute deviation. The same confusion exists more generally.$

The root mean square deviation (RMSD) or root mean square error (RMSE) is either one of two closely related and frequently used measures of the differences between true or predicted values on the one hand and observed values or an estimator on the other. The deviation is typically simply a differences of scalars; it can also be generalized to the vector lengths of a displacement, as in the bioinformatics concept of root mean square deviation of atomic positions.

Demand forecasting, also known as demand planning and sales forecasting (DP&SF), involves the prediction of the quantity of goods and services that will be demanded by consumers or business customers at a future point in time. More specifically, the methods of demand forecasting entail using predictive analytics to estimate customer demand in consideration of key economic conditions. This is an important tool in optimizing business profitability through efficient supply chain management. Demand forecasting methods are divided into two major categories, qualitative and quantitative methods:

The symmetric mean absolute percentage error is an accuracy measure based on percentage errors. It is usually defined as follows:

Robin John Hyndman is an Australian statistician known for his work on forecasting and time series. He is a Professor of Statistics at Monash University and was Editor-in-Chief of the International Journal of Forecasting from 2005–2018. In 2007, he won the Moran Medal from the Australian Academy of Science for his contributions to statistical research. In 2021, he won the Pitman Medal from the Statistical Society of Australia.

In statistics and management science, a tracking signal monitors any forecasts that have been made in comparison with actuals, and warns when there are unexpected departures of the outcomes from the forecasts. Forecasts can relate to sales, inventory, or anything pertaining to an organization's future demand.

References

1 2 Hyndman, R. J. (2006). "Another look at measures of forecast accuracy", FORESIGHT Issue 4 June 2006, pg46
1 2 Franses, Philip Hans (2016-01-01). "A note on the Mean Absolute Scaled Error". International Journal of Forecasting. 32 (1): 20–22. doi:10.1016/j.ijforecast.2015.03.008. hdl: 1765/78815 .
1 2 3 Hyndman, R. J. and Koehler A. B. (2006). "Another look at measures of forecast accuracy." International Journal of Forecasting volume 22 issue 4, pages 679-688. doi:10.1016/j.ijforecast.2006.03.001
↑ Makridakis, Spyros (1993-12-01). "Accuracy measures: theoretical and practical concerns". International Journal of Forecasting. 9 (4): 527–529. doi:10.1016/0169-2070(93)90079-3. S2CID 153403127.
↑ Diebold, Francis X.; Mariano, Roberto S. (1995). "Comparing predictive accuracy". Journal of Business and Economic Statistics. 13 (3): 253–263. doi:10.1080/07350015.1995.10524599.
↑ Diebold, Francis X.; Mariano, Roberto S. (2002). "Comparing predictive accuracy" (PDF). Journal of Business and Economic Statistics. 20 (1): 134–144. doi:10.1198/073500102753410444. S2CID 12090811.
↑ Diebold, Francis X. (2015). "Comparing predictive accuracy, twenty years later: A personal perspective on the use and abuse of Diebold–Mariano tests" (PDF). Journal of Business and Economic Statistics. 33 (1): 1. doi:10.1080/07350015.2014.983236.
1 2 3 4 "2.5 Evaluating forecast accuracy | OTexts". www.otexts.org. Retrieved 2016-05-15.
1 2 Hyndman, Rob et al, Forecasting with Exponential Smoothing: The State Space Approach, Berlin: Springer-Verlag, 2008. ISBN 978-3-540-71916-8.
↑ Hyndman, Rob. "Alternative to MAPE when the data is not a time series". Cross Validated. Retrieved 2022-10-11.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[Hyndman2006a-1] 1 2 Hyndman, R. J. (2006). "Another look at measures of forecast accuracy", FORESIGHT Issue 4 June 2006, pg46

[:1-2] 1 2 Franses, Philip Hans (2016-01-01). "A note on the Mean Absolute Scaled Error". International Journal of Forecasting. 32 (1): 20–22. doi:10.1016/j.ijforecast.2015.03.008. hdl: 1765/78815 .

[Hyndman2006-3] 1 2 3 Hyndman, R. J. and Koehler A. B. (2006). "Another look at measures of forecast accuracy." International Journal of Forecasting volume 22 issue 4, pages 679-688. doi:10.1016/j.ijforecast.2006.03.001

[4] Makridakis, Spyros (1993-12-01). "Accuracy measures: theoretical and practical concerns". International Journal of Forecasting. 9 (4): 527–529. doi:10.1016/0169-2070(93)90079-3. S2CID 153403127.

[5] Diebold, Francis X.; Mariano, Roberto S. (1995). "Comparing predictive accuracy". Journal of Business and Economic Statistics. 13 (3): 253–263. doi:10.1080/07350015.1995.10524599.

[6] Diebold, Francis X.; Mariano, Roberto S. (2002). "Comparing predictive accuracy" (PDF). Journal of Business and Economic Statistics. 20 (1): 134–144. doi:10.1198/073500102753410444. S2CID 12090811.

[7] Diebold, Francis X. (2015). "Comparing predictive accuracy, twenty years later: A personal perspective on the use and abuse of Diebold–Mariano tests" (PDF). Journal of Business and Economic Statistics. 33 (1): 1. doi:10.1080/07350015.2014.983236.

[:0-8] 1 2 3 4 "2.5 Evaluating forecast accuracy | OTexts". www.otexts.org. Retrieved 2016-05-15.

[Hyndman2008-9] 1 2 Hyndman, Rob et al, Forecasting with Exponential Smoothing: The State Space Approach, Berlin: Springer-Verlag, 2008. ISBN 978-3-540-71916-8.

[10] Hyndman, Rob. "Alternative to MAPE when the data is not a time series". Cross Validated. Retrieved 2022-10-11.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

v t e Machine learning evaluation metrics
Regression	MSE MAE sMAPE MAPE MASE MSPE RMS RMSE/RMSD R² MDA MAD
Classification	F-score P4 Accuracy Precision Recall Kappa MCC AUC ROC Sensitivity and specificity Logarithmic Loss
Clustering	Silhouette Calinski-Harabasz index Davies-Bouldin Dunn index Hopkins statistic Jaccard index Rand index Similarity measure SMC SimHash
Ranking	MRR NDCG AP
Computer Vision	PSNR SSIM IoU
NLP	Perplexity BLEU
Deep Learning Related Metrics	Inception score FID
Recommender system	Coverage Intra-list Similarity
Similarity	Cosine similarity Euclidean distance Pearson correlation coefficient
Confusion matrix