The symmetric mean absolute percentage error (SMAPE or sMAPE) is an accuracy measure based on percentage (or relative) errors. It is usually defined[ citation needed ] as follows:
where are the actual values and are the forecasted values. Note that if , then term is undefined (), and is usually ignored in the summation.
Explaining this equation in words, the absolute difference between At and Ft is divided by half the sum of absolute values of the actual value At and the forecast value Ft. The value of this calculation is summed for every fitted point t and divided again by the number of fitted points n.
The earliest reference to a similar formula appears to be Armstrong (1985, p. 348), where it is called "adjusted MAPE" and is defined without the absolute values in the denominator. It was later discussed, modified, and re-proposed by Flores (1986).
Armstrong's original definition is as follows:
The problem is that it can be negative if . Therefore, the currently accepted version of SMAPE assumes the absolute values in the denominator.
The idea behind SMAPE is that over and under-forecasts are treated in a relative way, rather than an absolute way, as with the mean absolute percentage error (MAPE). For example, applying the formula above to some actual and forecasted values:
| MAPE | SMAPE | ||
|---|---|---|---|
| 100 | 110 | 10% | 9.52% |
| 100 | 90 | 10% | 10.53% |
we see that MAPE considers an over and underestimation of 10% as equivalent, whereas SMAPE considers the underestimation to be slightly "worse" than the overestimation.
Extending this to larger forecast errors:
| MAPE | SMAPE | ||
|---|---|---|---|
| 100 | 200 | 100% | 66.67% |
| 100 | 50 | 50% | 66.67% |
Here, double overestimation and half underestimation are treated equivalently by SMAPE, whereas MAPE considers the overestimation to be "twice as bad" as the underestimation.
Extending to an even more extreme case:
| MAPE | SMAPE | ||
|---|---|---|---|
| 100 | 1,000 | 900% | 163.63% |
| 100 | 10 | 90% | 163.63% |
Here it becomes clear that MAPE is unbounded from above, and can provide extremely large penalties for overestimations – but cannot do the same for extreme underestimations. SMAPE, on the other hand, is bounded between 0% and 200%, and penalises these larger over and underestimations in a more "symmetric" manner.
Therefore, the choice between MAPE and SMAPE depends entirely on the problem at hand, and whether or not a relative metric is more appropriate. This may be the case if the expected forecasting errors exceed ; for smaller errors, the MAPE is more frequently chosen, due to its simplicity and ease of interpretation.
As a "percentage error", SMAPE values between 0% and 100% can be considered easier to interpret, and an alternative formula is sometimes used in practice:
There is also a third version of SMAPE, which allows measuring the direction of the bias in the data by generating a positive and a negative error on line item level. Furthermore, it is better protected against outliers and the bias effect[ clarification needed ]. The formula is:
Provided the data are strictly positive, an alternative measure of relative accuracy can be obtained based on the log of the accuracy ratio: log(Ft / At). This measure is easier to analyze statistically and has valuable symmetry and unbiasedness properties. When used in constructing forecasting models, the resulting prediction corresponds to the geometric mean (Tofallis, 2015) [ clarification needed ].
This article includes a list of references, related reading, or external links, but its sources remain unclear because it lacks inline citations .(August 2011) |