Makridakis Competitions

Last updated

The Makridakis Competitions (also known as the M Competitions or M-Competitions) are a series of open competitions to evaluate and compare the accuracy of different time series forecasting methods. They are organized by teams led by forecasting researcher Spyros Makridakis and were first held in 1982. [1] [2] [3] [4]

Contents

Competitions

Summary

No.Informal name for competitionYear of publication of resultsNumber of time series usedNumber of methods testedOther features
1M Competition [1] [5] 19821001 (used a subsample of 111 for the methods where it was too difficult to run all 1001)15 (plus 9 variations)Not real-time
2M2 Competition [1] [6] 199329 (23 from collaborating companies, 6 from macroeconomic indicators)16 (including 5 human forecasters and 11 automatic trend-based methods) plus 2 combined forecasts and 1 overall averageReal-time, many collaborating organizations, competition announced in advance
3M3 Competition [1] 2000300324
4M4 Competition2020 [7] 100,000All major ML and statistical methods have been testedFirst winner Slawek Smyl, Uber Technologies
5M5 CompetitionInitial results 2021, Final 2022Around 42,000 hierarchical timeseries provided by WalmartAll major forecasting methods, including Machine and Deep Learning, and Statistical ones will be testedFirst winner Accuracy Challenge: YeonJun In. First winners uncertainty Challenge: Russ Wolfinger and David Lander
6M6 CompetitionInitial results 2022, Final 2024Real time financial forecasting competition consisting of 50 S&P500 US stocks and of 50 international ETFsAll major forecasting methods, including Machine and Deep Learning, and Statistical ones will be tested

First competition in 1982

The first Makridakis Competition, held in 1982, and known in the forecasting literature as the M-Competition, used 1001 time series and 15 forecasting methods (with another nine variations of those methods included). [1] [5] According to a later paper by the authors, the following were the main conclusions of the M-Competition: [1]

  1. Statistically sophisticated or complex methods do not necessarily provide more accurate forecasts than simpler ones.
  2. The relative ranking of the performance of the various methods varies according to the accuracy measure being used.
  3. The accuracy when various methods are combined outperforms, on average, the individual methods being combined and does very well in comparison to other methods.
  4. The accuracy of the various methods depends on the length of the forecasting horizon involved.

The findings of the study have been verified and replicated through the use of new methods by other researchers. [8] [9] [10]

According Rob J. Hyndman "... anyone could submit forecasts, making this the first true forecasting competition as far as I am aware. [7]

Newbold (1983) was critical of the M-competition, and argued against the general idea of using a single competition to attempt to settle the complex issue. [11]

Before the first M-Competition, Makridakis and Hibon [12] published in the Journal of the Royal Statistical Society (JRSS) an article showing that simple methods perform well in comparison to the more complex and statistically sophisticated ones. Statisticians at that time criticized the results claiming that they were not possible. Their criticism motivated the subsequent M, M2 and M3 Competitions that proved the thesis of the Makridakis and Hibon Study.[ citation needed ]

Second competition, published in 1993

The second competition, called the M-2 Competition or M2-Competition, was conducted on a larger scale. A call to participate was published in the International Journal of Forecasting , announcements were made in the International Symposium of Forecasting, and a written invitation was sent to all known experts on the various time series methods. The M2-Competition was organized in collaboration with four companies and included six macroeconomic series, and was conducted on a real-time basis. Data was from the United States. [1] The results of the competition were published in a 1993 paper. [6] The results were claimed to be statistically identical to those of the M-Competition. [1]

The M2-Competition used much fewer time series than the original M-competition. Whereas the original M-competition had used 1001 time series, the M2-Competition used only 29, including 23 from the four collaborating companies and 6 macroeconomic series. [6] Data from the companies was obfuscated through the use of a constant multiplier in order to preserve proprietary privacy. [6] The purpose of the M2-Competition was to simulate real-world forecasting better in the following respects: [6]

The competition was organized as follows: [6]

In addition to the published results, many of the participants wrote short articles describing their experience participating in the competition and their reflections on what the competition demonstrated. Chris Chatfield praised the design of the competition, but said that despite the organizers' best efforts, he felt that forecasters still did not have enough access to the companies from the inside as he felt people would have in real-world forecasting. [13] Fildes and Makridakis (1995) argue that despite the evidence produced by these competitions, the implications continued to be ignored by theoretical statisticians. [14]

Third competition, published in 2000

The third competition, called the M-3 Competition or M3-Competition, was intended to both replicate and extend the features of the M-competition and M2-Competition, through the inclusion of more methods and researchers (particularly researchers in the area of neural networks) and more time series. [1] A total of 3003 time series was used. The paper documenting the results of the competition was published in the International Journal of Forecasting [1] in 2000 and the raw data was also made available on the International Institute of Forecasters website. [4] According to the authors, the conclusions from the M3-Competition were similar to those from the earlier competitions. [1]

The time series included yearly, quarterly, monthly, daily, and other time series. In order to ensure that enough data was available to develop an accurate forecasting model, minimum thresholds were set for the number of observations: 14 for yearly series, 16 for quarterly series, 48 for monthly series, and 60 for other series. [1]

Time series were in the following domains: micro, industry, macro, finance, demographic, and other. [1] [4] Below is the number of time series based on the time interval and the domain: [1] [4]

Time interval between successive observationsMicroIndustryMacroFinanceDemographicOtherTotal
Yearly146102835824511645
Quarterly2048333676570756
Monthly474334312145111521428
Other400290141174
Total8285197313084132043003

The five measures used to evaluate the accuracy of different forecasts were: symmetric mean absolute percentage error (also known as symmetric MAPE), average ranking, median symmetric absolute percentage error (also known as median symmetric APE), percentage better, and median RAE. [1]

A number of other papers have been published with different analyses of the data set from the M3-Competition. [2] [3] According to Rob J. Hyndman, Editor-in-Chief of the International Journal of Forecasting (IJF), "The M3 data have continued to be used since 2000 for testing new time series forecasting methods. In fact, unless a proposed forecasting method is competitive against the original M3 participating methods, it is difficult to get published in the IJF."

Fourth competition (2018)

The fourth competition, M4, was announced in November 2017. [15] The competition started on January 1, 2018 and ended on May 31, 2018. Initial results were published in the International Journal of Forecasting on June 21, 2018. [16]

The M4 extended and replicated the results of the previous three competitions, using an extended and diverse set of time series to identify the most accurate forecasting method(s) for different types of predictions. It aimed to get answers on how to improve forecasting accuracy and identify the most appropriate methods for each case. To get precise and compelling answers, the M4 Competition utilized 100,000 real-life series, and incorporates all major forecasting methods, including those based on Artificial Intelligence (Machine Learning, ML), as well as traditional statistical ones.

In his blog, Rob J. Hyndman said about M4: "The "M" competitions organized by Spyros Makridakis have had an enormous influence on the field of forecasting. They focused attention on what models produced good forecasts, rather than on the mathematical properties of those models. For that, Spyros deserves congratulations for changing the landscape of forecasting research through this series of competitions." [17]

Below is the number of time series based on the time interval and the domain:

Time interval between successive observationsMicroIndustryMacroFinanceDemographicOtherTotal
Yearly65383716390365191088123623000
Quarterly6020463753155305185886524000
Monthly10975100171001610987572827748000
Weekly1126411642412359
Daily14764221271559106334227
Hourly00000414414
Total2512118798194022453487083437100000

In order to ensure that enough data are available to develop an accurate forecasting model, minimum thresholds were set for the number of observations: 13 for yearly, 16 for quarterly, 42 for monthly, 80 for weekly, 93 for daily and 700 for hourly series.

One of its major objectives was to compare the accuracy of ML methods versus that of statistical ones and empirically verify the claims of the superior performance of ML methods.

Below is a short description of the M4 Competition and its major findings and conclusion:

The M4 Competition ended on May 31, 2018 and in addition to point forecasts, it included specifying Prediction Intervals (PI) too. M4 was an Open one, with its most important objective (the same with that of the previous three M Competitions): "to learn to improve forecasting accuracy and advance the field as much as possible".

The five major findings and the conclusion of M4:

Below we outline what we consider to be the five major findings of the M4 Competition and advance a logical conclusion from these findings.

  1. The combination of methods was the king of the M4. Out of the 17 most accurate methods, 12 were "combinations" of mostly statistical approaches.
  2. The biggest surprise, however, was a "hybrid" approach utilizing both Statistical and ML features. This method, produced the most accurate forecasts as well as the most precise PIs and was submitted by Slawek Smyl, Data Scientist at Uber Technologies. According to sMAPE, it was close to 10% (a huge improvement) more accurate than the Combination (Comb) benchmark of the Competition (see below). It is noted that in the M3 Competition (Makridakis & Hibon, 2000) the best method was 4% more accurate than the same Combination.
  3. The second most accurate method was a combination of seven statistical methods and one ML one, with the weights for the averaging being calculated by a ML algorithm, trained to minimize forecasting error through holdout tests. This method was jointly submitted by Spain's University of A Coruña and Australia's Monash University.
  4. The first and the second most accurate methods also achieved an amazing success in specifying correctly the 95% PIs. These are the first methods we know that have done so and do not underestimate uncertainty considerably.
  5. The six pure ML methods submitted in the M4 performed poorly, none of them being more accurate than Comb and only one being more accurate than Naïve2. These results are in agreement with those of a recent study we published in PLoS One (Makridakis, et al., 2018). [18]

The conclusion from the above findings is that the accuracy of individual statistical or ML methods is low and that hybrid approaches and combination of methods is the way forward in order to improve forecasting accuracy and make forecasting more valuable.

Fifth competition (2020)

M5 commenced on March 3 2020, and the results were declared on July 1, 2020. It used real-life data from Walmart and was conducted on Kaggle's Platform. It offered substantial prizes totaling US$100,000 to the winners. The data was provided by Walmart and consisted of around 42,000 hierarchical daily time series, starting at the level of SKUs and ending with the total demand of some large geographical area. In addition to the sales data, there was also information about prices, advertising/promotional activity and inventory levels as well as the day of the week the data refers to.

There were several major prizes for the first, second and third winners in the categories of

There were also student and company prizes. There were no limit to the number of prizes that can be won by a single participant or team.

The focus of the M5 was mainly on practitioners rather than academics. The M5 Competition attracted close to 6,000 participants and teams, receiving considerable interest.

Findings and Conclusions

This competition was of the "M" competitions to feature primarily machine learning methods at the top of its leaderboard. All of the top-performing were, "pure ML approaches and better than all statistical benchmarks and their combinations." [19] The LightGBM model, as well as deep neural networks, featured prominently in top submissions. Consistent with the M4 Competition, the three best performers each employed ensembles, or combinations, of separately-trained and tuned models, where each model had a different training procedure and training dataset.

Offshoots

NN3-Competition

Although the organizers of the M3-Competition did contact researchers in the area of artificial neural networks (ANN) to seek their participation in the competition, only one researcher participated, and that researcher's forecasts fared poorly. The reluctance of most ANN researchers to participate at the time was due to the computationally intensive nature of ANN-based forecasting and the huge time series used for the competition. [1] In 2005, Crone, Nikolopoulos and Hibon organized the NN-3 Competition, using 111 of the time series from the M3-Competition (not the same data, because it was shifted in time, but the same sources). The NN-3 Competition found that the best ANN-based forecasts performed comparably with the best known forecasting methods, but were far more computationally intensive. It was also noted that many ANN-based techniques fared considerably worse than simple forecasting methods, despite greater theoretical potential for good performance. [20]

Reception

Nassim Nicholas Taleb, in his book The Black Swan , references the Makridakis Competitions as follows: "The most interesting test of how academic methods fare in the real world was provided by Spyros Makridakis, who spent part of his career managing competitions between forecasters who practice a "scientific method" called econometrics—an approach that combines economic theory with statistical measurements. Simply put, he made people forecast in real life and then he judged their accuracy. This led to a series of "M-Competitions" he ran, with assistance from Michele Hibon, of which M3 was the third and most recent one, completed in 1999. Makridakis and Hibon reached the sad conclusion that "statistically sophisticated and complex methods do not necessarily provide more accurate forecasts than simpler ones."" [21]

In the book Everything is Obvious, Duncan Watts cites the work of Makridakis and Hibon as showing that "simple models are about as good as complex models in forecasting economic time series." [22]

Related Research Articles

Forecasting is the process of making predictions based on past and present data. Later these can be compared (resolved) against what happens. For example, a company might estimate their revenue in the next year, then compare it against the actual results creating a variance actual analysis. Prediction is a similar but more general term. Forecasting might refer to specific formal statistical methods employing time series, cross-sectional or longitudinal data, or alternatively to less formal judgmental methods or the process of prediction and resolution itself. Usage can vary between areas of application: for example, in hydrology the terms "forecast" and "forecasting" are sometimes reserved for estimates of values at certain specific future times, while the term "prediction" is used for more general estimates, such as the number of times floods will occur over a long period.

Prediction markets, also known as betting markets, information markets, decision markets, idea futures or event derivatives, are open markets that enable the prediction of specific outcomes using financial incentives. They are exchange-traded markets established for trading bets in the outcome of various events. The market prices can indicate what the crowd thinks the probability of the event is. A typical prediction market contract is set up to trade between 0 and 100%. The most common form of a prediction market is a binary option market, which will expire at the price of 0 or 100%. Prediction markets can be thought of as belonging to the more general concept of crowdsourcing which is specially designed to aggregate information on particular topics of interest. The main purposes of prediction markets are eliciting aggregating beliefs over an unknown future outcome. Traders with different beliefs trade on contracts whose payoffs are related to the unknown future outcome and the market prices of the contracts are considered as the aggregated belief.

<span class="mw-page-title-main">Extrapolation</span> Method for estimating new data outside known data points

In mathematics, extrapolation is a type of estimation, beyond the original observation range, of the value of a variable on the basis of its relationship with another variable. It is similar to interpolation, which produces estimates between known observations, but extrapolation is subject to greater uncertainty and a higher risk of producing meaningless results. Extrapolation may also mean extension of a method, assuming similar methods will be applicable. Extrapolation may also apply to human experience to project, extend, or expand known experience into an area not known or previously experienced so as to arrive at a knowledge of the unknown. The extrapolation method can be applied in the interior reconstruction problem.

The mean absolute percentage error (MAPE), also known as mean absolute percentage deviation (MAPD), is a measure of prediction accuracy of a forecasting method in statistics. It usually expresses the accuracy as a ratio defined by the formula:

The root-mean-square deviation (RMSD) or root-mean-square error (RMSE) is either one of two closely related and frequently used measures of the differences between true or predicted values on the one hand and observed values or an estimator on the other.

Demand forecasting is the prediction of the quantity of goods and services that will be demanded by consumers at a future point in time. More specifically, the methods of demand forecasting entail using predictive analytics to estimate customer demand in consideration of key economic conditions. This is an important tool in optimizing business profitability through efficient supply chain management. Demand forecasting methods are divided into two major categories, qualitative and quantitative methods. Qualitative methods are based on expert opinion and information gathered from the field. This method is mostly used in situations when there is minimal data available for analysis such as when a business or product has recently been introduced to the market. Quantitative methods, however, use available data, and analytical tools in order to produce predictions. Demand forecasting may be used in resource allocation, inventory management, assessing future capacity requirements, or making decisions on whether to enter a new market.

The PollyVote project uses the high-profile application of predicting U.S. presidential election results to demonstrate advances in forecasting research. The project is run by political science professors and forecasting experts, one of which is J. Scott Armstrong. All procedures, data, and results are fully disclosed and freely available online.

Seasonal subseries plots are a graphical tool to visualize and detect seasonality in a time series. Seasonal subseries plots involves the extraction of the seasons from a time series into a subseries. Based on a selected periodicity, it is an alternative plot that emphasizes the seasonal patterns are where the data for each season are collected together in separate mini time plots.

Robin John Hyndman is an Australian statistician known for his work on forecasting and time series. He is Professor of Statistics at Monash University and was Editor-in-Chief of the International Journal of Forecasting from 2005–2018. In 2007 he won the Moran Medal from the Australian Academy of Science for his contributions to statistical research. In 2021 he won the Pitman Medal from the Statistical Society of Australia.

A consensus forecast is a prediction of the future created by combining several separate forecasts which have often been created using different methodologies. They are used in a number of sciences, ranging from econometrics to meteorology, and are also known as combining forecasts, forecast averaging or model averaging and committee machines, ensemble averaging or expert aggregation.

Nowcasting in economics is the prediction of the very recent past, the present, and the very near future state of an economic indicator. The term is a portmanteau of "now" and "forecasting" and originates in meteorology. Typical measures used to assess the state of an economy, such as gross domestic product (GDP) or inflation, are only determined after a delay and are subject to revision. In these cases, nowcasting such indicators can provide an estimate of the variables before the true data are known. Nowcasting models have been applied most notably in Central Banks, who use the estimates to monitor the state of the economy in real-time as a proxy for official measures.

In statistics, the mean absolute scaled error (MASE) is a measure of the accuracy of forecasts. It is the mean absolute error of the forecast values, divided by the mean absolute error of the in-sample one-step naive forecast. It was proposed in 2005 by statistician Rob J. Hyndman and Professor of Decision Sciences Anne B. Koehler, who described it as a "generally applicable measurement of forecast accuracy without the problems seen in the other measurements." The mean absolute scaled error has favorable properties when compared to other methods for calculating forecast errors, such as root-mean-square-deviation, and is therefore recommended for determining comparative accuracy of forecasts.

John Galt Solutions is a software company that provides forecasting and supply chain management software.

<span class="mw-page-title-main">Solar power forecasting</span> Power forecasting

Solar power forecasting is the process of gathering and analyzing data in order to predict solar power generation on various time horizons with the goal to mitigate the impact of solar intermittency. Solar power forecasts are used for efficient management of the electric grid and for power trading.

The International Institute of Forecasters (IIF) is a non-profit organization based in Medford, Massachusetts and founded in 1981 that describes itself as "dedicated to developing and furthering the generation, distribution, and use of knowledge on forecasting."

The Greenbook of the Federal Reserve Board of Governors is a book with projections of various economic indicators for the economy of the United States produced by the Federal Reserve Board before each meeting of the Federal Open Market Committee. The projections are referred to as Greenbook projections or Greenbook forecasts. Many of the variables projected coincide with variables covered in the Survey of Professional Forecasters.

Spyros Makridakis is a professor of the University of Nicosia UNIC where he is the Director of the Institute for the Future (IFF) and an Emeritus Professor of Decision Sciences at INSEAD as well as the University of Piraeus and one of the world's leading experts on forecasting, with many journal articles and books on the subject. He is famous as the organizer of the Makridakis Competitions, known in the forecasting literature as the M-Competitions.

Quantile Regression Averaging (QRA) is a forecast combination approach to the computation of prediction intervals. It involves applying quantile regression to the point forecasts of a small number of individual forecasting models or experts. It has been introduced in 2014 by Jakub Nowotarski and Rafał Weron and originally used for probabilistic forecasting of electricity prices and loads. Despite its simplicity it has been found to perform extremely well in practice - the top two performing teams in the price track of the Global Energy Forecasting Competition (GEFCom2014) used variants of QRA.

Electricity price forecasting (EPF) is a branch of energy forecasting which focuses on using mathematical, statistical and machine learning models to predict electricity prices in the future. Over the last 30 years electricity price forecasts have become a fundamental input to energy companies’ decision-making mechanisms at the corporate level.

<span class="mw-page-title-main">Agustín Maravall</span> Spanish economist (born 1944)

Agustín Maravall Herrero is a Spanish economist. He is known for his contributions to the analysis of statistics and econometrics, particularly in seasonal adjustment and the estimation of signals in economic time series. He created a methodology and several computer programs for such analysis that are used throughout the world by analysts, researchers, and data producers. Maravall retired in December 2014 from the Bank of Spain.

References

  1. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Makridakis, Spyros; Hibon, Michèle (October 2000). "The M3-Competition: results, conclusions and implications". International Journal of Forecasting. 16 (4): 451–476. doi:10.1016/S0169-2070(00)00057-1. S2CID   14583743.
  2. 1 2 Koning, Alex J.; Franses, Philip Hans; Hibon, Michèle; Stekler, H.O. (July 2005). "The M3 competition: Statistical tests of the results". International Journal of Forecasting. 21 (3): 397–409. doi:10.1016/j.ijforecast.2004.10.003.
  3. 1 2 Hyndman, Rob J.; Koehler, Anne B. (October 2006). "Another look at measures of forecast accuracy" (PDF). International Journal of Forecasting. 22 (4): 679–688. doi:10.1016/j.ijforecast.2006.03.001. S2CID   15947215.
  4. 1 2 3 4 "M3-competition (full data)". International Institute of Forecasters. 12 February 2012. Retrieved April 19, 2014.
  5. 1 2 Makridakis, S.; Andersen, A.; Carbone, R.; Fildes, R.; Hibon, M.; Lewandowski, R.; Newton, J.; Parzen, E.; Winkler, R. (April 1982). "The accuracy of extrapolation (time series) methods: Results of a forecasting competition". Journal of Forecasting. 1 (2): 111–153. doi:10.1002/for.3980010202. S2CID   154413915.
  6. 1 2 3 4 5 6 Makridakis, Spyros; Chatfield, Chris; Hibon, Michèle; Lawrence, Michael; Mills, Terence; Ord, Keith; Simmons, LeRoy F. (April 1993). "The M2-competition: A real-time judgmentally based forecasting study". International Journal of Forecasting. 9 (1): 5–22. doi:10.1016/0169-2070(93)90044-N.
  7. 1 2 Makridakis, Spyros; Spiliotis, Evangelos; Assimakopoulos, Vassilios (January 2020). "The M4 Competition: 100,000 time series and 61 forecasting methods". International Journal of Forecasting. 36 (1): 54–74. doi: 10.1016/j.ijforecast.2019.04.014 .
  8. Geurts, M. D.; Kelly, J. P. (1986). "Forecasting demand for special services". International Journal of Forecasting . 2: 261–272. doi:10.1016/0169-2070(86)90046-4.
  9. Clemen, Robert T. (1989). "Combining forecasts: A review and annotated bibliography" (PDF). International Journal of Forecasting . 5 (4): 559–583. doi:10.1016/0169-2070(89)90012-5.
  10. Fildes, R.; Hibon, Michele; Makridakis, Spyros; Meade, N. (1998). "Generalising about univariate forecasting methods: further empirical evidence" (PDF). International Journal of Forecasting . 14 (3): 339–358. doi:10.1016/s0169-2070(98)00009-0. S2CID   154465504.
  11. Newbold, Paul (1983). "The competition to end all competitions". Journal of Forecasting . 2: 276–279.
  12. Spyros Makridakis and Michele Hibon (1979). "Accuracy of Forecasting: An Empirical Investigation". Journal of the Royal Statistical Society. Series A (General). 142 (2): 97–145. doi:10.2307/2345077. JSTOR   2345077. S2CID   173769248.
  13. Chatfield, Chris (April 1993). "A personal view of the M2-competition". International Journal of Forecasting . 9 (1): 23–24. doi:10.1016/0169-2070(93)90045-O.
  14. Fildes, R.; Makridakis, Spyros (1995). "The impact of empirical accuracy studies on time series analysis and forecasting" (PDF). International Statistical Review . 63 (3): 289–308. doi:10.2307/1403481. JSTOR   1403481.
  15. "Announcing the Makridakis M4 Forecasting Competition - University of Nicosia - Official Website". Archived from the original on 2017-12-01. Retrieved 2017-11-30.
  16. Makridakis, Spyros; Spiliotis, Evangelos; Assimakopoulos, Vassilios (October 2018). "The M4 Competition: Results, findings, conclusion and way forward". International Journal of Forecasting. 34 (4): 802–808. doi:10.1016/j.ijforecast.2018.06.001. S2CID   158696437.
  17. "M4 Forecasting Competition | Rob J Hyndman". 19 November 2017.
  18. Makridakis, Spyros; Spiliotis, Evangelos; Assimakopoulos, Vassilios (2018-03-27). "Statistical and Machine Learning forecasting methods: Concerns and ways forward". PLoS One. 13 (3): e0194889. Bibcode:2018PLoSO..1394889M. doi: 10.1371/journal.pone.0194889 . ISSN   1932-6203. PMC   5870978 . PMID   29584784.
  19. Makridakis, Spyros; Spiliotis, Evangelos; Assimakopoulos, Vassilios (October 2022). "M5 accuracy competition: Results, findings, and conclusions". International Journal of Forecasting. 38 (4): 1346–1364. doi: 10.1016/j.ijforecast.2021.11.013 . ISSN   0169-2070.
  20. Crone, Sven F.; Nikolopoulos, Konstantinos; Hibon, Michele (June 2005). "Automatic Modelling and Forecasting with Artificial Neural Networks– A forecasting competition evaluation" (PDF). Retrieved April 23, 2014.
  21. Nassim Nicholas Taleb (2005). Fooled by Randomness . Random House Trade Paperbacks. ISBN   978-0-8129-7521-5., Page 154, available for online viewing at Internet Archive
  22. Duncan Watts (2011). Everything is Obvious. Crown. ISBN   978-0307951793., Page 315