All models are wrong

Last updated

All models are wrong is a common aphorism and anapodoton in statistics; it is often expanded as "All models are wrong, but some are useful". The aphorism acknowledges that statistical models always fall short of the complexities of reality but can still be useful nonetheless. The aphorism originally referred just to statistical models, but it is now sometimes used for scientific models in general. [1]

Contents

The aphorism is generally attributed to George E. P. Box, a British statistician, although the underlying concept predates Box's writings.

Quotations of George Box

George Box GeorgeEPBox (cropped).jpg
George Box

The first record of Box saying "all models are wrong" is in a 1976 paper published in the Journal of the American Statistical Association . [2] The 1976 paper contains the aphorism twice. The two sections of the paper that contain the aphorism state:

2.3  Parsimony
Since all models are wrong the scientist cannot obtain a "correct" one by excessive elaboration. On the contrary following William of Occam he should seek an economical description of natural phenomena. Just as the ability to devise simple but evocative models is the signature of the great scientist so overelaboration and overparameterization is often the mark of mediocrity.
2.4  Worrying Selectively

Since all models are wrong the scientist must be alert to what is importantly wrong. It is inappropriate to be concerned about safety from mice when there are tigers abroad.

Box repeated the aphorism in a paper that was published in the proceedings of a 1978 statistics workshop. [3] The paper contains a section entitled "All models are wrong but some are useful". The section states (p 202-3):

Now it would be very remarkable if any system existing in the real world could be exactly represented by any simple model. However, cunningly chosen parsimonious models often do provide remarkably useful approximations. For example, the law PV = nRT relating pressure P, volume V and temperature T of an "ideal" gas via a constant R is not exactly true for any real gas, but it frequently provides a useful approximation and furthermore its structure is informative since it springs from a physical view of the behavior of gas molecules. For such a model there is no need to ask the question "Is the model true?". If "truth" is to be the "whole truth" the answer must be "No". The only question of interest is "Is the model illuminating and useful?".

Box repeated the aphorism twice more in his 1987 book, Empirical Model-Building and Response Surfaces (which was co-authored with Norman Draper). [4] The first repetition is on p. 74: "Remember that all models are wrong; the practical question is how wrong do they have to be to not be useful." The second repetition is on p. 424, which is excerpted below.

... all models are approximations. Essentially, all models are wrong, but some are useful. However, the approximate nature of the model must always be borne in mind....

A second edition of the book was published in 2007, under the title Response Surfaces, Mixtures, and Ridge Analyses. The second edition also repeats the aphorism twice, in contexts identical with those of the first edition (on p. 63 and p. 414). [5]

Box repeated the aphorism two more times in his 1997 book, Statistical Control: By Monitoring and Feedback Adjustment (which was co-authored with Alberto Luceño). [6] The first repetition is on p. 6, which is excerpted below.

It has been said that "all models are wrong but some models are useful." In other words, any model is at best a useful fiction—there never was, or ever will be, an exactly normal distribution or an exact linear relationship. Nevertheless, enormous progress has been made by entertaining such fictions and using them as approximations.

The second repetition is on p. 9: "So since all models are wrong, it is very important to know what to worry about; or, to put it in another way, what models are likely to produce procedures that work in practice (where exact assumptions are never true)".

A second edition of the book was published in 2009, under the title Statistical Control By Monitoring and Adjustment (co-authored with Alberto Luceño and María del Carmen Paniagua-Quiñones). The second edition also repeats the aphorism two times. [7] The first repetition is on p. 61, which is excerpted below.

All models are approximations. Assumptions, whether implied or clearly stated, are never exactly true. All models are wrong, but some models are useful. So the question you need to ask is not "Is the model true?" (it never is) but "Is the model good enough for this particular application?"

The second repetition is on p. 63; its context is essentially the same as that of the second repetition in the first edition.

Box's widely cited book Statistics for Experimenters (co-authored with William Hunter) does not include the aphorism in its first edition (published in 1978). [8] The second edition (published in 2005; co-authored with William Hunter and J. Stuart Hunter) includes the aphorism three times: on p. 208, p. 384, and p. 440. [9] On p. 440, the relevant sentence is this: "The most that can be expected from any model is that it can supply a useful approximation to reality: All models are wrong; some models are useful".

In addition to stating the aphorism verbatim, Box sometimes stated the essence of the aphorism with different words. One example is from 1978, while Box was President of the American Statistical Association. At the annual meeting of the Association, Box delivered his Presidential Address, wherein he stated this: "Models, of course, are never true, but fortunately it is only necessary that they be useful". [10]

Discussions

There have been varied discussions about the aphorism. A selection from those discussions is presented below.

In 1983, the statisticians Peter McCullagh and John Nelder published their much-cited book on generalized linear models. The book includes a brief discussion of the aphorism (though without citing Box). [11] A second edition of the book, published in 1989, contains a very similar discussion of the aphorism. [12] The discussion from the first edition is as follows.

Modelling in science remains, partly at least, an art. Some principles do exist, however, to guide the modeller. The first is that all models are wrong; some, though, are better than others and we can search for the better ones. At the same time we must recognize that eternal truth is not within our grasp.

In 1995, the statistician David Cox commented as follows. [13]

... it does not seem helpful just to say that all models are wrong. The very word model implies simplification and idealization. The idea that complex physical, biological or sociological systems can be exactly described by a few formulae is patently absurd. The construction of idealized representations that capture important stable aspects of such systems is, however, a vital part of general scientific analysis and statistical models, especially substantive ones, do not seem essentially different from other kinds of model.

In 1996, an Applied Statistician's Creed was proposed by M.R.Nester. [14] The creed includes, in its core part, the aphorism.

In 2002, K. P. Burnham and D. R. Anderson published their much-cited book on statistical model selection. The book states the following. [15]

A model is a simplification or approximation of reality and hence will not reflect all of reality. ... Box noted that "all models are wrong, but some are useful." While a model can never be "truth," a model might be ranked from very useful, to useful, to somewhat useful to, finally, essentially useless.

The statistician J. Michael Steele has commented on the aphorism as follows. [16]

... there are wonderful models — like city maps....

If I say that a map is wrong, it means that a building is misnamed, or the direction of a one-way street is mislabeled. I never expected my map to recreate all of physical reality, and I only feel ripped off if my map does not correctly answer the questions that it claims to answer.

My maps of Philadelphia are useful. Moreover, except for a few that are out-of-date, they are not wrong.

So, you say, "Yes, a map can be thought of as a model, but surely it would be more precise to say that a map is a 'visually enhanced database.' Such databases can be correct. These are not the kinds of models that Box had in mind."

I agree. ...

In 2008, the statistician Andrew Gelman responded to that, saying in particular the following. [17]

I take his general point, which is that a street map could be exactly correct, to the resolution of the map.

... The saying, "all models are wrong," is helpful because it is not completely obvious....

This is a simple point, and I can see how Steele can be irritated by people making a big point about it. But, the trouble is, many people don't realize that all models are wrong.

In 2013, the philosopher of science Peter Truran published an essay related to the aphorism. [18] The essay notes, in particular, the following.

... seemingly incompatible models may be used to make predictions about the same phenomenon. ... For each model we may believe that its predictive power is an indication of its being at least approximately true. But if both models are successful in making predictions, and yet mutually inconsistent, how can they both be true? Let us consider a simple illustration. Two observers are looking at a physical object. One may report seeing a circular disc, and the other may report seeing a rectangle. Both will be correct, but one will be looking at the object (a cylindrical can) from above and the other will be observing from the side. The two models represent different aspects of the same reality.

Truran's essay further notes that Newton's theory of gravitation has been supplanted by Einstein's theory of relativity and yet Newton's theory remains generally "empirically adequate". Indeed, Newton's theory generally has excellent predictive power. Yet Newton's theory is not an approximation of Einstein's theory. For illustration, consider an apple falling down from a tree. Under Newton's theory, the apple falls because Earth exerts a force on the apple—what is called "the force of gravity". Under Einstein's theory, Earth does not exert any force on the apple. [19] Hence, Newton's theory might be regarded as being, in some sense, completely wrong but extremely useful. (The usefulness of Newton's theory comes partly from being vastly simpler, both mathematically and computationally, than Einstein's theory.)

In 2014, the statistician David Hand made the following statement. [20]

In general, when building statistical models, we must not forget that the aim is to understand something about the real world. Or predict, choose an action, make a decision, summarize evidence, and so on, but always about the real world, not an abstract mathematical world: our models are not the reality—a point well made by George Box in his oft-cited remark that "all models are wrong, but some are useful".

In 2016, P. J. Bickel and K. A. Doksum published the second volume of their book on mathematical statistics. The volume includes the quote from Box's Presidential Address, given above. It states that the quote is the best formulation of the "guiding principle of modern statistics". [21]

Historical antecedents

Although the aphorism seems to have originated with George Box, the underlying concept goes back decades, perhaps centuries. Some exemplifications of that are given below.

In 1960, Georg Rasch said the following.

... no models are [true]not even the Newtonian laws. When you construct a model you leave out all the details which you, with the knowledge at your disposal, consider inessential.... Models should not be true, but it is important that they are applicable, and whether they are applicable for any given purpose must of course be investigated. This also means that a model is never accepted finally, only on trial.

Rasch, G. (1960), Probabilistic Models for Some Intelligence and Attainment Tests, Copenhagen: Danmarks Paedagogiske Institut, pp. 37–38; republished in 1980 by University of Chicago Press

In 1947, the mathematician John von Neumann said that "truth ... is much too complicated to allow anything but approximations". [22]

In 1942, the French philosopher-poet Paul Valéry said the following. [23]

In 1939, the founder of statistical process control, Walter Shewhart, said the following. [25]

... no model can ever be theoretically attainable that will completely and uniquely characterize the indefinitely expansible concept of a state of statistical control. What is perhaps even more important, on the basis of a finite portion of the sequence [X1, X2, X3, ...]—and we can never have more than a finite portion—we can not reasonably hope to construct a model that will represent exactly any specific characteristic of a particular state of control even though such a state actually exists. Here the situation is much like that in physical science where we find a model of a molecule; any model is always an incomplete though useful picture of the conceived physical thing called a molecule.

Shewhart, W. A. (1939), Statistical Method From the Viewpoint of Quality Control, U.S. Department of Agriculture, p. 19

In 1923, a related idea was articulated by the artist Pablo Picasso.

We all know that art is not truth. Art is a lie that makes us realize truth, at least the truth that is given us to understand. The artist must know the manner whereby to convince others of the truthfulness of his lies.

Picasso, Pablo (1923), "Picasso speaks", The Arts, 3: 315–326; [26] reprinted in Barr, Alfred H. Jr. (1939), Picasso: Forty Years of his Art (PDF), Museum of Modern Art, pp. 9–12

See also

Notes

  1. Skogen, M.D.; Ji, R.; Akimova, A.; Daewel, U.; and eleven others (2021), "Disclosing the truth: Are models better than observations?" (PDF), Marine Ecology Progress Series , 680: 7–13, Bibcode:2021MEPS..680....7S, doi:10.3354/meps13574, S2CID   229617529 .
  2. Box, George E. P. (1976), "Science and statistics" (PDF), Journal of the American Statistical Association , 71 (356): 791–799, doi:10.1080/01621459.1976.10480949 .
  3. Box, G. E. P. (1979), "Robustness in the strategy of scientific model building", in Launer, R. L.; Wilkinson, G. N. (eds.), Robustness in Statistics, Academic Press, pp. 201–236, doi:10.1016/B978-0-12-438150-6.50018-2, ISBN   978-1-4832-6336-6
  4. Box, G. E. P.; Draper, N. R. (1987), Empirical Model-Building and Response Surfaces, John Wiley & Sons .
  5. Box, G. E. P.; Draper, N. R. (2007), Response Surfaces, Mixtures, and Ridge Analyses, John Wiley & Sons .
  6. Box, G. E. P.; Luceño, A (1997), Statistical Control: By Monitoring and Feedback Adjustment, John Wiley & Sons .
  7. Box, G. E. P.; Luceño, A.; Paniagua-Quiñones, M. del Carmen (2009), Statistical Control By Monitoring and Adjustment, John Wiley & Sons .
  8. Box, G. E. P.; Hunter, W. G. (1978), Statistics for Experimenters, John Wiley & Sons .
  9. Box, G. E. P.; Hunter, J. S.; Hunter, W. G. (2005), Statistics for Experimenters (2nd ed.), John Wiley & Sons .
  10. Box, G. E. P. (1979), "Some problems of statistics and everyday life", Journal of the American Statistical Association , 74 (365): 1–4, doi:10.2307/2286713, JSTOR   2286713 .
  11. McCullagh, P.; Nelder, J. A. (1983), Generalized Linear Models, Chapman & Hall, §1.1.4.
  12. McCullagh, P.; Nelder, J. A. (1989), Generalized Linear Models (second ed.), Chapman & Hall, §1.1.4.
  13. Cox, D. R. (1995), "Comment on "Model uncertainty, data mining and statistical inference"", Journal of the Royal Statistical Society, Series A , 158: 455–456.
  14. Nester, M. R. (1996), "An applied statistician's creed" (PDF), Journal of the Royal Statistical Society, Series C , 45 (4): 401–410, doi:10.2307/2986064, JSTOR   2986064 .
  15. Burnham, K. P.; Anderson, D. R. (2002), Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach (2nd ed.), Springer-Verlag, §1.2.5. [As of February 2022, combined editions of this book have over 60,000 citations on Google Scholar.]
  16. Steele, J. M., "Models: Masterpieces and Lame Excuses".
  17. Gelman, A. (12 June 2008), "Some thoughts on the saying, "All models are wrong, but some are useful"".
  18. Truran, P. (2013), "Models: Useful but Not True", Practical Applications of the Philosophy of Science, SpringerBriefs in Philosophy, Springer, pp. 61–67, doi:10.1007/978-3-319-00452-5_10, ISBN   978-3-319-00451-8 .
  19. Under Einstein's theory of relativity, the primary reason the apple falls down is that Earth warps time, so that clocks near the base of the tree run more slowly than clocks high up in the tree; there is also a secondary reason, which is that Earth warps space. The empirical evidence for Einstein's theory is extremely strong—e.g. GPS relies on Einstein's theory, and it would not work if it relied on Newton's theory ( Ashby 2002 ).
  20. Hand, D. J. (2014), "Wonderful examples, but let's not close our eyes", Statistical Science , 29: 98–100, arXiv: 1405.4986 , doi: 10.1214/13-STS446 .
  21. Bickel, P. J.; Doksum, K. A. (2016), Mathematical Statistics, vol. II, Chapman & Hall, p. 2.
  22. von Neumann, J. (1947), "The mathematician", in Haywood, R. B. (ed.), Works of the Mind, University of Chicago Press, pp. 180–196; republished in 1995 by Bródy F., Vámos T. (editors), The Neumann Compendium, World Scientific, p. 618–626.
  23. The relatedness of Valéry's quotation with the aphorism "all models are wrong" has been noted by various authors, e.g. Vankat (2013 , §1.7).
  24. Some authors have given different English translations, e.g. Valéry (1970, p. 466), Wolfson & Murphy (1998), and Vankat (2013, §1.7). The translation presented here was given by Google Translate; it has only one word different from the translation of Wolfson & Murphy: "What" instead of "Whatever" (both occurrences).
  25. The relatedness of Shewhart's quotation with the aphorism "all models are wrong" is noted by Fricker & Woodall (2016).
  26. The quotation was originally given in Spanish (during an interview by Marius de Zayas); the cited publication is in English.

Related Research Articles

A mathematical model is an abstract description of a concrete system using mathematical concepts and language. The process of developing a mathematical model is termed mathematical modeling. Mathematical models are used in applied mathematics and in the natural sciences and engineering disciplines, as well as in non-physical systems such as the social sciences. It can also be taught as a subject in its own right.

<span class="mw-page-title-main">Scientific method</span> Interplay between observation, experiment and theory in science

The scientific method is an empirical method for acquiring knowledge that has characterized the development of science since at least the 17th century.

A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of sample data. A statistical model represents, often in considerably idealized form, the data-generating process. When referring specifically to probabilities, the corresponding term is probabilistic model. All statistical hypothesis tests and all statistical estimators are derived via statistical models. More generally, statistical models are part of the foundation of statistical inference. A statistical model is usually specified as a mathematical relationship between one or more random variables and other non-random variables. As such, a statistical model is "a formal representation of a theory".

<span class="mw-page-title-main">Statistical inference</span> Process of using data analysis

Statistical inference is the process of using data analysis to infer properties of an underlying distribution of probability. Inferential statistical analysis infers properties of a population, for example by testing hypotheses and deriving estimates. It is assumed that the observed data set is sampled from a larger population.

In linguistics and philosophy, a vague predicate is one which gives rise to borderline cases. For example, the English adjective "tall" is vague since it is not clearly true or false for someone of middling height. By contrast, the word "prime" is not vague since every number is definitively either prime or not. Vagueness is commonly diagnosed by a predicate's ability to give rise to the Sorites paradox. Vagueness is separate from ambiguity, in which an expression has multiple denotations. For instance the word "bank" is ambiguous since it can refer either to a river bank or to a financial institution, but there are no borderline cases between both interpretations.

In philosophy, Occam's razor is the problem-solving principle that recommends searching for explanations constructed with the smallest possible set of elements. It is also known as the principle of parsimony or the law of parsimony. Attributed to William of Ockham, a 14th-century English philosopher and theologian, it is frequently cited as Entia non sunt multiplicanda praeter necessitatem, which translates as "Entities must not be multiplied beyond necessity", although Occam never used these exact words. Popularly, the principle is sometimes paraphrased as "The simplest explanation is usually the best one."

In statistics, point estimation involves the use of sample data to calculate a single value which is to serve as a "best guess" or "best estimate" of an unknown population parameter. More formally, it is the application of a point estimator to the data to obtain a point estimate.

A scientific theory is an explanation of an aspect of the natural world and universe that can be repeatedly tested and corroborated in accordance with the scientific method, using accepted protocols of observation, measurement, and evaluation of results. Where possible, some theories are tested under controlled conditions in an experiment. In circumstances not amenable to experimental testing, theories are evaluated through principles of abductive reasoning. Established scientific theories have withstood rigorous scrutiny and embody scientific knowledge.

In philosophy of science and in epistemology, instrumentalism is a methodological view that ideas are useful instruments, and that the worth of an idea is based on how effective it is in explaining and predicting natural phenomena. According to instrumentalists, a successful scientific theory reveals nothing known either true or false about nature's unobservable objects, properties or processes. Scientific theory is merely a tool whereby humans predict observations in a particular domain of nature by formulating laws, which state or summarize regularities, while theories themselves do not reveal supposedly hidden aspects of nature that somehow explain these laws. Instrumentalism is a perspective originally introduced by Pierre Duhem in 1906.

An approximation is anything that is intentionally similar but not exactly equal to something else.

The hypothetico-deductive model or method is a proposed description of the scientific method. According to it, scientific inquiry proceeds by formulating a hypothesis in a form that can be falsifiable, using a test on observable data where the outcome is not yet known. A test outcome that could have and does run contrary to predictions of the hypothesis is taken as a falsification of the hypothesis. A test outcome that could have, but does not run contrary to the hypothesis corroborates the theory. It is then proposed to compare the explanatory value of competing hypotheses by testing how stringently they are corroborated by their predictions.

<span class="mw-page-title-main">Bose gas</span> State of matter of many bosons

An ideal Bose gas is a quantum-mechanical phase of matter, analogous to a classical ideal gas. It is composed of bosons, which have an integer value of spin and abide by Bose–Einstein statistics. The statistical mechanics of bosons were developed by Satyendra Nath Bose for a photon gas and extended to massive particles by Albert Einstein, who realized that an ideal gas of bosons would form a condensate at a low enough temperature, unlike a classical ideal gas. This condensate is known as a Bose–Einstein condensate.

<span class="mw-page-title-main">Optimal experimental design</span> Experimental design that is optimal with respect to some statistical criterion

In the design of experiments, optimal experimental designs are a class of experimental designs that are optimal with respect to some statistical criterion. The creation of this field of statistics has been credited to Danish statistician Kirstine Smith.

The deductive-nomological model of scientific explanation, also known as Hempel's model, the Hempel–Oppenheim model, the Popper–Hempel model, or the covering law model, is a formal view of scientifically answering questions asking, "Why...?". The DN model poses scientific explanation as a deductive structure, one where truth of its premises entails truth of its conclusion, hinged on accurate prediction or postdiction of the phenomenon to be explained.

<span class="mw-page-title-main">Scientific modelling</span> Scientific activity that produces models

Scientific modelling is an activity that produces models representing empirical objects, phenomena, and physical processes, to make a particular part or feature of the world easier to understand, define, quantify, visualize, or simulate. It requires selecting and identifying relevant aspects of a situation in the real world and then developing a model to replicate a system with those features. Different types of models may be used for different purposes, such as conceptual models to better understand, operational models to operationalize, mathematical models to quantify, computational models to simulate, and graphical models to visualize the subject.

<span class="mw-page-title-main">Response surface methodology</span> Statistical approach

In statistics, response surface methodology (RSM) explores the relationships between several explanatory variables and one or more response variables. RSM is an empirical model which employs the use of mathematical and statistical techniques to relate input variables, otherwise known as factors, to the response. RSM became very useful due to the fact that other methods available, such as the theoretical model, could be very cumbersome to use, time-consuming, inefficient, error-prone, and unreliable. The method was introduced by George E. P. Box and K. B. Wilson in 1951. The main idea of RSM is to use a sequence of designed experiments to obtain an optimal response. Box and Wilson suggest using a second-degree polynomial model to do this. They acknowledge that this model is only an approximation, but they use it because such a model is easy to estimate and apply, even when little is known about the process.

Models of scientific inquiry have two functions: first, to provide a descriptive account of how scientific inquiry is carried out in practice, and second, to provide an explanatory account of why scientific inquiry succeeds as well as it appears to do in arriving at genuine knowledge. The philosopher Wesley C. Salmon described scientific inquiry:

The search for scientific knowledge ends far back into antiquity. At some point in the past, at least by the time of Aristotle, philosophers recognized that a fundamental distinction should be drawn between two kinds of scientific knowledge—roughly, knowledge that and knowledge why. It is one thing to know that each planet periodically reverses the direction of its motion with respect to the background of fixed stars; it is quite a different matter to know why. Knowledge of the former type is descriptive; knowledge of the latter type is explanatory. It is explanatory knowledge that provides scientific understanding of the world.

In statistics, model specification is part of the process of building a statistical model: specification consists of selecting an appropriate functional form for the model and choosing which variables to include. For example, given personal income together with years of schooling and on-the-job experience , we might specify a functional relationship as follows:

<span class="mw-page-title-main">Theoretical physics</span> Branch of physics

Theoretical physics is a branch of physics that employs mathematical models and abstractions of physical objects and systems to rationalize, explain and predict natural phenomena. This is in contrast to experimental physics, which uses experimental tools to probe these phenomena.

Model-dependent realism is a view of scientific inquiry that focuses on the role of scientific models of phenomena. It claims reality should be interpreted based upon these models, and where several models overlap in describing a particular subject, multiple, equally valid, realities exist. It claims that it is meaningless to talk about the "true reality" of a model as we can never be absolutely certain of anything. The only meaningful thing is the usefulness of the model. The term "model-dependent realism" was coined by Stephen Hawking and Leonard Mlodinow in their 2010 book, The Grand Design.

References

Further reading