Regression fallacy

Last updated

The regression (or regressive) fallacy is an informal fallacy. It assumes that something has returned to normal because of corrective actions taken while it was abnormal. This fails to account for natural fluctuations. It is frequently a special kind of the post hoc fallacy.

Contents

Explanation

Things like golf scores, the earth's temperature, and chronic back pain fluctuate naturally and usually regress toward the mean. The logical flaw is to make predictions that expect exceptional results to continue as if they were average (see Representativeness heuristic). People are most likely to take action when variance is at its peak. Then after results become more normal they believe that their action was the cause of the change when in fact it was not causal.

This use of the word "regression" was coined by Sir Francis Galton in a study from 1885 called "Regression Toward Mediocrity in Hereditary Stature". He showed that the height of children from very short or very tall parents would move toward the average. In fact, in any situation where two variables are less than perfectly correlated, an exceptional score on one variable may not be matched by an equally exceptional score on the other variable. The imperfect correlation between parents and children (height is not entirely heritable) means that the distribution of heights of their children will be centered somewhere between the average of the parents and the average of the population as whole. Thus, any single child can be more extreme than the parents, but the odds are against it.

Examples

When his pain got worse, he went to a doctor, after which the pain subsided a little. Therefore, he benefited from the doctor's treatment.

The pain subsiding a little after it has gotten worse is more easily explained by regression toward the mean. Assuming the pain relief was caused by the doctor is fallacious.

The student did exceptionally poorly last semester, so I punished him. He did much better this semester. Clearly, punishment is effective in improving students' grades.

Often exceptional performances are followed by more normal performances, so the change in performance might better be explained by regression toward the mean. Incidentally, some experiments have shown that people may develop a systematic bias for punishment and against reward because of reasoning analogous to this example of the regression fallacy. [1]

The frequency of accidents on a road fell after a speed camera was installed. Therefore, the speed camera has improved road safety.

Speed cameras are often installed after a road incurs an exceptionally high number of accidents, and this value usually falls (regression to mean) immediately afterward. Many speed camera proponents attribute this fall in accidents to the speed camera, without observing the overall trend.

Some authors use the Sports Illustrated cover jinx as an example of a regression effect: extremely good performances are likely to be followed by less extreme ones, and athletes are chosen to appear on the cover of Sports Illustrated only after extreme performances. Attributing this to a "jinx" rather than regression, as some athletes reportedly believe, is an example of committing the regression fallacy. [2]

Misapplication

On the other hand, dismissing valid explanations can lead to a worse situation. For example:

After the Western Allies invaded Normandy, creating a second major front, German control of Europe waned. Clearly, the combination of the Western Allies and the USSR drove the Germans back.

Fallacious evaluation: "Given that the counterattacks against Germany occurred only after they had conquered the greatest amount of territory under their control, regression toward the mean can explain the retreat of German forces from occupied territories as a purely random fluctuation that would have happened without any intervention on the part of the USSR or the Western Allies." However, this was not the case. The reason is that political power and occupation of territories is not primarily determined by random events, making the concept of regression toward the mean inapplicable (on the large scale).

In essence, misapplication of regression toward the mean can reduce all events to a just-so story, without cause or effect. (Such misapplication takes as a premise that all events are random, as they must be for the concept of regression toward the mean to be validly applied.)

Notes

  1. Schaffner, 1985; Gilovich, 1991 pp. 27–28
  2. Gilovich, 1991 pp. 26–27; Plous, 1993 p. 118

Related Research Articles

The law of averages is the commonly held belief that a particular outcome or event will, over certain periods of time, occur at a frequency that is similar to its probability. Depending on context or application it can be considered a valid common-sense observation or a misunderstanding of probability. This notion can lead to the gambler's fallacy when one becomes convinced that a particular outcome must come soon simply because it has not occurred recently.

<span class="mw-page-title-main">Statistics</span> Study of the collection and analysis of data

Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model to be studied. Populations can be diverse groups of people or objects such as "all people living in a country" or "every atom composing a crystal". Statistics deals with every aspect of data, including the planning of data collection in terms of the design of surveys and experiments.

In classical rhetoric and logic, begging the question or assuming the conclusion is an informal fallacy that occurs when an argument's premises assume the truth of the conclusion. Historically, begging the question refers to a fault in a dialectical argument in which the speaker assumes some premise that has not been demonstrated to be true. In modern usage, it has come to refer to an argument in which the premises assume the conclusion without supporting it. This makes it an example of circular reasoning.

<span class="mw-page-title-main">Francis Galton</span> British eugenist, polymath, and behavioural geneticist (1822–1911)

Sir Francis Galton was a British polymath and the originator of eugenics during the Victorian era; his ideas later became the basis of behavioural genetics.

<span class="mw-page-title-main">Overfitting</span> Flaw in mathematical modelling

In mathematical modeling, overfitting is "the production of an analysis that corresponds too closely or exactly to a particular set of data, and may therefore fail to fit to additional data or predict future observations reliably". An overfitted model is a mathematical model that contains more parameters than can be justified by the data. In a mathematical sense, these parameters represent the degree of a polynomial. The essence of overfitting is to have unknowingly extracted some of the residual variation as if that variation represented underlying model structure.

<span class="mw-page-title-main">Regression toward the mean</span> Statistical phenomenon

In statistics, regression toward the mean is the phenomenon where if one sample of a random variable is extreme, the next sampling of the same random variable is likely to be closer to its mean. Furthermore, when many random variables are sampled and the most extreme results are intentionally picked out, it refers to the fact that a second sampling of these picked-out variables will result in "less extreme" results, closer to the initial mean of all of the variables.

The representativeness heuristic is used when making judgments about the probability of an event being representional in character and essence of a known prototypical event. It is one of a group of heuristics proposed by psychologists Amos Tversky and Daniel Kahneman in the early 1970s as "the degree to which [an event] (i) is similar in essential characteristics to its parent population, and (ii) reflects the salient features of the process by which it is generated". The representativeness heuristic works by comparing an event to a prototype or stereotype that we already have in mind. For example, if we see a person who is dressed in eccentric clothes and reading a poetry book, we might be more likely to think that they are a poet than an accountant. This is because the person's appearance and behavior are more representative of the stereotype of a poet than an accountant.

<span class="mw-page-title-main">Thomas Gilovich</span> American psychologist (born 1954)

Thomas Dashiff Gilovich is an American psychologist who is the Irene Blecker Rosenfeld Professor of Psychology at Cornell University. He has conducted research in social psychology, decision making, and behavioral economics, and has written popular books on these subjects. Gilovich has collaborated with Daniel Kahneman, Richard Nisbett, Lee Ross and Amos Tversky. His articles in peer-reviewed journals on subjects such as cognitive biases have been widely cited. In addition, Gilovich has been quoted in the media on subjects ranging from the effect of purchases on happiness to people's most common regrets, to perceptions of people and social groups. Gilovich is a fellow of the Committee for Skeptical Inquiry.

<span class="mw-page-title-main">Coefficient of determination</span> Indicator for how well data points fit a line or curve

In statistics, the coefficient of determination, denoted R2 or r2 and pronounced "R squared", is the proportion of the variation in the dependent variable that is predictable from the independent variable(s).

The overconfidence effect is a well-established bias in which a person's subjective confidence in their judgments is reliably greater than the objective accuracy of those judgments, especially when confidence is relatively high. Overconfidence is one example of a miscalibration of subjective probabilities. Throughout the research literature, overconfidence has been defined in three distinct ways: (1) overestimation of one's actual performance; (2) overplacement of one's performance relative to others; and (3) overprecision in expressing unwarranted certainty in the accuracy of one's beliefs.

A sophomore slump is when a sophomore fails to live up to the relatively high standards that occurred during freshman year.

Phylogenetic autocorrelation also known as Galton's problem, after Sir Francis Galton who described it, is the problem of drawing inferences from cross-cultural data, due to the statistical phenomenon now called autocorrelation. The problem is now recognized as a general one that applies to all nonexperimental studies and to some experimental designs as well. It is most simply described as the problem of external dependencies in making statistical estimates when the elements sampled are not statistically independent. Asking two people in the same household whether they watch TV, for example, does not give you statistically independent answers. The sample size, n, for independent observations in this case is one, not two. Once proper adjustments are made that deal with external dependencies, then the axioms of probability theory concerning statistical independence will apply. These axioms are important for deriving measures of variance, for example, or tests of statistical significance.

Bootstrapping is a procedure for estimating the distribution of an estimator by resampling one's data or a model estimated from the data. Bootstrapping assigns measures of accuracy to sample estimates. This technique allows estimation of the sampling distribution of almost any statistic using random sampling methods.

In time series data, seasonality refers to the trends that occur at specific regular intervals less than a year, such as weekly, monthly, or quarterly. Seasonality may be caused by various factors, such as weather, vacation, and holidays and consists of periodic, repetitive, and generally regular and predictable patterns in the levels of a time series.

The "hot hand" is a phenomenon, previously considered a cognitive social bias, that a person who experiences a successful outcome has a greater chance of success in further attempts. The concept is often applied to sports and skill-based tasks in general and originates from basketball, where a shooter is more likely to score if their previous attempts were successful; i.e., while having the "hot hand.” While previous success at a task can indeed change the psychological attitude and subsequent success rate of a player, researchers for many years did not find evidence for a "hot hand" in practice, dismissing it as fallacious. However, later research questioned whether the belief is indeed a fallacy. Some recent studies using modern statistical analysis have observed evidence for the "hot hand" in some sporting activities; however, other recent studies have not observed evidence of the "hot hand". Moreover, evidence suggests that only a small subset of players may show a "hot hand" and, among those who do, the magnitude of the "hot hand" tends to be small.

Heuristics is the process by which humans use mental shortcuts to arrive at decisions. Heuristics are simple strategies that humans, animals, organizations, and even machines use to quickly form judgments, make decisions, and find solutions to complex problems. Often this involves focusing on the most relevant aspects of a problem or situation to formulate a solution. While heuristic processes are used to find the answers and solutions that are most likely to work or be correct, they are not always right or the most accurate. Judgments and decisions based on heuristics are simply good enough to satisfy a pressing need in situations of uncertainty, where information is incomplete. In that sense they can differ from answers given by logic and probability.

In statistics, linear regression is a model that estimates the linear relationship between a scalar response and one or more explanatory variables. A model with exactly one explanatory variable is a simple linear regression; a model with two or more explanatory variables is a multiple linear regression. This term is distinct from multivariate linear regression, which predicts multiple correlated dependent variables rather than a single dependent variable.

Gary Nance Smith is the Fletcher Jones Professor of Economics at Pomona College. His research on financial markets statistical reasoning, and artificial intelligence, often involves stock market anomalies, statistical fallacies, and the misuse of data have been widely cited.

References