Cromwell's rule

Last updated

Cromwell's rule, named by statistician Dennis Lindley, [1] states that the use of prior probabilities of 1 ("the event will definitely occur") or 0 ("the event will definitely not occur") should be avoided, except when applied to statements that are logically true or false, such as 2 + 2 equaling 4.

Contents

The reference is to Oliver Cromwell, who wrote to the General Assembly of the Church of Scotland on 3 August 1650, shortly before the Battle of Dunbar, including a phrase that has become well known and frequently quoted: [2]

I beseech you, in the bowels of Christ, think it possible that you may be mistaken.

As Lindley puts it, assigning a probability should "leave a little probability for the moon being made of green cheese; it can be as small as 1 in a million, but have it there since otherwise an army of astronauts returning with samples of the said cheese will leave you unmoved". [3] Similarly, in assessing the likelihood that tossing a coin will result in either a head or a tail facing upwards, there is a possibility, albeit remote, that the coin will land on its edge and remain in that position.

If the prior probability assigned to a hypothesis is 0 or 1, then, by Bayes' theorem, the posterior probability (probability of the hypothesis, given the evidence) is forced to be 0 or 1 as well; no evidence, no matter how strong, could have any influence.

A strengthened version of Cromwell's rule, applying also to statements of arithmetic and logic, alters the first rule of probability, or the convexity rule, 0 ≤ Pr(A) ≤ 1, to 0 < Pr(A) < 1.

Bayesian divergence (pessimistic)

An example of Bayesian divergence of opinion is based on Appendix A of Sharon Bertsch McGrayne's 2011 book. [4] Tim and Susan disagree as to whether a stranger who has two fair coins and one unfair coin (one with heads on both sides) has tossed one of the two fair coins or the unfair one; the stranger has tossed one of his coins three times and it has come up heads each time.

Tim assumes that the stranger picked the coin randomly – i.e., assumes a prior probability distribution in which each coin had a 1/3 chance of being the one picked. Applying Bayesian inference, Tim then calculates an 80% probability that the result of three consecutive heads was achieved by using the unfair coin, because each of the fair coins had a 1/8 chance of giving three straight heads, while the unfair coin had an 8/8 chance; out of 24 equally likely possibilities for what could happen, 8 out of the 10 that agree with the observations came from the unfair coin. If more flips are conducted, each further head increases the probability that the coin is the unfair one. If no tail ever appears, this probability converges to 1. But if a tail ever occurs, the probability that the coin is unfair immediately goes to 0 and stays at 0 permanently.

Susan assumes the stranger chose a fair coin (so the prior probability that the tossed coin is the unfair coin is 0). Consequently, Susan calculates the probability that three (or any number of consecutive heads) were tossed with the unfair coin must be 0; if still more heads are thrown, Susan does not change her probability. Tim and Susan's probabilities do not converge as more and more heads are thrown.

Bayesian convergence (optimistic)

An example of Bayesian convergence of opinion is in Nate Silver's 2012 book The Signal and the Noise: Why so many predictions fail – but some don't . [5] After stating that "Absolutely nothing useful is realized when one person who holds that there is a 0 (zero) percent probability of something argues against another person who holds that the probability is 100 percent", Silver describes a simulation with a stock market 60% likely to rise in which three investors start out with initial guesses of 10%, 50% and 90% that the stock market is in a bull market. By the end of the simulation (shown in a graph), "all of the investors conclude they are in a bull market with almost (although not exactly of course) 100 percent certainty".

See also

Related Research Articles

Bayesian probability is an interpretation of the concept of probability, in which, instead of frequency or propensity of some phenomenon, probability is interpreted as reasonable expectation representing a state of knowledge or as quantification of a personal belief.

The gambler's fallacy, also known as the Monte Carlo fallacy or the fallacy of the maturity of chances, is the belief that, if an event has occurred less frequently than expected, it is more likely to happen again in the future. The fallacy is commonly associated with gambling, where it may be believed, for example, that the next dice roll is more than usually likely to be six because there have recently been fewer than the expected number of sixes.

<span class="mw-page-title-main">Probability</span> Branch of mathematics concerning chance and uncertainty

Probability is the branch of mathematics concerning events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probability, the more likely an event is to occur. A simple example is the tossing of a fair (unbiased) coin. Since the coin is fair, the two outcomes are both equally probable; the probability of "heads" equals the probability of "tails"; and since no other outcomes are possible, the probability of either "heads" or "tails" is 1/2.

The word probability has been used in a variety of ways since it was first applied to the mathematical study of games of chance. Does probability measure the real, physical, tendency of something to occur, or is it a measure of how strongly one believes it will occur, or does it draw on both these elements? In answering such questions, mathematicians interpret the probability values of probability theory.

<span class="mw-page-title-main">Probability axioms</span> Foundations of probability theory

The standard probability axioms are the foundations of probability theory introduced by Russian mathematician Andrey Kolmogorov in 1933. These axioms remain central and have direct contributions to mathematics, the physical sciences, and real-world probability cases.

<span class="mw-page-title-main">Probability space</span> Mathematical concept

In probability theory, a probability space or a probability triple is a mathematical construct that provides a formal model of a random process or "experiment". For example, one can define a probability space which models the throwing of a die.

Bayes' theorem gives a mathematical rule for inverting conditional probabilities, allowing us to find the probability of a cause given its effect. For example, if the risk of developing health problems is known to increase with age, Bayes' theorem allows the risk to an individual of a known age to be assessed more accurately by conditioning it relative to their age, rather than assuming that the individual is typical of the population as a whole. Based on Bayes law both the prevalence of a disease in a given population and the error rate of an infectious disease test have to be taken into account to evaluate the meaning of a positive test result correctly and avoid the base-rate fallacy.

Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes available. Fundamentally, Bayesian inference uses prior knowledge, in the form of a prior distribution in order to estimate posterior probabilities. Bayesian inference is an important technique in statistics, and especially in mathematical statistics. Bayesian updating is particularly important in the dynamic analysis of a sequence of data. Bayesian inference has found application in a wide range of activities, including science, engineering, philosophy, medicine, sport, and law. In the philosophy of decision theory, Bayesian inference is closely related to subjective probability, often called "Bayesian probability".

<span class="mw-page-title-main">Thomas Bayes</span> British statistician (c. 1701 – 1761)

Thomas Bayes was an English statistician, philosopher and Presbyterian minister who is known for formulating a specific case of the theorem that bears his name: Bayes' theorem. Bayes never published what would become his most famous accomplishment; his notes were edited and published posthumously by Richard Price.

The principle of indifference is a rule for assigning epistemic probabilities. The principle of indifference states that in the absence of any relevant evidence, agents should distribute their credence equally among all the possible outcomes under consideration.

In probability theory, an event is said to happen almost surely if it happens with probability 1. In other words, the set of outcomes on which the event does not occur has probability 0, even though the set might not be empty. The concept is analogous to the concept of "almost everywhere" in measure theory. In probability experiments on a finite sample space with a non-zero probability for each outcome, there is no difference between almost surely and surely ; however, this distinction becomes important when the sample space is an infinite set, because an infinite set can have non-empty subsets of probability 0.

Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability, where probability expresses a degree of belief in an event. The degree of belief may be based on prior knowledge about the event, such as the results of previous experiments, or on personal beliefs about the event. This differs from a number of other interpretations of probability, such as the frequentist interpretation, which views probability as the limit of the relative frequency of an event after many trials. More concretely, analysis in Bayesian methods codifies prior knowledge in the form of a prior distribution.

In null-hypothesis significance testing, the -value is the probability of obtaining test results at least as extreme as the result actually observed, under the assumption that the null hypothesis is correct. A very small p-value means that such an extreme observed outcome would be very unlikely under the null hypothesis. Even though reporting p-values of statistical tests is common practice in academic publications of many quantitative fields, misinterpretation and misuse of p-values is widespread and has been a major topic in mathematics and metascience. In 2016, the American Statistical Association (ASA) made a formal statement that "p-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone" and that "a p-value, or statistical significance, does not measure the size of an effect or the importance of a result" or "evidence regarding a model or hypothesis". That said, a 2019 task force by ASA has issued a statement on statistical significance and replicability, concluding with: "p-values and significance tests, when properly applied and interpreted, increase the rigor of the conclusions drawn from data".

In statistics, the question of checking whether a coin is fair is one whose importance lies, firstly, in providing a simple problem on which to illustrate basic ideas of statistical inference and, secondly, in providing a simple problem that can be used to compare various competing methods of statistical inference, including decision theory. The practical problem of checking whether a coin is fair might be considered as easily solved by performing a sufficiently large number of trials, but statistics and probability theory can provide guidance on two types of question; specifically those of how many trials to undertake and of the accuracy of an estimate of the probability of turning up heads, derived from a given sample of trials.

The sign test is a statistical test for consistent differences between pairs of observations, such as the weight of subjects before and after treatment. Given pairs of observations for each subject, the sign test determines if one member of the pair tends to be greater than the other member of the pair.

Credibility theory is a branch of actuarial mathematics concerned with determining risk premiums. To achieve this, it uses mathematical models in an effort to forecast the (expected) number of insurance claims based on past observations. Technically speaking, the problem is to find the best linear approximation to the mean of the Bayesian predictive density, which is why credibility theory has many results in common with linear filtering as well as Bayesian statistics more broadly.

The transferable belief model (TBM) is an elaboration on the Dempster–Shafer theory (DST), which is a mathematical model used to evaluate the probability that a given proposition is true from other propositions that are assigned probabilities. It was developed by Philippe Smets who proposed his approach as a response to Zadeh’s example against Dempster's rule of combination. In contrast to the original DST the TBM propagates the open-world assumption that relaxes the assumption that all possible outcomes are known. Under the open world assumption Dempster's rule of combination is adapted such that there is no normalization. The underlying idea is that the probability mass pertaining to the empty set is taken to indicate an unexpected outcome, e.g. the belief in a hypothesis outside the frame of discernment. This adaptation violates the probabilistic character of the original DST and also Bayesian inference. Therefore, the authors substituted notation such as probability masses and probability update with terms such as degrees of belief and transfer giving rise to the name of the method: The transferable belief model.

The ludic fallacy, proposed by Nassim Nicholas Taleb in his book The Black Swan (2007), is "the misuse of games to model real-life situations". Taleb explains the fallacy as "basing studies of chance on the narrow world of games and dice". The adjective ludic originates from the Latin noun ludus, meaning "play, game, sport, pastime".

Probability has a dual aspect: on the one hand the likelihood of hypotheses given the evidence for them, and on the other hand the behavior of stochastic processes such as the throwing of dice or coins. The study of the former is historically older in, for example, the law of evidence, while the mathematical treatment of dice began with the work of Cardano, Pascal, Fermat and Christiaan Huygens between the 16th and 17th century.

Bayesian epistemology is a formal approach to various topics in epistemology that has its roots in Thomas Bayes' work in the field of probability theory. One advantage of its formal method in contrast to traditional epistemology is that its concepts and theorems can be defined with a high degree of precision. It is based on the idea that beliefs can be interpreted as subjective probabilities. As such, they are subject to the laws of probability theory, which act as the norms of rationality. These norms can be divided into static constraints, governing the rationality of beliefs at any moment, and dynamic constraints, governing how rational agents should change their beliefs upon receiving new evidence. The most characteristic Bayesian expression of these principles is found in the form of Dutch books, which illustrate irrationality in agents through a series of bets that lead to a loss for the agent no matter which of the probabilistic events occurs. Bayesians have applied these fundamental principles to various epistemological topics but Bayesianism does not cover all topics of traditional epistemology. The problem of confirmation in the philosophy of science, for example, can be approached through the Bayesian principle of conditionalization by holding that a piece of evidence confirms a theory if it raises the likelihood that this theory is true. Various proposals have been made to define the concept of coherence in terms of probability, usually in the sense that two propositions cohere if the probability of their conjunction is higher than if they were neutrally related to each other. The Bayesian approach has also been fruitful in the field of social epistemology, for example, concerning the problem of testimony or the problem of group belief. Bayesianism still faces various theoretical objections that have not been fully solved.

References

  1. Jackman, Simon (2009) Bayesian Analysis for the Social Sciences, Wiley. ISBN   978-0-470-01154-6 (ebook ISBN   978-0-470-68663-8).
  2. Cromwell, Oliver (1650): Letter 129.
  3. Lindley, Dennis (1991). Making Decisions (2 ed.). Wiley. p.  104. ISBN   0-471-90808-8.
  4. McGrayne, Sharon Bertsch. (2011). The Theory That Would Not Die: How Bayes' Rule Cracked The Enigma Code, Hunted Down Russian Submarines, & Emerged Triumphant from Two Centuries of Controversy. New Haven: Yale University Press. ISBN   9780300169690; OCLC 670481486 The Theory That Would Not Die, pages 263-265 at Google Books
  5. Silver, Nate (2012). The Signal and the Noise: Why so many predictions fail – but some don't . New York: Penguin. pp.  258–261. ISBN   978-1-59-420411-1.