An Essay Towards Solving a Problem in the Doctrine of Chances

Last updated

"An Essay Towards Solving a Problem in the Doctrine of Chances" is a work on the mathematical theory of probability by Thomas Bayes, published in 1763, [1] two years after its author's death, and containing multiple amendments and additions due to his friend Richard Price. The title comes from the contemporary use of the phrase "doctrine of chances" to mean the theory of probability, which had been introduced via the title of a book by Abraham de Moivre. Contemporary reprints of the essay carry a more specific and significant title: A Method of Calculating the Exact Probability of All Conclusions Founded on Induction. [2]

Contents

The essay includes theorems of conditional probability which form the basis of what is now called Bayes's Theorem, together with a detailed treatment of the problem of setting a prior probability.

Bayes supposed a sequence of independent experiments, each having as its outcome either success or failure, the probability of success being some number p between 0 and 1. But then he supposed p to be an uncertain quantity, whose probability of being in any interval between 0 and 1 is the length of the interval. In modern terms, p would be considered a random variable uniformly distributed between 0 and 1. Conditionally on the value of p, the trials resulting in success or failure are independent, but unconditionally (or "marginally") they are not. That is because if a large number of successes are observed, then p is more likely to be large, so that success on the next trial is more probable. The question Bayes addressed was: what is the conditional probability distribution of p, given the numbers of successes and failures so far observed. The answer is that its probability density function is

(and ƒ(p) = 0 for p < 0 or p > 1) where k is the number of successes so far observed, and n is the number of trials so far observed. This is what today is called the Beta distribution with parameters k + 1 and n  k + 1.

Outline

Bayes's preliminary results in conditional probability (especially Propositions 3, 4 and 5) imply the truth of the theorem that is named for him. He states:"If there be two subsequent events, the probability of the second b/N and the probability of both together P/N, and it being first discovered that the second event has also happened, from hence I guess that the first event has also happened, the probability I am right is P/b.". Symbolically, this implies (see Stigler 1982):

which leads to Bayes's Theorem for conditional probabilities:

However, it does not appear that Bayes emphasized or focused on this finding. Rather, he focused on the finding the solution to a much broader inferential problem:

"Given the number of times in which an unknown event has happened and failed [... Find] the chance that the probability of its happening in a single trial lies somewhere between any two degrees of probability that can be named." [1]

The essay includes an example of a man trying to guess the ratio of "blanks" and "prizes" at a lottery. So far the man has watched the lottery draw ten blanks and one prize. Given these data, Bayes showed in detail how to compute the probability that the ratio of blanks to prizes is between 9:1 and 11:1 (the probability is low - about 7.7%). He went on to describe that computation after the man has watched the lottery draw twenty blanks and two prizes, forty blanks and four prizes, and so on. Finally, having drawn 10,000 blanks and 1,000 prizes, the probability reaches about 97%. [1]

Bayes's main result (Proposition 9) is the following in modern terms:

Assume a uniform prior distribution of the binomial parameter . After observing successes and failures,

It is unclear whether Bayes was a "Bayesian" in the modern sense. That is, whether he was interested in Bayesian inference, or merely in probability. Proposition 9 seems "Bayesian" in its presentation as a probability about the parameter . However, Bayes stated his question in a manner that suggests a frequentist viewpoint: he supposed that a ball is thrown at random onto a square table (this table is often misrepresented as a billiard table, and the ball as a billiard ball, but Bayes never describes them as such), and considered further balls that fall to the left or right of the first ball with probabilities and . The algebra is of course identical no matter which view is taken.

Richard Price and the existence of God

Richard Price discovered Bayes's essay and its now-famous theorem in Bayes's papers after Bayes's death. He believed that Bayes's Theorem helped prove the existence of God ("the Deity") and wrote the following in his introduction to the essay:

"The purpose I mean is, to show what reason we have for believing that there are in the constitution of things fixt laws according to which things happen, and that, therefore, the frame of the world must be the effect of the wisdom and power of an intelligent cause; and thus to confirm the argument taken from final causes for the existence of the Deity. It will be easy to see that the converse problem solved in this essay is more directly applicable to this purpose; for it shews us, with distinctness and precision, in every case of any particular order or recurrency of events, what reason there is to think that such recurrency or order is derived from stable causes or regulations in nature, and not from any irregularities of chance." (Philosophical Transactions of the Royal Society of London, 1763) [1]

In modern terms this is an instance of the teleological argument.

Versions of the essay

Commentaries

Related Research Articles

Bayesian probability is an interpretation of the concept of probability, in which, instead of frequency or propensity of some phenomenon, probability is interpreted as reasonable expectation representing a state of knowledge or as quantification of a personal belief.

<span class="mw-page-title-main">Normal distribution</span> Probability distribution

In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is The parameter is the mean or expectation of the distribution, while the parameter is the variance. The standard deviation of the distribution is (sigma). A random variable with a Gaussian distribution is said to be normally distributed, and is called a normal deviate.

Bayes' theorem gives a mathematical rule for inverting conditional probabilities, allowing us to find the probability of a cause given its effect. For example, if the risk of developing health problems is known to increase with age, Bayes' theorem allows the risk to an individual of a known age to be assessed more accurately by conditioning it relative to their age, rather than assuming that the individual is typical of the population as a whole. Based on Bayes law both the prevalence of a disease in a given population and the error rate of an infectious disease test have to be taken into account to evaluate the meaning of a positive test result correctly and avoid the base-rate fallacy.

Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes available. Fundamentally, Bayesian inference uses prior knowledge, in the form of a prior distribution in order to estimate posterior probabilities. Bayesian inference is an important technique in statistics, and especially in mathematical statistics. Bayesian updating is particularly important in the dynamic analysis of a sequence of data. Bayesian inference has found application in a wide range of activities, including science, engineering, philosophy, medicine, sport, and law. In the philosophy of decision theory, Bayesian inference is closely related to subjective probability, often called "Bayesian probability".

<span class="mw-page-title-main">Naive Bayes classifier</span> Probabilistic classification algorithm

In statistics, naive Bayes classifiers are a family of linear "probabilistic classifiers" which assumes that the features are conditionally independent, given the target class. The strength (naivety) of this assumption is what gives the classifier its name. These classifiers are among the simplest Bayesian network models.

In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed statistical model, the observed data is most probable. The point in the parameter space that maximizes the likelihood function is called the maximum likelihood estimate. The logic of maximum likelihood is both intuitive and flexible, and as such the method has become a dominant means of statistical inference.

In statistics, sufficiency is a property of a statistic computed on a sample dataset in relation to a parametric model of the dataset. A sufficient statistic contains all of the information that the dataset provides about the model parameters. It is closely related to the concepts of an ancillary statistic which contains no information about the model parameters, and of a complete statistic which only contains information about the parameters and no ancillary information.

<span class="mw-page-title-main">Thomas Bayes</span> British statistician (c. 1701 – 1761)

Thomas Bayes was an English statistician, philosopher and Presbyterian minister who is known for formulating a specific case of the theorem that bears his name: Bayes' theorem. Bayes never published what would become his most famous accomplishment; his notes were edited and published posthumously by Richard Price.

The principle of maximum entropy states that the probability distribution which best represents the current state of knowledge about a system is the one with largest entropy, in the context of precisely stated prior data.

A Bayesian network is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). While it is one of several forms of causal notation, causal networks are special cases of Bayesian networks. Bayesian networks are ideal for taking an event that occurred and predicting the likelihood that any one of several possible known causes was the contributing factor. For example, a Bayesian network could represent the probabilistic relationships between diseases and symptoms. Given symptoms, the network can be used to compute the probabilities of the presence of various diseases.

The posterior probability is a type of conditional probability that results from updating the prior probability with information summarized by the likelihood via an application of Bayes' rule. From an epistemological perspective, the posterior probability contains everything there is to know about an uncertain proposition, given prior knowledge and a mathematical model describing the observations available at a particular time. After the arrival of new information, the current posterior probability may serve as the prior in another round of Bayesian updating.

Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability, where probability expresses a degree of belief in an event. The degree of belief may be based on prior knowledge about the event, such as the results of previous experiments, or on personal beliefs about the event. This differs from a number of other interpretations of probability, such as the frequentist interpretation, which views probability as the limit of the relative frequency of an event after many trials. More concretely, analysis in Bayesian methods codifies prior knowledge in the form of a prior distribution.

The following events in the fields of science and technology occurred in the year 1763.

<span class="mw-page-title-main">Sunrise problem</span> Problem asking the probability that the sun will rise tomorrow

The sunrise problem can be expressed as follows: "What is the probability that the sun will rise tomorrow?" The sunrise problem illustrates the difficulty of using probability theory when evaluating the plausibility of statements or beliefs.

In statistics, the method of moments is a method of estimation of population parameters. The same principle is used to derive higher moments like skewness and kurtosis.

Bayesian linear regression is a type of conditional modeling in which the mean of one variable is described by a linear combination of other variables, with the goal of obtaining the posterior probability of the regression coefficients and ultimately allowing the out-of-sample prediction of the regressandconditional on observed values of the regressors. The simplest and most widely used version of this model is the normal linear model, in which given is distributed Gaussian. In this model, and under a particular choice of prior probabilities for the parameters—so-called conjugate priors—the posterior can be found analytically. With more arbitrarily chosen priors, the posteriors generally have to be approximated.

Inductive probability attempts to give the probability of future events based on past events. It is the basis for inductive reasoning, and gives the mathematical basis for learning and the perception of patterns. It is a source of knowledge about the world.

Bayesian hierarchical modelling is a statistical model written in multiple levels that estimates the parameters of the posterior distribution using the Bayesian method. The sub-models combine to form the hierarchical model, and Bayes' theorem is used to integrate them with the observed data and account for all the uncertainty that is present. The result of this integration is the posterior distribution, also known as the updated probability estimate, as additional evidence on the prior distribution is acquired.

<span class="mw-page-title-main">Jurimetrics</span> Application of quantitative metrics to law

Jurimetrics is the application of quantitative methods, and often especially probability and statistics, to law. In the United States, the journal Jurimetrics is published by the American Bar Association and Arizona State University. The Journal of Empirical Legal Studies is another publication that emphasizes the statistical analysis of law.

References

  1. 1 2 3 4 Bayes, Mr; Price, Mr (1763). "An Essay towards Solving a Problem in the Doctrine of Chances. By the Late Rev. Mr. Bayes, F. R. S. Communicated by Mr. Price, in a Letter to John Canton, A. M. F. R. S" (PDF). Philosophical Transactions of the Royal Society of London. 53: 370–418. doi: 10.1098/rstl.1763.0053 . Archived from the original (PDF) on 2011-04-10. Retrieved 2011-09-25.
  2. Stigler, Stephen M (2013). "The True Title of Bayes's Essay". Statistical Science. 28 (3): 283–288. arXiv: 1310.0173 . doi:10.1214/13-STS438.