WikiMili The Free Encyclopedia

Probability |
---|

Part of a series on |

Certainty |
---|

Related concepts and fundamentals: |

**Probability** is a measure quantifying the likelihood that events will occur.^{ [1] } See glossary of probability and statistics. Probability quantifies as a number between 0 and 1, where, roughly speaking,^{ [note 1] } 0 indicates impossibility and 1 indicates certainty.^{ [2] }^{ [3] } The higher the probability of an event, the more likely it is that the event will occur. A simple example is the tossing of a fair (unbiased) coin. Since the coin is fair, the two outcomes ("heads" and "tails") are both equally probable; the probability of "heads" equals the probability of "tails"; and since no other outcomes are possible, the probability of either "heads" or "tails" is 1/2 (which could also be written as 0.5 or 50%).

In mathematical analysis, a **measure** on a set is a systematic way to assign a number to each suitable subset of that set, intuitively interpreted as its size. In this sense, a measure is a generalization of the concepts of length, area, and volume. A particularly important example is the Lebesgue measure on a Euclidean space, which assigns the conventional length, area, and volume of Euclidean geometry to suitable subsets of the *n*-dimensional Euclidean space **R**^{n}. For instance, the Lebesgue measure of the interval [0, 1] in the real numbers is its length in the everyday sense of the word, specifically, 1.

In probability theory, an **event** is a set of outcomes of an experiment to which a probability is assigned. A single outcome may be an element of many different events, and different events in an experiment are usually not equally likely, since they may include very different groups of outcomes. An event defines a complementary event, namely the complementary set, and together these define a Bernoulli trial: did the event occur or not?

*Most of the terms listed in Wikipedia glossaries are already defined and explained within Wikipedia itself. However, glossaries like this one are useful for looking up, comparing and reviewing large numbers of terms together. You can help enhance this page by adding new terms or writing definitions for existing ones.*

- Interpretations
- Etymology
- History
- Theory
- Applications
- Mathematical treatment
- Independent events
- Mutually exclusive events
- Not mutually exclusive events
- Conditional probability
- Inverse probability
- Summary of probabilities
- Relation to randomness and probability in quantum mechanics
- See also
- Notes
- References
- Bibliography
- External links

These concepts have been given an axiomatic mathematical formalization in probability theory, which is used widely in such areas of study as mathematics, statistics, finance, gambling, science (in particular physics), artificial intelligence/machine learning, computer science, game theory, and philosophy to, for example, draw inferences about the expected frequency of events. Probability theory is also used to describe the underlying mechanics and regularities of complex systems.^{ [4] }

The **Kolmogorov axioms** are a fundamental part of Andrey Kolmogorov's probability theory. In it, the probability *P* of some event *E*, denoted , is usually defined as to satisfy these axioms. The axioms are described below.

**Probability theory** is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set of axioms. Typically these axioms formalise probability in terms of a probability space, which assigns a measure taking values between 0 and 1, termed the probability measure, to a set of outcomes called the sample space. Any specified subset of these outcomes is called an event.

**Mathematics** includes the study of such topics as quantity, structure (algebra), space (geometry), and change. It has no generally accepted definition.

When dealing with experiments that are random and well-defined in a purely theoretical setting (like tossing a fair coin), probabilities can be numerically described by the number of desired outcomes divided by the total number of all outcomes. For example, tossing a fair coin twice will yield "head-head", "head-tail", "tail-head", and "tail-tail" outcomes. The probability of getting an outcome of "head-head" is 1 out of 4 outcomes, or, in numerical terms, 1/4, 0.25 or 25%. However, when it comes to practical application, there are two major competing categories of probability interpretations, whose adherents possess different views about the fundamental nature of probability:

An **experiment** is a procedure carried out to support, refute, or validate a hypothesis. Experiments provide insight into cause-and-effect by demonstrating what outcome occurs when a particular factor is manipulated. Experiments vary greatly in goal and scale, but always rely on repeatable procedure and logical analysis of the results. There also exists natural experimental studies.

In mathematics, an expression is called **well-defined** or *unambiguous* if its definition assigns it a unique interpretation or value. Otherwise, the expression is said to be *not well-defined*, *ill-defined* or *ambiguous*. A function is well-defined if it gives the same result when the representation of the input is changed without changing the value of the input. For instance, if *f* takes real numbers as input, and if *f*(0.5) does not equal *f*(1/2) then *f* is not well-defined. The term *well-defined* can also used to indicate that a logical expression is unambiguous or uncontradictory.

- Objectivists assign numbers to describe some objective or physical state of affairs. The most popular version of objective probability is frequentist probability, which claims that the probability of a random event denotes the
*relative frequency of occurrence*of an experiment's outcome, when repeating the experiment. This interpretation considers probability to be the relative frequency "in the long run" of outcomes.^{ [5] }A modification of this is propensity probability, which interprets probability as the tendency of some experiment to yield a certain outcome, even if it is performed only once. - Subjectivists assign numbers per subjective probability, i.e., as a degree of belief.
^{ [6] }The degree of belief has been interpreted as, "the price at which you would buy or sell a bet that pays 1 unit of utility if E, 0 if not E."^{ [7] }The most popular version of subjective probability is Bayesian probability, which includes expert knowledge as well as experimental data to produce probabilities. The expert knowledge is represented by some (subjective) prior probability distribution. These data are incorporated in a likelihood function. The product of the prior and the likelihood, normalized, results in a posterior probability distribution that incorporates all the information known to date.^{ [8] }By Aumann's agreement theorem, Bayesian agents whose prior beliefs are similar will end up with similar posterior beliefs. However, sufficiently different priors can lead to different conclusions regardless of how much information the agents share.^{ [9] }

The word *probability* derives from the Latin *probabilitas*, which can also mean "probity", a measure of the authority of a witness in a legal case in Europe, and often correlated with the witness's nobility. In a sense, this differs much from the modern meaning of *probability*, which, in contrast, is a measure of the weight of empirical evidence, and is arrived at from inductive reasoning and statistical inference.^{ [10] }

**Etymology** is the study of the history of words. By extension, the phrase "the etymology of [some word]" means the origin of the particular word. For place names, there is a specific term, toponymy.

**Authority** is the right to exercise power, which can be formalized by a state and exercised by way of judges, appointed executives of government, or the ecclesiastical or priestly appointed representatives of a God or other deities. Authority, in the sense of "authorization", can also mean the right to complete an action or execute an order.

A **witness** is someone who has knowledge about a matter. In law a witness is someone who, either voluntarily or under compulsion, provides testimonial evidence, either oral or written, of what he or she knows or claims to know.

The scientific study of probability is a modern development of mathematics. Gambling shows that there has been an interest in quantifying the ideas of probability for millennia, but exact mathematical descriptions arose much later. There are reasons for the slow development of the mathematics of probability. Whereas games of chance provided the impetus for the mathematical study of probability, fundamental issues^{[ clarification needed ]} are still obscured by the superstitions of gamblers.^{ [11] }

**Gambling** is the wagering of money or something of value on an event with an uncertain outcome, with the primary intent of winning money or material goods. Gambling thus requires three elements to be present: consideration, risk (chance), and a prize. The outcome of the wager is often immediate, such as a single roll of dice, a spin of a roulette wheel, or a horse crossing the finish line, but longer time frames are also common, allowing wagers on the outcome of a future sports contest or even an entire sports season.

According to Richard Jeffrey, "Before the middle of the seventeenth century, the term 'probable' (Latin *probabilis*) meant *approvable*, and was applied in that sense, unequivocally, to opinion and to action. A probable action or opinion was one such as sensible people would undertake or hold, in the circumstances."^{ [12] } However, in legal contexts especially, 'probable' could also apply to propositions for which there was good evidence.^{ [13] }

**Richard Carl Jeffrey** was an American philosopher, logician, and probability theorist. He is best known for developing and championing the philosophy of radical probabilism and the associated heuristic of probability kinematics, also known as Jeffrey conditioning.

The earliest known forms of probability and statistics were developed by Middle Eastern mathematicians studying cryptography between the 8th and 13th centuries. Al-Khalil (717–786) wrote the *Book of Cryptographic Messages* which contains the first use of permutations and combinations to list all possible Arabic words with and without vowels. Al-Kindi (801–873) made the earliest known use of statistical inference in his work on cryptanalysis and frequency analysis. An important contribution of Ibn Adlan (1187–1268) was on sample size for use of frequency analysis.^{ [14] }

The sixteenth century Italian polymath Gerolamo Cardano demonstrated the efficacy of defining odds as the ratio of favourable to unfavourable outcomes (which implies that the probability of an event is given by the ratio of favourable outcomes to the total number of possible outcomes^{ [15] }). Aside from the elementary work by Cardano, the doctrine of probabilities dates to the correspondence of Pierre de Fermat and Blaise Pascal (1654). Christiaan Huygens (1657) gave the earliest known scientific treatment of the subject.^{ [16] } Jakob Bernoulli's * Ars Conjectandi * (posthumous, 1713) and Abraham de Moivre's * Doctrine of Chances * (1718) treated the subject as a branch of mathematics.^{ [17] } See Ian Hacking's *The Emergence of Probability*^{ [10] } and James Franklin's *The Science of Conjecture*^{ [18] } for histories of the early development of the very concept of mathematical probability.

The theory of errors may be traced back to Roger Cotes's *Opera Miscellanea* (posthumous, 1722), but a memoir prepared by Thomas Simpson in 1755 (printed 1756) first applied the theory to the discussion of errors of observation.^{ [19] } The reprint (1757) of this memoir lays down the axioms that positive and negative errors are equally probable, and that certain assignable limits define the range of all errors. Simpson also discusses continuous errors and describes a probability curve.

The first two laws of error that were proposed both originated with Pierre-Simon Laplace. The first law was published in 1774 and stated that the frequency of an error could be expressed as an exponential function of the numerical magnitude of the error, disregarding sign. The second law of error was proposed in 1778 by Laplace and stated that the frequency of the error is an exponential function of the square of the error.^{ [20] } The second law of error is called the normal distribution or the Gauss law. "It is difficult historically to attribute that law to Gauss, who in spite of his well-known precocity had probably not made this discovery before he was two years old."^{ [20] }

Daniel Bernoulli (1778) introduced the principle of the maximum product of the probabilities of a system of concurrent errors.

Adrien-Marie Legendre (1805) developed the method of least squares, and introduced it in his *Nouvelles méthodes pour la détermination des orbites des comètes* (*New Methods for Determining the Orbits of Comets*).^{ [21] } In ignorance of Legendre's contribution, an Irish-American writer, Robert Adrain, editor of "The Analyst" (1808), first deduced the law of facility of error,

where is a constant depending on precision of observation, and is a scale factor ensuring that the area under the curve equals 1. He gave two proofs, the second being essentially the same as John Herschel's (1850).^{[ citation needed ]} Gauss gave the first proof that seems to have been known in Europe (the third after Adrain's) in 1809. Further proofs were given by Laplace (1810, 1812), Gauss (1823), James Ivory (1825, 1826), Hagen (1837), Friedrich Bessel (1838), W.F. Donkin (1844, 1856), and Morgan Crofton (1870). Other contributors were Ellis (1844), De Morgan (1864), Glaisher (1872), and Giovanni Schiaparelli (1875). Peters's (1856) formula^{[ clarification needed ]} for *r*, the probable error of a single observation, is well known.

In the nineteenth century authors on the general theory included Laplace, Sylvestre Lacroix (1816), Littrow (1833), Adolphe Quetelet (1853), Richard Dedekind (1860), Helmert (1872), Hermann Laurent (1873), Liagre, Didion, and Karl Pearson. Augustus De Morgan and George Boole improved the exposition of the theory.

Andrey Markov introduced^{ [22] } the notion of Markov chains (1906), which played an important role in stochastic processes theory and its applications. The modern theory of probability based on the measure theory was developed by Andrey Kolmogorov (1931).^{ [23] }

On the geometric side (see integral geometry) contributors to *The Educational Times* were influential (Miller, Crofton, McColl, Wolstenholme, Watson, and Artemas Martin).^{[ citation needed ]}

Like other theories, the theory of probability is a representation of its concepts in formal terms—that is, in terms that can be considered separately from their meaning. These formal terms are manipulated by the rules of mathematics and logic, and any results are interpreted or translated back into the problem domain.

There have been at least two successful attempts to formalize probability, namely the Kolmogorov formulation and the Cox formulation. In Kolmogorov's formulation (see probability space), sets are interpreted as events and probability itself as a measure on a class of sets. In Cox's theorem, probability is taken as a primitive (that is, not further analyzed) and the emphasis is on constructing a consistent assignment of probability values to propositions. In both cases, the laws of probability are the same, except for technical details.

There are other methods for quantifying uncertainty, such as the Dempster–Shafer theory or possibility theory, but those are essentially different and not compatible with the laws of probability as usually understood.

Probability theory is applied in everyday life in risk assessment and modeling. The insurance industry and markets use actuarial science to determine pricing and make trading decisions. Governments apply probabilistic methods in environmental regulation, entitlement analysis (Reliability theory of aging and longevity), and financial regulation.

A good example of the use of probability theory in equity trading is the effect of the perceived probability of any widespread Middle East conflict on oil prices, which have ripple effects in the economy as a whole. An assessment by a commodity trader that a war is more likely can send that commodity's prices up or down, and signals other traders of that opinion. Accordingly, the probabilities are neither assessed independently nor necessarily very rationally. The theory of behavioral finance emerged to describe the effect of such groupthink on pricing, on policy, and on peace and conflict.^{ [24] }

In addition to financial assessment, probability can be used to analyze trends in biology (e.g. disease spread) as well as ecology (e.g. biological Punnett squares). As with finance, risk assessment can be used as a statistical tool to calculate the likelihood of undesirable events occurring and can assist with implementing protocols to avoid encountering such circumstances. Probability is used to design games of chance so that casinos can make a guaranteed profit, yet provide payouts to players that are frequent enough to encourage continued play.^{ [25] }

The discovery of rigorous methods to assess and combine probability assessments has changed society.^{ [26] }^{[ citation needed ]}

Another significant application of probability theory in everyday life is reliability. Many consumer products, such as automobiles and consumer electronics, use reliability theory in product design to reduce the probability of failure. Failure probability may influence a manufacturer's decisions on a product's warranty.^{ [27] }

The cache language model and other statistical language models that are used in natural language processing are also examples of applications of probability theory.

Consider an experiment that can produce a number of results. The collection of all possible results is called the sample space of the experiment. The power set of the sample space is formed by considering all different collections of possible results. For example, rolling a dice can produce six possible results. One collection of possible results gives an odd number on the dice. Thus, the subset {1,3,5} is an element of the power set of the sample space of dice rolls. These collections are called "events". In this case, {1,3,5} is the event that the dice falls on some odd number. If the results that actually occur fall in a given event, the event is said to have occurred.

A probability is a way of assigning every event a value between zero and one, with the requirement that the event made up of all possible results (in our example, the event {1,2,3,4,5,6}) is assigned a value of one. To qualify as a probability, the assignment of values must satisfy the requirement that if you look at a collection of mutually exclusive events (events with no common results, e.g., the events {1,6}, {3}, and {2,4} are all mutually exclusive), the probability that at least one of the events will occur is given by the sum of the probabilities of all the individual events.^{ [28] }

The probability of an event *A* is written as , , or .^{ [29] } This mathematical definition of probability can extend to infinite sample spaces, and even uncountable sample spaces, using the concept of a measure.

The *opposite* or *complement* of an event *A* is the event [not *A*] (that is, the event of *A* not occurring), often denoted as , or ; its probability is given by *P*(not *A*) = 1 − *P*(*A*).^{ [30] } As an example, the chance of not rolling a six on a six-sided die is 1 – (chance of rolling a six) . See Complementary event for a more complete treatment.

If two events *A* and *B* occur on a single performance of an experiment, this is called the intersection or joint probability of *A* and *B*, denoted as .

If two events, *A* and *B* are independent then the joint probability is

for example, if two coins are flipped the chance of both being heads is .^{ [31] }

If either event *A* or event *B* but never both occurs on a single performance of an experiment, then they are called mutually exclusive events.

If two events are mutually exclusive then the probability of **both** occurring is denoted as .

If two events are mutually exclusive then the probability of **either** occurring is denoted as .

For example, the chance of rolling a 1 or 2 on a six-sided die is

If the events are not mutually exclusive then

For example, when drawing a single card at random from a regular deck of cards, the chance of getting a heart or a face card (J,Q,K) (or one that is both) is , because of the 52 cards of a deck 13 are hearts, 12 are face cards, and 3 are both: here the possibilities included in the "3 that are both" are included in each of the "13 hearts" and the "12 face cards" but should only be counted once.

* Conditional probability * is the probability of some event *A*, given the occurrence of some other event *B*. Conditional probability is written , and is read "the probability of *A*, given *B*". It is defined by^{ [32] }

If then is formally undefined by this expression. However, it is possible to define a conditional probability for some zero-probability events using a σ-algebra of such events (such as those arising from a continuous random variable).^{[ citation needed ]}

For example, in a bag of 2 red balls and 2 blue balls (4 balls in total), the probability of taking a red ball is ; however, when taking a second ball, the probability of it being either a red ball or a blue ball depends on the ball previously taken, such as, if a red ball was taken, the probability of picking a red ball again would be since only 1 red and 2 blue balls would have been remaining.

In probability theory and applications, **Bayes' rule** relates the odds of event to event , before (prior to) and after (posterior to) conditioning on another event . The odds on to event is simply the ratio of the probabilities of the two events. When arbitrarily many events are of interest, not just two, the rule can be rephrased as **posterior is proportional to prior times likelihood**, where the proportionality symbol means that the left hand side is proportional to (i.e., equals a constant times) the right hand side as varies, for fixed or given (Lee, 2012; Bertsch McGrayne, 2012). In this form it goes back to Laplace (1774) and to Cournot (1843); see Fienberg (2005). See Inverse probability and Bayes' rule.

Event | Probability |
---|---|

A | |

not A | |

A or B | |

A and B | |

A given B |

In a deterministic universe, based on Newtonian concepts, there would be no probability if all conditions were known (Laplace's demon), (but there are situations in which sensitivity to initial conditions exceeds our ability to measure them, i.e. know them). In the case of a roulette wheel, if the force of the hand and the period of that force are known, the number on which the ball will stop would be a certainty (though as a practical matter, this would likely be true only of a roulette wheel that had not been exactly levelled – as Thomas A. Bass' Newtonian Casino revealed). This also assumes knowledge of inertia and friction of the wheel, weight, smoothness and roundness of the ball, variations in hand speed during the turning and so forth. A probabilistic description can thus be more useful than Newtonian mechanics for analyzing the pattern of outcomes of repeated rolls of a roulette wheel. Physicists face the same situation in kinetic theory of gases, where the system, while deterministic *in principle*, is so complex (with the number of molecules typically the order of magnitude of the Avogadro constant 6.02×10^{23}) that only a statistical description of its properties is feasible.

Probability theory is required to describe quantum phenomena.^{ [33] } A revolutionary discovery of early 20th century physics was the random character of all physical processes that occur at sub-atomic scales and are governed by the laws of quantum mechanics. The objective wave function evolves deterministically but, according to the Copenhagen interpretation, it deals with probabilities of observing, the outcome being explained by a wave function collapse when an observation is made. However, the loss of determinism for the sake of instrumentalism did not meet with universal approval. Albert Einstein famously remarked in a letter to Max Born: "I am convinced that God does not play dice".^{ [34] } Like Einstein, Erwin Schrödinger, who discovered the wave function, believed quantum mechanics is a statistical approximation of an underlying deterministic reality.^{ [35] } In some modern interpretations of the statistical mechanics of measurement, quantum decoherence is invoked to account for the appearance of subjectively probabilistic experimental outcomes.

- Chance (disambiguation)
- Class membership probabilities
- Contingency
- Equiprobability
- Heuristics in judgment and decision-making
- Probability theory
- Randomness
- Statistics
- Estimators
- Estimation Theory
- Probability density function

- In Law

- ↑ Strictly speaking, a probability of 0 indicates that an event
*almost*never takes place, whereas a probability of 1 indicates than an event*almost*certainly takes place. This is an important distinction when the sample space is infinite. For example, for the continuous uniform distribution on the real interval [5, 10], there are an infinite number of possible outcomes, and the probability of any given outcome being observed — for instance, exactly 7 — is 0. This means that when we make an observation, it will*almost surely not*be exactly 7. However, it does**not**mean that exactly 7 is*impossible*. Ultimately some specific outcome (with probability 0) will be observed, and one possibility for that specific outcome is exactly 7.

**Frequentist probability** or **frequentism** is an interpretation of probability; it defines an event's probability as the limit of its relative frequency in many trials. This interpretation supports the statistical needs of experimental scientists and pollsters; probabilities can be found by a repeatable objective process. It does not support all needs; gamblers typically require estimates of the odds without experiments.

In statistics, the **likelihood principle** is the proposition that, given a statistical model, all the evidence in a sample relevant to model parameters is contained in the likelihood function.

The word probability has been used in a variety of ways since it was first applied to the mathematical study of games of chance. Does probability measure the real, physical tendency of something to occur or is it a measure of how strongly one believes it will occur, or does it draw on both these elements? In answering such questions, mathematicians interpret the probability values of probability theory.

In probability theory and statistics, a **probability distribution** is a mathematical function that provides the probabilities of occurrence of different possible outcomes in an experiment. In more technical terms, the probability distribution is a description of a random phenomenon in terms of the probabilities of events. For instance, if the random variable X is used to denote the outcome of a coin toss, then the probability distribution of X would take the value 0.5 for *X* = heads, and 0.5 for *X* = tails. Examples of random phenomena can include the results of an experiment or survey.

A **statistical model** is a mathematical model that embodies a set of statistical assumptions concerning the generation of sample data. A statistical model represents, often in considerably idealized form, the data-generating process.

A **statistical hypothesis**, sometimes called **confirmatory data analysis**, is a hypothesis that is testable on the basis of observing a process that is modeled via a set of random variables. A **statistical hypothesis test** is a method of statistical inference. Commonly, two statistical data sets are compared, or a data set obtained by sampling is compared against a synthetic data set from an idealized model. A hypothesis is proposed for the statistical relationship between the two data sets, and this is compared as an alternative to an idealized null hypothesis that proposes no relationship between two data sets. The comparison is deemed *statistically significant* if the relationship between the data sets would be an unlikely realization of the null hypothesis according to a threshold probability—the significance level. Hypothesis tests are used when determining what outcomes of a study would lead to a rejection of the null hypothesis for a pre-specified level of significance.

In statistics, the **likelihood function** expresses the probability of a sample of data given a set of parameter values. The likelihood is typically formulated as a function of the parameters. However, it does *not* express a probability over the parameter space; it is equal to the joint probability distribution of a random sample, which is a random variable taking values in the sample space, not the parameter space.

In probability theory and statistics, **Bayes’ theorem** describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For example, if cancer is related to age, then, using Bayes’ theorem, a person's age can be used to more accurately assess the probability that they have cancer than can be done without knowledge of the person’s age.

In the theory of probability and statistics, a **Bernoulli trial** is a random experiment with exactly two possible outcomes, "success" and "failure", in which the probability of success is the same every time the experiment is conducted. It is named after Jacob Bernoulli, a 17th-century Swiss mathematician, who analyzed them in his *Ars Conjectandi* (1713).

**Bayesian statistics** is a theory in the field of statistics based on the Bayesian interpretation of probability where probability expresses a *degree of belief* in an event. The degree of belief may be based on prior knowledge about the event, such as the results of previous experiments, or on personal beliefs about the event. This differs from a number of other interpretations of probability, such as the frequentist interpretation that views probability as the limit of the relative frequency of an event after many trials.

**Mathematical statistics** is the application of probability theory, a branch of mathematics, to statistics, as opposed to techniques for collecting statistical data. Specific mathematical techniques which are used for this include mathematical analysis, linear algebra, stochastic analysis, differential equations, and measure theory.

In probability theory and statistics, the **Laplace distribution** is a continuous probability distribution named after Pierre-Simon Laplace. It is also sometimes called the *double exponential distribution*, because it can be thought of as two exponential distributions spliced together back-to-back, although the term is also sometimes used to refer to the Gumbel distribution. The difference between two independent identically distributed exponential random variables is governed by a Laplace distribution, as is a Brownian motion evaluated at an exponentially distributed random time. Increments of Laplace motion or a variance gamma process evaluated over the time scale also have a Laplace distribution.

In probability theory and statistics, the **log-Laplace distribution** is the probability distribution of a random variable whose logarithm has a Laplace distribution. If *X* has a Laplace distribution with parameters *μ* and *b*, then *Y* = *e*^{X} has a log-Laplace distribution. The distributional properties can be derived from the Laplace distribution.

The **history of statistics** in the modern way is that it originates from the term *statistics,* found in 1749 in Germany. Although there have been changes to the interpretation of the word over time. The development of statistics is intimately connected on the one hand with the development of sovereign states, particularly European states following the peace of Westphalia (1648); and the other hand with the development of probability theory, which put statistics on a firm theoretical basis.

**Statistical proof** is the rational demonstration of degree of certainty for a proposition, hypothesis or theory that is used to convince others subsequent to a statistical test of the supporting evidence and the types of inferences that can be drawn from the test scores. Statistical methods are used to increase the understanding of the facts and the proof demonstrates the validity and logic of inference with explicit reference to a hypothesis, the experimental data, the facts, the test, and the odds. Proof has two essential aims: the first is to convince and the second is to explain the proposition through peer and public review.

In probability theory, **conditional probability** is a measure of the probability of an event occurring given that another event has occurred. If the event of interest is *A* and the event *B* is known or assumed to have occurred, "the conditional probability of *A* given *B*", or "the probability of *A* under the condition *B*", is usually written as P(*A* | *B*), or sometimes *P*_{B}(*A*) or P(*A* / *B*). For example, the probability that any given person has a cough on any given day may be only 5%. But if we know or assume that the person has a cold, then they are much more likely to be coughing. The conditional probability that someone coughing is unwell might be 75%, then: P(Cough) = 5%; P(Sick | Cough) = 75%

Probability has a dual aspect: on the one hand the likelihood of hypotheses given the evidence for them, and on the other hand the behavior of stochastic processes such as the throwing of dice or coins. The study of the former is historically older in, for example, the law of evidence, while the mathematical treatment of dice began with the work of Cardano, Pascal and Fermat between the 16th and 17th century.

* An Essay towards solving a Problem in the Doctrine of Chances* is a work on the mathematical theory of probability by the Reverend Thomas Bayes, published in 1763, two years after its author's death, and containing multiple amendments and additions due to his friend Richard Price. The title comes from the contemporary use of the phrase "doctrine of chances" to mean the theory of probability, which had been introduced via the title of a book by Abraham de Moivre. Contemporary reprints of the Essay carry a more specific and significant title:

- ↑ "Probability".
*Webster's Revised Unabridged Dictionary*. G & C Merriam, 1913. - ↑ "Kendall's Advanced Theory of Statistics, Volume 1: Distribution Theory", Alan Stuart and Keith Ord, 6th Ed, (2009), ISBN 978-0-534-24312-8.
- ↑ William Feller, "An Introduction to Probability Theory and Its Applications", (Vol 1), 3rd Ed, (1968), Wiley, ISBN 0-471-25708-7.
- ↑ Probability Theory The Britannica website
- ↑ Hacking, Ian (1965).
*The Logic of Statistical Inference*. Cambridge University Press. ISBN 978-0-521-05165-1.^{[ page needed ]} - ↑ Finetti, Bruno de (1970). "Logical foundations and measurement of subjective probability".
*Acta Psychologica*.**34**: 129–145. doi:10.1016/0001-6918(70)90012-0. - ↑ Hájek, Alan (21 October 2002). Edward N. Zalta (ed.). "Interpretations of Probability".
*The Stanford Encyclopedia of Philosophy*(Winter 2012 ed.). Retrieved 22 April 2013. - ↑ Hogg, Robert V.; Craig, Allen; McKean, Joseph W. (2004).
*Introduction to Mathematical Statistics*(6th ed.). Upper Saddle River: Pearson. ISBN 978-0-13-008507-8.^{[ page needed ]} - ↑ Jaynes, E.T. (2003). "Section 5.3 Converging and diverging views". In Bretthorst, G. Larry (ed.).
*Probability Theory: The Logic of Science*(1 ed.). Cambridge University Press. ISBN 978-0-521-59271-0. - 1 2 Hacking, I. (2006)
*The Emergence of Probability: A Philosophical Study of Early Ideas about Probability, Induction and Statistical Inference*, Cambridge University Press, ISBN 978-0-521-68557-3^{[ page needed ]} - ↑ Freund, John. (1973)
*Introduction to Probability*. Dickenson ISBN 978-0-8221-0078-2 (p. 1) - ↑ Jeffrey, R.C.,
*Probability and the Art of Judgment,*Cambridge University Press. (1992). pp. 54–55 . ISBN 0-521-39459-7 - ↑ Franklin, J. (2001)
*The Science of Conjecture: Evidence and Probability Before Pascal,*Johns Hopkins University Press. (pp. 22, 113, 127) - ↑ Broemeling, Lyle D. (1 November 2011). "An Account of Early Statistical Inference in Arab Cryptology".
*The American Statistician*.**65**(4): 255–257. doi:10.1198/tas.2011.10191. - ↑
*Some laws and problems in classical probability and how Cardano anticipated them*Gorrochum, P.*Chance*magazine 2012 - ↑ Abrams, William,
*A Brief History of Probability*, Second Moment, retrieved 23 May 2008 - ↑ Ivancevic, Vladimir G.; Ivancevic, Tijana T. (2008).
*Quantum leap : from Dirac and Feynman, across the universe, to human body and mind*. Singapore ; Hackensack, NJ: World Scientific. p. 16. ISBN 978-981-281-927-7. - ↑ Franklin, James (2001).
*The Science of Conjecture: Evidence and Probability Before Pascal*. Johns Hopkins University Press. ISBN 978-0-8018-6569-5. - ↑ Shoesmith, Eddie (November 1985). "Thomas Simpson and the arithmetic mean".
*Historia Mathematica*.**12**(4): 352–355. doi:10.1016/0315-0860(85)90044-8. - 1 2 Wilson EB (1923) "First and second laws of error". Journal of the American Statistical Association, 18, 143
- ↑ Seneta, Eugene William. ""Adrien-Marie Legendre" (version 9)".
*StatProb: The Encyclopedia Sponsored by Statistics and Probability Societies*. Archived from the original on 3 February 2016. Retrieved 27 January 2016. - ↑ Weber, Richard. "Markov Chains" (PDF).
*Statistical Laboratory*. University of Cambridge. - ↑ Vitanyi, Paul M.B. (1988). "Andrei Nikolaevich Kolmogorov".
*CWI Quarterly*(1): 3–18. Retrieved 27 January 2016. - ↑ Singh, Laurie (2010) "Whither Efficient Markets? Efficient Market Theory and Behavioral Finance". The Finance Professionals' Post, 2010.
- ↑ Gao, J.Z.; Fong, D.; Liu, X. (April 2011). "Mathematical analyses of casino rebate systems for VIP gambling".
*International Gambling Studies*.**11**(1): 93–106. doi:10.1080/14459795.2011.552575. - ↑ "Data: Data Analysis, Probability and Statistics, and Graphing".
*archon.educ.kent.edu*. Retrieved 28 May 2017. - ↑ Gorman, Michael F. (2010). "Management Insights".
*Management Science*.**56**: iv–vii. doi:10.1287/mnsc.1090.1132. - ↑ Ross, Sheldon M. (2010).
*A First course in Probability*(8th ed.). Pearson Prentice Hall. pp. 26–27. ISBN 9780136033134. - ↑ Olofsson (2005) p. 8.
- ↑ Olofsson (2005), p. 9
- ↑ Olofsson (2005) p. 35.
- ↑ Olofsson (2005) p. 29.
- ↑ Burgin, Mark (2010). "Interpretations of Negative Probabilities": 1. arXiv: 1008.1287v1 .Cite journal requires
`|journal=`

(help) - ↑
*Jedenfalls bin ich überzeugt, daß der Alte nicht würfelt.*Letter to Max Born, 4 December 1926, in: Einstein/Born Briefwechsel 1916–1955. - ↑ Moore, W.J. (1992).
*Schrödinger: Life and Thought*. Cambridge University Press. p. 479. ISBN 978-0-521-43767-7.

- Kallenberg, O. (2005)
*Probabilistic Symmetries and Invariance Principles*. Springer-Verlag, New York. 510 pp. ISBN 0-387-25115-4 - Kallenberg, O. (2002)
*Foundations of Modern Probability,*2nd ed. Springer Series in Statistics. 650 pp. ISBN 0-387-95313-2 - Olofsson, Peter (2005)
*Probability, Statistics, and Stochastic Processes*, Wiley-Interscience. 504 pp ISBN 0-471-67969-0.

Wikiquote has quotations related to: Probability |

Wikibooks has more on the topic of: Probability |

Wikimedia Commons has media related to . Probability |

Library resources about Probability |

- Virtual Laboratories in Probability and Statistics (Univ. of Ala.-Huntsville)
- Probability on
*In Our Time*at the BBC - Probability and Statistics EBook
- Edwin Thompson Jaynes.
*Probability Theory: The Logic of Science*. Preprint: Washington University, (1996). — HTML index with links to PostScript files and PDF (first three chapters) - People from the History of Probability and Statistics (Univ. of Southampton)
- Probability and Statistics on the Earliest Uses Pages (Univ. of Southampton)
- Earliest Uses of Symbols in Probability and Statistics on Earliest Uses of Various Mathematical Symbols
- A tutorial on probability and Bayes' theorem devised for first-year Oxford University students
- pdf file of An Anthology of Chance Operations (1963) at UbuWeb
- Introduction to Probability – eBook, by Charles Grinstead, Laurie Snell Source
*(GNU Free Documentation License)* - (in English)(in Italian) Bruno de Finetti,
*Probabilità e induzione*, Bologna, CLUEB, 1993. ISBN 88-8091-176-7 (digital version) - Richard P. Feynman's Lecture on probability.

This page is based on this Wikipedia article

Text is available under the CC BY-SA 4.0 license; additional terms may apply.

Images, videos and audio are available under their respective licenses.

Text is available under the CC BY-SA 4.0 license; additional terms may apply.

Images, videos and audio are available under their respective licenses.