Bernoulli trial

Last updated
Graphs of probability P of not observing independent events each of probability p after n Bernoulli trials vs np for various p. Three examples are shown:
Blue curve: Throwing a 6-sided die 6 times gives a 33.5% chance that 6 (or any other given number) never turns up; it can be observed that as n increases, the probability of a 1/n-chance event never appearing after n tries rapidly converges to 0.
Grey curve: To get 50-50 chance of throwing a Yahtzee (5 cubic dice all showing the same number) requires 0.69 x 1296 ~ 898 throws.
Green curve: Drawing a card from a deck of playing cards without jokers 100 (1.92 x 52) times with replacement gives 85.7% chance of drawing the ace of spades at least once. Bernoulli trial progression.svg
Graphs of probability P of not observing independent events each of probability p after n Bernoulli trials vs np for various p. Three examples are shown:
Blue curve: Throwing a 6-sided die 6 times gives a 33.5% chance that 6 (or any other given number) never turns up; it can be observed that as n increases, the probability of a 1/n-chance event never appearing after n tries rapidly converges to 0.
Grey curve: To get 50-50 chance of throwing a Yahtzee (5 cubic dice all showing the same number) requires 0.69 × 1296 ~ 898 throws.
Green curve: Drawing a card from a deck of playing cards without jokers 100 (1.92 × 52) times with replacement gives 85.7% chance of drawing the ace of spades at least once.

In the theory of probability and statistics, a Bernoulli trial (or binomial trial) is a random experiment with exactly two possible outcomes, "success" and "failure", in which the probability of success is the same every time the experiment is conducted. [1] It is named after Jacob Bernoulli, a 17th-century Swiss mathematician, who analyzed them in his Ars Conjectandi (1713). [2]

Contents

The mathematical formalisation of the Bernoulli trial is known as the Bernoulli process. This article offers an elementary introduction to the concept, whereas the article on the Bernoulli process offers a more advanced treatment.

Since a Bernoulli trial has only two possible outcomes, it can be framed as some "yes or no" question. For example:

Therefore, success and failure are merely labels for the two outcomes, and should not be construed literally. The term "success" in this sense consists in the result meeting specified conditions; it is not a value judgement. More generally, given any probability space, for any event (set of outcomes), one can define a Bernoulli trial, corresponding to whether the event occurred or not (event or complementary event). Examples of Bernoulli trials include:

Definition

Independent repeated trials of an experiment with exactly two possible outcomes are called Bernoulli trials. Call one of the outcomes "success" and the other outcome "failure". Let be the probability of success in a Bernoulli trial, and be the probability of failure. Then the probability of success and the probability of failure sum to one, since these are complementary events: "success" and "failure" are mutually exclusive and exhaustive. Thus, one has the following relations:

Alternatively, these can be stated in terms of odds: given probability of success and of failure, the odds for are and the odds against are These can also be expressed as numbers, by dividing, yielding the odds for, , and the odds against, :

These are multiplicative inverses, so they multiply to 1, with the following relations:

In the case that a Bernoulli trial is representing an event from finitely many equally likely outcomes, where of the outcomes are success and of the outcomes are failure, the odds for are and the odds against are This yields the following formulas for probability and odds:

Here the odds are computed by dividing the number of outcomes, not the probabilities, but the proportion is the same, since these ratios only differ by multiplying both terms by the same constant factor.

Random variables describing Bernoulli trials are often encoded using the convention that 1 = "success", 0 = "failure".

Closely related to a Bernoulli trial is a binomial experiment, which consists of a fixed number of statistically independent Bernoulli trials, each with a probability of success , and counts the number of successes. A random variable corresponding to a binomial experiment is denoted by , and is said to have a binomial distribution . The probability of exactly successes in the experiment is given by:

where is a binomial coefficient.

Bernoulli trials may also lead to negative binomial distributions (which count the number of successes in a series of repeated Bernoulli trials until a specified number of failures are seen), as well as various other distributions.

When multiple Bernoulli trials are performed, each with its own probability of success, these are sometimes referred to as Poisson trials. [3]

Example: tossing coins

Consider the simple experiment where a fair coin is tossed four times. Find the probability that exactly two of the tosses result in heads.

Solution

A representation of the possible outcomes of flipping a fair coin four times in terms of the number of heads. As can be seen, the probability of getting exactly two heads in four flips is 6/16 = 3/8, which matches the calculations. Coin flip outcomes.png
A representation of the possible outcomes of flipping a fair coin four times in terms of the number of heads. As can be seen, the probability of getting exactly two heads in four flips is 6/16 = 3/8, which matches the calculations.

For this experiment, let a heads be defined as a success and a tails as a failure. Because the coin is assumed to be fair, the probability of success is . Thus, the probability of failure, , is given by

.

Using the equation above, the probability of exactly two tosses out of four total tosses resulting in a heads is given by:

See also

Related Research Articles

<span class="mw-page-title-main">Binomial distribution</span> Probability distribution

In probability theory and statistics, the binomial distribution with parameters n and p is the discrete probability distribution of the number of successes in a sequence of n independent experiments, each asking a yes–no question, and each with its own Boolean-valued outcome: success or failure. A single success/failure experiment is also called a Bernoulli trial or Bernoulli experiment, and a sequence of outcomes is called a Bernoulli process; for a single trial, i.e., n = 1, the binomial distribution is a Bernoulli distribution. The binomial distribution is the basis for the popular binomial test of statistical significance.

<span class="mw-page-title-main">Probability distribution</span> Mathematical function for the probability a given outcome occurs in an experiment

In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon in terms of its sample space and the probabilities of events.

<span class="mw-page-title-main">Negative binomial distribution</span> Probability distribution

In probability theory and statistics, the negative binomial distribution is a discrete probability distribution that models the number of failures in a sequence of independent and identically distributed Bernoulli trials before a specified (non-random) number of successes occurs. For example, we can define rolling a 6 on a dice as a success, and rolling any other number as a failure, and ask how many failure rolls will occur before we see the third success. In such a case, the probability distribution of the number of failures that appear will be a negative binomial distribution.

<span class="mw-page-title-main">Geometric distribution</span> Probability distribution

In probability theory and statistics, the geometric distribution is either one of two discrete probability distributions:

<span class="mw-page-title-main">Bernoulli process</span> Random process of binary (boolean) random variables

In probability and statistics, a Bernoulli process is a finite or infinite sequence of binary random variables, so it is a discrete-time stochastic process that takes only two values, canonically 0 and 1. The component Bernoulli variablesXi are identically distributed and independent. Prosaically, a Bernoulli process is a repeated coin flipping, possibly with an unfair coin. Every variable Xi in the sequence is associated with a Bernoulli trial or experiment. They all have the same Bernoulli distribution. Much of what can be said about the Bernoulli process can also be generalized to more than two outcomes ; this generalization is known as the Bernoulli scheme.

<span class="mw-page-title-main">Probability mass function</span> Discrete-variable probability distribution

In probability and statistics, a probability mass function is a function that gives the probability that a discrete random variable is exactly equal to some value. Sometimes it is also known as the discrete probability density function. The probability mass function is often the primary means of defining a discrete probability distribution, and such functions exist for either scalar or multivariate random variables whose domain is discrete.

In probability theory, odds provide a measure of the likelihood of a particular outcome. When specific events are equally likely, odds are calculated as the ratio of the number of events that produce that outcome to the number that do not. Odds are commonly used in gambling and statistics.

<span class="mw-page-title-main">Hypergeometric distribution</span> Discrete probability distribution

In probability theory and statistics, the hypergeometric distribution is a discrete probability distribution that describes the probability of successes in draws, without replacement, from a finite population of size that contains exactly objects with that feature, wherein each draw is either a success or a failure. In contrast, the binomial distribution describes the probability of successes in draws with replacement.

<span class="mw-page-title-main">Bernoulli distribution</span> Probability distribution modeling a coin toss which need not be fair

In probability theory and statistics, the Bernoulli distribution, named after Swiss mathematician Jacob Bernoulli, is the discrete probability distribution of a random variable which takes the value 1 with probability and the value 0 with probability . Less formally, it can be thought of as a model for the set of possible outcomes of any single experiment that asks a yes–no question. Such questions lead to outcomes that are boolean-valued: a single bit whose value is success/yes/true/one with probability p and failure/no/false/zero with probability q. It can be used to represent a coin toss where 1 and 0 would represent "heads" and "tails", respectively, and p would be the probability of the coin landing on heads. In particular, unfair coins would have

In information theory, the information content, self-information, surprisal, or Shannon information is a basic quantity derived from the probability of a particular event occurring from a random variable. It can be thought of as an alternative way of expressing probability, much like odds or log-odds, but which has particular mathematical advantages in the setting of information theory.

<span class="mw-page-title-main">Joint probability distribution</span> Type of probability distribution

Given two random variables that are defined on the same probability space, the joint probability distribution is the corresponding probability distribution on all possible pairs of outputs. The joint distribution can just as well be considered for any given number of random variables. The joint distribution encodes the marginal distributions, i.e. the distributions of each of the individual random variables and the conditional probability distributions, which deal with how the outputs of one random variable are distributed when given information on the outputs of the other random variable(s).

In probability theory, the multinomial distribution is a generalization of the binomial distribution. For example, it models the probability of counts for each side of a k-sided die rolled n times. For n independent trials each of which leads to a success for exactly one of k categories, with each category having a given fixed success probability, the multinomial distribution gives the probability of any particular combination of numbers of successes for the various categories.

Lottery mathematics is used to calculate probabilities of winning or losing a lottery game. It is based primarily on combinatorics, particularly the twelvefold way and combinations without replacement.

<span class="mw-page-title-main">Kelly criterion</span> Formula for bet sizing that maximizes the expected logarithmic value

In probability theory, the Kelly criterion is a formula for sizing a bet. The Kelly bet size is found by maximizing the expected value of the logarithm of wealth, which is equivalent to maximizing the expected geometric growth rate. Assuming that the expected returns are known, the Kelly criterion leads to higher wealth than any other strategy in the long run. J. L. Kelly Jr, a researcher at Bell Labs, described the criterion in 1956.

In statistics, a binomial proportion confidence interval is a confidence interval for the probability of success calculated from the outcome of a series of success–failure experiments. In other words, a binomial proportion confidence interval is an interval estimate of a success probability p when only the number of experiments n and the number of successes nS are known.

In statistics, binomial regression is a regression analysis technique in which the response has a binomial distribution: it is the number of successes in a series of independent Bernoulli trials, where each trial has probability of success . In binomial regression, the probability of a success is related to explanatory variables: the corresponding concept in ordinary regression is to relate the mean value of the unobserved response to explanatory variables.

The Lexis ratio is used in statistics as a measure which seeks to evaluate differences between the statistical properties of random mechanisms where the outcome is two-valued — for example "success" or "failure", "win" or "lose". The idea is that the probability of success might vary between different sets of trials in different situations. This ratio is not much used currently having been largely replaced by the use of the chi-squared test in testing for the homogeneity of samples.

<span class="mw-page-title-main">De Moivre–Laplace theorem</span> Convergence in distribution of binomial to normal distribution

In probability theory, the de Moivre–Laplace theorem, which is a special case of the central limit theorem, states that the normal distribution may be used as an approximation to the binomial distribution under certain conditions. In particular, the theorem shows that the probability mass function of the random number of "successes" observed in a series of independent Bernoulli trials, each having probability of success, converges to the probability density function of the normal distribution with mean and standard deviation , as grows large, assuming is not or .

<span class="mw-page-title-main">Poisson distribution</span> Discrete probability distribution

In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time if these events occur with a known constant mean rate and independently of the time since the last event. It can also be used for the number of events in other types of intervals than time, and in dimension greater than 1.

In probability theory and statistics, the Poisson binomial distribution is the discrete probability distribution of a sum of independent Bernoulli trials that are not necessarily identically distributed. The concept is named after Siméon Denis Poisson.

References

  1. Papoulis, A. (1984). "Bernoulli Trials". Probability, Random Variables, and Stochastic Processes (2nd ed.). New York: McGraw-Hill. pp. 57–63.
  2. James Victor Uspensky: Introduction to Mathematical Probability, McGraw-Hill, New York 1937, page 45
  3. Rajeev Motwani and P. Raghavan. Randomized Algorithms. Cambridge University Press, New York (NY), 1995, p.67-68