Lottery mathematics

Last updated

Lottery mathematics is used to calculate probabilities of winning or losing a lottery game. It is based primarily on combinatorics, particularly the twelvefold way and combinations without replacement.

Contents

Choosing 6 from 49

In a typical 6/49 game, each player chooses six distinct numbers from a range of 1–49. If the six numbers on a ticket match the numbers drawn by the lottery, the ticket holder is a jackpot winner—regardless of the order of the numbers. The probability of this happening is 1 in 13,983,816.

The chance of winning can be demonstrated as follows: The first number drawn has a 1 in 49 chance of matching. When the draw comes to the second number, there are now only 48 balls left in the bag, because the balls are drawn without replacement. So there is now a 1 in 48 chance of predicting this number.

Thus for each of the 49 ways of choosing the first number there are 48 different ways of choosing the second. This means that the probability of correctly predicting 2 numbers drawn from 49 in the correct order is calculated as 1 in 49 × 48. On drawing the third number there are only 47 ways of choosing the number; but we could have arrived at this point in any of 49 × 48 ways, so the chances of correctly predicting 3 numbers drawn from 49, again in the correct order, is 1 in 49 × 48 × 47. This continues until the sixth number has been drawn, giving the final calculation, 49 × 48 × 47 × 46 × 45 × 44, which can also be written as or 49 factorial divided by 43 factorial or FACT(49)/FACT(43) or simply PERM(49,6) .

608281864034267560872252163321295376887552831379210240000000000 / 60415263063373835637355132068513997507264512000000000 = 10068347520

This works out to 10,068,347,520, which is much bigger than the ~14 million stated above.

Perm(49,6)=10068347520 and 49 nPr 6 =10068347520.

However, the order of the 6 numbers is not significant for the payout. That is, if a ticket has the numbers 1, 2, 3, 4, 5, and 6, it wins as long as all the numbers 1 through 6 are drawn, no matter what order they come out in. Accordingly, given any combination of 6 numbers, there are 6 × 5 × 4 × 3 × 2 × 1 = 6! or 720 orders in which they can be drawn. Dividing 10,068,347,520 by 720 gives 13,983,816, also written as , or COMBIN(49,6) or 49 nCr 6 or more generally as

, where n is the number of alternatives and k is the number of choices. Further information is available at binomial coefficient and multinomial coefficient.

This function is called the combination function, COMBIN(n,k). For the rest of this article, we will use the notation . "Combination" means the group of numbers selected, irrespective of the order in which they are drawn. A combination of numbers is usually presented in ascending order. An eventual 7th drawn number, the reserve or bonus, is presented at the end.

An alternative method of calculating the odds is to note that the probability of the first ball corresponding to one of the six chosen is 6/49; the probability of the second ball corresponding to one of the remaining five chosen is 5/48; and so on. This yields a final formula of

A 7th ball often is drawn as reserve ball, in the past only a second chance to get 5+1 numbers correct with 6 numbers played.

Odds of getting other possibilities in choosing 6 from 49

One must divide the number of combinations producing the given result by the total number of possible combinations (for example, ). The numerator equates to the number of ways to select the winning numbers multiplied by the number of ways to select the losing numbers.

For a score of n (for example, if 3 choices match three of the 6 balls drawn, then n = 3), describes the odds of selecting n winning numbers from the 6 winning numbers. This means that there are 6 - n losing numbers, which are chosen from the 43 losing numbers in ways. The total number of combinations giving that result is, as stated above, the first number multiplied by the second. The expression is therefore .

This can be written in a general form for all lotteries as:

where is the number of balls in lottery, is the number of balls in a single ticket, and is the number of matching balls for a winning ticket.

The generalisation of this formula is called the hypergeometric distribution.

This gives the following results:

ScoreCalculationExact ProbabilityApproximate Decimal ProbabilityApproximate 1/Probability
0435,461/998,8440.4362.2938
168,757/166,4740.4132.4212
244,075/332,9480.1327.5541
38,815/499,4220.017756.66
4645/665,8960.0009691,032.4
543/2,330,6360.000018454,200.8
61/13,983,8160.000000071513,983,816

When a 7th number is drawn as bonus number then we have 49!/6!/1!/42!.=combin(49,6)*combin(49-6,1)=601304088 different possible drawing results.

ScoreCalculationExact ProbabilityApproximate Decimal ProbabilityApproximate 1/Probability
5 + 0252/139838160.000018020855,491.33
5 + 16/139838160.00000042912,330,636

You would expect to score 3 of 6 or better once in around 36.19 drawings. Notice that It takes a 3 if 6 wheel of 163 combinations to be sure of at least one 3/6 score.

1/p changes when several distinct combinations are played together. It mostly is about winning something, not just the jackpot.

Ensuring to win the jackpot

There is only one known way to ensure winning the jackpot. That is to buy at least one lottery ticket for every possible number combination. For example, one has to buy 13,983,816 different tickets to ensure to win the jackpot in a 6/49 game.

Lottery organizations have laws, rules and safeguards in place to prevent gamblers from executing such an operation. Further, just winning the jackpot by buying every possible combination does not guarantee that one will break even or make a profit.

If is the probability to win; the cost of a ticket; the cost for obtaining a ticket (e.g. including the logistics); one time costs for the operation (such as setting up and conducting the operation); then the jackpot should contain at least

to have a chance to at least break even.

The above theoretical "chance to break-even" point is slightly offset by the sum of the minor wins also included in all the lottery tickets:

Still, even if the above relation is satisfied, it does not guarantee to break even. The payout depends on the number of winning tickets for all the prizes , resulting in the relation

In probably the only known successful operations [1] the threshold to execute an operation was set at three times the cost of the tickets alone for unknown reasons

I.e.

This does, however, not eliminate all risks to make no profit. The success of the operations still depended on a bit of luck. In addition, in one operation the logistics failed and not all combinations could be obtained. This added the risk of not even winning the jackpot at all.

Powerballs and bonus balls

Many lotteries have a Powerball (or "bonus ball"). If the powerball is drawn from a pool of numbers different from the main lottery, the odds are multiplied by the number of powerballs. For example, in the 6 from 49 lottery, given 10 powerball numbers, then the odds of getting a score of 3 and the powerball would be 1 in 56.66 × 10, or 566.6 (the probability would be divided by 10, to give an exact value of ). Another example of such a game is Mega Millions, albeit with different jackpot odds.

Where more than 1 powerball is drawn from a separate pool of balls to the main lottery (for example, in the EuroMillions game), the odds of the different possible powerball matching scores are calculated using the method shown in the "other scores" section above (in other words, the powerballs are like a mini-lottery in their own right), and then multiplied by the odds of achieving the required main-lottery score.

If the powerball is drawn from the same pool of numbers as the main lottery, then, for a given target score, the number of winning combinations includes the powerball. For games based on the Canadian lottery (such as the lottery of the United Kingdom), after the 6 main balls are drawn, an extra ball is drawn from the same pool of balls, and this becomes the powerball (or "bonus ball"). An extra prize is given for matching 5 balls and the bonus ball. As described in the "other scores" section above, the number of ways one can obtain a score of 5 from a single ticket is . Since the number of remaining balls is 43, and the ticket has 1 unmatched number remaining, 1/43 of these 258 combinations will match the next ball drawn (the powerball), leaving 258/43 = 6 ways of achieving it. Therefore, the odds of getting a score of 5 and the powerball are .

Of the 258 combinations that match 5 of the main 6 balls, in 42/43 of them the remaining number will not match the powerball, giving odds of for obtaining a score of 5 without matching the powerball.

Using the same principle, the odds of getting a score of 2 and the powerball are for the score of 2 multiplied by the probability of one of the remaining four numbers matching the bonus ball, which is 4/43. Since , the probability of obtaining the score of 2 and the bonus ball is , approximate decimal odds of 1 in 81.2.

The general formula for matching balls in a choose lottery with one bonus ball from the pool of balls is:

The general formula for matching balls in a choose lottery with zero bonus ball from the pool of balls is:

The general formula for matching balls in a choose lottery with one bonus ball from a separate pool of balls is:

The general formula for matching balls in a choose lottery with no bonus ball from a separate pool of balls is:

Minimum number of tickets for a match

It is a hard (and often open) problem to calculate the minimum number of tickets one needs to purchase to guarantee that at least one of these tickets matches at least 2 numbers. In the 5-from-90 lotto, the minimum number of tickets that can guarantee a ticket with at least 2 matches is 100. [2]

Information theoretic results

As a discrete probability space, the probability of any particular lottery outcome is atomic, meaning it is greater than zero. Therefore, the probability of any event is the sum of probabilities of the outcomes of the event. This makes it easy to calculate quantities of interest from information theory. For example, the information content of any event is easy to calculate, by the formula

In particular, the information content of outcome of discrete random variable is

For example, winning in the example § Choosing 6 from 49 above is a Bernoulli-distributed random variable with a 1/13,983,816 chance of winning ("success") We write with and . The information content of winning is

shannons or bits of information. (See units of information for further explanation of terminology.) The information content of losing is

The information entropy of a lottery probability distribution is also easy to calculate as the expected value of the information content.

Oftentimes the random variable of interest in the lottery is a Bernoulli trial. In this case, the Bernoulli entropy function may be used. Using representing winning the 6-of-49 lottery, the Shannon entropy of 6-of-49 above is

Related Research Articles

<span class="mw-page-title-main">Binomial distribution</span> Probability distribution

In probability theory and statistics, the binomial distribution with parameters n and p is the discrete probability distribution of the number of successes in a sequence of n independent experiments, each asking a yes–no question, and each with its own Boolean-valued outcome: success or failure. A single success/failure experiment is also called a Bernoulli trial or Bernoulli experiment, and a sequence of outcomes is called a Bernoulli process; for a single trial, i.e., n = 1, the binomial distribution is a Bernoulli distribution. The binomial distribution is the basis for the binomial test of statistical significance.

In mathematics, a combination is a selection of items from a set that has distinct members, such that the order of selection does not matter. For example, given three fruits, say an apple, an orange and a pear, there are three combinations of two that can be drawn from this set: an apple and a pear; an apple and an orange; or a pear and an orange. More formally, a k-combination of a set S is a subset of k distinct elements of S. So, two combinations are identical if and only if each combination has the same members. If the set has n elements, the number of k-combinations, denoted by or , is equal to the binomial coefficient

<span class="texhtml mvar" style="font-style:italic;">e</span> (mathematical constant) Constant value used in mathematics

The number e is a mathematical constant approximately equal to 2.71828 that is the base of the natural logarithm and exponential function. It is sometimes called Euler's number, after the Swiss mathematician Leonhard Euler, though this can invite confusion with Euler numbers, or with Euler's constant, a different constant typically denoted . Alternatively, e can be called Napier's constant after John Napier. The Swiss mathematician Jacob Bernoulli discovered the constant while studying compound interest.

A histogram is a visual representation of the distribution of quantitative data. To construct a histogram, the first step is to "bin" the range of values— divide the entire range of values into a series of intervals—and then count how many values fall into each interval. The bins are usually specified as consecutive, non-overlapping intervals of a variable. The bins (intervals) are adjacent and are typically of equal size.

<span class="mw-page-title-main">Normal distribution</span> Probability distribution

In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is The parameter is the mean or expectation of the distribution, while the parameter is the variance. The standard deviation of the distribution is (sigma). A random variable with a Gaussian distribution is said to be normally distributed, and is called a normal deviate.

<span class="mw-page-title-main">Central limit theorem</span> Fundamental theorem in probability theory and statistics

In probability theory, the central limit theorem (CLT) states that, under appropriate conditions, the distribution of a normalized version of the sample mean converges to a standard normal distribution. This holds even if the original variables themselves are not normally distributed. There are several versions of the CLT, each applying in the context of different conditions.

Shor's algorithm is a quantum algorithm for finding the prime factors of an integer. It was developed in 1994 by the American mathematician Peter Shor. It is one of the few known quantum algorithms with compelling potential applications and strong evidence of superpolynomial speedup compared to best known classical (non-quantum) algorithms. On the other hand, factoring numbers of practical significance requires far more qubits than available in the near future. Another concern is that noise in quantum circuits may undermine results, requiring additional qubits for quantum error correction.

A likelihood function measures how well a statistical model explains observed data by calculating the probability of seeing that data under different parameter values of the model. It is constructed from the joint probability distribution of the random variable that (presumably) generated the observations. When evaluated on the actual data points, it becomes a function solely of the model parameters.

<span class="mw-page-title-main">Law of large numbers</span> Averages of repeated trials converge to the expected value

In probability theory, the law of large numbers (LLN) is a mathematical law that states that the average of the results obtained from a large number of independent random samples converges to the true value, if it exists. More formally, the LLN states that given a sample of independent and identically distributed values, the sample mean converges to the true mean.

<span class="mw-page-title-main">Hypergeometric distribution</span> Discrete probability distribution

In probability theory and statistics, the hypergeometric distribution is a discrete probability distribution that describes the probability of successes in draws, without replacement, from a finite population of size that contains exactly objects with that feature, wherein each draw is either a success or a failure. In contrast, the binomial distribution describes the probability of successes in draws with replacement.

A birthday attack is a bruteforce collision attack that exploits the mathematics behind the birthday problem in probability theory. This attack can be used to abuse communication between two or more parties. The attack depends on the higher likelihood of collisions found between random attack attempts and a fixed degree of permutations (pigeonholes). Let be the number of possible values of a hash function, with . With a birthday attack, it is possible to find a collision of a hash function with chance in where is the bit length of the hash output, and with being the classical preimage resistance security with the same probability. There is a general result that quantum computers can perform birthday attacks, thus breaking collision resistance, in .

<span class="mw-page-title-main">Random walk</span> Process forming a path from many random steps

In mathematics, a random walk, sometimes known as a drunkard's walk, is a stochastic process that describes a path that consists of a succession of random steps on some mathematical space.

In mathematics, summation is the addition of a sequence of numbers, called addends or summands; the result is their sum or total. Beside numbers, other types of values can be summed as well: functions, vectors, matrices, polynomials and, in general, elements of any type of mathematical objects on which an operation denoted "+" is defined.

In information theory, the asymptotic equipartition property (AEP) is a general property of the output samples of a stochastic source. It is fundamental to the concept of typical set used in theories of data compression.

In probability theory and statistics, the cumulantsκn of a probability distribution are a set of quantities that provide an alternative to the moments of the distribution. Any two probability distributions whose moments are identical will have identical cumulants as well, and vice versa.

In information theory, the information content, self-information, surprisal, or Shannon information is a basic quantity derived from the probability of a particular event occurring from a random variable. It can be thought of as an alternative way of expressing probability, much like odds or log-odds, but which has particular mathematical advantages in the setting of information theory.

<span class="mw-page-title-main">Pascal's pyramid</span> Arrangement of trinomial numbers

In mathematics, Pascal's pyramid is a three-dimensional arrangement of the trinomial numbers, which are the coefficients of the trinomial expansion and the trinomial distribution. Pascal's pyramid is the three-dimensional analog of the two-dimensional Pascal's triangle, which contains the binomial numbers and relates to the binomial expansion and the binomial distribution. The binomial and trinomial numbers, coefficients, expansions, and distributions are subsets of the multinomial constructs with the same names.

Fisher's exact test is a statistical significance test used in the analysis of contingency tables. Although in practice it is employed when sample sizes are small, it is valid for all sample sizes. It is named after its inventor, Ronald Fisher, and is one of a class of exact tests, so called because the significance of the deviation from a null hypothesis can be calculated exactly, rather than relying on an approximation that becomes exact in the limit as the sample size grows to infinity, as with many statistical tests.

<span class="mw-page-title-main">Stirling numbers of the second kind</span> Numbers parameterizing ways to partition a set

In mathematics, particularly in combinatorics, a Stirling number of the second kind is the number of ways to partition a set of n objects into k non-empty subsets and is denoted by or . Stirling numbers of the second kind occur in the field of mathematics called combinatorics and the study of partitions. They are named after James Stirling.

Euclid's theorem is a fundamental statement in number theory that asserts that there are infinitely many prime numbers. It was first proven by Euclid in his work Elements. There are several proofs of the theorem.

References

  1. The man who won the lottery 14 times
  2. Z. Füredi, G. J. Székely, and Z. Zubor (1996). "On the lottery problem". Journal of Combinatorial Designs. 4 (1): 5–10. doi:10.1002/(sici)1520-6610(1996)4:1<5::aid-jcd2>3.3.co;2-w.{{cite journal}}: CS1 maint: multiple names: authors list (link)