Banach's matchbox problem

Last updated March 09, 2024

Banach's match problem is a classic problem in probability attributed to Stefan Banach. Feller ^[1] says that the problem was inspired by a humorous reference to Banach's smoking habit in a speech honouring him by Hugo Steinhaus, but that it was not Banach who set the problem or provided an answer.

Suppose a mathematician carries two matchboxes at all times: one in his left pocket and one in his right. Each time he needs a match, he is equally likely to take it from either pocket. Suppose he reaches into his pocket and discovers for the first time that the box picked is empty. If it is assumed that each of the matchboxes originally contained $N$ matches, what is the probability that there are exactly $k$ matches in the other box?

Solution

Without loss of generality consider the case where the matchbox in his right pocket has an unlimited number of matches and let $M$ be the number of matches removed from this one before the left one is found to be empty. When the left pocket is found to be empty, the man has chosen that pocket $(N+1)$ times. Then $M$ is the number of successes before $(N+1)$ failures in Bernoulli trials with $p=1/2$ , which has the negative binomial distribution and thus

P[M=m]={\binom {N+m}{m}}\left({\frac {1}{2}}\right)^{N+1+m}

.

Returning to the original problem, we see that the probability that the left pocket is found to be empty first is $P[M<N+1]$ which equals $1/2$ because both are equally likely. We see that the number $K$ of matches remaining in the other pocket is

P[K=k]=P[M=N-k|M<N+1]=2P[M=N-k]={\binom {2N-k}{N-k}}\left({\frac {1}{2}}\right)^{2N-k}

.

The expectation of the distribution is approximately $2{\sqrt {N/\pi }}-1$ . (This is shown using Stirling's approximation.^[2]) So starting with boxes with $N=40$ matches, the expected number of matches in the second box is $6$ .

Related Research Articles

In probability theory and statistics, the binomial distribution with parameters n and p is the discrete probability distribution of the number of successes in a sequence of n independent experiments, each asking a yes–no question, and each with its own Boolean-valued outcome: success or failure. A single success/failure experiment is also called a Bernoulli trial or Bernoulli experiment, and a sequence of outcomes is called a Bernoulli process; for a single trial, i.e., n = 1, the binomial distribution is a Bernoulli distribution. The binomial distribution is the basis for the popular binomial test of statistical significance.

In mathematics, the binomial coefficients are the positive integers that occur as coefficients in the binomial theorem. Commonly, a binomial coefficient is indexed by a pair of integers $n \geq k \geq 0$ and is written $It is the coefficient of the x k term in the polynomial expansion of the binomial power (1 + x) n; this coefficient can be computed by the multiplicative formula$

In mathematics, a combination is a selection of items from a set that has distinct members, such that the order of selection does not matter. For example, given three fruits, say an apple, an orange and a pear, there are three combinations of two that can be drawn from this set: an apple and a pear; an apple and an orange; or a pear and an orange. More formally, a k-combination of a set S is a subset of k distinct elements of S. So, two combinations are identical if and only if each combination has the same members. If the set has n elements, the number of k-combinations, denoted by $or, is equal to the binomial coefficient$

In mathematics, the Euler numbers are a sequence E_n of integers defined by the Taylor series expansion

In probability theory, the birthday problem asks for the probability that, in a set of $n$ randomly chosen people, at least two will share a birthday. The birthday paradox refers to the counterintuitive fact that only 23 people are needed for that probability to exceed 50%.

In mathematics, a generating function is a representation of an infinite sequence of numbers as the coefficients of a formal power series. Unlike an ordinary series, the formal power series is not required to converge: in fact, the generating function is not actually regarded as a function, and the "variable" remains an indeterminate. Generating functions were first introduced by Abraham de Moivre in 1730, in order to solve the general linear recurrence problem. One can generalize to formal power series in more than one indeterminate, to encode information about infinite multi-dimensional arrays of numbers.

<span class="mw-page-title-main">Hypergeometric distribution</span> Discrete probability distribution

In probability theory and statistics, the hypergeometric distribution is a discrete probability distribution that describes the probability of $successes in draws, without replacement, from a finite population of size that contains exactly objects with that feature, wherein each draw is either a success or a failure. In contrast, the binomial distribution describes the probability of successes in draws with replacement.$

In combinatorial mathematics, the Catalan numbers are a sequence of natural numbers that occur in various counting problems, often involving recursively defined objects. They are named after the French-Belgian mathematician Eugène Charles Catalan.

In mathematics, Bertrand's postulate states that for each $there is a prime such that . First conjectured in 1845 by Joseph Bertrand, it was first proven by Chebyshev, and a shorter but also advanced proof was given by Ramanujan.$

In combinatorics, a branch of mathematics, the inclusion–exclusion principle is a counting technique which generalizes the familiar method of obtaining the number of elements in the union of two finite sets; symbolically expressed as

In mathematics, the double factorial of a number $n$ , denoted by $n ‼$ , is the product of all the positive integers up to $n$ that have the same parity as $n$ . That is,

In combinatorics, Bertrand's ballot problem is the question: "In an election where candidate A receives p votes and candidate B receives q votes with p > q, what is the probability that A will be strictly ahead of B throughout the count?" The answer is

In mathematics, Apéry's theorem is a result in number theory that states the Apéry's constant ζ(3) is irrational. That is, the number

In mathematics, the Schuette–Nesbitt formula is a generalization of the inclusion–exclusion principle. It is named after Donald R. Schuette and Cecil J. Nesbitt.

In the context of combinatorial mathematics, stars and bars is a graphical aid for deriving certain combinatorial theorems. It was popularized by William Feller in his classic book on probability. It can be used to solve many simple counting problems, such as how many ways there are to put $n$ indistinguishable balls into $k$ distinguishable bins.

In the statistical theory of estimation, the German tank problem consists of estimating the maximum of a discrete uniform distribution from sampling without replacement. In simple terms, suppose there exists an unknown number of items which are sequentially numbered from 1 to N. A random sample of these items is taken and their sequence numbers observed; the problem is to estimate N from these observed numbers.

In mathematical analysis, and especially in real, harmonic analysis and functional analysis, an Orlicz space is a type of function space which generalizes the L^p spaces. Like the L^p spaces, they are Banach spaces. The spaces are named for Władysław Orlicz, who was the first to define them in 1932.

The Newton–Pepys problem is a probability problem concerning the probability of throwing sixes from a certain number of dice.

In probability theory and statistics, the negative hypergeometric distribution describes probabilities for when sampling from a finite population without replacement in which each sample can be classified into two mutually exclusive categories like Pass/Fail or Employed/Unemployed. As random selections are made from the population, each subsequent draw decreases the population causing the probability of success to change with each draw. Unlike the standard hypergeometric distribution, which describes the number of successes in a fixed sample size, in the negative hypergeometric distribution, samples are drawn until $failures have been found, and the distribution describes the probability of finding successes in such a sample. In other words, the negative hypergeometric distribution describes the likelihood of successes in a sample with exactly failures.$

The 100 prisoners problem is a mathematical problem in probability theory and combinatorics. In this problem, 100 numbered prisoners must find their own numbers in one of 100 drawers in order to survive. The rules state that each prisoner may open only 50 drawers and cannot communicate with other prisoners. At first glance, the situation appears hopeless, but a clever strategy offers the prisoners a realistic chance of survival.

References

↑ Feller, William, An Introduction to Probability Theory And Its Applications, Third Edition, Wiley, 1968, Chapter VI, section 8
↑ Feller, page 238.

External links

Java applet

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Feller, William, An Introduction to Probability Theory And Its Applications, Third Edition, Wiley, 1968, Chapter VI, section 8

[2] Feller, page 238.

[1]

[2]

Banach's matchbox problem

Contents

Solution

See also

Related Research Articles

References

External links