Combination

Last updated

In mathematics, a combination is a selection of items from a set that has distinct members, such that the order of selection does not matter (unlike permutations). For example, given three fruits, say an apple, an orange and a pear, there are three combinations of two that can be drawn from this set: an apple and a pear; an apple and an orange; or a pear and an orange. More formally, a k-combination of a set S is a subset of k distinct elements of S. So, two combinations are identical if and only if each combination has the same members. (The arrangement of the members in each set does not matter.) If the set has n elements, the number of k-combinations, denoted by or , is equal to the binomial coefficient

Contents

which can be written using factorials as whenever , and which is zero when . This formula can be derived from the fact that each k-combination of a set S of n members has permutations so or . [1] The set of all k-combinations of a set S is often denoted by .

A combination is a selection of n things taken k at a time without repetition. To refer to combinations in which repetition is allowed, the terms k-combination with repetition, k-multiset, [2] or k-selection, [3] are often used. [4] If, in the above example, it were possible to have two of any one kind of fruit there would be 3 more 2-selections: one with two apples, one with two oranges, and one with two pears.

Although the set of three fruits was small enough to write a complete list of combinations, this becomes impractical as the size of the set increases. For example, a poker hand can be described as a 5-combination (k = 5) of cards from a 52 card deck (n = 52). The 5 cards of the hand are all distinct, and the order of cards in the hand does not matter. There are 2,598,960 such combinations, and the chance of drawing any one hand at random is 1 / 2,598,960.

Number of k-combinations

3-element subsets of a 5-element set Combinations without repetition; 5 choose 3.svg
3-element subsets of a 5-element set

The number of k-combinations from a given set S of n elements is often denoted in elementary combinatorics texts by , or by a variation such as , , , or even [5] (the last form is standard in French, Romanian, Russian, and Chinese texts). [6] [7] The same number however occurs in many other mathematical contexts, where it is denoted by (often read as "n choose k"); notably it occurs as a coefficient in the binomial formula, hence its name binomial coefficient. One can define for all natural numbers k at once by the relation

from which it is clear that

and further

for .

To see that these coefficients count k-combinations from S, one can first consider a collection of n distinct variables Xs labeled by the elements s of S, and expand the product over all elements of S:

it has 2n distinct terms corresponding to all the subsets of S, each subset giving the product of the corresponding variables Xs. Now setting all of the Xs equal to the unlabeled variable X, so that the product becomes (1 + X)n, the term for each k-combination from S becomes Xk, so that the coefficient of that power in the result equals the number of such k-combinations.

Binomial coefficients can be computed explicitly in various ways. To get all of them for the expansions up to (1 + X)n, one can use (in addition to the basic cases already given) the recursion relation

for 0 < k < n, which follows from (1 + X)n= (1 + X)n − 1(1 + X); this leads to the construction of Pascal's triangle.

For determining an individual binomial coefficient, it is more practical to use the formula

The numerator gives the number of k-permutations of n, i.e., of sequences of k distinct elements of S, while the denominator gives the number of such k-permutations that give the same k-combination when the order is ignored.

When k exceeds n/2, the above formula contains factors common to the numerator and the denominator, and canceling them out gives the relation

for 0 ≤ kn. This expresses a symmetry that is evident from the binomial formula, and can also be understood in terms of k-combinations by taking the complement of such a combination, which is an (nk)-combination.

Finally there is a formula which exhibits this symmetry directly, and has the merit of being easy to remember:

where n! denotes the factorial of n. It is obtained from the previous formula by multiplying denominator and numerator by (nk)!, so it is certainly computationally less efficient than that formula.

The last formula can be understood directly, by considering the n! permutations of all the elements of S. Each such permutation gives a k-combination by selecting its first k elements. There are many duplicate selections: any combined permutation of the first k elements among each other, and of the final (n  k) elements among each other produces the same combination; this explains the division in the formula.

From the above formulas follow relations between adjacent numbers in Pascal's triangle in all three directions:

Together with the basic cases , these allow successive computation of respectively all numbers of combinations from the same set (a row in Pascal's triangle), of k-combinations of sets of growing sizes, and of combinations with a complement of fixed size nk.

Example of counting combinations

As a specific example, one can compute the number of five-card hands possible from a standard fifty-two card deck as: [8]

Alternatively one may use the formula in terms of factorials and cancel the factors in the numerator against parts of the factors in the denominator, after which only multiplication of the remaining factors is required:

Another alternative computation, equivalent to the first, is based on writing

which gives

When evaluated in the following order, 52 ÷ 1 × 51 ÷ 2 × 50 ÷ 3 × 49 ÷ 4 × 48 ÷ 5, this can be computed using only integer arithmetic. The reason is that when each division occurs, the intermediate result that is produced is itself a binomial coefficient, so no remainders ever occur.

Using the symmetric formula in terms of factorials without performing simplifications gives a rather extensive calculation:

Enumerating k-combinations

One can enumerate all k-combinations of a given set S of n elements in some fixed order, which establishes a bijection from an interval of integers with the set of those k-combinations. Assuming S is itself ordered, for instance S = { 1, 2, ..., n }, there are two natural possibilities for ordering its k-combinations: by comparing their smallest elements first (as in the illustrations above) or by comparing their largest elements first. The latter option has the advantage that adding a new largest element to S will not change the initial part of the enumeration, but just add the new k-combinations of the larger set after the previous ones. Repeating this process, the enumeration can be extended indefinitely with k-combinations of ever larger sets. If moreover the intervals of the integers are taken to start at 0, then the k-combination at a given place i in the enumeration can be computed easily from i, and the bijection so obtained is known as the combinatorial number system. It is also known as "rank"/"ranking" and "unranking" in computational mathematics. [9] [10]

There are many ways to enumerate k combinations. One way is to track k index numbers of the elements selected, starting with {0 .. k−1} (zero-based) or {1 .. k} (one-based) as the first allowed k-combination. Then, repeatedly move to the next allowed k-combination by incrementing the smallest index number for which this would not create two equal index numbers, at the same time resetting all smaller index numbers to their initial values.

Number of combinations with repetition

A k-combination with repetitions, or k-multicombination, or multisubset of size k from a set S of size n is given by a set of k not necessarily distinct elements of S, where order is not taken into account: two sequences define the same multiset if one can be obtained from the other by permuting the terms. In other words, it is a sample of k elements from a set of n elements allowing for duplicates (i.e., with replacement) but disregarding different orderings (e.g. {2,1,2} = {1,2,2}). Associate an index to each element of S and think of the elements of S as types of objects, then we can let denote the number of elements of type i in a multisubset. The number of multisubsets of size k is then the number of nonnegative integer (so allowing zero) solutions of the Diophantine equation: [11]

If S has n elements, the number of such k-multisubsets is denoted by

a notation that is analogous to the binomial coefficient which counts k-subsets. This expression, n multichoose k, [12] can also be given in terms of binomial coefficients:

This relationship can be easily proved using a representation known as stars and bars. [13]

Proof

A solution of the above Diophantine equation can be represented by stars, a separator (a bar), then more stars, another separator, and so on. The total number of stars in this representation is k and the number of bars is n - 1 (since a separation into n parts needs n-1 separators). Thus, a string of k + n - 1 (or n + k - 1) symbols (stars and bars) corresponds to a solution if there are k stars in the string. Any solution can be represented by choosing k out of k + n − 1 positions to place stars and filling the remaining positions with bars. For example, the solution of the equation (n = 4 and k = 10) can be represented by [14]

The number of such strings is the number of ways to place 10 stars in 13 positions, which is the number of 10-multisubsets of a set with 4 elements.

Bijection between 3-subsets of a 7-set (left) and 3-multisets with elements from a 5-set (right).
This illustrates that
(
7
3
)
=
(
(
5
3
)
)
{\textstyle {\binom {7}{3}}=\left(\!\!{\binom {5}{3}}\!\!\right)}
. Combinations with repetition; 5 multichoose 3.svg
Bijection between 3-subsets of a 7-set (left) and 3-multisets with elements from a 5-set (right).
This illustrates that .

As with binomial coefficients, there are several relationships between these multichoose expressions. For example, for ,

This identity follows from interchanging the stars and bars in the above representation. [15]

Example of counting multisubsets

For example, if you have four types of donuts (n = 4) on a menu to choose from and you want three donuts (k = 3), the number of ways to choose the donuts with repetition can be calculated as

This result can be verified by listing all the 3-multisubsets of the set S = {1,2,3,4}. This is displayed in the following table. [16] The second column lists the donuts you actually chose, the third column shows the nonnegative integer solutions of the equation and the last column gives the stars and bars representation of the solutions. [17]

No.3-multisetEq. solutionStars and bars
1{1,1,1}[3,0,0,0]
2{1,1,2}[2,1,0,0]
3{1,1,3}[2,0,1,0]
4{1,1,4}[2,0,0,1]
5{1,2,2}[1,2,0,0]
6{1,2,3}[1,1,1,0]
7{1,2,4}[1,1,0,1]
8{1,3,3}[1,0,2,0]
9{1,3,4}[1,0,1,1]
10{1,4,4}[1,0,0,2]
11{2,2,2}[0,3,0,0]
12{2,2,3}[0,2,1,0]
13{2,2,4}[0,2,0,1]
14{2,3,3}[0,1,2,0]
15{2,3,4}[0,1,1,1]
16{2,4,4}[0,1,0,2]
17{3,3,3}[0,0,3,0]
18{3,3,4}[0,0,2,1]
19{3,4,4}[0,0,1,2]
20{4,4,4}[0,0,0,3]

Number of k-combinations for all k

The number of k-combinations for all k is the number of subsets of a set of n elements. There are several ways to see that this number is 2n. In terms of combinations, , which is the sum of the nth row (counting from 0) of the binomial coefficients in Pascal's triangle. These combinations (subsets) are enumerated by the 1 digits of the set of base 2 numbers counting from 0 to 2n  1, where each digit position is an item from the set of n.

Given 3 cards numbered 1 to 3, there are 8 distinct combinations (subsets), including the empty set:

Representing these subsets (in the same order) as base 2 numerals:

Probability: sampling a random combination

There are various algorithms to pick out a random combination from a given set or list. Rejection sampling is extremely slow for large sample sizes. One way to select a k-combination efficiently from a population of size n is to iterate across each element of the population, and at each step pick that element with a dynamically changing probability of (see Reservoir sampling). Another is to pick a random non-negative integer less than and convert it into a combination using the combinatorial number system.

Number of ways to put objects into bins

A combination can also be thought of as a selection of two sets of items: those that go into the chosen bin and those that go into the unchosen bin. This can be generalized to any number of bins with the constraint that every item must go to exactly one bin. The number of ways to put objects into bins is given by the multinomial coefficient

where n is the number of items, m is the number of bins, and is the number of items that go into bin i.

One way to see why this equation holds is to first number the objects arbitrarily from 1 to n and put the objects with numbers into the first bin in order, the objects with numbers into the second bin in order, and so on. There are distinct numberings, but many of them are equivalent, because only the set of items in a bin matters, not their order in it. Every combined permutation of each bins' contents produces an equivalent way of putting items into bins. As a result, every equivalence class consists of distinct numberings, and the number of equivalence classes is .

The binomial coefficient is the special case where k items go into the chosen bin and the remaining items go into the unchosen bin:

See also

Notes

  1. Reichl, Linda E. (2016). "2.2. Counting Microscopic States". A Modern Course in Statistical Physics. WILEY-VCH. p. 30. ISBN   978-3-527-69048-0.
  2. Mazur 2010 , p. 10
  3. Ryser 1963 , p. 7 also referred to as an unordered selection.
  4. When the term combination is used to refer to either situation (as in ( Brualdi 2010 )) care must be taken to clarify whether sets or multisets are being discussed.
  5. Uspensky 1937 , p. 18
  6. High School Textbook for full-time student (Required) Mathematics Book II B (in Chinese) (2nd ed.). China: People's Education Press. June 2006. pp. 107–116. ISBN   978-7-107-19616-4.
  7. 人教版高中数学选修2-3 [Mathematics textbook, volume 2-3, for senior high school, People's Education Press]. People's Education Press. p. 21. Archived from the original on 7 April 2023.
  8. Mazur 2010 , p. 21
  9. Lucia Moura. "Generating Elementary Combinatorial Objects" (PDF). Site.uottawa.ca. Archived (PDF) from the original on 9 October 2022. Retrieved 10 April 2017.
  10. "SAGE : Subsets" (PDF). Sagemath.org. Retrieved 10 April 2017.
  11. Brualdi 2010 , p. 52
  12. Benjamin & Quinn 2003 , p. 70
  13. In the article Stars and bars (combinatorics) the roles of n and k are reversed.
  14. Benjamin & Quinn 2003 , pp. 71 72
  15. Benjamin & Quinn 2003 , p. 72 (identity 145)
  16. Benjamin & Quinn 2003 , p. 71
  17. Mazur 2010 , p. 10 where the stars and bars are written as binary numbers, with stars = 0 and bars = 1.

Related Research Articles

<span class="mw-page-title-main">Binomial coefficient</span> Number of subsets of a given size

In mathematics, the binomial coefficients are the positive integers that occur as coefficients in the binomial theorem. Commonly, a binomial coefficient is indexed by a pair of integers nk ≥ 0 and is written It is the coefficient of the xk term in the polynomial expansion of the binomial power (1 + x)n; this coefficient can be computed by the multiplicative formula

In elementary algebra, the binomial theorem describes the algebraic expansion of powers of a binomial. According to the theorem, the power expands into a polynomial with terms of the form , where the exponents and are nonnegative integers satisfying and the coefficient of each term is a specific positive integer depending on and . For example, for ,

In mathematics, Pascal's triangle is an infinite triangular array of the binomial coefficients which play a crucial role in probability theory, combinatorics, and algebra. In much of the Western world, it is named after the French mathematician Blaise Pascal, although other mathematicians studied it centuries before him in Persia, India, China, Germany, and Italy.

In combinatorial mathematics, the Bell numbers count the possible partitions of a set. These numbers have been studied by mathematicians since the 19th century, and their roots go back to medieval Japan. In an example of Stigler's law of eponymy, they are named after Eric Temple Bell, who wrote about them in the 1930s.

In mathematics, the falling factorial is defined as the polynomial

<span class="mw-page-title-main">Inclusion–exclusion principle</span> Counting technique in combinatorics

In combinatorics, the inclusion–exclusion principle is a counting technique which generalizes the familiar method of obtaining the number of elements in the union of two finite sets; symbolically expressed as

In mathematics, a monomial is, roughly speaking, a polynomial which has only one term. Two definitions of a monomial may be encountered:

  1. A monomial, also called a power product or primitive monomial, is a product of powers of variables with nonnegative integer exponents, or, in other words, a product of variables, possibly with repetitions. For example, is a monomial. The constant is a primitive monomial, being equal to the empty product and to for any variable . If only a single variable is considered, this means that a monomial is either or a power of , with a positive integer. If several variables are considered, say, then each can be given an exponent, so that any monomial is of the form with non-negative integers.
  2. A monomial in the first sense multiplied by a nonzero constant, called the coefficient of the monomial. A primitive monomial is a special case of a monomial in this second sense, where the coefficient is . For example, in this interpretation and are monomials.

In mathematics, the multinomial theorem describes how to expand a power of a sum in terms of powers of the terms in that sum. It is the generalization of the binomial theorem from binomials to multinomials.

In mathematics, the Gaussian binomial coefficients are q-analogs of the binomial coefficients. The Gaussian binomial coefficient, written as or , is a polynomial in q with integer coefficients, whose value when q is set to a prime power counts the number of subspaces of dimension k in a vector space of dimension n over , a finite field with q elements; i.e. it is the number of points in the finite Grassmannian .

In mathematics, a q-analog of a theorem, identity or expression is a generalization involving a new parameter q that returns the original theorem, identity or expression in the limit as q → 1. Typically, mathematicians are interested in q-analogs that arise naturally, rather than in arbitrarily contriving q-analogs of known results. The earliest q-analog studied in detail is the basic hypergeometric series, which was introduced in the 19th century.

<span class="mw-page-title-main">Central binomial coefficient</span> Sequence of numbers ((2n) choose (n))

In mathematics the nth central binomial coefficient is the particular binomial coefficient

In mathematics, especially in combinatorics, Stirling numbers of the first kind arise in the study of permutations. In particular, the unsigned Stirling numbers of the first kind count permutations according to their number of cycles.

In combinatorics, the twelvefold way is a systematic classification of 12 related enumerative problems concerning two finite sets, which include the classical problems of counting permutations, combinations, multisets, and partitions either of a set or of a number. The idea of the classification is credited to Gian-Carlo Rota, and the name was suggested by Joel Spencer.

<span class="mw-page-title-main">Eulerian number</span> Polynomial sequence

In combinatorics, the Eulerian number is the number of permutations of the numbers 1 to in which exactly elements are greater than the previous element.

In mathematics, the Schuette–Nesbitt formula is a generalization of the inclusion–exclusion principle. It is named after Donald R. Schuette and Cecil J. Nesbitt.

In combinatorics, stars and bars is a graphical aid for deriving certain combinatorial theorems. It can be used to solve many simple counting problems, such as how many ways there are to put n indistinguishable balls into k distinguishable bins. The solution to this particular problem is given by the binomial coefficient , which is the number of subsets of size k − 1 that can be formed from a set of size n + k − 1.

In mathematics and especially in algebraic combinatorics, the Stanley symmetric functions are a family of symmetric functions introduced by Richard Stanley in his study of the symmetric group of permutations.

In combinatorial mathematics and statistics, the Fuss–Catalan numbers are numbers of the form

In combinatorial mathematics, cyclic sieving is a phenomenon in which an integer polynomial evaluated at roots of unity counts the symmetries of the action of a cyclic group on a finite set. Given a family of such phenomena, the polynomials give a q-analogue for the enumeration of the sets, often arising from an underlying algebraic structure such as a representation.

<span class="mw-page-title-main">Lattice path</span> Sequence of end-to-end vectors across points of a lattice

In combinatorics, a lattice pathL in the d-dimensional integer lattice of length k with steps in the set S, is a sequence of vectors such that each consecutive difference lies in S. A lattice path may lie in any lattice in , but the integer lattice is most commonly used.

References