Multinomial theorem

Last updated November 04, 2023

In mathematics, the multinomial theorem describes how to expand a power of a sum in terms of powers of the terms in that sum. It is the generalization of the binomial theorem from binomials to multinomials.

Theorem
Example
Alternate expression
Proof
Multinomial coefficients
Sum of all multinomial coefficients
Number of multinomial coefficients
Valuation of multinomial coefficients
Asymptotics
Interpretations
Ways to put objects into bins
Number of ways to select according to a distribution
Number of unique permutations of words
Generalized Pascal's triangle
See also
References

Theorem

For any positive integer $m$ and any non-negative integer $n$ , the multinomial formula describes how a sum with $m$ terms expands when raised to an arbitrary power $n$ :

(x_{1}+x_{2}+\cdots +x_{m})^{n}=\sum _{k_{1}+k_{2}+\cdots +k_{m}=n;\ k_{1},k_{2},\cdots ,k_{m}\geq 0}{n \choose k_{1},k_{2},\ldots ,k_{m}}\prod _{t=1}^{m}x_{t}^{k_{t}}\,,

where

{n \choose k_{1},k_{2},\ldots ,k_{m}}={\frac {n!}{k_{1}!\,k_{2}!\cdots k_{m}!}}

is a multinomial coefficient. The sum is taken over all combinations of nonnegative integer indices $k 1$ through $k m$ such that the sum of all $k i$ is $n$ . That is, for each term in the expansion, the exponents of the $x i$ must add up to $n$ . Also, as with the binomial theorem, quantities of the form $x 0$ that appear are taken to equal 1 (even when $x$ equals zero).

In the case $m = 2$ , this statement reduces to that of the binomial theorem.

Example

The third power of the trinomial $a + b + c$ is given by

(a+b+c)^{3}=a^{3}+b^{3}+c^{3}+3a^{2}b+3a^{2}c+3b^{2}a+3b^{2}c+3c^{2}a+3c^{2}b+6abc.

This can be computed by hand using the distributive property of multiplication over addition, but it can also be done (perhaps more easily) with the multinomial theorem. It is possible to "read off" the multinomial coefficients from the terms by using the multinomial coefficient formula. For example:

a^{2}b^{0}c^{1}

has the coefficient

{3 \choose 2,0,1}={\frac {3!}{2!\cdot 0!\cdot 1!}}={\frac {6}{2\cdot 1\cdot 1}}=3.

a^{1}b^{1}c^{1}

has the coefficient

{3 \choose 1,1,1}={\frac {3!}{1!\cdot 1!\cdot 1!}}={\frac {6}{1\cdot 1\cdot 1}}=6.

Alternate expression

The statement of the theorem can be written concisely using multiindices:

(x_{1}+\cdots +x_{m})^{n}=\sum _{|\alpha |=n}{n \choose \alpha }x^{\alpha }

where

\alpha =(\alpha _{1},\alpha _{2},\dots ,\alpha _{m})

and

x^{\alpha }=x_{1}^{\alpha _{1}}x_{2}^{\alpha _{2}}\cdots x_{m}^{\alpha _{m}}

Proof

This proof of the multinomial theorem uses the binomial theorem and induction on $m$ .

First, for $m = 1$ , both sides equal $x 1 n$ since there is only one term $k 1 = n$ in the sum. For the induction step, suppose the multinomial theorem holds for $m$ . Then

{\begin{aligned}&(x_{1}+x_{2}+\cdots +x_{m}+x_{m+1})^{n}=(x_{1}+x_{2}+\cdots +(x_{m}+x_{m+1}))^{n}\\[6pt]={}&\sum _{k_{1}+k_{2}+\cdots +k_{m-1}+K=n}{n \choose k_{1},k_{2},\ldots ,k_{m-1},K}x_{1}^{k_{1}}x_{2}^{k_{2}}\cdots x_{m-1}^{k_{m-1}}(x_{m}+x_{m+1})^{K}\end{aligned}}

by the induction hypothesis. Applying the binomial theorem to the last factor,

=\sum _{k_{1}+k_{2}+\cdots +k_{m-1}+K=n}{n \choose k_{1},k_{2},\ldots ,k_{m-1},K}x_{1}^{k_{1}}x_{2}^{k_{2}}\cdots x_{m-1}^{k_{m-1}}\sum _{k_{m}+k_{m+1}=K}{K \choose k_{m},k_{m+1}}x_{m}^{k_{m}}x_{m+1}^{k_{m+1}}

=\sum _{k_{1}+k_{2}+\cdots +k_{m-1}+k_{m}+k_{m+1}=n}{n \choose k_{1},k_{2},\ldots ,k_{m-1},k_{m},k_{m+1}}x_{1}^{k_{1}}x_{2}^{k_{2}}\cdots x_{m-1}^{k_{m-1}}x_{m}^{k_{m}}x_{m+1}^{k_{m+1}}

which completes the induction. The last step follows because

{n \choose k_{1},k_{2},\ldots ,k_{m-1},K}{K \choose k_{m},k_{m+1}}={n \choose k_{1},k_{2},\ldots ,k_{m-1},k_{m},k_{m+1}},

as can easily be seen by writing the three coefficients using factorials as follows:

{\frac {n!}{k_{1}!k_{2}!\cdots k_{m-1}!K!}}{\frac {K!}{k_{m}!k_{m+1}!}}={\frac {n!}{k_{1}!k_{2}!\cdots k_{m+1}!}}.

Multinomial coefficients

The numbers

{n \choose k_{1},k_{2},\ldots ,k_{m}}

appearing in the theorem are the multinomial coefficients. They can be expressed in numerous ways, including as a product of binomial coefficients or of factorials:

{n \choose k_{1},k_{2},\ldots ,k_{m}}={\frac {n!}{k_{1}!\,k_{2}!\cdots k_{m}!}}={k_{1} \choose k_{1}}{k_{1}+k_{2} \choose k_{2}}\cdots {k_{1}+k_{2}+\cdots +k_{m} \choose k_{m}}

Sum of all multinomial coefficients

The substitution of $x i = 1$ for all $i$ into the multinomial theorem

\sum _{k_{1}+k_{2}+\cdots +k_{m}=n}{n \choose k_{1},k_{2},\ldots ,k_{m}}x_{1}^{k_{1}}x_{2}^{k_{2}}\cdots x_{m}^{k_{m}}=(x_{1}+x_{2}+\cdots +x_{m})^{n}

gives immediately that

\sum _{k_{1}+k_{2}+\cdots +k_{m}=n}{n \choose k_{1},k_{2},\ldots ,k_{m}}=m^{n}.

Number of multinomial coefficients

The number of terms in a multinomial sum, $# n, m$ , is equal to the number of monomials of degree $n$ on the variables $x 1, \dots, x m$ :

\#_{n,m}={n+m-1 \choose m-1}.

The count can be performed easily using the method of stars and bars.

Valuation of multinomial coefficients

The largest power of a prime $p$ that divides a multinomial coefficient may be computed using a generalization of Kummer's theorem.

Asymptotics

By Stirling's approximation, or equivalently the log-gamma function's asymptotic expansion,

\log {\binom {kn}{n,n,\cdots ,n}}=kn\log(k)+{\frac {1}{2}}\left(\log(k)-(k-1)\log(2\pi n)\right)-{\frac {k^{2}-1}{12kn}}+{\frac {k^{4}-1}{360k^{3}n^{3}}}-{\frac {k^{6}-1}{1260k^{5}n^{5}}}+O\left({\frac {1}{n^{6}}}\right)

so for example,

{\binom {2n}{n}}\sim {\frac {2^{2n}}{\sqrt {n\pi }}}

Interpretations

Ways to put objects into bins

The multinomial coefficients have a direct combinatorial interpretation, as the number of ways of depositing $n$ distinct objects into $m$ distinct bins, with $k 1$ objects in the first bin, $k 2$ objects in the second bin, and so on.^[1]

Number of ways to select according to a distribution

In statistical mechanics and combinatorics, if one has a number distribution of labels, then the multinomial coefficients naturally arise from the binomial coefficients. Given a number distribution ${n i}$ on a set of $N$ total items, $n i$ represents the number of items to be given the label $i$ . (In statistical mechanics $i$ is the label of the energy state.)

The number of arrangements is found by

Choosing $n 1$ of the total $N$ to be labeled 1. This can be done ${\tbinom {N}{n_{1}}}$ ways.
From the remaining $N - n 1$ items choose $n 2$ to label 2. This can be done ${\tbinom {N-n_{1}}{n_{2}}}$ ways.
From the remaining $N - n 1 - n 2$ items choose $n 3$ to label 3. Again, this can be done ${\tbinom {N-n_{1}-n_{2}}{n_{3}}}$ ways.

Multiplying the number of choices at each step results in:

{N \choose n_{1}}{N-n_{1} \choose n_{2}}{N-n_{1}-n_{2} \choose n_{3}}\cdots ={\frac {N!}{(N-n_{1})!n_{1}!}}\cdot {\frac {(N-n_{1})!}{(N-n_{1}-n_{2})!n_{2}!}}\cdot {\frac {(N-n_{1}-n_{2})!}{(N-n_{1}-n_{2}-n_{3})!n_{3}!}}\cdots .

Cancellation results in the formula given above.

Number of unique permutations of words

The multinomial coefficient

{\binom {n}{k_{1},\ldots ,k_{m}}}

is also the number of distinct ways to permute a multiset of $n$ elements, where $k i$ is the multiplicity of each of the $i$ th element. For example, the number of distinct permutations of the letters of the word MISSISSIPPI, which has 1 M, 4 Is, 4 Ss, and 2 Ps, is

{11 \choose 1,4,4,2}={\frac {11!}{1!\,4!\,4!\,2!}}=34650.

Generalized Pascal's triangle

One can use the multinomial theorem to generalize Pascal's triangle or Pascal's pyramid to Pascal's simplex. This provides a quick way to generate a lookup table for multinomial coefficients.

Related Research Articles

In probability theory and statistics, the binomial distribution with parameters n and p is the discrete probability distribution of the number of successes in a sequence of n independent experiments, each asking a yes–no question, and each with its own Boolean-valued outcome: success or failure. A single success/failure experiment is also called a Bernoulli trial or Bernoulli experiment, and a sequence of outcomes is called a Bernoulli process; for a single trial, i.e., n = 1, the binomial distribution is a Bernoulli distribution. The binomial distribution is the basis for the popular binomial test of statistical significance.

In mathematics, the binomial coefficients are the positive integers that occur as coefficients in the binomial theorem. Commonly, a binomial coefficient is indexed by a pair of integers $n \geq k \geq 0$ and is written $It is the coefficient of the x k term in the polynomial expansion of the binomial power (1 + x) n; this coefficient can be computed by the multiplicative formula$

In elementary algebra, the binomial theorem (or binomial expansion) describes the algebraic expansion of powers of a binomial. According to the theorem, it is possible to expand the polynomial $(x + y) n$ into a sum involving terms of the form $ax b y c$ , where the exponents $b$ and $c$ are nonnegative integers with $b + c = n$ , and the coefficient $a$ of each term is a specific positive integer depending on $n$ and $b$ . For example, for $n = 4$ ,

In mathematics, a combination is a selection of items from a set that has distinct members, such that the order of selection does not matter. For example, given three fruits, say an apple, an orange and a pear, there are three combinations of two that can be drawn from this set: an apple and a pear; an apple and an orange; or a pear and an orange. More formally, a k-combination of a set S is a subset of k distinct elements of S. So, two combinations are identical if and only if each combination has the same members. If the set has n elements, the number of k-combinations, denoted by $or, is equal to the binomial coefficient$

In mathematics, Pascal's triangle is a triangular array of the binomial coefficients arising in probability theory, combinatorics, and algebra. In much of the Western world, it is named after the French mathematician Blaise Pascal, although other mathematicians studied it centuries before him in Persia, India, China, Germany, and Italy.

In mathematics, a recurrence relation is an equation according to which the $th term of a sequence of numbers is equal to some combination of the previous terms. Often, only previous terms of the sequence appear in the equation, for a parameter that is independent of; this number is called the order of the relation. If the values of the first numbers in the sequence have been given, the rest of the sequence can be calculated by repeatedly applying the equation.$

In mathematics, a multiset is a modification of the concept of a set that, unlike a set, allows for multiple instances for each of its elements. The number of instances given for each element is called the multiplicity of that element in the multiset. As a consequence, an infinite number of multisets exist which contain only elements $a$ and $b$ , but vary in the multiplicities of their elements:

In combinatorial mathematics, the Bell polynomials, named in honor of Eric Temple Bell, are used in the study of set partitions. They are related to Stirling and Bell numbers. They also occur in many applications, such as in the Faà di Bruno's formula.

Multi-index notation is a mathematical notation that simplifies formulas used in multivariable calculus, partial differential equations and the theory of distributions, by generalising the concept of an integer index to an ordered tuple of indices.

In mathematics, the binomial series is a generalization of the polynomial that comes from a binomial formula expression like $for a nonnegative integer . Specifically, the binomial series is the Taylor series for the function centered at, where and . Explicitly,$

In mathematical analysis, Cesàro summation assigns values to some infinite sums that are not necessarily convergent in the usual sense. The Cesàro sum is defined as the limit, as n tends to infinity, of the sequence of arithmetic means of the first n partial sums of the series.

In probability theory, the multinomial distribution is a generalization of the binomial distribution. For example, it models the probability of counts for each side of a k-sided die rolled n times. For n independent trials each of which leads to a success for exactly one of k categories, with each category having a given fixed success probability, the multinomial distribution gives the probability of any particular combination of numbers of successes for the various categories.

In calculus, the general Leibniz rule, named after Gottfried Wilhelm Leibniz, generalizes the product rule. It states that if $and are -times differentiable functions, then the product is also -times differentiable and its th derivative is given by$

<span class="mw-page-title-main">Dirichlet distribution</span> Probability distribution

In probability and statistics, the Dirichlet distribution (after Peter Gustav Lejeune Dirichlet), often denoted $, is a family of continuous multivariate probability distributions parameterized by a vector of positive reals. It is a multivariate generalization of the beta distribution, hence its alternative name of multivariate beta distribution (MBD) . Dirichlet distributions are commonly used as prior distributions in Bayesian statistics, and in fact, the Dirichlet distribution is the conjugate prior of the categorical distribution and multinomial distribution.$

In mathematics, the Gaussian binomial coefficients are q-analogs of the binomial coefficients. The Gaussian binomial coefficient, written as $or, is a polynomial in q with integer coefficients, whose value when q is set to a prime power counts the number of subspaces of dimension k in a vector space of dimension n over, a finite field with q elements; i.e. it is the number of points in the finite Grassmannian .$

In mathematics the nth central binomial coefficient is the particular binomial coefficient

In mathematics, Pascal's rule is a combinatorial identity about binomial coefficients. It states that for positive natural numbers n and k,

In mathematics, especially in combinatorics, Stirling numbers of the first kind arise in the study of permutations. In particular, the Stirling numbers of the first kind count permutations according to their number of cycles.

In mathematics, Kummer's theorem is a formula for the exponent of the highest power of a prime number p that divides a given binomial coefficient. In other words, it gives the p-adic valuation of a binomial coefficient. The theorem is named after Ernst Kummer, who proved it in a paper,.

In mathematics, a transformation of a sequence's generating function provides a method of converting the generating function for one sequence into a generating function enumerating another. These transformations typically involve integral formulas applied to a sequence generating function or weighted sums over the higher-order derivatives of these functions.

References

↑ National Institute of Standards and Technology (May 11, 2010). "NIST Digital Library of Mathematical Functions". Section 26.4 . Retrieved August 30, 2010.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] National Institute of Standards and Technology (May 11, 2010). "NIST Digital Library of Mathematical Functions". Section 26.4 . Retrieved August 30, 2010.

[1]