Muirhead's inequality

Last updated March 20, 2024

In mathematics, Muirhead's inequality, named after Robert Franklin Muirhead, also known as the "bunching" method, generalizes the inequality of arithmetic and geometric means.

Preliminary definitions

a-mean

For any real vector

a=(a_{1},\dots ,a_{n})

define the "a-mean" [a] of positive real numbers x₁, ..., x_n by

[a]={\frac {1}{n!}}\sum _{\sigma }x_{\sigma _{1}}^{a_{1}}\cdots x_{\sigma _{n}}^{a_{n}},

where the sum extends over all permutations σ of { 1, ..., n }.

When the elements of a are nonnegative integers, the a-mean can be equivalently defined via the monomial symmetric polynomial $m_{a}(x_{1},\dots ,x_{n})$ as

[a]={\frac {k_{1}!\cdots k_{l}!}{n!}}m_{a}(x_{1},\dots ,x_{n}),

where ℓ is the number of distinct elements in a, and k₁, ..., k_ℓ are their multiplicities.

Notice that the a-mean as defined above only has the usual properties of a mean (e.g., if the mean of equal numbers is equal to them) if $a_{1}+\cdots +a_{n}=1$ . In the general case, one can consider instead $[a]^{1/(a_{1}+\cdots +a_{n})}$ , which is called a Muirhead mean.^[1]

Examples

For a = (1, 0, ..., 0), the a-mean is just the ordinary arithmetic mean of x₁, ..., x_n.
For a = (1/n, ..., 1/n), the a-mean is the geometric mean of x₁, ..., x_n.
For a = (x, 1 − x), the a-mean is the Heinz mean.
The Muirhead mean for a = (−1, 0, ..., 0) is the harmonic mean.

Doubly stochastic matrices

An n × n matrix P is doubly stochastic precisely if both P and its transpose P^T are stochastic matrices. A stochastic matrix is a square matrix of nonnegative real entries in which the sum of the entries in each column is 1. Thus, a doubly stochastic matrix is a square matrix of nonnegative real entries in which the sum of the entries in each row and the sum of the entries in each column is 1.

Statement

Muirhead's inequality states that [a] ≤ [b] for all x such that x_i > 0 for every i ∈ { 1, ..., n } if and only if there is some doubly stochastic matrix P for which a = Pb.

Furthermore, in that case we have [a] = [b] if and only if a = b or all x_i are equal.

The latter condition can be expressed in several equivalent ways; one of them is given below.

The proof makes use of the fact that every doubly stochastic matrix is a weighted average of permutation matrices (Birkhoff-von Neumann theorem).

Another equivalent condition

Because of the symmetry of the sum, no generality is lost by sorting the exponents into decreasing order:

a_{1}\geq a_{2}\geq \cdots \geq a_{n}

b_{1}\geq b_{2}\geq \cdots \geq b_{n}.

Then the existence of a doubly stochastic matrix P such that a = Pb is equivalent to the following system of inequalities:

{\begin{aligned}a_{1}&\leq b_{1}\\a_{1}+a_{2}&\leq b_{1}+b_{2}\\a_{1}+a_{2}+a_{3}&\leq b_{1}+b_{2}+b_{3}\\&\,\,\,\vdots \\a_{1}+\cdots +a_{n-1}&\leq b_{1}+\cdots +b_{n-1}\\a_{1}+\cdots +a_{n}&=b_{1}+\cdots +b_{n}.\end{aligned}}

(The last one is an equality; the others are weak inequalities.)

The sequence $b_{1},\ldots ,b_{n}$ is said to majorize the sequence $a_{1},\ldots ,a_{n}$ .

Symmetric sum notation

It is convenient to use a special notation for the sums. A success in reducing an inequality in this form means that the only condition for testing it is to verify whether one exponent sequence ( $\alpha _{1},\ldots ,\alpha _{n}$ ) majorizes the other one.

\sum _{\text{sym}}x_{1}^{\alpha _{1}}\cdots x_{n}^{\alpha _{n}}

This notation requires developing every permutation, developing an expression made of n! monomials, for instance:

{\begin{aligned}\sum _{\text{sym}}x^{3}y^{2}z^{0}&=x^{3}y^{2}z^{0}+x^{3}z^{2}y^{0}+y^{3}x^{2}z^{0}+y^{3}z^{2}x^{0}+z^{3}x^{2}y^{0}+z^{3}y^{2}x^{0}\\&=x^{3}y^{2}+x^{3}z^{2}+y^{3}x^{2}+y^{3}z^{2}+z^{3}x^{2}+z^{3}y^{2}\end{aligned}}

Examples

Arithmetic-geometric mean inequality

Let

a_{G}=\left({\frac {1}{n}},\ldots ,{\frac {1}{n}}\right)

and

a_{A}=(1,0,0,\ldots ,0).

We have

{\begin{aligned}a_{A1}=1&>a_{G1}={\frac {1}{n}},\\a_{A1}+a_{A2}=1&>a_{G1}+a_{G2}={\frac {2}{n}},\\&\,\,\,\vdots \\a_{A1}+\cdots +a_{An}&=a_{G1}+\cdots +a_{Gn}=1.\end{aligned}}

Then

[a_A] ≥ [a_G],

which is

{\frac {1}{n!}}(x_{1}^{1}\cdot x_{2}^{0}\cdots x_{n}^{0}+\cdots +x_{1}^{0}\cdots x_{n}^{1})(n-1)!\geq {\frac {1}{n!}}(x_{1}\cdot \cdots \cdot x_{n})^{1/n}n!

yielding the inequality.

Other examples

We seek to prove that x² + y² ≥ 2xy by using bunching (Muirhead's inequality). We transform it in the symmetric-sum notation:

\sum _{\mathrm {sym} }x^{2}y^{0}\geq \sum _{\mathrm {sym} }x^{1}y^{1}.

The sequence (2, 0) majorizes the sequence (1, 1), thus the inequality holds by bunching.

Similarly, we can prove the inequality

x^{3}+y^{3}+z^{3}\geq 3xyz

by writing it using the symmetric-sum notation as

\sum _{\mathrm {sym} }x^{3}y^{0}z^{0}\geq \sum _{\mathrm {sym} }x^{1}y^{1}z^{1},

which is the same as

2x^{3}+2y^{3}+2z^{3}\geq 6xyz.

Since the sequence (3, 0, 0) majorizes the sequence (1, 1, 1), the inequality holds by bunching.

Notes

↑ Bullen, P. S. Handbook of means and their inequalities. Kluwer Academic Publishers Group, Dordrecht, 2003. ISBN 1-4020-1522-4

Related Research Articles

The Cauchy–Schwarz inequality is an upper bound on the inner product between two vectors in an inner product space in terms of the product of the vector norms. It is considered one of the most important and widely used inequalities in mathematics.

In probability theory, Chebyshev's inequality provides an upper bound on the probability of deviation of a random variable from its mean. More specifically, the probability that a random variable deviates from its mean by more than $is at most, where is any positive constant.$

In probability theory, the Azuma–Hoeffding inequality gives a concentration result for the values of martingales that have bounded differences.

In mathematics, the inequality of arithmetic and geometric means, or more briefly the AM–GM inequality, states that the arithmetic mean of a list of non-negative real numbers is greater than or equal to the geometric mean of the same list; and further, that the two means are equal if and only if every number in the list is the same.

In mathematics, the rearrangement inequality states that for every choice of real numbers

In probability theory and statistics, the Rademacher distribution is a discrete probability distribution where a random variate X has a 50% chance of being +1 and a 50% chance of being -1.

In statistics and probability theory, a point process or point field is a collection of mathematical points randomly located on a mathematical space such as the real line or Euclidean space. Point processes can be used for spatial data analysis, which is of interest in such diverse disciplines as forestry, plant ecology, epidemiology, geography, seismology, materials science, astronomy, telecommunications, computational neuroscience, economics and others.

In mathematics, especially in combinatorics, Stirling numbers of the first kind arise in the study of permutations. In particular, the Stirling numbers of the first kind count permutations according to their number of cycles.

In mathematics, especially in probability and combinatorics, a doubly stochastic matrix (also called bistochastic matrix) is a square matrix $of nonnegative real numbers, each of whose rows and columns sums to 1, i.e.,$

In mathematics, Schubert calculus is a branch of algebraic geometry introduced in the nineteenth century by Hermann Schubert in order to solve various counting problems of projective geometry and, as such, is viewed as part of enumerative geometry. Giving it a more rigorous foundation was the aim of Hilbert's 15th problem. It is related to several more modern concepts, such as characteristic classes, and both its algorithmic aspects and applications remain of current interest. The term Schubert calculus is sometimes used to mean the enumerative geometry of linear subspaces of a vector space, which is roughly equivalent to describing the cohomology ring of Grassmannians. Sometimes it is used to mean the more general enumerative geometry of algebraic varieties that are homogenous spaces of simple Lie groups. Even more generally, Schubert calculus is sometimes understood as encompassing the study of analogous questions in generalized cohomology theories.

In the mathematical theory of probability, a Doob martingale is a stochastic process that approximates a given random variable and has the martingale property with respect to the given filtration. It may be thought of as the evolving sequence of best approximations to the random variable based on information accumulated up to a certain time.

In mathematics, Doob's martingale inequality, also known as Kolmogorov’s submartingale inequality is a result in the study of stochastic processes. It gives a bound on the probability that a submartingale exceeds any given value over a given interval of time. As the name suggests, the result is usually given in the case that the process is a martingale, but the result is also valid for submartingales.

In probability theory, Bernstein inequalities give bounds on the probability that the sum of random variables deviates from its mean. In the simplest case, let X₁, ..., X_n be independent Bernoulli random variables taking values +1 and −1 with probability 1/2, then for every positive $,$

In probability theory and theoretical computer science, McDiarmid's inequality is a concentration inequality which bounds the deviation between the sampled value and the expected value of certain functions when they are evaluated on independent random variables. McDiarmid's inequality applies to functions that satisfy a bounded differences property, meaning that replacing a single argument to the function while leaving all other arguments unchanged cannot cause too large of a change in the value of the function.

In mathematics, the Grothendieck inequality states that there is a universal constant $with the following property. If M ij is an n \times n matrix with$

In mathematics, Maclaurin's inequality, named after Colin Maclaurin, is a refinement of the inequality of arithmetic and geometric means.

The block Wiedemann algorithm for computing kernel vectors of a matrix over a finite field is a generalization by Don Coppersmith of an algorithm due to Doug Wiedemann.

In mathematics, particularly linear algebra, the Schur–Horn theorem, named after Issai Schur and Alfred Horn, characterizes the diagonal of a Hermitian matrix with given eigenvalues. It has inspired investigations and substantial generalizations in the setting of symplectic geometry. A few important generalizations are Kostant's convexity theorem, Atiyah–Guillemin–Sternberg convexity theorem, Kirwan convexity theorem.

For certain applications in linear algebra, it is useful to know properties of the probability distribution of the largest eigenvalue of a finite sum of random matrices. Suppose $is a finite sequence of random matrices. Analogous to the well-known Chernoff bound for sums of scalars, a bound on the following is sought for a given parameter t :$

In mathematics, a transformation of a sequence's generating function provides a method of converting the generating function for one sequence into a generating function enumerating another. These transformations typically involve integral formulas applied to a sequence generating function or weighted sums over the higher-order derivatives of these functions.

References

Combinatorial Theory by John N. Guidi, based on lectures given by Gian-Carlo Rota in 1998, MIT Copy Technology Center, 2002.
Kiran Kedlaya, A < B (A less than B), a guide to solving inequalities
Muirhead's theorem at PlanetMath .
Hardy, G.H.; Littlewood, J.E.; Pólya, G. (1952), Inequalities, Cambridge Mathematical Library (2. ed.), Cambridge: Cambridge University Press, ISBN 0-521-05206-8, MR 0046395, Zbl 0047.05302, Section 2.18, Theorem 45.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Bullen, P. S. Handbook of means and their inequalities. Kluwer Academic Publishers Group, Dordrecht, 2003. ISBN 1-4020-1522-4

[1]