Karamata's inequality

Last updated December 24, 2021

In mathematics, Karamata's inequality,^[1] named after Jovan Karamata,^[2] also known as the majorization inequality, is a theorem in elementary algebra for convex and concave real-valued functions, defined on an interval of the real line. It generalizes the discrete form of Jensen's inequality, and generalizes in turn to the concept of Schur-convex functions.

Statement of the inequality

Let $I$ be an interval of the real line and let $f$ denote a real-valued, convex function defined on $I$ . If $x 1, \dots, x n$ and $y 1, \dots, y n$ are numbers in $I$ such that $(x 1, \dots, x n)$ majorizes $(y 1, \dots, y n)$ , then

f(x_{1})+\cdots +f(x_{n})\geq f(y_{1})+\cdots +f(y_{n}).

(1)

Here majorization means that $x 1, \dots, x n$ and $y 1, \dots, y n$ satisfies

x_{1}\geq x_{2}\geq \cdots \geq x_{n}

and

y_{1}\geq y_{2}\geq \cdots \geq y_{n},

(2)

and we have the inequalities

x_{1}+\cdots +x_{i}\geq y_{1}+\cdots +y_{i}

for all

i \in {1, \dots, n - 1}

.

(3)

and the equality

x_{1}+\cdots +x_{n}=y_{1}+\cdots +y_{n}

(4)

If $f$ is a strictly convex function, then the inequality ( 1 ) holds with equality if and only if we have $x i = y i$ for all $i \in {1, \dots, n}$ .

Remarks

If the convex function $f$ is non-decreasing, then the proof of ( 1 ) below and the discussion of equality in case of strict convexity shows that the equality ( 4 ) can be relaxed to
$x_{1}+\cdots +x_{n}\geq y_{1}+\cdots +y_{n}.$

(5)
The inequality ( 1 ) is reversed if $f$ is concave, since in this case the function $- f$ is convex.

Example

The finite form of Jensen's inequality is a special case of this result. Consider the real numbers $x 1, \dots, x n \in I$ and let

a:={\frac {x_{1}+x_{2}+\cdots +x_{n}}{n}}

denote their arithmetic mean. Then $(x 1, \dots, x n)$ majorizes the $n$ -tuple $(a, a, \dots, a)$ , since the arithmetic mean of the $i$ largest numbers of $(x 1, \dots, x n)$ is at least as large as the arithmetic mean $a$ of all the $n$ numbers, for every $i \in {1, \dots, n - 1}$ . By Karamata's inequality ( 1 ) for the convex function $f$ ,

f(x_{1})+f(x_{2})+\cdots +f(x_{n})\geq f(a)+f(a)+\cdots +f(a)=nf(a).

Dividing by $n$ gives Jensen's inequality. The sign is reversed if $f$ is concave.

Proof of the inequality

We may assume that the numbers are in decreasing order as specified in ( 2 ).

If $x i = y i$ for all $i \in {1, \dots, n}$ , then the inequality ( 1 ) holds with equality, hence we may assume in the following that $x i \neq y i$ for at least one $i$ .

If $x i = y i$ for an $i \in {1, \dots, n - 1}$ , then the inequality ( 1 ) and the majorization properties ( 3 ) and ( 4 ) are not affected if we remove $x i$ and $y i$ . Hence we may assume that $x i \neq y i$ for all $i \in {1, \dots, n - 1}$ .

It is a property of convex functions that for two numbers $x \neq y$ in the interval $I$ the slope

{\frac {f(x)-f(y)}{x-y}}

of the secant line through the points $(x, f (x))$ and $(y, f (y))$ of the graph of $f$ is a monotonically non-decreasing function in $x$ for $y$ fixed (and vice versa). This implies that

c_{i+1}:={\frac {f(x_{i+1})-f(y_{i+1})}{x_{i+1}-y_{i+1}}}\leq {\frac {f(x_{i})-f(y_{i})}{x_{i}-y_{i}}}=:c_{i}

(6)

for all $i \in {1, \dots, n - 1}$ . Define $A 0 = B 0 = 0$ and

A_{i}=x_{1}+\cdots +x_{i},\qquad B_{i}=y_{1}+\cdots +y_{i}

for all $i \in {1, \dots, n}$ . By the majorization property ( 3 ), $A i \geq B i$ for all $i \in {1, \dots, n - 1}$ and by ( 4 ), $A n = B n$ . Hence,

{\begin{aligned}\sum _{i=1}^{n}{\bigl (}f(x_{i})-f(y_{i}){\bigr )}&=\sum _{i=1}^{n}c_{i}(x_{i}-y_{i})\\&=\sum _{i=1}^{n}c_{i}{\bigl (}\underbrace {A_{i}-A_{i-1}} _{=\,x_{i}}{}-(\underbrace {B_{i}-B_{i-1}} _{=\,y_{i}}){\bigr )}\\&=\sum _{i=1}^{n}c_{i}(A_{i}-B_{i})-\sum _{i=1}^{n}c_{i}(A_{i-1}-B_{i-1})\\&=c_{n}(\underbrace {A_{n}-B_{n}} _{=\,0})+\sum _{i=1}^{n-1}(\underbrace {c_{i}-c_{i+1}} _{\geq \,0})(\underbrace {A_{i}-B_{i}} _{\geq \,0})-c_{1}(\underbrace {A_{0}-B_{0}} _{=\,0})\\&\geq 0,\end{aligned}}

(7)

which proves Karamata's inequality ( 1 ).

To discuss the case of equality in ( 1 ), note that $x 1 > y 1$ by ( 3 ) and our assumption $x i \neq y i$ for all $i \in {1, \dots, n - 1}$ . Let $i$ be the smallest index such that $(x i, y i) \neq (x i +1, y i +1)$ , which exists due to ( 4 ). Then $A i > B i$ . If $f$ is strictly convex, then there is strict inequality in ( 6 ), meaning that $c i +1 < c i$ . Hence there is a strictly positive term in the sum on the right hand side of ( 7 ) and equality in ( 1 ) cannot hold.

If the convex function $f$ is non-decreasing, then $c n \geq 0$ . The relaxed condition ( 5 ) means that $A n \geq B n$ , which is enough to conclude that $c n (A n - B n) \geq 0$ in the last step of ( 7 ).

If the function $f$ is strictly convex and non-decreasing, then $c n > 0$ . It only remains to discuss the case $A n > B n$ . However, then there is a strictly positive term on the right hand side of ( 7 ) and equality in ( 1 ) cannot hold.

Related Research Articles

In mathematics, generalized means are a family of functions for aggregating sets of numbers. These include as special cases the Pythagorean means.

In mathematics, the limit inferior and limit superior of a sequence can be thought of as limiting bounds on the sequence. They can be thought of in a similar fashion for a function. For a set, they are the infimum and supremum of the set's limit points, respectively. In general, when there are multiple objects around which a sequence, function, or set accumulates, the inferior and superior limits extract the smallest and largest of them; the type of object and the measure of size is context-dependent, but the notion of extreme limits is invariant. Limit inferior is also called infimum limit, limit infimum, liminf, inferior limit, lower limit, or inner limit; limit superior is also known as supremum limit, limit supremum, limsup, superior limit, upper limit, or outer limit.

In probability theory, there exist several different notions of convergence of random variables. The convergence of sequences of random variables to some limit random variable is an important concept in probability theory, and its applications to statistics and stochastic processes. The same concepts are known in more general mathematics as stochastic convergence and they formalize the idea that a sequence of essentially random or unpredictable events can sometimes be expected to settle down into a behavior that is essentially unchanging when items far enough into the sequence are studied. The different possible notions of convergence relate to how such a behavior can be characterized: two readily understood behaviors are that the sequence eventually takes a constant value, and that values in the sequence continue to change but can be described by an unchanging probability distribution.

Triangle inequality property of geometry, also used to generalize the notion of "distance" in metric spaces

In mathematics, the triangle inequality states that for any triangle, the sum of the lengths of any two sides must be greater than or equal to the length of the remaining side. This statement permits the inclusion of degenerate triangles, but some authors, especially those writing about elementary geometry, will exclude this possibility, thus leaving out the possibility of equality. If $x$ , $y$ , and $z$ are the lengths of the sides of the triangle, with no side being greater than $z$ , then the triangle inequality states that

In mathematics, an inequality is a relation which makes a non-equal comparison between two numbers or other mathematical expressions. It is used most often to compare two numbers on the number line by their size. There are several different notations used to represent different kinds of inequalities:

In mathematical analysis, the Minkowski inequality establishes that the L^p spaces are normed vector spaces. Let S be a measure space, let 1 ≤ p < ∞ and let f and g be elements of L^p(S). Then f + g is in L^p(S), and we have the triangle inequality

Convex function Real function with secant line between points above the graph itself

In mathematics, a real-valued function is called convex if the line segment between any two points on the graph of the function does not lie below the graph between the two points. Equivalently, a function is convex if its epigraph is a convex set. A twice-differentiable function of a single variable is convex if and only if its second derivative is nonnegative on its entire domain. Well-known examples of convex functions of a single variable include the quadratic function $and the exponential function . In simple terms, a convex function refers to a function whose graph is shaped like a cup, while a concave function's graph is shaped like a cap .$

In mathematics, a concave function is the negative of a convex function. A concave function is also synonymously called concave downwards, concave down, convex upwards, convex cap, or upper convex.

Jensens inequality Theorem of convex functions

In mathematics, Jensen's inequality, named after the Danish mathematician Johan Jensen, relates the value of a convex function of an integral to the integral of the convex function. It was proved by Jensen in 1906. Given its generality, the inequality appears in many forms depending on the context, some of which are presented below. In its simplest form the inequality states that the convex transformation of a mean is less than or equal to the mean applied after convex transformation; it is a simple corollary that the opposite is true of concave transformations.

In mathematics, the inequality of arithmetic and geometric means, or more briefly the AM–GM inequality, states that the arithmetic mean of a list of non-negative real numbers is greater than or equal to the geometric mean of the same list; and further, that the two means are equal if and only if every number in the list is the same.

In mathematics, the rearrangement inequality states that

In mathematics, Muirhead's inequality, named after Robert Franklin Muirhead, also known as the "bunching" method, generalizes the inequality of arithmetic and geometric means.

In mathematics, a function f is logarithmically convex or superconvex if $, the composition of the logarithm with f, is itself a convex function.$

In mathematics, there are two different results that share the common name of the Ky Fan inequality. One is an inequality involving the geometric mean and arithmetic mean of two sets of real numbers of the unit interval. The result was published on page 5 of the book Inequalities by Edwin F. Beckenbach and Richard E. Bellman (1961), who refer to an unpublished result of Ky Fan. They mention the result in connection with the inequality of arithmetic and geometric means and Augustin Louis Cauchy's proof of this inequality by forward-backward-induction; a method which can also be used to prove the Ky Fan inequality.

In mathematics, majorization is a preorder on vectors of real numbers. For a vector $, we denote by the vector with the same components, but sorted in descending order. Given, we say that weakly majorizes from below written as iff$

Pythagorean means Classical averages studied in ancient Greece

In mathematics, the three classical Pythagorean means are the arithmetic mean (AM), the geometric mean (GM), and the harmonic mean (HM). These means were studied with proportions by Pythagoreans and later generations of Greek mathematicians because of their importance in geometry and music.

In mathematics, a Schur-convex function, also known as S-convex, isotonic function and order-preserving function is a function $that for all such that is majorized by, one has that . Named after Issai Schur, Schur-convex functions are used in the study of majorization. Every function that is convex and symmetric is also Schur-convex. The opposite implication is not true, but all Schur-convex functions are symmetric.$

In functional analysis, compact operators are linear operators on Banach spaces that map bounded sets to relatively compact sets. In the case of a Hilbert space H, the compact operators are the closure of the finite rank operators in the uniform operator topology. In general, operators on infinite-dimensional spaces feature properties that do not appear in the finite-dimensional case, i.e. for matrices. The compact operators are notable in that they share as much similarity with matrices as one can expect from a general operator. In particular, the spectral properties of compact operators resemble those of square matrices.

In mathematics, there are many kinds of inequalities involving matrices and linear operators on Hilbert spaces. This article covers some important operator inequalities connected with traces of matrices.

In mathematics, Young's inequality for products is a mathematical inequality about the product of two numbers. The inequality is named after William Henry Young and should not be confused with Young's convolution inequality.

References

↑ Kadelburg, Zoran; Đukić, Dušan; Lukić, Milivoje; Matić, Ivan (2005), "Inequalities of Karamata, Schur and Muirhead, and some applications" (PDF), The Teaching of Mathematics, 8 (1): 31–45, ISSN 1451-4966
↑ Karamata, Jovan (1932), "Sur une inégalité relative aux fonctions convexes" (PDF), Publ. Math. Univ. Belgrade (in French), 1: 145–148, Zbl 0005.20101

External links

An explanation of Karamata's inequality and majorization theory can be found here.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Kadelburg, Zoran; Đukić, Dušan; Lukić, Milivoje; Matić, Ivan (2005), "Inequalities of Karamata, Schur and Muirhead, and some applications" (PDF), The Teaching of Mathematics, 8 (1): 31–45, ISSN 1451-4966

[2] Karamata, Jovan (1932), "Sur une inégalité relative aux fonctions convexes" (PDF), Publ. Math. Univ. Belgrade (in French), 1: 145–148, Zbl 0005.20101

[1]

[2]