Rearrangement inequality

Last updated December 03, 2023

In mathematics, the rearrangement inequality^[1] states that for every choice of real numbers

Informally, this means that in these types of sums, the largest sum is achieved by pairing large $x$ values with large $y$ values, and the smallest sum is achieved by pairing small values with large values. This can be formalised in the case that the $x_{1},\ldots ,x_{n}$ are distinct, meaning that $x_{1}<\cdots <x_{n},$ then:

The upper bound in ( 1 ) is attained only for permutations $\sigma$ that keep the order of $y_{1},\ldots ,y_{n},$ that is, $y_{\sigma (1)}\leq \cdots \leq y_{\sigma (n)},$ or equivalently $(y_{1},\ldots ,y_{n})=(y_{\sigma (1)},\ldots ,y_{\sigma (n)}).$ Such a $\sigma$ can permute the indices of $y$ -values that are equal; in the case $y_{1}=\cdots =y_{n}$ every permutation keeps the order of $y_{1},\ldots ,y_{n}.$ If $y_{1}<\cdots <y_{n},$ then the only such $\sigma$ is the identiy.
Correspondingly, the lower bound in ( 1 ) is attained only for permutations $\sigma$ that reverse the order of $y_{1},\ldots ,y_{n},$ meaning that $y_{\sigma (1)}\geq \cdots \geq y_{\sigma (n)}.$ If $y_{1}<\cdots <y_{n},$ then $\sigma (i)=n-i+1$ for all $i=1,\ldots ,n,$ is the only permutation to do this.

Note that the rearrangement inequality ( 1 ) makes no assumptions on the signs of the real numbers, unlike inequalities such as the arithmetic-geometric mean inequality.

Applications

Many important inequalities can be proved by the rearrangement inequality, such as the arithmetic mean – geometric mean inequality, the Cauchy–Schwarz inequality, and Chebyshev's sum inequality.

As a simple example, consider real numbers $x_{1}\leq \cdots \leq x_{n}$ : By applying ( 1 ) with $y_{i}:=x_{i}$ for all $i=1,\ldots ,n,$ it follows that

x_{1}x_{n}+\cdots +x_{n}x_{1}\leq x_{1}x_{\sigma (1)}+\cdots +x_{n}x_{\sigma (n)}\leq x_{1}^{2}+\cdots +x_{n}^{2}

for every permutation $\sigma$ of $1,\ldots ,n.$

Intuition

The rearrangement inequality can be regarded as intuitive in the following way. Imagine there is a heap of $10 bills, a heap of $20 bills and one more heap of $100 bills. You are allowed to take 7 bills from a heap of your choice and then the heap disappears. In the second round you are allowed to take 5 bills from another heap and the heap disappears. In the last round you may take 3 bills from the last heap. In what order do you want to choose the heaps to maximize your profit? Obviously, the best you can do is to gain $7\cdot 100+5\cdot 20+3\cdot 10$ dollars. This is exactly what the upper bound of the rearrangement inequality ( 1 ) says for the sequences $3<5<7$ and $10<20<100.$ In this sense, it can be considered as an example of a greedy algorithm.

Geometric interpretation

Assume that $0<x_{1}<\cdots <x_{n}$ and $0<y_{1}<\cdots <y_{n}.$ Consider a rectangle of width $x_{1}+\cdots +x_{n}$ and height $y_{1}+\cdots +y_{n},$ subdivided into $n$ columns of widths $x_{1},\ldots ,x_{n}$ and the same number of rows of heights $y_{1},\ldots ,y_{n},$ so there are $\textstyle n^{2}$ small rectangles. You are supposed to take $n$ of these, one from each column and one from each row. The rearrangement inequality ( 1 ) says that you optimize the total area of your selection by taking the rectangles on the diagonal or the antidiagonal.

Proofs

Proof by Contradiction

The lower bound and the corresponding discussion of equality follow by applying the results for the upper bound to

-y_{n}\leq \cdots \leq -y_{1}.

Therefore, it suffices to prove the upper bound in ( 1 ) and discuss when equality holds. Since there are only finitely many permutations of $1,\ldots ,n,$ there exists at least one $\sigma$ for which the middle term in ( 1 )

x_{1}y_{\sigma (1)}+\cdots +x_{n}y_{\sigma (n)}

is maximal. In case there are several permutations with this property, let σ denote one with the highest number of integers $i$ from $\{1,\ldots ,n\}$ satisfying $y_{i}=y_{\sigma (i)}.$

We will now prove by contradiction, that $\sigma$ has to keep the order of $y_{1},\ldots ,y_{n}$ (then we are done with the upper bound in ( 1 ), because the identity has that property). Assume that there exists a $j\in \{1,\ldots ,n-1\}$ such that $y_{i}=y_{\sigma (i)}$ for all $i\in \{1,\ldots ,j-1\}$ and $y_{j}\neq y_{\sigma (j)}.$ Hence $y_{j}<y_{\sigma (j)}$ and there has to exist a $k\in \{j+1,\ldots ,n\}$ with $y_{j}=y_{\sigma (k)}$ to fill the gap. Therefore,

x_{j}\leq x_{k}\qquad {\text{and}}\qquad y_{\sigma (k)}<y_{\sigma (j)},

(2)

which implies that

0\leq (x_{k}-x_{j})(y_{\sigma (j)}-y_{\sigma (k)}).

(3)

Expanding this product and rearranging gives

x_{j}y_{\sigma (j)}+x_{k}y_{\sigma (k)}\leq x_{j}y_{\sigma (k)}+x_{k}y_{\sigma (j)}\,,

(4)

which is equivalent to ( 3 ). Hence the permutation

\tau (i):={\begin{cases}\sigma (i)&{\text{for }}i\in \{1,\ldots ,n\}\setminus \{j,k\},\\\sigma (k)&{\text{for }}i=j,\\\sigma (j)&{\text{for }}i=k,\end{cases}}

which arises from $\sigma$ by exchanging the values $\sigma (j)$ and $\sigma (k),$ has at least one additional point which keeps the order compared to $\sigma ,$ namely at $j$ satisfying $y_{j}=y_{\tau (j)},$ and also attains the maximum in ( 1 ) due to ( 4 ). This contradicts the choice of $\sigma .$

If $x_{1}<\cdots <x_{n},$ then we have strict inequalities in ( 2 ), ( 3 ), and ( 4 ), hence the maximum can only be attained by permutations keeping the order of $y_{1}\leq \cdots \leq y_{n},$ and every other permutation $\sigma$ cannot be optimal.

Proof via Induction

As above, if suffices to treat the upper bound in ( 1 ). For a proof by mathematical induction, we start with $n=2.$ Observe that

x_{1}\leq x_{2}\quad {\text{ and }}\quad y_{1}\leq y_{2}

implies that

0\leq (x_{2}-x_{1})(y_{2}-y_{1}),

(5)

which is equivalent to

x_{1}y_{2}+x_{2}y_{1}\leq x_{1}y_{1}+x_{2}y_{2},

(6)

hence the upper bound in ( 1 ) is true for $n=2.$ If $x_{1}<x_{2},$ then we get strict inequality in ( 5 ) and ( 6 ) if and only if $y_{1}<y_{2}.$ Hence only the identity, which is the only permutation here keeping the order of $y_{1}<y_{2},$ gives the maximum.

As an induction hypothesis assume that the upper bound in the rearrangement inequality ( 1 ) is true for $n-1$ with $n\geq 3$ and that in the case $x_{1}<\cdots <x_{n-1}$ there is equality only when the permutation $\sigma$ of $1,\ldots ,n-1$ keeps the order of $y_{1},\ldots ,y_{n-1}.$

Consider now $x_{1}\leq \cdots \leq x_{n}$ and $y_{1}\leq \cdots \leq y_{n}.$ Take a $\sigma$ from the finite number of permutations of $1,\ldots ,n$ such that the rearrangement in the middle of ( 1 ) gives the maximal result. There are two cases:

If $\sigma (n)=n,$ then $y_{n}=y_{\sigma (n)}$ and, using the induction hypothesis, the upper bound in ( 1 ) is true with equality and $\sigma$ keeps the order of $y_{1},\ldots ,y_{n-1},y_{n}$ in the case $x_{1}<\cdots <x_{n}.$
If $k:=\sigma (n)<n,$ then there is a $j\in \{1,\dots ,n-1\}$ with $\sigma (j)=n.$ Define the permutation $\tau (i)={\begin{cases}\sigma (i)&{\text{for }}i\in \{1,\ldots ,n\}\setminus \{j,n\},\\k&{\text{for }}i=j,\\n&{\text{for }}i=n,\end{cases}}$ which arises from $\sigma$ by exchanging the values of $j$ and $n.$ There are now two subcases:

If $x_{k}=x_{n}$ or $y_{k}=y_{n},$ then this exchange of values of $\sigma$ has no effect on the middle term in ( 1 ) because $\tau$ gives the same sum, and we can proceed by applying the first case to $\tau .$ Note that in the case $x_{1}<\cdots <x_{n},$ the permutation $\tau$ keeps the order of $y_{1},\ldots ,y_{n}$ if and only if $\sigma$ does.
If $x_{k}<x_{n}$ and $y_{k}<y_{n},$ then $0<(x_{n}-x_{k})(y_{n}-y_{k}),$ which is equivalent to $x_{k}y_{n}+x_{n}y_{k}<x_{k}y_{k}+x_{n}y_{n}$ and shows that $\sigma$ is not optimal, hence this case cannot happen due to the choice of $\sigma .$

Generalizations

Three or more sequences

A straightforward generalization takes into account more sequences. Assume we have finite ordered sequences of nonnegative real numbers

0\leq x_{1}\leq \cdots \leq x_{n}\quad {\text{and}}\quad 0\leq y_{1}\leq \cdots \leq y_{n}\quad {\text{and}}\quad 0\leq z_{1}\leq \cdots \leq z_{n}

and a permutation $y_{\sigma (1)},\ldots ,y_{\sigma (n)}$ of $y_{1},\dots ,y_{n}$ and another permutation $z_{\tau (1)},\dots ,z_{\tau (n)}$ of $z_{1},\dots ,z_{n}.$ Then

x_{1}y_{\sigma (1)}z_{\tau (1)}+\cdots +x_{n}y_{\sigma (n)}z_{\tau (n)}\leq x_{1}y_{1}z_{1}+\cdots +x_{n}y_{n}z_{n}.

Note that, unlike the standard rearrangement inequality ( 1 ), this statement requires the numbers to be nonnegative. A similar statement is true for any number of sequences with all numbers nonnegative.

Functions instead of factors

Another generalization of the rearrangement inequality states that for all real numbers $x_{1}\leq \cdots \leq x_{n}$ and every choice of continuously differentiable functions $f_{i}:[x_{1},x_{n}]\to \mathbb {R}$ for $i=1,2,\ldots ,n$ such that their derivatives $f'_{1},\ldots ,f'_{n}$ satisfy

f'_{1}(x)\leq f'_{2}(x)\leq \cdots \leq f'_{n}(x)\quad {\text{ for all }}x\in [x_{1},x_{n}],

the inequality

\sum _{i=1}^{n}f_{n-i+1}(x_{i})\leq \sum _{i=1}^{n}f_{\sigma (i)}(x_{i})\leq \sum _{i=1}^{n}f_{i}(x_{i})

holds for every permutation $f_{\sigma (1)},\ldots ,f_{\sigma (n)}$ of $f_{1},\ldots ,f_{n}.$ ^[2] Taking real numbers $y_{1}\leq \cdots \leq y_{n}$ and the linear functions $f_{i}(x):=xy_{i}$ for real $x$ and $i=1,\ldots ,n,$ the standard rearrangement inequality ( 1 ) is recovered.

Related Research Articles

Autocorrelation, sometimes known as serial correlation in the discrete time case, is the correlation of a signal with a delayed copy of itself as a function of delay. Informally, it is the similarity between observations of a random variable as a function of the time lag between them. The analysis of autocorrelation is a mathematical tool for finding repeating patterns, such as the presence of a periodic signal obscured by noise, or identifying the missing fundamental frequency in a signal implied by its harmonic frequencies. It is often used in signal processing for analyzing functions or series of values, such as time domain signals.

In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is

In mathematics, a permutation of a set is, loosely speaking, an arrangement of its members into a sequence or linear order, or if the set is already ordered, a rearrangement of its elements. The word "permutation" also refers to the act or process of changing the linear order of an ordered set.

In mathematics, when X is a finite set with at least two elements, the permutations of X (i.e. the bijective functions from X to X) fall into two classes of equal size: the even permutations and the odd permutations. If any total ordering of X is fixed, the parity (oddness or evenness) of a permutation $of X can be defined as the parity of the number of inversions for σ, i.e., of pairs of elements x, y of X such that x < y and σ (x) > σ (y) .$

In mathematics, an infinite series of numbers is said to converge absolutely if the sum of the absolute values of the summands is finite. More precisely, a real or complex series $is said to converge absolutely if for some real number Similarly, an improper integral of a function, is said to converge absolutely if the integral of the absolute value of the integrand is finite—that is, if$

In theoretical physics, path-ordering is the procedure that orders a product of operators according to the value of a chosen parameter:

In mathematics, Muirhead's inequality, named after Robert Franklin Muirhead, also known as the "bunching" method, generalizes the inequality of arithmetic and geometric means.

In quantum field theory, the Wightman distributions can be analytically continued to analytic functions in Euclidean space with the domain restricted to the ordered set of points in Euclidean space with no coinciding points. These functions are called the Schwinger functions and they are real-analytic, symmetric under the permutation of arguments, Euclidean covariant and satisfy a property known as reflection positivity. Properties of Schwinger functions are known as Osterwalder–Schrader axioms. Schwinger functions are also referred to as Euclidean correlation functions.

In linear algebra, the Laplace expansion, named after Pierre-Simon Laplace, also called cofactor expansion, is an expression of the determinant of an $n \times n$ -matrix $B$ as a weighted sum of minors, which are the determinants of some $(n - 1) \times (n - 1)$ -submatrices of $B$ . Specifically, for every $i$ , the Laplace expansion along the $i$ th row is the equality

<span class="mw-page-title-main">Exponential type</span> Type of complex function with growth bounded by an exponential function

In complex analysis, a branch of mathematics, a holomorphic function is said to be of exponential type C if its growth is bounded by the exponential function $for some real-valued constant as . When a function is bounded in this way, it is then possible to express it as certain kinds of convergent summations over a series of other complex functions, as well as understanding when it is possible to apply techniques such as Borel summation, or, for example, to apply the Mellin transform, or to perform approximations using the Euler-Maclaurin formula. The general case is handled by Nachbin's theorem, which defines the analogous notion of -type for a general function as opposed to .$

Toroidal coordinates are a three-dimensional orthogonal coordinate system that results from rotating the two-dimensional bipolar coordinate system about the axis that separates its two foci. Thus, the two foci $and in bipolar coordinates become a ring of radius in the plane of the toroidal coordinate system; the -axis is the axis of rotation. The focal ring is also known as the reference circle.$

In statistics, an exchangeable sequence of random variables is a sequence X₁, X₂, X₃, ... whose joint probability distribution does not change when the positions in the sequence in which finitely many of them appear are altered. Thus, for example the sequences

In mathematics, the theory of optimal stopping or early stopping is concerned with the problem of choosing a time to take a particular action, in order to maximise an expected reward or minimise an expected cost. Optimal stopping problems can be found in areas of statistics, economics, and mathematical finance. A key example of an optimal stopping problem is the secretary problem. Optimal stopping problems can often be written in the form of a Bellman equation, and are therefore often solved using dynamic programming.

In probability theory and theoretical computer science, McDiarmid's inequality is a concentration inequality which bounds the deviation between the sampled value and the expected value of certain functions when they are evaluated on independent random variables. McDiarmid's inequality applies to functions that satisfy a bounded differences property, meaning that replacing a single argument to the function while leaving all other arguments unchanged cannot cause too large of a change in the value of the function.

In mathematics, particularly linear algebra, the Schur–Horn theorem, named after Issai Schur and Alfred Horn, characterizes the diagonal of a Hermitian matrix with given eigenvalues. It has inspired investigations and substantial generalizations in the setting of symplectic geometry. A few important generalizations are Kostant's convexity theorem, Atiyah–Guillemin–Sternberg convexity theorem, Kirwan convexity theorem.

In statistical mechanics, the Griffiths inequality, sometimes also called Griffiths–Kelly–Sherman inequality or GKS inequality, named after Robert B. Griffiths, is a correlation inequality for ferromagnetic spin systems. Informally, it says that in ferromagnetic spin systems, if the 'a-priori distribution' of the spin is invariant under spin flipping, the correlation of any monomial of the spins is non-negative; and the two point correlation of two monomial of the spins is non-negative.

For certain applications in linear algebra, it is useful to know properties of the probability distribution of the largest eigenvalue of a finite sum of random matrices. Suppose $is a finite sequence of random matrices. Analogous to the well-known Chernoff bound for sums of scalars, a bound on the following is sought for a given parameter t :$

In the mathematical field of group theory, an Artin transfer is a certain homomorphism from an arbitrary finite or infinite group to the commutator quotient group of a subgroup of finite index. Originally, such mappings arose as group theoretic counterparts of class extension homomorphisms of abelian extensions of algebraic number fields by applying Artin's reciprocity maps to ideal class groups and analyzing the resulting homomorphisms between quotients of Galois groups. However, independently of number theoretic applications, a partial order on the kernels and targets of Artin transfers has recently turned out to be compatible with parent-descendant relations between finite p-groups, which can be visualized in descendant trees. Therefore, Artin transfers provide a valuable tool for the classification of finite p-groups and for searching and identifying particular groups in descendant trees by looking for patterns defined by the kernels and targets of Artin transfers. These strategies of pattern recognition are useful in purely group theoretic context, as well as for applications in algebraic number theory concerning Galois groups of higher p-class fields and Hilbert p-class field towers.

The concept of angles between lines, between two planes or between a line and a plane can be generalized to arbitrary dimensions. This generalization was first discussed by Camille Jordan. For any pair of flats in a Euclidean space of arbitrary dimension one can define a set of mutual angles which are invariant under isometric transformation of the Euclidean space. If the flats do not intersect, their shortest distance is one more invariant. These angles are called canonical or principal. The concept of angles can be generalized to pairs of flats in a finite-dimensional inner product space over the complex numbers.

In mathematics, the injective tensor product of two topological vector spaces (TVSs) was introduced by Alexander Grothendieck and was used by him to define nuclear spaces. An injective tensor product is in general not necessarily complete, so its completion is called the completed injective tensor products. Injective tensor products have applications outside of nuclear spaces. In particular, as described below, up to TVS-isomorphism, many TVSs that are defined for real or complex valued functions, for instance, the Schwartz space or the space of continuously differentiable functions, can be immediately extended to functions valued in a Hausdorff locally convex TVS $with out any need to extend definitions from real/complex-valued functions to -valued functions.$

References

↑ Hardy, G.H.; Littlewood, J.E.; Pólya, G. (1952), Inequalities, Cambridge Mathematical Library (2. ed.), Cambridge: Cambridge University Press, ISBN 0-521-05206-8, MR 0046395, Zbl 0047.05302 , Section 10.2, Theorem 368
↑ Holstermann, Jan (2017), "A Generalization of the Rearrangement Inequality" (PDF), Mathematical Reflections, no. 5 (2017)

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Hardy, G.H.; Littlewood, J.E.; Pólya, G. (1952), Inequalities, Cambridge Mathematical Library (2. ed.), Cambridge: Cambridge University Press, ISBN 0-521-05206-8, MR 0046395, Zbl 0047.05302 , Section 10.2, Theorem 368

[2] Holstermann, Jan (2017), "A Generalization of the Rearrangement Inequality" (PDF), Mathematical Reflections, no. 5 (2017)

[1]

[2]