Trace inequalities

Last updated

In mathematics, there are many kinds of inequalities involving matrices and linear operators on Hilbert spaces. This article covers some important operator inequalities connected with traces of matrices. [1] [2] [3] [4]

Mathematics field of study concerning quantity, patterns and change

Mathematics includes the study of such topics as quantity, structure, space, and change.

Inequality (mathematics) mathematical relation comparing two different values

In mathematics, an inequality is a relation that holds between two values when they are different.

Matrix (mathematics) Two-dimensional array of numbers with specific operations

In mathematics, a matrix is a rectangular array of numbers, symbols, or expressions, arranged in rows and columns. For example, the dimensions of the matrix below are 2 × 3, because there are two rows and three columns:

Contents

Basic definitions

Let Hn denote the space of Hermitian n×n matrices, Hn+ denote the set consisting of positive semi-definite n×n Hermitian matrices and Hn++ denote the set of positive definite Hermitian matrices. For operators on an infinite dimensional Hilbert space we require that they be trace class and self-adjoint, in which case similar definitions apply, but we discuss only matrices, for simplicity.

In mathematics, a Hermitian matrix is a complex square matrix that is equal to its own conjugate transpose—that is, the element in the i-th row and j-th column is equal to the complex conjugate of the element in the j-th row and i-th column, for all indices i and j:

In mathematics, a trace class operator is a compact operator for which a trace may be defined, such that the trace is finite and independent of the choice of basis. Trace class operators are essentially the same as nuclear operators, though many authors reserve the term "trace class operator" for the special case of nuclear operators on Hilbert spaces, and reserve "nuclear operator" for usage in more general Banach spaces.

In mathematics, a self-adjoint operator on a finite-dimensional complex vector space V with inner product is a linear map A that is its own adjoint: for all vectors v and w. If V is finite-dimensional with a given orthonormal basis, this is equivalent to the condition that the matrix of A is a Hermitian matrix, i.e., equal to its conjugate transpose A. By the finite-dimensional spectral theorem, V has an orthonormal basis such that the matrix of A relative to this basis is a diagonal matrix with entries in the real numbers. In this article, we consider generalizations of this concept to operators on Hilbert spaces of arbitrary dimension.

For any real-valued function f on an interval I ⊂ ℝ, one may define a matrix function f(A) for any operator AHn with eigenvalues λ in I by defining it on the eigenvalues and corresponding projectors P as

In mathematics, a matrix function is a function which maps a matrix to another matrix.

In linear algebra, an eigenvector or characteristic vector of a linear transformation is a non-zero vector that changes by only a scalar factor when that linear transformation is applied to it. More formally, if T is a linear transformation from a vector space V over a field F into itself and v is a vector in V that is not the zero vector, then v is an eigenvector of T if T(v) is a scalar multiple of v. This condition can be written as the equation

Projection (linear algebra) linear transformation that, when applied multiple times to any value, gives the same result as if it were applied once

In linear algebra and functional analysis, a projection is a linear transformation from a vector space to itself such that . That is, whenever is applied twice to any value, it gives the same result as if it were applied once (idempotent). It leaves its image unchanged. Though abstract, this definition of "projection" formalizes and generalizes the idea of graphical projection. One can also consider the effect of a projection on a geometrical object by examining the effect of the projection on points in the object.

given the spectral decomposition

Operator monotone

A function f: I → ℝ defined on an interval I ⊂ ℝ is said to be operator monotone if ∀n, and all A,BHn with eigenvalues in I, the following holds,

where the inequality A ≥ B means that the operator AB ≥ 0 is positive semi-definite. One may check that f(A)=A2 is, in fact, not operator monotone!

Operator convex

A function is said to be operator convex if for all and all A,BHn with eigenvalues in I, and , the following holds

Note that the operator has eigenvalues in , since and have eigenvalues in I.

A function is operator concave if is operator convex, i.e. the inequality above for is reversed.

Joint convexity

A function , defined on intervals is said to be jointly convex if for all and all with eigenvalues in and all with eigenvalues in , and any the following holds

A function g is jointly concave if −g is jointly convex, i.e. the inequality above for g is reversed.

Trace function

Given a function f: ℝ → ℝ, the associated trace function on Hn is given by

where A has eigenvalues λ and Tr stands for a trace of the operator.

In linear algebra, the trace of an n × n square matrix A is defined to be the sum of the elements on the main diagonal of A.

Convexity and monotonicity of the trace function

Let f: ℝ → ℝ be continuous, and let n be any integer. Then, if is monotone increasing, so is on Hn.

Likewise, if is convex, so is on Hn, and it is strictly convex if f is strictly convex.

Convex function real function with secant line between points above the graph itself

In mathematics, a real-valued function defined on an n-dimensional interval is called convex if the line segment between any two points on the graph of the function lies above or on the graph. Equivalently, a function is convex if its epigraph is a convex set. For a twice differentiable function of a single variable, if the second derivative is always greater than or equal to zero for its entire domain then the function is convex. Well-known examples of convex functions include the quadratic function and the exponential function .

See proof and discussion in, [1] for example.

Löwner–Heinz theorem

For , the function is operator monotone and operator concave.

For , the function is operator monotone and operator concave.

For , the function is operator convex. Furthermore,

is operator concave and operator monotone, while
is operator convex.

The original proof of this theorem is due to K. Löwner who gave a necessary and sufficient condition for f to be operator monotone. [5] An elementary proof of the theorem is discussed in [1] and a more general version of it in. [6]

Klein's inequality

For all Hermitian n×n matrices A and B and all differentiable convex functions f: ℝ → ℝ with derivative f ' , or for all positive-definite Hermitian n×n matrices A and B, and all differentiable convex functions f:(0,∞) → ℝ, the following inequality holds,

In either case, if f is strictly convex, equality holds if and only if A = B. A popular choice in applications is f(t) = t log t, see below.

Proof

Let C = AB so that, for 0 < t < 1,

Define

By convexity and monotonicity of trace functions, φ is convex, and so for all 0 < t < 1,

and, in fact, the right hand side is monotone decreasing in t. Taking the limit t→0 yields Klein's inequality.

Note that if f is strictly convex and C ≠ 0, then φ is strictly convex. The final assertion follows from this and the fact that is monotone decreasing in t.

Golden–Thompson inequality

In 1965, S. Golden [7] and C.J. Thompson [8] independently discovered that

For any matrices ,

This inequality can be generalized for three operators: [9] for non-negative operators ,

Peierls–Bogoliubov inequality

Let be such that Tr eR = 1. Defining g = Tr FeR, we have

The proof of this inequality follows from the above combined with Klein's inequality. Take f(x) = exp(x), A=R + F, and B = R + gI. [10]

Gibbs variational principle

Let be a self-adjoint operator such that is trace class. Then for any with

with equality if and only if

Lieb's concavity theorem

The following theorem was proved by E. H. Lieb in. [9] It proves and generalizes a conjecture of E. P. Wigner, M. M. Yanase and F. J. Dyson. [11] Six years later other proofs were given by T. Ando [12] and B. Simon, [3] and several more have been given since then.

For all matrices , and all and such that and , with the real valued map on given by

Here stands for the adjoint operator of

Lieb's theorem

For a fixed Hermitian matrix , the function

is concave on .

The theorem and proof are due to E. H. Lieb, [9] Thm 6, where he obtains this theorem as a corollary of Lieb's concavity Theorem. The most direct proof is due to H. Epstein; [13] see M.B. Ruskai papers, [14] [15] for a review of this argument.

Ando's convexity theorem

T. Ando's proof [12] of Lieb's concavity theorem led to the following significant complement to it:

For all matrices , and all and with , the real valued map on given by

is convex.

Joint convexity of relative entropy

For two operators define the following map

For density matrices and , the map is the Umegaki's quantum relative entropy.

Note that the non-negativity of follows from Klein's inequality with .

Statement

The map is jointly convex.

Proof

For all , is jointly concave, by Lieb's concavity theorem, and thus

is convex. But

and convexity is preserved in the limit.

The proof is due to G. Lindblad. [16]

Jensen's operator and trace inequalities

The operator version of Jensen's inequality is due to C. Davis. [17]

A continuous, real function on an interval satisfies Jensen's Operator Inequality if the following holds

for operators with and for self-adjoint operators with spectrum on .

See, [17] [18] for the proof of the following two theorems.

Jensen's trace inequality

Let f be a continuous function defined on an interval I and let m and n be natural numbers. If f is convex, we then have the inequality

for all (X1, ... , Xn) self-adjoint m × m matrices with spectra contained in I and all (A1, ... , An) of m × m matrices with

Conversely, if the above inequality is satisfied for some n and m, where n > 1, then f is convex.

Jensen's operator inequality

For a continuous function defined on an interval the following conditions are equivalent:

for all bounded, self-adjoint operators on an arbitrary Hilbert space with spectra contained in and all on with

every self-adjoint operator with spectrum in .

Araki–Lieb–Thirring inequality

E. H. Lieb and W. E. Thirring proved the following inequality in [19] in 1976: For any , and

In 1990 [20] H. Araki generalized the above inequality to the following one: For any , and

for

and

for

The Lieb–Thirring inequality also enjoys the following generalization: [21] for any , and

Effros's theorem and its extension

E. Effros in [22] proved the following theorem.

If is an operator convex function, and and are commuting bounded linear operators, i.e. the commutator , the perspective

is jointly convex, i.e. if and with (i=1,2), ,

Ebadian et al. later extended the inequality to the case where and do not commute . [23]

Von Neumann's trace inequality

Von Neumann's trace inequality, named after its originator John von Neumann, states that for any n × n complex matrices A, B with singular values and respectively, [24]

The equality is achieved when and are simultaneously unitarily diagonalizable (see trace).

See also

Related Research Articles

In linear algebra, the determinant is a scalar value that can be computed from the elements of a square matrix and encodes certain properties of the linear transformation described by the matrix. The determinant of a matrix A is denoted det(A), det A, or |A|. Geometrically, it can be viewed as the volume scaling factor of the linear transformation described by the matrix. This is also the signed volume of the n-dimensional parallelepiped spanned by the column or row vectors of the matrix. The determinant is positive or negative according to whether the linear mapping preserves or reverses the orientation of n-space.

In mathematics, a linear map is a mapping VW between two modules that preserves the operations of addition and scalar multiplication.

In mathematics, the Cauchy–Schwarz inequality, also known as the Cauchy–Bunyakovsky–Schwarz inequality, is a useful inequality encountered in many different settings, such as linear algebra, analysis, probability theory, vector algebra and other areas. It is considered to be one of the most important inequalities in all of mathematics.

In probability theory, Chebyshev's inequality guarantees that, for a wide class of probability distributions, no more than a certain fraction of values can be more than a certain distance from the mean. Specifically, no more than 1/k2 of the distribution's values can be more than k standard deviations away from the mean. The rule is often called Chebyshev's theorem, about the range of standard deviations around the mean, in statistics. The inequality has great utility because it can be applied to any probability distribution in which the mean and variance are defined. For example, it can be used to prove the weak law of large numbers.

In linear algebra, an n-by-n square matrix A is called invertible if there exists an n-by-n square matrix B such that

Exterior algebra algebraic construction used in Euclidean geometry

In mathematics, the exterior product or wedge product of vectors is an algebraic construction used in geometry to study areas, volumes, and their higher-dimensional analogues. The exterior product of two vectors u and v, denoted by uv, is called a bivector and lives in a space called the exterior square, a vector space that is distinct from the original space of vectors. The magnitude of uv can be interpreted as the area of the parallelogram with sides u and v, which in three dimensions can also be computed using the cross product of the two vectors. Like the cross product, the exterior product is anticommutative, meaning that uv = −(vu) for all vectors u and v, but, unlike the cross product, the exterior product is associative. One way to visualize a bivector is as a family of parallelograms all lying in the same plane, having the same area, and with the same orientation—a choice of clockwise or counterclockwise.

Jensens inequality Theorem of convex functions

In mathematics, Jensen's inequality, named after the Danish mathematician Johan Jensen, relates the value of a convex function of an integral to the integral of the convex function. It was proven by Jensen in 1906. Given its generality, the inequality appears in many forms depending on the context, some of which are presented below. In its simplest form the inequality states that the convex transformation of a mean is less than or equal to the mean applied after convex transformation; it is a simple corollary that the opposite is true of concave transformations.

In linear algebra and functional analysis, the min-max theorem, or variational theorem, or Courant–Fischer–Weyl min-max principle, is a result that gives a variational characterization of eigenvalues of compact Hermitian operators on Hilbert spaces. It can be viewed as the starting point of many results of similar nature.

In mathematics, a matrix norm is a vector norm in a vector space whose elements (vectors) are matrices.

Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets. Whereas many classes of convex optimization problems admit polynomial-time algorithms, mathematical optimization is in general NP-hard.

In mathematics, the Fredholm determinant is a complex-valued function which generalizes the determinant of a finite dimensional linear operator. It is defined for bounded operators on a Hilbert space which differ from the identity operator by a trace-class operator. The function is named after the mathematician Erik Ivar Fredholm.

In mathematics, the Poincaré inequality is a result in the theory of Sobolev spaces, named after the French mathematician Henri Poincaré. The inequality allows one to obtain bounds on a function using bounds on its derivatives and the geometry of its domain of definition. Such bounds are of great importance in the modern, direct methods of the calculus of variations. A very closely related result is Friedrichs' inequality.

In mathematics, Doob's martingale inequality is a result in the study of stochastic processes. It gives a bound on the probability that a stochastic process exceeds any given value over a given interval of time. As the name suggests, the result is usually given in the case that the process is a non-negative martingale, but the result is also valid for non-negative submartingales.

In mathematics, particularly linear algebra, the Schur–Horn theorem, named after Issai Schur and Alfred Horn, characterizes the diagonal of a Hermitian matrix with given eigenvalues. It has inspired investigations and substantial generalizations in the setting of symplectic geometry. A few important generalizations are Kostant's convexity theorem, Atiyah–Guillemin–Sternberg convexity theorem, Kirwan convexity theorem.

In probability theory, concentration inequalities provide bounds on how a random variable deviates from some value. The laws of large numbers of classical probability theory states that sums of independent random variables are, under very mild conditions, close to their expectation with a large probability. Such sums are the most basic examples of random variables concentrated around their mean. Recent results show that such behavior is shared by other functions of independent random variables.

In mathematics, the Grunsky matrices, or Grunsky operators, are matrices introduced by Grunsky (1939) in complex analysis and geometric function theory. They correspond to either a single holomorphic function on the unit disk or a pair of holomorphic functions on the unit disk and its complement. The Grunsky inequalities express boundedness properties of these matrices, which in general are contraction operators or in important special cases unitary operators. As Grunsky showed, these inequalities hold if and only if the holomorphic function is univalent. The inequalities are equivalent to the inequalities of Goluzin, discovered in 1947. Roughly speaking, the Grunsky inequalities give information on the coefficients of the logarithm of a univalent function; later generalizations by Milin, starting from the Lebedev–Milin inequality, succeeded in exponentiating the inequalities to obtain inequalities for the coefficients of the univalent function itself. Historically the inequalities were used in proving special cases of the Bieberbach conjecture up to the sixth coefficient; the exponentiated inequalities of Milin were used by de Branges in the final solution. The Grunsky operators and their Fredholm determinants are related to spectral properties of bounded domains in the complex plane. The operators have further applications in conformal mapping, Teichmüller theory and conformal field theory.

For certain applications in linear algebra, it is useful to know properties of the probability distribution of the largest eigenvalue of a finite sum of random matrices. Suppose is a finite sequence of random matrices. Analogous to the well-known Chernoff bound for sums of scalars, a bound on the following is sought for a given parameter t:

Strong subadditivity of entropy (SSA) was long known and appreciated in classical probability theory and information theory. Its extension to quantum mechanical entropy was conjectured by D.W. Robinson and D. Ruelle in 1966 and O. E. Lanford III and D. W. Robinson in 1968 and proved in 1973 by E.H. Lieb and M.B. Ruskai. It is a basic theorem in modern quantum information theory.

References

  1. 1 2 3 E. Carlen, Trace Inequalities and Quantum Entropy: An Introductory Course, Contemp. Math. 529 (2010) 73–140 doi : 10.1090/conm/529/10428
  2. R. Bhatia, Matrix Analysis, Springer, (1997).
  3. 1 2 B. Simon, Trace Ideals and their Applications, Cambridge Univ. Press, (1979); Second edition. Amer. Math. Soc., Providence, RI, (2005).
  4. M. Ohya, D. Petz, Quantum Entropy and Its Use, Springer, (1993).
  5. K. Löwner, "Uber monotone Matrix funktionen", Math. Z. 38, 177–216, (1934).
  6. W.F. Donoghue, Jr., Monotone Matrix Functions and Analytic Continuation, Springer, (1974).
  7. S. Golden, Lower Bounds for Helmholtz Functions, Phys. Rev. 137, B 1127–1128 (1965)
  8. C.J. Thompson, Inequality with Applications in Statistical Mechanics, J. Math. Phys. 6, 1812–1813, (1965).
  9. 1 2 3 E. H. Lieb, Convex Trace Functions and the Wigner–Yanase–Dyson Conjecture, Advances in Math. 11, 267–288 (1973).
  10. D. Ruelle, Statistical Mechanics: Rigorous Results, World Scient. (1969).
  11. E. P. Wigner, M. M. Yanase, On the Positive Semi-Definite Nature of a Certain Matrix Expression, Can. J. Math. 16, 397–406, (1964).
  12. 1 2 . Ando, Convexity of Certain Maps on Positive Definite Matrices and Applications to Hadamard Products, Lin. Alg. Appl. 26, 203–241 (1979).
  13. H. Epstein, Remarks on Two Theorems of E. Lieb, Comm. Math. Phys., 31:317–325, (1973).
  14. M. B. Ruskai, Inequalities for Quantum Entropy: A Review With Conditions for Equality, J. Math. Phys., 43(9):4358–4375, (2002).
  15. M. B. Ruskai, Another Short and Elementary Proof of Strong Subadditivity of Quantum Entropy, Reports Math. Phys. 60, 1–12 (2007).
  16. G. Lindblad, Expectations and Entropy Inequalities, Commun. Math. Phys. 39, 111–119 (1974).
  17. 1 2 C. Davis, A Schwarz inequality for convex operator functions, Proc. Amer. Math. Soc. 8, 42–44, (1957).
  18. F. Hansen, G. K. Pedersen, Jensen's Operator Inequality, Bull. London Math. Soc. 35 (4): 553–564, (2003).
  19. E. H. Lieb, W. E. Thirring, Inequalities for the Moments of the Eigenvalues of the Schrödinger Hamiltonian and Their Relation to Sobolev Inequalities, in Studies in Mathematical Physics, edited E. Lieb, B. Simon, and A. Wightman, Princeton University Press, 269–303 (1976).
  20. H. Araki, On an Inequality of Lieb and Thirring, Lett. Math. Phys. 19, 167–170 (1990).
  21. Z. Allen-Zhu, Y. Lee, L. Orecchia, Using Optimization to Obtain a Width-Independent, Parallel, Simpler, and Faster Positive SDP Solver, in ACM-SIAM Symposium on Discrete Algorithms, 1824–1831 (2016).
  22. E. Effros, A Matrix Convexity Approach to Some Celebrated Quantum Inequalities, Proc. Natl. Acad. Sci. USA, 106, n.4, 1006–1008 (2009).
  23. A. Ebadian, I. Nikoufar, and M. Gordjic, "Perspectives of matrix convex functions," Proc. Natl Acad. Sci. USA, 108(18), 7313–7314 (2011)
  24. Mirsky, L. (December 1975). "A trace inequality of John von Neumann". Monatshefte für Mathematik. 79 (4): 303–306. doi:10.1007/BF01647331.