Operator norm

Last updated

In mathematics, the operator norm measures the "size" of certain linear operators by assigning each a real number called its operator norm. Formally, it is a norm defined on the space of bounded linear operators between two given normed vector spaces.

Introduction and definition

Given two normed vector spaces ${\displaystyle V}$ and ${\displaystyle W}$ (over the same base field, either the real numbers ${\displaystyle \mathbb {R} }$ or the complex numbers ${\displaystyle \mathbb {C} }$), a linear map ${\displaystyle A:V\to W}$ is continuous if and only if there exists a real number ${\displaystyle c}$ such that [1]

${\displaystyle \|Av\|\leq c\|v\|\quad {\mbox{ for all }}v\in V.}$

The norm on the left is the one in ${\displaystyle W}$ and the norm on the right is the one in ${\displaystyle V}$. Intuitively, the continuous operator ${\displaystyle A}$ never increases the length of any vector by more than a factor of ${\displaystyle c.}$ Thus the image of a bounded set under a continuous operator is also bounded. Because of this property, the continuous linear operators are also known as bounded operators. In order to "measure the size" of ${\displaystyle A,}$ it then seems natural to take the infimum of the numbers ${\displaystyle c}$ such that the above inequality holds for all ${\displaystyle v\in V.}$ This number represents the maximum scalar factor by which ${\displaystyle \mathbb {R} ^{n}}$ "lengthens" vectors. In other words, we measure the "size" of ${\displaystyle A}$ by how much it "lengthens" vectors in the "biggest" case. So we define the operator norm of ${\displaystyle A}$ as

${\displaystyle \|A\|_{op}=\inf\{c\geq 0:\|Av\|\leq c\|v\|{\mbox{ for all }}v\in V\}.}$

The infimum is attained as the set of all such ${\displaystyle c}$ is closed, nonempty, and bounded from below. [2]

It is important to bear in mind that this operator norm depends on the choice of norms for the normed vector spaces ${\displaystyle V}$ and W.

Examples

Every real ${\displaystyle m}$-by-${\displaystyle n}$ matrix corresponds to a linear map from ${\displaystyle \mathbb {R} ^{n}}$ to ${\displaystyle \mathbb {R} ^{m}.}$ Each pair of the plethora of (vector) norms applicable to real vector spaces induces an operator norm for all ${\displaystyle m}$-by-${\displaystyle n}$ matrices of real numbers; these induced norms form a subset of matrix norms.

If we specifically choose the Euclidean norm on both ${\displaystyle \mathbb {R} ^{n}}$ and ${\displaystyle \mathbb {R} ^{m},}$ then the matrix norm given to a matrix ${\displaystyle A}$ is the square root of the largest eigenvalue of the matrix ${\displaystyle A^{*}A}$ (where ${\displaystyle A^{*}}$ denotes the conjugate transpose of ${\displaystyle A}$). [3] This is equivalent to assigning the largest singular value of ${\displaystyle A.}$

Passing to a typical infinite-dimensional example, consider the sequence space ${\displaystyle \ell ^{2},}$ which is an Lp space, defined by

${\displaystyle l^{2}=\left\{\left(a_{n}\right)_{n\geq 1}:\;a_{n}\in \mathbb {C} ,\;\sum _{n}|a_{n}|^{2}<\infty \right\}.}$

This can be viewed as an infinite-dimensional analogue of the Euclidean space ${\displaystyle \mathbb {C} ^{n}.}$ Now consider a bounded sequence ${\displaystyle s_{\bullet }=\left(s_{n}\right)_{n=1}^{\infty }.}$ The sequence ${\displaystyle s_{\bullet }}$ is an element of the space ${\displaystyle \ell ^{\infty },}$ with a norm given by

${\displaystyle \left\|s_{\bullet }\right\|_{\infty }=\sup _{n}\left|s_{n}\right|.}$

Define an operator ${\displaystyle T_{s}}$ by pointwise multiplication:

${\displaystyle \left(a_{n}\right)_{n=1}^{\infty }\;{\stackrel {T_{s}}{\mapsto }}\;\ \left(s_{n}\cdot a_{n}\right)_{n=1}^{\infty }.}$

The operator ${\displaystyle T_{s}}$ is bounded with operator norm

${\displaystyle \left\|T_{s}\right\|_{op}=\left\|s_{\bullet }\right\|_{\infty }.}$

This discussion extends directly to the case where ${\displaystyle \ell ^{2}}$ is replaced by a general ${\displaystyle L^{p}}$ space with ${\displaystyle p>1}$ and ${\displaystyle \ell ^{\infty }}$ replaced by ${\displaystyle L^{\infty }.}$

Equivalent definitions

Let ${\displaystyle A:V\to W}$ be a linear operator between normed spaces. The first four definitions are always equivalent, and if in addition ${\displaystyle V\neq \{0\}}$ then they are all equivalent:

{\displaystyle {\begin{alignedat}{4}\|A\|_{op}&=\inf &&\{c\geq 0~&&:~\|Av\|\leq c\|v\|~&&~{\mbox{ for all }}~&&v\in V\}\\&=\sup &&\{\|Av\|~&&:~\|v\|\leq 1~&&~{\mbox{ and }}~&&v\in V\}\\&=\sup &&\{\|Av\|~&&:~\|v\|<1~&&~{\mbox{ and }}~&&v\in V\}\\&=\sup &&\{\|Av\|~&&:~\|v\|\in \{0,1\}~&&~{\mbox{ and }}~&&v\in V\}\\&=\sup &&\{\|Av\|~&&:~\|v\|=1~&&~{\mbox{ and }}~&&v\in V\}\;\;\;{\text{ this equality holds if and only if }}V\neq \{0\}\\&=\sup &&{\bigg \{}{\frac {\|Av\|}{\|v\|}}~&&:~v\neq 0~&&~{\mbox{ and }}~&&v\in V{\bigg \}}\;\;\;{\text{ this equality holds if and only if }}V\neq \{0\}.\\\end{alignedat}}}

If ${\displaystyle V=\{0\}}$ then the sets in the last two rows will be empty, and consequently their supremums over the set ${\displaystyle [-\infty ,\infty ]}$ will equal ${\displaystyle \infty }$ instead of the correct value of ${\displaystyle 0.}$ If the supremum is taken over the set ${\displaystyle [0,\infty ]}$ instead, then the supremum of the empty set is ${\displaystyle 0}$ and the formulas hold for any ${\displaystyle V.}$ If ${\displaystyle A:V\to W}$ is bounded then [4]

${\displaystyle \|A\|_{op}=\sup \left\{\left|w^{*}(Av)\right|:\|v\|\leq 1,\left\|w^{*}\right\|\leq 1{\text{ where }}v\in V,w^{*}\in W^{*}\right\}}$

and [4]

${\displaystyle \|A\|_{op}=\left\|{}^{t}A\right\|_{op}}$

where ${\displaystyle {}^{t}A:W^{*}\to V^{*}}$ is the transpose of ${\displaystyle A:V\to W,}$ which is the linear operator defined by ${\displaystyle w^{*}\,\mapsto \,w^{*}\circ A.}$

Properties

The operator norm is indeed a norm on the space of all bounded operators between ${\displaystyle V}$ and W. This means

${\displaystyle \|A\|_{op}\geq 0{\mbox{ and }}\|A\|_{op}=0{\mbox{ if and only if }}A=0,}$
${\displaystyle \|aA\|_{op}=|a|\|A\|_{op}{\mbox{ for every scalar }}a,}$
${\displaystyle \|A+B\|_{op}\leq \|A\|_{op}+\|B\|_{op}.}$

The following inequality is an immediate consequence of the definition:

${\displaystyle \|Av\|\leq \|A\|_{op}\|v\|\ {\mbox{ for every }}\ v\in V.}$

The operator norm is also compatible with the composition, or multiplication, of operators: if ${\displaystyle V}$, ${\displaystyle W}$ and ${\displaystyle X}$ are three normed spaces over the same base field, and ${\displaystyle A:V\to W}$ and ${\displaystyle B:W\to X}$ are two bounded operators, then it is a sub-multiplicative norm, that is:

${\displaystyle \|BA\|_{op}\leq \|B\|_{op}\|A\|_{op}.}$

For bounded operators on ${\displaystyle V}$, this implies that operator multiplication is jointly continuous.

It follows from the definition that if a sequence of operators converges in operator norm, it converges uniformly on bounded sets.

Table of common operator norms

Some common operator norms are easy to calculate, and others are NP-hard. Except for the NP-hard norms, all these norms can be calculated in ${\displaystyle N^{2}}$ operations (for an ${\displaystyle N\times N}$ matrix), with the exception of the ${\displaystyle \ell _{2}-\ell _{2}}$ norm (which requires ${\displaystyle N^{3}}$ operations for the exact answer, or fewer if you approximate it with the power method or Lanczos iterations).

Computability of Operator Norms [5]
Co-domain
${\displaystyle \ell _{1}}$${\displaystyle \ell _{2}}$${\displaystyle \ell _{\infty }}$
Domain${\displaystyle \ell _{1}}$Maximum ${\displaystyle \ell _{1}}$ norm of a columnMaximum ${\displaystyle \ell _{2}}$ norm of a columnMaximum ${\displaystyle \ell _{\infty }}$ norm of a column
${\displaystyle \ell _{2}}$NP-hardMaximum singular valueMaximum ${\displaystyle \ell _{2}}$ norm of a row
${\displaystyle \ell _{\infty }}$NP-hardNP-hardMaximum ${\displaystyle \ell _{1}}$ norm of a row

The norm of the adjoint or transpose can be computed as follows. We have that for any ${\displaystyle p,q,}$ then ${\displaystyle \|A\|_{p\rightarrow q}=\|A^{*}\|_{q'\rightarrow p'}}$ where ${\displaystyle p',q'}$ are Hölder conjugate to ${\displaystyle p,q,}$ that is, ${\displaystyle 1/p+1/p'=1}$ and ${\displaystyle 1/q+1/q'=1.}$

Operators on a Hilbert space

Suppose ${\displaystyle H}$ is a real or complex Hilbert space. If ${\displaystyle A:H\to H}$ is a bounded linear operator, then we have

${\displaystyle \|A\|_{op}=\left\|A^{*}\right\|_{op}}$

and

${\displaystyle \left\|A^{*}A\right\|_{op}=\|A\|_{op}^{2},}$

where ${\displaystyle A^{*}}$ denotes the adjoint operator of ${\displaystyle A}$ (which in Euclidean spaces with the standard inner product corresponds to the conjugate transpose of the matrix ${\displaystyle A}$).

In general, the spectral radius of ${\displaystyle A}$ is bounded above by the operator norm of ${\displaystyle A}$:

${\displaystyle \rho (A)\leq \|A\|_{op}.}$

To see why equality may not always hold, consider the Jordan canonical form of a matrix in the finite-dimensional case. Because there are non-zero entries on the superdiagonal, equality may be violated. The quasinilpotent operators is one class of such examples. A nonzero quasinilpotent operator ${\displaystyle A}$ has spectrum ${\displaystyle \{0\}.}$ So ${\displaystyle \rho (A)=0}$ while ${\displaystyle \|A\|_{op}>0.}$

However, when a matrix ${\displaystyle N}$ is normal, its Jordan canonical form is diagonal (up to unitary equivalence); this is the spectral theorem. In that case it is easy to see that

${\displaystyle \rho (N)=\|N\|_{op}.}$

This formula can sometimes be used to compute the operator norm of a given bounded operator ${\displaystyle A}$: define the Hermitian operator ${\displaystyle B=A^{*}A,}$ determine its spectral radius, and take the square root to obtain the operator norm of ${\displaystyle A.}$

The space of bounded operators on ${\displaystyle H,}$ with the topology induced by operator norm, is not separable. For example, consider the Lp space ${\displaystyle L^{2}[0,1],}$ which is a Hilbert space. For ${\displaystyle 0 let ${\displaystyle \Omega _{t}}$ be the characteristic function of ${\displaystyle [0,t],}$ and ${\displaystyle P_{t}}$ be the multiplication operator given by ${\displaystyle \Omega _{t},}$ that is,

${\displaystyle P_{t}(f)=f\cdot \Omega _{t}.}$

Then each ${\displaystyle P_{t}}$ is a bounded operator with operator norm 1 and

${\displaystyle \left\|P_{t}-P_{s}\right\|_{op}=1\quad {\mbox{ for all }}\quad t\neq s.}$

But ${\displaystyle \{P_{t}:0 is an uncountable set. This implies the space of bounded operators on ${\displaystyle L^{2}([0,1])}$ is not separable, in operator norm. One can compare this with the fact that the sequence space ${\displaystyle \ell ^{\infty }}$ is not separable.

The associative algebra of all bounded operators on a Hilbert space, together with the operator norm and the adjoint operation, yields a C*-algebra.

Notes

1. Kreyszig, Erwin (1978), Introductory functional analysis with applications, John Wiley & Sons, p. 97, ISBN   9971-51-381-1
2. See e.g. Lemma 6.2 of Aliprantis & Border (2007).
3. Weisstein, Eric W. "Operator Norm". mathworld.wolfram.com. Retrieved 2020-03-14.
4. Rudin 1991, pp. 92-115.
5. section 4.3.1, Joel Tropp's PhD thesis,

Related Research Articles

In mathematics, more specifically in functional analysis, a Banach space is a complete normed vector space. Thus, a Banach space is a vector space with a metric that allows the computation of vector length and distance between vectors and is complete in the sense that a Cauchy sequence of vectors always converges to a well defined limit that is within the space.

In mathematics, the Lp spaces are function spaces defined using a natural generalization of the p-norm for finite-dimensional vector spaces. They are sometimes called Lebesgue spaces, named after Henri Lebesgue, although according to the Bourbaki group they were first introduced by Frigyes Riesz. Lp spaces form an important class of Banach spaces in functional analysis, and of topological vector spaces. Because of their key role in the mathematical analysis of measure and probability spaces, Lebesgue spaces are used also in the theoretical discussion of problems in physics, statistics, finance, engineering, and other disciplines.

Distributions, also known as Schwartz distributions or generalized functions, are objects that generalize the classical notion of functions in mathematical analysis. Distributions make it possible to differentiate functions whose derivatives do not exist in the classical sense. In particular, any locally integrable function has a distributional derivative. Distributions are widely used in the theory of partial differential equations, where it may be easier to establish the existence of distributional solutions than classical solutions, or appropriate classical solutions may not exist. Distributions are also important in physics and engineering where many problems naturally lead to differential equations whose solutions or initial conditions are distributions, such as the Dirac delta function.

In mathematics, a trace-class operator is a compact operator for which a trace may be defined, such that the trace is finite and independent of the choice of basis. Trace-class operators are essentially the same as nuclear operators, though many authors reserve the term "trace-class operator" for the special case of nuclear operators on Hilbert spaces and reserve "nuclear operator" for usage in more general topological vector spaces.

In mathematics, the uniform boundedness principle or Banach–Steinhaus theorem is one of the fundamental results in functional analysis. Together with the Hahn–Banach theorem and the open mapping theorem, it is considered one of the cornerstones of the field. In its basic form, it asserts that for a family of continuous linear operators whose domain is a Banach space, pointwise boundedness is equivalent to uniform boundedness in operator norm.

In mathematics, particularly in functional analysis, a seminorm is a vector space norm that need not be positive definite. Seminorms are intimately connected with convex sets: every seminorm is the Minkowski functional of some absorbing disk and, conversely, the Minkowski functional of any such set is a seminorm.

In mathematics, a function space is a set of functions between two fixed sets. Often, the domain and/or codomain will have additional structure which is inherited by the function space. For example, the set of functions from any set X into a vector space has a natural vector space structure given by pointwise addition and scalar multiplication. In other scenarios, the function space might inherit a topological or metric structure, hence the name function space.

In functional analysis, a bounded linear operator is a linear transformation between topological vector spaces (TVSs) and that maps bounded subsets of to bounded subsets of If and are normed vector spaces, then is bounded if and only if there exists some such that for all in

In functional analysis and related areas of mathematics, locally convex topological vector spaces (LCTVS) or locally convex spaces are examples of topological vector spaces (TVS) that generalize normed spaces. They can be defined as topological vector spaces whose topology is generated by translations of balanced, absorbent, convex sets. Alternatively they can be defined as a vector space with a family of seminorms, and a topology can be defined in terms of that family. Although in general such spaces are not necessarily normable, the existence of a convex local base for the zero vector is strong enough for the Hahn–Banach theorem to hold, yielding a sufficiently rich theory of continuous linear functionals.

In functional analysis and related branches of mathematics, the Banach–Alaoglu theorem states that the closed unit ball of the dual space of a normed vector space is compact in the weak* topology. A common proof identifies the unit ball with the weak-* topology as a closed subset of a product of compact sets with the product topology. As a consequence of Tychonoff's theorem, this product, and hence the unit ball within, is compact.

In mathematics, specifically in functional analysis, each bounded linear operator on a complex Hilbert space has a corresponding Hermitian adjoint. Adjoints of operators generalize conjugate transposes of square matrices to (possibly) infinite-dimensional situations. If one thinks of operators on a complex Hilbert space as generalized complex numbers, then the adjoint of an operator plays the role of the complex conjugate of a complex number.

In mathematics, a norm is a function from a real or complex vector space to the nonnegative real numbers that behaves in certain ways like the distance from the origin: it commutes with scaling, obeys a form of the triangle inequality, and is zero only at the origin. In particular, the Euclidean distance of a vector from the origin is a norm, called the Euclidean norm, or 2-norm, which may also be defined as the square root of the inner product of a vector with itself.

In mathematics, a matrix norm is a vector norm in a vector space whose elements (vectors) are matrices.

In functional analysis and related areas of mathematics, a sequence space is a vector space whose elements are infinite sequences of real or complex numbers. Equivalently, it is a function space whose elements are functions from the natural numbers to the field K of real or complex numbers. The set of all such functions is naturally identified with the set of all possible infinite sequences with elements in K, and can be turned into a vector space under the operations of pointwise addition of functions and pointwise scalar multiplication. All sequence spaces are linear subspaces of this space. Sequence spaces are typically equipped with a norm, or at least the structure of a topological vector space.

In linear algebra, a sublinear function, also called a quasi-seminorm or a Banach functional, on a vector space is a real-valued function with only some of the properties of a seminorm. Unlike seminorms, a sublinear function does not have to be nonnegative-valued and also does not have to be absolutely homogeneous. Seminorms are themselves abstractions of the more well known notion of norms, where a seminorm has all the defining properties of a norm except that it is not required to map non-zero vectors to non-zero values.

In mathematics, there are usually many different ways to construct a topological tensor product of two topological vector spaces. For Hilbert spaces or nuclear spaces there is a simple well-behaved theory of tensor products, but for general Banach spaces or locally convex topological vector spaces the theory is notoriously subtle.

In mathematics, the Fraňková–Helly selection theorem is a generalisation of Helly's selection theorem for functions of bounded variation to the case of regulated functions. It was proved in 1991 by the Czech mathematician Dana Fraňková.

In functional analysis, the dual norm is a measure of size for a continuous linear function defined on a normed vector space.

This is a glossary for the terminology in a mathematical field of functional analysis.

In mathematical analysis, the spaces of test functions and distributions are topological vector spaces (TVSs) that are used in the definition and application of distributions. Test functions are usually infinitely differentiable complex-valued functions on a non-empty open subset that have compact support. The space of all test functions, denoted by is endowed with a certain topology, called the canonical LF-topoogy, that makes into a complete Hausdorff locally convex TVS. The strong dual space of is called the space of distributions on and is denoted by where the "" subscript indicates that the continuous dual space of denote by is endowed with the strong dual topology.