Classical capacity

Last updated

In quantum information theory, the classical capacity of a quantum channel is the maximum rate at which classical data can be sent over it error-free in the limit of many uses of the channel. Holevo, Schumacher, and Westmoreland proved the following least upper bound on the classical capacity of any quantum channel :

Contents

where is a classical-quantum state of the following form:

is a probability distribution, and each is a density operator that can be input to the channel .

Achievability using sequential decoding

We briefly review the HSW coding theorem (the statement of the achievability of the Holevo information rate for communicating classical data over a quantum channel). We first review the minimal amount of quantum mechanics needed for the theorem. We then cover quantum typicality, and finally we prove the theorem using a recent sequential decoding technique.

Review of quantum mechanics

In order to prove the HSW coding theorem, we really just need a few basic things from quantum mechanics. First, a quantum state is a unit trace, positive operator known as a density operator. Usually, we denote it by , , , etc. The simplest model for a quantum channel is known as a classical-quantum channel:

The meaning of the above notation is that inputting the classical letter at the transmitting end leads to a quantum state at the receiving end. It is the task of the receiver to perform a measurement to determine the input of the sender. If it is true that the states are perfectly distinguishable from one another (i.e., if they have orthogonal supports such that for ), then the channel is a noiseless channel. We are interested in situations for which this is not the case. If it is true that the states all commute with one another, then this is effectively identical to the situation for a classical channel, so we are also not interested in these situations. So, the situation in which we are interested is that in which the states have overlapping support and are non-commutative.

The most general way to describe a quantum measurement is with a positive operator-valued measure (POVM). We usually denote the elements of a POVM as . These operators should satisfy positivity and completeness in order to form a valid POVM:

The probabilistic interpretation of quantum mechanics states that if someone measures a quantum state using a measurement device corresponding to the POVM , then the probability for obtaining outcome is equal to

and the post-measurement state is

if the person measuring obtains outcome . These rules are sufficient for us to consider classical communication schemes over cq channels.

Quantum typicality

The reader can find a good review of this topic in the article about the typical subspace.

Gentle operator lemma

The following lemma is important for our proofs. It demonstrates that a measurement that succeeds with high probability on average does not disturb the state too much on average:

Lemma: [Winter] Given an ensemble with expected density operator , suppose that an operator such that succeeds with high probability on the state :

Then the subnormalized state is close in expected trace distance to the original state :

(Note that is the nuclear norm of the operator so that Tr.)

The following inequality is useful for us as well. It holds for any operators , , such that :

 

 

 

 

(1)

The quantum information-theoretic interpretation of the above inequality is that the probability of obtaining outcome from a quantum measurement acting on the state is upper bounded by the probability of obtaining outcome on the state summed with the distinguishability of the two states and .

Non-commutative union bound

Lemma: [Sen's bound] The following bound holds for a subnormalized state such that and with , ... , being projectors:

We can think of Sen's bound as a "non-commutative union bound" because it is analogous to the following union bound from probability theory:

where are events. The analogous bound for projector logic would be

if we think of as a projector onto the intersection of subspaces. Though, the above bound only holds if the projectors , ..., are commuting (choosing , , and gives a counterexample). If the projectors are non-commuting, then Sen's bound is the next best thing and suffices for our purposes here.

HSW theorem with the non-commutative union bound

We now prove the HSW theorem with Sen's non-commutative union bound. We divide up the proof into a few parts: codebook generation, POVM construction, and error analysis.

Codebook Generation. We first describe how Alice and Bob agree on a random choice of code. They have the channel and a distribution . They choose classical sequences according to the IID\ distribution . After selecting them, they label them with indices as . This leads to the following quantum codewords:

The quantum codebook is then . The average state of the codebook is then

 

 

 

 

(2)

where .

POVM Construction . Sens' bound from the above lemma suggests a method for Bob to decode a state that Alice transmits. Bob should first ask "Is the received state in the average typical subspace?" He can do this operationally by performing a typical subspace measurement corresponding to . Next, he asks in sequential order, "Is the received codeword in the conditionally typical subspace?" This is in some sense equivalent to the question, "Is the received codeword the transmitted codeword?" He can ask these questions operationally by performing the measurements corresponding to the conditionally typical projectors .

Why should this sequential decoding scheme work well? The reason is that the transmitted codeword lies in the typical subspace on average:

where the inequality follows from (\ref{eq:1st-typ-prop}). Also, the projectors are "good detectors" for the states (on average) because the following condition holds from conditional quantum typicality:

Error Analysis. The probability of detecting the codeword correctly under our sequential decoding scheme is equal to

where we make the abbreviation . (Observe that we project into the average typical subspace just once.) Thus, the probability of an incorrect detection for the codeword is given by

and the average error probability of this scheme is equal to

Instead of analyzing the average error probability, we analyze the expectation of the average error probability, where the expectation is with respect to the random choice of code:

 

 

 

 

(3)

Our first step is to apply Sen's bound to the above quantity. But before doing so, we should rewrite the above expression just slightly, by observing that

Substituting into ( 3 ) (and forgetting about the small term for now) gives an upper bound of

We then apply Sen's bound to this expression with and the sequential projectors as , , ..., . This gives the upper bound Due to concavity of the square root, we can bound this expression from above by

where the second bound follows by summing over all of the codewords not equal to the codeword (this sum can only be larger).

We now focus exclusively on showing that the term inside the square root can be made small. Consider the first term:

where the first inequality follows from ( 1 ) and the second inequality follows from the gentle operator lemma and the properties of unconditional and conditional typicality. Consider now the second term and the following chain of inequalities:

The first equality follows because the codewords and are independent since they are different. The second equality follows from ( 2 ). The first inequality follows from (\ref{eq:3rd-typ-prop}). Continuing, we have

The first inequality follows from and exchanging the trace with the expectation. The second inequality follows from (\ref{eq:2nd-cond-typ}). The next two are straightforward.

Putting everything together, we get our final bound on the expectation of the average error probability:

Thus, as long as we choose , there exists a code with vanishing error probability.

See also

Related Research Articles

In mathematics, a self-adjoint operator on an infinite-dimensional complex vector space V with inner product is a linear map A that is its own adjoint. If V is finite-dimensional with a given orthonormal basis, this is equivalent to the condition that the matrix of A is a Hermitian matrix, i.e., equal to its conjugate transpose A. By the finite-dimensional spectral theorem, V has an orthonormal basis such that the matrix of A relative to this basis is a diagonal matrix with entries in the real numbers. In this article, we consider generalizations of this concept to operators on Hilbert spaces of arbitrary dimension.

Spherical harmonics Special mathematical functions defined on the surface of a sphere

In mathematics and physical science, spherical harmonics are special functions defined on the surface of a sphere. They are often employed in solving partial differential equations in many scientific fields.

In vector calculus, Green's theorem relates a line integral around a simple closed curve C to a double integral over the plane region D bounded by C. It is the two-dimensional special case of Stokes' theorem.

In mathematics, the Poisson summation formula is an equation that relates the Fourier series coefficients of the periodic summation of a function to values of the function's continuous Fourier transform. Consequently, the periodic summation of a function is completely defined by discrete samples of the original function's Fourier transform. And conversely, the periodic summation of a function's Fourier transform is completely defined by discrete samples of the original function. The Poisson summation formula was discovered by Siméon Denis Poisson and is sometimes called Poisson resummation.

In quantum mechanics, the canonical commutation relation is the fundamental relation between canonical conjugate quantities. For example,

In mathematics, the von Mangoldt function is an arithmetic function named after German mathematician Hans von Mangoldt. It is an example of an important arithmetic function that is neither multiplicative nor additive.

In mathematics, the explicit formulae for L-functions are relations between sums over the complex number zeroes of an L-function and sums over prime powers, introduced by Riemann (1859) for the Riemann zeta function. Such explicit formulae have been applied also to questions on bounding the discriminant of an algebraic number field, and the conductor of a number field.

In mathematics, a Fredholm kernel is a certain type of a kernel on a Banach space, associated with nuclear operators on the Banach space. They are an abstraction of the idea of the Fredholm integral equation and the Fredholm operator, and are one of the objects of study in Fredholm theory. Fredholm kernels are named in honour of Erik Ivar Fredholm. Much of the abstract theory of Fredholm kernels was developed by Alexander Grothendieck and published in 1955.

Charge density Electric charge per unit length, area or volume

In electromagnetism, charge density is the amount of electric charge per unit length, surface area, or volume. Volume charge density is the quantity of charge per unit volume, measured in the SI system in coulombs per cubic meter (C⋅m−3), at any point in a volume. Surface charge density (σ) is the quantity of charge per unit area, measured in coulombs per square meter (C⋅m−2), at any point on a surface charge distribution on a two dimensional surface. Linear charge density (λ) is the quantity of charge per unit length, measured in coulombs per meter (C⋅m−1), at any point on a line charge distribution. Charge density can be either positive or negative, since electric charge can be either positive or negative.

In mathematics, the Weyl character formula in representation theory describes the characters of irreducible representations of compact Lie groups in terms of their highest weights. It was proved by Hermann Weyl. There is a closely related formula for the character of an irreducible representation of a semisimple Lie algebra. In Weyl's approach to the representation theory of connected compact Lie groups, the proof of the character formula is a key step in proving that every dominant integral element actually arises as the highest weight of some irreducible representation. Important consequences of the character formula are the Weyl dimension formula and the Kostant multiplicity formula.

Chebyshev function

In mathematics, the Chebyshev function is either of two related functions. The first Chebyshev functionϑ(x) or θ(x) is given by

Newman–Penrose formalism Notation in general relativity

The Newman–Penrose (NP) formalism is a set of notation developed by Ezra T. Newman and Roger Penrose for general relativity (GR). Their notation is an effort to treat general relativity in terms of spinor notation, which introduces complex forms of the usual variables used in GR. The NP formalism is itself a special case of the tetrad formalism, where the tensors of the theory are projected onto a complete vector basis at each point in spacetime. Usually this vector basis is chosen to reflect some symmetry of the spacetime, leading to simplified expressions for physical observables. In the case of the NP formalism, the vector basis chosen is a null tetrad: a set of four null vectors—two real, and a complex-conjugate pair. The two real members asymptotically point radially inward and radially outward, and the formalism is well adapted to treatment of the propagation of radiation in curved spacetime. The Weyl scalars, derived from the Weyl tensor, are often used. In particular, it can be shown that one of these scalars— in the appropriate frame—encodes the outgoing gravitational radiation of an asymptotically flat system.

A ratio distribution is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two random variables X and Y, the distribution of the random variable Z that is formed as the ratio Z = X/Y is a ratio distribution.

In mathematics, Reidemeister torsion is a topological invariant of manifolds introduced by Kurt Reidemeister for 3-manifolds and generalized to higher dimensions by Wolfgang Franz (1935) and Georges de Rham (1936). Analytic torsion is an invariant of Riemannian manifolds defined by Daniel B. Ray and Isadore M. Singer as an analytic analogue of Reidemeister torsion. Jeff Cheeger and Werner Müller (1978) proved Ray and Singer's conjecture that Reidemeister torsion and analytic torsion are the same for compact Riemannian manifolds.

In optics, the Fraunhofer diffraction equation is used to model the diffraction of waves when the diffraction pattern is viewed at a long distance from the diffracting object, and also when it is viewed at the focal plane of an imaging lens.

For certain applications in linear algebra, it is useful to know properties of the probability distribution of the largest eigenvalue of a finite sum of random matrices. Suppose is a finite sequence of random matrices. Analogous to the well-known Chernoff bound for sums of scalars, a bound on the following is sought for a given parameter t:

In quantum information theory, the idea of a typical subspace plays an important role in the proofs of many coding theorems. Its role is analogous to that of the typical set in classical information theory.

Stochastic portfolio theory (SPT) is a mathematical theory for analyzing stock market structure and portfolio behavior introduced by E. Robert Fernholz in 2002. It is descriptive as opposed to normative, and is consistent with the observed behavior of actual markets. Normative assumptions, which serve as a basis for earlier theories like modern portfolio theory (MPT) and the capital asset pricing model (CAPM), are absent from SPT.

Lagrangian field theory is a formalism in classical field theory. It is the field-theoretic analogue of Lagrangian mechanics. Lagrangian mechanics is used to analyze the motion of a system of discrete particles each with a finite number of degrees of freedom. Lagrangian field theory applies to continua and fields, which have an infinite number of degrees of freedom.

Causal fermion systems

The theory of causal fermion systems is an approach to describe fundamental physics. It provides a unification of the weak, the strong and the electromagnetic forces with gravity at the level of classical field theory. Moreover, it gives quantum mechanics as a limiting case and has revealed close connections to quantum field theory. Therefore, it is a candidate for a unified physical theory. Instead of introducing physical objects on a preexisting spacetime manifold, the general concept is to derive spacetime as well as all the objects therein as secondary objects from the structures of an underlying causal fermion system. This concept also makes it possible to generalize notions of differential geometry to the non-smooth setting. In particular, one can describe situations when spacetime no longer has a manifold structure on the microscopic scale. As a result, the theory of causal fermion systems is a proposal for quantum geometry and an approach to quantum gravity.

References