Semidefinite programming

Last updated January 18, 2024

Semidefinite programming (SDP) is a subfield of mathematical programming concerned with the optimization of a linear objective function (a user-specified function that the user wants to minimize or maximize) over the intersection of the cone of positive semidefinite matrices with an affine space, i.e., a spectrahedron.^[1]

Semidefinite programming is a relatively new field of optimization which is of growing interest for several reasons. Many practical problems in operations research and combinatorial optimization can be modeled or approximated as semidefinite programming problems. In automatic control theory, SDPs are used in the context of linear matrix inequalities. SDPs are in fact a special case of cone programming and can be efficiently solved by interior point methods. All linear programs and (convex) quadratic programs can be expressed as SDPs, and via hierarchies of SDPs the solutions of polynomial optimization problems can be approximated. Semidefinite programming has been used in the optimization of complex systems. In recent years, some quantum query complexity problems have been formulated in terms of semidefinite programs.

Motivation and definition

Initial motivation

A linear programming problem is one in which we wish to maximize or minimize a linear objective function of real variables over a polytope. In semidefinite programming, we instead use real-valued vectors and are allowed to take the dot product of vectors; nonnegativity constraints on real variables in LP (linear programming) are replaced by semidefiniteness constraints on matrix variables in SDP (semidefinite programming). Specifically, a general semidefinite programming problem can be defined as any mathematical programming problem of the form

{\begin{array}{rl}{\displaystyle \min _{x^{1},\ldots ,x^{n}\in \mathbb {R} ^{n}}}&{\displaystyle \sum _{i,j\in [n]}c_{i,j}(x^{i}\cdot x^{j})}\\{\text{subject to}}&{\displaystyle \sum _{i,j\in [n]}a_{i,j,k}(x^{i}\cdot x^{j})\leq b_{k}}{\text{  for all }}k\\\end{array}}

where the $c_{i,j},a_{i,j,k}$ , and the $b_{k}$ are real numbers and $x^{i}\cdot x^{j}$ is the dot product of $x^{i}$ and $x^{j}$ .

Equivalent formulations

An $n\times n$ matrix $M$ is said to be positive semidefinite if it is the Gram matrix of some vectors (i.e. if there exist vectors $x^{1},\ldots ,x^{n}$ such that $m_{i,j}=x^{i}\cdot x^{j}$ for all $i,j$ ). If this is the case, we denote this as $M\succeq 0$ . Note that there are several other equivalent definitions of being positive semidefinite, for example, positive semidefinite matrices are self-adjoint matrices that have only non-negative eigenvalues.

Denote by $\mathbb {S} ^{n}$ the space of all $n\times n$ real symmetric matrices. The space is equipped with the inner product (where ${\rm {trace}}$ denotes the trace):

${\displaystyle \langle A,B\rangle$

We can rewrite the mathematical program given in the previous section equivalently as

{\begin{array}{rl}{\displaystyle \min _{X\in \mathbb {S} ^{n}}}&\langle C,X\rangle \\{\text{subject to}}&\langle A_{k},X\rangle \leq b_{k},\quad k=1,\ldots ,m\\&X\succeq 0.\end{array}}

where entry $i,j$ in $C$ is given by ${\frac {c_{i,j}+c_{j,i}}{2}}$ from the previous section and $A_{k}$ is a symmetric $n\times n$ matrix having $i,j$ th entry ${\frac {a_{i,j,k}+a_{j,i,k}}{2}}$ from the previous section. Thus, the matrices $C$ and $A_{k}$ are symmetric and the above inner products are well-defined.

Note that if we add slack variables appropriately, this SDP can be converted to an equational form:

{\begin{array}{rl}{\displaystyle \min _{X\in \mathbb {S} ^{n}}}&\langle C,X\rangle \\{\text{subject to}}&\langle A_{k},X\rangle =b_{k},\quad k=1,\ldots ,m\\&X\succeq 0.\end{array}}

For convenience, an SDP may be specified in a slightly different, but equivalent form. For example, linear expressions involving nonnegative scalar variables may be added to the program specification. This remains an SDP because each variable can be incorporated into the matrix $X$ as a diagonal entry ( $X_{ii}$ for some $i$ ). To ensure that $X_{ii}\geq 0$ , constraints $X_{ij}=0$ can be added for all $j\neq i$ . As another example, note that for any positive semidefinite matrix $X$ , there exists a set of vectors $\{v_{i}\}$ such that the $i$ , $j$ entry of $X$ is $X_{ij}=(v_{i},v_{j})$ the scalar product of $v_{i}$ and $v_{j}$ . Therefore, SDPs are often formulated in terms of linear expressions on scalar products of vectors. Given the solution to the SDP in the standard form, the vectors $\{v_{i}\}$ can be recovered in $O(n^{3})$ time (e.g., by using an incomplete Cholesky decomposition of X).

Relations to other optimization problems

The space of semidefinite matrices is a convex cone. Therefore, SDP is a special case of conic optimization, which is a special case of convex optimization.

When the matrix C is diagonal, the inner products <C,X> is equivalent to a vector product of the diagonal of C and the diagonal of X. Analogously, when the matrices A_k are diagonal, the corresponding inner products are equivalent to vector products. In these vector products, only the diagonal elements of X are used, so we can add constraints equating the non-diagonal elements of X to 0. The condition $X\succeq 0$ is then equivalent to the condition that all diagonal elements of X are non-negative. Then, the resulting SDP becomes a linear program in which the variables are the diagonal elements of X.

Duality theory

Definitions

Analogously to linear programming, given a general SDP of the form

{\begin{array}{rl}{\displaystyle \min _{X\in \mathbb {S} ^{n}}}&\langle C,X\rangle \\{\text{subject to}}&\langle A_{i},X\rangle =b_{i},\quad i=1,\ldots ,m\\&X\succeq 0\end{array}}

(the primal problem or P-SDP), we define the dual semidefinite program (D-SDP) as

{\begin{array}{rl}{\displaystyle \max _{y\in \mathbb {R} ^{m}}}&b^{T}y\\{\text{subject to}}&{\displaystyle \sum _{i=1}^{m}}y_{i}A_{i}\preceq C\end{array}}

where for any two matrices $P$ and $Q$ , $P\succeq Q$ means $P-Q\succeq 0$ .

Weak duality

The weak duality theorem states that the value of the primal SDP is at least the value of the dual SDP. Therefore, any feasible solution to the dual SDP lower-bounds the primal SDP value, and conversely, any feasible solution to the primal SDP upper-bounds the dual SDP value. This is because

\langle C,X\rangle -b^{T}y=\langle C,X\rangle -\sum _{i=1}^{m}y_{i}b_{i}=\langle C,X\rangle -\sum _{i=1}^{m}y_{i}\langle A_{i},X\rangle =\langle C-\sum _{i=1}^{m}y_{i}A_{i},X\rangle \geq 0,

where the last inequality is because both matrices are positive semidefinite, and the result of this function is sometimes referred to as duality gap.

Strong duality

When the value of the primal and dual SDPs are equal, the SDP is said to satisfy the strong duality property. Unlike linear programs, where every dual linear program has optimal objective equal to the primal objective, not every SDP satisfies strong duality; in general, the value of the dual SDP may lie strictly below the value of the primal, and the P-SDP and D-SDP satisfy the following properties:

(i) Suppose the primal problem (P-SDP) is bounded below and strictly feasible (i.e., there exists $X_{0}\in \mathbb {S} ^{n},X_{0}\succ 0$ such that $\langle A_{i},X_{0}\rangle =b_{i}$ , $i=1,\ldots ,m$ ). Then there is an optimal solution $y^{*}$ to (D-SDP) and

\langle C,X^{*}\rangle =b^{T}y^{*}.

(ii) Suppose the dual problem (D-SDP) is bounded above and strictly feasible (i.e., $\sum _{i=1}^{m}(y_{0})_{i}A_{i}\prec C$ for some $y_{0}\in \mathbb {R} ^{m}$ ). Then there is an optimal solution $X^{*}$ to (P-SDP) and the equality from (i) holds.

A sufficient condition for strong duality to hold for a SDP problem (and in general, for any convex optimization problem) is the Slater's condition. It is also possible to attain strong duality for SDPs without additional regularity conditions by using an extended dual problem proposed by Ramana.^[2]^[3]

Examples

Example 1

Consider three random variables $A$ , $B$ , and $C$ . A given set of correlation coefficients $\rho _{AB},\ \rho _{AC},\rho _{BC}$ are possible if and only if

{\begin{pmatrix}1&\rho _{AB}&\rho _{AC}\\\rho _{AB}&1&\rho _{BC}\\\rho _{AC}&\rho _{BC}&1\end{pmatrix}}\succeq 0.

This matrix is called the correlation matrix. Suppose that we know from some prior knowledge (empirical results of an experiment, for example) that $-0.2\leq \rho _{AB}\leq -0.1$ and $0.4\leq \rho _{BC}\leq 0.5$ . The problem of determining the smallest and largest values that $\rho _{AC}\$ can take is given by:

{\begin{array}{rl}{\displaystyle \min /\max }&x_{13}\\{\text{subject to}}&-0.2\leq x_{12}\leq -0.1\\&0.4\leq x_{23}\leq 0.5\\&{\begin{pmatrix}1&x_{12}&x_{13}\\x_{12}&1&x_{23}\\x_{13}&x_{23}&1\end{pmatrix}}\succeq 0\end{array}}

We set $\rho _{AB}=x_{12},\ \rho _{AC}=x_{13},\ \rho _{BC}=x_{23}$ to obtain the answer. This can be formulated by an SDP. We handle the inequality constraints by augmenting the variable matrix and introducing slack variables, for example

$\mathrm {tr} \left(\left({\begin{array}{cccccc}0&1&0&0&0&0\\0&0&0&0&0&0\\0&0&0&0&0&0\\0&0&0&1&0&0\\0&0&0&0&0&0\\0&0&0&0&0&0\end{array}}\right)\cdot \left({\begin{array}{cccccc}1&x_{12}&x_{13}&0&0&0\\x_{12}&1&x_{23}&0&0&0\\x_{13}&x_{23}&1&0&0&0\\0&0&0&s_{1}&0&0\\0&0&0&0&s_{2}&0\\0&0&0&0&0&s_{3}\end{array}}\right)\right)=x_{12}+s_{1}=-0.1$

Solving this SDP gives the minimum and maximum values of $\rho _{AC}=x_{13}\$ as $-0.978$ and $0.872$ respectively.

Example 2

Consider the problem

minimize

{\frac {(c^{T}x)^{2}}{d^{T}x}}

subject to

Ax+b\geq 0

where we assume that $d^{T}x>0$ whenever $Ax+b\geq 0$ .

Introducing an auxiliary variable $t$ the problem can be reformulated:

minimize

t

subject to

Ax+b\geq 0,\,{\frac {(c^{T}x)^{2}}{d^{T}x}}\leq t

In this formulation, the objective is a linear function of the variables $x,t$ .

The first restriction can be written as

{\textbf {diag}}(Ax+b)\geq 0

where the matrix ${\textbf {diag}}(Ax+b)$ is the square matrix with values in the diagonal equal to the elements of the vector $Ax+b$ .

The second restriction can be written as

td^{T}x-(c^{T}x)^{2}\geq 0

Defining $D$ as follows

D=\left[{\begin{array}{cc}t&c^{T}x\\c^{T}x&d^{T}x\end{array}}\right]

We can use the theory of Schur Complements to see that

D\succeq 0

(Boyd and Vandenberghe, 1996)

The semidefinite program associated with this problem is

minimize

t

subject to

\left[{\begin{array}{ccc}{\textbf {diag}}(Ax+b)&0&0\\0&t&c^{T}x\\0&c^{T}x&d^{T}x\end{array}}\right]\succeq 0

Example 3 (Goemans–Williamson max cut approximation algorithm)

Semidefinite programs are important tools for developing approximation algorithms for NP-hard maximization problems. The first approximation algorithm based on an SDP is due to Michel Goemans and David P. Williamson (JACM, 1995).^[1]^: Chap.1 They studied the max cut problem: Given a graph G = (V, E), output a partition of the vertices V so as to maximize the number of edges crossing from one side to the other. This problem can be expressed as an integer quadratic program:

Maximize

\sum _{(i,j)\in E}{\frac {1-v_{i}v_{j}}{2}},

such that each

v_{i}\in \{1,-1\}

.

Unless P = NP, we cannot solve this maximization problem efficiently. However, Goemans and Williamson observed a general three-step procedure for attacking this sort of problem:

Relax the integer quadratic program into an SDP.
Solve the SDP (to within an arbitrarily small additive error $\epsilon$ ).
Round the SDP solution to obtain an approximate solution to the original integer quadratic program.

For max cut, the most natural relaxation is

\max \sum _{(i,j)\in E}{\frac {1-\langle v_{i},v_{j}\rangle }{2}},

such that

\lVert v_{i}\rVert ^{2}=1

, where the maximization is over vectors

\{v_{i}\}

instead of integer scalars.

This is an SDP because the objective function and constraints are all linear functions of vector inner products. Solving the SDP gives a set of unit vectors in $\mathbf {R^{n}}$ ; since the vectors are not required to be collinear, the value of this relaxed program can only be higher than the value of the original quadratic integer program. Finally, a rounding procedure is needed to obtain a partition. Goemans and Williamson simply choose a uniformly random hyperplane through the origin and divide the vertices according to which side of the hyperplane the corresponding vectors lie. Straightforward analysis shows that this procedure achieves an expected approximation ratio (performance guarantee) of 0.87856 - ε. (The expected value of the cut is the sum over edges of the probability that the edge is cut, which is proportional to the angle $\cos ^{-1}\langle v_{i},v_{j}\rangle$ between the vectors at the endpoints of the edge over $\pi$ . Comparing this probability to $(1-\langle v_{i},v_{j}\rangle )/{2}$ , in expectation the ratio is always at least 0.87856.) Assuming the unique games conjecture, it can be shown that this approximation ratio is essentially optimal.

Since the original paper of Goemans and Williamson, SDPs have been applied to develop numerous approximation algorithms. Recently, Prasad Raghavendra has developed a general framework for constraint satisfaction problems based on the unique games conjecture.^[4]

Other applications

Semidefinite programming has been applied to find approximate solutions to combinatorial optimization problems, such as the solution of the max cut problem with an approximation ratio of 0.87856. SDPs are also used in geometry to determine tensegrity graphs, and arise in control theory as LMIs, and in inverse elliptic coefficient problems as convex, non-linear, semidefiniteness constraints.^[5] It is also widely used in physics to constrain conformal field theories with the conformal bootstrap.^[6]

Run-time complexity

The semidefinite feasibility problem (SDF) is the following decision problem: given an SDP, decide whether it has at least one feasible solution. The exact run-time complexity of this problem is unknown (as of 1997). However, Ramana proved the following:^[2]

In the Turing machine model, SDF is in NP iff it is in co-NP. Therefore, SDF is not NP-complete unless NP=coNP.
In the Blum–Shub–Smale machine model, SDF is in the intersection of NP and co-NP.

Approximation algorithms

There are several types of algorithms for approximately solving SDPs. These algorithms output the value of the SDP up to an additive error $\epsilon$ in time that is polynomial in the program description size and $\log(1/\epsilon )$ .

Ellipsoid method

The ellipsoid method is a general method for convex programming, and can be used in particular to solve SDPs. In the context of SDPs, the ellipsoid method provides the following guarantee.^[1]^{: Thm.2.6.1}Consider an SDP in the following equational form:

${\begin{array}{rl}{\displaystyle \max _{X\in \mathbb {S} ^{n}}}&\langle C,X\rangle \\{\text{subject to}}&\langle A_{k},X\rangle =b_{k},\quad k=1,\ldots ,m\\&X\succeq 0.\end{array}}$

Let L be the affine subspace of matrices in Sⁿ satisfying the m equational constraints; so the SDP can be written as: $\max _{X\in L}\langle C,X\rangle {\text{ subject to }}X\succeq 0$ . Suppose all coefficients in the SDP are rational numbers. Let R be an explicitly given upper bound on the maximum Frobenius norm of a feasible solution, and ε>0 a constant. A matrix X in Sⁿ is called ε-deep if every matrix Y in L with Frobenius distance at most ε from X satisfies the feasibility condition $Y\succeq 0$ . Denote $v_{deep}:=\sup\{\langle C,X\rangle :X{\text{ is }}\epsilon {\text{-deep}}\}$ . The ellipsoid returns one of the following outputs:

A matrix X* in L (that is, satisfying all linear equality constraints exactly), such that the Frobenius distance between X* and some feasible solution is at most ε (that is, approximately satisfying the inequality constraint $X\succeq 0$ ), and $\langle C,X^{*}\rangle \geq v_{deep}-\epsilon$ (that is, approximately optimal objective value).
A certificate that the problem has no ε-deep solutions (that is, the problem is approximately infeasible).

The run-time is polynomial in the binary encodings of the inputs and in log(R/ε), in the Turing machine model.

Note that, in general, R may be doubly-exponential in n. In that case, the run-time guarantee of the ellipsoid method is exponential in n. But in most applications, R is not so huge. In these cases, the ellipsoid method is the only known method that guarantees polynomial runtime in the Turing machine model.^[1]^: 23 But in practice, its performance is not so good.

Hazan's algorithm

Elad Hazan ^[7] has developed a simple algorithm for solving SDPs with the additional constraint that the trace of the variables matrix must be 1. Specifically, his algorithm solves problems of the form:

${\begin{array}{rl}{\displaystyle \max _{X\in \mathbb {S} ^{n}}}&\langle C,X\rangle \\{\text{subject to}}&\langle A_{k},X\rangle \leq b_{k},\quad k=1,\ldots ,m\\&\operatorname {trace} (X)=1,\\&X\succeq 0.\end{array}}$

This can be seen as a maximizing a linear objective over the spectraplex.

Let OPT be the optimal value of this problem, and ε>0 a constant. Hazan's algorithm returns one of the following:^[1]^{: Thm.5.1.1}

A matrix in the spectraplex such that $OPT-\langle C,X\rangle \leq 2\epsilon \cdot \|C\|{\text{ and }}\langle A_{i},X\rangle -bi\leq \epsilon \cdot \|A_{i}\|$ (that is, the solution is approximately optimal and satisfies the linear inequalities approximately);
A certificate that there is no feasible solution.

The run-time in the Real RAM model is, with high probability, in $O\left({\frac {N\cdot \log {n}\cdot \log {m}\cdot log(1/\epsilon )}{\epsilon ^{3}}}\right)$ , where N is input size (number of nonzero entries in the matrices), n is the number of variables, and m is the number of inequality constraints. Note that the runtime is polynomial in 1/ε rather than log(1/ε). However, in practice it runs quite fast.

Hazan's algorithm uses a sub-routine for solving the feasibility problem - deciding whether a given SDP is feasible:

${\begin{array}{rl}{\displaystyle {\text{find }}{X\in \mathbb {S} ^{n}}}&\\{\text{subject to}}&\langle A_{k},X\rangle \leq b_{k},\quad k=1,\ldots ,m\\&\operatorname {trace} (X)=1,\\&X\succeq 0.\end{array}}$

There is an algorithm that, for any given ε>0, returns one of the following:^[1]^{: Thm.5.3.2}

An ε-approximately-feasible solution: a solution X in the spectraplex, for which $\langle A_{i},X\rangle -bi\leq \epsilon$ for all i in 1,...,m.
A certificate that no feasible solution exists.

The run-time in the Real RAM model is, with high probability, in $O\left({\frac {N\cdot \log {n}\cdot \log {m}\cdot }{\epsilon ^{3}}}\right)$ . The algorithm is applied in a binary search style to find an approximately-optimal solution.

Interior point methods

Most codes are based on interior point methods (CSDP, MOSEK, SeDuMi, SDPT3, DSDP, SDPA). These are robust and efficient for general linear SDP problems, but restricted by the fact that the algorithms are second-order methods and need to store and factorize a large (and often dense) matrix. Theoretically, the state-of-the-art high-accuracy SDP algorithms^[8]^[9] are based on this approach.

First-order methods

First-order methods for conic optimization avoid computing, storing and factorizing a large Hessian matrix and scale to much larger problems than interior point methods, at some cost in accuracy. A first-order method is implemented in the Splitting Cone Solver (SCS).^[10] Another first-order method is the alternating direction method of multipliers (ADMM).^[11] This method requires in every step projection on the cone of semidefinite matrices.

Bundle method

The code ConicBundle formulates the SDP problem as a nonsmooth optimization problem and solves it by the Spectral Bundle method of nonsmooth optimization. This approach is very efficient for a special class of linear SDP problems.

Other solving methods

Algorithms based on Augmented Lagrangian method (PENSDP) are similar in behavior to the interior point methods and can be specialized to some very large scale problems. Other algorithms use low-rank information and reformulation of the SDP as a nonlinear programming problem (SDPLR, ManiSDP).^[12]

Approximate methods

Algorithms that solve SDPs approximately have been proposed as well. The main goal of such methods is to achieve lower complexity in applications where approximate solutions are sufficient and complexity must be minimal. A prominent method that has been used for data detection in multiple-input multiple-output (MIMO) wireless systems is Triangular Approximate SEmidefinite Relaxation (TASER),^[13] which operates on the Cholesky decomposition factors of the semidefinite matrix instead of the semidefinite matrix. This method calculates approximate solutions for a max-cut-like problem that are often comparable to solutions from exact solvers but in only 10-20 algorithm iterations.

Preprocessing algorithms

Facial reduction algorithms are algorithms used to preprocess SDPs problems by inspecting the constraints of the problem. These can be used to

Detect lack of strict feasibility;
Delete redundant rows and columns;
Reduce the size of the variable matrix.^[14]

Related Research Articles

Quantum decoherence is the loss of quantum coherence, the process in which a system's behaviour changes from that which can be explained by quantum mechanics to that which can be explained by classical mechanics. In quantum mechanics, particles such as electrons are described by a wave function, a mathematical representation of the quantum state of a system; a probabilistic interpretation of the wave function is used to explain various quantum effects. As long as there exists a definite phase relation between different states, the system is said to be coherent. A definite phase relationship is necessary to perform quantum computing on quantum information encoded in quantum states. Coherence is preserved under the laws of quantum physics.

In mathematics, the Hodge star operator or Hodge star is a linear map defined on the exterior algebra of a finite-dimensional oriented vector space endowed with a nondegenerate symmetric bilinear form. Applying the operator to an element of the algebra produces the Hodge dual of the element. This map was introduced by W. V. D. Hodge.

In computer science and operations research, approximation algorithms are efficient algorithms that find approximate solutions to optimization problems with provable guarantees on the distance of the returned solution to the optimal one. Approximation algorithms naturally arise in the field of theoretical computer science as a consequence of the widely believed P ≠ NP conjecture. Under this conjecture, a wide class of optimization problems cannot be solved exactly in polynomial time. The field of approximation algorithms, therefore, tries to understand how closely it is possible to approximate optimal solutions to such problems in polynomial time. In an overwhelming majority of the cases, the guarantee of such algorithms is a multiplicative one expressed as an approximation ratio or approximation factor i.e., the optimal solution is always guaranteed to be within a (predetermined) multiplicative factor of the returned solution. However, there are also many approximation algorithms that provide an additive guarantee on the quality of the returned solution. A notable example of an approximation algorithm that provides both is the classic approximation algorithm of Lenstra, Shmoys and Tardos for scheduling on unrelated parallel machines.

In linear algebra and functional analysis, the partial trace is a generalization of the trace. Whereas the trace is a scalar valued function on operators, the partial trace is an operator-valued function. The partial trace has applications in quantum information and decoherence which is relevant for quantum measurement and thereby to the decoherent approaches to interpretations of quantum mechanics, including consistent histories and the relative state interpretation.

<span class="mw-page-title-main">Quantum tomography</span> Reconstruction of quantum states based on measurements

Quantum tomography or quantum state tomography is the process by which a quantum state is reconstructed using measurements on an ensemble of identical quantum states. The source of these states may be any device or system which prepares quantum states either consistently into quantum pure states or otherwise into general mixed states. To be able to uniquely identify the state, the measurements must be tomographically complete. That is, the measured operators must form an operator basis on the Hilbert space of the system, providing all the information about the state. Such a set of observations is sometimes called a quorum. The term tomography was first used in the quantum physics literature in a 1993 paper introducing experimental optical homodyne tomography.

The Kaczmarz method or Kaczmarz's algorithm is an iterative algorithm for solving linear equation systems $. It was first discovered by the Polish mathematician Stefan Kaczmarz, and was rediscovered in the field of image reconstruction from projections by Richard Gordon, Robert Bender, and Gabor Herman in 1970, where it is called the Algebraic Reconstruction Technique (ART). ART includes the positivity constraint, making it nonlinear.$

The Peres–Horodecki criterion is a necessary condition, for the joint density matrix $of two quantum mechanical systems and, to be separable. It is also called the PPT criterion, for positive partial transpose . In the 2\times2 and 2\times3 dimensional cases the condition is also sufficient. It is used to decide the separability of mixed states, where the Schmidt decomposition does not apply. The theorem was discovered in 1996 by Asher Peres and the Horodecki family$

In physics, the von Neumann entropy, named after John von Neumann, is an extension of the concept of Gibbs entropy from classical statistical mechanics to quantum statistical mechanics. For a quantum-mechanical system described by a density matrix $ρ$ , the von Neumann entropy is

In linear algebra, the Schmidt decomposition refers to a particular way of expressing a vector in the tensor product of two inner product spaces. It has numerous applications in quantum information theory, for example in entanglement characterization and in state purification, and plasticity.

In quantum mechanics, notably in quantum information theory, fidelity is a measure of the "closeness" of two quantum states. It expresses the probability that one state will pass a test to identify as the other. The fidelity is not a metric on the space of density matrices, but it can be used to define the Bures metric on this space.

In mathematics, the Grothendieck inequality states that there is a universal constant $with the following property. If M ij is an n \times n matrix with$

A sum-of-squares optimization program is an optimization problem with a linear cost function and a particular type of constraint on the decision variables. These constraints are of the form that when the decision variables are used as coefficients in certain polynomials, those polynomials should have the polynomial SOS property. When fixing the maximum degree of the polynomials involved, sum-of-squares optimization is also known as the Lasserre hierarchy of relaxations in semidefinite programming.

<span class="mw-page-title-main">Matrix completion</span>

Matrix completion is the task of filling in the missing entries of a partially observed matrix, which is equivalent to performing data imputation in statistics. A wide range of datasets are naturally organized in matrix form. One example is the movie-ratings matrix, as appears in the Netflix problem: Given a ratings matrix in which each entry $represents the rating of movie by customer, if customer has watched movie and is otherwise missing, we would like to predict the remaining entries in order to make good recommendations to customers on what to watch next. Another example is the document-term matrix: The frequencies of words used in a collection of documents can be represented as a matrix, where each entry corresponds to the number of times the associated term appears in the indicated document.$

Quantum refereed game in quantum information processing is a class of games in the general theory of quantum games. It is played between two players, Alice and Bob, and arbitrated by a referee. The referee outputs the pay-off for the players after interacting with them for a fixed number of rounds, while exchanging quantum information.

The min-entropy, in information theory, is the smallest of the Rényi family of entropies, corresponding to the most conservative way of measuring the unpredictability of a set of outcomes, as the negative logarithm of the probability of the most likely outcome. The various Rényi entropies are all equal for a uniform distribution, but measure the unpredictability of a nonuniform distribution in different ways. The min-entropy is never greater than the ordinary or Shannon entropy and that in turn is never greater than the Hartley or max-entropy, defined as the logarithm of the number of outcomes with nonzero probability.

The Harrow–Hassidim–Lloyd algorithm or HHL algorithm is a quantum algorithm for numerically solving a system of linear equations, designed by Aram Harrow, Avinatan Hassidim, and Seth Lloyd. The algorithm estimates the result of a scalar measurement on the solution vector to a given linear system of equations.

Regularized least squares (RLS) is a family of methods for solving the least-squares problem while using regularization to further constrain the resulting solution.

Quantum optimization algorithms are quantum algorithms that are used to solve optimization problems. Mathematical optimization deals with finding the best solution to a problem from a set of possible solutions. Mostly, the optimization problem is formulated as a minimization problem, where one tries to minimize an error which depends on the solution: the optimal solution has the minimal error. Different optimization techniques are applied in various fields such as mechanics, economics and engineering, and as the complexity and amount of data involved rise, more efficient ways of solving optimization problems are needed. Quantum computing may allow problems which are not practically feasible on classical computers to be solved, or suggest a considerable speed up with respect to the best known classical algorithm.

Batch normalization is a method used to make training of artificial neural networks faster and more stable through normalization of the layers' inputs by re-centering and re-scaling. It was proposed by Sergey Ioffe and Christian Szegedy in 2015.

In statistics, machine learning and algorithms, a tensor sketch is a type of dimensionality reduction that is particularly efficient when applied to vectors that have tensor structure. Such a sketch can be used to speed up explicit kernel methods, bilinear pooling in neural networks and is a cornerstone in many numerical linear algebra algorithms.

References

1 2 3 4 5 6 Gärtner, Bernd; Matoušek, Jiří (2012), Gärtner, Bernd; Matousek, Jiri (eds.), "Semidefinite Programming", Approximation Algorithms and Semidefinite Programming, Berlin, Heidelberg: Springer, pp. 15–25, doi:10.1007/978-3-642-22015-9_2, ISBN 978-3-642-22015-9 , retrieved 2023-12-31
1 2 Ramana, Motakuri V. (1997). "An exact duality theory for semidefinite programming and its complexity implications". Mathematical Programming. 77 (1): 129–162. doi:10.1007/BF02614433. ISSN 0025-5610. S2CID 12886462.
↑ Vandenberghe, Lieven; Boyd, Stephen (1996). "Semidefinite Programming". SIAM Review. 38 (1): 49–95. doi:10.1137/1038003. ISSN 0036-1445.
↑ Raghavendra, Prasad (2008). "Optimal algorithms and inapproximability results for every CSP?". Proceedings of the fortieth annual ACM symposium on Theory of computing. pp. 245–254. doi:10.1145/1374376.1374414. ISBN 9781605580470. S2CID 15075197.
↑ Harrach, Bastian (2021), "Solving an inverse elliptic coefficient problem by convex non-linear semidefinite programming", Optimization Letters, 16 (5): 1599–1609, arXiv: 2105.11440 , doi:10.1007/s11590-021-01802-4, S2CID 235166806
↑ Simmons-Duffin, David (2015-02-06). "A Semidefinite Program Solver for the Conformal Bootstrap". Journal of High Energy Physics. 2015 (6): 174. arXiv: 1502.02033 . Bibcode:2015JHEP...06..174S. doi:10.1007/JHEP06(2015)174. S2CID 256009551.
↑ Hazan, Elad (2008). Laber, Eduardo Sany; Bornstein, Claudson; Nogueira, Loana Tito; Faria, Luerbio (eds.). "Sparse Approximate Solutions to Semidefinite Programs". LATIN 2008: Theoretical Informatics. Lecture Notes in Computer Science. Berlin, Heidelberg: Springer: 306–316. doi:10.1007/978-3-540-78773-0_27. ISBN 978-3-540-78773-0.
↑ Jiang, Haotian; Kathuria, Tarun; Lee, Yin Tat; Padmanabhan, Swati; Song, Zhao (November 2020). "A Faster Interior Point Method for Semidefinite Programming". 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS). Durham, NC, USA: IEEE. pp. 910–918. arXiv: 2009.10217 . doi:10.1109/FOCS46700.2020.00089. ISBN 978-1-7281-9621-3. S2CID 221836388.
↑ Huang, Baihe; Jiang, Shunhua; Song, Zhao; Tao, Runzhou; Zhang, Ruizhe (2021-11-18). "Solving SDP Faster: A Robust IPM Framework and Efficient Implementation". arXiv: 2101.08208 [math.OC].
↑ Brendan O'Donoghue, Eric Chu, Neal Parikh, Stephen Boyd, "Conic Optimization via Operator Splitting and Homogeneous Self-Dual Embedding", Journal of Optimization Theory and Applications, 2016, pp 1042--1068, https://web.stanford.edu/~boyd/papers/pdf/scs.pdf.
↑ Wen, Zaiwen, Donald Goldfarb, and Wotao Yin. "Alternating direction augmented Lagrangian methods for semidefinite programming." Mathematical Programming Computation 2.3-4 (2010): 203-230.
↑ Burer, Samuel; Monteiro, Renato D. C. (2003), "A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization", Mathematical Programming, 95 (2): 329–357, CiteSeerX 10.1.1.682.1520 , doi:10.1007/s10107-002-0352-8, ISSN 1436-4646, S2CID 7691228
↑ Castañeda, O.; Goldstein, T.; Studer, C. (December 2016). "Data Detection in Large Multi-Antenna Wireless Systems via Approximate Semidefinite Relaxation". IEEE Transactions on Circuits and Systems I: Regular Papers. 63 (12): 2334–2346. arXiv: 1609.01797 . doi: 10.1109/TCSI.2016.2607198 . hdl:20.500.11850/448631. ISSN 1558-0806.
↑ Zhu, Yuzixuan; Pataki, Gábor; Tran-Dinh, Quoc (2019), "Sieve-SDP: a simple facial reduction algorithm to preprocess semidefinite programs", Mathematical Programming Computation, 11 (3): 503–586, arXiv: 1710.08954 , doi:10.1007/s12532-019-00164-4, ISSN 1867-2949, S2CID 53645581

Lieven Vandenberghe, Stephen Boyd, "Semidefinite Programming", SIAM Review 38, March 1996, pp. 49–95. pdf
Monique Laurent, Franz Rendl, "Semidefinite Programming and Integer Programming", Report PNA-R0210, CWI, Amsterdam, April 2002. optimization-online
E. de Klerk, "Aspects of Semidefinite Programming: Interior Point Algorithms and Selected Applications", Kluwer Academic Publishers, March 2002, ISBN 1-4020-0547-4.
Robert M. Freund, "Introduction to Semidefinite Programming (SDP), SDP-Introduction

External links

Links to introductions and events in the field
Lecture notes from László Lovász on Semidefinite Programming

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[:0-1] 1 2 3 4 5 6 Gärtner, Bernd; Matoušek, Jiří (2012), Gärtner, Bernd; Matousek, Jiri (eds.), "Semidefinite Programming", Approximation Algorithms and Semidefinite Programming, Berlin, Heidelberg: Springer, pp. 15–25, doi:10.1007/978-3-642-22015-9_2, ISBN 978-3-642-22015-9 , retrieved 2023-12-31

[:1-2] 1 2 Ramana, Motakuri V. (1997). "An exact duality theory for semidefinite programming and its complexity implications". Mathematical Programming. 77 (1): 129–162. doi:10.1007/BF02614433. ISSN 0025-5610. S2CID 12886462.

[3] Vandenberghe, Lieven; Boyd, Stephen (1996). "Semidefinite Programming". SIAM Review. 38 (1): 49–95. doi:10.1137/1038003. ISSN 0036-1445.

[4] Raghavendra, Prasad (2008). "Optimal algorithms and inapproximability results for every CSP?". Proceedings of the fortieth annual ACM symposium on Theory of computing. pp. 245–254. doi:10.1145/1374376.1374414. ISBN 9781605580470. S2CID 15075197.

[5] Harrach, Bastian (2021), "Solving an inverse elliptic coefficient problem by convex non-linear semidefinite programming", Optimization Letters, 16 (5): 1599–1609, arXiv: 2105.11440 , doi:10.1007/s11590-021-01802-4, S2CID 235166806

[6] Simmons-Duffin, David (2015-02-06). "A Semidefinite Program Solver for the Conformal Bootstrap". Journal of High Energy Physics. 2015 (6): 174. arXiv: 1502.02033 . Bibcode:2015JHEP...06..174S. doi:10.1007/JHEP06(2015)174. S2CID 256009551.

[7] Hazan, Elad (2008). Laber, Eduardo Sany; Bornstein, Claudson; Nogueira, Loana Tito; Faria, Luerbio (eds.). "Sparse Approximate Solutions to Semidefinite Programs". LATIN 2008: Theoretical Informatics. Lecture Notes in Computer Science. Berlin, Heidelberg: Springer: 306–316. doi:10.1007/978-3-540-78773-0_27. ISBN 978-3-540-78773-0.

[8] Jiang, Haotian; Kathuria, Tarun; Lee, Yin Tat; Padmanabhan, Swati; Song, Zhao (November 2020). "A Faster Interior Point Method for Semidefinite Programming". 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS). Durham, NC, USA: IEEE. pp. 910–918. arXiv: 2009.10217 . doi:10.1109/FOCS46700.2020.00089. ISBN 978-1-7281-9621-3. S2CID 221836388.

[9] Huang, Baihe; Jiang, Shunhua; Song, Zhao; Tao, Runzhou; Zhang, Ruizhe (2021-11-18). "Solving SDP Faster: A Robust IPM Framework and Efficient Implementation". arXiv: 2101.08208 [math.OC].

[10] Brendan O'Donoghue, Eric Chu, Neal Parikh, Stephen Boyd, "Conic Optimization via Operator Splitting and Homogeneous Self-Dual Embedding", Journal of Optimization Theory and Applications, 2016, pp 1042--1068, https://web.stanford.edu/~boyd/papers/pdf/scs.pdf.

[11] Wen, Zaiwen, Donald Goldfarb, and Wotao Yin. "Alternating direction augmented Lagrangian methods for semidefinite programming." Mathematical Programming Computation 2.3-4 (2010): 203-230.

[12] Burer, Samuel; Monteiro, Renato D. C. (2003), "A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization", Mathematical Programming, 95 (2): 329–357, CiteSeerX 10.1.1.682.1520 , doi:10.1007/s10107-002-0352-8, ISSN 1436-4646, S2CID 7691228

[13] Castañeda, O.; Goldstein, T.; Studer, C. (December 2016). "Data Detection in Large Multi-Antenna Wireless Systems via Approximate Semidefinite Relaxation". IEEE Transactions on Circuits and Systems I: Regular Papers. 63 (12): 2334–2346. arXiv: 1609.01797 . doi: 10.1109/TCSI.2016.2607198 . hdl:20.500.11850/448631. ISSN 1558-0806.

[14] Zhu, Yuzixuan; Pataki, Gábor; Tran-Dinh, Quoc (2019), "Sieve-SDP: a simple facial reduction algorithm to preprocess semidefinite programs", Mathematical Programming Computation, 11 (3): 503–586, arXiv: 1710.08954 , doi:10.1007/s12532-019-00164-4, ISSN 1867-2949, S2CID 53645581

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

Semidefinite programming

Contents

Motivation and definition

Initial motivation

Equivalent formulations

Relations to other optimization problems

Duality theory

Definitions

Weak duality

Strong duality

Examples

Example 1

Example 2

Example 3 (Goemans–Williamson max cut approximation algorithm)

Other applications

Run-time complexity

Approximation algorithms

Ellipsoid method

Hazan's algorithm

Interior point methods

First-order methods

Bundle method

Other solving methods

Approximate methods

Preprocessing algorithms

See also

Related Research Articles

References

External links