Semidefinite embedding

Last updated

Maximum Variance Unfolding (MVU), also known as Semidefinite Embedding (SDE), is an algorithm in computer science that uses semidefinite programming to perform non-linear dimensionality reduction of high-dimensional vectorial input data. [1] [2] [3]

Contents

It is motivated by the observation that kernel Principal Component Analysis (kPCA) does not reduce the data dimensionality, [4] as it leverages the Kernel trick to non-linearly map the original data into an inner-product space.

Algorithm

MVU creates a mapping from the high dimensional input vectors to some low dimensional Euclidean vector space in the following steps: [5]

  1. A neighbourhood graph is created. Each input is connected with its k-nearest input vectors (according to Euclidean distance metric) and all k-nearest neighbors are connected with each other. If the data is sampled well enough, the resulting graph is a discrete approximation of the underlying manifold.
  2. The neighbourhood graph is "unfolded" with the help of semidefinite programming. Instead of learning the output vectors directly, the semidefinite programming aims to find an inner product matrix that maximizes the pairwise distances between any two inputs that are not connected in the neighbourhood graph while preserving the nearest neighbors distances.
  3. The low-dimensional embedding is finally obtained by application of multidimensional scaling on the learned inner product matrix.

The steps of applying semidefinite programming followed by a linear dimensionality reduction step to recover a low-dimensional embedding into a Euclidean space were first proposed by Linial, London, and Rabinovich. [6]

Optimization formulation

Let be the original input and be the embedding. If are two neighbors, then the local isometry constraint that needs to be satisfied is: [7] [8] [9]

Let be the Gram matrices of and (i.e.: ). We can express the above constraint for every neighbor points in term of : [10] [11]

In addition, we also want to constrain the embedding to center at the origin: [12] [13] [14]

As described above, except the distances of neighbor points are preserved, the algorithm aims to maximize the pairwise distance of every pair of points. The objective function to be maximized is: [15] [16] [17]

Intuitively, maximizing the function above is equivalent to pulling the points as far away from each other as possible and therefore "unfold" the manifold. The local isometry constraint [18]

Let where

prevents the objective function from diverging (going to infinity).

Since the graph has N points, the distance between any two points . We can then bound the objective function as follows: [19] [20]

The objective function can be rewritten purely in the form of the Gram matrix: [21] [22] [23]

Finally, the optimization can be formulated as: [24] [25] [26]

After the Gram matrix is learned by semidefinite programming, the output can be obtained via Cholesky decomposition.

In particular, the Gram matrix can be written as where is the i-th element of eigenvector of the eigenvalue . [27] [28]

It follows that the -th element of the output is . [29] [30]

See also

Notes

  1. Weinberger, Sha and Saul 2004a
  2. Weinberger and Saul 2004b
  3. Weinberger and Saul 2006
  4. Lawrence 2012 , page 1612
  5. Weinberger, Sha and Saul 2004a , page 7.
  6. Linial, London and Rabinovich 1995
  7. Weinberger, Sha and Saul 2004a , page 3, equation 8
  8. Weinberger and Saul 2004b , page 3, equation 2
  9. Weinberger and Saul 2006 , page 4, equation 2
  10. Weinberger, Sha and Saul 2004a , page 3, equation 9
  11. Weinberger and Saul 2004b , page 3, equation 3
  12. Weinberger, Sha and Saul 2004a , page 3, equation 6
  13. Weinberger and Saul 2004b , page 3, equation 5
  14. Weinberger and Saul 2006 , page 5, equation 8
  15. Weinberger, Sha and Saul 2004a , page 4, equation 10
  16. Weinberger and Saul 2004b , page 4, equation 6
  17. Weinberger and Saul 2006 , page 5, equation 4
  18. Weinberger and Saul 2004b , page 4, equation 7
  19. Weinberger and Saul 2004b , page 4, equation 8
  20. Weinberger and Saul 2006 , page 5, equation 6
  21. Weinberger, Sha and Saul 2004a , page 4, equation 11
  22. Weinberger and Saul 2004b , page 4, equation 9
  23. Weinberger and Saul 2006 , page 6, equations 10 to 13
  24. Weinberger, Sha and Saul 2004a , page 4, section 3.3
  25. Weinberger and Saul 2004b , page 4, equation 9
  26. Weinberger and Saul 2006 , page 6, equations 10 to 13
  27. Weinberger and Saul 2004b , page 4, equation 10
  28. Weinberger and Saul 2006 , page 7, equations 14
  29. Weinberger and Saul 2004b , page 4, equation 11
  30. Weinberger and Saul 2006 , page 7, equations 15

Related Research Articles

<span class="mw-page-title-main">Bessel function</span> Families of solutions to related differential equations

Bessel functions, first defined by the mathematician Daniel Bernoulli and then generalized by Friedrich Bessel, are canonical solutions y(x) of Bessel's differential equation

<span class="mw-page-title-main">Matrix multiplication</span> Mathematical operation in linear algebra

In mathematics, particularly in linear algebra, matrix multiplication is a binary operation that produces a matrix from two matrices. For matrix multiplication, the number of columns in the first matrix must be equal to the number of rows in the second matrix. The resulting matrix, known as the matrix product, has the number of rows of the first and the number of columns of the second matrix. The product of matrices A and B is denoted as AB.

In the mathematical field of differential geometry, a metric tensor is an additional structure on a manifold M that allows defining distances and angles, just as the inner product on a Euclidean space allows defining distances and angles there. More precisely, a metric tensor at a point p of M is a bilinear form defined on the tangent space at p, and a metric tensor on M consists of a metric tensor at each point p of M that varies smoothly with p.

<span class="mw-page-title-main">Differential operator</span> Typically linear operator defined in terms of differentiation of functions

In mathematics, a differential operator is an operator defined as a function of the differentiation operator. It is helpful, as a matter of notation first, to consider differentiation as an abstract operation that accepts a function and returns another function.

<span class="mw-page-title-main">Poisson bracket</span> Operation in Hamiltonian mechanics

In mathematics and classical mechanics, the Poisson bracket is an important binary operation in Hamiltonian mechanics, playing a central role in Hamilton's equations of motion, which govern the time evolution of a Hamiltonian dynamical system. The Poisson bracket also distinguishes a certain class of coordinate transformations, called canonical transformations, which map canonical coordinate systems into canonical coordinate systems. A "canonical coordinate system" consists of canonical position and momentum variables that satisfy canonical Poisson bracket relations. The set of possible canonical transformations is always very rich. For instance, it is often possible to choose the Hamiltonian itself as one of the new canonical momentum coordinates.

In mathematics, the Hodge star operator or Hodge star is a linear map defined on the exterior algebra of a finite-dimensional oriented vector space endowed with a nondegenerate symmetric bilinear form. Applying the operator to an element of the algebra produces the Hodge dual of the element. This map was introduced by W. V. D. Hodge.

<span class="mw-page-title-main">Quantum group</span> Algebraic construct of interest in theoretical physics

In mathematics and theoretical physics, the term quantum group denotes one of a few different kinds of noncommutative algebras with additional structure. These include Drinfeld–Jimbo type quantum groups, compact matrix quantum groups, and bicrossproduct quantum groups. Despite their name, they do not themselves have a natural group structure, though they are in some sense 'close' to a group.

<span class="mw-page-title-main">Cobb–Douglas production function</span> Macroeconomic formula that describes productivity

In economics and econometrics, the Cobb–Douglas production function is a particular functional form of the production function, widely used to represent the technological relationship between the amounts of two or more inputs and the amount of output that can be produced by those inputs. The Cobb–Douglas form is developed and tested against statistical evidence by Charles Cobb and Paul Douglas between 1927 and 1947; according to Douglas, the functional form itself was developed earlier by Philip Wicksteed.

In mathematics, the Hessian matrix, Hessian or Hesse matrix is a square matrix of second-order partial derivatives of a scalar-valued function, or scalar field. It describes the local curvature of a function of many variables. The Hessian matrix was developed in the 19th century by the German mathematician Ludwig Otto Hesse and later named after him. Hesse originally used the term "functional determinants". The Hessian is sometimes denoted by H or, ambiguously, by ∇2.

<span class="mw-page-title-main">Hyperfine structure</span> Small shifts and splittings in the energy levels of atoms, molecules and ions

In atomic physics, hyperfine structure is defined by small shifts in otherwise degenerate energy levels and the resulting splittings in those energy levels of atoms, molecules, and ions, due to electromagnetic multipole interaction between the nucleus and electron clouds.

In mathematics, the determinant of an m×m skew-symmetric matrix can always be written as the square of a polynomial in the matrix entries, a polynomial with integer coefficients that only depends on m. When m is odd, the polynomial is zero. When m=2n is even, it is a nonzero polynomial of degree n, and is unique up to multiplication by ±1. The convention on skew-symmetric tridiagonal matrices, given below in the examples, then determines one specific polynomial, called the Pfaffian polynomial. The value of this polynomial, when applied to the entiries of a skew-symmetric matrix, is called the Pfaffian of that matrix. The term Pfaffian was introduced by Cayley (1852) who indirectly named them after Johann Friedrich Pfaff.

<span class="mw-page-title-main">Elliptic operator</span> Type of differential operator

In the theory of partial differential equations, elliptic operators are differential operators that generalize the Laplace operator. They are defined by the condition that the coefficients of the highest-order derivatives be positive, which implies the key property that the principal symbol is invertible, or equivalently that there are no real characteristic directions.

<span class="mw-page-title-main">Gaussian integral</span> Integral of the Gaussian function, equal to sqrt(π)

The Gaussian integral, also known as the Euler–Poisson integral, is the integral of the Gaussian function over the entire real line. Named after the German mathematician Carl Friedrich Gauss, the integral is

<span class="mw-page-title-main">Curvilinear coordinates</span> Coordinate system whose directions vary in space

In geometry, curvilinear coordinates are a coordinate system for Euclidean space in which the coordinate lines may be curved. These coordinates may be derived from a set of Cartesian coordinates by using a transformation that is locally invertible at each point. This means that one can convert a point given in a Cartesian coordinate system to its curvilinear coordinates and back. The name curvilinear coordinates, coined by the French mathematician Lamé, derives from the fact that the coordinate surfaces of the curvilinear systems are curved.

In mathematics, a Kac–Moody algebra is a Lie algebra, usually infinite-dimensional, that can be defined by generators and relations through a generalized Cartan matrix. These algebras form a generalization of finite-dimensional semisimple Lie algebras, and many properties related to the structure of a Lie algebra such as its root system, irreducible representations, and connection to flag manifolds have natural analogues in the Kac–Moody setting.

In mathematics, a matrix norm is a vector norm in a vector space whose elements (vectors) are matrices.

In mathematics, the spin representations are particular projective representations of the orthogonal or special orthogonal groups in arbitrary dimension and signature. More precisely, they are two equivalent representations of the spin groups, which are double covers of the special orthogonal groups. They are usually studied over the real or complex numbers, but they can be defined over other fields.

A Moran process or Moran model is a simple stochastic process used in biology to describe finite populations. The process is named after Patrick Moran, who first proposed the model in 1958. It can be used to model variety-increasing processes such as mutation as well as variety-reducing effects such as genetic drift and natural selection. The process can describe the probabilistic dynamics in a finite population of constant size N in which two alleles A and B are competing for dominance. The two alleles are considered to be true replicators.

In mathematics, Ricci calculus constitutes the rules of index notation and manipulation for tensors and tensor fields on a differentiable manifold, with or without a metric tensor or connection. It is also the modern name for what used to be called the absolute differential calculus, developed by Gregorio Ricci-Curbastro in 1887–1896, and subsequently popularized in a paper written with his pupil Tullio Levi-Civita in 1900. Jan Arnoldus Schouten developed the modern notation and formalism for this mathematical framework, and made contributions to the theory, during its applications to general relativity and differential geometry in the early twentieth century.

Tau functions are an important ingredient in the modern mathematical theory of integrable systems, and have numerous applications in a variety of other domains. They were originally introduced by Ryogo Hirota in his direct method approach to soliton equations, based on expressing them in an equivalent bilinear form.

References

Additional material