Diffusion maps is a dimensionality reduction or feature extraction algorithm introduced by Coifman and Lafon [1] [2] [3] [4] which computes a family of embeddings of a data set into Euclidean space (often low-dimensional) whose coordinates can be computed from the eigenvectors and eigenvalues of a diffusion operator on the data. The Euclidean distance between points in the embedded space is equal to the "diffusion distance" between probability distributions centered at those points. Different from linear dimensionality reduction methods such as principal component analysis (PCA), diffusion maps are part of the family of nonlinear dimensionality reduction methods which focus on discovering the underlying manifold that the data has been sampled from. By integrating local similarities at different scales, diffusion maps give a global description of the data-set. Compared with other methods, the diffusion map algorithm is robust to noise perturbation and computationally inexpensive.
Following [3] and, [5] diffusion maps can be defined in four steps.
Diffusion maps exploit the relationship between heat diffusion and random walk Markov chain. The basic observation is that if we take a random walk on the data, walking to a nearby data-point is more likely than walking to another that is far away. Let be a measure space, where is the data set and represents the distribution of the points on .
Based on this, the connectivity between two data points, and , can be defined as the probability of walking from to in one step of the random walk. Usually, this probability is specified in terms of a kernel function of the two points: . For example, the popular Gaussian kernel:
More generally, the kernel function has the following properties
( is symmetric)
( is positivity preserving).
The kernel constitutes the prior definition of the local geometry of the data-set. Since a given kernel will capture a specific feature of the data set, its choice should be guided by the application that one has in mind. This is a major difference with methods such as principal component analysis, where correlations between all data points are taken into account at once.
Given , we can then construct a reversible discrete-time Markov chain on (a process known as the normalized graph Laplacian construction):
and define:
Although the new normalized kernel does not inherit the symmetric property, it does inherit the positivity-preserving property and gains a conservation property:
From we can construct a transition matrix of a Markov chain () on . In other words, represents the one-step transition probability from to , and gives the t-step transition matrix.
We define the diffusion matrix (it is also a version of graph Laplacian matrix)
We then define the new kernel
or equivalently,
where D is a diagonal matrix and
We apply the graph Laplacian normalization to this new kernel:
where is a diagonal matrix and
One of the main ideas of the diffusion framework is that running the chain forward in time (taking larger and larger powers of ) reveals the geometric structure of at larger and larger scales (the diffusion process). Specifically, the notion of a cluster in the data set is quantified as a region in which the probability of escaping this region is low (within a certain time t). Therefore, t not only serves as a time parameter, but it also has the dual role of scale parameter.
The eigendecomposition of the matrix yields
where is the sequence of eigenvalues of and and are the biorthogonal left and right eigenvectors respectively. Due to the spectrum decay of the eigenvalues, only a few terms are necessary to achieve a given relative accuracy in this sum.
The reason to introduce the normalization step involving is to tune the influence of the data point density on the infinitesimal transition of the diffusion. In some applications, the sampling of the data is generally not related to the geometry of the manifold we are interested in describing. In this case, we can set and the diffusion operator approximates the Laplace–Beltrami operator. We then recover the Riemannian geometry of the data set regardless of the distribution of the points. To describe the long-term behavior of the point distribution of a system of stochastic differential equations, we can use and the resulting Markov chain approximates the Fokker–Planck diffusion. With , it reduces to the classical graph Laplacian normalization.
The diffusion distance at time between two points can be measured as the similarity of two points in the observation space with the connectivity between them. It is given by
where is the stationary distribution of the Markov chain, given by the first left eigenvector of . Explicitly:
Intuitively, is small if there is a large number of short paths connecting and . There are several interesting features associated with the diffusion distance, based on our previous discussion that also serves as a scale parameter:
The diffusion distance can be calculated using the eigenvectors by
So the eigenvectors can be used as a new set of coordinates for the data. The diffusion map is defined as:
Because of the spectrum decay, it is sufficient to use only the first k eigenvectors and eigenvalues. Thus we get the diffusion map from the original data to a k-dimensional space which is embedded in the original space.
In [6] it is proved that
so the Euclidean distance in the diffusion coordinates approximates the diffusion distance.
The basic algorithm framework of diffusion map is as:
Step 1. Given the similarity matrix L.
Step 2. Normalize the matrix according to parameter : .
Step 3. Form the normalized matrix .
Step 4. Compute the k largest eigenvalues of and the corresponding eigenvectors.
Step 5. Use diffusion map to get the embedding .
In the paper [6] Nadler et al. showed how to design a kernel that reproduces the diffusion induced by a Fokker–Planck equation. They also explained that, when the data approximate a manifold, one can recover the geometry of this manifold by computing an approximation of the Laplace–Beltrami operator. This computation is completely insensitive to the distribution of the points and therefore provides a separation of the statistics and the geometry of the data. Since diffusion maps give a global description of the data-set, they can measure the distances between pairs of sample points in the manifold in which the data is embedded. Applications based on diffusion maps include face recognition, [7] spectral clustering, low dimensional representation of images, image segmentation, [8] 3D model segmentation, [9] speaker verification [10] and identification, [11] sampling on manifolds, anomaly detection, [12] [13] image inpainting, [14] revealing brain resting state networks organization [15] and so on.
Furthermore, the diffusion maps framework has been productively extended to complex networks, [16] revealing a functional organisation of networks which differs from the purely topological or structural one.
In mathematical physics and mathematics, the Pauli matrices are a set of three 2 × 2 complex matrices that are traceless, Hermitian, involutory and unitary. Usually indicated by the Greek letter sigma, they are occasionally denoted by tau when used in connection with isospin symmetries.
The quantum harmonic oscillator is the quantum-mechanical analog of the classical harmonic oscillator. Because an arbitrary smooth potential can usually be approximated as a harmonic potential at the vicinity of a stable equilibrium point, it is one of the most important model systems in quantum mechanics. Furthermore, it is one of the few quantum-mechanical systems for which an exact, analytical solution is known.
In mechanics and geometry, the 3D rotation group, often denoted SO(3), is the group of all rotations about the origin of three-dimensional Euclidean space under the operation of composition.
In mathematics, a self-adjoint operator on a complex vector space V with inner product is a linear map A that is its own adjoint. If V is finite-dimensional with a given orthonormal basis, this is equivalent to the condition that the matrix of A is a Hermitian matrix, i.e., equal to its conjugate transpose A∗. By the finite-dimensional spectral theorem, V has an orthonormal basis such that the matrix of A relative to this basis is a diagonal matrix with entries in the real numbers. This article deals with applying generalizations of this concept to operators on Hilbert spaces of arbitrary dimension.
In mathematics, the adjoint representation of a Lie group G is a way of representing the elements of the group as linear transformations of the group's Lie algebra, considered as a vector space. For example, if G is , the Lie group of real n-by-n invertible matrices, then the adjoint representation is the group homomorphism that sends an invertible n-by-n matrix to an endomorphism of the vector space of all linear transformations of defined by: .
Nonlinear dimensionality reduction, also known as manifold learning, is any of various related techniques that aim to project high-dimensional data, potentially existing across non-linear manifolds which cannot be adequately captured by linear decomposition methods, onto lower-dimensional latent manifolds, with the goal of either visualizing the data in the low-dimensional space, or learning the mapping itself. The techniques described below can be understood as generalizations of linear decomposition methods used for dimensionality reduction, such as singular value decomposition and principal component analysis.
In information geometry, the Fisher information metric is a particular Riemannian metric which can be defined on a smooth statistical manifold, i.e., a smooth manifold whose points are probability measures defined on a common probability space. It can be used to calculate the informational difference between measurements.
In physics, the S-matrix or scattering matrix is a matrix which relates the initial state and the final state of a physical system undergoing a scattering process. It is used in quantum mechanics, scattering theory and quantum field theory (QFT).
In quantum field theory, the Lehmann–Symanzik–Zimmermann (LSZ) reduction formula is a method to calculate S-matrix elements from the time-ordered correlation functions of a quantum field theory. It is a step of the path that starts from the Lagrangian of some quantum field theory and leads to prediction of measurable quantities. It is named after the three German physicists Harry Lehmann, Kurt Symanzik and Wolfhart Zimmermann.
In linear algebra, an eigenvector or characteristic vector is a vector that has its direction unchanged by a given linear transformation. More precisely, an eigenvector, , of a linear transformation, , is scaled by a constant factor, , when the linear transformation is applied to it: . The corresponding eigenvalue, characteristic value, or characteristic root is the multiplying factor .
In quantum mechanics, an energy level is degenerate if it corresponds to two or more different measurable states of a quantum system. Conversely, two or more different states of a quantum mechanical system are said to be degenerate if they give the same value of energy upon measurement. The number of different states corresponding to a particular energy level is known as the degree of degeneracy of the level. It is represented mathematically by the Hamiltonian for the system having more than one linearly independent eigenstate with the same energy eigenvalue. When this is the case, energy alone is not enough to characterize what state the system is in, and other quantum numbers are needed to characterize the exact state when distinction is desired. In classical mechanics, this can be understood in terms of different possible trajectories corresponding to the same energy.
In physics, Fujikawa's method is a way of deriving the chiral anomaly in quantum field theory. It uses the correspondence between functional determinants and the partition function, effectively making use of the Atiyah–Singer index theorem.
The Hückel method or Hückel molecular orbital theory, proposed by Erich Hückel in 1930, is a simple method for calculating molecular orbitals as linear combinations of atomic orbitals. The theory predicts the molecular orbitals for π-electrons in π-delocalized molecules, such as ethylene, benzene, butadiene, and pyridine. It provides the theoretical basis for Hückel's rule that cyclic, planar molecules or ions with π-electrons are aromatic. It was later extended to conjugated molecules such as pyridine, pyrrole and furan that contain atoms other than carbon and hydrogen (heteroatoms). A more dramatic extension of the method to include σ-electrons, known as the extended Hückel method (EHM), was developed by Roald Hoffmann. The extended Hückel method gives some degree of quantitative accuracy for organic molecules in general and was used to provide computational justification for the Woodward–Hoffmann rules. To distinguish the original approach from Hoffmann's extension, the Hückel method is also known as the simple Hückel method (SHM).
The time-evolving block decimation (TEBD) algorithm is a numerical scheme used to simulate one-dimensional quantum many-body systems, characterized by at most nearest-neighbour interactions. It is dubbed Time-evolving Block Decimation because it dynamically identifies the relevant low-dimensional Hilbert subspaces of an exponentially larger original Hilbert space. The algorithm, based on the Matrix Product States formalism, is highly efficient when the amount of entanglement in the system is limited, a requirement fulfilled by a large class of quantum many-body systems in one dimension.
In operator theory, a branch of mathematics, a positive-definite kernel is a generalization of a positive-definite function or a positive-definite matrix. It was first introduced by James Mercer in the early 20th century, in the context of solving integral operator equations. Since then, positive-definite functions and their various analogues and generalizations have arisen in diverse parts of mathematics. They occur naturally in Fourier analysis, probability theory, operator theory, complex function-theory, moment problems, integral equations, boundary-value problems for partial differential equations, machine learning, embedding problem, information theory, and other areas.
In multivariate statistics, spectral clustering techniques make use of the spectrum (eigenvalues) of the similarity matrix of the data to perform dimensionality reduction before clustering in fewer dimensions. The similarity matrix is provided as an input and consists of a quantitative assessment of the relative similarity of each pair of points in the dataset.
This is a glossary for the terminology often encountered in undergraduate quantum mechanics courses.
Coherent states have been introduced in a physical context, first as quasi-classical states in quantum mechanics, then as the backbone of quantum optics and they are described in that spirit in the article Coherent states. However, they have generated a huge variety of generalizations, which have led to a tremendous amount of literature in mathematical physics. In this article, we sketch the main directions of research on this line. For further details, we refer to several existing surveys.
A heat kernel signature (HKS) is a feature descriptor for use in deformable shape analysis and belongs to the group of spectral shape analysis methods. For each point in the shape, HKS defines its feature vector representing the point's local and global geometric properties. Applications include segmentation, classification, structure discovery, shape matching and shape retrieval.
Tau functions are an important ingredient in the modern mathematical theory of integrable systems, and have numerous applications in a variety of other domains. They were originally introduced by Ryogo Hirota in his direct method approach to soliton equations, based on expressing them in an equivalent bilinear form.