Self-similarity matrix

Last updated August 09, 2023

In data analysis, the self-similarity matrix is a graphical representation of similar sequences in a data series.

Similarity can be explained by different measures, like spatial distance (distance matrix), correlation, or comparison of local histograms or spectral properties (e.g. IXEGRAM^[1]). This technique is also applied for the search of a given pattern in a long data series as in gene matching.^{[ citation needed ]} A similarity plot can be the starting point for dot plots or recurrence plots.

Definition

To construct a self-similarity matrix, one first transforms a data series into an ordered sequence of feature vectors $V=(v_{1},v_{2},\ldots ,v_{n})$ , where each vector $v_{i}$ describes the relevant features of a data series in a given local interval. Then the self-similarity matrix is formed by computing the similarity of pairs of feature vectors

S(j,k)=s(v_{j},v_{k})\quad j,k\in (1,\ldots ,n)

where $s(v_{j},v_{k})$ is a function measuring the similarity of the two vectors, for instance, the inner product $s(v_{j},v_{k})=v_{j}\cdot v_{k}$ . Then similar segments of feature vectors will show up as path of high similarity along diagonals of the matrix.^[2] Similarity plots are used for action recognition that is invariant to point of view ^[3] and for audio segmentation using spectral clustering of the self-similarity matrix.^[4]

Example

Related Research Articles

In mathematics, a tensor is an algebraic object that describes a multilinear relationship between sets of algebraic objects related to a vector space. Tensors may map between different objects such as vectors, scalars, and even other tensors. There are many types of tensors, including scalars and vectors, dual vectors, multilinear maps between vector spaces, and even some operations such as the dot product. Tensors are defined independent of any basis, although they are often referred to by their components in a basis related to a particular coordinate system; those components form an array, which can be thought of as a high-dimensional matrix.

In machine learning, support vector machines are supervised learning models with associated learning algorithms that analyze data for classification and regression analysis. Developed at AT&T Bell Laboratories by Vladimir Vapnik with colleagues SVMs are one of the most robust prediction methods, being based on statistical learning frameworks or VC theory proposed by Vapnik and Chervonenkis (1974). Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that assigns new examples to one category or the other, making it a non-probabilistic binary linear classifier. SVM maps training examples to points in space so as to maximise the width of the gap between the two categories. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall.

Principal component analysis (PCA) is a popular technique for analyzing large datasets containing a high number of dimensions/features per observation, increasing the interpretability of data while preserving the maximum amount of information, and enabling the visualization of multidimensional data. Formally, PCA is a statistical technique for reducing the dimensionality of a dataset. This is accomplished by linearly transforming the data into a new coordinate system where the variation in the data can be described with fewer dimensions than the initial data. Many studies use the first two principal components in order to plot the data in two dimensions and to visually identify clusters of closely related data points. Principal component analysis has applications in many fields such as population genetics, microbiome studies, and atmospheric science.

<span class="mw-page-title-main">Eigenface</span> Set of eigenvectors used in the computer vision problem of human face recognition

An eigenface is the name given to a set of eigenvectors when used in the computer vision problem of human face recognition. The approach of using eigenfaces for recognition was developed by Sirovich and Kirby and used by Matthew Turk and Alex Pentland in face classification. The eigenvectors are derived from the covariance matrix of the probability distribution over the high-dimensional vector space of face images. The eigenfaces themselves form a basis set of all images used to construct the covariance matrix. This produces dimension reduction by allowing the smaller set of basis images to represent the original training images. Classification can be achieved by comparing how faces are represented by the basis set.

In descriptive statistics and chaos theory, a recurrence plot (RP) is a plot showing, for each moment $in time, the times at which the state of a dynamical system returns to the previous state at, i.e., when the phase space trajectory visits roughly the same area in the phase space as at time . In other words, it is a plot of$

Latent semantic analysis (LSA) is a technique in natural language processing, in particular distributional semantics, of analyzing relationships between a set of documents and the terms they contain by producing a set of concepts related to the documents and terms. LSA assumes that words that are close in meaning will occur in similar pieces of text. A matrix containing word counts per document is constructed from a large piece of text and a mathematical technique called singular value decomposition (SVD) is used to reduce the number of rows while preserving the similarity structure among columns. Documents are then compared by cosine similarity between any two columns. Values close to 1 represent very similar documents while values close to 0 represent very dissimilar documents.

The scale-invariant feature transform (SIFT) is a computer vision algorithm to detect, describe, and match local features in images, invented by David Lowe in 1999. Applications include object recognition, robotic mapping and navigation, image stitching, 3D modeling, gesture recognition, video tracking, individual identification of wildlife and match moving.

A language model is a probabilistic model of a natural language that can generate probabilities of a series of words, based on text corpora in one or multiple languages it was trained on. Given that languages can be used to express an infinite variety of valid sentences, language modeling faces the problem of assigning non-zero probabilities to linguistically valid sequences that may never be encountered in the training data.

In machine learning, kernel machines are a class of algorithms for pattern analysis, whose best known member is the support-vector machine (SVM). These methods involve using linear classifiers to solve nonlinear problems. The general task of pattern analysis is to find and study general types of relations in datasets. For many algorithms that solve these tasks, the data in raw representation have to be explicitly transformed into feature vector representations via a user-specified feature map: in contrast, kernel methods require only a user-specified kernel, i.e., a similarity function over all pairs of data points computed using inner products. The feature map in kernel machines is infinite dimensional but only requires a finite dimensional matrix from user-input according to the Representer theorem. Kernel machines are slow to compute for datasets larger than a couple of thousand examples without parallel processing.

Non-negative matrix factorization, also non-negative matrix approximation is a group of algorithms in multivariate analysis and linear algebra where a matrix $V$ is factorized into (usually) two matrices $W$ and $H$ , with the property that all three matrices have no negative elements. This non-negativity makes the resulting matrices easier to inspect. Also, in applications such as processing of audio spectrograms or muscular activity, non-negativity is inherent to the data being considered. Since the problem is not exactly solvable in general, it is commonly approximated numerically.

In image processing, ridge detection is the attempt, via software, to locate ridges in an image, defined as curves whose points are local maxima of the function, akin to geographical ridges.

In mathematics, the structure tensor, also referred to as the second-moment matrix, is a matrix derived from the gradient of a function. It describes the distribution of the gradient in a specified neighborhood around a point and makes the information invariant respect the observing coordinates. The structure tensor is often used in image processing and computer vision.

In computer science, locality-sensitive hashing (LSH) is an algorithmic technique that hashes similar input items into the same "buckets" with high probability. Since similar items end up in the same buckets, this technique can be used for data clustering and nearest neighbor search. It differs from conventional hashing techniques in that hash collisions are maximized, not minimized. Alternatively, the technique can be seen as a way to reduce the dimensionality of high-dimensional data; high-dimensional input items can be reduced to low-dimensional versions while preserving relative distances between items.

<span class="mw-page-title-main">Spectral clustering</span> Clustering methods

In multivariate statistics, spectral clustering techniques make use of the spectrum (eigenvalues) of the similarity matrix of the data to perform dimensionality reduction before clustering in fewer dimensions. The similarity matrix is provided as an input and consists of a quantitative assessment of the relative similarity of each pair of points in the dataset.

The image segmentation problem is concerned with partitioning an image into multiple regions according to some homogeneity criterion. This article is primarily concerned with graph theoretic approaches to image segmentation applying graph partitioning via minimum cut or maximum cut. Segmentation-based object categorization can be viewed as a specific case of spectral clustering applied to image segmentation.

The block Wiedemann algorithm for computing kernel vectors of a matrix over a finite field is a generalization by Don Coppersmith of an algorithm due to Doug Wiedemann.

In multilinear algebra, the higher-order singular value decomposition (HOSVD) of a tensor is a specific orthogonal Tucker decomposition. It may be regarded as one type of generalization of the matrix singular value decomposition. It has applications in computer vision, computer graphics, machine learning, scientific computing, and signal processing. Some aspects can be traced as far back as F. L. Hitchcock in 1928, but it was L. R. Tucker who developed for third-order tensors the general Tucker decomposition in the 1960s, further advocated by L. De Lathauwer et al. in their Multilinear SVD work that employs the power method, or advocated by Vasilescu and Terzopoulos that developed M-mode SVD a parallel algorithm that employs the matrix SVD.

A heat kernel signature (HKS) is a feature descriptor for use in deformable shape analysis and belongs to the group of spectral shape analysis methods. For each point in the shape, HKS defines its feature vector representing the point's local and global geometric properties. Applications include segmentation, classification, structure discovery, shape matching and shape retrieval.

Diffusion maps is a dimensionality reduction or feature extraction algorithm introduced by Coifman and Lafon which computes a family of embeddings of a data set into Euclidean space whose coordinates can be computed from the eigenvectors and eigenvalues of a diffusion operator on the data. The Euclidean distance between points in the embedded space is equal to the "diffusion distance" between probability distributions centered at those points. Different from linear dimensionality reduction methods such as principal component analysis (PCA), diffusion maps are part of the family of nonlinear dimensionality reduction methods which focus on discovering the underlying manifold that the data has been sampled from. By integrating local similarities at different scales, diffusion maps give a global description of the data-set. Compared with other methods, the diffusion map algorithm is robust to noise perturbation and computationally inexpensive.

In mathematics and statistics, random projection is a technique used to reduce the dimensionality of a set of points which lie in Euclidean space. Random projection methods are known for their power, simplicity, and low error rates when compared to other methods. According to experimental results, random projection preserves distances well, but empirical results are sparse. They have been applied to many natural language tasks under the name random indexing.

References

↑ M. A. Casey; A. Westner (July 2000). "Separation of mixed audio sources by independent subspace analysis" (PDF). Proc. Int. Comput. Music Conf. Retrieved 2013-11-19.
↑ Müller, Meinard; Michael Clausen (2007). "Transposition-invariant self-similarity matrices" (PDF). Proceedings of the 8th International Conference on Music Information Retrieval (ISMIR 2007): 47–50. Retrieved 2013-11-19.
↑ I.N. Junejo; E. Dexter; I. Laptev; Patrick Pérez (2008). Cross-View Action Recognition from Temporal Self-Similarities. In Proc. European Conference on Computer Vision (ECCV), Marseille, France. Lecture Notes in Computer Science. Vol. 5303. pp. 293–306. CiteSeerX 10.1.1.405.1518 . doi:10.1007/978-3-540-88688-4_22. ISBN 978-3-540-88685-3.
↑ Dubnov, Shlomo; Ted Apel (2004). "Audio segmentation by singular value clustering". Proceedings of Computer Music Conference (ICMC 2004). CiteSeerX 10.1.1.324.4298 .
↑ Cross-View Action Recognition from Temporal Self-Similarities (2008), I. Junejo, E. Dexter, I. Laptev, and Patrick Pérez)

External links

http://www.recurrence-plot.tk/related_methods.php

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] M. A. Casey; A. Westner (July 2000). "Separation of mixed audio sources by independent subspace analysis" (PDF). Proc. Int. Comput. Music Conf. Retrieved 2013-11-19.

[Muller2007-2] Müller, Meinard; Michael Clausen (2007). "Transposition-invariant self-similarity matrices" (PDF). Proceedings of the 8th International Conference on Music Information Retrieval (ISMIR 2007): 47–50. Retrieved 2013-11-19.

[3] I.N. Junejo; E. Dexter; I. Laptev; Patrick Pérez (2008). Cross-View Action Recognition from Temporal Self-Similarities. In Proc. European Conference on Computer Vision (ECCV), Marseille, France. Lecture Notes in Computer Science. Vol. 5303. pp. 293–306. CiteSeerX 10.1.1.405.1518 . doi:10.1007/978-3-540-88688-4_22. ISBN 978-3-540-88685-3.

[4] Dubnov, Shlomo; Ted Apel (2004). "Audio segmentation by singular value clustering". Proceedings of Computer Music Conference (ICMC 2004). CiteSeerX 10.1.1.324.4298 .

[5] Cross-View Action Recognition from Temporal Self-Similarities (2008), I. Junejo, E. Dexter, I. Laptev, and Patrick Pérez)

[1]

[2]

[3]

[4]

Self-similarity matrix

Contents

Definition

Example

See also

Related Research Articles

References

Further reading

External links