This article may be too technical for most readers to understand.(June 2023) |
Matching pursuit (MP) is a sparse approximation algorithm which finds the "best matching" projections of multidimensional data onto the span of an over-complete (i.e., redundant) dictionary . The basic idea is to approximately represent a signal from Hilbert space as a weighted sum of finitely many functions (called atoms) taken from . An approximation with atoms has the form
where is the th column of the matrix and is the scalar weighting factor (amplitude) for the atom . Normally, not every atom in will be used in this sum. Instead, matching pursuit chooses the atoms one at a time in order to maximally (greedily) reduce the approximation error. This is achieved by finding the atom that has the highest inner product with the signal (assuming the atoms are normalized), subtracting from the signal an approximation that uses only that one atom, and repeating the process until the signal is satisfactorily decomposed, i.e., the norm of the residual is small, where the residual after calculating and is denoted by
If converges quickly to zero, then only a few atoms are needed to get a good approximation to . Such sparse representations are desirable for signal coding and compression. More precisely, the sparsity problem that matching pursuit is intended to approximately solve is
where is the pseudo-norm (i.e. the number of nonzero elements of ). In the previous notation, the nonzero entries of are . Solving the sparsity problem exactly is NP-hard, which is why approximation methods like MP are used.
For comparison, consider the Fourier transform representation of a signal - this can be described using the terms given above, where the dictionary is built from sinusoidal basis functions (the smallest possible complete dictionary). The main disadvantage of Fourier analysis in signal processing is that it extracts only the global features of the signals and does not adapt to the analysed signals . By taking an extremely redundant dictionary, we can look in it for atoms (functions) that best match a signal .
If contains a large number of vectors, searching for the most sparse representation of is computationally unacceptable for practical applications. In 1993, Mallat and Zhang [1] proposed a greedy solution that they named "Matching Pursuit." For any signal and any dictionary , the algorithm iteratively generates a sorted list of atom indices and weighting scalars, which form the sub-optimal solution to the problem of sparse signal representation.
Algorithm Matching Pursuit Input: Signal: , dictionary with normalized columns . Output: List of coefficients and indices for corresponding atoms . Initialization: ; ; Repeat: Find with maximum inner product ; ; ; ; Until stop condition (for example: ) return
In signal processing, the concept of matching pursuit is related to statistical projection pursuit, in which "interesting" projections are found; ones that deviate more from a normal distribution are considered to be more interesting.
Matching pursuit has been applied to signal, image [2] and video coding, [3] [4] shape representation and recognition, [5] 3D objects coding, [6] and in interdisciplinary applications like structural health monitoring. [7] It has been shown that it performs better than DCT based coding for low bit rates in both efficiency of coding and quality of image. [8] The main problem with matching pursuit is the computational complexity of the encoder. In the basic version of an algorithm, the large dictionary needs to be searched at each iteration. Improvements include the use of approximate dictionary representations and suboptimal ways of choosing the best match at each iteration (atom extraction). [9] The matching pursuit algorithm is used in MP/SOFT, a method of simulating quantum dynamics. [10]
MP is also used in dictionary learning. [11] [12] In this algorithm, atoms are learned from a database (in general, natural scenes such as usual images) and not chosen from generic dictionaries.
A very recent application of MP is its use in linear computation coding [13] to speed-up the computation of matrix-vector products.
A popular extension of Matching Pursuit (MP) is its orthogonal version: Orthogonal Matching Pursuit [14] [15] (OMP). The main difference from MP is that after every step, all the coefficients extracted so far are updated, by computing the orthogonal projection of the signal onto the subspace spanned by the set of atoms selected so far. This can lead to results better than standard MP, but requires more computation. OMP was shown to have stability and performance guarantees under certain restricted isometry conditions. [16] The incremental multi-parameter algorithm (IMP), published three years before MP, works in the same way as OMP. [17]
Extensions such as Multichannel MP [18] and Multichannel OMP [19] allow one to process multicomponent signals. An obvious extension of Matching Pursuit is over multiple positions and scales, by augmenting the dictionary to be that of a wavelet basis. This can be done efficiently using the convolution operator without changing the core algorithm. [20]
Matching pursuit is related to the field of compressed sensing and has been extended by researchers in that community. Notable extensions are Orthogonal Matching Pursuit (OMP), [21] Stagewise OMP (StOMP), [22] compressive sampling matching pursuit (CoSaMP), [23] Generalized OMP (gOMP), [24] and Multipath Matching Pursuit (MMP). [25]
In numerical analysis and functional analysis, a discrete wavelet transform (DWT) is any wavelet transform for which the wavelets are discretely sampled. As with other wavelet transforms, a key advantage it has over Fourier transforms is temporal resolution: it captures both frequency and location information.
CORDIC, Volder's algorithm, Digit-by-digit method, Circular CORDIC, Linear CORDIC, Hyperbolic CORDIC, and Generalized Hyperbolic CORDIC, is a simple and efficient algorithm to calculate trigonometric functions, hyperbolic functions, square roots, multiplications, divisions, and exponentials and logarithms with arbitrary base, typically converging with one digit per iteration. CORDIC is therefore also an example of digit-by-digit algorithms. CORDIC and closely related methods known as pseudo-multiplication and pseudo-division or factor combining are commonly used when no hardware multiplier is available, as the only operations they require are additions, subtractions, bitshift and lookup tables. As such, they all belong to the class of shift-and-add algorithms. In computer science, CORDIC is often used to implement floating-point arithmetic when the target platform lacks hardware multiply for cost or space reasons.
Feature selection is the process of selecting a subset of relevant features for use in model construction. Stylometry and DNA microarray analysis are two cases where feature selection is used. It should be distinguished from feature extraction.
In image processing, a Gabor filter, named after Dennis Gabor, who first proposed it as a 1D filter. The Gabor filter was first generalized to 2D by Gösta Granlund, by adding a reference direction. The Gabor filter is a linear filter used for texture analysis, which essentially means that it analyzes whether there is any specific frequency content in the image in specific directions in a localized region around the point or region of analysis. Frequency and orientation representations of Gabor filters are claimed by many contemporary vision scientists to be similar to those of the human visual system. They have been found to be particularly appropriate for texture representation and discrimination. In the spatial domain, a 2D Gabor filter is a Gaussian kernel function modulated by a sinusoidal plane wave.
In graph theory, a dominating set for a graph G is a subset D of its vertices, such that any vertex of G is in D, or has a neighbor in D. The domination numberγ(G) is the number of vertices in a smallest dominating set for G.
Non-negative matrix factorization, also non-negative matrix approximation is a group of algorithms in multivariate analysis and linear algebra where a matrix V is factorized into (usually) two matrices W and H, with the property that all three matrices have no negative elements. This non-negativity makes the resulting matrices easier to inspect. Also, in applications such as processing of audio spectrograms or muscular activity, non-negativity is inherent to the data being considered. Since the problem is not exactly solvable in general, it is commonly approximated numerically.
In mathematics, Fourier–Bessel series is a particular kind of generalized Fourier series based on Bessel functions.
MUSIC is an algorithm used for frequency estimation and radio direction finding.
Compressed sensing is a signal processing technique for efficiently acquiring and reconstructing a signal, by finding solutions to underdetermined linear systems. This is based on the principle that, through optimization, the sparsity of a signal can be exploited to recover it from far fewer samples than required by the Nyquist–Shannon sampling theorem. There are two conditions under which recovery is possible. The first one is sparsity, which requires the signal to be sparse in some domain. The second one is incoherence, which is applied through the isometric property, which is sufficient for sparse signals. Compressed sensing has applications in, for example, MRI where the incoherence condition is typically satisfied.
In statistics, the Q-function is the tail distribution function of the standard normal distribution. In other words, is the probability that a normal (Gaussian) random variable will obtain a value larger than standard deviations. Equivalently, is the probability that a standard normal random variable takes a value larger than .
Sparse approximation theory deals with sparse solutions for systems of linear equations. Techniques for finding these solutions and exploiting them in applications have found wide use in image processing, signal processing, machine learning, medical imaging, and more.
In computer science, lattice problems are a class of optimization problems related to mathematical objects called lattices. The conjectured intractability of such problems is central to the construction of secure lattice-based cryptosystems: lattice problems are an example of NP-hard problems which have been shown to be average-case hard, providing a test case for the security of cryptographic algorithms. In addition, some lattice problems which are worst-case hard can be used as a basis for extremely secure cryptographic schemes. The use of worst-case hardness in such schemes makes them among the very few schemes that are very likely secure even against quantum computers. For applications in such cryptosystems, lattices over vector spaces or free modules are generally considered.
In linear algebra, the coherence or mutual coherence of a matrix A is defined as the maximum absolute value of the cross-correlations between the columns of A.
In applied mathematics and statistics, basis pursuit denoising (BPDN) refers to a mathematical optimization problem of the form
In data mining and machine learning, kq-flats algorithm is an iterative method which aims to partition m observations into k clusters where each cluster is close to a q-flat, where q is a given integer.
In applied mathematics, k-SVD is a dictionary learning algorithm for creating a dictionary for sparse representations, via a singular value decomposition approach. k-SVD is a generalization of the k-means clustering method, and it works by iteratively alternating between sparse coding the input data based on the current dictionary, and updating the atoms in the dictionary to better fit the data. It is structurally related to the expectation–maximization (EM) algorithm. k-SVD can be found widely in use in applications such as image processing, audio processing, biology, and document analysis.
Proximal gradientmethods for learning is an area of research in optimization and statistical learning theory which studies algorithms for a general class of convex regularization problems where the regularization penalty may not be differentiable. One such example is regularization of the form
Sparse dictionary learning is a representation learning method which aims at finding a sparse representation of the input data in the form of a linear combination of basic elements as well as those basic elements themselves. These elements are called atoms and they compose a dictionary. Atoms in the dictionary are not required to be orthogonal, and they may be an over-complete spanning set. This problem setup also allows the dimensionality of the signals being represented to be higher than the one of the signals being observed. The above two properties lead to having seemingly redundant atoms that allow multiple representations of the same signal but also provide an improvement in sparsity and flexibility of the representation.
L1-norm principal component analysis (L1-PCA) is a general method for multivariate data analysis. L1-PCA is often preferred over standard L2-norm principal component analysis (PCA) when the analyzed data may contain outliers.
The convolutional sparse coding paradigm is an extension of the global sparse coding model, in which a redundant dictionary is modeled as a concatenation of circulant matrices. While the global sparsity constraint describes signal as a linear combination of a few atoms in the redundant dictionary , usually expressed as for a sparse vector , the alternative dictionary structure adopted by the convolutional sparse coding model allows the sparsity prior to be applied locally instead of globally: independent patches of are generated by "local" dictionaries operating over stripes of .
{{cite journal}}
: Cite journal requires |journal=
(help){{cite book}}
: |journal=
ignored (help)