In-crowd algorithm

Last updated

The in-crowd algorithm is a numerical method for solving basis pursuit denoising quickly; faster than any other algorithm for large, sparse problems. [1] This algorithm is an active set method, which minimizes iteratively sub-problems of the global basis pursuit denoising:

where is the observed signal, is the sparse signal to be recovered, is the expected signal under , and is the regularization parameter trading off signal fidelity and simplicity. The simplicity is here measured using the sparsity of the solution , measure through its -norm. The active set strategies are very efficient in this context as only few coefficient are expected to be non-zero. Thus, if they can be identified, solving the problem restricted to these coefficients yield the solution. Here, the features are greedily selected based on the absolute value of their gradient at the current estimate.

Other active-set methods for the basis pursuit denoising includes BLITZ, [2] where the selection of the active set is performed using the duality gap of the problem, and The Feature Sign Search, [3] where the features are included based on the estimate of their sign.

Algorithm

It consists of the following:

  1. Declare to be 0, so the unexplained residual
  2. Declare the active set to be the empty set, and to be its complement (the inactive set)
  3. Calculate the usefulness for each component in
  4. If on , no , terminate
  5. Otherwise, add components to based on their usefulness
  6. Solve basis pursuit denoising exactly on , and throw out any component of whose value attains exactly 0. This problem is dense, so quadratic programming techniques work very well for this sub problem.
  7. Update - n.b. can be computed in the subproblem as all elements outside of are 0
  8. Go to step 3.

Since every time the in-crowd algorithm performs a global search it adds up to components to the active set, it can be a factor of faster than the best alternative algorithms when this search is computationally expensive. A theorem [1] guarantees that the global optimum is reached in spite of the many-at-a-time nature of the in-crowd algorithm.

Notes

  1. 1 2 See The In-Crowd Algorithm for Fast Basis Pursuit Denoising, IEEE Trans Sig Proc 59 (10), Oct 1 2011, pp. 4595 - 4605, , demo MATLAB code available
  2. Johnson T, Guestrin C. Blitz: A principled meta-algorithm for scaling sparse optimization. In proceedings of the International Conference on Machine Learning (ICML) 2015 (pp. 1171-1179).()
  3. Lee H, Battle A, Raina R, Ng AY. Efficient sparse coding algorithms. In Advances in neural information processing systems 2007 (pp. 801-808).


Related Research Articles

Quadratic programming (QP) is the process of solving certain mathematical optimization problems involving quadratic functions. Specifically, one seeks to optimize a multivariate quadratic function subject to linear constraints on the variables. Quadratic programming is a type of nonlinear programming.

<span class="mw-page-title-main">Regularization (mathematics)</span> Technique to make a model more generalizable and transferable

In mathematics, statistics, finance, computer science, particularly in machine learning and inverse problems, regularization is a process that changes the result answer to be "simpler". It is often used to obtain results for ill-posed problems or to prevent overfitting.

The Lanczos algorithm is an iterative method devised by Cornelius Lanczos that is an adaptation of power methods to find the "most useful" eigenvalues and eigenvectors of an Hermitian matrix, where is often but not necessarily much smaller than . Although computationally efficient in principle, the method as initially formulated was not useful, due to its numerical instability.

In mathematics, preconditioning is the application of a transformation, called the preconditioner, that conditions a given problem into a form that is more suitable for numerical solving methods. Preconditioning is typically related to reducing a condition number of the problem. The preconditioned problem is then usually solved by an iterative method.

Numerical linear algebra, sometimes called applied linear algebra, is the study of how matrix operations can be used to create computer algorithms which efficiently and accurately provide approximate answers to questions in continuous mathematics. It is a subfield of numerical analysis, and a type of linear algebra. Computers use floating-point arithmetic and cannot exactly represent irrational data, so when a computer algorithm is applied to a matrix of data, it can sometimes increase the difference between a number stored in the computer and the true number that it is an approximation of. Numerical linear algebra uses properties of vectors and matrices to develop computer algorithms that minimize the error introduced by the computer, and is also concerned with ensuring that the algorithm is as efficient as possible.

Compressed sensing is a signal processing technique for efficiently acquiring and reconstructing a signal, by finding solutions to underdetermined linear systems. This is based on the principle that, through optimization, the sparsity of a signal can be exploited to recover it from far fewer samples than required by the Nyquist–Shannon sampling theorem. There are two conditions under which recovery is possible. The first one is sparsity, which requires the signal to be sparse in some domain. The second one is incoherence, which is applied through the isometric property, which is sufficient for sparse signals.

In linear algebra, eigendecomposition is the factorization of a matrix into a canonical form, whereby the matrix is represented in terms of its eigenvalues and eigenvectors. Only diagonalizable matrices can be factorized in this way. When the matrix being factorized is a normal or real symmetric matrix, the decomposition is called "spectral decomposition", derived from the spectral theorem.

In mathematics, the quadratic eigenvalue problem (QEP), is to find scalar eigenvalues , left eigenvectors and right eigenvectors such that

Sparse approximation theory deals with sparse solutions for systems of linear equations. Techniques for finding these solutions and exploiting them in applications have found wide use in image processing, signal processing, machine learning, medical imaging, and more.

The Bregman method is an iterative algorithm to solve certain convex optimization problems involving regularization. The original version is due to Lev M. Bregman, who published it in 1967.

In statistics and machine learning, lasso is a regression analysis method that performs both variable selection and regularization in order to enhance the prediction accuracy and interpretability of the resulting statistical model. It was originally introduced in geophysics, and later by Robert Tibshirani, who coined the term.

<span class="mw-page-title-main">Total variation denoising</span>

In signal processing, particularly image processing, total variation denoising, also known as total variation regularization or total variation filtering, is a noise removal process (filter). It is based on the principle that signals with excessive and possibly spurious detail have high total variation, that is, the integral of the absolute image gradient is high. According to this principle, reducing the total variation of the signal—subject to it being a close match to the original signal—removes unwanted detail whilst preserving important details such as edges. The concept was pioneered by L. I. Rudin, S. Osher, and E. Fatemi in 1992 and so is today known as the ROF model.

In applied mathematics and statistics, basis pursuit denoising (BPDN) refers to a mathematical optimization problem of the form

Augmented Lagrangian methods are a certain class of algorithms for solving constrained optimization problems. They have similarities to penalty methods in that they replace a constrained optimization problem by a series of unconstrained problems and add a penalty term to the objective; the difference is that the augmented Lagrangian method adds yet another term, designed to mimic a Lagrange multiplier. The augmented Lagrangian is related to, but not identical with the method of Lagrange multipliers.

<span class="mw-page-title-main">Step detection</span>

In statistics and signal processing, step detection is the process of finding abrupt changes in the mean level of a time series or signal. It is usually considered as a special case of the statistical method known as change detection or change point detection. Often, the step is small and the time series is corrupted by some kind of noise, and this makes the problem challenging because the step may be hidden by the noise. Therefore, statistical and/or signal processing algorithms are often required.

The quantum algorithm for linear systems of equations, also called HHL algorithm, designed by Aram Harrow, Avinatan Hassidim, and Seth Lloyd, is a quantum algorithm published in 2008 for solving linear systems. The algorithm estimates the result of a scalar measurement on the solution vector to a given linear system of equations.

In the field of statistical learning theory, matrix regularization generalizes notions of vector regularization to cases where the object to be learned is a matrix. The purpose of regularization is to enforce conditions, for example sparsity or smoothness, that can produce stable predictive functions. For example, in the more common vector framework, Tikhonov regularization optimizes over

<span class="mw-page-title-main">Sparse dictionary learning</span> Representation learning method

Sparse coding is a representation learning method which aims at finding a sparse representation of the input data in the form of a linear combination of basic elements as well as those basic elements themselves. These elements are called atoms and they compose a dictionary. Atoms in the dictionary are not required to be orthogonal, and they may be an over-complete spanning set. This problem setup also allows the dimensionality of the signals being represented to be higher than the one of the signals being observed. The above two properties lead to having seemingly redundant atoms that allow multiple representations of the same signal but also provide an improvement in sparsity and flexibility of the representation.

Structured sparsity regularization is a class of methods, and an area of research in statistical learning theory, that extend and generalize sparsity regularization learning methods. Both sparsity and structured sparsity regularization methods seek to exploit the assumption that the output variable to be learned can be described by a reduced number of variables in the input space . Sparsity regularization methods focus on selecting the input variables that best describe the output. Structured sparsity regularization methods generalize and extend sparsity regularization methods, by allowing for optimal selection over structures like groups or networks of input variables in .

Quantum optimization algorithms are quantum algorithms that are used to solve optimization problems. Mathematical optimization deals with finding the best solution to a problem from a set of possible solutions. Mostly, the optimization problem is formulated as a minimization problem, where one tries to minimize an error which depends on the solution: the optimal solution has the minimal error. Different optimization techniques are applied in various fields such as mechanics, economics and engineering, and as the complexity and amount of data involved rise, more efficient ways of solving optimization problems are needed. The power of quantum computing may allow problems which are not practically feasible on classical computers to be solved, or suggest a considerable speed up with respect to the best known classical algorithm.