Sparse approximation

Last updated March 06, 2023

Sparse approximation (also known as sparse representation) theory deals with sparse solutions for systems of linear equations. Techniques for finding these solutions and exploiting them in applications have found wide use in image processing, signal processing, machine learning, medical imaging, and more.

Sparse decomposition

Noiseless observations

Consider a linear system of equations $x=D\alpha$ , where $D$ is an underdetermined $m\times p$ matrix $(m<p)$ and $x\in \mathbb {R} ^{m},\alpha \in \mathbb {R} ^{p}$ . The matrix $D$ (typically assumed to be full-rank) is referred to as the dictionary, and $x$ is a signal of interest. The core sparse representation problem is defined as the quest for the sparsest possible representation $\alpha$ satisfying $x=D\alpha$ . Due to the underdetermined nature of $D$ , this linear system admits in general infinitely many possible solutions, and among these we seek the one with the fewest non-zeros. Put formally, we solve

\min _{\alpha \in \mathbb {R} ^{p}}\|\alpha \|_{0}{\text{ subject to }}x=D\alpha ,

where $\|\alpha \|_{0}=\#\{i:\alpha _{i}\neq 0,\,i=1,\ldots ,p\}$ is the $\ell _{0}$ pseudo-norm, which counts the number of non-zero components of $\alpha$ . This problem is known to be NP-hard with a reduction to NP-complete subset selection problems in combinatorial optimization.

Sparsity of $\alpha$ implies that only a few ( $k\ll m<p$ ) components in it are non-zero. The underlying motivation for such a sparse decomposition is the desire to provide the simplest possible explanation of $x$ as a linear combination of as few as possible columns from $D$ , also referred to as atoms. As such, the signal $x$ can be viewed as a molecule composed of a few fundamental elements taken from $D$ .

While the above posed problem is indeed NP-Hard, its solution can often be found using approximation algorithms. One such option is a convex relaxation of the problem, obtained by using the $\ell _{1}$ -norm instead of $\ell _{0}$ , where $\|\alpha \|_{1}$ simply sums the absolute values of the entries in $\alpha$ . This is known as the basis pursuit (BP) algorithm, which can be handled using any linear programming solver. An alternative approximation method is a greedy technique, such as the matching pursuit (MP), which finds the location of the non-zeros one at a time.

Surprisingly, under mild conditions on $D$ (using the spark (mathematics), the mutual coherence or the restricted isometry property) and the level of sparsity in the solution, $k$ , the sparse representation problem can be shown to have a unique solution, and BP and MP are guaranteed to find it perfectly.^[1]^[2]^[3]

Noisy observations

Often the observed signal $x$ is noisy. By relaxing the equality constraint and imposing an $\ell _{2}$ -norm on the data-fitting term, the sparse decomposition problem becomes

\min _{\alpha \in \mathbb {R} ^{p}}\|\alpha \|_{0}{\text{ subject to }}\|x-D\alpha \|_{2}^{2}\leq \epsilon ^{2},

or put in a Lagrangian form,

\min _{\alpha \in \mathbb {R} ^{p}}\lambda \|\alpha \|_{0}+{\frac {1}{2}}\|x-D\alpha \|_{2}^{2},

where $\lambda$ is replacing the $\epsilon$ .

Just as in the noiseless case, these two problems are NP-Hard in general, but can be approximated using pursuit algorithms. More specifically, changing the $\ell _{0}$ to an $\ell _{1}$ -norm, we obtain

\min _{\alpha \in \mathbb {R} ^{p}}\lambda \|\alpha \|_{1}+{\frac {1}{2}}\|x-D\alpha \|_{2}^{2},

which is known as the basis pursuit denoising. Similarly, matching pursuit can be used for approximating the solution of the above problems, finding the locations of the non-zeros one at a time until the error threshold is met. Here as well, theoretical guarantees suggest that BP and MP lead to nearly optimal solutions depending on the properties of $D$ and the cardinality of the solution $k$ . ^[4]^[5]^[6] Another interesting theoretical result refers to the case in which $D$ is a unitary matrix. Under this assumption, the problems posed above (with either $\ell _{0}$ or $\ell _{1}$ ) admit closed-form solutions in the form of non-linear shrinkage.^[4]

Variations

There are several variations to the basic sparse approximation problem.

Structured sparsity: In the original version of the problem, any of the atoms in the dictionary can be picked. In the structured (block) sparsity model, instead of picking atoms individually, groups of them are to be picked. These groups can be overlapping and of varying size. The objective is to represent $x$ such that it is sparse while forcing this block-structure.^[7]

Collaborative (joint) sparse coding: The original version of the problem is defined for a single signal $x$ . In the collaborative (joint) sparse coding model, a set of signals is available, each believed to emerge from (nearly) the same set of atoms from $D$ . In this case, the pursuit task aims to recover a set of sparse representations that best describe the data while forcing them to share the same (or close-by) support.^[8]

Other structures: More broadly, the sparse approximation problem can be cast while forcing a specific desired structure on the pattern of non-zero locations in $\alpha$ . Two cases of interest that have been extensively studied are tree-based structure, and more generally, a Boltzmann distributed support.^[9]

Algorithms

As already mentioned above, there are various approximation (also referred to as pursuit) algorithms that have been developed for addressing the sparse representation problem:

\min _{\alpha \in \mathbb {R} ^{p}}\|\alpha \|_{0}{\text{ subject to }}\|x-D\alpha \|_{2}^{2}\leq \epsilon ^{2}.

We mention below a few of these main methods.

Matching pursuit is a greedy iterative algorithm for approximately solving the above problem. It works by gradually finding the locations of the non-zeros in $\alpha$ one at a time. The core idea is to find in each step the column (atom) in $D$ that best correlates with the current residual (initialized to $x$ ), and then updating this residual to take the new atom and its coefficient into account. Matching pursuit might pick the same atom multiple times.
Orthogonal matching pursuit is very similar to matching pursuit, with one major difference: in each of the algorithm's step, all the non-zero coefficients are updated by a least squares. As a consequence, the residual is orthogonal to the already chosen atoms, and thus an atom cannot be picked more than once.
Stage-wise greedy methods: Improved variations over the above are algorithms that operate greedily while adding two critical features: (i) the ability to add groups of non-zeros at a time (instead of one non-zero per round); and (ii) including a pruning step in each round in which several of the atoms are discarded from the support. Representatives of this approach are the Subspace-Pursuit algorithm and the CoSaMP.^[10]

Basis pursuit solves a convex relaxed version of the problem by replacing the $\ell _{0}$ by an $\ell _{1}$ -norm. Note that this only defines a new objective, while leaving open the question of the algorithm to use for getting the desired solution. Commonly considered such algorithms are the IRLS, LARS, and iterative soft-shrinkage methods.^[11]

There are several other methods for solving sparse decomposition problems: homotopy method, coordinate descent, iterative hard-thresholding, first order proximal methods, which are related to the above-mentioned iterative soft-shrinkage algorithms, and Dantzig selector.

Applications

Sparse approximation ideas and algorithms have been extensively used in signal processing, image processing, machine learning, medical imaging, array processing, data mining, and more. In most of these applications, the unknown signal of interest is modeled as a sparse combination of a few atoms from a given dictionary, and this is used as the regularization of the problem. These problems are typically accompanied by a dictionary learning mechanism that aims to fit $D$ to best match the model to the given data. The use of sparsity-inspired models has led to state-of-the-art results in a wide set of applications.^[12]^[13]^[14] Recent work suggests that there is a tight connection between sparse representation modeling and deep-learning.^[15]

Related Research Articles

<span class="mw-page-title-main">Regularization (mathematics)</span> Technique to make a model more generalizable and transferable

In mathematics, statistics, finance, computer science, particularly in machine learning and inverse problems, regularization is a process that changes the result answer to be "simpler". It is often used to obtain results for ill-posed problems or to prevent overfitting.

The Frank–Wolfe algorithm is an iterative first-order optimization algorithm for constrained convex optimization. Also known as the conditional gradient method, reduced gradient algorithm and the convex combination algorithm, the method was originally proposed by Marguerite Frank and Philip Wolfe in 1956. In each iteration, the Frank–Wolfe algorithm considers a linear approximation of the objective function, and moves towards a minimizer of this linear function.

Matching pursuit (MP) is a sparse approximation algorithm which finds the "best matching" projections of multidimensional data onto the span of an over-complete dictionary $. The basic idea is to approximately represent a signal from Hilbert space as a weighted sum of finitely many functions taken from . An approximation with atoms has the form$

Limited-memory BFGS is an optimization algorithm in the family of quasi-Newton methods that approximates the Broyden–Fletcher–Goldfarb–Shanno algorithm (BFGS) using a limited amount of computer memory. It is a popular algorithm for parameter estimation in machine learning. The algorithm's target problem is to minimize $over unconstrained values of the real-vector where is a differentiable scalar function.$

Stochastic approximation methods are a family of iterative methods typically used for root-finding problems or for optimization problems. The recursive update rules of stochastic approximation methods can be used, among other things, for solving linear systems when the collected data is corrupted by noise, or for approximating extreme values of functions which cannot be computed directly, but only estimated via noisy observations.

In mathematics, the Grothendieck inequality states that there is a universal constant $with the following property. If M ij is an n \times n matrix with$

Compressed sensing is a signal processing technique for efficiently acquiring and reconstructing a signal, by finding solutions to underdetermined linear systems. This is based on the principle that, through optimization, the sparsity of a signal can be exploited to recover it from far fewer samples than required by the Nyquist–Shannon sampling theorem. There are two conditions under which recovery is possible. The first one is sparsity, which requires the signal to be sparse in some domain. The second one is incoherence, which is applied through the isometric property, which is sufficient for sparse signals.

In linear algebra, the coherence or mutual coherence of a matrix A is defined as the maximum absolute value of the cross-correlations between the columns of A.

The Bregman method is an iterative algorithm to solve certain convex optimization problems involving regularization. The original version is due to Lev M. Bregman, who published it in 1967.

In statistics and machine learning, lasso is a regression analysis method that performs both variable selection and regularization in order to enhance the prediction accuracy and interpretability of the resulting statistical model. It was originally introduced in geophysics, and later by Robert Tibshirani, who coined the term.

In applied mathematics and statistics, basis pursuit denoising (BPDN) refers to a mathematical optimization problem of the form

The in-crowd algorithm is a numerical method for solving basis pursuit denoising quickly; faster than any other algorithm for large, sparse problems. This algorithm is an active set method, which minimizes iteratively sub-problems of the global basis pursuit denoising:

This article is about Compressed sensing in speech signals.

In applied mathematics, k-SVD is a dictionary learning algorithm for creating a dictionary for sparse representations, via a singular value decomposition approach. k-SVD is a generalization of the k-means clustering method, and it works by iteratively alternating between sparse coding the input data based on the current dictionary, and updating the atoms in the dictionary to better fit the data. It is structurally related to the expectation maximization (EM) algorithm. k-SVD can be found widely in use in applications such as image processing, audio processing, biology, and document analysis.

Proximal gradientmethods for learning is an area of research in optimization and statistical learning theory which studies algorithms for a general class of convex regularization problems where the regularization penalty may not be differentiable. One such example is $regularization of the form$

In machine learning, Manifold regularization is a technique for using the shape of a dataset to constrain the functions that should be learned on that dataset. In many machine learning problems, the data to be learned do not cover the entire input space. For example, a facial recognition system may not need to classify any possible image, but only the subset of images that contain faces. The technique of manifold learning assumes that the relevant subset of data comes from a manifold, a mathematical structure with useful properties. The technique also assumes that the function to be learned is smooth: data with different labels are not likely to be close together, and so the labeling function should not change quickly in areas where there are likely to be many data points. Because of this assumption, a manifold regularization algorithm can use unlabeled data to inform where the learned function is allowed to change quickly and where it is not, using an extension of the technique of Tikhonov regularization. Manifold regularization algorithms can extend supervised learning algorithms in semi-supervised learning and transductive learning settings, where unlabeled data are available. The technique has been used for applications including medical imaging, geographical imaging, and object recognition.

Sparse coding is a representation learning method which aims at finding a sparse representation of the input data in the form of a linear combination of basic elements as well as those basic elements themselves. These elements are called atoms and they compose a dictionary. Atoms in the dictionary are not required to be orthogonal, and they may be an over-complete spanning set. This problem setup also allows the dimensionality of the signals being represented to be higher than the one of the signals being observed. The above two properties lead to having seemingly redundant atoms that allow multiple representations of the same signal but also provide an improvement in sparsity and flexibility of the representation.

Structured sparsity regularization is a class of methods, and an area of research in statistical learning theory, that extend and generalize sparsity regularization learning methods. Both sparsity and structured sparsity regularization methods seek to exploit the assumption that the output variable $to be learned can be described by a reduced number of variables in the input space . Sparsity regularization methods focus on selecting the input variables that best describe the output. Structured sparsity regularization methods generalize and extend sparsity regularization methods, by allowing for optimal selection over structures like groups or networks of input variables in .$

In compressed sensing, the nullspace property gives necessary and sufficient conditions on the reconstruction of sparse signals using the techniques of $-relaxation$ . The term "nullspace property" originates from Cohen, Dahmen, and DeVore. The nullspace property is often difficult to check in practice, and the restricted isometry property is a more modern condition in the field of compressed sensing.

The convolutional sparse coding paradigm is an extension of the global sparse coding model, in which a redundant dictionary is modeled as a concatenation of circulant matrices. While the global sparsity constraint describes signal $as a linear combination of a few atoms in the redundant dictionary, usually expressed as for a sparse vector, the alternative dictionary structure adopted by the convolutional sparse coding model allows the sparsity prior to be applied locally instead of globally: independent patches of are generated by "local" dictionaries operating over stripes of .$

References

↑ Donoho, D.L. and Elad, M. (2003). "Optimally sparse representation in general (nonorthogonal) dictionaries via L1 minimization" (PDF). Proceedings of the National Academy of Sciences. 100 (5): 2197–2202. Bibcode:2003PNAS..100.2197D. doi: 10.1073/pnas.0437847100 . PMC 153464 . PMID 16576749.{{cite journal}}: CS1 maint: multiple names: authors list (link)
↑ Tropp, J.A. (2004). "Greed is good: Algorithmic results for sparse approximation" (PDF). IEEE Transactions on Information Theory. 50 (10): 2231–2242. CiteSeerX 10.1.1.321.1443 . doi:10.1109/TIT.2004.834793. S2CID 675692.
↑ Donoho, D.L. (2006). "For most large underdetermined systems of linear equations the minimal l1-norm solution is also the sparsest solution" (PDF). Communications on Pure and Applied Mathematics. 56 (6): 797–829. doi:10.1002/cpa.20132. S2CID 8510060.
1 2 Elad, M. (2010). Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing. Springer. CiteSeerX 10.1.1.331.8963 . doi:10.1007/978-1-4419-7011-4. ISBN 978-1441970107.
↑ Donoho, D.L., Elad, M. and Templyakov, V. (2006). "Stable recovery of sparse overcomplete representations in the presence of noise" (PDF). IEEE Transactions on Information Theory. 52 (1): 6–18. CiteSeerX 10.1.1.125.5610 . doi:10.1109/TIT.2005.860430. S2CID 14813938.{{cite journal}}: CS1 maint: multiple names: authors list (link)
↑ Tropp, J.A. (2006). "Just relax: Convex programming methods for identifying sparse signals in noise" (PDF). IEEE Transactions on Information Theory. 52 (3): 1030–1051. CiteSeerX 10.1.1.184.2957 . doi:10.1109/TIT.2005.864420. S2CID 6496872.
↑ Eldar, Y.C, Kuppinger, P. and Bolcskei, H. (2009). "Block-sparse signals: Uncertainty relations and efficient recovery". IEEE Transactions on Signal Processing. 58 (6): 3042–3054. arXiv: 0906.3173 . Bibcode:2010ITSP...58.3042E. doi:10.1109/TSP.2010.2044837. S2CID 335122.{{cite journal}}: CS1 maint: multiple names: authors list (link)
↑ Tropp, J.A., Gilbert, A.C. and Strauss, M.J. (2006). "Algorithms for simultaneous sparse approximation. Part I: Greedy pursuit". Signal Processing. 86 (3): 572–588. doi:10.1016/j.sigpro.2005.05.030.{{cite journal}}: CS1 maint: multiple names: authors list (link)
↑ Peleg, T. Eldar, Y.C. and Elad, M. (2012). "Exploiting Statistical Dependencies in Sparse Representations for Signal Recovery". IEEE Transactions on Signal Processing. 60 (5): 2286–2303. arXiv: 1010.5734 . Bibcode:2012ITSP...60.2286P. doi:10.1109/TSP.2012.2188520. S2CID 3179803.{{cite journal}}: CS1 maint: multiple names: authors list (link)
↑ Needell, D. and Tropp, J.A. (2009). "CoSaMP: Iterative signal recovery from incomplete and inaccurate samples". Applied and Computational Harmonic Analysis. 26 (3): 301–321. arXiv: 0803.2392 . doi:10.1016/j.acha.2008.07.002.{{cite journal}}: CS1 maint: multiple names: authors list (link)
↑ Zibulevsky, M. and Elad, M. (2010). "L1-L2 optimization in signal and image processing" (PDF). IEEE Signal Processing Magazine. 27 (3): 76–88. Bibcode:2010ISPM...27...76Z. doi:10.1109/MSP.2010.936023. S2CID 2783691.{{cite journal}}: CS1 maint: multiple names: authors list (link)
↑ Baraniuk, R.G. Candes, E. Elad, M. and Ma, Y. (2010). "Applications of sparse representation and compressive sensing". Proceedings of the IEEE. 98 (6): 906–909. doi:10.1109/JPROC.2010.2047424.{{cite journal}}: CS1 maint: multiple names: authors list (link)
↑ Elad, M. Figueiredo, M.A.T., and Ma, Y. (2010). "On the role of sparse and redundant representations in image processing" (PDF). Proceedings of the IEEE. 98 (6): 972–982. CiteSeerX 10.1.1.160.465 . doi:10.1109/JPROC.2009.2037655. S2CID 10992685. Archived from the original (PDF) on 2018-01-17.{{cite journal}}: CS1 maint: multiple names: authors list (link)
↑ Plumbley, M.D. Blumensath, T. Daudet, L. Gribonval, R. and Davies, M.E. (2010). "Sparse representations in audio and music: From coding to source separation". Proceedings of the IEEE. 98 (6): 995–1005. CiteSeerX 10.1.1.160.1607 . doi:10.1109/JPROC.2009.2030345. S2CID 4461063.{{cite journal}}: CS1 maint: multiple names: authors list (link)
↑ Papyan, V. Romano, Y. and Elad, M. (2017). "Convolutional Neural Networks Analyzed via Convolutional Sparse Coding" (PDF). Journal of Machine Learning Research. 18 (83): 1–52. arXiv: 1607.08194 . Bibcode:2016arXiv160708194P.{{cite journal}}: CS1 maint: multiple names: authors list (link)

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[donohoelad2003-1] Donoho, D.L. and Elad, M. (2003). "Optimally sparse representation in general (nonorthogonal) dictionaries via L1 minimization" (PDF). Proceedings of the National Academy of Sciences. 100 (5): 2197–2202. Bibcode:2003PNAS..100.2197D. doi: 10.1073/pnas.0437847100 . PMC 153464 . PMID 16576749.{{cite journal}}: CS1 maint: multiple names: authors list (link)

[Tropp2004-2] Tropp, J.A. (2004). "Greed is good: Algorithmic results for sparse approximation" (PDF). IEEE Transactions on Information Theory. 50 (10): 2231–2242. CiteSeerX 10.1.1.321.1443 . doi:10.1109/TIT.2004.834793. S2CID 675692.

[donoho2006most-3] Donoho, D.L. (2006). "For most large underdetermined systems of linear equations the minimal l1-norm solution is also the sparsest solution" (PDF). Communications on Pure and Applied Mathematics. 56 (6): 797–829. doi:10.1002/cpa.20132. S2CID 8510060.

[Elad2010-4] 1 2 Elad, M. (2010). Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing. Springer. CiteSeerX 10.1.1.331.8963 . doi:10.1007/978-1-4419-7011-4. ISBN 978-1441970107.

[DonohoEladTem2006-5] Donoho, D.L., Elad, M. and Templyakov, V. (2006). "Stable recovery of sparse overcomplete representations in the presence of noise" (PDF). IEEE Transactions on Information Theory. 52 (1): 6–18. CiteSeerX 10.1.1.125.5610 . doi:10.1109/TIT.2005.860430. S2CID 14813938.{{cite journal}}: CS1 maint: multiple names: authors list (link)

[Tropp2006-6] Tropp, J.A. (2006). "Just relax: Convex programming methods for identifying sparse signals in noise" (PDF). IEEE Transactions on Information Theory. 52 (3): 1030–1051. CiteSeerX 10.1.1.184.2957 . doi:10.1109/TIT.2005.864420. S2CID 6496872.

[7] Eldar, Y.C, Kuppinger, P. and Bolcskei, H. (2009). "Block-sparse signals: Uncertainty relations and efficient recovery". IEEE Transactions on Signal Processing. 58 (6): 3042–3054. arXiv: 0906.3173 . Bibcode:2010ITSP...58.3042E. doi:10.1109/TSP.2010.2044837. S2CID 335122.{{cite journal}}: CS1 maint: multiple names: authors list (link)

[8] Tropp, J.A., Gilbert, A.C. and Strauss, M.J. (2006). "Algorithms for simultaneous sparse approximation. Part I: Greedy pursuit". Signal Processing. 86 (3): 572–588. doi:10.1016/j.sigpro.2005.05.030.{{cite journal}}: CS1 maint: multiple names: authors list (link)

[9] Peleg, T. Eldar, Y.C. and Elad, M. (2012). "Exploiting Statistical Dependencies in Sparse Representations for Signal Recovery". IEEE Transactions on Signal Processing. 60 (5): 2286–2303. arXiv: 1010.5734 . Bibcode:2012ITSP...60.2286P. doi:10.1109/TSP.2012.2188520. S2CID 3179803.{{cite journal}}: CS1 maint: multiple names: authors list (link)

[10] Needell, D. and Tropp, J.A. (2009). "CoSaMP: Iterative signal recovery from incomplete and inaccurate samples". Applied and Computational Harmonic Analysis. 26 (3): 301–321. arXiv: 0803.2392 . doi:10.1016/j.acha.2008.07.002.{{cite journal}}: CS1 maint: multiple names: authors list (link)

[11] Zibulevsky, M. and Elad, M. (2010). "L1-L2 optimization in signal and image processing" (PDF). IEEE Signal Processing Magazine. 27 (3): 76–88. Bibcode:2010ISPM...27...76Z. doi:10.1109/MSP.2010.936023. S2CID 2783691.{{cite journal}}: CS1 maint: multiple names: authors list (link)

[survey1-12] Baraniuk, R.G. Candes, E. Elad, M. and Ma, Y. (2010). "Applications of sparse representation and compressive sensing". Proceedings of the IEEE. 98 (6): 906–909. doi:10.1109/JPROC.2010.2047424.{{cite journal}}: CS1 maint: multiple names: authors list (link)

[survey2-13] Elad, M. Figueiredo, M.A.T., and Ma, Y. (2010). "On the role of sparse and redundant representations in image processing" (PDF). Proceedings of the IEEE. 98 (6): 972–982. CiteSeerX 10.1.1.160.465 . doi:10.1109/JPROC.2009.2037655. S2CID 10992685. Archived from the original (PDF) on 2018-01-17.{{cite journal}}: CS1 maint: multiple names: authors list (link)

[survey3-14] Plumbley, M.D. Blumensath, T. Daudet, L. Gribonval, R. and Davies, M.E. (2010). "Sparse representations in audio and music: From coding to source separation". Proceedings of the IEEE. 98 (6): 995–1005. CiteSeerX 10.1.1.160.1607 . doi:10.1109/JPROC.2009.2030345. S2CID 4461063.{{cite journal}}: CS1 maint: multiple names: authors list (link)

[deeplearning-15] Papyan, V. Romano, Y. and Elad, M. (2017). "Convolutional Neural Networks Analyzed via Convolutional Sparse Coding" (PDF). Journal of Machine Learning Research. 18 (83): 1–52. arXiv: 1607.08194 . Bibcode:2016arXiv160708194P.{{cite journal}}: CS1 maint: multiple names: authors list (link)

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

v t e Numerical linear algebra
Key concepts	Floating point Numerical stability
Problems	System of linear equations Matrix decompositions Matrix multiplication (algorithms) Matrix splitting Sparse problems
Hardware	CPU cache TLB Cache-oblivious algorithm SIMD Multiprocessing
Software	MATLAB Basic Linear Algebra Subprograms (BLAS) LAPACK Specialized libraries General purpose software