Least-squares function approximation

Last updated September 20, 2022

In mathematics, least squares function approximation applies the principle of least squares to function approximation, by means of a weighted sum of other functions. The best approximation can be defined as that which minimizes the difference between the original function and the approximation; for a least-squares approach the quality of the approximation is measured in terms of the squared differences between the two.

Functional analysis

A generalization to approximation of a data set is the approximation of a function by a sum of other functions, usually an orthogonal set:^[1]

f(x)\approx f_{n}(x)=a_{1}\phi _{1}(x)+a_{2}\phi _{2}(x)+\cdots +a_{n}\phi _{n}(x),\

with the set of functions { $\ \phi _{j}(x)$ } an orthonormal set over the interval of interest, say [a, b]: see also Fejér's theorem. The coefficients { $\ a_{j}$ } are selected to make the magnitude of the difference ||f − f_n||² as small as possible. For example, the magnitude, or norm, of a function g (x ) over the interval [a, b] can be defined by:^[2]

\|g\|=\left(\int _{a}^{b}g^{*}(x)g(x)\,dx\right)^{1/2}

where the ‘*’ denotes complex conjugate in the case of complex functions. The extension of Pythagoras' theorem in this manner leads to function spaces and the notion of Lebesgue measure, an idea of “space” more general than the original basis of Euclidean geometry. The { $\phi _{j}(x)\$ } satisfy orthonormality relations:^[3]

\int _{a}^{b}\phi _{i}^{*}(x)\phi _{j}(x)\,dx=\delta _{ij},

where δ_ij is the Kronecker delta. Substituting function f_n into these equations then leads to the n-dimensional Pythagorean theorem:^[4]

\|f_{n}\|^{2}=|a_{1}|^{2}+|a_{2}|^{2}+\cdots +|a_{n}|^{2}.\,

The coefficients {a_j} making ||f − f_n||² as small as possible are found to be:^[1]

a_{j}=\int _{a}^{b}\phi _{j}^{*}(x)f(x)\,dx.

The generalization of the n-dimensional Pythagorean theorem to infinite-dimensional real inner product spaces is known as Parseval's identity or Parseval's equation.^[5] Particular examples of such a representation of a function are the Fourier series and the generalized Fourier series.

Further discussion

Using linear algebra

It follows that one can find a "best" approximation of another function by minimizing the area between two functions, a continuous function $f$ on $[a,b]$ and a function $g\in W$ where $W$ is a subspace of $C[a,b]$ :

{\text{Area}}=\int _{a}^{b}\left\vert f(x)-g(x)\right\vert \,dx,

all within the subspace $W$ . Due to the frequent difficulty of evaluating integrands involving absolute value, one can instead define

\int _{a}^{b}[f(x)-g(x)]^{2}\,dx

as an adequate criterion for obtaining the least squares approximation, function $g$ , of $f$ with respect to the inner product space $W$ .

As such, $\lVert f-g\rVert ^{2}$ or, equivalently, $\lVert f-g\rVert$ , can thus be written in vector form:

\int _{a}^{b}[f(x)-g(x)]^{2}\,dx=\left\langle f-g,f-g\right\rangle =\lVert f-g\rVert ^{2}.

In other words, the least squares approximation of $f$ is the function $g\in {\text{ subspace }}W$ closest to $f$ in terms of the inner product $\left\langle f,g\right\rangle$ . Furthermore, this can be applied with a theorem:

Let

f

be continuous on

[a,b]

, and let

W

be a finite-dimensional subspace of

C[a,b]

. The least squares approximating function of

f

with respect to

W

is given by

g=\left\langle f,{\vec {w}}_{1}\right\rangle {\vec {w}}_{1}+\left\langle f,{\vec {w}}_{2}\right\rangle {\vec {w}}_{2}+\cdots +\left\langle f,{\vec {w}}_{n}\right\rangle {\vec {w}}_{n},

where

B=\{{\vec {w}}_{1},{\vec {w}}_{2},\dots ,{\vec {w}}_{n}\}

is an orthonormal basis for

W

.

Related Research Articles

In quantum mechanics, bra–ket notation, or Dirac notation, is used ubiquitously to denote quantum states. The notation uses angle brackets, and , and a vertical bar , to construct "bras" and "kets".

In theoretical physics, a Feynman diagram is a pictorial representation of the mathematical expressions describing the behavior and interaction of subatomic particles. The scheme is named after American physicist Richard Feynman, who introduced the diagrams in 1948. The interaction of subatomic particles can be complex and difficult to understand; Feynman diagrams give a simple visualization of what would otherwise be an arcane and abstract formula. According to David Kaiser, "Since the middle of the 20th century, theoretical physicists have increasingly turned to this tool to help them undertake critical calculations. Feynman diagrams have revolutionized nearly every aspect of theoretical physics." While the diagrams are applied primarily to quantum field theory, they can also be used in other fields, such as solid-state theory. Frank Wilczek wrote that the calculations which won him the 2004 Nobel Prize in Physics "would have been literally unthinkable without Feynman diagrams, as would [Wilczek's] calculations that established a route to production and observation of the Higgs particle."

The Riesz representation theorem, sometimes called the Riesz–Fréchet representation theorem after Frigyes Riesz and Maurice René Fréchet, establishes an important connection between a Hilbert space and its continuous dual space. If the underlying field is the real numbers, the two are isometrically isomorphic; if the underlying field is the complex numbers, the two are isometrically anti-isomorphic. The (anti-) isomorphism is a particular natural isomorphism.

In mathematics, weak topology is an alternative term for certain initial topologies, often on topological vector spaces or spaces of linear operators, for instance on a Hilbert space. The term is most commonly used for the initial topology of a topological vector space with respect to its continuous dual. The remainder of this article will deal with this case, which is one of the concepts of functional analysis.

In mathematics, the Dirac delta distribution, also known as the unit impulse, is a generalized function or distribution over the real numbers, whose value is zero everywhere except at zero, and whose integral over the entire real line is equal to one.

Distributions, also known as Schwartz distributions or generalized functions, are objects that generalize the classical notion of functions in mathematical analysis. Distributions make it possible to differentiate functions whose derivatives do not exist in the classical sense. In particular, any locally integrable function has a distributional derivative.

In mathematics, a self-adjoint operator on an infinite-dimensional complex vector space V with inner product $is a linear map A that is its own adjoint. If V is finite-dimensional with a given orthonormal basis, this is equivalent to the condition that the matrix of A is a Hermitian matrix, i.e., equal to its conjugate transpose A * . By the finite-dimensional spectral theorem, V has an orthonormal basis such that the matrix of A relative to this basis is a diagonal matrix with entries in the real numbers. In this article, we consider generalizations of this concept to operators on Hilbert spaces of arbitrary dimension.$

In linear algebra, two vectors in an inner product space are orthonormal if they are orthogonal unit vectors. A set of vectors form an orthonormal set if all vectors in the set are mutually orthogonal and all of unit length. An orthonormal set which forms a basis is called an orthonormal basis.

In mathematical analysis, many generalizations of Fourier series have proved to be useful. They are all special cases of decompositions over an orthonormal basis of an inner product space. Here we consider that of square-integrable functions defined on an interval of the real line, which is important, among others, for interpolation theory.

In physics, an operator is a function over a space of physical states onto another space of physical states. The simplest example of the utility of operators is the study of symmetry. Because of this, they are very useful tools in classical mechanics. Operators are even more important in quantum mechanics, where they form an intrinsic part of the formulation of the theory.

In mathematical analysis, Parseval's identity, named after Marc-Antoine Parseval, is a fundamental result on the summability of the Fourier series of a function. Geometrically, it is a generalized Pythagorean theorem for inner-product spaces.

Pseudo-spectral methods, also known as discrete variable representation (DVR) methods, are a class of numerical methods used in applied mathematics and scientific computing for the solution of partial differential equations. They are closely related to spectral methods, but complement the basis by an additional pseudo-spectral basis, which allows representation of functions on a quadrature grid. This simplifies the evaluation of certain operators, and can considerably speed up the calculation when using fast algorithms such as the fast Fourier transform.

In mathematics, spectral theory is an inclusive term for theories extending the eigenvector and eigenvalue theory of a single square matrix to a much broader theory of the structure of operators in a variety of mathematical spaces. It is a result of studies of linear algebra and the solutions of systems of linear equations and their generalizations. The theory is connected to that of analytic functions because the spectral properties of an operator are related to analytic functions of the spectral parameter.

In functional analysis, a reproducing kernel Hilbert space (RKHS) is a Hilbert space of functions in which point evaluation is a continuous linear functional. Roughly speaking, this means that if two functions $and in the RKHS are close in norm, i.e., is small, then and are also pointwise close, i.e., is small for all . The converse does not need to be true.$

In quantum field theory, correlation functions, often referred to as correlators or Green's functions, are vacuum expectation values of time-ordered products of field operators. They are a key object of study in quantum field theory where they can be used to calculate various observables such as S-matrix elements.

In the theory of stochastic processes, the Karhunen–Loève theorem, also known as the Kosambi–Karhunen–Loève theorem is a representation of a stochastic process as an infinite linear combination of orthogonal functions, analogous to a Fourier series representation of a function on a bounded interval. The transformation is also known as Hotelling transform and eigenvector transform, and is closely related to principal component analysis (PCA) technique widely used in image processing and in data analysis in many fields.

In mathematics, a dual wavelet is the dual to a wavelet. In general, the wavelet series generated by a square-integrable function will have a dual series, in the sense of the Riesz representation theorem. However, the dual series is not itself in general representable by a square-integrable function.

In operator theory, Naimark's dilation theorem is a result that characterizes positive operator valued measures. It can be viewed as a consequence of Stinespring's dilation theorem.

In mathematics, Hilbert spaces allow generalizing the methods of linear algebra and calculus from (finite-dimensional) Euclidean vector spaces to spaces that may be infinite-dimensional. A Hilbert space is a vector space equipped with an inner product which defines a distance function for which it is a complete metric space. Hilbert spaces arise naturally and frequently in mathematics and physics, typically as function spaces.

Coherent states have been introduced in a physical context, first as quasi-classical states in quantum mechanics, then as the backbone of quantum optics and they are described in that spirit in the article Coherent states. However, they have generated a huge variety of generalizations, which have led to a tremendous amount of literature in mathematical physics. In this article, we sketch the main directions of research on this line. For further details, we refer to several existing surveys.

References

1 2 Cornelius Lanczos (1988). Applied analysis (Reprint of 1956 Prentice–Hall ed.). Dover Publications. pp. 212–213. ISBN 0-486-65656-X.
↑ Gerald B Folland (2009). "Equation 3.14". Fourier analysis and its application (Reprint of Wadsworth and Brooks/Cole 1992 ed.). American Mathematical Society Bookstore. p. 69. ISBN 0-8218-4790-2.
↑ Folland, Gerald B (2009). Fourier Analysis and Its Applications. American Mathematical Society. p. 69. ISBN 0-8218-4790-2.
↑ David J. Saville, Graham R. Wood (1991). "§2.5 Sum of squares". Statistical methods: the geometric approach (3rd ed.). Springer. p. 30. ISBN 0-387-97517-9.
↑ Gerald B Folland (2009-01-13). "Equation 3.22". cited work. p. 77. ISBN 0-8218-4790-2.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[Lanczos-1] 1 2 Cornelius Lanczos (1988). Applied analysis (Reprint of 1956 Prentice–Hall ed.). Dover Publications. pp. 212–213. ISBN 0-486-65656-X.

[Folland-2] Gerald B Folland (2009). "Equation 3.14". Fourier analysis and its application (Reprint of Wadsworth and Brooks/Cole 1992 ed.). American Mathematical Society Bookstore. p. 69. ISBN 0-8218-4790-2.

[Folland2-3] Folland, Gerald B (2009). Fourier Analysis and Its Applications. American Mathematical Society. p. 69. ISBN 0-8218-4790-2.

[Wood-4] David J. Saville, Graham R. Wood (1991). "§2.5 Sum of squares". Statistical methods: the geometric approach (3rd ed.). Springer. p. 30. ISBN 0-387-97517-9.

[Folland3-5] Gerald B Folland (2009-01-13). "Equation 3.22". cited work. p. 77. ISBN 0-8218-4790-2.

[1]

[2]

[3]

[4]

[5]