Wiener filter

Last updated

In signal processing, the Wiener filter is a filter used to produce an estimate of a desired or target random process by linear time-invariant (LTI) filtering of an observed noisy process, assuming known stationary signal and noise spectra, and additive noise. The Wiener filter minimizes the mean square error between the estimated random process and the desired process.

Contents

Description

The goal of the Wiener filter is to compute a statistical estimate of an unknown signal using a related signal as an input and filtering that known signal to produce the estimate as an output. For example, the known signal might consist of an unknown signal of interest that has been corrupted by additive noise. The Wiener filter can be used to filter out the noise from the corrupted signal to provide an estimate of the underlying signal of interest. The Wiener filter is based on a statistical approach, and a more statistical account of the theory is given in the minimum mean square error (MMSE) estimator article.

Typical deterministic filters are designed for a desired frequency response. However, the design of the Wiener filter takes a different approach. One is assumed to have knowledge of the spectral properties of the original signal and the noise, and one seeks the linear time-invariant filter whose output would come as close to the original signal as possible. Wiener filters are characterized by the following: [1]

  1. Assumption: signal and (additive) noise are stationary linear stochastic processes with known spectral characteristics or known autocorrelation and cross-correlation
  2. Requirement: the filter must be physically realizable/causal (this requirement can be dropped, resulting in a non-causal solution)
  3. Performance criterion: minimum mean-square error (MMSE)

This filter is frequently used in the process of deconvolution; for this application, see Wiener deconvolution.

Wiener filter solutions

Let be an unknown signal which must be estimated from a measurement signal . Where alpha is a tunable parameter. is known as prediction, is known as filtering, and is known as smoothing (see Wiener filtering chapter of [1] for more details).

The Wiener filter problem has solutions for three possible cases: one where a noncausal filter is acceptable (requiring an infinite amount of both past and future data), the case where a causal filter is desired (using an infinite amount of past data), and the finite impulse response (FIR) case where only input data is used (i.e. the result or output is not fed back into the filter as in the IIR case). The first case is simple to solve but is not suited for real-time applications. Wiener's main accomplishment was solving the case where the causality requirement is in effect; Norman Levinson gave the FIR solution in an appendix of Wiener's book.

Noncausal solution

where are spectral densities. Provided that is optimal, then the minimum mean-square error equation reduces to

and the solution is the inverse two-sided Laplace transform of .

Causal solution

where

This general formula is complicated and deserves a more detailed explanation. To write down the solution in a specific case, one should follow these steps: [2]

  1. Start with the spectrum in rational form and factor it into causal and anti-causal components: where contains all the zeros and poles in the left half plane (LHP) and contains the zeroes and poles in the right half plane (RHP). This is called the Wiener–Hopf factorization.
  2. Divide by and write out the result as a partial fraction expansion.
  3. Select only those terms in this expansion having poles in the LHP. Call these terms .
  4. Divide by . The result is the desired filter transfer function .

Finite impulse response Wiener filter for discrete series

Block diagram view of the FIR Wiener filter for discrete series. An input signal w[n] is convolved with the Wiener filter g[n] and the result is compared to a reference signal s[n] to obtain the filtering error e[n]. Wiener block.svg
Block diagram view of the FIR Wiener filter for discrete series. An input signal w[n] is convolved with the Wiener filter g[n] and the result is compared to a reference signal s[n] to obtain the filtering error e[n].

The causal finite impulse response (FIR) Wiener filter, instead of using some given data matrix X and output vector Y, finds optimal tap weights by using the statistics of the input and output signals. It populates the input matrix X with estimates of the auto-correlation of the input signal (T) and populates the output vector Y with estimates of the cross-correlation between the output and input signals (V).

In order to derive the coefficients of the Wiener filter, consider the signal w[n] being fed to a Wiener filter of order (number of past taps) N and with coefficients . The output of the filter is denoted x[n] which is given by the expression

The residual error is denoted e[n] and is defined as e[n] = x[n]  s[n] (see the corresponding block diagram). The Wiener filter is designed so as to minimize the mean square error (MMSE criteria) which can be stated concisely as follows:

where denotes the expectation operator. In the general case, the coefficients may be complex and may be derived for the case where w[n] and s[n] are complex as well. With a complex signal, the matrix to be solved is a Hermitian Toeplitz matrix, rather than symmetric Toeplitz matrix. For simplicity, the following considers only the case where all these quantities are real. The mean square error (MSE) may be rewritten as:

To find the vector which minimizes the expression above, calculate its derivative with respect to each

Assuming that w[n] and s[n] are each stationary and jointly stationary, the sequences and known respectively as the autocorrelation of w[n] and the cross-correlation between w[n] and s[n] can be defined as follows:

The derivative of the MSE may therefore be rewritten as:

Note that for real , the autocorrelation is symmetric:

Letting the derivative be equal to zero results in:

which can be rewritten (using the above symmetric property) in matrix form

These equations are known as the Wiener–Hopf equations. The matrix T appearing in the equation is a symmetric Toeplitz matrix. Under suitable conditions on , these matrices are known to be positive definite and therefore non-singular yielding a unique solution to the determination of the Wiener filter coefficient vector, . Furthermore, there exists an efficient algorithm to solve such Wiener–Hopf equations known as the Levinson-Durbin algorithm so an explicit inversion of T is not required.

In some articles, the cross correlation function is defined in the opposite way:

Then, the matrix will contain ; this is just a difference in notation. Whichever notation is used, note that for real :

Relationship to the least squares filter

The realization of the causal Wiener filter looks a lot like the solution to the least squares estimate, except in the signal processing domain. The least squares solution, for input matrix and output vector is

The FIR Wiener filter is related to the least mean squares filter, but minimizing the error criterion of the latter does not rely on cross-correlations or auto-correlations. Its solution converges to the Wiener filter solution.

Complex signals

For complex signals, the derivation of the complex Wiener filter is performed by minimizing =. This involves computing partial derivatives with respect to both the real and imaginary parts of , and requiring them both to be zero.

The resulting Wiener-Hopf equations are:

which can be rewritten in matrix form:

Note here that:

The Wiener coefficient vector is then computed as:

Applications

The Wiener filter has a variety of applications in signal processing, image processing, [3] control systems, and digital communications. These applications generally fall into one of four main categories:

Astronaut-noise.png
Noisy image of an astronaut
Astronaut-denoised.png
The image after a Wiener filter is applied (full-view recommended)

For example, the Wiener filter can be used in image processing to remove noise from a picture. For example, using the Mathematica function: WienerFilter[image,2] on the first image on the right, produces the filtered image below it.

It is commonly used to denoise audio signals, especially speech, as a preprocessor before speech recognition.

History

The filter was proposed by Norbert Wiener during the 1940s and published in 1949. [4] [5] The discrete-time equivalent of Wiener's work was derived independently by Andrey Kolmogorov and published in 1941. [6] Hence the theory is often called the Wiener–Kolmogorov filtering theory (cf. Kriging). The Wiener filter was the first statistically designed filter to be proposed and subsequently gave rise to many others including the Kalman filter.

See also

Related Research Articles

<span class="mw-page-title-main">Row and column spaces</span> Vector spaces associated to a matrix

In linear algebra, the column space of a matrix A is the span of its column vectors. The column space of a matrix is the image or range of the corresponding matrix transformation.

<span class="mw-page-title-main">Matrix multiplication</span> Mathematical operation in linear algebra

In mathematics, particularly in linear algebra, matrix multiplication is a binary operation that produces a matrix from two matrices. For matrix multiplication, the number of columns in the first matrix must be equal to the number of rows in the second matrix. The resulting matrix, known as the matrix product, has the number of rows of the first and the number of columns of the second matrix. The product of matrices A and B is denoted as AB.

In linear algebra, a Toeplitz matrix or diagonal-constant matrix, named after Otto Toeplitz, is a matrix in which each descending diagonal from left to right is constant. For instance, the following matrix is a Toeplitz matrix:

In the mathematical field of differential geometry, a metric tensor is an additional structure on a manifold M that allows defining distances and angles, just as the inner product on a Euclidean space allows defining distances and angles there. More precisely, a metric tensor at a point p of M is a bilinear form defined on the tangent space at p, and a metric field on M consists of a metric tensor at each point p of M that varies smoothly with p.

In linear algebra, an n-by-n square matrix A is called invertible if there exists an n-by-n square matrix B such that

Levinson recursion or Levinson–Durbin recursion is a procedure in linear algebra to recursively calculate the solution to an equation involving a Toeplitz matrix. The algorithm runs in Θ(n2) time, which is a strong improvement over Gauss–Jordan elimination, which runs in Θ(n3).

In linear algebra, a square matrix  is called diagonalizable or non-defective if it is similar to a diagonal matrix. That is, if there exists an invertible matrix  and a diagonal matrix such that . This is equivalent to . This property exists for any linear map: for a finite-dimensional vector space , a linear map  is called diagonalizable if there exists an ordered basis of  consisting of eigenvectors of . These definitions are equivalent: if  has a matrix representation as above, then the column vectors of  form a basis consisting of eigenvectors of , and the diagonal entries of  are the corresponding eigenvalues of ; with respect to this eigenvector basis,  is represented by .

The cross-correlation matrix of two random vectors is a matrix containing as elements the cross-correlations of all pairs of elements of the random vectors. The cross-correlation matrix is used in various digital signal processing algorithms.

In linear algebra, a QR decomposition, also known as a QR factorization or QU factorization, is a decomposition of a matrix A into a product A = QR of an orthonormal matrix Q and an upper triangular matrix R. QR decomposition is often used to solve the linear least squares (LLS) problem and is the basis for a particular eigenvalue algorithm, the QR algorithm.

<span class="mw-page-title-main">Block matrix</span> Matrix defined using smaller matrices called blocks

In mathematics, a block matrix or a partitioned matrix is a matrix that is interpreted as having been broken into sections called blocks or submatrices.

<span class="mw-page-title-main">Kriging</span> Method of interpolation

In statistics, originally in geostatistics, kriging or Kriging, also known as Gaussian process regression, is a method of interpolation based on Gaussian process governed by prior covariances. Under suitable assumptions of the prior, kriging gives the best linear unbiased prediction (BLUP) at unsampled locations. Interpolating methods based on other criteria such as smoothness may not yield the BLUP. The method is widely used in the domain of spatial analysis and computer experiments. The technique is also known as Wiener–Kolmogorov prediction, after Norbert Wiener and Andrey Kolmogorov.

In mathematics, the matrix exponential is a matrix function on square matrices analogous to the ordinary exponential function. It is used to solve systems of linear differential equations. In the theory of Lie groups, the matrix exponential gives the exponential map between a matrix Lie algebra and the corresponding Lie group.

In linear algebra, linear transformations can be represented by matrices. If is a linear transformation mapping to and is a column vector with entries, then

In mathematics, the Kronecker product, sometimes denoted by ⊗, is an operation on two matrices of arbitrary size resulting in a block matrix. It is a specialization of the tensor product from vectors to matrices and gives the matrix of the tensor product linear map with respect to a standard choice of basis. The Kronecker product is to be distinguished from the usual matrix multiplication, which is an entirely different operation. The Kronecker product is also sometimes called matrix direct product.

In linear algebra, a circulant matrix is a square matrix in which all rows are composed of the same elements and each row is rotated one element to the right relative to the preceding row. It is a particular kind of Toeplitz matrix.

In mathematics, the kernel of a linear map, also known as the null space or nullspace, is the part of the domain which the map maps to the zero vector. That is, given a linear map L : VW between two vector spaces V and W, the kernel of L is the vector space of all elements v of V such that L(v) = 0, where 0 denotes the zero vector in W, or more symbolically:

In mathematics, matrix calculus is a specialized notation for doing multivariable calculus, especially over spaces of matrices. It collects the various partial derivatives of a single function with respect to many variables, and/or of a multivariate function with respect to a single variable, into vectors and matrices that can be treated as single entities. This greatly simplifies operations such as finding the maximum or minimum of a multivariate function and solving systems of differential equations. The notation used here is commonly used in statistics and engineering, while the tensor index notation is preferred in physics.

Least mean squares (LMS) algorithms are a class of adaptive filter used to mimic a desired filter by finding the filter coefficients that relate to producing the least mean square of the error signal. It is a stochastic gradient descent method in that the filter is only adapted based on the error at the current time. It was invented in 1960 by Stanford University professor Bernard Widrow and his first Ph.D. student, Ted Hoff, based on their research in single-layer neural networks (ADALINE). Specifically, they used gradient descent to train ADALINE to recognize patterns, and called the algorithm "delta rule". They then applied the rule to filters, resulting in the LMS algorithm.

In the mathematical discipline of matrix theory, a Jordan matrix, named after Camille Jordan, is a block diagonal matrix over a ring R, where each block along the diagonal, called a Jordan block, has the following form:

<span class="mw-page-title-main">Generalized pencil-of-function method</span>

Generalized pencil-of-function method (GPOF), also known as matrix pencil method, is a signal processing technique for estimating a signal or extracting information with complex exponentials. Being similar to Prony and original pencil-of-function methods, it is generally preferred to those for its robustness and computational efficiency.

References

  1. 1 2 Brown, Robert Grover; Hwang, Patrick Y.C. (1996). Introduction to Random Signals and Applied Kalman Filtering (3 ed.). New York: John Wiley & Sons. ISBN   978-0-471-12839-7.
  2. Welch, Lloyd R. "Wiener–Hopf Theory" (PDF). Archived from the original (PDF) on 2006-09-20. Retrieved 2006-11-25.
  3. Boulfelfel, D.; Rangayyan, R. M.; Hahn, L. J.; Kloiber, R. (1994). "Three-dimensional restoration of single photon emission computed tomography images". IEEE Transactions on Nuclear Science. 41 (5): 1746–1754. Bibcode:1994ITNS...41.1746B. doi:10.1109/23.317385. S2CID   33708058.
  4. Wiener N: The interpolation, extrapolation and smoothing of stationary time series', Report of the Services 19, Research Project DIC-6037 MIT, February 1942
  5. Wiener, Norbert (1949). Extrapolation, Interpolation, and Smoothing of Stationary Time Series. New York: Wiley. ISBN   978-0-262-73005-1.
  6. Kolmogorov A.N: 'Stationary sequences in Hilbert space', (In Russian) Bull. Moscow Univ. 1941 vol.2 no.6 1-40. English translation in Kailath T. (ed.) Linear least squares estimation Dowden, Hutchinson & Ross 1977 ISBN   0-87933-098-8

Further reading