Estimation of signal parameters via rotational invariance techniques

Last updated
Example of separation into subarrays (2D ESPRIT) ESPRIT 2D.gif
Example of separation into subarrays (2D ESPRIT)

Estimation theory, or estimation of signal parameters via rotational invariant techniques (ESPRIT), is a technique to determine the parameters of a mixture of sinusoids in background noise. This technique was first proposed for frequency estimation. [1] However, with the introduction of phased-array systems in everyday technology, it is also used for angle of arrival estimations. [2]

Contents

General description

System model

The model under investigation is the following (1-D version):

This model describes a system that is fed with inputs signals , with , and produces output signals , with . The system's output is sampled at discrete time instances. All input signals are weighted and summed up. There are separate weights for each input signal and for each output signal. The quantity denotes noise added by the system.

The one-dimensional form of ESPRIT can be applied if the weights have the following form.

That is, the weights are complex exponential functions, and the phases are integer multiples of some radial frequency. Note that this frequency only depends on the index of the system's input.

The goal of ESPRIT is to estimate the radial frequencies given the outputs and the number of input signals .

Since the radial frequencies are the actual objectives, we will change the notation from to .

Let us now change to a vector notation by putting the weights in a column vector .

Now, the system model can be rewritten using and the output vector as follows.

Dividing into virtual sub-arrays

Maximum overlapping of two sub-arrays (N denotes number of sensors in the array, m is the number of sensors in each sub-array, and
J
1
{\displaystyle J_{1}}
and
J
2
{\displaystyle J_{2}}
are selection matrices) Max overlapping subarrays.png
Maximum overlapping of two sub-arrays (N denotes number of sensors in the array, m is the number of sensors in each sub-array, and and are selection matrices)

The basis of ESPRIT is that the weight vector has the property that adjacent entries are related as follows:

In order to write down this property for the whole vector we define two selection matrices and :

Here, is an identity matrix of size and is a vector of zeros. The vector contains all elements of except the last one. The vector contains all elements of except the first one. Therefore, we can write:

In general, we have multiple sinusoids with radial frequencies . Therefore, we put the corresponding weight vectors into a Vandermonde matrix .

Moreover, we define a matrix which has complex exponentials on its main diagonal and zero in all other places.

Now, we can write down the property for the whole matrix .

Note: is multiplied from the right such that it scales each column of by the same value.

In the next sections, we will use the following matrices:

Here, contains the first rows of, while contains the last rows of .

Hence, the basic property becomes:

Notice that applies a rotation to the matrix in the complex plane. ESPRIT exploits similar rotations from the covariance matrix of the measured data.

Signal subspace

The relation is the first major observation required for ESPRIT. The second major observation concerns the signal subspace that can be computed from the output signals .

We will now look at multiple-time instances . For each time instance, we measure an output vector . Let denote the matrix of size comprising all of these measurements.

Likewise, let us put all input signals into a matrix .

The same we do for the noise components:

The system model can now be written as

The singular value decomposition (SVD) of is given as

where and are unitary matrices of sizes and , respectively. is a non-rectangular diagonal matrix of size that holds the singular values from the largest (top left) in descending order. The operator * denotes the complex-conjugate transpose (Hermitian transpose)

Let us assume that , which means that the number of times that we conduct a measurement is at least as large as the number of output signals .

Notice that in the system model, we have input signals. We presume that the largest singular values stem from these input signals. All other singular values are presumed to stem from noise. That is, if there was no noise, there would only be non-zero singular values. We will now decompose each SVD matrix into submatrices, where some submatrices correspond to the input signals and some correspond to the input noise, respectively:

Here, and contain the first columns of and, respectively. is a diagonal matrix comprising the largest singular values. The SVD can be equivalently written as follows.

, ⁣, and represent the contribution of the input signal to . Therefore, we will call the signal subspace. In contrast, , , and represent the contribution of noise to . Hence, by using the system model we can write:

and

By modifying the second-last equation, we get:

That is, the signal subspace is the product of the matrix and some other matrix . In the following, it is only important that there exist such an invertible matrix . Its actual content will not be important.

Note:

The signal subspace is usually not computed from the measurement matrix . Instead, one may use the auto-correlation matrix.

Hence, can be decomposed into signal subspace and noise subspace

Putting the things together

These are the two basic properties that are known now:

Let us start with the equation on the right:

Define these abbreviations for the truncated signal sub spaces:

Moreover, define this matrix:

Note that the left-hand side of the last equation has the form of an eigenvalue decomposition, where the eigenvalues are stored in the matrix . As defined in some earlier section, stores complex exponentials on its main diagonals. Their phases are the sought-after radial frequencies .

Using these abbreviations, the following form is obtained:

The idea is now that, if we could compute from this equation, we would be able to find the eigenvalues of which in turn would give us the radial frequencies. However, is generally not invertible. For that, a least squares solution can be used

Estimation of radial frequencies

The eigenvalues of P are complex numbers:

The estimated radial frequencies are the phases (angles) of the eigenvalues .

Algorithm summary

  1. Collect measurements .
  2. If not already known: Estimate the number of input signals .
  3. Compute auto-correlation matrix.
  4. Compute singular value decomposition (SVD) of and extract the signal subspace .
  5. Compute matrices and .
    where and .
  6. Solve the equation
    for . An example would be the least squares solution:
    (Here, * denotes the Hermitian (conjugate) transpose.) An alternative would be the total least squares solution.
  7. Compute the eigenvalues of .
  8. The phases of the eigenvalues are the sought-after radial frequencies .

Notes

Choice of selection matrices

In the derivation above, the selection matrices and were used. For simplicity, they were defined as and . However, at no point during the derivation it was required that and must be defined like this. Indeed, any appropriate matrices may be used as long as the rotational invariance

(or some generalization of it) is maintained. And accordingly, and may contain any rows of .

Generalized rotational invariance

The rotational invariance used in the derivation may be generalized. So far, the matrix has been defined to be a diagonal matrix that stores the sought-after complex exponentials on its main diagonal. However, may also exhibit some other structure. [3] For instance, it may be an upper triangular matrix. In this case, constitutes a triangularization of .

Algorithm example

A pseudocode is given below for the implementation of ESPRIT algorithm.

function esprit(y, model_order, number_of_sources):     m = model_order     n = number_of_sources     create covariance matrix R, from the noisy measurements y. Size of R will be (m-by-m).     compute the svd of R     [U, E, V] = svd(R)          obtain the orthonormal eigenvectors corresponding to the sources     S = U(:, 1:n)                             split the orthonormal eigenvectors in two     S1 = S(1:m-1, :) and S2 = S(2:m, :)                                                     compute P via LS (MATLAB's backslash operator)     P = S1\S2              find the angles of the eigenvalues of P     w = angle(eig(P)) / (2*pi*elspacing)      doa=asind(w)      %return the doa angle by taking the arcsin in degrees      return 'doa

See also

Related Research Articles

<span class="mw-page-title-main">Lorentz transformation</span> Family of linear transformations

In physics, the Lorentz transformations are a six-parameter family of linear transformations from a coordinate frame in spacetime to another frame that moves at a constant velocity relative to the former. The respective inverse transformation is then parameterized by the negative of this velocity. The transformations are named after the Dutch physicist Hendrik Lorentz.

<span class="mw-page-title-main">Pauli matrices</span> Matrices important in quantum mechanics and the study of spin

In mathematical physics and mathematics, the Pauli matrices are a set of three 2 × 2 complex matrices that are Hermitian, involutory and unitary. Usually indicated by the Greek letter sigma, they are occasionally denoted by tau when used in connection with isospin symmetries.

In linear algebra, the Cholesky decomposition or Cholesky factorization is a decomposition of a Hermitian, positive-definite matrix into the product of a lower triangular matrix and its conjugate transpose, which is useful for efficient numerical solutions, e.g., Monte Carlo simulations. It was discovered by André-Louis Cholesky for real matrices, and posthumously published in 1924. When it is applicable, the Cholesky decomposition is roughly twice as efficient as the LU decomposition for solving systems of linear equations.

<span class="mw-page-title-main">Ellipsoid</span> Quadric surface that looks like a deformed sphere

An ellipsoid is a surface that can be obtained from a sphere by deforming it by means of directional scalings, or more generally, of an affine transformation.

Covariance in probability theory and statistics is a measure of the joint variability of two random variables.

<span class="mw-page-title-main">Symplectic group</span> Mathematical group

In mathematics, the name symplectic group can refer to two different, but closely related, collections of mathematical groups, denoted Sp(2n, F) and Sp(n) for positive integer n and field F (usually C or R). The latter is called the compact symplectic group and is also denoted by . Many authors prefer slightly different notations, usually differing by factors of 2. The notation used here is consistent with the size of the most common matrices which represent the groups. In Cartan's classification of the simple Lie algebras, the Lie algebra of the complex group Sp(2n, C) is denoted Cn, and Sp(n) is the compact real form of Sp(2n, C). Note that when we refer to the (compact) symplectic group it is implied that we are talking about the collection of (compact) symplectic groups, indexed by their dimension n.

In mathematics, particularly in linear algebra, a skew-symmetricmatrix is a square matrix whose transpose equals its negative. That is, it satisfies the condition

In continuum mechanics, the infinitesimal strain theory is a mathematical approach to the description of the deformation of a solid body in which the displacements of the material particles are assumed to be much smaller than any relevant dimension of the body; so that its geometry and the constitutive properties of the material at each point of space can be assumed to be unchanged by the deformation.

In vector calculus, the Jacobian matrix of a vector-valued function of several variables is the matrix of all its first-order partial derivatives. When this matrix is square, that is, when the function takes the same number of variables as input as the number of vector components of its output, its determinant is referred to as the Jacobian determinant. Both the matrix and the determinant are often referred to simply as the Jacobian in literature.

In the mathematical field of differential geometry, a metric tensor is an additional structure on a manifold M that allows defining distances and angles, just as the inner product on a Euclidean space allows defining distances and angles there. More precisely, a metric tensor at a point p of M is a bilinear form defined on the tangent space at p, and a metric field on M consists of a metric tensor at each point p of M that varies smoothly with p.

In linear algebra, an n-by-n square matrix A is called invertible if there exists an n-by-n square matrix B such that

<span class="mw-page-title-main">Exterior algebra</span> Algebra of exterior/ wedge products

In mathematics, the exterior algebra or Grassmann algebra of a vector space is an associative algebra that contains which has a product, called exterior product or wedge product and denoted with , such that for every vector in The exterior algebra is named after Hermann Grassmann, and the names of the product come from the "wedge" symbol and the fact that the product of two elements of are "outside"

In Hamiltonian mechanics, a canonical transformation is a change of canonical coordinates (q, p) → that preserves the form of Hamilton's equations. This is sometimes known as form invariance. Although Hamilton's equations are preserved, it need not preserve the explicit form of the Hamiltonian itself. Canonical transformations are useful in their own right, and also form the basis for the Hamilton–Jacobi equations and Liouville's theorem.

In mathematics, the Kronecker product, sometimes denoted by ⊗, is an operation on two matrices of arbitrary size resulting in a block matrix. It is a specialization of the tensor product from vectors to matrices and gives the matrix of the tensor product linear map with respect to a standard choice of basis. The Kronecker product is to be distinguished from the usual matrix multiplication, which is an entirely different operation. The Kronecker product is also sometimes called matrix direct product.

In linear algebra, a rotation matrix is a transformation matrix that is used to perform a rotation in Euclidean space. For example, using the convention below, the matrix

In mathematics, the discrete Laplace operator is an analog of the continuous Laplace operator, defined so that it has meaning on a graph or a discrete grid. For the case of a finite-dimensional graph, the discrete Laplace operator is more commonly called the Laplacian matrix.

<span class="mw-page-title-main">Classical group</span>

In mathematics, the classical groups are defined as the special linear groups over the reals R, the complex numbers C and the quaternions H together with special automorphism groups of symmetric or skew-symmetric bilinear forms and Hermitian or skew-Hermitian sesquilinear forms defined on real, complex and quaternionic finite-dimensional vector spaces. Of these, the complex classical Lie groups are four infinite families of Lie groups that together with the exceptional groups exhaust the classification of simple Lie groups. The compact classical groups are compact real forms of the complex classical groups. The finite analogues of the classical groups are the classical groups of Lie type. The term "classical group" was coined by Hermann Weyl, it being the title of his 1939 monograph The Classical Groups.

In probability theory, the family of complex normal distributions, denoted or , characterizes complex random variables whose real and imaginary parts are jointly normal. The complex normal family has three parameters: location parameter μ, covariance matrix , and the relation matrix . The standard complex normal is the univariate distribution with , , and .

<span class="mw-page-title-main">Stokes' theorem</span> Theorem in vector calculus

Stokes' theorem, also known as the Kelvin–Stokes theorem after Lord Kelvin and George Stokes, the fundamental theorem for curls or simply the curl theorem, is a theorem in vector calculus on . Given a vector field, the theorem relates the integral of the curl of the vector field over some surface, to the line integral of the vector field around the boundary of the surface. The classical theorem of Stokes can be stated in one sentence: The line integral of a vector field over a loop is equal to the surface integral of its curl over the enclosed surface. It is illustrated in the figure, where the direction of positive circulation of the bounding contour ∂Σ, and the direction n of positive flux through the surface Σ, are related by a right-hand-rule. For the right hand the fingers circulate along ∂Σ and the thumb is directed along n.

<span class="mw-page-title-main">Complex random vector</span>

In probability theory and statistics, a complex random vector is typically a tuple of complex-valued random variables, and generally is a random variable taking values in a vector space over the field of complex numbers. If are complex-valued random variables, then the n-tuple is a complex random vector. Complex random variables can always be considered as pairs of real random vectors: their real and imaginary parts.

References

  1. Paulraj, A.; Roy, R.; Kailath, T. (1985), "Estimation Of Signal Parameters Via Rotational Invariance Techniques - Esprit", Nineteenth Asilomar Conference on Circuits, Systems and Computers, pp. 83–89, doi:10.1109/ACSSC.1985.671426, ISBN   978-0-8186-0729-5, S2CID   2293566
  2. Volodymyr Vasylyshyn. The direction of arrival estimation using ESPRIT with sparse arrays.// Proc. 2009 European Radar Conference (EuRAD). – 30 Sept.-2 Oct. 2009. - Pp. 246 - 249. -
  3. Hu, Anzhong; Lv, Tiejun; Gao, Hui; Zhang, Zhang; Yang, Shaoshi (2014). "An ESPRIT-Based Approach for 2-D Localization of Incoherently Distributed Sources in Massive MIMO Systems". IEEE Journal of Selected Topics in Signal Processing. 8 (5): 996–1011. arXiv: 1403.5352 . Bibcode:2014ISTSP...8..996H. doi:10.1109/JSTSP.2014.2313409. ISSN   1932-4553. S2CID   11664051.

Further reading