Householder transformation

Last updated

In linear algebra, a Householder transformation (also known as a Householder reflection or elementary reflector) is a linear transformation that describes a reflection about a plane or hyperplane containing the origin. The Householder transformation was used in a 1958 paper by Alston Scott Householder. [1]

Contents

Its analogue over general inner product spaces is the Householder operator.

Definition

Transformation

The reflection hyperplane can be defined by its normal vector, a unit vector (a vector with length ) that is orthogonal to the hyperplane. The reflection of a point about this hyperplane is the linear transformation:

where is given as a column unit vector with conjugate transpose .

Householder matrix

The matrix constructed from this transformation can be expressed in terms of an outer product as:

is known as the Householder matrix, where is the identity matrix.

Properties

The Householder matrix has the following properties:

  • it is Hermitian: ,
  • it is unitary: ,
  • hence it is involutory: .
  • A Householder matrix has eigenvalues . To see this, notice that if is orthogonal to the vector which was used to create the reflector, then , i.e., is an eigenvalue of multiplicity , since there are independent vectors orthogonal to . Also, notice , and so is an eigenvalue with multiplicity .
  • The determinant of a Householder reflector is , since the determinant of a matrix is the product of its eigenvalues, in this case one of which is with the remainder being (as in the previous point).

Applications

Geometric optics

In geometric optics, specular reflection can be expressed in terms of the Householder matrix (see Specular reflection § Vector formulation ).

Numerical linear algebra

Householder transformations are widely used in numerical linear algebra, for example, to annihilate the entries below the main diagonal of a matrix, [2] to perform QR decompositions and in the first step of the QR algorithm. They are also widely used for transforming to a Hessenberg form. For symmetric or Hermitian matrices, the symmetry can be preserved, resulting in tridiagonalization. [3]

QR decomposition

Householder reflections can be used to calculate QR decompositions by reflecting first one column of a matrix onto a multiple of a standard basis vector, calculating the transformation matrix, multiplying it with the original matrix and then recursing down the minors of that product. To accomplish this, a hermitian unitary matrix Q is sought which takes a complex vector x into a complex multiple of a complex vector e. For the QR decomposition, e will be a unit coordinate vector, say for the kth coordinate. A complex matrix Q having the form Q = I - u u* with u* u = 2 has the desired property of being hermitian unitary. Here * denotes the conjugate transpose. Since the only two vectors involved are x and e, the vector u must have the form u = a x + b e, where a and b are complex coefficients to be determined. Since an overall phase factor for u does not matter, the coefficient a can be chosen to be positive real. Now Qx = x (1 – a (u* x)) - e (b (u* x)). For the coefficient of the vector x to be zero, the two terms in u* x must have the same phase within a multiple of 180 degrees, so we must have arg(b) = arg(e* x) within a multiple of 180 degrees. There are two solutions according to whether an even or odd multiple of 180 degrees is chosen. So for the complex coefficient of the vector x to be zero, two linear equations in the moduli of a and b must be solved. The phases of a and b have already been determined. Suppose e to be the kth unit coordinate vector so that e* e = 1 and xk= e* x and let |x|= sqrt(x* x). Then a and b may be expressed simply either as a =1/sqrt (|x| (|x|+ |xk|)) and b = a |x| exp(i*arg(xk)) or as a =1/sqrt (|x| (|x|- |xk|)) and b = - a |x| exp(i*arg(xk)). The multiplier of e is -b/a for both solutions. The first solution is better because the denominator cannot be near zero compared to |x|.

Tridiagonalization

This procedure is presented in Numerical Analysis by Burden and Faires. It uses a slightly altered function with . [4] In the first step, to form the Householder matrix in each step we need to determine and , which are:

From and , construct vector :

where , , and

for each

Then compute:

Having found and computed the process is repeated for as follows:

Continuing in this manner, the tridiagonal and symmetric matrix is formed.

Examples

In this example, also from Burden and Faires, [4] the given matrix is transformed to the similar tridiagonal matrix A3 by using the Householder method.

Following those steps in the Householder method, we have:

The first Householder matrix:

Used to form

As we can see, the final result is a tridiagonal symmetric matrix which is similar to the original one. The process is finished after two steps.

Computational and theoretical relationship to other unitary transformations

The Householder transformation is a reflection about a hyperplane with unit normal vector , as stated earlier. An -by- unitary transformation satisfies . Taking the determinant (-th power of the geometric mean) and trace (proportional to arithmetic mean) of a unitary matrix reveals that its eigenvalues have unit modulus. This can be seen directly and swiftly:

Since arithmetic and geometric means are equal if the variables are constant (see inequality of arithmetic and geometric means), we establish the claim of unit modulus.

For the case of real valued unitary matrices we obtain orthogonal matrices, . It follows rather readily (see orthogonal matrix) that any orthogonal matrix can be decomposed into a product of 2 by 2 rotations, called Givens Rotations, and Householder reflections. This is appealing intuitively since multiplication of a vector by an orthogonal matrix preserves the length of that vector, and rotations and reflections exhaust the set of (real valued) geometric operations that render invariant a vector's length.

The Householder transformation was shown to have a one-to-one relationship with the canonical coset decomposition of unitary matrices defined in group theory, which can be used to parametrize unitary operators in a very efficient manner. [5]

Finally we note that a single Householder transform, unlike a solitary Givens transform, can act on all columns of a matrix, and as such exhibits the lowest computational cost for QR decomposition and tridiagonalization. The penalty for this "computational optimality" is, of course, that Householder operations cannot be as deeply or efficiently parallelized. As such Householder is preferred for dense matrices on sequential machines, whilst Givens is preferred on sparse matrices, and/or parallel machines.

See also

Notes

  1. Householder, A. S. (1958). "Unitary Triangularization of a Nonsymmetric Matrix" (PDF). Journal of the ACM . 5 (4): 339–342. doi:10.1145/320941.320947. MR   0111128. S2CID   9858625.
  2. Taboga, Marco. "Householder matrix, Lectures on matrix algebra".
  3. Schabauer, Hannes; Pacher, Christoph; Sunderland, Andrew G.; Gansterer, Wilfried N. (2010-05-01). "Toward a parallel solver for generalized complex symmetric eigenvalue problems". Procedia Computer Science. 1 (1): 437–445. doi: 10.1016/j.procs.2010.04.047 .
  4. 1 2 Burden, Richard; Faires, Douglas; Burden, Annette (2016). Numerical analysis (10th ed.). Thomson Brooks/Cole. ISBN   9781305253667.
  5. Renan Cabrera; Traci Strohecker; Herschel Rabitz (2010). "The canonical coset decomposition of unitary matrices through Householder transformations". Journal of Mathematical Physics . 51 (8): 082101. arXiv: 1008.2477 . Bibcode:2010JMP....51h2101C. doi:10.1063/1.3466798. S2CID   119641896.

Related Research Articles

<span class="mw-page-title-main">Pauli matrices</span> Matrices important in quantum mechanics and the study of spin

In mathematical physics and mathematics, the Pauli matrices are a set of three 2 × 2 complex matrices that are Hermitian, involutory and unitary. Usually indicated by the Greek letter sigma, they are occasionally denoted by tau when used in connection with isospin symmetries.

In mechanics and geometry, the 3D rotation group, often denoted SO(3), is the group of all rotations about the origin of three-dimensional Euclidean space under the operation of composition.

<span class="mw-page-title-main">Special unitary group</span> Group of unitary matrices with determinant of 1

In mathematics, the special unitary group of degree n, denoted SU(n), is the Lie group of n × n unitary matrices with determinant 1.

In mathematics, particularly in linear algebra, a skew-symmetricmatrix is a square matrix whose transpose equals its negative. That is, it satisfies the condition

Unit quaternions, known as versors, provide a convenient mathematical notation for representing spatial orientations and rotations of elements in three dimensional space. Specifically, they encode information about an axis-angle rotation about an arbitrary axis. Rotation and orientation quaternions have applications in computer graphics, computer vision, robotics, navigation, molecular dynamics, flight dynamics, orbital mechanics of satellites, and crystallographic texture analysis.

In the mathematical field of differential geometry, a metric tensor is an additional structure on a manifold M that allows defining distances and angles, just as the inner product on a Euclidean space allows defining distances and angles there. More precisely, a metric tensor at a point p of M is a bilinear form defined on the tangent space at p, and a metric field on M consists of a metric tensor at each point p of M that varies smoothly with p.

<span class="mw-page-title-main">Covariance and contravariance of vectors</span> Vector behavior under coordinate changes

In physics, especially in multilinear algebra and tensor analysis, covariance and contravariance describe how the quantitative description of certain geometric or physical entities changes with a change of basis. Briefly, a contravariant vector is a list of numbers that transforms oppositely to a change of basis, and a covariant vector is a list of numbers that transforms in the same way. Contravariant vectors are often just called vectors and covariant vectors are called covectors or dual vectors. The terms covariant and contravariant were introduced by James Joseph Sylvester in 1851.

<span class="mw-page-title-main">Exterior algebra</span> Algebra of exterior/ wedge products

In mathematics, the exterior algebra or Grassmann algebra of a vector space is an associative algebra that contains which has a product, called exterior product or wedge product and denoted with , such that for every vector in The exterior algebra is named after Hermann Grassmann, and the names of the product come from the "wedge" symbol and the fact that the product of two elements of are "outside"

<span class="mw-page-title-main">Lorentz group</span> Lie group of Lorentz transformations

In physics and mathematics, the Lorentz group is the group of all Lorentz transformations of Minkowski spacetime, the classical and quantum setting for all (non-gravitational) physical phenomena. The Lorentz group is named for the Dutch physicist Hendrik Lorentz.

In linear algebra, a square matrix  is called diagonalizable or non-defective if it is similar to a diagonal matrix. That is, if there exists an invertible matrix  and a diagonal matrix such that . This is equivalent to . This property exists for any linear map: for a finite-dimensional vector space , a linear map  is called diagonalizable if there exists an ordered basis of  consisting of eigenvectors of . These definitions are equivalent: if  has a matrix representation as above, then the column vectors of  form a basis consisting of eigenvectors of , and the diagonal entries of  are the corresponding eigenvalues of ; with respect to this eigenvector basis,  is represented by .

<span class="mw-page-title-main">Poisson bracket</span> Operation in Hamiltonian mechanics

In mathematics and classical mechanics, the Poisson bracket is an important binary operation in Hamiltonian mechanics, playing a central role in Hamilton's equations of motion, which govern the time evolution of a Hamiltonian dynamical system. The Poisson bracket also distinguishes a certain class of coordinate transformations, called canonical transformations, which map canonical coordinate systems into canonical coordinate systems. A "canonical coordinate system" consists of canonical position and momentum variables that satisfy canonical Poisson bracket relations. The set of possible canonical transformations is always very rich. For instance, it is often possible to choose the Hamiltonian itself as one of the new canonical momentum coordinates.

In linear algebra, a QR decomposition, also known as a QR factorization or QU factorization, is a decomposition of a matrix A into a product A = QR of an orthonormal matrix Q and an upper triangular matrix R. QR decomposition is often used to solve the linear least squares (LLS) problem and is the basis for a particular eigenvalue algorithm, the QR algorithm.

In Hamiltonian mechanics, a canonical transformation is a change of canonical coordinates (q, p) → that preserves the form of Hamilton's equations. This is sometimes known as form invariance. Although Hamilton's equations are preserved, it need not preserve the explicit form of the Hamiltonian itself. Canonical transformations are useful in their own right, and also form the basis for the Hamilton–Jacobi equations and Liouville's theorem.

In mathematics, the matrix exponential is a matrix function on square matrices analogous to the ordinary exponential function. It is used to solve systems of linear differential equations. In the theory of Lie groups, the matrix exponential gives the exponential map between a matrix Lie algebra and the corresponding Lie group.

In linear algebra, a rotation matrix is a transformation matrix that is used to perform a rotation in Euclidean space. For example, using the convention below, the matrix

In the field of mathematics, norms are defined for elements within a vector space. Specifically, when the vector space comprises matrices, such norms are referred to as matrix norms. Matrix norms differ from vector norms in that they must also interact with matrix multiplication.

In differential geometry, a tensor density or relative tensor is a generalization of the tensor field concept. A tensor density transforms as a tensor field when passing from one coordinate system to another, except that it is additionally multiplied or weighted by a power W of the Jacobian determinant of the coordinate transition function or its absolute value. A tensor density with a single index is called a vector density. A distinction is made among (authentic) tensor densities, pseudotensor densities, even tensor densities and odd tensor densities. Sometimes tensor densities with a negative weight W are called tensor capacity. A tensor density can also be regarded as a section of the tensor product of a tensor bundle with a density bundle.

<span class="mw-page-title-main">Osculating circle</span> Circle of immediate corresponding curvature of a curve at a point

An osculating circle is a circle that best approximates the curvature of a curve at a specific point. It is tangent to the curve at that point and has the same curvature as the curve at that point. The osculating circle provides a way to understand the local behavior of a curve and is commonly used in differential geometry and calculus.

In linear algebra, it is often important to know which vectors have their directions unchanged by a given linear transformation. An eigenvector or characteristic vector is such a vector. Thus an eigenvector of a linear transformation is scaled by a constant factor when the linear transformation is applied to it: . The corresponding eigenvalue, characteristic value, or characteristic root is the multiplying factor .

In geometry, various formalisms exist to express a rotation in three dimensions as a mathematical transformation. In physics, this concept is applied to classical mechanics where rotational kinematics is the science of quantitative description of a purely rotational motion. The orientation of an object at a given instant is described with the same tools, as it is defined as an imaginary rotation from a reference placement in space, rather than an actually observed rotation from a previous placement in space.

References