In mathematics, the square root of a matrix extends the notion of square root from numbers to matrices. A matrix B is said to be a square root of A if the matrix product BB is equal to A. [1]
Some authors use the name square root or the notation A1/2 only for the specific case when A is positive semidefinite, to denote the unique matrix B that is positive semidefinite and such that BB = BTB = A (for real-valued matrices, where BT is the transpose of B).
Less frequently, the name square root may be used for any factorization of a positive semidefinite matrix A as BTB = A, as in the Cholesky factorization, even if BB ≠ A. This distinct meaning is discussed in Positive definite matrix § Decomposition .
In general, a matrix can have several square roots. In particular, if then as well.
The 2×2 identity matrix has infinitely many square roots. They are given by
where are any numbers (real or complex) such that . In particular if is any Pythagorean triple—that is, any set of positive integers such that , then is a square root matrix of which is symmetric and has rational entries. [2] Thus
Minus identity has a square root, for example:
which can be used to represent the imaginary unit i and hence all complex numbers using 2×2 real matrices, see Matrix representation of complex numbers.
Just as with the real numbers, a real matrix may fail to have a real square root, but have a square root with complex-valued entries. Some matrices have no square root. An example is the matrix
While the square root of a nonnegative integer is either again an integer or an irrational number, in contrast an integer matrix can have a square root whose entries are rational, yet non-integral, as in examples above.
A symmetric real n × n matrix is called positive semidefinite if for all (here denotes the transpose, changing a column vector x into a row vector). A square real matrix is positive semidefinite if and only if for some matrix B. There can be many different such matrices B. A positive semidefinite matrix A can also have many matrices B such that . However, A always has precisely one square root B that is positive semidefinite and symmetric. In particular, since B is required to be symmetric, , so the two conditions or are equivalent.
For complex-valued matrices, the conjugate transpose is used instead and positive semidefinite matrices are Hermitian, meaning .
Theorem [3] — Let A be a positive semidefinite and symmetric matrix (note that A can be positive semidefinite but not symmetric). Then there is exactly one positive semidefinite and symmetric matrix B such that . Note that there can be more than one non-symmetric and positive semidefinite matrix such that
This unique matrix is called the principal, non-negative, or positive square root (the latter in the case of positive definite matrices).
The principal square root of a real positive semidefinite matrix is real. [3] The principal square root of a positive definite matrix is positive definite; more generally, the rank of the principal square root of A is the same as the rank of A. [3]
The operation of taking the principal square root is continuous on this set of matrices. [4] These properties are consequences of the holomorphic functional calculus applied to matrices. [5] [6] The existence and uniqueness of the principal square root can be deduced directly from the Jordan normal form (see below).
An n×n matrix with ndistinct nonzero eigenvalues has 2n square roots. Such a matrix, A, has an eigendecomposition VDV−1 where V is the matrix whose columns are eigenvectors of A and D is the diagonal matrix whose diagonal elements are the corresponding n eigenvalues λi. Thus the square roots of A are given by VD1/2V−1, where D1/2 is any square root matrix of D, which, for distinct eigenvalues, must be diagonal with diagonal elements equal to square roots of the diagonal elements of D; since there are two possible choices for a square root of each diagonal element of D, there are 2n choices for the matrix D1/2.
This also leads to a proof of the above observation, that a positive-definite matrix has precisely one positive-definite square root: a positive definite matrix has only positive eigenvalues, and each of these eigenvalues has only one positive square root; and since the eigenvalues of the square root matrix are the diagonal elements of D1/2, for the square root matrix to be itself positive definite necessitates the use of only the unique positive square roots of the original eigenvalues.
If a matrix is idempotent, meaning , then by definition one of its square roots is the matrix itself.
If D is a diagonal n × n matrix , then some of its square roots are diagonal matrices , where . If the diagonal elements of D are real and non-negative then it is positive semidefinite, and if the square roots are taken with non-negative sign, the resulting matrix is the principal root of D. A diagonal matrix may have additional non-diagonal roots if some entries on the diagonal are equal, as exemplified by the identity matrix above.
If U is an upper triangular matrix (meaning its entries are for ) and at most one of its diagonal entries is zero, then one upper triangular solution of the equation can be found as follows. Since the equation should be satisfied, let be the principal square root of the complex number . By the assumption , this guarantees that for all i,j (because the principal square roots of complex numbers all lie on one half of the complex plane). From the equation
we deduce that can be computed recursively for increasing from 1 to n-1 as:
If U is upper triangular but has multiple zeroes on the diagonal, then a square root might not exist, as exemplified by . Note the diagonal entries of a triangular matrix are precisely its eigenvalues (see Triangular matrix#Properties).
An n × n matrix A is diagonalizable if there is a matrix V and a diagonal matrix D such that A = VDV−1. This happens if and only if A has n eigenvectors which constitute a basis for Cn. In this case, V can be chosen to be the matrix with the n eigenvectors as columns, and thus a square root of A is
where S is any square root of D. Indeed,
For example, the matrix can be diagonalized as VDV−1, where
D has principal square root
giving the square root
When A is symmetric, the diagonalizing matrix V can be made an orthogonal matrix by suitably choosing the eigenvectors (see spectral theorem). Then the inverse of V is simply the transpose, so that
Every complex-valued square matrix , regardless of diagonalizability, has a Schur decomposition given by where is upper triangular and is unitary (meaning ). The eigenvalues of are exactly the diagonal entries of ; if at most one of them is zero, then the following is a square root [7]
where a square root of the upper triangular matrix can be found as described above.
If is positive definite, then the eigenvalues are all positive reals, so the chosen diagonal of also consists of positive reals. Hence the eigenvalues of are positive reals, which means the resulting matrix is the principal root of .
Similarly as for the Schur decomposition, every square matrix can be decomposed as where P is invertible and J is in Jordan normal form.
To see that any complex matrix with positive eigenvalues has a square root of the same form, it suffices to check this for a Jordan block. Any such block has the form λ(I + N) with λ > 0 and N nilpotent. If (1 + z)1/2 = 1 + a1z + a2z2 + ⋯ is the binomial expansion for the square root (valid in |z| < 1), then as a formal power series its square equals 1 + z. Substituting N for z, only finitely many terms will be non-zero and S = √λ (I + a1N + a2N2 + ⋯) gives a square root of the Jordan block with eigenvalue √λ.
It suffices to check uniqueness for a Jordan block with λ = 1. The square constructed above has the form S = I + L where L is polynomial in N without constant term. Any other square root T with positive eigenvalues has the form T = I + M with M nilpotent, commuting with N and hence L. But then 0 = S2 − T2 = 2(L − M)(I + (L + M)/2). Since L and M commute, the matrix L + M is nilpotent and I + (L + M)/2 is invertible with inverse given by a Neumann series. Hence L = M.
If A is a matrix with positive eigenvalues and minimal polynomial p(t), then the Jordan decomposition into generalized eigenspaces of A can be deduced from the partial fraction expansion of p(t)−1. The corresponding projections onto the generalized eigenspaces are given by real polynomials in A. On each eigenspace, A has the form λ(I + N) as above. The power series expression for the square root on the eigenspace show that the principal square root of A has the form q(A) where q(t) is a polynomial with real coefficients.
Recall the formal power series , which converges provided (since the coefficients of the power series are summable). Plugging in into this expression yields
provided that . By virtue of Gelfand formula, that condition is equivalent to the requirement that the spectrum of is contained within the disk . This method of defining or computing is especially useful in the case where is positive semi-definite. In that case, we have and therefore , so that the expression defines a square root of which moreover turns out to be the unique positive semi-definite root. This method remains valid to define square roots of operators on infinite-dimensional Banach or Hilbert spaces or certain elements of (C*) Banach algebras.
Another way to find the square root of an n × n matrix A is the Denman–Beavers square root iteration. [8]
Let Y0 = A and Z0 = I, where I is the n × n identity matrix. The iteration is defined by
As this uses a pair of sequences of matrix inverses whose later elements change comparatively little, only the first elements have a high computational cost since the remainder can be computed from earlier elements with only a few passes of a variant of Newton's method for computing inverses,
With this, for later values of k one would set and and then use for some small (perhaps just 1), and similarly for
Convergence is not guaranteed, even for matrices that do have square roots, but if the process converges, the matrix converges quadratically to a square root A1/2, while converges to its inverse, A−1/2.
Yet another iterative method is obtained by taking the well-known formula of the Babylonian method for computing the square root of a real number, and applying it to matrices. Let X0 = I, where I is the identity matrix. The iteration is defined by
Again, convergence is not guaranteed, but if the process converges, the matrix converges quadratically to a square root A1/2. Compared to Denman–Beavers iteration, an advantage of the Babylonian method is that only one matrix inverse need be computed per iteration step. On the other hand, as Denman–Beavers iteration uses a pair of sequences of matrix inverses whose later elements change comparatively little, only the first elements have a high computational cost since the remainder can be computed from earlier elements with only a few passes of a variant of Newton's method for computing inverses (see Denman–Beavers iteration above); of course, the same approach can be used to get the single sequence of inverses needed for the Babylonian method. However, unlike Denman–Beavers iteration, the Babylonian method is numerically unstable and more likely to fail to converge. [1]
The Babylonian method follows from Newton's method for the equation and using for all [9]
This article needs additional citations for verification .(July 2010) |
In linear algebra and operator theory, given a bounded positive semidefinite operator (a non-negative operator) T on a complex Hilbert space, B is a square root of T if T = B* B, where B* denotes the Hermitian adjoint of B. [ citation needed ] According to the spectral theorem, the continuous functional calculus can be applied to obtain an operator T1/2 such that T1/2 is itself positive and (T1/2)2 = T. The operator T1/2 is the unique non-negative square root of T. [ citation needed ]
A bounded non-negative operator on a complex Hilbert space is self adjoint by definition. So T = (T1/2)* T1/2. Conversely, it is trivially true that every operator of the form B* B is non-negative. Therefore, an operator T is non-negative if and only if T = B* B for some B (equivalently, T = CC* for some C).
The Cholesky factorization provides another particular example of square root, which should not be confused with the unique non-negative square root.
If T is a non-negative operator on a finite-dimensional Hilbert space, then all square roots of T are related by unitary transformations. More precisely, if T = A*A = B*B, then there exists a unitary U such that A = UB.
Indeed, take B = T1/2 to be the unique non-negative square root of T. If T is strictly positive, then B is invertible, and so U = AB−1 is unitary:
If T is non-negative without being strictly positive, then the inverse of B cannot be defined, but the Moore–Penrose pseudoinverse B+ can be. In that case, the operator B+A is a partial isometry, that is, a unitary operator from the range of T to itself. This can then be extended to a unitary operator U on the whole space by setting it equal to the identity on the kernel of T. More generally, this is true on an infinite-dimensional Hilbert space if, in addition, T has closed range. In general, if A, B are closed and densely defined operators on a Hilbert space H, and A* A = B* B, then A = UB where U is a partial isometry.
Square roots, and the unitary freedom of square roots, have applications throughout functional analysis and linear algebra.
If A is an invertible operator on a finite-dimensional Hilbert space, then there is a unique unitary operator U and positive operator P such that
this is the polar decomposition of A. The positive operator P is the unique positive square root of the positive operator A∗A, and U is defined by U = AP−1.
If A is not invertible, then it still has a polar composition in which P is defined in the same way (and is unique). The unitary operator U is not unique. Rather it is possible to determine a "natural" unitary operator as follows: AP+ is a unitary operator from the range of A to itself, which can be extended by the identity on the kernel of A∗. The resulting unitary operator U then yields the polar decomposition of A.
By Choi's result, a linear map
is completely positive if and only if it is of the form
where k ≤ nm. Let {Epq} ⊂ Cn × n be the n2 elementary matrix units. The positive matrix
is called the Choi matrix of Φ. The Kraus operators correspond to the, not necessarily square, square roots of MΦ: For any square root B of MΦ, one can obtain a family of Kraus operators Vi by undoing the Vec operation to each column bi of B. Thus all sets of Kraus operators are related by partial isometries.
In quantum physics, a density matrix for an n-level quantum system is an n × n complex matrix ρ that is positive semidefinite with trace 1. If ρ can be expressed as
where and Σ pi = 1, the set
is said to be an ensemble that describes the mixed state ρ. Notice {vi} is not required to be orthogonal. Different ensembles describing the state ρ are related by unitary operators, via the square roots of ρ. For instance, suppose
The trace 1 condition means
Let
and vi be the normalized ai. We see that
gives the mixed state ρ.
In mathematics, the determinant is a scalar-valued function of the entries of a square matrix. The determinant of a matrix A is commonly denoted det(A), det A, or |A|. Its value characterizes some properties of the matrix and the linear map represented, on a given basis, by the matrix. In particular, the determinant is nonzero if and only if the matrix is invertible and the corresponding linear map is an isomorphism.
In mathematical physics and mathematics, the Pauli matrices are a set of three 2 × 2 complex matrices that are traceless, Hermitian, involutory and unitary. Usually indicated by the Greek letter sigma, they are occasionally denoted by tau when used in connection with isospin symmetries.
In mathematics, a symmetric matrix with real entries is positive-definite if the real number is positive for every nonzero real column vector where is the row vector transpose of More generally, a Hermitian matrix is positive-definite if the real number is positive for every nonzero complex column vector where denotes the conjugate transpose of
In linear algebra, the trace of a square matrix A, denoted tr(A), is defined to be the sum of elements on the main diagonal of A. The trace is only defined for a square matrix.
In linear algebra, a symmetric matrix is a square matrix that is equal to its transpose. Formally,
In linear algebra, the Cholesky decomposition or Cholesky factorization is a decomposition of a Hermitian, positive-definite matrix into the product of a lower triangular matrix and its conjugate transpose, which is useful for efficient numerical solutions, e.g., Monte Carlo simulations. It was discovered by André-Louis Cholesky for real matrices, and posthumously published in 1924. When it is applicable, the Cholesky decomposition is roughly twice as efficient as the LU decomposition for solving systems of linear equations.
In linear algebra, the singular value decomposition (SVD) is a factorization of a real or complex matrix into a rotation, followed by a rescaling followed by another rotation. It generalizes the eigendecomposition of a square normal matrix with an orthonormal eigenbasis to any matrix. It is related to the polar decomposition.
In mathematics, a complex square matrix A is normal if it commutes with its conjugate transpose A*:
In mathematics, a Hermitian matrix is a complex square matrix that is equal to its own conjugate transpose—that is, the element in the i-th row and j-th column is equal to the complex conjugate of the element in the j-th row and i-th column, for all indices i and j:
In the mathematical discipline of linear algebra, a matrix decomposition or matrix factorization is a factorization of a matrix into a product of matrices. There are many different matrix decompositions; each finds use among a particular class of problems.
In the mathematical discipline of linear algebra, the Schur decomposition or Schur triangulation, named after Issai Schur, is a matrix decomposition. It allows one to write an arbitrary complex square matrix as unitarily similar to an upper triangular matrix whose diagonal elements are the eigenvalues of the original matrix.
In mathematics, a triangular matrix is a special kind of square matrix. A square matrix is called lower triangular if all the entries above the main diagonal are zero. Similarly, a square matrix is called upper triangular if all the entries below the main diagonal are zero.
In linear algebra, a tridiagonal matrix is a band matrix that has nonzero elements only on the main diagonal, the subdiagonal/lower diagonal, and the supradiagonal/upper diagonal. For example, the following matrix is tridiagonal:
In geometry, Euler's rotation theorem states that, in three-dimensional space, any displacement of a rigid body such that a point on the rigid body remains fixed, is equivalent to a single rotation about some axis that runs through the fixed point. It also means that the composition of two rotations is also a rotation. Therefore the set of rotations has a group structure, known as a rotation group.
In mathematics, the Cayley transform, named after Arthur Cayley, is any of a cluster of related things. As originally described by Cayley (1846), the Cayley transform is a mapping between skew-symmetric matrices and special orthogonal matrices. The transform is a homography used in real analysis, complex analysis, and quaternionic analysis. In the theory of Hilbert spaces, the Cayley transform is a mapping between linear operators.
In mathematics, the polar decomposition of a square real or complex matrix is a factorization of the form , where is a unitary matrix and is a positive semi-definite Hermitian matrix, both square and of the same size.
In matrix theory, the Perron–Frobenius theorem, proved by Oskar Perron and Georg Frobenius, asserts that a real square matrix with positive entries has a unique eigenvalue of largest magnitude and that eigenvalue is real. The corresponding eigenvector can be chosen to have strictly positive components, and also asserts a similar statement for certain classes of nonnegative matrices. This theorem has important applications to probability theory ; to the theory of dynamical systems ; to economics ; to demography ; to social networks ; to Internet search engines (PageRank); and even to ranking of American football teams. The first to discuss the ordering of players within tournaments using Perron–Frobenius eigenvectors is Edmund Landau.
In mathematics, a logarithm of a matrix is another matrix such that the matrix exponential of the latter matrix equals the original matrix. It is thus a generalization of the scalar logarithm and in some sense an inverse function of the matrix exponential. Not all matrices have a logarithm and those matrices that do have a logarithm may have more than one logarithm. The study of logarithms of matrices leads to Lie theory since when a matrix has a logarithm then it is in an element of a Lie group and the logarithm is the corresponding element of the vector space of the Lie algebra.
In numerical linear algebra, the Jacobi eigenvalue algorithm is an iterative method for the calculation of the eigenvalues and eigenvectors of a real symmetric matrix. It is named after Carl Gustav Jacob Jacobi, who first proposed the method in 1846, but only became widely used in the 1950s with the advent of computers.
In mathematics, every analytic function can be used for defining a matrix function that maps square matrices with complex entries to square matrices of the same size.