Smith normal form

Last updated January 27, 2024

In mathematics, the Smith normal form (sometimes abbreviated SNF^[1]) is a normal form that can be defined for any matrix (not necessarily square) with entries in a principal ideal domain (PID). The Smith normal form of a matrix is diagonal, and can be obtained from the original matrix by multiplying on the left and right by invertible square matrices. In particular, the integers are a PID, so one can always calculate the Smith normal form of an integer matrix. The Smith normal form is very useful for working with finitely generated modules over a PID, and in particular for deducing the structure of a quotient of a free module. It is named after the Irish mathematician Henry John Stephen Smith.

Definition

Let A be a nonzero m×n matrix over a principal ideal domain R. There exist invertible $m\times m$ and $n\times n$ -matrices S, T (with coefficients in R) such that the product S A T is

${\begin{pmatrix}\alpha _{1}&0&0&\cdots &0&\cdots &0\\0&\alpha _{2}&0&&&&\\0&0&\ddots &&\vdots &&\vdots \\\vdots &&&\alpha _{r}&&&\\0&&\cdots &&0&\cdots &0\\\vdots &&&&\vdots &&\vdots \\0&&\cdots &&0&\cdots &0\end{pmatrix}}.$

and the diagonal elements $\alpha _{i}$ satisfy $\alpha _{i}\mid \alpha _{i+1}$ for all $1\leq i<r$ . This is the Smith normal form of the matrix A. The elements $\alpha _{i}$ are unique up to multiplication by a unit and are called the elementary divisors, invariants, or invariant factors. They can be computed (up to multiplication by a unit) as

\alpha _{i}={\frac {d_{i}(A)}{d_{i-1}(A)}},

where $d_{i}(A)$ (called i-th determinant divisor) equals the greatest common divisor of the determinants of all $i\times i$ minors of the matrix A and $d_{0}(A):=1$ .

Example : For a $2\times 2$ matrix, ${\rm {SNF}}{a~~b \choose c~~d}={\rm {diag}}(d_{1},d_{2}/d_{1})$ with $d_{1}=\gcd(a,b,c,d)$ and $d_{2}=ad-bc$ .

Algorithm

The first goal is to find invertible square matrices $S$ and $T$ such that the product $SAT$ is diagonal. This is the hardest part of the algorithm. Once diagonality is achieved, it becomes relatively easy to put the matrix into Smith normal form. Phrased more abstractly, the goal is to show that, thinking of $A$ as a map from $R^{n}$ (the free $R$ -module of rank $n$ ) to $R^{m}$ (the free $R$ -module of rank $m$ ), there are isomorphisms $S:R^{m}\to R^{m}$ and $T:R^{n}\to R^{n}$ such that $S\cdot A\cdot T$ has the simple form of a diagonal matrix. The matrices $S$ and $T$ can be found by starting out with identity matrices of the appropriate size, and modifying $S$ each time a row operation is performed on $A$ in the algorithm by the corresponding column operation (for example, if row $i$ is added to row $j$ of $A$ , then column $j$ should be subtracted from column $i$ of $S$ to retain the product invariant), and similarly modifying $T$ for each column operation performed. Since row operations are left-multiplications and column operations are right-multiplications, this preserves the invariant $A'=S'\cdot A\cdot T'$ where $A',S',T'$ denote current values and $A$ denotes the original matrix; eventually the matrices in this invariant become diagonal. Only invertible row and column operations are performed, which ensures that $S$ and $T$ remain invertible matrices.

For $a\in R\setminus \{0\}$ , write $\delta (a)$ for the number of prime factors of $a$ (these exist and are unique since any PID is also a unique factorization domain). In particular, $R$ is also a Bézout domain, so it is a gcd domain and the gcd of any two elements satisfies a Bézout's identity.

To put a matrix into Smith normal form, one can repeatedly apply the following, where $t$ loops from 1 to $m$ .

Step I: Choosing a pivot

Choose $j_{t}$ to be the smallest column index of $A$ with a non-zero entry, starting the search at column index $j_{t-1}+1$ if $t>1$ .

We wish to have $a_{t,j_{t}}\neq 0$ ; if this is the case this step is complete, otherwise there is by assumption some $k$ with $a_{k,j_{t}}\neq 0$ , and we can exchange rows $t$ and $k$ , thereby obtaining $a_{t,j_{t}}\neq 0$ .

Our chosen pivot is now at position $(t,j_{t})$ .

Step II: Improving the pivot

If there is an entry at position (k,j_t) such that $a_{t,j_{t}}\nmid a_{k,j_{t}}$ , then, letting $\beta =\gcd \left(a_{t,j_{t}},a_{k,j_{t}}\right)$ , we know by the Bézout property that there exist σ, τ in R such that

a_{t,j_{t}}\cdot \sigma +a_{k,j_{t}}\cdot \tau =\beta .

By left-multiplication with an appropriate invertible matrix L, it can be achieved that row t of the matrix product is the sum of σ times the original row t and τ times the original row k, that row k of the product is another linear combination of those original rows, and that all other rows are unchanged. Explicitly, if σ and τ satisfy the above equation, then for $\alpha =a_{t,j_{t}}/\beta$ and $\gamma =a_{k,j_{t}}/\beta$ (which divisions are possible by the definition of β) one has

\sigma \cdot \alpha +\tau \cdot \gamma =1,

so that the matrix

L_{0}={\begin{pmatrix}\sigma &\tau \\-\gamma &\alpha \\\end{pmatrix}}

is invertible, with inverse

{\begin{pmatrix}\alpha &-\tau \\\gamma &\sigma \\\end{pmatrix}}.

Now L can be obtained by fitting $L_{0}$ into rows and columns t and k of the identity matrix. By construction the matrix obtained after left-multiplying by L has entry β at position (t,j_t) (and due to our choice of α and γ it also has an entry 0 at position (k,j_t), which is useful though not essential for the algorithm). This new entry β divides the entry $a_{t,j_{t}}$ that was there before, and so in particular $\delta (\beta )<\delta (a_{t,j_{t}})$ ; therefore repeating these steps must eventually terminate. One ends up with a matrix having an entry at position (t,j_t) that divides all entries in column j_t.

Step III: Eliminating entries

Finally, adding appropriate multiples of row t, it can be achieved that all entries in column j_t except for that at position (t,j_t) are zero. This can be achieved by left-multiplication with an appropriate matrix. However, to make the matrix fully diagonal we need to eliminate nonzero entries on the row of position (t,j_t) as well. This can be achieved by repeating the steps in Step II for columns instead of rows, and using multiplication on the right by the transpose of the obtained matrix L. In general this will result in the zero entries from the prior application of Step III becoming nonzero again.

However, notice that each application of Step II for either rows or columns must continue to reduce the value of $\delta (a_{t,j_{t}})$ , and so the process must eventually stop after some number of iterations, leading to a matrix where the entry at position (t,j_t) is the only non-zero entry in both its row and column.

At this point, only the block of A to the lower right of (t,j_t) needs to be diagonalized, and conceptually the algorithm can be applied recursively, treating this block as a separate matrix. In other words, we can increment t by one and go back to Step I.

Final step

Applying the steps described above to the remaining non-zero columns of the resulting matrix (if any), we get an $m\times n$ -matrix with column indices $j_{1}<\ldots <j_{r}$ where $r\leq \min(m,n)$ . The matrix entries $(l,j_{l})$ are non-zero, and every other entry is zero.

Now we can move the null columns of this matrix to the right, so that the nonzero entries are on positions $(i,i)$ for $1\leq i\leq r$ . For short, set $\alpha _{i}$ for the element at position $(i,i)$ .

The condition of divisibility of diagonal entries might not be satisfied. For any index $i<r$ for which $\alpha _{i}\nmid \alpha _{i+1}$ , one can repair this shortcoming by operations on rows and columns $i$ and $i+1$ only: first add column $i+1$ to column $i$ to get an entry $\alpha _{i+1}$ in column i without disturbing the entry $\alpha _{i}$ at position $(i,i)$ , and then apply a row operation to make the entry at position $(i,i)$ equal to $\beta =\gcd(\alpha _{i},\alpha _{i+1})$ as in Step II; finally proceed as in Step III to make the matrix diagonal again. Since the new entry at position $(i+1,i+1)$ is a linear combination of the original $\alpha _{i},\alpha _{i+1}$ , it is divisible by β.

The value $\delta (\alpha _{1})+\cdots +\delta (\alpha _{r})$ does not change by the above operation (it is δ of the determinant of the upper $r\times r$ submatrix), whence that operation does diminish (by moving prime factors to the right) the value of

\sum _{j=1}^{r}(r-j)\delta (\alpha _{j}).

So after finitely many applications of this operation no further application is possible, which means that we have obtained $\alpha _{1}\mid \alpha _{2}\mid \cdots \mid \alpha _{r}$ as desired.

Since all row and column manipulations involved in the process are invertible, this shows that there exist invertible $m\times m$ and $n\times n$ -matrices S, T so that the product S A T satisfies the definition of a Smith normal form. In particular, this shows that the Smith normal form exists, which was assumed without proof in the definition.

Applications

The Smith normal form is useful for computing the homology of a chain complex when the chain modules of the chain complex are finitely generated. For instance, in topology, it can be used to compute the homology of a finite simplicial complex or CW complex over the integers, because the boundary maps in such a complex are just integer matrices. It can also be used to determine the invariant factors that occur in the structure theorem for finitely generated modules over a principal ideal domain, which includes the fundamental theorem of finitely generated abelian groups.

The Smith normal form is also used in control theory to compute transmission and blocking zeros of a transfer function matrix.^[2]

Example

As an example, we will find the Smith normal form of the following matrix over the integers.

{\begin{pmatrix}2&4&4\\-6&6&12\\10&4&16\end{pmatrix}}

The following matrices are the intermediate steps as the algorithm is applied to the above matrix.

\to {\begin{pmatrix}2&0&0\\-6&18&24\\10&-16&-4\end{pmatrix}}\to {\begin{pmatrix}2&0&0\\0&18&24\\0&-16&-4\end{pmatrix}}

\to {\begin{pmatrix}2&0&0\\0&2&20\\0&-16&-4\end{pmatrix}}\to {\begin{pmatrix}2&0&0\\0&2&20\\0&0&156\end{pmatrix}}

\to {\begin{pmatrix}2&0&0\\0&2&0\\0&0&156\end{pmatrix}}

So the Smith normal form is

{\begin{pmatrix}2&0&0\\0&2&0\\0&0&156\end{pmatrix}}

and the invariant factors are 2, 2 and 156.

Similarity

The Smith normal form can be used to determine whether or not matrices with entries over a common field $K$ are similar. Specifically two matrices A and B are similar if and only if the characteristic matrices $xI-A$ and $xI-B$ have the same Smith normal form (working in the PID $K[x]$ ).

For example, with

{\begin{aligned}A&{}={\begin{bmatrix}1&2\\0&1\end{bmatrix}},&&{\mbox{SNF}}(xI-A)={\begin{bmatrix}1&0\\0&(x-1)^{2}\end{bmatrix}}\\B&{}={\begin{bmatrix}3&-4\\1&-1\end{bmatrix}},&&{\mbox{SNF}}(xI-B)={\begin{bmatrix}1&0\\0&(x-1)^{2}\end{bmatrix}}\\C&{}={\begin{bmatrix}1&0\\1&2\end{bmatrix}},&&{\mbox{SNF}}(xI-C)={\begin{bmatrix}1&0\\0&(x-1)(x-2)\end{bmatrix}}.\end{aligned}}

A and B are similar because the Smith normal form of their characteristic matrices match, but are not similar to C because the Smith normal form of the characteristic matrices do not match.

Notes

↑ Stanley, Richard P. (2016). "Smith normal form in combinatorics". Journal of Combinatorial Theory . Series A. 144: 476–495. arXiv: 1602.00166 . doi: 10.1016/j.jcta.2016.06.013 . S2CID 14400632.
↑ Maciejowski, Jan M. (1989). Multivariable feedback design. Wokingham, England: Addison-Wesley. ISBN 0201182432. OCLC 19456124.

Related Research Articles

In mathematics, the determinant is a scalar value that is a function of the entries of a square matrix. The determinant of a matrix $A$ is commonly denoted $det(A)$ , $det A$ , or $| A |$ . Its value characterizes some properties of the matrix and the linear map represented by the matrix. In particular, the determinant is nonzero if and only if the matrix is invertible and the linear map represented by the matrix is an isomorphism. The determinant of a product of matrices is the product of their determinants.

In mathematical physics and mathematics, the Pauli matrices are a set of three $2 \times 2$ complex matrices that are Hermitian, involutory and unitary. Usually indicated by the Greek letter sigma, they are occasionally denoted by tau when used in connection with isospin symmetries.

In mathematics, particularly in linear algebra, matrix multiplication is a binary operation that produces a matrix from two matrices. For matrix multiplication, the number of columns in the first matrix must be equal to the number of rows in the second matrix. The resulting matrix, known as the matrix product, has the number of rows of the first and the number of columns of the second matrix. The product of matrices $A$ and $B$ is denoted as $AB$ .

In linear algebra, the Cayley–Hamilton theorem states that every square matrix over a commutative ring satisfies its own characteristic equation.

In mechanics and geometry, the 3D rotation group, often denoted SO(3), is the group of all rotations about the origin of three-dimensional Euclidean space $under the operation of composition.$

In linear algebra, an $n$ -by- $n$ square matrix $A$ is called invertible if there exists an $n$ -by- $n$ square matrix $B$ such that

In linear algebra, a square matrix $is called diagonalizable or non-defective if it is similar to a diagonal matrix. That is, if there exists an invertible matrix and a diagonal matrix such that . This is equivalent to . This property exists for any linear map: for a finite-dimensional vector space, a linear map is called diagonalizable if there exists an ordered basis of consisting of eigenvectors of . These definitions are equivalent: if has a matrix representation as above, then the column vectors of form a basis consisting of eigenvectors of, and the diagonal entries of are the corresponding eigenvalues of; with respect to this eigenvector basis, is represented by .$

In linear algebra, a QR decomposition, also known as a QR factorization or QU factorization, is a decomposition of a matrix A into a product A = QR of an orthonormal matrix Q and an upper triangular matrix R. QR decomposition is often used to solve the linear least squares (LLS) problem and is the basis for a particular eigenvalue algorithm, the QR algorithm.

In linear algebra, a matrix is in row echelon form if it can be obtained as the result of Gaussian elimination. Every matrix can be put in row echelon form by applying a sequence of elementary row operations. The term echelon comes from the French "échelon", and refers to the fact that the nonzero entries of a matrix in row echelon form look like an inverted staircase.

In mathematics, and in particular linear algebra, the Moore–Penrose inverse of a matrix is the most widely known generalization of the inverse matrix. It was independently described by E. H. Moore in 1920, Arne Bjerhammar in 1951, and Roger Penrose in 1955. Earlier, Erik Ivar Fredholm had introduced the concept of a pseudoinverse of integral operators in 1903. When referring to a matrix, the term pseudoinverse, without further specification, is often used to indicate the Moore–Penrose inverse. The term generalized inverse is sometimes used as a synonym for pseudoinverse.

In mathematics, a block matrix or a partitioned matrix is a matrix that is interpreted as having been broken into sections called blocks or submatrices. Intuitively, a matrix interpreted as a block matrix can be visualized as the original matrix with a collection of horizontal and vertical lines, which break it up, or partition it, into a collection of smaller matrices. Any matrix may be interpreted as a block matrix in one or more ways, with each interpretation defined by how its rows and columns are partitioned.

In mathematics, the determinant of an m×m skew-symmetric matrix can always be written as the square of a polynomial in the matrix entries, a polynomial with integer coefficients that only depends on m. When m is odd, the polynomial is zero. When m is even, it is a nonzero polynomial of degree m/2, and is unique up to multiplication by ±1. The convention on skew-symmetric tridiagonal matrices, given below in the examples, then determines one specific polynomial, called the Pfaffian polynomial. The value of this polynomial, when applied to the entries of a skew-symmetric matrix, is called the Pfaffian of that matrix. The term Pfaffian was introduced by Cayley (1852), who indirectly named them after Johann Friedrich Pfaff.

In linear algebra, a rotation matrix is a transformation matrix that is used to perform a rotation in Euclidean space. For example, using the convention below, the matrix

In mathematics, we can define norms for the elements of a vector space. When the vector space in question consists of matrices, these are called matrix norms.

In mathematics and theoretical physics, a supermatrix is a Z₂-graded analog of an ordinary matrix. Specifically, a supermatrix is a 2×2 block matrix with entries in a superalgebra. The most important examples are those with entries in a commutative superalgebra or an ordinary field.

In mathematics, a logarithm of a matrix is another matrix such that the matrix exponential of the latter matrix equals the original matrix. It is thus a generalization of the scalar logarithm and in some sense an inverse function of the matrix exponential. Not all matrices have a logarithm and those matrices that do have a logarithm may have more than one logarithm. The study of logarithms of matrices leads to Lie theory since when a matrix has a logarithm then it is in an element of a Lie group and the logarithm is the corresponding element of the vector space of the Lie algebra.

In numerical analysis and linear algebra, lower–upper (LU) decomposition or factorization factors a matrix as the product of a lower triangular matrix and an upper triangular matrix. The product sometimes includes a permutation matrix as well. LU decomposition can be viewed as the matrix form of Gaussian elimination. Computers usually solve square systems of linear equations using LU decomposition, and it is also a key step when inverting a matrix or computing the determinant of a matrix. The LU decomposition was introduced by the Polish astronomer Tadeusz Banachiewicz in 1938. To quote: "It appears that Gauss and Doolittle applied the method [of elimination] only to symmetric equations. More recent authors, for example, Aitken, Banachiewicz, Dwyer, and Crout … have emphasized the use of the method, or variations of it, in connection with non-symmetric problems … Banachiewicz … saw the point … that the basic problem is really one of matrix factorization, or “decomposition” as he called it." It's also referred to as LR decomposition.

In the mathematical field of linear algebra, an arrowhead matrix is a square matrix containing zeros in all entries except for the first row, first column, and main diagonal, these entries can be any number. In other words, the matrix has the form

<span class="mw-page-title-main">Matrix (mathematics)</span> Array of numbers

In mathematics, a matrix is a rectangular array or table of numbers, symbols, or expressions, arranged in rows and columns, which is used to represent a mathematical object or a property of such an object.

In mathematics, the Robinson–Schensted–Knuth correspondence, also referred to as the RSK correspondence or RSK algorithm, is a combinatorial bijection between matrices $A$ with non-negative integer entries and pairs $(P, Q)$ of semistandard Young tableaux of equal shape, whose size equals the sum of the entries of $A$ . More precisely the weight of $P$ is given by the column sums of $A$ , and the weight of $Q$ by its row sums. It is a generalization of the Robinson–Schensted correspondence, in the sense that taking $A$ to be a permutation matrix, the pair $(P, Q)$ will be the pair of standard tableaux associated to the permutation under the Robinson–Schensted correspondence.

References

Smith, Henry J. Stephen (1861). "On systems of linear indeterminate equations and congruences". Phil. Trans. R. Soc. Lond. 151 (1): 293–326. doi:10.1098/rstl.1861.0016. JSTOR 108738. S2CID 110730515. Reprinted (pp. 367–409) in The Collected Mathematical Papers of Henry John Stephen Smith, Vol. I, edited by J. W. L. Glaisher. Oxford: Clarendon Press (1894), xcv+603 pp.
Smith normal form at PlanetMath .
Example of Smith normal form at PlanetMath .
K. R. Matthews, Smith normal form. MP274: Linear Algebra, Lecture Notes, University of Queensland, 1991.

External links

An animated example of computation of Smith normal form.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Stanley, Richard P. (2016). "Smith normal form in combinatorics". Journal of Combinatorial Theory . Series A. 144: 476–495. arXiv: 1602.00166 . doi: 10.1016/j.jcta.2016.06.013 . S2CID 14400632.

[2] Maciejowski, Jan M. (1989). Multivariable feedback design. Wokingham, England: Addison-Wesley. ISBN 0201182432. OCLC 19456124.

[1]

[2]