Frobenius normal form

Last updated January 11, 2025

In linear algebra, the Frobenius normal form or rational canonical form of a square matrix A with entries in a field F is a canonical form for matrices obtained by conjugation by invertible matrices over F. The form reflects a minimal decomposition of the vector space into subspaces that are cyclic for A (i.e., spanned by some vector and its repeated images under A). Since only one normal form can be reached from a given matrix (whence the "canonical"), a matrix B is similar to A if and only if it has the same rational canonical form as A. Since this form can be found without any operations that might change when extending the field F (whence the "rational"), notably without factoring polynomials, this shows that whether two matrices are similar does not change upon field extensions. The form is named after German mathematician Ferdinand Georg Frobenius.

Some authors use the term rational canonical form for a somewhat different form that is more properly called the primary rational canonical form. Instead of decomposing into a minimum number of cyclic subspaces, the primary form decomposes into a maximum number of cyclic subspaces. It is also defined over F, but has somewhat different properties: finding the form requires factorization of polynomials, and as a consequence the primary rational canonical form may change when the same matrix is considered over an extension field of F. This article mainly deals with the form that does not require factorization, and explicitly mentions "primary" when the form using factorization is meant.

Motivation

When trying to find out whether two square matrices A and B are similar, one approach is to try, for each of them, to decompose the vector space as far as possible into a direct sum of stable subspaces, and compare the respective actions on these subspaces. For instance if both are diagonalizable, then one can take the decomposition into eigenspaces (for which the action is as simple as it can get, namely by a scalar), and then similarity can be decided by comparing eigenvalues and their multiplicities. While in practice this is often a quite insightful approach, there are various drawbacks this has as a general method. First, it requires finding all eigenvalues, say as roots of the characteristic polynomial, but it may not be possible to give an explicit expression for them. Second, a complete set of eigenvalues might exist only in an extension of the field one is working over, and then one does not get a proof of similarity over the original field. Finally A and B might not be diagonalizable even over this larger field, in which case one must instead use a decomposition into generalized eigenspaces, and possibly into Jordan blocks.

But obtaining such a fine decomposition is not necessary to just decide whether two matrices are similar. The rational canonical form is based on instead using a direct sum decomposition into stable subspaces that are as large as possible, while still allowing a very simple description of the action on each of them. These subspaces must be generated by a single nonzero vector v and all its images by repeated application of the linear operator associated to the matrix; such subspaces are called cyclic subspaces (by analogy with cyclic subgroups) and they are clearly stable under the linear operator. A basis of such a subspace is obtained by taking v and its successive images as long as they are linearly independent. The matrix of the linear operator with respect to such a basis is the companion matrix of a monic polynomial; this polynomial (the minimal polynomial of the operator restricted to the subspace, which notion is analogous to that of the order of a cyclic subgroup) determines the action of the operator on the cyclic subspace up to isomorphism, and is independent of the choice of the vector v generating the subspace.

A direct sum decomposition into cyclic subspaces always exists, and finding one does not require factoring polynomials. However it is possible that cyclic subspaces do allow a decomposition as direct sum of smaller cyclic subspaces (essentially by the Chinese remainder theorem). Therefore, just having for both matrices some decomposition of the space into cyclic subspaces, and knowing the corresponding minimal polynomials, is not in itself sufficient to decide their similarity. An additional condition is imposed to ensure that for similar matrices one gets decompositions into cyclic subspaces that exactly match: in the list of associated minimal polynomials each one must divide the next (and the constant polynomial 1 is forbidden to exclude trivial cyclic subspaces). The resulting list of polynomials are called the invariant factors of (the K[X]-module defined by) the matrix, and two matrices are similar if and only if they have identical lists of invariant factors. The rational canonical form of a matrix A is obtained by expressing it on a basis adapted to a decomposition into cyclic subspaces whose associated minimal polynomials are the invariant factors of A; two matrices are similar if and only if they have the same rational canonical form.

Example

Consider the following matrix A, over Q:

\scriptstyle A={\begin{pmatrix}-1&3&-1&0&-2&0&0&-2\\-1&-1&1&1&-2&-1&0&-1\\-2&-6&4&3&-8&-4&-2&1\\-1&8&-3&-1&5&2&3&-3\\0&0&0&0&0&0&0&1\\0&0&0&0&-1&0&0&0\\1&0&0&0&2&0&0&0\\0&0&0&0&4&0&1&0\end{pmatrix}}.

A has minimal polynomial $\mu =X^{6}-4X^{4}-2X^{3}+4X^{2}+4X+1$ , so that the dimension of a subspace generated by the repeated images of a single vector is at most 6. The characteristic polynomial is $\chi =X^{8}-X^{7}-5X^{6}+2X^{5}+10X^{4}+2X^{3}-7X^{2}-5X-1$ , which is a multiple of the minimal polynomial by a factor $X^{2}-X-1$ . There always exist vectors such that the cyclic subspace that they generate has the same minimal polynomial as the operator has on the whole space; indeed most vectors will have this property, and in this case the first standard basis vector $e_{1}$ does so: the vectors $A^{k}(e_{1})$ for $k=0,1,\ldots ,5$ are linearly independent and span a cyclic subspace with minimal polynomial $\mu$ . There exist complementary stable subspaces (of dimension 2) to this cyclic subspace, and the space generated by vectors $v=(3,4,8,0,-1,0,2,-1)^{\top }$ and $w=(5,4,5,9,-1,1,1,-2)^{\top }$ is an example. In fact one has $A\cdot v=w$ , so the complementary subspace is a cyclic subspace generated by $v$ ; it has minimal polynomial $X^{2}-X-1$ . Since $\mu$ is the minimal polynomial of the whole space, it is clear that $X^{2}-X-1$ must divide $\mu$ (and it is easily checked that it does), and we have found the invariant factors $X^{2}-X-1$ and $\mu =X^{6}-4X^{4}-2X^{3}+4X^{2}+4X+1$ of A. Then the rational canonical form of A is the block diagonal matrix with the corresponding companion matrices as diagonal blocks, namely

\scriptstyle C=\left({\begin{array}{cc|cccccc}0&1&0&0&0&0&0&0\\1&1&0&0&0&0&0&0\\\hline 0&0&0&0&0&0&0&-1\\0&0&1&0&0&0&0&-4\\0&0&0&1&0&0&0&-4\\0&0&0&0&1&0&0&2\\0&0&0&0&0&1&0&4\\0&0&0&0&0&0&1&0\end{array}}\right).

A basis on which this form is attained is formed by the vectors $v,w$ above, followed by $A^{k}(e_{1})$ for $k=0,1,\ldots ,5$ ; explicitly this means that for

\scriptstyle P={\begin{pmatrix}3&5&1&-1&0&0&-4&0\\4&4&0&-1&-1&-2&-3&-5\\8&5&0&-2&-5&-2&-11&-6\\0&9&0&-1&3&-2&0&0\\-1&-1&0&0&0&1&-1&4\\0&1&0&0&0&0&-1&1\\2&1&0&1&-1&0&2&-6\\-1&-2&0&0&1&-1&4&-2\end{pmatrix}}

,

one has $A=PCP^{-1}.$

General case and theory

Fix a base field F and a finite-dimensional vector space V over F. Given a polynomial P ∈ F[X], there is associated to it a companion matrix C_P whose characteristic polynomial and minimal polynomial are both equal to P.

Theorem: Let V be a finite-dimensional vector space over a field F, and A a square matrix over F. Then V (viewed as an F[X]-module with the action of X given by A) admits a F[X]-module isomorphism

V ≅ F[X]/f₁ ⊕ … ⊕ F[X]/f_k

where the f_i ∈ F[X] may be taken to be monic polynomials of positive degree (so they are non-units in F[X]) that satisfy the relations

f₁ | f₂ | … | f_k

(where "a | b" is notation for "a divides b"); with these conditions the list of polynomials f_i is unique.

Sketch of Proof: Apply the structure theorem for finitely generated modules over a principal ideal domain to V, viewing it as an F[X]-module. The structure theorem provides a decomposition into cyclic factors, each of which is a quotient of F[X] by a proper ideal; the zero ideal cannot be present since the resulting free module would be infinite-dimensional as F vector space, while V is finite-dimensional. For the polynomials f_i one then takes the unique monic generators of the respective ideals, and since the structure theorem ensures containment of every ideal in the preceding ideal, one obtains the divisibility conditions for the f_i. See [DF] for details.

Given an arbitrary square matrix, the elementary divisors used in the construction of the Jordan normal form do not exist over F[X], so the invariant factors f_i as given above must be used instead. The last of these factors f_k is then the minimal polynomial, which all the invariant factors therefore divide, and the product of the invariant factors gives the characteristic polynomial. Note that this implies that the minimal polynomial divides the characteristic polynomial (which is essentially the Cayley-Hamilton theorem), and that every irreducible factor of the characteristic polynomial also divides the minimal polynomial (possibly with lower multiplicity).

For each invariant factor f_i one takes its companion matrix C_{f_i}, and the block diagonal matrix formed from these blocks yields the rational canonical form of A. When the minimal polynomial is identical to the characteristic polynomial (the case k = 1), the Frobenius normal form is the companion matrix of the characteristic polynomial. As the rational canonical form is uniquely determined by the unique invariant factors associated to A, and these invariant factors are independent of basis, it follows that two square matrices A and B are similar if and only if they have the same rational canonical form.

A rational normal form generalizing the Jordan normal form

The Frobenius normal form does not reflect any form of factorization of the characteristic polynomial, even if it does exist over the ground field F. This implies that it is invariant when F is replaced by a different field (as long as it contains the entries of the original matrix A). On the other hand, this makes the Frobenius normal form rather different from other normal forms that do depend on factoring the characteristic polynomial, notably the diagonal form (if A is diagonalizable) or more generally the Jordan normal form (if the characteristic polynomial splits into linear factors). For instance, the Frobenius normal form of a diagonal matrix with distinct diagonal entries is just the companion matrix of its characteristic polynomial.

There is another way to define a normal form, that, like the Frobenius normal form, is always defined over the same field F as A, but that does reflect a possible factorization of the characteristic polynomial (or equivalently the minimal polynomial) into irreducible factors over F, and which reduces to the Jordan normal form when this factorization only contains linear factors (corresponding to eigenvalues). This form^[1] is sometimes called the generalized Jordan normal form, or primary rational canonical form. It is based on the fact that the vector space can be canonically decomposed into a direct sum of stable subspaces corresponding to the distinct irreducible factors P of the characteristic polynomial (as stated by the lemme des noyaux [ fr ]^[2]), where the characteristic polynomial of each summand is a power of the corresponding P. These summands can be further decomposed, non-canonically, as a direct sum of cyclic F[x]-modules (like is done for the Frobenius normal form above), where the characteristic polynomial of each summand is still a (generally smaller) power of P. The primary rational canonical form is a block diagonal matrix corresponding to such a decomposition into cyclic modules, with a particular form called generalized Jordan block in the diagonal blocks, corresponding to a particular choice of a basis for the cyclic modules. This generalized Jordan block is itself a block matrix of the form

\scriptstyle {\begin{pmatrix}C&0&\cdots &0\\U&C&\cdots &0\\\vdots &\ddots &\ddots &\vdots \\0&\cdots &U&C\end{pmatrix}}

where C is the companion matrix of the irreducible polynomial $P$ , and $U$ is a matrix whose sole nonzero entry is a 1 in the upper right-hand corner. For the case of a linear irreducible factor $P = x - λ$ , these blocks are reduced to single entries $C = λ$ and $U = 1$ and, one finds a (transposed) Jordan block. In any generalized Jordan block, all entries immediately below the main diagonal are 1. A basis of the cyclic module giving rise to this form is obtained by choosing a generating vector $v$ (one that is not annihilated by $P k -1 (A)$ where the minimal polynomial of the cyclic module is $P k$ ), and taking as basis

v,A(v),A^{2}(v),\ldots ,A^{d-1}(v),~P(A)(v),A(P(A)(v)),\ldots ,A^{d-1}(P(A)(v)),~P^{2}(A)(v),\ldots ,~P^{k-1}(A)(v),\ldots ,A^{d-1}(P^{k-1}(A)(v))

where $d = deg P$ .

Related Research Articles

In mathematics, an abelian group, also called a commutative group, is a group in which the result of applying the group operation to two group elements does not depend on the order in which they are written. That is, the group operation is commutative. With addition as an operation, the integers and the real numbers form abelian groups, and the concept of an abelian group may be viewed as a generalization of these examples. Abelian groups are named after Niels Henrik Abel.

Linear algebra is the branch of mathematics concerning linear equations such as:

In linear algebra, the trace of a square matrix $A$ , denoted $tr(A)$ , is the sum of the elements on its main diagonal, $. It is only defined for a square matrix.$

In linear algebra and functional analysis, a spectral theorem is a result about when a linear operator or matrix can be diagonalized. This is extremely useful because computations involving a diagonalizable matrix can often be reduced to much simpler computations involving the corresponding diagonal matrix. The concept of diagonalization is relatively straightforward for operators on finite-dimensional vector spaces but requires some modification for operators on infinite-dimensional spaces. In general, the spectral theorem identifies a class of linear operators that can be modeled by multiplication operators, which are as simple as one can hope to find. In more abstract language, the spectral theorem is a statement about commutative C*-algebras. See also spectral theory for a historical perspective.

In mathematics, the orthogonal group in dimension $n$ , denoted $O(n)$ , is the group of distance-preserving transformations of a Euclidean space of dimension $n$ that preserve a fixed point, where the group operation is given by composing transformations. The orthogonal group is sometimes called the general orthogonal group, by analogy with the general linear group. Equivalently, it is the group of $n \times n$ orthogonal matrices, where the group operation is given by matrix multiplication (an orthogonal matrix is a real matrix whose inverse equals its transpose). The orthogonal group is an algebraic group and a Lie group. It is compact.

In linear algebra, the characteristic polynomial of a square matrix is a polynomial which is invariant under matrix similarity and has the eigenvalues as roots. It has the determinant and the trace of the matrix among its coefficients. The characteristic polynomial of an endomorphism of a finite-dimensional vector space is the characteristic polynomial of the matrix of that endomorphism over any basis. The characteristic equation, also known as the determinantal equation, is the equation obtained by equating the characteristic polynomial to zero.

In linear algebra, a Jordan normal form, also known as a Jordan canonical form, is an upper triangular matrix of a particular form called a Jordan matrix representing a linear operator on a finite-dimensional vector space with respect to some basis. Such a matrix has each non-zero off-diagonal entry equal to 1, immediately above the main diagonal, and with identical diagonal entries to the left and below them.

In linear algebra, two n-by-n matrices $A$ and $B$ are called similar if there exists an invertible n-by-n matrix $P$ such that $Similar matrices represent the same linear map under two (possibly) different bases, with P being the change-of-basis matrix.$

This is an outline of topics related to linear algebra, the branch of mathematics concerning linear equations and linear maps and their representations in vector spaces and through matrices.

In the mathematical discipline of linear algebra, the Schur decomposition or Schur triangulation, named after Issai Schur, is a matrix decomposition. It allows one to write an arbitrary complex square matrix as unitarily similar to an upper triangular matrix whose diagonal elements are the eigenvalues of the original matrix.

In linear algebra, the Frobenius companion matrix of the monic polynomial $is the square matrix defined as$

In mathematics, the symmetric algebra $S (V)$ (also denoted $Sym(V))$ on a vector space $V$ over a field $K$ is a commutative algebra over $K$ that contains $V$ , and is, in some sense, minimal for this property. Here, "minimal" means that $S (V)$ satisfies the following universal property: for every linear map $f$ from $V$ to a commutative algebra $A$ , there is a unique algebra homomorphism $g : S (V) \to A$ such that $f = g \circ i$ , where $i$ is the inclusion map of $V$ in $S (V)$ .

In mathematics, the Smith normal form is a normal form that can be defined for any matrix with entries in a principal ideal domain (PID). The Smith normal form of a matrix is diagonal, and can be obtained from the original matrix by multiplying on the left and right by invertible square matrices. In particular, the integers are a PID, so one can always calculate the Smith normal form of an integer matrix. The Smith normal form is very useful for working with finitely generated modules over a PID, and in particular for deducing the structure of a quotient of a free module. It is named after the Irish mathematician Henry John Stephen Smith.

In matrix theory, the Perron–Frobenius theorem, proved by Oskar Perron and Georg Frobenius, asserts that a real square matrix with positive entries has a unique eigenvalue of largest magnitude and that eigenvalue is real. The corresponding eigenvector can be chosen to have strictly positive components, and also asserts a similar statement for certain classes of nonnegative matrices. This theorem has important applications to probability theory ; to the theory of dynamical systems ; to economics ; to demography ; to social networks ; to Internet search engines (PageRank); and even to ranking of American football teams. The first to discuss the ordering of players within tournaments using Perron–Frobenius eigenvectors is Edmund Landau.

In linear algebra, an eigenvector or characteristic vector is a vector that has its direction unchanged by a given linear transformation. More precisely, an eigenvector, $, of a linear transformation,, is scaled by a constant factor,, when the linear transformation is applied to it: . The corresponding eigenvalue, characteristic value, or characteristic root is the multiplying factor .$

In linear algebra, the minimal polynomial $μ A$ of an $n \times n$ matrix $A$ over a field $F$ is the monic polynomial $P$ over $F$ of least degree such that $P (A) = 0$ . Any other polynomial $Q$ with $Q (A) = 0$ is a (polynomial) multiple of $μ A$ .

In mathematics, in the field of abstract algebra, the structure theorem for finitely generated modules over a principal ideal domain is a generalization of the fundamental theorem of finitely generated abelian groups and roughly states that finitely generated modules over a principal ideal domain (PID) can be uniquely decomposed in much the same way that integers have a prime factorization. The result provides a simple framework to understand various canonical form results for square matrices over fields.

In linear algebra, eigendecomposition is the factorization of a matrix into a canonical form, whereby the matrix is represented in terms of its eigenvalues and eigenvectors. Only diagonalizable matrices can be factorized in this way. When the matrix being factorized is a normal or real symmetric matrix, the decomposition is called "spectral decomposition", derived from the spectral theorem.

In mathematics, in linear algebra and functional analysis, a cyclic subspace is a certain special subspace of a vector space associated with a vector in the vector space and a linear transformation of the vector space. The cyclic subspace associated with a vector v in a vector space V and a linear transformation T of V is called the T-cyclic subspace generated by v. The concept of a cyclic subspace is a basic component in the formulation of the cyclic decomposition theorem in linear algebra.

In mathematics, semi-simplicity is a widespread concept in disciplines such as linear algebra, abstract algebra, representation theory, category theory, and algebraic geometry. A semi-simple object is one that can be decomposed into a sum of simple objects, and simple objects are those that do not contain non-trivial proper sub-objects. The precise definitions of these words depends on the context.

References

[DF] David S. Dummit and Richard M. Foote. Abstract Algebra. 2nd Edition, John Wiley & Sons. pp. 442, 446, 452-458. ISBN 0-471-36857-1.

↑ Phani Bhushan Bhattacharya, Surender Kumar Jain, S. R. Nagpaul, Basic abstract algebra, Theorem 5.4, p.423
↑ Xavier Gourdon, Les maths en tête, Mathématiques pour M', Algèbre, 1998, Ellipses, Th. 1 p. 173

External links

Rational Canonical Form (Mathworld)

Algorithms

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Phani Bhushan Bhattacharya, Surender Kumar Jain, S. R. Nagpaul, Basic abstract algebra, Theorem 5.4, p.423

[2] Xavier Gourdon, Les maths en tête, Mathématiques pour M', Algèbre, 1998, Ellipses, Th. 1 p. 173

[1]

[2]