Companion matrix

Last updated January 31, 2024

In linear algebra, the Frobenius companion matrix of the monic polynomial

Similarity to companion matrix

Any matrix $A$ with entries in a field $F$ has characteristic polynomial $p(x)=\det(xI-A)$ , which in turn has companion matrix $C(p)$ . These matrices are related as follows.

The following statements are equivalent:

A is similar over F to $C(p)$ , i.e. A can be conjugated to its companion matrix by matrices in GL_n(F);
the characteristic polynomial $p(x)$ coincides with the minimal polynomial of A , i.e. the minimal polynomial has degree n;
the linear mapping $A:F^{n}\to F^{n}$ makes $F^{n}$ a cyclic $F[A]$ -module, having a basis of the form $\{v,Av,\ldots ,A^{n-1}v\}$ ; or equivalently $F^{n}\cong F[X]/(p(x))$ as $F[A]$ -modules.

If the above hold, one says that A is non-derogatory.

Not every square matrix is similar to a companion matrix, but every square matrix is similar to a block diagonal matrix made of companion matrices. If we also demand that the polynomial of each diagonal block divides the next one, they are uniquely determined by A, and this gives the rational canonical form of A.

Diagonalizability

The roots of the characteristic polynomial $p(x)$ are the eigenvalues of $C(p)$ . If there are n distinct eigenvalues $\lambda _{1},\ldots ,\lambda _{n}$ , then $C(p)$ is diagonalizable as $C(p)=V^{-1}\!DV$ , where D is the diagonal matrix and V is the Vandermonde matrix corresponding to the $λ$ 's:

D={\begin{bmatrix}\lambda _{1}&0&\!\!\!\cdots \!\!\!&0\\0&\lambda _{2}&\!\!\!\cdots \!\!\!&0\\0&0&\!\!\!\cdots \!\!\!&\lambda _{n}\end{bmatrix}},\qquad V={\begin{bmatrix}1&\lambda _{1}&\lambda _{1}^{2}&\!\!\!\cdots \!\!\!&\lambda _{1}^{n}\\1&\lambda _{2}&\lambda _{2}^{2}&\!\!\!\cdots \!\!\!&\lambda _{2}^{n}\\[-1em]\vdots &\vdots &\vdots &\!\!\!\ddots \!\!\!&\vdots \\1&\lambda _{n}&\lambda _{n}^{2}&\!\!\!\cdots \!\!\!&\lambda _{n}^{n}\end{bmatrix}}.

Indeed, an easy computation shows that the transpose $C(p)^{T}$ has eigenvectors $v_{i}=(1,\lambda _{i},\ldots ,\lambda _{i}^{n-1})$ with $C(p)^{T}\!(v_{i})=\lambda _{i}v_{i}$ , which follows from $p(\lambda _{i})=c_{0}+c_{1}\lambda _{i}+\cdots +c_{n-1}\lambda _{i}^{n-1}+\lambda _{i}^{n}=0$ . Thus, its diagonalizing change of basis matrix is $V^{T}=[v_{1}^{T}\ldots v_{n}^{T}]$ , meaning $C(p)^{T}=V^{T}D\,(V^{T})^{-1}$ , and taking the transpose of both sides gives $C(p)=V^{-1}\!DV$ .

We can read the eigenvectors of $C(p)$ with $C(p)(w_{i})=\lambda _{i}w_{i}$ from the equation $C(p)=V^{-1}\!DV$ : they are the column vectors of the inverse Vandermonde matrix $V^{-1}=[w_{1}^{T}\cdots w_{n}^{T}]$ . This matrix is known explicitly, giving the eignevectors $w_{i}=(L_{0i},\ldots ,L_{(n-1)i})$ , with coordinates equal to the coefficients of the Lagrange polynomials

L_{i}(x)=L_{0i}+L_{1i}x+\cdots +L_{(n-1)i}x^{n-1}=\prod _{j\neq i}{\frac {x-\lambda _{j}}{\lambda _{j}-\lambda _{i}}}={\frac {p(x)}{(x-\lambda _{i})\,p'(\lambda _{i})}}.

Alternatively, the scaled eigenvectors ${\tilde {w}}_{i}=p'\!(\lambda _{i})\,w_{i}$ have simpler coefficients.

If $p(x)$ has multiple roots, then $C(p)$ is not diagonalizable. Rather, the Jordan canonical form of $C(p)$ contains one diagonal block for each distinct root, an m × m block with $\lambda$ on the diagonal if the root $\lambda$ has multiplicity m.

Linear recursive sequences

A linear recursive sequence defined by $a_{k+n}=-c_{0}a_{k}-c_{1}a_{k+1}\cdots -c_{n-1}a_{k+n-1}$ for $k\geq 0$ has the characteristic polynomial $p(x)=c_{0}+c_{1}x+\cdots +c_{n-1}x^{n-1}+x^{n}$ , whose transpose companion matrix $C(p)^{T}$ generates the sequence:

{\begin{bmatrix}a_{k+1}\\a_{k+2}\\\vdots \\a_{k+n-1}\\a_{k+n}\end{bmatrix}}={\begin{bmatrix}0&1&0&\cdots &0\\0&0&1&\cdots &0\\\vdots &\vdots &\vdots &\ddots &\vdots \\0&0&0&\cdots &1\\-c_{0}&-c_{1}&-c_{2}&\cdots &-c_{n-1}\end{bmatrix}}{\begin{bmatrix}a_{k}\\a_{k+1}\\\vdots \\a_{k+n-2}\\a_{k+n-1}\end{bmatrix}}.

The vector $v=(1,\lambda ,\lambda ^{2},\ldots ,\lambda ^{n-1})$ is an eigenvector of this matrix, where the eigenvalue $\lambda$ is a root of $p(x)$ . Setting the initial values of the sequence equal to this vector produces a geometric sequence $a_{k}=\lambda ^{k}$ which satisfies the recurrence. In the case of n distinct eigenvalues, an arbitrary solution $a_{k}$ can be written as a linear combination of such geometric solutions, and the eigenvalues of largest complex norm give an asymptotic approximation.

From linear ODE to first-order linear ODE system

Similarly to the above case of linear recursions, consider a homogeneous linear ODE of order n for the scalar function $y=y(t)$ :

y^{(n)}+c_{n-1}y^{(n-1)}+\dots +c_{1}y^{(1)}+c_{0}y=0.

This can be equivalently described as a coupled system of homogeneous linear ODE of order 1 for the vector function $z(t)=(y(t),y'(t),\ldots ,y^{(n-1)}(t))$ :

z'=C(p)^{T}z

where $C(p)^{T}$ is the transpose companion matrix for the characteristic polynomial

p(x)=x^{n}+c_{n-1}x^{n-1}+\cdots +c_{1}x+c_{0}.

Here the coefficients $c_{i}=c_{i}(t)$ may be also functions, not just constants.

If $C(p)^{T}$ is diagonalizable, then a diagonalizing change of basis will transform this into a decoupled system equivalent to one scalar homogeneous first-order linear ODE in each coordinate.

An inhomogeneous equation

y^{(n)}+c_{n-1}y^{(n-1)}+\dots +c_{1}y^{(1)}+c_{0}y=f(t)

is equivalent to the system:

z'=C(p)^{T}z+F(t)

with the inhomogeneity term $F(t)=(0,\ldots ,0,f(t))$ .

Again, a diagonalizing change of basis will transform this into a decoupled system of scalar inhomogeneous first-order linear ODEs.

Cyclic shift matrix

In the case of $p(x)=x^{n}-1$ , when the eigenvalues are the complex roots of unity, the companion matrix and its transpose both reduce to Sylvester's cyclic shift matrix, a circulant matrix.

Multiplication map on a simple field extension

Consider a polynomial $p(x)=x^{n}+c_{n-1}x^{n-1}+\cdots +c_{1}x+c_{0}$ with coefficients in a field $F$ , and suppose $p(x)$ is irreducible in the polynomial ring $F[x]$ . Then adjoining a root $\lambda$ of $p(x)$ produces a field extension $K=F(\lambda )\cong F[x]/(p(x))$ , which is also a vector space over $F$ with standard basis $\{1,\lambda ,\lambda ^{2},\ldots ,\lambda ^{n-1}\}$ . Then the $F$ -linear multiplication mapping

m_{\lambda }:K\to K

defined by

m_{\lambda }(\alpha )=\lambda \alpha

has an n × n matrix $[m_{\lambda }]$ with respect to the standard basis. Since $m_{\lambda }(\lambda ^{i})=\lambda ^{i+1}$ and $m_{\lambda }(\lambda ^{n-1})=\lambda ^{n}=-c_{0}-\cdots -c_{n-1}\lambda ^{n-1}$ , this is the companion matrix of $p(x)$ :

[m_{\lambda }]=C(p).

Assuming this extension is separable (for example if $F$ has characteristic zero or is a finite field), $p(x)$ has distinct roots $\lambda _{1},\ldots ,\lambda _{n}$ with $\lambda _{1}=\lambda$ , so that

p(x)=(x-\lambda _{1})\cdots (x-\lambda _{n}),

and it has splitting field $L=F(\lambda _{1},\ldots ,\lambda _{n})$ . Now $m_{\lambda }$ is not diagonalizable over $F$ ; rather, we must extend it to an $L$ -linear map on $L^{n}\cong L\otimes _{F}K$ , a vector space over $L$ with standard basis $\{1{\otimes }1,\,1{\otimes }\lambda ,\,1{\otimes }\lambda ^{2},\ldots ,1{\otimes }\lambda ^{n-1}\}$ , containing vectors $w=(\beta _{1},\ldots ,\beta _{n})=\beta _{1}{\otimes }1+\cdots +\beta _{n}{\otimes }\lambda ^{n-1}$ . The extended mapping is defined by $m_{\lambda }(\beta \otimes \alpha )=\beta \otimes (\lambda \alpha )$ .

The matrix $[m_{\lambda }]=C(p)$ is unchanged, but as above, it can be diagonalized by matrices with entries in $L$ :

[m_{\lambda }]=C(p)=V^{-1}\!DV,

for the diagonal matrix $D=\operatorname {diag} (\lambda _{1},\ldots ,\lambda _{n})$ and the Vandermonde matrix V corresponding to $\lambda _{1},\ldots ,\lambda _{n}\in L$ . The explicit formula for the eigenvectors (the scaled column vectors of the inverse Vandermonde matrix $V^{-1}$ ) can be written as:

{\tilde {w}}_{i}=\beta _{0i}{\otimes }1+\beta _{1i}{\otimes }\lambda +\cdots +\beta _{(n-1)i}{\otimes }\lambda ^{n-1}=\prod _{j\neq i}(1{\otimes }\lambda -\lambda _{j}{\otimes }1)

where $\beta _{ij}\in L$ are the coefficients of the scaled Lagrange polynomial

{\frac {p(x)}{x-\lambda _{i}}}=\prod _{j\neq i}(x-\lambda _{j})=\beta _{0i}+\beta _{1i}x+\cdots +\beta _{(n-1)i}x^{n-1}.

Notes

↑ Horn, Roger A.; Charles R. Johnson (1985). Matrix Analysis. Cambridge, UK: Cambridge University Press. pp. 146–147. ISBN 0-521-30586-1 . Retrieved 2010-02-10.

Related Research Articles

In physics, the Lorentz transformations are a six-parameter family of linear transformations from a coordinate frame in spacetime to another frame that moves at a constant velocity relative to the former. The respective inverse transformation is then parameterized by the negative of this velocity. The transformations are named after the Dutch physicist Hendrik Lorentz.

Reed–Solomon codes are a group of error-correcting codes that were introduced by Irving S. Reed and Gustave Solomon in 1960. They have many applications, the most prominent of which include consumer technologies such as MiniDiscs, CDs, DVDs, Blu-ray discs, QR codes, data transmission technologies such as DSL and WiMAX, broadcast systems such as satellite communications, DVB and ATSC, and storage systems such as RAID 6.

In statistics, the Gauss–Markov theorem states that the ordinary least squares (OLS) estimator has the lowest sampling variance within the class of linear unbiased estimators, if the errors in the linear regression model are uncorrelated, have equal variances and expectation value of zero. The errors do not need to be normal for the theorem to apply, nor do they need to be independent and identically distributed.

In mathematics, the Hodge star operator or Hodge star is a linear map defined on the exterior algebra of a finite-dimensional oriented vector space endowed with a nondegenerate symmetric bilinear form. Applying the operator to an element of the algebra produces the Hodge dual of the element. This map was introduced by W. V. D. Hodge.

In mathematics and theoretical physics, the term quantum group denotes one of a few different kinds of noncommutative algebras with additional structure. These include Drinfeld–Jimbo type quantum groups, compact matrix quantum groups, and bicrossproduct quantum groups. Despite their name, they do not themselves have a natural group structure, though they are in some sense 'close' to a group.

In linear algebra, a Vandermonde matrix, named after Alexandre-Théophile Vandermonde, is a matrix with the terms of a geometric progression in each row: an $matrix$

In mathematics, the Hessian matrix, Hessian or Hesse matrix is a square matrix of second-order partial derivatives of a scalar-valued function, or scalar field. It describes the local curvature of a function of many variables. The Hessian matrix was developed in the 19th century by the German mathematician Ludwig Otto Hesse and later named after him. Hesse originally used the term "functional determinants". The Hessian is sometimes denoted by H or, ambiguously, by ∇².

In numerical analysis, one of the most important problems is designing efficient and stable algorithms for finding the eigenvalues of a matrix. These eigenvalue algorithms may also find eigenvectors.

In mathematics, the matrix exponential is a matrix function on square matrices analogous to the ordinary exponential function. It is used to solve systems of linear differential equations. In the theory of Lie groups, the matrix exponential gives the exponential map between a matrix Lie algebra and the corresponding Lie group.

In mathematics, the Kronecker product, sometimes denoted by ⊗, is an operation on two matrices of arbitrary size resulting in a block matrix. It is a specialization of the tensor product from vectors to matrices and gives the matrix of the tensor product linear map with respect to a standard choice of basis. The Kronecker product is to be distinguished from the usual matrix multiplication, which is an entirely different operation. The Kronecker product is also sometimes called matrix direct product.

Quantum statistical mechanics is statistical mechanics applied to quantum mechanical systems. In quantum mechanics a statistical ensemble is described by a density operator S, which is a non-negative, self-adjoint, trace-class operator of trace 1 on the Hilbert space H describing the quantum system. This can be shown under various mathematical formalisms for quantum mechanics. One such formalism is provided by quantum logic.

In linear algebra, it is often important to know which vectors have their directions unchanged by a linear transformation. An eigenvector or characteristic vector is such a vector. Thus an eigenvector $of a linear transformation is scaled by a constant factor when the linear transformation is applied to it: . The corresponding eigenvalue, characteristic value, or characteristic root is the multiplying factor .$

In mathematics, the resultant of two polynomials is a polynomial expression of their coefficients that is equal to zero if and only if the polynomials have a common root, or, equivalently, a common factor. In some older texts, the resultant is also called the eliminant.

In the mathematical discipline of matrix theory, a Jordan matrix, named after Camille Jordan, is a block diagonal matrix over a ring $R$ , where each block along the diagonal, called a Jordan block, has the following form:

In statistics, Bayesian multivariate linear regression is a Bayesian approach to multivariate linear regression, i.e. linear regression where the predicted outcome is a vector of correlated random variables rather than a single scalar random variable. A more general treatment of this approach can be found in the article MMSE estimator.

A multi-compartment model is a type of mathematical model used for describing the way materials or energies are transmitted among the compartments of a system. Sometimes, the physical system that we try to model in equations is too complex, so it is much easier to discretize the problem and reduce the number of parameters. Each compartment is assumed to be a homogeneous entity within which the entities being modeled are equivalent. A multi-compartment model is classified as a lumped parameters model. Similar to more general mathematical models, multi-compartment models can treat variables as continuous, such as a differential equation, or as discrete, such as a Markov chain. Depending on the system being modeled, they can be treated as stochastic or deterministic.

In linear algebra, eigendecomposition is the factorization of a matrix into a canonical form, whereby the matrix is represented in terms of its eigenvalues and eigenvectors. Only diagonalizable matrices can be factorized in this way. When the matrix being factorized is a normal or real symmetric matrix, the decomposition is called "spectral decomposition", derived from the spectral theorem.

In mathematics, a quasitoric manifold is a topological analogue of the nonsingular projective toric variety of algebraic geometry. A smooth $-dimensional manifold is a quasitoric manifold if it admits a smooth, locally standard action of an -dimensional torus, with orbit space an -dimensional simple convex polytope.$

The cyclotomic fast Fourier transform is a type of fast Fourier transform algorithm over finite fields. This algorithm first decomposes a DFT into several circular convolutions, and then derives the DFT results from the circular convolution results. When applied to a DFT over $, this algorithm has a very low multiplicative complexity. In practice, since there usually exist efficient algorithms for circular convolutions with specific lengths, this algorithm is very efficient.$

In mathematics, a linear recurrence with constant coefficients sets equal to 0 a polynomial that is linear in the various iterates of a variable—that is, in the values of the elements of a sequence. The polynomial's linearity means that each of its terms has degree 0 or 1. A linear recurrence denotes the evolution of some variable over time, with the current time period or discrete moment in time denoted as $t$ , one period earlier denoted as $t - 1$ , one period later as $t + 1$ , etc.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Horn, Roger A.; Charles R. Johnson (1985). Matrix Analysis. Cambridge, UK: Cambridge University Press. pp. 146–147. ISBN 0-521-30586-1 . Retrieved 2010-02-10.

[1]

v t e Matrix classes
Explicitly constrained entries	Alternant Anti-diagonal Anti-Hermitian Anti-symmetric Arrowhead Band Bidiagonal Bisymmetric Block-diagonal Block Block tridiagonal Boolean Cauchy Centrosymmetric Conference Complex Hadamard Copositive Diagonally dominant Diagonal Discrete Fourier Transform Elementary Equivalent Frobenius Generalized permutation Hadamard Hankel Hermitian Hessenberg Hollow Integer Logical Matrix unit Metzler Moore Nonnegative Pentadiagonal Permutation Persymmetric Polynomial Quaternionic Signature Skew-Hermitian Skew-symmetric Skyline Sparse Sylvester Symmetric Toeplitz Triangular Tridiagonal Vandermonde Walsh Z
Constant	Exchange Hilbert Identity Lehmer Of ones Pascal Pauli Redheffer Shift Zero
Conditions on eigenvalues or eigenvectors	Companion Convergent Defective Definite Diagonalizable Hurwitz Positive-definite Stieltjes
Satisfying conditions on products or inverses	Congruent Idempotent or Projection Invertible Involutory Nilpotent Normal Orthogonal Unimodular Unipotent Unitary Totally unimodular Weighing
With specific applications	Adjugate Alternating sign Augmented Bézout Carleman Cartan Circulant Cofactor Commutation Confusion Coxeter Distance Duplication and elimination Euclidean distance Fundamental (linear differential equation) Generator Gram Hessian Householder Jacobian Moment Payoff Pick Random Rotation Seifert Shear Similarity Symplectic Totally positive Transformation
Used in statistics	Centering Correlation Covariance Design Doubly stochastic Fisher information Hat Precision Stochastic Transition
Used in graph theory	Adjacency Biadjacency Degree Edmonds Incidence Laplacian Seidel adjacency Tutte
Used in science and engineering	Cabibbo–Kobayashi–Maskawa Density Fundamental (computer vision) Fuzzy associative Gamma Gell-Mann Hamiltonian Irregular Overlap S State transition Substitution Z (chemistry)
Related terms	Jordan normal form Linear independence Matrix exponential Matrix representation of conic sections Perfect matrix Pseudoinverse Row echelon form Wronskian
Mathematicsportal List of matrices Category:Matrices