Sylvester matrix

Last updated December 30, 2024

In mathematics, a Sylvester matrix is a matrix associated to two univariate polynomials with coefficients in a field or a commutative ring. The entries of the Sylvester matrix of two polynomials are coefficients of the polynomials. The determinant of the Sylvester matrix of two polynomials is their resultant, which is zero when the two polynomials have a common root (in case of coefficients in a field) or a non-constant common divisor (in case of coefficients in an integral domain).

Definition

Formally, let p and q be two nonzero polynomials, respectively of degree m and n. Thus:

p(z)=p_{0}+p_{1}z+p_{2}z^{2}+\cdots +p_{m}z^{m},\;q(z)=q_{0}+q_{1}z+q_{2}z^{2}+\cdots +q_{n}z^{n}.

The Sylvester matrix associated to p and q is then the $(n+m)\times (n+m)$ matrix constructed as follows:

if n > 0, the first row is:

{\begin{pmatrix}p_{m}&p_{m-1}&\cdots &p_{1}&p_{0}&0&\cdots &0\end{pmatrix}}.

the second row is the first row, shifted one column to the right; the first element of the row is zero.
the following n − 2 rows are obtained the same way, shifting the coefficients one column to the right each time and setting the other entries in the row to be 0.
if m > 0 the (n + 1)th row is:

{\begin{pmatrix}q_{n}&q_{n-1}&\cdots &q_{1}&q_{0}&0&\cdots &0\end{pmatrix}}.

the following rows are obtained the same way as before.

Thus, if m = 4 and n = 3, the matrix is:

S_{p,q}={\begin{pmatrix}p_{4}&p_{3}&p_{2}&p_{1}&p_{0}&0&0\\0&p_{4}&p_{3}&p_{2}&p_{1}&p_{0}&0\\0&0&p_{4}&p_{3}&p_{2}&p_{1}&p_{0}\\q_{3}&q_{2}&q_{1}&q_{0}&0&0&0\\0&q_{3}&q_{2}&q_{1}&q_{0}&0&0\\0&0&q_{3}&q_{2}&q_{1}&q_{0}&0\\0&0&0&q_{3}&q_{2}&q_{1}&q_{0}\end{pmatrix}}.

If one of the degrees is zero (that is, the corresponding polynomial is a nonzero constant polynomial), then there are zero rows consisting of coefficients of the other polynomial, and the Sylvester matrix is a diagonal matrix of dimension the degree of the non-constant polynomial, with the all diagonal coefficients equal to the constant polynomial. If m = n = 0, then the Sylvester matrix is the empty matrix with zero rows and zero columns.

A variant

The above defined Sylvester matrix appears in a Sylvester paper of 1840. In a paper of 1853, Sylvester introduced the following matrix, which is, up to a permutation of the rows, the Sylvester matrix of p and q, which are both considered as having degree max(m, n).^[1] This is thus a $2\max(n,m)\times 2\max(n,m)$ -matrix containing $\max(n,m)$ pairs of rows. Assuming $m>n,$ it is obtained as follows:

the first pair is:

{\begin{pmatrix}p_{m}&p_{m-1}&\cdots &p_{n}&\cdots &p_{1}&p_{0}&0&\cdots &0\\0&\cdots &0&q_{n}&\cdots &q_{1}&q_{0}&0&\cdots &0\end{pmatrix}}.

the second pair is the first pair, shifted one column to the right; the first elements in the two rows are zero.
the remaining $max(n,m)-2$ pairs of rows are obtained the same way as above.

Thus, if m = 4 and n = 3, the matrix is:

{\begin{pmatrix}p_{4}&p_{3}&p_{2}&p_{1}&p_{0}&0&0&0\\0&q_{3}&q_{2}&q_{1}&q_{0}&0&0&0\\0&p_{4}&p_{3}&p_{2}&p_{1}&p_{0}&0&0\\0&0&q_{3}&q_{2}&q_{1}&q_{0}&0&0\\0&0&p_{4}&p_{3}&p_{2}&p_{1}&p_{0}&0\\0&0&0&q_{3}&q_{2}&q_{1}&q_{0}&0\\0&0&0&p_{4}&p_{3}&p_{2}&p_{1}&p_{0}\\0&0&0&0&q_{3}&q_{2}&q_{1}&q_{0}\\\end{pmatrix}}.

The determinant of the 1853 matrix is, up to sign, the product of the determinant of the Sylvester matrix (which is called the resultant of p and q) by $p_{m}^{m-n}$ (still supposing $m\geq n$ ).

Applications

These matrices are used in commutative algebra, e.g. to test if two polynomials have a (non-constant) common factor. In such a case, the determinant of the associated Sylvester matrix (which is called the resultant of the two polynomials) equals zero. The converse is also true.

The solutions of the simultaneous linear equations

{S_{p,q}}^{\mathrm {T} }\cdot {\begin{pmatrix}x\\y\end{pmatrix}}={\begin{pmatrix}0\\0\end{pmatrix}}

where $x$ is a vector of size $n$ and $y$ has size $m$ , comprise the coefficient vectors of those and only those pairs $x,y$ of polynomials (of degrees $n-1$ and $m-1$ , respectively) which fulfill

x(z)\cdot p(z)+y(z)\cdot q(z)=0,

where polynomial multiplication and addition is used. This means the kernel of the transposed Sylvester matrix gives all solutions of the Bézout equation where $\deg x<\deg q$ and $\deg y<\deg p$ .

Consequently the rank of the Sylvester matrix determines the degree of the greatest common divisor of p and q:

\deg(\gcd(p,q))=m+n-\operatorname {rank} S_{p,q}.

Moreover, the coefficients of this greatest common divisor may be expressed as determinants of submatrices of the Sylvester matrix (see Subresultant).

Related Research Articles

In mathematics, the determinant is a scalar-valued function of the entries of a square matrix. The determinant of a matrix $A$ is commonly denoted $det(A)$ , $det A$ , or $| A |$ . Its value characterizes some properties of the matrix and the linear map represented, on a given basis, by the matrix. In particular, the determinant is nonzero if and only if the matrix is invertible and the corresponding linear map is an isomorphism.

In mathematics, the discriminant of a polynomial is a quantity that depends on the coefficients and allows deducing some properties of the roots without computing them. More precisely, it is a polynomial function of the coefficients of the original polynomial. The discriminant is widely used in polynomial factoring, number theory, and algebraic geometry.

In arithmetic and computer programming, the extended Euclidean algorithm is an extension to the Euclidean algorithm, and computes, in addition to the greatest common divisor (gcd) of integers a and b, also the coefficients of Bézout's identity, which are integers x and y such that

In linear algebra, the Cayley–Hamilton theorem states that every square matrix over a commutative ring satisfies its own characteristic equation.

In linear algebra, the characteristic polynomial of a square matrix is a polynomial which is invariant under matrix similarity and has the eigenvalues as roots. It has the determinant and the trace of the matrix among its coefficients. The characteristic polynomial of an endomorphism of a finite-dimensional vector space is the characteristic polynomial of the matrix of that endomorphism over any basis. The characteristic equation, also known as the determinantal equation, is the equation obtained by equating the characteristic polynomial to zero.

In mathematics, a quadratic form is a polynomial with terms all of degree two. For example,

In linear algebra, a Vandermonde matrix, named after Alexandre-Théophile Vandermonde, is a matrix with the terms of a geometric progression in each row: an $matrix$

In mathematics, specifically linear algebra, the Cauchy–Binet formula, named after Augustin-Louis Cauchy and Jacques Philippe Marie Binet, is an identity for the determinant of the product of two rectangular matrices of transpose shapes. It generalizes the statement that the determinant of a product of square matrices is equal to the product of their determinants. The formula is valid for matrices with the entries from any commutative ring.

In mathematics, especially in the field of algebra, a polynomial ring or polynomial algebra is a ring formed from the set of polynomials in one or more indeterminates with coefficients in another ring, often a field.

In mathematics, the determinant of an m-by-m skew-symmetric matrix can always be written as the square of a polynomial in the matrix entries, a polynomial with integer coefficients that only depends on m. When m is odd, the polynomial is zero, and when m is even, it is a nonzero polynomial of degree m/2, and is unique up to multiplication by ±1. The convention on skew-symmetric tridiagonal matrices, given below in the examples, then determines one specific polynomial, called the Pfaffian polynomial. The value of this polynomial, when applied to the entries of a skew-symmetric matrix, is called the Pfaffian of that matrix. The term Pfaffian was introduced by Cayley, who indirectly named them after Johann Friedrich Pfaff.

A Savitzky–Golay filter is a digital filter that can be applied to a set of digital data points for the purpose of smoothing the data, that is, to increase the precision of the data without distorting the signal tendency. This is achieved, in a process known as convolution, by fitting successive sub-sets of adjacent data points with a low-degree polynomial by the method of linear least squares. When the data points are equally spaced, an analytical solution to the least-squares equations can be found, in the form of a single set of "convolution coefficients" that can be applied to all data sub-sets, to give estimates of the smoothed signal, at the central point of each sub-set. The method, based on established mathematical procedures, was popularized by Abraham Savitzky and Marcel J. E. Golay, who published tables of convolution coefficients for various polynomials and sub-set sizes in 1964. Some errors in the tables have been corrected. The method has been extended for the treatment of 2- and 3-dimensional data.

In mathematics, a Bézout matrix is a special square matrix associated with two polynomials, introduced by James Joseph Sylvester in 1853 and Arthur Cayley in 1857 and named after Étienne Bézout. Bézoutian may also refer to the determinant of this matrix, which is equal to the resultant of the two polynomials. Bézout matrices are sometimes used to test the stability of a given polynomial.

In mathematics, the resultant of two polynomials is a polynomial expression of their coefficients that is equal to zero if and only if the polynomials have a common root, or, equivalently, a common factor. In some older texts, the resultant is also called the eliminant.

In mathematics, the Schwartz–Zippel lemma is a tool commonly used in probabilistic polynomial identity testing. Identity testing is the problem of determining whether a given multivariate polynomial is the 0-polynomial, the polynomial that ignores all its variables and always returns zero. The lemma states that evaluating a nonzero polynomial on inputs chosen randomly from a large-enough set is likely to find an input that produces a nonzero output.

In mathematics, a polynomial matrix or matrix of polynomials is a matrix whose elements are univariate or multivariate polynomials. Equivalently, a polynomial matrix is a polynomial whose coefficients are matrices.

In algebra, the greatest common divisor of two polynomials is a polynomial, of the highest possible degree, that is a factor of both the two original polynomials. This concept is analogous to the greatest common divisor of two integers.

In coding theory, list decoding is an alternative to unique decoding of error-correcting codes in the presence of many errors. If a code has relative distance $, then it is possible in principle to recover an encoded message when up to fraction of the codeword symbols are corrupted. But when error rate is greater than , this will not in general be possible. List decoding overcomes that issue by allowing the decoder to output a short list of messages that might have been encoded. List decoding can correct more than fraction of errors.$

In mathematics, Manin matrices, named after Yuri Manin who introduced them around 1987–88, are a class of matrices with elements in a not-necessarily commutative ring, which in a certain sense behave like matrices whose elements commute. In particular there is natural definition of the determinant for them and most linear algebra theorems like Cramer's rule, Cayley–Hamilton theorem, etc. hold true for them. Any matrix with commuting elements is a Manin matrix. These matrices have applications in representation theory in particular to Capelli's identity, Yangian and quantum integrable systems.

In mathematics a P-recursive equation can be solved for polynomial solutions. Sergei A. Abramov in 1989 and Marko Petkovšek in 1992 described an algorithm which finds all polynomial solutions of those recurrence equations with polynomial coefficients. The algorithm computes a degree bound for the solution in a first step. In a second step an ansatz for a polynomial of this degree is used and the unknown coefficients are computed by a system of linear equations. This article describes this algorithm.

In the theory of linear programming, a basic feasible solution (BFS) is a solution with a minimal set of non-zero variables. Geometrically, each BFS corresponds to a vertex of the polyhedron of feasible solutions. If there exists an optimal solution, then there exists an optimal BFS. Hence, to find an optimal solution, it is sufficient to consider the BFS-s. This fact is used by the simplex algorithm, which essentially travels from one BFS to another until an optimal solution is found.

References

↑ Akritas, A.G.; Malaschonok, G.I.; Vigklas, P.S. (2014). "Sturm Sequences and Modified Subresultant Polynomial Remainder Sequences". Serdica Journal of Computing. 8 (1): 29–46. hdl:10525/2428.

Weisstein, Eric W. "Sylvester Matrix". MathWorld .

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[amv2014-1] Akritas, A.G.; Malaschonok, G.I.; Vigklas, P.S. (2014). "Sturm Sequences and Modified Subresultant Polynomial Remainder Sequences". Serdica Journal of Computing. 8 (1): 29–46. hdl:10525/2428.

[1]

v t e Matrix classes
Explicitly constrained entries	Alternant Anti-diagonal Anti-Hermitian Anti-symmetric Arrowhead Band Bidiagonal Bisymmetric Block-diagonal Block Block tridiagonal Boolean Cauchy Centrosymmetric Conference Complex Hadamard Copositive Diagonally dominant Diagonal Discrete Fourier Transform Elementary Equivalent Frobenius Generalized permutation Hadamard Hankel Hermitian Hessenberg Hollow Integer Logical Matrix unit Metzler Moore Nonnegative Pentadiagonal Permutation Persymmetric Polynomial Quaternionic Signature Skew-Hermitian Skew-symmetric Skyline Sparse Sylvester Symmetric Toeplitz Triangular Tridiagonal Vandermonde Walsh Z
Constant	Exchange Hilbert Identity Lehmer Of ones Pascal Pauli Redheffer Shift Zero
Conditions on eigenvalues or eigenvectors	Companion Convergent Defective Definite Diagonalizable Hurwitz-stable Positive-definite Stieltjes
Satisfying conditions on products or inverses	Congruent Idempotent or Projection Invertible Involutory Nilpotent Normal Orthogonal Unimodular Unipotent Unitary Totally unimodular Weighing
With specific applications	Adjugate Alternating sign Augmented Bézout Carleman Cartan Circulant Cofactor Commutation Confusion Coxeter Distance Duplication and elimination Euclidean distance Fundamental (linear differential equation) Generator Gram Hessian Householder Jacobian Moment Payoff Pick Random Rotation Routh-Hurwitz Seifert Shear Similarity Symplectic Totally positive Transformation
Used in statistics	Centering Correlation Covariance Design Doubly stochastic Fisher information Hat Precision Stochastic Transition
Used in graph theory	Adjacency Biadjacency Degree Edmonds Incidence Laplacian Seidel adjacency Tutte
Used in science and engineering	Cabibbo–Kobayashi–Maskawa Density Fundamental (computer vision) Fuzzy associative Gamma Gell-Mann Hamiltonian Irregular Overlap S State transition Substitution Z (chemistry)
Related terms	Jordan normal form Linear independence Matrix exponential Matrix representation of conic sections Perfect matrix Pseudoinverse Row echelon form Wronskian
Mathematicsportal List of matrices Category:Matrices