Elementary symmetric polynomial

Last updated

In mathematics, specifically in commutative algebra, the elementary symmetric polynomials are one type of basic building block for symmetric polynomials, in the sense that any symmetric polynomial can be expressed as a polynomial in elementary symmetric polynomials. That is, any symmetric polynomial P is given by an expression involving only additions and multiplication of constants and elementary symmetric polynomials. There is one elementary symmetric polynomial of degree d in n variables for each positive integer dn, and it is formed by adding together all distinct products of d distinct variables.

Contents

Definition

The elementary symmetric polynomials in n variables X1, ..., Xn, written ek(X1, ..., Xn) for k = 1, ..., n, are defined by

and so forth, ending with

In general, for k ≥ 0 we define

so that ek(X1, ..., Xn) = 0 if k > n. (Sometimes, 1 = e0(X1, ..., Xn) is included among the elementary symmetric polynomials, but excluding it allows generally simpler formulation of results and properties.)

Thus, for each positive integer k less than or equal to n there exists exactly one elementary symmetric polynomial of degree k in n variables. To form the one that has degree k, we take the sum of all products of k-subsets of the n variables. (By contrast, if one performs the same operation using multisets of variables, that is, taking variables with repetition, one arrives at the complete homogeneous symmetric polynomials.)

Given an integer partition (that is, a finite non-increasing sequence of positive integers) λ = (λ1, ..., λm), one defines the symmetric polynomial eλ(X1, ..., Xn), also called an elementary symmetric polynomial, by

.

Sometimes the notation σk is used instead of ek.

Examples

The following lists the n elementary symmetric polynomials for the first four positive values of n.

For n = 1:

For n = 2:

For n = 3:

For n = 4:

Properties

The elementary symmetric polynomials appear when we expand a linear factorization of a monic polynomial: we have the identity

That is, when we substitute numerical values for the variables X1, X2, ..., Xn, we obtain the monic univariate polynomial (with variable λ) whose roots are the values substituted for X1, X2, ..., Xn and whose coefficients are – up to their sign – the elementary symmetric polynomials. These relations between the roots and the coefficients of a polynomial are called Vieta's formulas.

The characteristic polynomial of a square matrix is an example of application of Vieta's formulas. The roots of this polynomial are the eigenvalues of the matrix. When we substitute these eigenvalues into the elementary symmetric polynomials, we obtain – up to their sign – the coefficients of the characteristic polynomial, which are invariants of the matrix. In particular, the trace (the sum of the elements of the diagonal) is the value of e1, and thus the sum of the eigenvalues. Similarly, the determinant is – up to the sign – the constant term of the characteristic polynomial, i.e. the value of en. Thus the determinant of a square matrix is the product of the eigenvalues.

The set of elementary symmetric polynomials in n variables generates the ring of symmetric polynomials in n variables. More specifically, the ring of symmetric polynomials with integer coefficients equals the integral polynomial ring [e1(X1, ..., Xn), ..., en(X1, ..., Xn)]. (See below for a more general statement and proof.) This fact is one of the foundations of invariant theory. For another system of symmetric polynomials with the same property see Complete homogeneous symmetric polynomials, and for a system with a similar, but slightly weaker, property see Power sum symmetric polynomial.

Fundamental theorem of symmetric polynomials

For any commutative ring A, denote the ring of symmetric polynomials in the variables X1, ..., Xn with coefficients in A by A[X1, ..., Xn]Sn. This is a polynomial ring in the n elementary symmetric polynomials ek(X1, ..., Xn) for k = 1, ..., n.

This means that every symmetric polynomial P(X1, ..., Xn) ∈ A[X1, ..., Xn]Sn has a unique representation

for some polynomial QA[Y1, ..., Yn]. Another way of saying the same thing is that the ring homomorphism that sends Yk to ek(X1, ..., Xn) for k = 1, ..., n defines an isomorphism between A[Y1, ..., Yn] and A[X1, ..., Xn]Sn.

Proof sketch

The theorem may be proved for symmetric homogeneous polynomials by a double induction with respect to the number of variables n and, for fixed n, with respect to the degree of the homogeneous polynomial. The general case then follows by splitting an arbitrary symmetric polynomial into its homogeneous components (which are again symmetric).

In the case n = 1 the result is trivial because every polynomial in one variable is automatically symmetric.

Assume now that the theorem has been proved for all polynomials for m < n variables and all symmetric polynomials in n variables with degree < d. Every homogeneous symmetric polynomial P in A[X1, ..., Xn]Sn can be decomposed as a sum of homogeneous symmetric polynomials

Here the "lacunary part" Placunary is defined as the sum of all monomials in P which contain only a proper subset of the n variables X1, ..., Xn, i.e., where at least one variable Xj is missing.

Because P is symmetric, the lacunary part is determined by its terms containing only the variables X1, ..., Xn − 1, i.e., which do not contain Xn. More precisely: If A and B are two homogeneous symmetric polynomials in X1, ..., Xn having the same degree, and if the coefficient of A before each monomial which contains only the variables X1, ..., Xn − 1 equals the corresponding coefficient of B, then A and B have equal lacunary parts. (This is because every monomial which can appear in a lacunary part must lack at least one variable, and thus can be transformed by a permutation of the variables into a monomial which contains only the variables X1, ..., Xn − 1.)

But the terms of P which contain only the variables X1, ..., Xn − 1 are precisely the terms that survive the operation of setting Xn to 0, so their sum equals P(X1, ..., Xn − 1, 0), which is a symmetric polynomial in the variables X1, ..., Xn − 1 that we shall denote by (X1, ..., Xn − 1). By the inductive hypothesis, this polynomial can be written as

for some . Here the doubly indexed σj,n − 1 denote the elementary symmetric polynomials in n − 1 variables.

Consider now the polynomial

Then R(X1, ..., Xn) is a symmetric polynomial in X1, ..., Xn, of the same degree as Placunary, which satisfies

(the first equality holds because setting Xn to 0 in σj,n gives σj,n − 1, for all j < n). In other words, the coefficient of R before each monomial which contains only the variables X1, ..., Xn − 1 equals the corresponding coefficient of P. As we know, this shows that the lacunary part of R coincides with that of the original polynomial P. Therefore the difference PR has no lacunary part, and is therefore divisible by the product X1···Xn of all variables, which equals the elementary symmetric polynomial σn,n. Then writing PR = σn,nQ, the quotient Q is a homogeneous symmetric polynomial of degree less than d (in fact degree at most dn) which by the inductive hypothesis can be expressed as a polynomial in the elementary symmetric functions. Combining the representations for PR and R one finds a polynomial representation for P.

The uniqueness of the representation can be proved inductively in a similar way. (It is equivalent to the fact that the n polynomials e1, ..., en are algebraically independent over the ring A.) The fact that the polynomial representation is unique implies that A[X1, ..., Xn]Sn is isomorphic to A[Y1, ..., Yn].

Alternative proof

The following proof is also inductive, but does not involve other polynomials than those symmetric in X1, ..., Xn, and also leads to a fairly direct procedure to effectively write a symmetric polynomial as a polynomial in the elementary symmetric ones. Assume the symmetric polynomial to be homogeneous of degree d; different homogeneous components can be decomposed separately. Order the monomials in the variables Xi lexicographically, where the individual variables are ordered X1 > ... > Xn, in other words the dominant term of a polynomial is one with the highest occurring power of X1, and among those the one with the highest power of X2, etc. Furthermore parametrize all products of elementary symmetric polynomials that have degree d (they are in fact homogeneous) as follows by partitions of d. Order the individual elementary symmetric polynomials ei(X1, ..., Xn) in the product so that those with larger indices i come first, then build for each such factor a column of i boxes, and arrange those columns from left to right to form a Young diagram containing d boxes in all. The shape of this diagram is a partition of d, and each partition λ of d arises for exactly one product of elementary symmetric polynomials, which we shall denote by eλt(X1, ..., Xn) (the t is present only because traditionally this product is associated to the transpose partition of λ). The essential ingredient of the proof is the following simple property, which uses multi-index notation for monomials in the variables Xi.

Lemma. The leading term of eλt(X1, ..., Xn) is Xλ.

Proof. The leading term of the product is the product of the leading terms of each factor (this is true whenever one uses a monomial order, like the lexicographic order used here), and the leading term of the factor ei(X1, ..., Xn) is clearly X1X2···Xi. To count the occurrences of the individual variables in the resulting monomial, fill the column of the Young diagram corresponding to the factor concerned with the numbers 1, ..., i of the variables, then all boxes in the first row contain 1, those in the second row 2, and so forth, which means the leading term is Xλ.

Now one proves by induction on the leading monomial in lexicographic order, that any nonzero homogeneous symmetric polynomial P of degree d can be written as polynomial in the elementary symmetric polynomials. Since P is symmetric, its leading monomial has weakly decreasing exponents, so it is some Xλ with λ a partition of d. Let the coefficient of this term be c, then Pceλt(X1, ..., Xn) is either zero or a symmetric polynomial with a strictly smaller leading monomial. Writing this difference inductively as a polynomial in the elementary symmetric polynomials, and adding back ceλt(X1, ..., Xn) to it, one obtains the sought for polynomial expression for P.

The fact that this expression is unique, or equivalently that all the products (monomials) eλt(X1, ..., Xn) of elementary symmetric polynomials are linearly independent, is also easily proved. The lemma shows that all these products have different leading monomials, and this suffices: if a nontrivial linear combination of the eλt(X1, ..., Xn) were zero, one focuses on the contribution in the linear combination with nonzero coefficient and with (as polynomial in the variables Xi) the largest leading monomial; the leading term of this contribution cannot be cancelled by any other contribution of the linear combination, which gives a contradiction.

See also

Related Research Articles

In mathematics, the discriminant of a polynomial is a quantity that depends on the coefficients and allows deducing some properties of the roots without computing them. More precisely, it is a polynomial function of the coefficients of the original polynomial. The discriminant is widely used in polynomial factoring, number theory, and algebraic geometry.

In mathematics, a quadratic form is a polynomial with terms all of degree two. For example,

In algebra and in particular in algebraic combinatorics, the ring of symmetric functions is a specific limit of the rings of symmetric polynomials in n indeterminates, as n goes to infinity. This ring serves as universal structure in which relations between symmetric polynomials can be expressed in a way independent of the number n of indeterminates. Among other things, this ring plays an important role in the representation theory of the symmetric group.

<span class="mw-page-title-main">Polynomial ring</span> Algebraic structure

In mathematics, especially in the field of algebra, a polynomial ring or polynomial algebra is a ring formed from the set of polynomials in one or more indeterminates with coefficients in another ring, often a field.

In mathematics, a symmetric polynomial is a polynomial P(X1, X2, …, Xn) in n variables, such that if any of the variables are interchanged, one obtains the same polynomial. Formally, P is a symmetric polynomial if for any permutation σ of the subscripts 1, 2, ..., n one has P(Xσ(1), Xσ(2), …, Xσ(n)) = P(X1, X2, …, Xn).

In mathematics, a homogeneous polynomial, sometimes called quantic in older texts, is a polynomial whose nonzero terms all have the same degree. For example, is a homogeneous polynomial of degree 5, in two variables; the sum of the exponents in each term is always 5. The polynomial is not homogeneous, because the sum of exponents does not match from term to term. The function defined by a homogeneous polynomial is always a homogeneous function.

In mathematics, the resultant of two polynomials is a polynomial expression of their coefficients that is equal to zero if and only if the polynomials have a common root, or, equivalently, a common factor. In some older texts, the resultant is also called the eliminant.

In mathematics, Newton's identities, also known as the Girard–Newton formulae, give relations between two types of symmetric polynomials, namely between power sums and elementary symmetric polynomials. Evaluated at the roots of a monic polynomial P in one variable, they allow expressing the sums of the k-th powers of all roots of P in terms of the coefficients of P, without actually finding those roots. These identities were found by Isaac Newton around 1666, apparently in ignorance of earlier work (1629) by Albert Girard. They have applications in many areas of mathematics, including Galois theory, invariant theory, group theory, combinatorics, as well as further applications outside mathematics, including general relativity.

In mathematics, Schur polynomials, named after Issai Schur, are certain symmetric polynomials in n variables, indexed by partitions, that generalize the elementary symmetric polynomials and the complete homogeneous symmetric polynomials. In representation theory they are the characters of polynomial irreducible representations of the general linear groups. The Schur polynomials form a linear basis for the space of all symmetric polynomials. Any product of Schur polynomials can be written as a linear combination of Schur polynomials with non-negative integral coefficients; the values of these coefficients is given combinatorially by the Littlewood–Richardson rule. More generally, skew Schur polynomials are associated with pairs of partitions and have similar properties to Schur polynomials.

In mathematics, the Jack function is a generalization of the Jack polynomial, introduced by Henry Jack. The Jack polynomial is a homogeneous, symmetric polynomial which generalizes the Schur and zonal polynomials, and is in turn generalized by the Heckman–Opdam polynomials and Macdonald polynomials.

In mathematics, specifically in algebraic combinatorics and commutative algebra, the complete homogeneous symmetric polynomials are a specific kind of symmetric polynomials. Every symmetric polynomial can be expressed as a polynomial expression in complete homogeneous symmetric polynomials.

In mathematics, specifically in commutative algebra, the power sum symmetric polynomials are a type of basic building block for symmetric polynomials, in the sense that every symmetric polynomial with rational coefficients can be expressed as a sum and difference of products of power sum symmetric polynomials with rational coefficients. However, not every symmetric polynomial with integral coefficients is generated by integral combinations of products of power-sum polynomials: they are a generating set over the rationals, but not over the integers.

In mathematics, Macdonald polynomialsPλ(x; t,q) are a family of orthogonal symmetric polynomials in several variables, introduced by Macdonald in 1987. He later introduced a non-symmetric generalization in 1995. Macdonald originally associated his polynomials with weights λ of finite root systems and used just one variable t, but later realized that it is more natural to associate them with affine root systems rather than finite root systems, in which case the variable t can be replaced by several different variables t=(t1,...,tk), one for each of the k orbits of roots in the affine root system. The Macdonald polynomials are polynomials in n variables x=(x1,...,xn), where n is the rank of the affine root system. They generalize many other families of orthogonal polynomials, such as Jack polynomials and Hall–Littlewood polynomials and Askey–Wilson polynomials, which in turn include most of the named 1-variable orthogonal polynomials as special cases. Koornwinder polynomials are Macdonald polynomials of certain non-reduced root systems. They have deep relationships with affine Hecke algebras and Hilbert schemes, which were used to prove several conjectures made by Macdonald about them.

In mathematics, the Hall–Littlewood polynomials are symmetric functions depending on a parameter t and a partition λ. They are Schur functions when t is 0 and monomial symmetric functions when t is 1 and are special cases of Macdonald polynomials. They were first defined indirectly by Philip Hall using the Hall algebra, and later defined directly by Dudley E. Littlewood (1961).

In algebra, a multivariate polynomial

In algebra and in particular in algebraic combinatorics, a quasisymmetric function is any element in the ring of quasisymmetric functions which is in turn a subring of the formal power series ring with a countable number of variables. This ring generalizes the ring of symmetric functions. This ring can be realized as a specific limit of the rings of quasisymmetric polynomials in n variables, as n goes to infinity. This ring serves as universal structure in which relations between quasisymmetric polynomials can be expressed in a way independent of the number n of variables.

In algebra, a λ-ring or lambda ring is a commutative ring together with some operations λn on it that behave like the exterior powers of vector spaces. Many rings considered in K-theory carry a natural λ-ring structure. λ-rings also provide a powerful formalism for studying an action of the symmetric functions on the ring of polynomials, recovering and extending many classical results.

In mathematics, the ring of polynomial functions on a vector space V over a field k gives a coordinate-free analog of a polynomial ring. It is denoted by k[V]. If V is finite dimensional and is viewed as an algebraic variety, then k[V] is precisely the coordinate ring of V.

In combinatorial mathematics, the hook length formula is a formula for the number of standard Young tableaux whose shape is a given Young diagram. It has applications in diverse areas such as representation theory, probability, and algorithm analysis; for example, the problem of longest increasing subsequences. A related formula gives the number of semi-standard Young tableaux, which is a specialization of a Schur polynomial.

Plethystic substitution is a shorthand notation for a common kind of substitution in the algebra of symmetric functions and that of symmetric polynomials. It is essentially basic substitution of variables, but allows for a change in the number of variables used.

References