Cramer's rule

Last updated

In linear algebra, Cramer's rule is an explicit formula for the solution of a system of linear equations with as many equations as unknowns, valid whenever the system has a unique solution. It expresses the solution in terms of the determinants of the (square) coefficient matrix and of matrices obtained from it by replacing one column by the column vector of right-sides of the equations. It is named after Gabriel Cramer, who published the rule for an arbitrary number of unknowns in 1750, [1] [2] although Colin Maclaurin also published special cases of the rule in 1748, [3] and possibly knew of it as early as 1729. [4] [5] [6]

Contents

Cramer's rule implemented in a naive way is computationally inefficient for systems of more than two or three equations. [7] In the case of n equations in n unknowns, it requires computation of n + 1 determinants, while Gaussian elimination produces the result with the same computational complexity as the computation of a single determinant. [8] [9] [ verification needed ] Cramer's rule can also be numerically unstable even for 2×2 systems. [10] However, Cramer's rule can be implemented with the same complexity as Gaussian elimination, [11] [12] (consistently requires twice as many arithmetic operations and has the same numerical stability when the same permutation matrices are applied).

General case

Consider a system of n linear equations for n unknowns, represented in matrix multiplication form as follows:

where the n × n matrix A has a nonzero determinant, and the vector is the column vector of the variables. Then the theorem states that in this case the system has a unique solution, whose individual values for the unknowns are given by:

where is the matrix formed by replacing the i-th column of A by the column vector b.

A more general version of Cramer's rule [13] considers the matrix equation

where the n × n matrix A has a nonzero determinant, and X, B are n × m matrices. Given sequences and , let be the k × k submatrix of X with rows in and columns in . Let be the n × n matrix formed by replacing the column of A by the column of B, for all . Then

In the case , this reduces to the normal Cramer's rule.

The rule holds for systems of equations with coefficients and unknowns in any field, not just in the real numbers.

Proof

The proof for Cramer's rule uses the following properties of the determinants: linearity with respect to any given column and the fact that the determinant is zero whenever two columns are equal, which is implied by the property that the sign of the determinant flips if you switch two columns.

Fix the index j of a column, and consider that the entries of the other columns have fixed values. This makes the determinant a function of the entries of the jth column. Linearity with respect of this column means that this function has the form

where the are coefficients that depend on the entries of A that are not in column j. So, one has

(Laplace expansion provides a formula for computing the but their expression is not important here.)

If the function is applied to any other column k of A, then the result is the determinant of the matrix obtained from A by replacing column j by a copy of column k, so the resulting determinant is 0 (the case of two equal columns).

Now consider a system of n linear equations in n unknowns , whose coefficient matrix is A, with det(A) assumed to be nonzero:

If one combines these equations by taking C1,j times the first equation, plus C2,j times the second, and so forth until Cn,j times the last, then for every k the resulting coefficient of xk becomes

So, all coefficients become zero, except the coefficient of that becomes Similarly, the constant coefficient becomes and the resulting equation is thus

which gives the value of as

As, by construction, the numerator is the determinant of the matrix obtained from A by replacing column j by b, we get the expression of Cramer's rule as a necessary condition for a solution.

It remains to prove that these values for the unknowns form a solution. Let M be the n × n matrix that has the coefficients of as jth row, for (this is the adjugate matrix for A). Expressed in matrix terms, we have thus to prove that

is a solution; that is, that

For that, it suffices to prove that

where is the identity matrix.

The above properties of the functions show that one has MA = det(A)In, and therefore,

This completes the proof, since a left inverse of a square matrix is also a right-inverse (see Invertible matrix theorem).

For other proofs, see below.

Finding inverse matrix

Let A be an n × n matrix with entries in a field F. Then

where adj(A) denotes the adjugate matrix, det(A) is the determinant, and I is the identity matrix. If det(A) is nonzero, then the inverse matrix of A is

This gives a formula for the inverse of A, provided det(A) ≠ 0. In fact, this formula works whenever F is a commutative ring, provided that det(A) is a unit. If det(A) is not a unit, then A is not invertible over the ring (it may be invertible over a larger ring in which some non-unit elements of F may be invertible).

Applications

Explicit formulas for small systems

Consider the linear system

which in matrix format is

Assume a1b2b1a2 is nonzero. Then, with the help of determinants, x and y can be found with Cramer's rule as

The rules for 3 × 3 matrices are similar. Given

which in matrix format is

Then the values of x, y and z can be found as follows:

Differential geometry

Ricci calculus

Cramer's rule is used in the Ricci calculus in various calculations involving the Christoffel symbols of the first and second kind. [14]

In particular, Cramer's rule can be used to prove that the divergence operator on a Riemannian manifold is invariant with respect to change of coordinates. We give a direct proof, suppressing the role of the Christoffel symbols. Let be a Riemannian manifold equipped with local coordinates . Let be a vector field. We use the summation convention throughout.

Theorem.
The divergence of ,
is invariant under change of coordinates.
Proof

Let be a coordinate transformation with non-singular Jacobian. Then the classical transformation laws imply that where . Similarly, if , then . Writing this transformation law in terms of matrices yields , which implies .

Now one computes

In order to show that this equals , it is necessary and sufficient to show that

which is equivalent to

Carrying out the differentiation on the left-hand side, we get:

where denotes the matrix obtained from by deleting the th row and th column. But Cramer's Rule says that

is the th entry of the matrix . Thus

completing the proof.

Computing derivatives implicitly

Consider the two equations and . When u and v are independent variables, we can define and

An equation for can be found by applying Cramer's rule.

Calculation of

First, calculate the first derivatives of F, G, x, and y:

Substituting dx, dy into dF and dG, we have:

Since u, v are both independent, the coefficients of du, dv must be zero. So we can write out equations for the coefficients:

Now, by Cramer's rule, we see that:

This is now a formula in terms of two Jacobians:

Similar formulas can be derived for

Integer programming

Cramer's rule can be used to prove that an integer programming problem whose constraint matrix is totally unimodular and whose right-hand side is integer, has integer basic solutions. This makes the integer program substantially easier to solve.

Ordinary differential equations

Cramer's rule is used to derive the general solution to an inhomogeneous linear differential equation by the method of variation of parameters.

Geometric interpretation

Geometric interpretation of Cramer's rule. The areas of the second and third shaded parallelograms are the same and the second is
x
1
{\displaystyle x_{1}}
times the first. From this equality Cramer's rule follows. Cramer.jpg
Geometric interpretation of Cramer's rule. The areas of the second and third shaded parallelograms are the same and the second is times the first. From this equality Cramer's rule follows.

Cramer's rule has a geometric interpretation that can be considered also a proof or simply giving insight about its geometric nature. These geometric arguments work in general and not only in the case of two equations with two unknowns presented here.

Given the system of equations

it can be considered as an equation between vectors

The area of the parallelogram determined by and is given by the determinant of the system of equations:

In general, when there are more variables and equations, the determinant of n vectors of length n will give the volume of the parallelepiped determined by those vectors in the n-th dimensional Euclidean space.

Therefore, the area of the parallelogram determined by and has to be times the area of the first one since one of the sides has been multiplied by this factor. Now, this last parallelogram, by Cavalieri's principle, has the same area as the parallelogram determined by and

Equating the areas of this last and the second parallelogram gives the equation

from which Cramer's rule follows.

Other proofs

A proof by abstract linear algebra

This is a restatement of the proof above in abstract language.

Consider the map where is the matrix with substituted in the th column, as in Cramer's rule. Because of linearity of determinant in every column, this map is linear. Observe that it sends the th column of to the th basis vector (with 1 in the th place), because determinant of a matrix with a repeated column is 0. So we have a linear map which agrees with the inverse of on the column space; hence it agrees with on the span of the column space. Since is invertible, the column vectors span all of , so our map really is the inverse of . Cramer's rule follows.

A short proof

A short proof of Cramer's rule [15] can be given by noticing that is the determinant of the matrix

On the other hand, assuming that our original matrix A is invertible, this matrix has columns , where is the n-th column of the matrix A. Recall that the matrix has columns , and therefore . Hence, by using that the determinant of the product of two matrices is the product of the determinants, we have

The proof for other is similar.

Using Geometric Algebra

see here

Inconsistent and indeterminate cases

A system of equations is said to be inconsistent when there are no solutions and it is called indeterminate when there is more than one solution. For linear equations, an indeterminate system will have infinitely many solutions (if it is over an infinite field), since the solutions can be expressed in terms of one or more parameters that can take arbitrary values.

Cramer's rule applies to the case where the coefficient determinant is nonzero. In the 2×2 case, if the coefficient determinant is zero, then the system is incompatible if the numerator determinants are nonzero, or indeterminate if the numerator determinants are zero.

For 3×3 or higher systems, the only thing one can say when the coefficient determinant equals zero is that if any of the numerator determinants are nonzero, then the system must be inconsistent. However, having all determinants zero does not imply that the system is indeterminate. A simple example where all determinants vanish (equal zero) but the system is still incompatible is the 3×3 system x+y+z=1, x+y+z=2, x+y+z=3.

See also

Related Research Articles

In mathematics, the determinant is a scalar value that is a function of the entries of a square matrix. The determinant of a matrix A is commonly denoted det(A), det A, or |A|. Its value characterizes some properties of the matrix and the linear map represented by the matrix. In particular, the determinant is nonzero if and only if the matrix is invertible and the linear map represented by the matrix is an isomorphism. The determinant of a product of matrices is the product of their determinants.

<span class="mw-page-title-main">Linear independence</span> Vectors whose linear combinations are nonzero

In the theory of vector spaces, a set of vectors is said to be linearly independent if there exists no nontrivial linear combination of the vectors that equals the zero vector. If such a linear combination exists, then the vectors are said to be linearly dependent. These concepts are central to the definition of dimension.

<span class="mw-page-title-main">System of linear equations</span> Several equations of degree 1 to be solved simultaneously

In mathematics, a system of linear equations is a collection of one or more linear equations involving the same variables. For example,

<span class="mw-page-title-main">Cross product</span> Mathematical operation on vectors in 3D space

In mathematics, the cross product or vector product is a binary operation on two vectors in a three-dimensional oriented Euclidean vector space, and is denoted by the symbol . Given two linearly independent vectors a and b, the cross product, a × b, is a vector that is perpendicular to both a and b, and thus normal to the plane containing them. It has many applications in mathematics, physics, engineering, and computer programming. It should not be confused with the dot product.

<span class="mw-page-title-main">Square matrix</span> Matrix with the same number of rows and columns

In mathematics, a square matrix is a matrix with the same number of rows and columns. An n-by-n matrix is known as a square matrix of order . Any two square matrices of the same order can be added and multiplied.

Ray transfer matrix analysis is a mathematical form for performing ray tracing calculations in sufficiently simple problems which can be solved considering only paraxial rays. Each optical element is described by a 2×2 ray transfer matrix which operates on a vector describing an incoming light ray to calculate the outgoing ray. Multiplication of the successive matrices thus yields a concise ray transfer matrix describing the entire optical system. The same mathematics is also used in accelerator physics to track particles through the magnet installations of a particle accelerator, see electron optics.

In vector calculus, the Jacobian matrix of a vector-valued function of several variables is the matrix of all its first-order partial derivatives. When this matrix is square, that is, when the function takes the same number of variables as input as the number of vector components of its output, its determinant is referred to as the Jacobian determinant. Both the matrix and the determinant are often referred to simply as the Jacobian in literature.

In linear algebra, the adjugate or classical adjoint of a square matrix A is the transpose of its cofactor matrix and is denoted by adj(A). It is also occasionally known as adjunct matrix, or "adjoint", though the latter term today normally refers to a different concept, the adjoint operator which for a matrix is the conjugate transpose.

In linear algebra, an n-by-n square matrix A is called invertible if there exists an n-by-n square matrix B such that

In linear algebra, a minor of a matrix A is the determinant of some smaller square matrix, cut down from A by removing one or more of its rows and columns. Minors obtained by removing just one row and one column from square matrices are required for calculating matrix cofactors, which in turn are useful for computing both the determinant and inverse of square matrices. The requirement that the square matrix be smaller than the original matrix is often omitted in the definition.

In linear algebra, a Vandermonde matrix, named after Alexandre-Théophile Vandermonde, is a matrix with the terms of a geometric progression in each row: an matrix

In geometry, Plücker coordinates, introduced by Julius Plücker in the 19th century, are a way to assign six homogeneous coordinates to each line in projective 3-space, . Because they satisfy a quadratic constraint, they establish a one-to-one correspondence between the 4-dimensional space of lines in and points on a quadric in . A predecessor and special case of Grassmann coordinates, Plücker coordinates arise naturally in geometric algebra. They have proved useful for computer graphics, and also can be extended to coordinates for the screws and wrenches in the theory of kinematics used for robot control.

In mathematics, Dodgson condensation or method of contractants is a method of computing the determinants of square matrices. It is named for its inventor, Charles Lutwidge Dodgson (better known by his pseudonym, as Lewis Carroll, the popular author), who discovered it in 1866. The method in the case of an n × n matrix is to construct an (n − 1) × (n − 1) matrix, an (n − 2) × (n − 2), and so on, finishing with a 1 × 1 matrix, which has one entry, the determinant of the original matrix.

In linear algebra, it is often important to know which vectors have their directions unchanged by a linear transformation. An eigenvector or characteristic vector is such a vector. Thus an eigenvector of a linear transformation is scaled by a constant factor when the linear transformation is applied to it: . The corresponding eigenvalue, characteristic value, or characteristic root is the multiplying factor .

<span class="mw-page-title-main">Line–line intersection</span> Common point(s) shared by two lines in Euclidean geometry

In Euclidean geometry, the intersection of a line and a line can be the empty set, a point, or another line. Distinguishing these cases and finding the intersection have uses, for example, in computer graphics, motion planning, and collision detection.

In linear algebra, eigendecomposition is the factorization of a matrix into a canonical form, whereby the matrix is represented in terms of its eigenvalues and eigenvectors. Only diagonalizable matrices can be factorized in this way. When the matrix being factorized is a normal or real symmetric matrix, the decomposition is called "spectral decomposition", derived from the spectral theorem.

A differential equation is a mathematical equation for an unknown function of one or several variables that relates the values of the function itself and its derivatives of various orders. A matrix differential equation contains more than one function stacked into vector form with a matrix relating the functions to their derivatives.

<span class="mw-page-title-main">Matrix (mathematics)</span> Array of numbers

In mathematics, a matrix is a rectangular array or table of numbers, symbols, or expressions, arranged in rows and columns, which is used to represent a mathematical object or a property of such an object.

Geometric algebra is an extension of vector algebra, providing additional algebraic structures on vector spaces, with geometric interpretations.

In mathematics, a system of differential equations is a finite set of differential equations. Such a system can be either linear or non-linear. Also, such a system can be either a system of ordinary differential equations or a system of partial differential equations.

References

  1. Cramer, Gabriel (1750). "Introduction à l'Analyse des lignes Courbes algébriques" (in French). Geneva: Europeana. pp. 656–659. Retrieved 2012-05-18.
  2. Kosinski, A. A. (2001). "Cramer's Rule is due to Cramer". Mathematics Magazine. 74 (4): 310–312. doi:10.2307/2691101. JSTOR   2691101.
  3. MacLaurin, Colin (1748). A Treatise of Algebra, in Three Parts.
  4. Boyer, Carl B. (1968). A History of Mathematics (2nd ed.). Wiley. p. 431.
  5. Katz, Victor (2004). A History of Mathematics (Brief ed.). Pearson Education. pp. 378–379.
  6. Hedman, Bruce A. (1999). "An Earlier Date for "Cramer's Rule"" (PDF). Historia Mathematica. 26 (4): 365–368. doi:10.1006/hmat.1999.2247. S2CID   121056843.
  7. David Poole (2014). Linear Algebra: A Modern Introduction. Cengage Learning. p. 276. ISBN   978-1-285-98283-0.
  8. Joe D. Hoffman; Steven Frankel (2001). Numerical Methods for Engineers and Scientists, Second Edition. CRC Press. p. 30. ISBN   978-0-8247-0443-8.
  9. Thomas S. Shores (2007). Applied Linear Algebra and Matrix Analysis. Springer Science & Business Media. p. 132. ISBN   978-0-387-48947-6.
  10. Nicholas J. Higham (2002). Accuracy and Stability of Numerical Algorithms: Second Edition. SIAM. p. 13. ISBN   978-0-89871-521-7.
  11. Ken Habgood; Itamar Arel (2012). "A condensation-based application of Cramerʼs rule for solving large-scale linear systems". Journal of Discrete Algorithms. 10: 98–109. doi: 10.1016/j.jda.2011.06.007 .
  12. G.I.Malaschonok (1983). "Solution of a System of Linear Equations in an Integral Ring". USSR J. Of Comput. Math. And Math. Phys. 23: 1497–1500. arXiv: 1711.09452 .
  13. Zhiming Gong; M. Aldeen; L. Elsner (2002). "A note on a generalized Cramer's rule". Linear Algebra and Its Applications. 340 (1–3): 253–254. doi: 10.1016/S0024-3795(01)00469-4 .
  14. Levi-Civita, Tullio (1926). The Absolute Differential Calculus (Calculus of Tensors). Dover. pp. 111–112. ISBN   9780486634012.
  15. Robinson, Stephen M. (1970). "A Short Proof of Cramer's Rule". Mathematics Magazine. 43 (2): 94–95. doi:10.1080/0025570X.1970.11976018.