Lagrange's four-square theorem

Last updated
Unlike in three dimensions in which distances between vertices of a polycube with unit edges excludes [?]7 due to Legendre's three-square theorem, Lagrange's four-square theorem states that the analogue in four dimensions yields square roots of every natural number Distances between double cube corners.svg
Unlike in three dimensions in which distances between vertices of a polycube with unit edges excludes √7 due to Legendre's three-square theorem, Lagrange's four-square theorem states that the analogue in four dimensions yields square roots of every natural number

Lagrange's four-square theorem, also known as Bachet's conjecture, states that every natural number can be represented as a sum of four non-negative integer squares. [1] That is, the squares form an additive basis of order four. where the four numbers are integers. For illustration, 3, 31, and 310 in several ways, can be represented as the sum of four squares as follows:

Contents

This theorem was proven by Joseph Louis Lagrange in 1770. It is a special case of the Fermat polygonal number theorem.

Historical development

From examples given in the Arithmetica, it is clear that Diophantus was aware of the theorem. This book was translated in 1621 into Latin by Bachet (Claude Gaspard Bachet de Méziriac), who stated the theorem in the notes of his translation. But the theorem was not proved until 1770 by Lagrange. [2]

Adrien-Marie Legendre extended the theorem in 1797–8 with his three-square theorem, by proving that a positive integer can be expressed as the sum of three squares if and only if it is not of the form for integers k and m. Later, in 1834, Carl Gustav Jakob Jacobi discovered a simple formula for the number of representations of an integer as the sum of four squares with his own four-square theorem.

The formula is also linked to Descartes' theorem of four "kissing circles", which involves the sum of the squares of the curvatures of four circles. This is also linked to Apollonian gaskets, which were more recently related to the Ramanujan–Petersson conjecture. [3]

Proofs

The classical proof

Several very similar modern versions [4] [5] [6] of Lagrange's proof exist. The proof below is a slightly simplified version, in which the cases for which m is even or odd do not require separate arguments.

The classical proof

It is sufficient to prove the theorem for every odd prime number p. This immediately follows from Euler's four-square identity (and from the fact that the theorem is true for the numbers 1 and 2).

The residues of a2 modulo p are distinct for every a between 0 and (p − 1)/2 (inclusive). To see this, take some a and define c as a2 mod p. a is a root of the polynomial x2c over the field Z/pZ . So is pa (which is different from a). In a field K, any polynomial of degree n has at most n distinct roots (Lagrange's theorem (number theory)), so there are no other a with this property, in particular not among 0 to (p − 1)/2.

Similarly, for b taking integral values between 0 and (p − 1)/2 (inclusive), the b2 − 1 are distinct. By the pigeonhole principle, there are a and b in this range, for which a2 and b2 − 1 are congruent modulo p, that is for which

Now let m be the smallest positive integer such that mp is the sum of four squares, x12 + x22 + x32 + x42 (we have just shown that there is some m (namely n) with this property, so there is a least one m, and it is smaller than p). We show by contradiction that m equals 1: supposing it is not the case, we prove the existence of a positive integer r less than m, for which rp is also the sum of four squares (this is in the spirit of the infinite descent [7] method of Fermat).

For this purpose, we consider for each xi the yi which is in the same residue class modulo m and between (–m + 1)/2 and m/2 (possibly included). It follows that y12 + y22 + y32 + y42 = mr, for some strictly positive integer r less than m.

Finally, another appeal to Euler's four-square identity shows that mpmr = z12 + z22 + z32 + z42. But the fact that each xi is congruent to its corresponding yi implies that all of the zi are divisible by m. Indeed,

It follows that, for wi = zi/m, w12 + w22 + w32 + w42 = rp, and this is in contradiction with the minimality of m.

In the descent above, we must rule out both the case y1 = y2 = y3 = y4 = m/2 (which would give r = m and no descent), and also the case y1 = y2 = y3 = y4 = 0 (which would give r = 0 rather than strictly positive). For both of those cases, one can check that mp = x12 + x22 + x32 + x42 would be a multiple of m2, contradicting the fact that p is a prime greater than m.

Proof using the Hurwitz integers

Another way to prove the theorem relies on Hurwitz quaternions, which are the analog of integers for quaternions. [8]

Proof using the Hurwitz integers

The Hurwitz quaternions consist of all quaternions with integer components and all quaternions with half-integer components. These two sets can be combined into a single formula where are integers. Thus, the quaternion components are either all integers or all half-integers, depending on whether is even or odd, respectively. The set of Hurwitz quaternions forms a ring; that is to say, the sum or product of any two Hurwitz quaternions is likewise a Hurwitz quaternion.

The (arithmetic, or field) norm of a rational quaternion is the nonnegative rational number where is the conjugate of . Note that the norm of a Hurwitz quaternion is always an integer. (If the coefficients are half-integers, then their squares are of the form , and the sum of four such numbers is an integer.)

Since quaternion multiplication is associative, and real numbers commute with other quaternions, the norm of a product of quaternions equals the product of the norms:

For any , . It follows easily that is a unit in the ring of Hurwitz quaternions if and only if .

The proof of the main theorem begins by reduction to the case of prime numbers. Euler's four-square identity implies that if Lagrange's four-square theorem holds for two numbers, it holds for the product of the two numbers. Since any natural number can be factored into powers of primes, it suffices to prove the theorem for prime numbers. It is true for . To show this for an odd prime integer p, represent it as a quaternion and assume for now (as we shall show later) that it is not a Hurwitz irreducible; that is, it can be factored into two non-unit Hurwitz quaternions

The norms of are integers such that and . This shows that both and are equal to p (since they are integers), and p is the sum of four squares

If it happens that the chosen has half-integer coefficients, it can be replaced by another Hurwitz quaternion. Choose in such a way that has even integer coefficients. Then

Since has even integer coefficients, will have integer coefficients and can be used instead of the original to give a representation of p as the sum of four squares.

As for showing that p is not a Hurwitz irreducible, Lagrange proved that any odd prime p divides at least one number of the form , where l and m are integers. [8] This can be seen as follows: since p is prime, can hold for integers , only when . Thus, the set of squares contains distinct residues modulo p. Likewise, contains residues. Since there are only p residues in total, and , the sets X and Y must intersect.

The number u can be factored in Hurwitz quaternions:

The norm on Hurwitz quaternions satisfies a form of the Euclidean property: for any quaternion with rational coefficients we can choose a Hurwitz quaternion so that by first choosing so that and then so that for . Then we obtain

It follows that for any Hurwitz quaternions with , there exists a Hurwitz quaternion such that

The ring H of Hurwitz quaternions is not commutative, hence it is not an actual Euclidean domain, and it does not have unique factorization in the usual sense. Nevertheless, the property above implies that every right ideal is principal. Thus, there is a Hurwitz quaternion such that

In particular, for some Hurwitz quaternion . If were a unit, would be a multiple of p, however this is impossible as is not a Hurwitz quaternion for . Similarly, if were a unit, we would have so p divides , which again contradicts the fact that is not a Hurwitz quaternion. Thus, p is not Hurwitz irreducible, as claimed.

Generalizations

Lagrange's four-square theorem is a special case of the Fermat polygonal number theorem and Waring's problem. Another possible generalization is the following problem: Given natural numbers , can we solve

for all positive integers n in integers ? The case is answered in the positive by Lagrange's four-square theorem. The general solution was given by Ramanujan. [9] He proved that if we assume, without loss of generality, that then there are exactly 54 possible choices for such that the problem is solvable in integers for all n. (Ramanujan listed a 55th possibility , but in this case the problem is not solvable if . [10] )

Algorithms

In 1986, Michael O. Rabin and Jeffrey Shallit [11] proposed randomized polynomial-time algorithms for computing a single representation for a given integer n, in expected running time . It was further improved to by Paul Pollack and Enrique Treviño in 2018. [12]

Number of representations

The number of representations of a natural number n as the sum of four squares of integers is denoted by r4(n). Jacobi's four-square theorem states that this is eight times the sum of the divisors of n if n is odd and 24 times the sum of the odd divisors of n if n is even (see divisor function), i.e.

Equivalently, it is eight times the sum of all its divisors which are not divisible by 4, i.e.

We may also write this as where the second term is to be taken as zero if n is not divisible by 4. In particular, for a prime number p we have the explicit formula r4(p) = 8(p + 1). [13]

Some values of r4(n) occur infinitely often as r4(n) = r4(2mn) whenever n is even. The values of r4(n)/n can be arbitrarily large: indeed, r4(n)/n is infinitely often larger than 8log n. [13]

Uniqueness

The sequence of positive integers which have only one representation as a sum of four squares of non-negative integers (up to order) is:

1, 2, 3, 5, 6, 7, 8, 11, 14, 15, 23, 24, 32, 56, 96, 128, 224, 384, 512, 896 ... (sequence A006431 in the OEIS ).

These integers consist of the seven odd numbers 1, 3, 5, 7, 11, 15, 23 and all numbers of the form or .

The sequence of positive integers which cannot be represented as a sum of four non-zero squares is:

1, 2, 3, 5, 6, 8, 9, 11, 14, 17, 24, 29, 32, 41, 56, 96, 128, 224, 384, 512, 896 ... (sequence A000534 in the OEIS ).

These integers consist of the eight odd numbers 1, 3, 5, 9, 11, 17, 29, 41 and all numbers of the form or .

Further refinements

Lagrange's four-square theorem can be refined in various ways. For example, Zhi-Wei Sun [14] proved that each natural number can be written as a sum of four squares with some requirements on the choice of these four numbers.

One may also wonder whether it is necessary to use the entire set of square integers to write each natural as the sum of four squares. Eduard Wirsing proved that there exists a set of squares S with such that every positive integer smaller than or equal to n can be written as a sum of at most 4 elements of S. [15]

See also

Notes

  1. Andrews, George E. (1994), Number Theory, Dover Publications, p. 144, ISBN   0-486-68252-8
  2. Ireland & Rosen 1990.
  3. Sarnak 2013.
  4. Landau 1958 , Theorems 166 to 169.
  5. Hardy & Wright 2008 , Theorem 369.
  6. Niven & Zuckerman 1960 , paragraph 5.7.
  7. Here the argument is a direct proof by contradiction. With the initial assumption that m > 2, m < p, is some integer such that mp is the sum of four squares (not necessarily the smallest), the argument could be modified to become an infinite descent argument in the spirit of Fermat.
  8. 1 2 Stillwell 2003 , pp. 138–157.
  9. Ramanujan 1917.
  10. Oh 2000.
  11. Rabin & Shallit 1986.
  12. Pollack & Treviño 2018.
  13. 1 2 Williams 2011 , p. 119.
  14. Sun 2017.
  15. Spencer 1996.

Related Research Articles

<span class="mw-page-title-main">Pauli matrices</span> Matrices important in quantum mechanics and the study of spin

In mathematical physics and mathematics, the Pauli matrices are a set of three 2 × 2 complex matrices that are traceless, Hermitian, involutory and unitary. Usually indicated by the Greek letter sigma, they are occasionally denoted by tau when used in connection with isospin symmetries.

<span class="mw-page-title-main">Matrix multiplication</span> Mathematical operation in linear algebra

In mathematics, particularly in linear algebra, matrix multiplication is a binary operation that produces a matrix from two matrices. For matrix multiplication, the number of columns in the first matrix must be equal to the number of rows in the second matrix. The resulting matrix, known as the matrix product, has the number of rows of the first and the number of columns of the second matrix. The product of matrices A and B is denoted as AB.

In mechanics and geometry, the 3D rotation group, often denoted SO(3), is the group of all rotations about the origin of three-dimensional Euclidean space under the operation of composition.

<span class="mw-page-title-main">Lindemann–Weierstrass theorem</span> On algebraic independence of exponentials of linearly independent algebraic numbers over Q

In transcendental number theory, the Lindemann–Weierstrass theorem is a result that is very useful in establishing the transcendence of numbers. It states the following:

In physics, the S-matrix or scattering matrix relates the initial state and the final state of a physical system undergoing a scattering process. It is used in quantum mechanics, scattering theory and quantum field theory (QFT).

In linear algebra, a rotation matrix is a transformation matrix that is used to perform a rotation in Euclidean space. For example, using the convention below, the matrix

<span class="mw-page-title-main">Euler's rotation theorem</span> Movement with a fixed point is rotation

In geometry, Euler's rotation theorem states that, in three-dimensional space, any displacement of a rigid body such that a point on the rigid body remains fixed, is equivalent to a single rotation about some axis that runs through the fixed point. It also means that the composition of two rotations is also a rotation. Therefore the set of rotations has a group structure, known as a rotation group.

In commutative algebra and field theory, the Frobenius endomorphism is a special endomorphism of commutative rings with prime characteristic p, an important class that includes finite fields. The endomorphism maps every element to its p-th power. In certain contexts it is an automorphism, but this is not true in general.

<span class="mw-page-title-main">LSZ reduction formula</span> Connection between correlation functions and the S-matrix

In quantum field theory, the Lehmann–Symanzik–Zimmermann (LSZ) reduction formula is a method to calculate S-matrix elements from the time-ordered correlation functions of a quantum field theory. It is a step of the path that starts from the Lagrangian of some quantum field theory and leads to prediction of measurable quantities. It is named after the three German physicists Harry Lehmann, Kurt Symanzik and Wolfhart Zimmermann.

A multipole expansion is a mathematical series representing a function that depends on angles—usually the two angles used in the spherical coordinate system for three-dimensional Euclidean space, . Similarly to Taylor series, multipole expansions are useful because oftentimes only the first few terms are needed to provide a good approximation of the original function. The function being expanded may be real- or complex-valued and is defined either on , or less often on for some other .

<span class="mw-page-title-main">Ordinary least squares</span> Method for estimating the unknown parameters in a linear regression model

In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model by the principle of least squares: minimizing the sum of the squares of the differences between the observed dependent variable in the input dataset and the output of the (linear) function of the independent variable. Some sources consider OLS to be linear regression.

In quantum mechanics, the Hellmann–Feynman theorem relates the derivative of the total energy with respect to a parameter to the expectation value of the derivative of the Hamiltonian with respect to that same parameter. According to the theorem, once the spatial distribution of the electrons has been determined by solving the Schrödinger equation, all the forces in the system can be calculated using classical electrostatics.

<span class="mw-page-title-main">Eisenstein integer</span> Complex number whose mapping on a coordinate plane produces a triangular lattice

In mathematics, the Eisenstein integers, occasionally also known as Eulerian integers, are the complex numbers of the form

In mathematics, a real or complex-valued function f on d-dimensional Euclidean space satisfies a Hölder condition, or is Hölder continuous, when there are real constants C ≥ 0, α > 0, such that for all x and y in the domain of f. More generally, the condition can be formulated for functions between any two metric spaces. The number is called the exponent of the Hölder condition. A function on an interval satisfying the condition with α > 1 is constant. If α = 1, then the function satisfies a Lipschitz condition. For any α > 0, the condition implies the function is uniformly continuous. The condition is named after Otto Hölder.

In geometry, various formalisms exist to express a rotation in three dimensions as a mathematical transformation. In physics, this concept is applied to classical mechanics where rotational kinematics is the science of quantitative description of a purely rotational motion. The orientation of an object at a given instant is described with the same tools, as it is defined as an imaginary rotation from a reference placement in space, rather than an actually observed rotation from a previous placement in space.

<span class="mw-page-title-main">Beta-binomial distribution</span> Discrete probability distribution

In probability theory and statistics, the beta-binomial distribution is a family of discrete probability distributions on a finite support of non-negative integers arising when the probability of success in each of a fixed or known number of Bernoulli trials is either unknown or random. The beta-binomial distribution is the binomial distribution in which the probability of success at each of n trials is not fixed but randomly drawn from a beta distribution. It is frequently used in Bayesian statistics, empirical Bayes methods and classical statistics to capture overdispersion in binomial type distributed data.

In many-body theory, the term Green's function is sometimes used interchangeably with correlation function, but refers specifically to correlators of field operators or creation and annihilation operators.

The purpose of this page is to provide supplementary materials for the ordinary least squares article, reducing the load of the main article with mathematics and improving its accessibility, while at the same time retaining the completeness of exposition.

Tau functions are an important ingredient in the modern mathematical theory of integrable systems, and have numerous applications in a variety of other domains. They were originally introduced by Ryogo Hirota in his direct method approach to soliton equations, based on expressing them in an equivalent bilinear form.

In mathematics, in number theory, Gauss composition law is a rule, invented by Carl Friedrich Gauss, for performing a binary operation on integral binary quadratic forms (IBQFs). Gauss presented this rule in his Disquisitiones Arithmeticae, a textbook on number theory published in 1801, in Articles 234 - 244. Gauss composition law is one of the deepest results in the theory of IBQFs and Gauss's formulation of the law and the proofs its properties as given by Gauss are generally considered highly complicated and very difficult. Several later mathematicians have simplified the formulation of the composition law and have presented it in a format suitable for numerical computations. The concept has also found generalisations in several directions.

References