Dixon's factorization method

Last updated October 30, 2023

In number theory, Dixon's factorization method (also Dixon's random squares method^[1] or Dixon's algorithm) is a general-purpose integer factorization algorithm; it is the prototypical factor base method. Unlike for other factor base methods, its run-time bound comes with a rigorous proof that does not rely on conjectures about the smoothness properties of the values taken by a polynomial.

Basic idea

Dixon's method is based on finding a congruence of squares modulo the integer N which is intended to factor. Fermat's factorization method finds such a congruence by selecting random or pseudo-random x values and hoping that the integer x² mod N is a perfect square (in the integers):

x^{2}\equiv y^{2}\quad ({\hbox{mod }}N),\qquad x\not \equiv \pm y\quad ({\hbox{mod }}N).

For example, if N = 84923, (by starting at 292, the first number greater than √N and counting up) the 505² mod 84923 is 256, the square of 16. So (505 − 16)(505 + 16) = 0 mod 84923. Computing the greatest common divisor of 505 − 16 and N using Euclid's algorithm gives 163, which is a factor of N.

In practice, selecting random x values will take an impractically long time to find a congruence of squares, since there are only √N squares less than N.

Dixon's method replaces the condition "is the square of an integer" with the much weaker one "has only small prime factors"; for example, there are 292 squares smaller than 84923; 662 numbers smaller than 84923 whose prime factors are only 2,3,5 or 7; and 4767 whose prime factors are all less than 30. (Such numbers are called B-smooth with respect to some bound B.)

If there are many numbers $a_{1}\ldots a_{n}$ whose squares can be factorized as $a_{i}^{2}\mod N=\prod _{j=1}^{m}b_{j}^{e_{ij}}$ for a fixed set $b_{1}\ldots b_{m}$ of small primes, linear algebra modulo 2 on the matrix $e_{ij}$ will give a subset of the $a_{i}$ whose squares combine to a product of small primes to an even power — that is, a subset of the $a_{i}$ whose squares multiply to the square of a (hopefully different) number mod N.

Method

Suppose the composite number N is being factored. Bound B is chosen, and the factor base is identified (which is called P), the set of all primes less than or equal to B. Next, positive integers z are sought such that z² mod N is B-smooth. Therefore we can write, for suitable exponents a_i,

z^{2}{\text{ mod }}N=\prod _{p_{i}\in P}p_{i}^{a_{i}}

When enough of these relations have been generated (it is generally sufficient that the number of relations be a few more than the size of P), the methods of linear algebra, such as Gaussian elimination, can be used to multiply together these various relations in such a way that the exponents of the primes on the right-hand side are all even:

{z_{1}^{2}z_{2}^{2}\cdots z_{k}^{2}\equiv \prod _{p_{i}\in P}p_{i}^{a_{i,1}+a_{i,2}+\cdots +a_{i,k}}\ {\pmod {N}}\quad ({\text{where }}a_{i,1}+a_{i,2}+\cdots +a_{i,k}\equiv 0{\pmod {2}})}

This yields a congruence of squares of the form a² ≡ b² (mod N), which can be turned into a factorization of N, N = gcd(a + b, N) × (N/gcd(a + b, N)). This factorization might turn out to be trivial (i.e. N = N × 1), which can only happen if a ≡ ±b (mod N), in which case another try must be made with a different combination of relations; but if a nontrivial pair of factors of N is reached, the algorithm terminates.

Pseudocode

input: positive integer  ${\textstyle N}$ output: non-trivial factor of  ${\textstyle N}$   Choose bound  ${\textstyle B}$  Let  ${\textstyle P:=\{p_{1},p_{2},\ldots ,p_{k}\}}$  be all primes  ${\textstyle \leq B}$ repeatfor ${\textstyle i=1}$ to ${\textstyle k+1}$ do         Choose  ${\textstyle 0<z_{i}<N}$  such that  ${\textstyle z_{i}^{2}{\text{ mod }}N}$  is  ${\textstyle B}$ -smooth         Let  ${\textstyle a_{i}:=\{a_{i1},a_{i2},\ldots ,a_{ik}\}}$  such that  ${\textstyle z_{i}^{2}{\text{ mod }}N=\prod _{p_{j}\in P}p_{j}^{a_{ij}}}$ end for      Find non-empty  ${\textstyle T\subseteq \{1,2,\ldots ,k+1\}}$  such that  ${\textstyle \sum _{i\in T}a_{i}\equiv {\vec {0}}{\pmod {2}}}$      Let  ${\textstyle x:=\left(\prod _{i\in T}z_{i}\right){\text{ mod }}N}$  ${\textstyle y:=\left(\prod _{p_{j}\in P}p_{j}^{\left(\sum _{i\in T}a_{ij}\right)/2}\right){\text{ mod }}N}$ while ${\textstyle x\equiv \pm y{\pmod {N}}}$ return ${\textstyle \gcd(x+y,N)}$

Example

This example will try to factor N = 84923 using bound B = 7. The factor base is then P = {2, 3, 5, 7}. A search can be made for integers between $\left\lceil {\sqrt {84923}}\right\rceil =292$ and N whose squares mod N are B-smooth. Suppose that two of the numbers found are 513 and 537:

513^{2}\mod 84923=8400=2^{4}\cdot 3\cdot 5^{2}\cdot 7

537^{2}\mod 84923=33600=2^{6}\cdot 3\cdot 5^{2}\cdot 7

So

(513\cdot 537)^{2}\mod 84923=2^{10}\cdot 3^{2}\cdot 5^{4}\cdot 7^{2}\mod 84923

Then

${\begin{aligned}&{}(513\cdot 537)^{2}\mod 84923\\&=(275481)^{2}\mod 84923\\&=(84923\cdot 3+20712)^{2}\mod 84923\\&=(84923\cdot 3)^{2}+2\cdot (84923\cdot 3\cdot 20712)+20712^{2}\mod 84923\\&=0+0+20712^{2}\mod 84923\end{aligned}}$

That is, $20712^{2}\mod 84923=(2^{5}\cdot 3\cdot 5^{2}\cdot 7)^{2}\mod 84923=16800^{2}\mod 84923.$

The resulting factorization is 84923 = gcd(20712 − 16800, 84923) × gcd(20712 + 16800, 84923) = 163 × 521.

Optimizations

The quadratic sieve is an optimization of Dixon's method. It selects values of x close to the square root of N such that x² modulo N is small, thereby largely increasing the chance of obtaining a smooth number.

Other ways to optimize Dixon's method include using a better algorithm to solve the matrix equation, taking advantage of the sparsity of the matrix: a number z cannot have more than $\log _{2}z$ factors, so each row of the matrix is almost all zeros. In practice, the block Lanczos algorithm is often used. Also, the size of the factor base must be chosen carefully: if it is too small, it will be difficult to find numbers that factorize completely over it, and if it is too large, more relations will have to be collected.

A more sophisticated analysis, using the approximation that a number has all its prime factors less than $N^{1/a}$ with probability about $a^{-a}$ (an approximation to the Dickman–de Bruijn function), indicates that choosing too small a factor base is much worse than too large, and that the ideal factor base size is some power of $\exp \left({\sqrt {\log N\log \log N}}\right)$ .

The optimal complexity of Dixon's method is

O\left(\exp \left(2{\sqrt {2}}{\sqrt {\log n\log \log n}}\right)\right)

in big-O notation, or

L_{n}[1/2,2{\sqrt {2}}]

in L-notation.

Related Research Articles

<span class="mw-page-title-main">Carmichael number</span> Composite number in number theory

In number theory, a Carmichael number is a composite number $, which in modular arithmetic satisfies the congruence relation:$

In mathematics, specifically abstract algebra, an integral domain is a nonzero commutative ring in which the product of any two nonzero elements is nonzero. Integral domains are generalizations of the ring of integers and provide a natural setting for studying divisibility. In an integral domain, every nonzero element a has the cancellation property, that is, if a ≠ 0, an equality ab = ac implies b = c.

In mathematics, a square-free integer (or squarefree integer) is an integer which is divisible by no square number other than 1. That is, its prime factorization has exactly one factor for each prime that appears in it. For example, 10 = 2 ⋅ 5 is square-free, but 18 = 2 ⋅ 3 ⋅ 3 is not, because 18 is divisible by 9 = 3². The smallest positive square-free numbers are

Shor's algorithm is a quantum algorithm for finding the prime factors of an integer. It was developed in 1994 by the American mathematician Peter Shor. It is one of the few known quantum algorithms with compelling potential applications and strong evidence of superpolynomial speedup compared to best known classical algorithms. On the other hand, factoring numbers of practical significance requires far more qubits than available in the near future. Another concern is that noise in quantum circuits may undermine results, requiring additional qubits for quantum error correction.

In recreational mathematics, a repunit is a number like 11, 111, or 1111 that contains only the digit 1 — a more specific type of repdigit. The term stands for "repeated unit" and was coined in 1966 by Albert H. Beiler in his book Recreations in the Theory of Numbers.

In algebra and number theory, Wilson's theorem states that a natural number n > 1 is a prime number if and only if the product of all the positive integers less than n is one less than a multiple of n. That is, the factorial $satisfies$

In number theory, a congruence of squares is a congruence commonly used in integer factorization algorithms.

Pollard's rho algorithm is an algorithm for integer factorization. It was invented by John Pollard in 1975. It uses only a small amount of space, and its expected running time is proportional to the square root of the smallest prime factor of the composite number being factorized.

The quadratic sieve algorithm (QS) is an integer factorization algorithm and, in practice, the second-fastest method known. It is still the fastest for integers under 100 decimal digits or so, and is considerably simpler than the number field sieve. It is a general-purpose factorization algorithm, meaning that its running time depends solely on the size of the integer to be factored, and not on special structure or properties. It was invented by Carl Pomerance in 1981 as an improvement to Schroeppel's linear sieve.

In number theory, a branch of mathematics, the special number field sieve (SNFS) is a special-purpose integer factorization algorithm. The general number field sieve (GNFS) was derived from it.

In mathematics, Hensel's lemma, also known as Hensel's lifting lemma, named after Kurt Hensel, is a result in modular arithmetic, stating that if a univariate polynomial has a simple root modulo a prime number $p$ , then this root can be lifted to a unique root modulo any higher power of $p$ . More generally, if a polynomial factors modulo $p$ into two coprime polynomials, this factorization can be lifted to a factorization modulo any higher power of $p$ .

In additive number theory, Fermat's theorem on sums of two squares states that an odd prime p can be expressed as:

Fermat's factorization method, named after Pierre de Fermat, is based on the representation of an odd integer as the difference of two squares:

In mathematics, the rational sieve is a general algorithm for factoring integers into prime factors. It is a special case of the general number field sieve. While it is less efficient than the general algorithm, it is conceptually simpler. It serves as a helpful first step in understanding how the general number field sieve works.

In mathematics and computer science, a primality certificate or primality proof is a succinct, formal proof that a number is prime. Primality certificates allow the primality of a number to be rapidly checked without having to run an expensive or unreliable primality test. "Succinct" usually means that the proof should be at most polynomially larger than the number of digits in the number itself.

In mathematics and computer algebra, factorization of polynomials or polynomial factorization expresses a polynomial with coefficients in a given field or in the integers as the product of irreducible factors with coefficients in the same domain. Polynomial factorization is one of the fundamental components of computer algebra systems.

In mathematics, particularly in the area of arithmetic, a modular multiplicative inverse of an integer $a$ is an integer $x$ such that the product $ax$ is congruent to 1 with respect to the modulus $m$ . In the standard notation of modular arithmetic this congruence is written as

In cryptography, Very Smooth Hash (VSH) is a provably secure cryptographic hash function invented in 2005 by Scott Contini, Arjen Lenstra and Ron Steinfeld. Provably secure means that finding collisions is as difficult as some known hard mathematical problem. Unlike other provably secure collision-resistant hashes, VSH is efficient and usable in practice. Asymptotically, it only requires a single multiplication per log(n) message-bits and uses RSA-type arithmetic. Therefore, VSH can be useful in embedded environments where code space is limited.

Coppersmith's attack describes a class of cryptographic attacks on the public-key cryptosystem RSA based on the Coppersmith method. Particular applications of the Coppersmith method for attacking RSA include cases when the public exponent e is small or when partial knowledge of a prime factor of the secret key is available.

In number theory, Berlekamp's root finding algorithm, also called the Berlekamp–Rabin algorithm, is the probabilistic method of finding roots of polynomials over a field $. The method was discovered by Elwyn Berlekamp in 1970 as an auxiliary to the algorithm for polynomial factorization over finite fields. The algorithm was later modified by Rabin for arbitrary finite fields in 1979. The method was also independently discovered before Berlekamp by other researchers.$

References

↑ Kleinjung, Thorsten; et al. (2010). "Factorization of a 768-Bit RSA Modulus". Advances in Cryptology – CRYPTO 2010. Lecture Notes in Computer Science. Vol. 6223. pp. 333–350. doi:10.1007/978-3-642-14623-7_18. ISBN 978-3-642-14622-0. S2CID 11556080.
↑ Dixon, J. D. (1981). "Asymptotically fast factorization of integers". Math. Comp. 36 (153): 255–260. doi: 10.1090/S0025-5718-1981-0595059-1 . JSTOR 2007743.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[Kleinjung10-1] Kleinjung, Thorsten; et al. (2010). "Factorization of a 768-Bit RSA Modulus". Advances in Cryptology – CRYPTO 2010. Lecture Notes in Computer Science. Vol. 6223. pp. 333–350. doi:10.1007/978-3-642-14623-7_18. ISBN 978-3-642-14622-0. S2CID 11556080.

[Dixon81-2] Dixon, J. D. (1981). "Asymptotically fast factorization of integers". Math. Comp. 36 (153): 255–260. doi: 10.1090/S0025-5718-1981-0595059-1 . JSTOR 2007743.

[1]

[2]

v t e Number-theoretic algorithms
Primality tests	AKS APR Baillie–PSW Elliptic curve Pocklington Fermat Lucas Lucas–Lehmer Lucas–Lehmer–Riesel Proth's theorem Pépin's Quadratic Frobenius Solovay–Strassen Miller–Rabin
Prime-generating	Sieve of Atkin Sieve of Eratosthenes Sieve of Pritchard Sieve of Sundaram Wheel factorization
Integer factorization	Continued fraction (CFRAC) Dixon's Lenstra elliptic curve (ECM) Euler's Pollard's rho p − 1 p + 1 Quadratic sieve (QS) General number field sieve (GNFS) Special number field sieve (SNFS) Rational sieve Fermat's Shanks's square forms Trial division Shor's
Multiplication	Ancient Egyptian Long Karatsuba Toom–Cook Schönhage–Strassen Fürer's
Euclidean division	Binary Chunking Fourier Goldschmidt Newton-Raphson Long Short SRT
Discrete logarithm	Baby-step giant-step Pollard rho Pollard kangaroo Pohlig–Hellman Index calculus Function field sieve
Greatest common divisor	Binary Euclidean Extended Euclidean Lehmer's
Modular square root	Cipolla Pocklington's Tonelli–Shanks Berlekamp Kunerth
Other algorithms	Chakravala Cornacchia Exponentiation by squaring Integer square root Integer relation (LLL; KZ) Modular exponentiation Montgomery reduction Schoof Trachtenberg system
Italics indicate that algorithm is for numbers of special forms