Pollard's rho algorithm

Last updated

Pollard's rho algorithm is an algorithm for integer factorization. It was invented by John Pollard in 1975. [1] It uses only a small amount of space, and its expected running time is proportional to the square root of the smallest prime factor of the composite number being factorized.

Contents

Core ideas

The algorithm is used to factorize a number , where is a non-trivial factor. A polynomial modulo , called (e.g., ), is used to generate a pseudorandom sequence. It is important to note that must be a polynomial. A starting value, say 2, is chosen, and the sequence continues as , , , etc. The sequence is related to another sequence . Since is not known beforehand, this sequence cannot be explicitly computed in the algorithm. Yet, in it lies the core idea of the algorithm.

Because the number of possible values for these sequences is finite, both the sequence, which is mod , and sequence will eventually repeat, even though these values are unknown. If the sequences were to behave like random numbers, the birthday paradox implies that the number of before a repetition occurs would be expected to be , where is the number of possible values. So the sequence will likely repeat much earlier than the sequence . When one has found a such that but , the number is a multiple of , so has been found.

Once a sequence has a repeated value, the sequence will cycle, because each value depends only on the one before it. This structure of eventual cycling gives rise to the name "rho algorithm", owing to similarity to the shape of the Greek letter ρ when the values , , etc. are represented as nodes in a directed graph.

Cycle diagram resembling the Greek letter r Pollard rho cycle.svg
Cycle diagram resembling the Greek letter ρ

This is detected by Floyd's cycle-finding algorithm: two nodes and (i.e., and ) are kept. In each step, one moves to the next node in the sequence and the other moves forward by two nodes. After that, it is checked whether . If it is not 1, then this implies that there is a repetition in the sequence (i.e. . This works because if the is the same as , the difference between and is necessarily a multiple of . Although this always happens eventually, the resulting greatest common divisor (GCD) is a divisor of other than 1. This may be itself, since the two sequences might repeat at the same time. In this (uncommon) case the algorithm fails, and can be repeated with a different parameter.

Algorithm

The algorithm takes as its inputs n, the integer to be factored; and , a polynomial in x computed modulo n. In the original algorithm, , but nowadays it is more common to use . The output is either a non-trivial factor of n, or failure. It performs the following steps: [2]

Pseudocode for Pollard's rho algorithm

    x ← 2 // starting value     y ← x     d ← 1      while d = 1:         x ← g(x)         y ← g(g(y))         d ← gcd(|x - y|, n)      if d = n:          return failureelse:         return d

Here x and y corresponds to and in the previous section. Note that this algorithm may fail to find a nontrivial factor even when n is composite. In that case, the method can be tried again, using a starting value other than 2 or a different .

Example factorization

Let and .

Pollard's rho algorithm example factorization for
n
=
253
{\displaystyle n=253}
and
g
(
x
)
=
x
2
mod
2
53
{\displaystyle g(x)=x^{2}{\bmod {2}}53}
, with starting value 2. The example is using Floyd's cycle-finding algorithm. Rho-example-animated.gif
Pollard's rho algorithm example factorization for and , with starting value 2. The example is using Floyd's cycle-finding algorithm.
ixygcd(|xy|, 8051)
15261
22674741
367787197
4747414811

Now 97 is a non-trivial factor of 8051. Starting values other than x = y = 2 may give the cofactor (83) instead of 97. One extra iteration is shown above to make it clear that y moves twice as fast as x. Note that even after a repetition, the GCD can return to 1.

Variants

In 1980, Richard Brent published a faster variant of the rho algorithm. He used the same core ideas as Pollard but a different method of cycle detection, replacing Floyd's cycle-finding algorithm with the related Brent's cycle finding method. [3]

A further improvement was made by Pollard and Brent. They observed that if , then also for any positive integer . In particular, instead of computing at every step, it suffices to define as the product of 100 consecutive terms modulo , and then compute a single . A major speed up results as 100 gcd steps are replaced with 99 multiplications modulo and a single gcd. Occasionally it may cause the algorithm to fail by introducing a repeated factor, for instance when is a square. But it then suffices to go back to the previous gcd term, where , and use the regular ρ algorithm from there.

Application

The algorithm is very fast for numbers with small factors, but slower in cases where all factors are large. The ρ algorithm's most remarkable success was the 1980 factorization of the Fermat number F8 = 1238926361552897 × 93461639715357977769163558199606896584051237541638188580280321. [4] The ρ algorithm was a good choice for F8 because the prime factor p = 1238926361552897 is much smaller than the other factor. The factorization took 2 hours on a UNIVAC 1100/42. [4]

Example: factoring n = 10403 = 101 · 103

The following table shows numbers produced by the algorithm, starting with and using the polynomial . The third and fourth columns of the table contain additional information not known by the algorithm. They are included to show how the algorithm works.

step
22220
52521
2622622
6772671263
5982693264
39032665265
34182685266
156341855857
3531341897858
5168341817859
37243418888510
9783418698511
98123418158512
59833418248513
99703418728514
2369970347215
36829970467216
20169970977217
70879970177218
102899970887219
25949970697220
84999970157221
49739970247222
27999970727223

The first repetition modulo 101 is 97 which occurs in step 17. The repetition is not detected until step 23, when . This causes to be , and a factor is found.

Complexity

If the pseudorandom number occurring in the Pollard ρ algorithm were an actual random number, it would follow that success would be achieved half the time, by the birthday paradox in iterations. It is believed that the same analysis applies as well to the actual rho algorithm, but this is a heuristic claim, and rigorous analysis of the algorithm remains open. [5]

See also

Related Research Articles

<span class="mw-page-title-main">Euclidean algorithm</span> Algorithm for computing greatest common divisors

In mathematics, the Euclidean algorithm, or Euclid's algorithm, is an efficient method for computing the greatest common divisor (GCD) of two integers (numbers), the largest number that divides them both without a remainder. It is named after the ancient Greek mathematician Euclid, who first described it in his Elements . It is an example of an algorithm, a step-by-step procedure for performing a calculation according to well-defined rules, and is one of the oldest algorithms in common use. It can be used to reduce fractions to their simplest form, and is a part of many other number-theoretic and cryptographic calculations.

<span class="mw-page-title-main">Quadratic reciprocity</span> Gives conditions for the solvability of quadratic equations modulo prime numbers

In number theory, the law of quadratic reciprocity is a theorem about modular arithmetic that gives conditions for the solvability of quadratic equations modulo prime numbers. Due to its subtlety, it has many formulations, but the most standard statement is:

RSA (Rivest–Shamir–Adleman) is a public-key cryptosystem, one of the oldest widely used for secure data transmission. The initialism "RSA" comes from the surnames of Ron Rivest, Adi Shamir and Leonard Adleman, who publicly described the algorithm in 1977. An equivalent system was developed secretly in 1973 at Government Communications Headquarters (GCHQ), the British signals intelligence agency, by the English mathematician Clifford Cocks. That system was declassified in 1997.

The Lenstra elliptic-curve factorization or the elliptic-curve factorization method (ECM) is a fast, sub-exponential running time, algorithm for integer factorization, which employs elliptic curves. For general-purpose factoring, ECM is the third-fastest known factoring method. The second-fastest is the multiple polynomial quadratic sieve, and the fastest is the general number field sieve. The Lenstra elliptic-curve factorization is named after Hendrik Lenstra.

The Rabin cryptosystem is a family of public-key encryption schemes based on a trapdoor function whose security, like that of RSA, is related to the difficulty of integer factorization.

Pollard's p − 1 algorithm is a number theoretic integer factorization algorithm, invented by John Pollard in 1974. It is a special-purpose algorithm, meaning that it is only suitable for integers with specific types of factors; it is the simplest example of an algebraic-group factorisation algorithm.

The quadratic sieve algorithm (QS) is an integer factorization algorithm and, in practice, the second-fastest method known. It is still the fastest for integers under 100 decimal digits or so, and is considerably simpler than the number field sieve. It is a general-purpose factorization algorithm, meaning that its running time depends solely on the size of the integer to be factored, and not on special structure or properties. It was invented by Carl Pomerance in 1981 as an improvement to Schroeppel's linear sieve.

In computer science, cycle detection or cycle finding is the algorithmic problem of finding a cycle in a sequence of iterated function values.

In number theory, Dixon's factorization method is a general-purpose integer factorization algorithm; it is the prototypical factor base method. Unlike for other factor base methods, its run-time bound comes with a rigorous proof that does not rely on conjectures about the smoothness properties of the values taken by a polynomial.

Pollard's rho algorithm for logarithms is an algorithm introduced by John Pollard in 1978 to solve the discrete logarithm problem, analogous to Pollard's rho algorithm to solve the integer factorization problem.

In mathematics, Hensel's lemma, also known as Hensel's lifting lemma, named after Kurt Hensel, is a result in modular arithmetic, stating that if a univariate polynomial has a simple root modulo a prime number p, then this root can be lifted to a unique root modulo any higher power of p. More generally, if a polynomial factors modulo p into two coprime polynomials, this factorization can be lifted to a factorization modulo any higher power of p.

In computational number theory, Williams's p + 1 algorithm is an integer factorization algorithm, one of the family of algebraic-group factorisation algorithms. It was invented by Hugh C. Williams in 1982.

<span class="mw-page-title-main">Eisenstein integer</span> Complex number whose mapping on a coordinate plane produces a triangular lattice

In mathematics, the Eisenstein integers, occasionally also known as Eulerian integers, are the complex numbers of the form

In mathematics, the rational sieve is a general algorithm for factoring integers into prime factors. It is a special case of the general number field sieve. While it is less efficient than the general algorithm, it is conceptually simpler. It serves as a helpful first step in understanding how the general number field sieve works.

The Blum–Goldwasser (BG) cryptosystem is an asymmetric key encryption algorithm proposed by Manuel Blum and Shafi Goldwasser in 1984. Blum–Goldwasser is a probabilistic, semantically secure cryptosystem with a constant-size ciphertext expansion. The encryption algorithm implements an XOR-based stream cipher using the Blum-Blum-Shub (BBS) pseudo-random number generator to generate the keystream. Decryption is accomplished by manipulating the final state of the BBS generator using the private key, in order to find the initial seed and reconstruct the keystream.

In mathematics and computer algebra, factorization of polynomials or polynomial factorization expresses a polynomial with coefficients in a given field or in the integers as the product of irreducible factors with coefficients in the same domain. Polynomial factorization is one of the fundamental components of computer algebra systems.

In mathematics, particularly computational algebra, Berlekamp's algorithm is a well-known method for factoring polynomials over finite fields. The algorithm consists mainly of matrix reduction and polynomial GCD computations. It was invented by Elwyn Berlekamp in 1967. It was the dominant algorithm for solving the problem until the Cantor–Zassenhaus algorithm of 1981. It is currently implemented in many well-known computer algebra systems.

In mathematics, elliptic curve primality testing techniques, or elliptic curve primality proving (ECPP), are among the quickest and most widely used methods in primality proving. It is an idea put forward by Shafi Goldwasser and Joe Kilian in 1986 and turned into an algorithm by A. O. L. Atkin the same year. The algorithm was altered and improved by several collaborators subsequently, and notably by Atkin and François Morain, in 1993. The concept of using elliptic curves in factorization had been developed by H. W. Lenstra in 1985, and the implications for its use in primality testing followed quickly.

In mathematics and computer algebra the factorization of a polynomial consists of decomposing it into a product of irreducible factors. This decomposition is theoretically possible and is unique for polynomials with coefficients in any field, but rather strong restrictions on the field of the coefficients are needed to allow the computation of the factorization by means of an algorithm. In practice, algorithms have been designed only for polynomials with coefficients in a finite field, in the field of rationals or in a finitely generated field extension of one of them.

In number theory, Berlekamp's root finding algorithm, also called the Berlekamp–Rabin algorithm, is the probabilistic method of finding roots of polynomials over the field with elements. The method was discovered by Elwyn Berlekamp in 1970 as an auxiliary to the algorithm for polynomial factorization over finite fields. The algorithm was later modified by Rabin for arbitrary finite fields in 1979. The method was also independently discovered before Berlekamp by other researchers.

References

  1. Pollard, J. M. (1975). "A Monte Carlo method for factorization". BIT Numerical Mathematics. 15 (3): 331–334. doi:10.1007/bf01933667. S2CID   122775546.
  2. Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L. & Stein, Clifford (2009). "Section 31.9: Integer factorization". Introduction to Algorithms (third ed.). Cambridge, MA: MIT Press. pp. 975–980. ISBN   978-0-262-03384-8. (this section discusses only Pollard's rho algorithm).
  3. Brent, Richard P. (1980). "An Improved Monte Carlo Factorization Algorithm". BIT. 20 (2): 176–184. doi:10.1007/BF01933190. S2CID   17181286.
  4. 1 2 Brent, R.P.; Pollard, J. M. (1981). "Factorization of the Eighth Fermat Number". Mathematics of Computation. 36 (154): 627–630. doi: 10.2307/2007666 . JSTOR   2007666.
  5. Galbraith, Steven D. (2012). "14.2.5 Towards a rigorous analysis of Pollard rho". Mathematics of Public Key Cryptography. Cambridge University Press. pp. 272–273. ISBN   9781107013926..

Further reading