Sieve of Atkin

Last updated

In mathematics, the sieve of Atkin is a modern algorithm for finding all prime numbers up to a specified integer. Compared with the ancient sieve of Eratosthenes, which marks off multiples of primes, the sieve of Atkin does some preliminary work and then marks off multiples of squares of primes, thus achieving a better theoretical asymptotic complexity. It was created in 2003 by A. O. L. Atkin and Daniel J. Bernstein. [1]

Contents

Algorithm

In the algorithm:

The algorithm:

  1. Create a results list, filled with 2, 3, and 5.
  2. Create a sieve list with an entry for each positive integer; all entries of this list should initially be marked non prime (composite).
  3. For each entry number n in the sieve list, with modulo-sixty remainder r :
    1. If r is 1, 13, 17, 29, 37, 41, 49, or 53, flip the entry for each possible solution to 4x2 + y2 = n. The number of flipping operations as a ratio to the sieving range for this step approaches 4π/15 [1] × 8/60 (the "8" in the fraction comes from the eight modulos handled by this quadratic and the 60 because Atkin calculated this based on an even number of modulo 60 wheels), which results in a fraction of about 0.1117010721276....
    2. If r is 7, 19, 31, or 43, flip the entry for each possible solution to 3x2 + y2 = n. The number of flipping operations as a ratio to the sieving range for this step approaches π0.12 [1] × 4/60 (the "4" in the fraction comes from the four modulos handled by this quadratic and the 60 because Atkin calculated this based on an even number of modulo 60 wheels), which results in a fraction of about 0.072551974569....
    3. If r is 11, 23, 47, or 59, flip the entry for each possible solution to 3x2  y2 = n when x > y. The number of flipping operations as a ratio to the sieving range for this step approaches 1.92ln(0.5+1.5) [1] × 4/60 (the "4" in the fraction comes from the four modulos handled by this quadratic and the 60 because Atkin calculated this based on an even number of modulo 60 wheels), which results in a fraction of about 0.060827679704....
    4. If r is something else, ignore it completely.
  4. Start with the lowest number in the sieve list.
  5. Take the next number in the sieve list still marked prime.
  6. Include the number in the results list.
  7. Square the number and mark all multiples of that square as non prime. Note that the multiples that can be factored by 2, 3, or 5 need not be marked, as these will be ignored in the final enumeration of primes.
  8. Repeat steps four through seven. The total number of operations for these repetitions of marking the squares of primes as a ratio of the sieving range is the sum of the inverse of the primes squared, which approaches the prime zeta function(2) of 0.45224752004... minus 1/22, 1/32, and 1/52 for those primes which have been eliminated by the wheel, with the result multiplied by 16/60 for the ratio of wheel hits per range; this results in a ratio of about 0.01363637571....

Adding the above ratios of operations together, the above algorithm takes a constant ratio of flipping/marking operations to the sieving range of about 0.2587171021...; From an actual implementation of the algorithm, the ratio is about 0.25 for sieving ranges as low as 67.

Pseudocode

The following is pseudocode which combines Atkin's algorithms 3.1, 3.2, and 3.3 [1] by using a combined set "s" of all the numbers modulo 60 excluding those which are multiples of the prime numbers 2, 3, and 5, as per the algorithms, for a straightforward version of the algorithm that supports optional bit packing of the wheel; although not specifically mentioned in the referenced paper, this pseudocode eliminates some obvious combinations of odd/even x's/y's in order to reduce computation where those computations would never pass the modulo tests anyway (i.e. would produce even numbers, or multiples of 3 or 5):

limit1000000000// arbitrary search limit// set of wheel "hit" positions for a 2/3/5 wheel rolled twice as per the Atkin algorithms{1,7,11,13,17,19,23,29,31,37,41,43,47,49,53,59}// Initialize the sieve with enough wheels to include limit:forn60×w+xwherew{0,1,...,limit÷60},xs:is_prime(n)false// Put in candidate primes://   integers which have an odd number of//   representations by certain quadratic forms.// Algorithm step 3.1:fornlimit,n4+wherex{1,2,...}andy{1,3,...}// all x's odd y'sifnmod60{1,13,17,29,37,41,49,53}:is_prime(n)¬is_prime(n)// toggle state// Algorithm step 3.2:fornlimit,n3+wherex{1,3,...}andy{2,4,...}// only odd x'sifnmod60{7,19,31,43}:// and even y'sis_prime(n)¬is_prime(n)// toggle state// Algorithm step 3.3:fornlimit,n3-wherex{2,3,...}andy{x-1,x-3,...,1}//all even/oddifnmod60{11,23,47,59}:// odd/even combosis_prime(n)¬is_prime(n)// toggle state// Eliminate composites by sieving, only for those occurrences on the wheel:forlimit,n60×w+xwherew{0,1,...},xs,n7:ifis_prime(n):// n is prime, omit multiples of its square; this is sufficient // because square-free composites can't get on this listforclimit,c×(60×w+x)wherew{0,1,...},xs:is_prime(c)false// one sweep to produce a sequential list of primes up to limit:output2,3,5for7nlimit,n60×w+xwherew{0,1,...},xs:ifis_prime(n):outputn

This pseudocode is written for clarity; although some redundant computations have been eliminated by controlling the odd/even x/y combinations, it still wastes almost half of its quadratic computations on non-productive loops that don't pass the modulo tests such that it will not be faster than an equivalent wheel factorized (2/3/5) sieve of Eratosthenes. To improve its efficiency, a method must be devised to minimize or eliminate these non-productive computations.

Explanation

The algorithm completely ignores any numbers with remainder modulo 60 that is divisible by two, three, or five, since numbers with a modulo 60 remainder divisible by one of these three primes are themselves divisible by that prime.

All numbers n with modulo-sixty remainder 1, 13, 17, 29, 37, 41, 49, or 53 have a modulo-four remainder of 1. These numbers are prime if and only if the number of solutions to 4x2 + y2 = n is odd and the number is squarefree (proven as theorem 6.1 of [1] ).

All numbers n with modulo-sixty remainder 7, 19, 31, or 43 have a modulo-six remainder of 1. These numbers are prime if and only if the number of solutions to 3x2 + y2 = n is odd and the number is squarefree (proven as theorem 6.2 of [1] ).

All numbers n with modulo-sixty remainder 11, 23, 47, or 59 have a modulo-twelve remainder of 11. These numbers are prime if and only if the number of solutions to 3x2y2 = n is odd and the number is squarefree (proven as theorem 6.3 of [1] ).

None of the potential primes are divisible by 2, 3, or 5, so they can't be divisible by their squares. This is why squarefree checks don't include 22, 32, and 52.

Computational complexity

It can be computed [1] that the above series of three quadratic equation operations each have a number of operations that is a constant ratio of the range as the range goes to infinity; as well the prime square free culling operations can be described by the prime zeta function(2) with constant offsets and factors so it is also a constant factor of the range as the range goes to infinity. Therefore, the algorithm described above can compute primes up to N using O(N) operations with only O(N) bits of memory.

The page segmented version implemented by the authors has the same O(N) operations but reduces the memory requirement to just that required by the base primes below the square root of the range of O(N1/2/log N) bits of memory plus a minimal page buffer. This is slightly better performance with the same memory requirement as the page segmented sieve of Eratosthenes which uses O(N log log N) operations and the same O(N1/2/log N) bits of memory [2] plus a minimal page buffer. However, such a sieve does not outperform a Sieve of Eratosthenes with maximum practical wheel factorization (a combination of a 2/3/5/7 sieving wheel and pre-culling composites in the segment page buffers using a 2/3/5/7/11/13/17/19 pattern), which although it has slightly more operations than the Sieve of Atkin for very large but practical ranges, has a constant factor of less complexity per operation by about three times in comparing the per operation time between the algorithms implemented by Bernstein in CPU clock cycles per operation. The main problem with the Page Segmented Sieve of Atkin is the difficulty in implementing the "prime square free" culling sequences due to the span between culls rapidly growing far beyond the page buffer span; the time expended for this operation in Bernstein's implementation rapidly grows to many times the time expended in the actual quadratic equation calculations, meaning that the linear complexity of the part that would otherwise be quite negligible becomes a major consumer of execution time. Thus, even though an optimized implementation may again settle to a O(n) time complexity, this constant factor of increased time per operations means that the Sieve of Atkin is slower.

A special modified "enumerating lattice points" variation which is not the above version of the Sieve of Atkin can theoretically compute primes up to N using O(N/log log N) operations with N1/2 + o(1) bits of memory [1] but this variation is rarely implemented. That is a little better in performance at a very high cost in memory as compared to both the ordinary page segmented version and to an equivalent but rarely-implemented version of the sieve of Eratosthenes which uses O(N) operations and O(N1/2(log log N)/log N) bits of memory. [3] [4] [5]

Pritchard observed that for the wheel sieves, one can reduce memory consumption while preserving Big O time complexity, but this generally comes at a cost in an increased constant factor for time per operation due to the extra complexity. Therefore, this special version is likely more of value as an intellectual exercise than a practical prime sieve with reduced real time expended for a given large practical sieving range.

See also

Related Research Articles

Modular arithmetic Computation modulo a fixed integer

In mathematics, modular arithmetic is a system of arithmetic for integers, where numbers "wrap around" when reaching a certain value, called the modulus. The modern approach to modular arithmetic was developed by Carl Friedrich Gauss in his book Disquisitiones Arithmeticae, published in 1801.

Prime number Positive integer with exactly two divisors, 1 and itself

A prime number is a natural number greater than 1 that is not a product of two smaller natural numbers. A natural number greater than 1 that is not prime is called a composite number. For example, 5 is prime because the only ways of writing it as a product, 1 × 5 or 5 × 1, involve 5 itself. However, 4 is composite because it is a product in which both numbers are smaller than 4. Primes are central in number theory because of the fundamental theorem of arithmetic: every natural number greater than 1 is either a prime itself or can be factorized as a product of primes that is unique up to their order.

Gaussian integer Complex number whose real and imaginary parts are both integers

In number theory, a Gaussian integer is a complex number whose real and imaginary parts are both integers. The Gaussian integers, with ordinary addition and multiplication of complex numbers, form an integral domain, usually written as Z[i]. This integral domain is a particular case of a commutative ring of quadratic integers. It does not have a total ordering that respects arithmetic.

Sieve of Eratosthenes Ancient algorithm for generating prime numbers

In mathematics, the sieve of Eratosthenes is an ancient algorithm for finding all prime numbers up to any given limit.

In number theory, Euler's criterion is a formula for determining whether an integer is a quadratic residue modulo a prime. Precisely,

In number theory, the general number field sieve (GNFS) is the most efficient classical algorithm known for factoring integers larger than 10100. Heuristically, its complexity for factoring an integer n (consisting of ⌊log2n⌋ + 1 bits) is of the form

A primality test is an algorithm for determining whether an input number is prime. Among other fields of mathematics, it is used for cryptography. Unlike integer factorization, primality tests do not generally give prime factors, only stating whether the input number is prime or not. Factorization is thought to be a computationally difficult problem, whereas primality testing is comparatively easy. Some primality tests prove that a number is prime, while others like Miller–Rabin prove that a number is composite. Therefore, the latter might more accurately be called compositeness tests instead of primality tests.

In number theory, an integer q is called a quadratic residue modulo n if it is congruent to a perfect square modulo n; i.e., if there exists an integer x such that:

Trial division is the most laborious but easiest to understand of the integer factorization algorithms. The essential idea behind trial division tests to see if an integer n, the integer to be factored, can be divided by each number in turn that is less than n. For example, for the integer n = 12, the only numbers that divide it are 1, 2, 3, 4, 6, 12. Selecting only the largest powers of primes in this list gives that 12 = 3 × 4 = 3 × 22.

The quadratic sieve algorithm (QS) is an integer factorization algorithm and, in practice, the second fastest method known. It is still the fastest for integers under 100 decimal digits or so, and is considerably simpler than the number field sieve. It is a general-purpose factorization algorithm, meaning that its running time depends solely on the size of the integer to be factored, and not on special structure or properties. It was invented by Carl Pomerance in 1981 as an improvement to Schroeppel's linear sieve.

In number theory, a branch of mathematics, the special number field sieve (SNFS) is a special-purpose integer factorization algorithm. The general number field sieve (GNFS) was derived from it.

In number theory, the Erdős–Straus conjecture states that for all integers n ≥ 2, the rational number 4/n can be expressed as the sum of three positive unit fractions. It is named after Paul Erdős and Ernst G. Straus, who formulated the conjecture in 1948; it is one of many conjectures by Erdős.

In computational number theory, a variety of algorithms make it possible to generate prime numbers efficiently. These are used in various applications, for example hashing, public-key cryptography, and search of prime factors in large numbers.

Schoof's algorithm is an efficient algorithm to count points on elliptic curves over finite fields. The algorithm has applications in elliptic curve cryptography where it is important to know the number of points to judge the difficulty of solving the discrete logarithm problem in the group of points on an elliptic curve.

The Tonelli–Shanks algorithm is used in modular arithmetic to solve for r in a congruence of the form r2n, where p is a prime: that is, to find a square root of n modulo p.

Wheel factorization

Wheel factorization is an improvement of the trial division method for integer factorization.

In mathematics, the sieve of Sundaram is a simple deterministic algorithm for finding all the prime numbers up to a specified integer. It was discovered by Indian mathematician S. P. Sundaram in 1934.

An important aspect in the study of elliptic curves is devising effective ways of counting points on the curve. There have been several approaches to do so, and the algorithms devised have proved to be useful tools in the study of various fields such as number theory, and more recently in cryptography and Digital Signature Authentication. While in number theory they have important consequences in the solving of Diophantine equations, with respect to cryptography, they enable us to make effective use of the difficulty of the discrete logarithm problem (DLP) for the group , of elliptic curves over a finite field , where q = pk and p is a prime. The DLP, as it has come to be known, is a widely used approach to public key cryptography, and the difficulty in solving this problem determines the level of security of the cryptosystem. This article covers algorithms to count points on elliptic curves over fields of large characteristic, in particular p > 3. For curves over fields of small characteristic more efficient algorithms based on p-adic methods exist.

In mathematics, elliptic curve primality testing techniques, or elliptic curve primality proving (ECPP), are among the quickest and most widely used methods in primality proving. It is an idea put forward by Shafi Goldwasser and Joe Kilian in 1986 and turned into an algorithm by A. O. L. Atkin the same year. The algorithm was altered and improved by several collaborators subsequently, and notably by Atkin and François Morain, in 1993. The concept of using elliptic curves in factorization had been developed by H. W. Lenstra in 1985, and the implications for its use in primality testing followed quickly.

References

  1. 1 2 3 4 5 6 7 8 9 10 A.O.L. Atkin, D.J. Bernstein, Prime sieves using binary quadratic forms, Math. Comp. 73 (2004), 1023-1030.
  2. Pritchard, Paul, "Linear prime-number sieves: a family tree," Sci. Comput. Programming9:1 (1987), pp. 17–35.
  3. Paul Pritchard, A sublinear additive sieve for finding prime numbers, Communications of the ACM 24 (1981), 18–23. MR 600730
  4. Paul Pritchard, Explaining the wheel sieve, Acta Informatica 17 (1982), 477–485. MR 685983
  5. Paul Pritchard, Fast compact prime number sieves (among others), Journal of Algorithms 4 (1983), 332–344. MR 729229