Irrational base discrete weighted transform

Last updated November 07, 2025

In mathematics, the irrational base discrete weighted transform (IBDWT) is a variant of the fast Fourier transform using an irrational base; it was developed by Richard Crandall (Reed College), Barry Fagin (Dartmouth College) and Joshua Doenias (NeXT Software)^[1] in the early 1990s using Mathematica. It implies a fast, practical implementation of large-number modular multiplication on modern computers, at asymptotically 2× faster than non-modular FFT multiplication.^[2]^[3]

Algorithm

The IBDWT method, as applied to the Lucas-Lehmer test for Mersenne primes (which requires repeated squaring modulo a Mersenne number $M_{p}=2^{p}-1$ ), is based on four key elements developed by Crandall and Fagin:^[4]

Balanced-radix representations: Allows digits to be signed (e.g., in the range $-W/2\leq x_{j}<W/2$ ), which reduces error bounds.^[4]
Variable-base digit representations: Allows each digit $x_{j}$ to have its own base $W_{j}$ .^[4]
Weighted cyclic convolutions: The multiplication is performed using a Discrete Weighted Transform (DWT).^[4]
Irrational numeric bases: The base used for the transform is irrational.^[4]

This approach avoids the need for zero-padding the arrays and performs the multiplication modulo $M_{p}$ directly.^[4] The algorithm to compute the product $xy{\pmod {M_{p}}}$ is as follows:^[4]

Choose a run length (signal-length) $N<p$ .^[4]
Establish a variable base representation for the numbers. For example, $x=\sum _{j=0}^{N-1}x_{j}2^{\lceil pj/N\rceil }$ .^[4] Each term is usually between 16 and 20 bits if using double-precision terms.
Define a weight-signal $a$ where each component $a_{j}=2^{\lceil pj/N\rceil -pj/N}$ , approximated by floats in the interval [1, 2).^[4]
Compute the forward DWT for both numbers: ${\mathcal {X}}\leftarrow DWT_{(N,a)}x$ and ${\mathcal {Y}}\leftarrow DWT_{(N,a)}y$ . This is practically computed using a standard DFT (like an FFT) as $DWT_{(N,a)}x\triangleq DFT_{(N)}(ax)$ .^[4]
Perform a component-wise product of the transformed arrays: ${\mathcal {Z}}\leftarrow {\mathcal {X}}{\mathcal {Y}}$ .^[4]
Compute the inverse DWT: $z\leftarrow DWT_{(N,a)}^{-1}{\mathcal {Z}}$ . This is computed as $z=a^{-1}DFT_{(N)}^{-1}({\mathcal {Z}})$ .^[4]
Round the resulting components to the nearest integer: $z\leftarrow {\text{round}}(z)$ , optionally checking the roundoff error is no greater than 0.4 (greater indicates too many integer bits stuffed into each term).^[4]
Adjust the resulting digits $\{z_{n}\}$ to restore the variable-base radix representation. This step handles carries and borrows. Single-step partial carrying is sufficient.^[4]

Applications

Double-precision IBDWT is used in the Great Internet Mersenne Prime Search's x86 client Prime95 to perform modular multiplication in the Lucas–Lehmer test and Fermat primarily tests. The prime95 IBDWT library gwnum is also used in programs such as PrimeGrid's LLR2 and PRST. It is chosen because x86 CPUs since Pentium 4 have so much double-precision floating-point computing power that it is much faster to multiply numbers using IBDWT than to do the so using a more straightforward integer FFT (NTT).

Double-precision IBDWT has also been ported to other CPU architectures in the form of Glucas. It has also been ported to GPUs in the form of CUDALucas, GPUowl, and PRPLL.^[4]

IBDWT can also be done using integer arithmetic modulo 2⁶⁴-2³²+1, a number theoretic transform. This approach was first demonstrated by Nick Craig-Wood in ARMPrime.^[5] This too has been ported to GPUs, providing an alternative for consumer GPUs with weak double-precision computing power but acceptable 32-bit integer power, especially Nvidia models from the 2020s boasting "1:1" or "1:2" 32-bit integer multiplication speed but "1:64" double-precision speed relative to 32-bit floating-point.^[6]

Derived methods

Granger and Scott demonstrated using IBDWT-inspired "GRP (generalized repunit prime) multiplication" to accelerate eliptic curve cryptography over F(2⁵²¹-1), the P-521. This is a Karatsuba-like technique featuring a cyclic convolution similar to IBDWT.^[3]

References

↑ Crandall, Richard (1997). "The Challenge of Large Numbers" . Scientific American. 276 (2): 74–78. Bibcode:1997SciAm.276b..74C. doi:10.1038/scientificamerican0297-74. JSTOR 24993611 . Retrieved 29 March 2023.
↑ "Mathematica Use of Renowned Computational Scientist and Author Richard Crandall". Wolfram Research. Retrieved 29 March 2023.
1 2
- Granger, Robert; Scott, Michael (March 31, 2015). Faster ECC over F(2^521−1) (PDF). PKC 2015.
- Granger, Robert; Scott, Michael (2014), Faster ECC over $\mathbb{F}_{2^{521}-1}$ , retrieved 2025-11-03
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Thall, Andrew. "Fast Mersenne Prime Testing on the GPU" (PDF). Retrieved 29 March 2023.
↑ Craig-Wood, Nick (4 October 2023). "ncw/iprime". GitHub .
↑ Gallot, Yves (21 September 2025). "galloty/marin". GitHub .

Richard Crandall, Barry Fagin: Discrete weighted transforms and large-integer arithmetic, Mathematics of Computation 62, 205, 305-324, January 1994 (PDF file)
Richard Crandall: Topics in Advanced Scientific Computation, TELOS/Springer-Verlag

This mathematical analysis–related article is a stub. You can help Wikipedia by expanding it.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Crandall, Richard (1997). "The Challenge of Large Numbers" . Scientific American. 276 (2): 74–78. Bibcode:1997SciAm.276b..74C. doi:10.1038/scientificamerican0297-74. JSTOR 24993611 . Retrieved 29 March 2023.

[2] "Mathematica Use of Renowned Computational Scientist and Author Richard Crandall". Wolfram Research. Retrieved 29 March 2023.

[GrangerScott15-3] 1 2
Granger, Robert; Scott, Michael (March 31, 2015). Faster ECC over F(2^521−1) (PDF). PKC 2015.
Granger, Robert; Scott, Michael (2014), Faster ECC over $\mathbb{F}_{2^{521}-1}$ , retrieved 2025-11-03

[mw0Q] Granger, Robert; Scott, Michael (March 31, 2015). Faster ECC over F(2^521−1) (PDF). PKC 2015.

[mw1w] Granger, Robert; Scott, Michael (2014), Faster ECC over $\mathbb{F}_{2^{521}-1}$ , retrieved 2025-11-03

[thall-4] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Thall, Andrew. "Fast Mersenne Prime Testing on the GPU" (PDF). Retrieved 29 March 2023.

[5] Craig-Wood, Nick (4 October 2023). "ncw/iprime". GitHub .

[6] Gallot, Yves (21 September 2025). "galloty/marin". GitHub .

[1]

[2]

[3]

[4]

[5]

[6]

Irrational base discrete weighted transform

Contents

Algorithm

Applications

Derived methods

References