Split-radix FFT algorithm

Last updated March 10, 2023

The split-radix FFT is a fast Fourier transform (FFT) algorithm for computing the discrete Fourier transform (DFT), and was first described in an initially little-appreciated paper by R. Yavne (1968) and subsequently rediscovered simultaneously by various authors in 1984. (The name "split radix" was coined by two of these reinventors, P. Duhamel and H. Hollmann.) In particular, split radix is a variant of the Cooley–Tukey FFT algorithm that uses a blend of radices 2 and 4: it recursively expresses a DFT of length N in terms of one smaller DFT of length N/2 and two smaller DFTs of length N/4.

The split-radix FFT, along with its variations, long had the distinction of achieving the lowest published arithmetic operation count (total exact number of required real additions and multiplications) to compute a DFT of power-of-two sizes N. The arithmetic count of the original split-radix algorithm was improved upon in 2004 (with the initial gains made in unpublished work by J. Van Buskirk via hand optimization for N=64 ), but it turns out that one can still achieve the new lowest count by a modification of split radix (Johnson and Frigo, 2007). Although the number of arithmetic operations is not the sole factor (or even necessarily the dominant factor) in determining the time required to compute a DFT on a computer, the question of the minimum possible count is of longstanding theoretical interest. (No tight lower bound on the operation count has currently been proven.)

The split-radix algorithm can only be applied when N is a multiple of 4, but since it breaks a DFT into smaller DFTs it can be combined with any other FFT algorithm as desired.

Split-radix decomposition

Recall that the DFT is defined by the formula:

X_{k}=\sum _{n=0}^{N-1}x_{n}\omega _{N}^{nk}

where $k$ is an integer ranging from $0$ to $N-1$ and $\omega _{N}$ denotes the primitive root of unity:

\omega _{N}=e^{-{\frac {2\pi i}{N}}},

and thus: $\omega _{N}^{N}=1$ .

The split-radix algorithm works by expressing this summation in terms of three smaller summations. (Here, we give the "decimation in time" version of the split-radix FFT; the dual decimation in frequency version is essentially just the reverse of these steps.)

First, a summation over the even indices $x_{2n_{2}}$ . Second, a summation over the odd indices broken into two pieces: $x_{4n_{4}+1}$ and $x_{4n_{4}+3}$ , according to whether the index is 1 or 3 modulo 4. Here, $n_{m}$ denotes an index that runs from 0 to $N/m-1$ . The resulting summations look like:

X_{k}=\sum _{n_{2}=0}^{N/2-1}x_{2n_{2}}\omega _{N/2}^{n_{2}k}+\omega _{N}^{k}\sum _{n_{4}=0}^{N/4-1}x_{4n_{4}+1}\omega _{N/4}^{n_{4}k}+\omega _{N}^{3k}\sum _{n_{4}=0}^{N/4-1}x_{4n_{4}+3}\omega _{N/4}^{n_{4}k}

where we have used the fact that $\omega _{N}^{mnk}=\omega _{N/m}^{nk}$ . These three sums correspond to portions of radix-2 (size N/2) and radix-4 (size N/4) Cooley–Tukey steps, respectively. (The underlying idea is that the even-index subtransform of radix-2 has no multiplicative factor in front of it, so it should be left as-is, while the odd-index subtransform of radix-2 benefits by combining a second recursive subdivision.)

These smaller summations are now exactly DFTs of length N/2 and N/4, which can be performed recursively and then recombined.

More specifically, let $U_{k}$ denote the result of the DFT of length N/2 (for $k=0,\ldots ,N/2-1$ ), and let $Z_{k}$ and $Z'_{k}$ denote the results of the DFTs of length N/4 (for $k=0,\ldots ,N/4-1$ ). Then the output $X_{k}$ is simply:

X_{k}=U_{k}+\omega _{N}^{k}Z_{k}+\omega _{N}^{3k}Z'_{k}.

This, however, performs unnecessary calculations, since $k\geq N/4$ turn out to share many calculations with $k<N/4$ . In particular, if we add N/4 to k, the size-N/4 DFTs are not changed (because they are periodic in N/4), while the size-N/2 DFT is unchanged if we add N/2 to k. So, the only things that change are the $\omega _{N}^{k}$ and $\omega _{N}^{3k}$ terms, known as twiddle factors. Here, we use the identities:

\omega _{N}^{k+N/4}=-i\omega _{N}^{k}

\omega _{N}^{3(k+N/4)}=i\omega _{N}^{3k}

to finally arrive at:

X_{k}=U_{k}+\left(\omega _{N}^{k}Z_{k}+\omega _{N}^{3k}Z'_{k}\right),

X_{k+N/2}=U_{k}-\left(\omega _{N}^{k}Z_{k}+\omega _{N}^{3k}Z'_{k}\right),

X_{k+N/4}=U_{k+N/4}-i\left(\omega _{N}^{k}Z_{k}-\omega _{N}^{3k}Z'_{k}\right),

X_{k+3N/4}=U_{k+N/4}+i\left(\omega _{N}^{k}Z_{k}-\omega _{N}^{3k}Z'_{k}\right),

which gives all of the outputs $X_{k}$ if we let $k$ range from $0$ to $N/4-1$ in the above four expressions.

Notice that these expressions are arranged so that we need to combine the various DFT outputs by pairs of additions and subtractions, which are known as butterflies. In order to obtain the minimal operation count for this algorithm, one needs to take into account special cases for $k=0$ (where the twiddle factors are unity) and for $k=N/8$ (where the twiddle factors are $(1\pm i)/{\sqrt {2}}$ and can be multiplied more quickly); see, e.g. Sorensen et al. (1986). Multiplications by $\pm 1$ and $\pm i$ are ordinarily counted as free (all negations can be absorbed by converting additions into subtractions or vice versa).

This decomposition is performed recursively when N is a power of two. The base cases of the recursion are N=1, where the DFT is just a copy $X_{0}=x_{0}$ , and N=2, where the DFT is an addition $X_{0}=x_{0}+x_{1}$ and a subtraction $X_{1}=x_{0}-x_{1}$ .

These considerations result in a count: $4N\log _{2}N-6N+8$ real additions and multiplications, for N>1 a power of two. This count assumes that, for odd powers of 2, the leftover factor of 2 (after all the split-radix steps, which divide N by 4) is handled directly by the DFT definition (4 real additions and multiplications), or equivalently by a radix-2 Cooley–Tukey FFT step.

Related Research Articles

In mathematics, the discrete Fourier transform (DFT) converts a finite sequence of equally-spaced samples of a function into a same-length sequence of equally-spaced samples of the discrete-time Fourier transform (DTFT), which is a complex-valued function of frequency. The interval at which the DTFT is sampled is the reciprocal of the duration of the input sequence. An inverse DFT (IDFT) is a Fourier series, using the DTFT samples as coefficients of complex sinusoids at the corresponding DTFT frequencies. It has the same sample-values as the original input sequence. The DFT is therefore said to be a frequency domain representation of the original input sequence. If the original sequence spans all the non-zero values of a function, its DTFT is continuous, and the DFT provides discrete samples of one cycle. If the original sequence is one cycle of a periodic function, the DFT provides all the non-zero values of one DTFT cycle.

<span class="mw-page-title-main">Fast Fourier transform</span> O(N log N) discrete Fourier transform algorithm

A fast Fourier transform (FFT) is an algorithm that computes the discrete Fourier transform (DFT) of a sequence, or its inverse (IDFT). Fourier analysis converts a signal from its original domain to a representation in the frequency domain and vice versa. The DFT is obtained by decomposing a sequence of values into components of different frequencies. This operation is useful in many fields, but computing it directly from the definition is often too slow to be practical. An FFT rapidly computes such transformations by factorizing the DFT matrix into a product of sparse factors. As a result, it manages to reduce the complexity of computing the DFT from $, which arises if one simply applies the definition of DFT, to, where is the data size. The difference in speed can be enormous, especially for long data sets where N may be in the thousands or millions. In the presence of round-off error, many FFT algorithms are much more accurate than evaluating the DFT definition directly or indirectly. There are many different FFT algorithms based on a wide range of published theories, from simple complex-number arithmetic to group theory and number theory.$

A discrete cosine transform (DCT) expresses a finite sequence of data points in terms of a sum of cosine functions oscillating at different frequencies. The DCT, first proposed by Nasir Ahmed in 1972, is a widely used transformation technique in signal processing and data compression. It is used in most digital media, including digital images, digital video, digital audio, digital television, digital radio, and speech coding. DCTs are also important to numerous other applications in science and engineering, such as digital signal processing, telecommunication devices, reducing network bandwidth usage, and spectral methods for the numerical solution of partial differential equations.

In computer science, divide and conquer is an algorithm design paradigm. A divide-and-conquer algorithm recursively breaks down a problem into two or more sub-problems of the same or related type, until these become simple enough to be solved directly. The solutions to the sub-problems are then combined to give a solution to the original problem.

A discrete Hartley transform (DHT) is a Fourier-related transform of discrete, periodic data similar to the discrete Fourier transform (DFT), with analogous applications in signal processing and related fields. Its main distinction from the DFT is that it transforms real inputs to real outputs, with no intrinsic involvement of complex numbers. Just as the DFT is the discrete analogue of the continuous Fourier transform (FT), the DHT is the discrete analogue of the continuous Hartley transform (HT), introduced by Ralph V. L. Hartley in 1942.

Rader's algorithm (1968), named for Charles M. Rader of MIT Lincoln Laboratory, is a fast Fourier transform (FFT) algorithm that computes the discrete Fourier transform (DFT) of prime sizes by re-expressing the DFT as a cyclic convolution.

The chirp Z-transform (CZT) is a generalization of the discrete Fourier transform (DFT). While the DFT samples the Z plane at uniformly-spaced points along the unit circle, the chirp Z-transform samples along spiral arcs in the Z-plane, corresponding to straight lines in the S plane. The DFT, real DFT, and zoom DFT can be calculated as special cases of the CZT.

The prime-factor algorithm (PFA), also called the Good–Thomas algorithm (1958/1963), is a fast Fourier transform (FFT) algorithm that re-expresses the discrete Fourier transform (DFT) of a size N = N₁N₂ as a two-dimensional N₁×N₂ DFT, but only for the case where N₁ and N₂ are relatively prime. These smaller transforms of size N₁ and N₂ can then be evaluated by applying PFA recursively or by using some other FFT algorithm.

Bruun's algorithm is a fast Fourier transform (FFT) algorithm based on an unusual recursive polynomial-factorization approach, proposed for powers of two by G. Bruun in 1978 and generalized to arbitrary even composite sizes by H. Murakami in 1996. Because its operations involve only real coefficients until the last computation stage, it was initially proposed as a way to efficiently compute the discrete Fourier transform (DFT) of real data. Bruun's algorithm has not seen widespread use, however, as approaches based on the ordinary Cooley–Tukey FFT algorithm have been successfully adapted to real data with at least as much efficiency. Furthermore, there is evidence that Bruun's algorithm may be intrinsically less accurate than Cooley–Tukey in the face of finite numerical precision.

The Cooley–Tukey algorithm, named after J. W. Cooley and John Tukey, is the most common fast Fourier transform (FFT) algorithm. It re-expresses the discrete Fourier transform (DFT) of an arbitrary composite size $in terms of N 1 smaller DFTs of sizes N 2, recursively, to reduce the computation time to O(N log N) for highly composite N (smooth numbers). Because of the algorithm's importance, specific variants and implementation styles have become known by their own names, as described below.$

In signal processing, a finite impulse response (FIR) filter is a filter whose impulse response is of finite duration, because it settles to zero in finite time. This is in contrast to infinite impulse response (IIR) filters, which may have internal feedback and may continue to respond indefinitely.

In applied mathematics, a DFT matrix is an expression of a discrete Fourier transform (DFT) as a transformation matrix, which can be applied to a signal through matrix multiplication.

In mathematics, the discrete-time Fourier transform (DTFT), also called the finite Fourier transform, is a form of Fourier analysis that is applicable to a sequence of values.

The Schönhage–Strassen algorithm is an asymptotically fast multiplication algorithm for large integers. It was developed by Arnold Schönhage and Volker Strassen in 1971. The run-time bit complexity is, in big O notation, $for two n -digit numbers. The algorithm uses recursive fast Fourier transforms in rings with 2 n +1 elements, a specific type of number theoretic transform.$

The Goertzel algorithm is a technique in digital signal processing (DSP) for efficient evaluation of the individual terms of the discrete Fourier transform (DFT). It is useful in certain practical applications, such as recognition of dual-tone multi-frequency signaling (DTMF) tones produced by the push buttons of the keypad of a traditional analog telephone. The algorithm was first described by Gerald Goertzel in 1958.

<span class="mw-page-title-main">Butterfly diagram</span>

In the context of fast Fourier transform algorithms, a butterfly is a portion of the computation that combines the results of smaller discrete Fourier transforms (DFTs) into a larger DFT, or vice versa. The name "butterfly" comes from the shape of the data-flow diagram in the radix-2 case, as described below. The earliest occurrence in print of the term is thought to be in a 1969 by Subhranil Majumder MIT technical report. The same structure can also be found in the Viterbi algorithm, used for finding the most likely sequence of hidden states.

In mathematics, the discrete Fourier transform over a ring generalizes the discrete Fourier transform (DFT), of a function whose values are commonly complex numbers, over an arbitrary ring.

In quantum computing, the quantum Fourier transform (QFT) is a linear transformation on quantum bits, and is the quantum analogue of the discrete Fourier transform. The quantum Fourier transform is a part of many quantum algorithms, notably Shor's algorithm for factoring and computing the discrete logarithm, the quantum phase estimation algorithm for estimating the eigenvalues of a unitary operator, and algorithms for the hidden subgroup problem. The quantum Fourier transform was discovered by Don Coppersmith.

In mathematical analysis and applications, multidimensional transforms are used to analyze the frequency content of signals in a domain of two or more dimensions.

The vector-radix FFT algorithm, is a multidimensional fast Fourier transform (FFT) algorithm, which is a generalization of the ordinary Cooley–Tukey FFT algorithm that divides the transform dimensions by arbitrary radices. It breaks a multidimensional (MD) discrete Fourier transform (DFT) down into successively smaller MD DFTs until, ultimately, only trivial MD DFTs need to be evaluated.

References

R. Yavne, "An economical method for calculating the discrete Fourier transform," in Proc. AFIPS Fall Joint Computer Conf.33, 115–125 (1968).
P. Duhamel and H. Hollmann, "Split-radix FFT algorithm," Electron. Lett.20 (1), 14–16 (1984).
M. Vetterli and H. J. Nussbaumer, "Simple FFT and DCT algorithms with reduced number of operations," Signal Processing6 (4), 267–278 (1984).
J. B. Martens, "Recursive cyclotomic factorization—a new algorithm for calculating the discrete Fourier transform," IEEE Trans. Acoust., Speech, Signal Processing32 (4), 750–761 (1984).
P. Duhamel and M. Vetterli, "Fast Fourier transforms: a tutorial review and a state of the art," Signal Processing19, 259–299 (1990).
S. G. Johnson and M. Frigo, "A modified split-radix FFT with fewer arithmetic operations," IEEE Trans. Signal Process.55 (1), 111–119 (2007).
Douglas L. Jones, "Split-radix FFT algorithms," Connexions web site (Nov. 2, 2006).
H. V. Sorensen, M. T. Heideman, and C. S. Burrus, "On computing the split-radix FFT", IEEE Trans. Acoust., Speech, Signal Processing34 (1), 152–156 (1986).

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.