Goertzel algorithm

Last updated

The Goertzel algorithm is a technique in digital signal processing (DSP) for efficient evaluation of the individual terms of the discrete Fourier transform (DFT). It is useful in certain practical applications, such as recognition of dual-tone multi-frequency signaling (DTMF) tones produced by the push buttons of the keypad of a traditional analog telephone. The algorithm was first described by Gerald Goertzel in 1958. [1]

Contents

Like the DFT, the Goertzel algorithm analyses one selectable frequency component from a discrete signal. [2] [3] [4] Unlike direct DFT calculations, the Goertzel algorithm applies a single real-valued coefficient at each iteration, using real-valued arithmetic for real-valued input sequences. For covering a full spectrum (except when using for continuous stream of data where coefficients are reused for subsequent calculations, which has computational complexity equivalent of sliding DFT), the Goertzel algorithm has a higher order of complexity than fast Fourier transform (FFT) algorithms, but for computing a small number of selected frequency components, it is more numerically efficient. The simple structure of the Goertzel algorithm makes it well suited to small processors and embedded applications.

The Goertzel algorithm can also be used "in reverse" as a sinusoid synthesis function, which requires only 1 multiplication and 1 subtraction per generated sample. [5]

The algorithm

The main calculation in the Goertzel algorithm has the form of a digital filter, and for this reason the algorithm is often called a Goertzel filter. The filter operates on an input sequence in a cascade of two stages with a parameter , giving the frequency to be analysed, normalised to radians per sample.

The first stage calculates an intermediate sequence, :

 

 

 

 

(1)

The second stage applies the following filter to , producing output sequence :

 

 

 

 

(2)

The first filter stage can be observed to be a second-order IIR filter with a direct-form structure. This particular structure has the property that its internal state variables equal the past output values from that stage. Input values for are presumed all equal to 0. To establish the initial filter state so that evaluation can begin at sample , the filter states are assigned initial values . To avoid aliasing hazards, frequency is often restricted to the range 0 to π (see Nyquist–Shannon sampling theorem); using a value outside this range is not meaningless, but is equivalent to using an aliased frequency inside this range, since the exponential function is periodic with a period of 2π in .

The second-stage filter can be observed to be a FIR filter, since its calculations do not use any of its past outputs.

Z-transform methods can be applied to study the properties of the filter cascade. The Z transform of the first filter stage given in equation (1) is

 

 

 

 

(3)

The Z transform of the second filter stage given in equation (2) is

 

 

 

 

(4)

The combined transfer function of the cascade of the two filter stages is then

 

 

 

 

(5)

This can be transformed back to an equivalent time-domain sequence, and the terms unrolled back to the first input term at index :[ citation needed ]

 

 

 

 

(6)

Numerical stability

It can be observed that the poles of the filter's Z transform are located at and , on a circle of unit radius centered on the origin of the complex Z-transform plane. This property indicates that the filter process is marginally stable and vulnerable to numerical-error accumulation when computed using low-precision arithmetic and long input sequences. [6] A numerically stable version was proposed by Christian Reinsch. [7]

DFT computations

For the important case of computing a DFT term, the following special restrictions are applied.

 

 

 

 

(7)

 

 

 

 

(8)

Making these substitutions into equation (6) and observing that the term , equation (6) then takes the following form:

 

 

 

 

(9)

We can observe that the right side of equation (9) is extremely similar to the defining formula for DFT term , the DFT term for index number , but not exactly the same. The summation shown in equation (9) requires input terms, but only input terms are available when evaluating a DFT. A simple but inelegant expedient is to extend the input sequence with one more artificial value . [8] We can see from equation (9) that the mathematical effect on the final result is the same as removing term from the summation, thus delivering the intended DFT value.

However, there is a more elegant approach that avoids the extra filter pass. From equation (1), we can note that when the extended input term is used in the final step,

 

 

 

 

(10)

Thus, the algorithm can be completed as follows:

The last two mathematical operations are simplified by combining them algebraically:

 

 

 

 

(11)

Note that stopping the filter updates at term and immediately applying equation (2) rather than equation (11) misses the final filter state updates, yielding a result with incorrect phase. [9]

The particular filtering structure chosen for the Goertzel algorithm is the key to its efficient DFT calculations. We can observe that only one output value is used for calculating the DFT, so calculations for all the other output terms are omitted. Since the FIR filter is not calculated, the IIR stage calculations , etc. can be discarded immediately after updating the first stage's internal state.

This seems to leave a paradox: to complete the algorithm, the FIR filter stage must be evaluated once using the final two outputs from the IIR filter stage, while for computational efficiency the IIR filter iteration discards its output values. This is where the properties of the direct-form filter structure are applied. The two internal state variables of the IIR filter provide the last two values of the IIR filter output, which are the terms required to evaluate the FIR filter stage.

Applications

Power-spectrum terms

Examining equation (6), a final IIR filter pass to calculate term using a supplemental input value applies a complex multiplier of magnitude 1 to the previous term . Consequently, and represent equivalent signal power. It is equally valid to apply equation (11) and calculate the signal power from term or to apply equation (2) and calculate the signal power from term . Both cases lead to the following expression for the signal power represented by DFT term :

 

 

 

 

(12)

In the pseudocode below, the real-valued input data is stored in the array x and the variables sprev and sprev2 temporarily store output history from the IIR filter. Nterms is the number of samples in the array, and Kterm corresponds to the frequency of interest, multiplied by the sampling period.

Nterms defined here Kterm selected here ω = 2 × π × Kterm / Nterms; coeff := 2 × cos(ω)  sprev := 0 sprev2 := 0 for each index n in range 0 to Nterms-1 do     s := x[n] + coeff × sprev - sprev2     sprev2 := sprev     sprev := s end  power := sprev2 + sprev22 - (coeff × sprev × sprev2)

It is possible [10] to organise the computations so that incoming samples are delivered singly to a software object that maintains the filter state between updates, with the final power result accessed after the other processing is done.

Single DFT term with real-valued arithmetic

The case of real-valued input data arises frequently, especially in embedded systems where the input streams result from direct measurements of physical processes. When the input data are real-valued, the filter internal state variables sprev and sprev2 can be observed also to be real-valued, consequently, no complex arithmetic is required in the first IIR stage. Optimizing for real-valued arithmetic typically is as simple as applying appropriate real-valued data types for the variables.

After the calculations using input term , and filter iterations are terminated, equation (11) must be applied to evaluate the DFT term. The final calculation uses complex-valued arithmetic, but this can be converted into real-valued arithmetic by separating real and imaginary terms:

 

 

 

 

(13)

Comparing to the power-spectrum application, the only difference are the calculation used to finish:

(Same IIR filter calculations as in the signal power implementation) XKreal = sprev * cr - sprev2; XKimag = sprev * ci; 

Phase detection

This application requires the same evaluation of DFT term , as discussed in the previous section, using a real-valued or complex-valued input stream. Then the signal phase can be evaluated as

 

 

 

 

(14)

taking appropriate precautions for singularities, quadrant, and so forth when computing the inverse tangent function.

Complex signals in real arithmetic

Since complex signals decompose linearly into real and imaginary parts, the Goertzel algorithm can be computed in real arithmetic separately over the sequence of real parts, yielding , and over the sequence of imaginary parts, yielding . After that, the two complex-valued partial results can be recombined:

 

 

 

 

(15)

Computational complexity

To compute a single DFT bin for a complex input sequence of length , the Goertzel algorithm requires multiplications and additions/subtractions within the loop, as well as 4 multiplications and 4 final additions/subtractions, for a total of multiplications and additions/subtractions. This is repeated for each of the frequencies.
This is harder to apply directly because it depends on the FFT algorithm used, but a typical example is a radix-2 FFT, which requires multiplications and additions/subtractions per DFT bin, for each of the bins.

In the complexity order expressions, when the number of calculated terms is smaller than , the advantage of the Goertzel algorithm is clear. But because FFT code is comparatively complex, the "cost per unit of work" factor is often larger for an FFT, and the practical advantage favours the Goertzel algorithm even for several times larger than .

As a rule-of-thumb for determining whether a radix-2 FFT or a Goertzel algorithm is more efficient, adjust the number of terms in the data set upward to the nearest exact power of 2, calling this , and the Goertzel algorithm is likely to be faster if

FFT implementations and processing platforms have a significant impact on the relative performance. Some FFT implementations [11] perform internal complex-number calculations to generate coefficients on-the-fly, significantly increasing their "cost K per unit of work." FFT and DFT algorithms can use tables of pre-computed coefficient values for better numerical efficiency, but this requires more accesses to coefficient values buffered in external memory, which can lead to increased cache contention that counters some of the numerical advantage.

Both algorithms gain approximately a factor of 2 efficiency when using real-valued rather than complex-valued input data. However, these gains are natural for the Goertzel algorithm but will not be achieved for the FFT without using certain algorithm variants [ which? ] specialised for transforming real-valued data.

See also

Related Research Articles

<span class="mw-page-title-main">Discrete Fourier transform</span> Type of Fourier transform in discrete mathematics

In mathematics, the discrete Fourier transform (DFT) converts a finite sequence of equally-spaced samples of a function into a same-length sequence of equally-spaced samples of the discrete-time Fourier transform (DTFT), which is a complex-valued function of frequency. The interval at which the DTFT is sampled is the reciprocal of the duration of the input sequence. An inverse DFT (IDFT) is a Fourier series, using the DTFT samples as coefficients of complex sinusoids at the corresponding DTFT frequencies. It has the same sample-values as the original input sequence. The DFT is therefore said to be a frequency domain representation of the original input sequence. If the original sequence spans all the non-zero values of a function, its DTFT is continuous, and the DFT provides discrete samples of one cycle. If the original sequence is one cycle of a periodic function, the DFT provides all the non-zero values of one DTFT cycle.

Linear filters process time-varying input signals to produce output signals, subject to the constraint of linearity. In most cases these linear filters are also time invariant in which case they can be analyzed exactly using LTI system theory revealing their transfer functions in the frequency domain and their impulse responses in the time domain. Real-time implementations of such linear signal processing filters in the time domain are inevitably causal, an additional constraint on their transfer functions. An analog electronic circuit consisting only of linear components will necessarily fall in this category, as will comparable mechanical systems or digital signal processing systems containing only linear elements. Since linear time-invariant filters can be completely characterized by their response to sinusoids of different frequencies, they are sometimes known as frequency filters.

<span class="mw-page-title-main">Fast Fourier transform</span> O(N log N) discrete Fourier transform algorithm

A fast Fourier transform (FFT) is an algorithm that computes the discrete Fourier transform (DFT) of a sequence, or its inverse (IDFT). Fourier analysis converts a signal from its original domain to a representation in the frequency domain and vice versa. The DFT is obtained by decomposing a sequence of values into components of different frequencies. This operation is useful in many fields, but computing it directly from the definition is often too slow to be practical. An FFT rapidly computes such transformations by factorizing the DFT matrix into a product of sparse factors. As a result, it manages to reduce the complexity of computing the DFT from , which arises if one simply applies the definition of DFT, to , where is the data size. The difference in speed can be enormous, especially for long data sets where N may be in the thousands or millions. In the presence of round-off error, many FFT algorithms are much more accurate than evaluating the DFT definition directly or indirectly. There are many different FFT algorithms based on a wide range of published theories, from simple complex-number arithmetic to group theory and number theory.

<span class="mw-page-title-main">Digital filter</span> Device for suppressing part of a discretely-sampled signal

In signal processing, a digital filter is a system that performs mathematical operations on a sampled, discrete-time signal to reduce or enhance certain aspects of that signal. This is in contrast to the other major type of electronic filter, the analog filter, which is typically an electronic circuit operating on continuous-time analog signals.

A discrete cosine transform (DCT) expresses a finite sequence of data points in terms of a sum of cosine functions oscillating at different frequencies. The DCT, first proposed by Nasir Ahmed in 1972, is a widely used transformation technique in signal processing and data compression. It is used in most digital media, including digital images, digital video, digital audio, digital television, digital radio, and speech coding. DCTs are also important to numerous other applications in science and engineering, such as digital signal processing, telecommunication devices, reducing network bandwidth usage, and spectral methods for the numerical solution of partial differential equations.

A discrete Hartley transform (DHT) is a Fourier-related transform of discrete, periodic data similar to the discrete Fourier transform (DFT), with analogous applications in signal processing and related fields. Its main distinction from the DFT is that it transforms real inputs to real outputs, with no intrinsic involvement of complex numbers. Just as the DFT is the discrete analogue of the continuous Fourier transform (FT), the DHT is the discrete analogue of the continuous Hartley transform (HT), introduced by Ralph V. L. Hartley in 1942.

The modified discrete cosine transform (MDCT) is a transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped: it is designed to be performed on consecutive blocks of a larger dataset, where subsequent blocks are overlapped so that the last half of one block coincides with the first half of the next block. This overlapping, in addition to the energy-compaction qualities of the DCT, makes the MDCT especially attractive for signal compression applications, since it helps to avoid artifacts stemming from the block boundaries. As a result of these advantages, the MDCT is the most widely used lossy compression technique in audio data compression. It is employed in most modern audio coding standards, including MP3, Dolby Digital (AC-3), Vorbis (Ogg), Windows Media Audio (WMA), ATRAC, Cook, Advanced Audio Coding (AAC), High-Definition Coding (HDC), LDAC, Dolby AC-4, and MPEG-H 3D Audio, as well as speech coding standards such as AAC-LD (LD-MDCT), G.722.1, G.729.1, CELT, and Opus.

Rader's algorithm (1968), named for Charles M. Rader of MIT Lincoln Laboratory, is a fast Fourier transform (FFT) algorithm that computes the discrete Fourier transform (DFT) of prime sizes by re-expressing the DFT as a cyclic convolution.

The chirp Z-transform (CZT) is a generalization of the discrete Fourier transform (DFT). While the DFT samples the Z plane at uniformly-spaced points along the unit circle, the chirp Z-transform samples along spiral arcs in the Z-plane, corresponding to straight lines in the S plane. The DFT, real DFT, and zoom DFT can be calculated as special cases of the CZT.

The Cooley–Tukey algorithm, named after J. W. Cooley and John Tukey, is the most common fast Fourier transform (FFT) algorithm. It re-expresses the discrete Fourier transform (DFT) of an arbitrary composite size in terms of N1 smaller DFTs of sizes N2, recursively, to reduce the computation time to O(N log N) for highly composite N (smooth numbers). Because of the algorithm's importance, specific variants and implementation styles have become known by their own names, as described below.

In signal processing, a finite impulse response (FIR) filter is a filter whose impulse response is of finite duration, because it settles to zero in finite time. This is in contrast to infinite impulse response (IIR) filters, which may have internal feedback and may continue to respond indefinitely.

<span class="mw-page-title-main">Discrete wavelet transform</span> Transform in numerical harmonic analysis

In numerical analysis and functional analysis, a discrete wavelet transform (DWT) is any wavelet transform for which the wavelets are discretely sampled. As with other wavelet transforms, a key advantage it has over Fourier transforms is temporal resolution: it captures both frequency and location information.

In mathematics, the discrete-time Fourier transform (DTFT), also called the finite Fourier transform, is a form of Fourier analysis that is applicable to a sequence of values.

<span class="mw-page-title-main">Schönhage–Strassen algorithm</span> Multiplication algorithm

The Schönhage–Strassen algorithm is an asymptotically fast multiplication algorithm for large integers, published by Arnold Schönhage and Volker Strassen in 1971. It works by recursively applying number-theoretic transforms over the integers modulo 2n+1. The run-time bit complexity to multiply two n-digit numbers using the algorithm is in big O notation.

<span class="mw-page-title-main">Filter bank</span> Tool for Digital Signal Processing

In signal processing, a filter bank is an array of bandpass filters that separates the input signal into multiple components, each one carrying a single frequency sub-band of the original signal. One application of a filter bank is a graphic equalizer, which can attenuate the components differently and recombine them into a modified version of the original signal. The process of decomposition performed by the filter bank is called analysis ; the output of analysis is referred to as a subband signal with as many subbands as there are filters in the filter bank. The reconstruction process is called synthesis, meaning reconstitution of a complete signal resulting from the filtering process.

<span class="mw-page-title-main">Butterfly diagram</span>

In the context of fast Fourier transform algorithms, a butterfly is a portion of the computation that combines the results of smaller discrete Fourier transforms (DFTs) into a larger DFT, or vice versa. The name "butterfly" comes from the shape of the data-flow diagram in the radix-2 case, as described below. The earliest occurrence in print of the term is thought to be in a 1969 MIT technical report. The same structure can also be found in the Viterbi algorithm, used for finding the most likely sequence of hidden states.

<span class="mw-page-title-main">Constant-Q transform</span> Short-time Fourier transform with variable resolution

In mathematics and signal processing, the constant-Q transform and variable-Q transform, simply known as CQT and VQT, transforms a data series to the frequency domain. It is related to the Fourier transform and very closely related to the complex Morlet wavelet transform. Its design is suited for musical representation.

<span class="mw-page-title-main">Overlap–save method</span>

In signal processing, overlap–save is the traditional name for an efficient way to evaluate the discrete convolution between a very long signal and a finite impulse response (FIR) filter :

Similar to 1-D Digital signal processing in case of the Multidimensional signal processing we have Efficient algorithms. The efficiency of an Algorithm can be evaluated by the amount of computational resources it takes to compute output or the quantity of interest. In this page, two of the very efficient algorithms for multidimensional signals are explained. For the sake of simplicity and description it is explained for 2-D Signals. However, same theory holds good for M-D signals. The exact computational savings for each algorithm is also mentioned.

Parallelmultidimensionaldigitalsignal processing (mD-DSP) is defined as the application of parallel programming and multiprocessing to digital signal processing techniques to process digital signals that have more than a single dimension. The use of mD-DSP is fundamental to many application areas such as digital image and video processing, medical imaging, geophysical signal analysis, sonar, radar, lidar, array processing, computer vision, computational photography, and augmented and virtual reality. However, as the number of dimensions of a signal increases the computational complexity to operate on the signal increases rapidly. This relationship between the number of dimensions and the amount of complexity, related to both time and space, as studied in the field of algorithm analysis, is analogues to the concept of the curse of dimensionality. This large complexity generally results in an extremely long execution run-time of a given mD-DSP application rendering its usage to become impractical for many applications; especially for real-time applications. This long run-time is the primary motivation of applying parallel algorithmic techniques to mD-DSP problems.

References

  1. Goertzel, G. (January 1958), "An Algorithm for the Evaluation of Finite Trigonometric Series", American Mathematical Monthly, 65 (1): 34–35, doi:10.2307/2310304, JSTOR   2310304
  2. Mock, P. (March 21, 1985), "Add DTMF Generation and Decoding to DSP-μP Designs" (PDF), EDN, ISSN   0012-7515 ; also found in DSP Applications with the TMS320 Family, Vol. 1, Texas Instruments, 1989.
  3. Chen, Chiouguey J. (June 1996), Modified Goertzel Algorithm in DTMF Detection Using the TMS320C80 DSP (PDF), Application Report, Texas Instruments, SPRA066
  4. Schmer, Gunter (May 2000), DTMF Tone Generation and Detection: An Implementation Using the TMS320C54x (PDF), Application Report, Texas Instruments, SPRA096a
  5. Cheng, Eric; Hudak, Paul (January 2009), Audio Processing and Sound Synthesis in Haskell (PDF), archived from the original (PDF) on 2017-03-28
  6. Gentleman, W. M. (1 February 1969). "An error analysis of Goertzel's (Watt's) method for computing Fourier coefficients". The Computer Journal. 12 (2): 160–164. doi: 10.1093/comjnl/12.2.160 .
  7. Stoer, J.; Bulirsch, R. (2002), Introduction to Numerical Analysis, Springer, ISBN   9780387954523
  8. "Goertzel's Algorithm". Cnx.org. 2006-09-12. Retrieved 2014-02-03.
  9. "Electronic Engineering Times | Connecting the Global Electronics Community". EE Times. Retrieved 2014-02-03.
  10. Elmenreich, Wilfried (August 25, 2011). "Efficiently detecting a frequency using a Goertzel filter" . Retrieved 16 September 2014.
  11. Press; Flannery; Teukolsky; Vetterling (2007), "Chapter 12", Numerical Recipes, The Art of Scientific Computing, Cambridge University Press

Further reading