Noise shaping

Last updated

Noise shaping is a technique typically used in digital audio, image, and video processing, usually in combination with dithering, as part of the process of quantization or bit-depth reduction of a signal. Its purpose is to increase the apparent signal-to-noise ratio of the resultant signal. It does this by altering the spectral shape of the error that is introduced by dithering and quantization; such that the noise power is at a lower level in frequency bands at which noise is considered to be less desirable and at a correspondingly higher level in bands where it is considered to be more desirable. A popular noise shaping algorithm used in image processing is known as ‘Floyd Steinberg dithering’; and many noise shaping algorithms used in audio processing are based on an ‘Absolute threshold of hearing’ model.

Contents

Operation

Any feedback loop functions as a filter. Noise shaping works by putting quantization noise in a feedback loop designed to filter the noise as desired.

Low-pass boxcar filter example

For example, consider the feedback system:

where b is a constant, n is the cycle number, x[n] is the input sample value, y[n] is the value being quantized, and e[n] is its quantization error:

In this model, when any sample's bit depth is reduced, the quantization error is measured and on the next cycle added with the next sample prior to quantization. The effect is that the quantization error is low-pass filtered by a 2-sample boxcar filter (also known as a simple moving average filter). As a result, compared to before, the quantization error has lower power at higher frequencies and higher power at lower frequencies. The filter's cutoff frequency can be adjusted by modifying b, the proportion of error from the previous sample that is fed back.

Impulse response filters in general

More generally, any FIR filter or IIR filter can be used to create a more complex frequency response curve. Such filters can be designed using the weighted least squares method. [1] In the case of digital audio, typically the weighting function used is one divided by the absolute threshold of hearing curve, i.e.

Dithering

Adding an appropriate amount of dither during quantization prevents determinable errors correlated to the signal. If dither is not used then noise shaping effectively functions merely as distortion shaping — pushing the distortion energy around to different frequency bands, but it is still distortion. If dither is added to the process as

then the quantization error truly becomes noise, and the process indeed yields noise shaping.

In digital audio

Noise shaping in audio is most commonly applied as a bit-reduction scheme. The most basic form of dither is flat, white noise. The ear, however, is less sensitive to certain frequencies than others at low levels (see Equal-loudness contour). By using noise shaping the quantization error can be effectively spread around so that more of it is focused on frequencies that can't be heard as well and less of it is focused on frequencies that can. The result is that where the ear is most critical the quantization error can be reduced greatly and where the ears are less sensitive the noise is much greater. This can give a perceived noise reduction of 4 bits compared to straight dither. [2] So although 16-bit samples only have 96 dB of dynamic range across the entire spectrum (see quantization distortion calculations), noise-shaped dithering can however increase the perceived audio dynamic range to 120 dB. [3]

Noise shaping and 1-bit converters

Since around 1989, 1-bit delta-sigma modulators have been used in analog-to-digital converters. This involves sampling the audio at a very high rate (2.8224 million samples per second, for example) but only using a single bit. Because only 1 bit is used, this converter only has 6.02 dB of dynamic range. The noise floor, however, is spread throughout the entire non-aliased frequency range below the Nyquist frequency of 1.4112 MHz. Noise shaping is used to lower the noise present in the audible range (20 Hz to 20 kHz) and increase the noise above the audible range. This results in a broadband dynamic range of only 7.78 dB, but it is not consistent among frequency bands, and in the lowest frequencies (the audible range) the dynamic range is much greater — over 100 dB. Noise shaping is inherently built into the delta-sigma modulators.

The 1-bit converter is the basis of the DSD format by Sony. One criticism of the 1-bit converter (and thus the DSD system) is that because only 1 bit is used in both the signal and the feedback loop, adequate amounts of dither cannot be used in the feedback loop and distortion can be heard under some conditions (more discussion at Direct Stream Digital § DSD vs. PCM). [4] [5]

Most A/D converters made since 2000 use multi-bit or multi-level delta-sigma modulators that yield more than 1 bit output so that proper dither can be added in the feedback loop. For traditional PCM sampling the signal is then decimated to 44.1 kHz or other appropriate sample rates.

In modern ADCs

Analog Devices uses what they refer to as "Noise Shaping Requantizer", [6] and Texas Instruments uses what they refer to as "SNRBoost" [7] [8] to lower the noise floor approximately 30db compared to the surrounding frequencies. This comes at a cost of non-continuous operation but produces a nice bathtub shape to the spectrum floor. This can be combined with other techniques such as Bit-Boost[ specify ] to further enhance the resolution of the spectrum.

Related Research Articles

<span class="mw-page-title-main">Analog-to-digital converter</span> System that converts an analog signal into a digital signal

In electronics, an analog-to-digital converter is a system that converts an analog signal, such as a sound picked up by a microphone or light entering a digital camera, into a digital signal. An ADC may also provide an isolated measurement such as an electronic device that converts an analog input voltage or current to a digital number representing the magnitude of the voltage or current. Typically the digital output is a two's complement binary number that is proportional to the input, but there are other possibilities.

Dynamic range is the ratio between the largest and smallest values that a certain quantity can assume. It is often used in the context of signals, like sound and light. It is measured either as a ratio or as a base-10 (decibel) or base-2 logarithmic value of the difference between the smallest and largest signal values.

<span class="mw-page-title-main">Emphasis (telecommunications)</span> Process of boosting noise-prone parts of the signal before transmission (and reversing upon receipt)

In signal processing, pre-emphasis is a technique to protect against anticipated noise. The idea is to boost the frequency range that is most susceptible to noise beforehand, so that after a noisy process more information can be recovered from that frequency range. Removal of the distortion caused by pre-emphasis is called de-emphasis, making the output accurately reproduce the original input.

Signal-to-noise ratio is a measure used in science and engineering that compares the level of a desired signal to the level of background noise. SNR is defined as the ratio of signal power to noise power, often expressed in decibels. A ratio higher than 1:1 indicates more signal than noise.

<span class="mw-page-title-main">Digital-to-analog converter</span> Device that converts a digital signal into an analog signal

In electronics, a digital-to-analog converter is a system that converts a digital signal into an analog signal. An analog-to-digital converter (ADC) performs the reverse function.

Sound can be recorded and stored and played using either digital or analog techniques. Both techniques introduce errors and distortions in the sound, and these methods can be systematically compared. Musicians and listeners have argued over the superiority of digital versus analog sound recordings. Arguments for analog systems include the absence of fundamental error mechanisms which are present in digital audio systems, including aliasing and associated anti-aliasing filter implementation, jitter and quantization noise. Advocates of digital point to the high levels of performance possible with digital audio, including excellent linearity in the audible band and low levels of noise and distortion.

<span class="mw-page-title-main">Sampling (signal processing)</span> Measurement of a signal at discrete time intervals

In signal processing, sampling is the reduction of a continuous-time signal to a discrete-time signal. A common example is the conversion of a sound wave to a sequence of "samples". A sample is a value of the signal at a point in time and/or space; this definition differs from the term's usage in statistics, which refers to a set of such values.

<span class="mw-page-title-main">Audio system measurements</span> Means of quantifying system performance

Audio system measurements are a means of quantifying system performance. These measurements are made for several purposes. Designers take measurements so that they can specify the performance of a piece of equipment. Maintenance engineers make them to ensure equipment is still working to specification, or to ensure that the cumulative defects of an audio path are within limits considered acceptable. Audio system measurements often accommodate psychoacoustic principles to measure the system in a way that relates to human hearing.

<span class="mw-page-title-main">Direct Stream Digital</span> System for digitally encoding audio signals

Direct Stream Digital (DSD) is a trademark used by Sony and Philips for their system for digitally encoding audio signals for the Super Audio CD (SACD).

<span class="mw-page-title-main">Quantization (signal processing)</span> Process of mapping a continuous set to a countable set

Quantization, in mathematics and digital signal processing, is the process of mapping input values from a large set to output values in a (countable) smaller set, often with a finite number of elements. Rounding and truncation are typical examples of quantization processes. Quantization is involved to some degree in nearly all digital signal processing, as the process of representing a signal in digital form ordinarily involves rounding. Quantization also forms the core of essentially all lossy compression algorithms.

Dither is an intentionally applied form of noise used to randomize quantization error, preventing large-scale patterns such as color banding in images. Dither is routinely used in processing of both digital audio and video data, and is often one of the last stages of mastering audio to a CD.

In signal processing, oversampling is the process of sampling a signal at a sampling frequency significantly higher than the Nyquist rate. Theoretically, a bandwidth-limited signal can be perfectly reconstructed if sampled at the Nyquist rate or above it. The Nyquist rate is defined as twice the bandwidth of the signal. Oversampling is capable of improving resolution and signal-to-noise ratio, and can be helpful in avoiding aliasing and phase distortion by relaxing anti-aliasing filter performance requirements.

<span class="mw-page-title-main">Delta-sigma modulation</span> Method for converting signals between digital and analog

Delta-sigma modulation is an oversampling method for encoding signals into low bit depth digital signals at a very high sample-frequency as part of the process of delta-sigma analog-to-digital converters (ADCs) and digital-to-analog converters (DACs). Delta-sigma modulation achieves high quality by utilizing a negative feedback loop during quantization to the lower bit depth that continuously corrects quantization errors and moves quantization noise to higher frequencies well above the original signal's bandwidth. Subsequent low-pass filtering for demodulation easily removes this high frequency noise and time averages to achieve high accuracy in amplitude which can be ultimately encoded as pulse-code modulation (PCM).

A 1-bit DAC is used as a consumer electronics marketing term describing an oversampling digital-to-analog converter (DAC) that utilizes a digital noise shaping delta-sigma modulator operating at many multiples of the sampling frequency that outputs to an actual 1-bit DAC. The combination can have high signal-to-noise and hence an equivalent effective number of bits as a DAC with a larger number of bits.

<span class="mw-page-title-main">Audio bit depth</span> Number of bits of information recorded for each digital audio sample

In digital audio using pulse-code modulation (PCM), bit depth is the number of bits of information in each sample, and it directly corresponds to the resolution of each sample. Examples of bit depth include Compact Disc Digital Audio, which uses 16 bits per sample, and DVD-Audio and Blu-ray Disc, which can support up to 24 bits per sample.

Effective number of bits (ENOB) is a measure of the dynamic range of an analog-to-digital converter (ADC), digital-to-analog converter, or their associated circuitry. The resolution of an ADC is specified by the number of bits used to represent the analog value. Ideally, a 12-bit ADC will have an effective number of bits of almost 12. However, real signals have noise, and real circuits are imperfect and introduce additional noise and distortion. Those imperfections reduce the number of bits of accuracy in the ADC. The ENOB describes the effective resolution of the system in bits. An ADC may have a 12-bit resolution, but the effective number of bits, when used in a system, may be 9.5.

A Bitcrusher is an audio effect that produces distortion by reducing of the resolution or bandwidth of digital audio data. The resulting quantization noise may produce a "warmer" sound impression, or a harsh one, depending on the amount of reduction.

<span class="mw-page-title-main">Sub-band coding</span>

In signal processing, sub-band coding (SBC) is any form of transform coding that breaks a signal into a number of different frequency bands, typically by using a fast Fourier transform, and encodes each one independently. This decomposition is often the first step in data compression for audio and video signals.

Pulse-code modulation (PCM) is a method used to digitally represent sampled analog signals. It is the standard form of digital audio in computers, compact discs, digital telephony and other digital audio applications. In a PCM stream, the amplitude of the analog signal is sampled at uniform intervals, and each sample is quantized to the nearest value within a range of digital steps.

Pulse-density modulation, or PDM, is a form of modulation used to represent an analog signal with a binary signal. In a PDM signal, specific amplitude values are not encoded into codewords of pulses of different weight as they would be in pulse-code modulation (PCM); rather, the relative density of the pulses corresponds to the analog signal's amplitude. The output of a 1-bit DAC is the same as the PDM encoding of the signal.

References

  1. Verhelst, Werner; De Koning, Dreten (24 October 2001). Noise shaping filter design for minimally audible signal requantization. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. IEEE.
  2. Gerzon, Michael; Peter Craven; Robert Stuart; Rhonda Wilson (16–19 March 1993). Psychoacoustic Noise Shaped Improvements in CD and Other Linear Digital Media. 94th Convention of the Audio Engineering Society, Berlin. AES. Preprint 3501.
  3. "24/192 Music Downloads are Very Silly Indeed". xiph.org. Retrieved 2015-08-01.
  4. Lipschitz, Stanley P.; Vanderkooy, John (2000-09-22). "Why Professional 1-Bit Sigma-Delta Conversion is a Bad Idea" (PDF). Archived from the original (PDF) on 2022-11-02.
  5. Lipshitz, Stanley P.; Vanderkooy, John (2001-05-12). "Why 1-Bit Sigma-Delta Conversion is Unsuitable for High-Quality Applications" (PDF). Archived (PDF) from the original on 2023-04-30. Retrieved 2023-08-28.
  6. AD6677 80 MHz Bandwidth IF Receiver (on Page 23)
  7. Using Windowing With SNRBoost3G Technology (PDF)
  8. Understanding Low-Amplitude Behavior of 11-bit ADCs (PDF)