Sample-rate conversion

Last updated

Sample-rate conversion, sampling-frequency conversion or resampling is the process of changing the sampling rate or sampling frequency of a discrete signal to obtain a new discrete representation of the underlying continuous signal. [1] Application areas include image scaling [2] and audio/visual systems, where different sampling rates may be used for engineering, economic, or historical reasons.

Contents

For example, Compact Disc Digital Audio and Digital Audio Tape systems use different sampling rates, and American television, European television, and movies all use different frame rates. Sample-rate conversion prevents changes in speed and pitch that would otherwise occur when transferring recorded material between such systems.

More specific types of resampling include: upsampling or upscaling; downsampling , downscaling, or decimation; and interpolation . The term multi-rate digital signal processing is sometimes used to refer to systems that incorporate sample-rate conversion.

Techniques

Conceptual approaches to sample-rate conversion include: converting to an analog continuous signal, then re-sampling at the new rate, or calculating the values of the new samples directly from the old samples. The latter approach is more satisfactory since it introduces less noise and distortion. [3] Two possible implementation methods are as follows:

  1. If the ratio of the two sample rates is (or can be approximated by) [upper-alpha 1] [4] a fixed rational number L/M: generate an intermediate signal by inserting L − 1 zeros between each of the original samples. Low-pass filter this signal at half of the lower of the two rates. Select every M-th sample from the filtered output, to obtain the result. [5]
  2. Treat the samples as geometric points and create any needed new points by interpolation. Choosing an interpolation method is a trade-off between implementation complexity and conversion quality (according to application requirements). Commonly used are: ZOH (for film/video frames), cubic (for image processing) and windowed sinc function (for audio).

The two methods are mathematically identical: picking an interpolation function in the second scheme is equivalent to picking the impulse response of the filter in the first scheme. Linear interpolation is equivalent to a triangular impulse response; windowed sinc approximates a brick-wall filter (it approaches the desirable brick-wall filter as the number of points increases). The length of the impulse response of the filter in method 1 corresponds to the number of points used in interpolation in method 2.

In method 1, a slow pre-computation (such as the Remez algorithm) can be used to obtain an optimal (per application requirements) filter design. Method 2 will work in more general cases, e.g. where the ratio of sample rates is not rational, or two real-time streams must be accommodated, or the sample rates are time-varying.

See decimation and upsampling for further information on sample-rate conversion filter design/implementation.

Examples

Film and television

The slow-scan TV signals from the Apollo Moon missions were converted to the conventional TV rates for the viewers at home. Digital interpolation schemes were not practical at that time, so analog conversion was used. This was based on a TV rate camera viewing a monitor displaying the Apollo slow-scan images. [6]

Movies (shot at 24 frames per second) are converted to television (roughly 50 or 60 fields [upper-alpha 2] per second). To convert a 24 frame/sec movie to 60 field/sec television, for example, alternate movie frames are shown 2 and 3 times, respectively. For 50 Hz systems such as PAL each frame is shown twice. Since 50 is not exactly 2×24, the movie will run 50/48 = 4% faster, and the audio pitch will be 4% higher, an effect known as PAL speed-up. This is often accepted for simplicity, but more complex methods are possible that preserve the running time and pitch. Every twelfth frame can be repeated 3 times rather than twice, or digital interpolation (see above) can be used in a video scaler.

Audio

Audio on Compact Disc has a sampling rate of 44.1 kHz; to transfer it to a digital medium that uses 48 kHz, method 1 above can be used with L = 160, M = 147 (since 48000/44100 = 160/147). [5] For the reverse conversion, the values of L and M are swapped. Per above, in both cases, the low-pass filter should be set to 22.05 kHz.

See also

Sample rate conversion in multiple dimensions:

Techniques and processing that may involve sample-rate conversion:

Techniques used in related processes:

Notes

  1. For example, the irrational ratio 21/12, corresponding to one equal-temperament semitone, might be approximated by 196/185 (roughly 0.99994 semitones).
  2. A field is half of an interlaced frame just the odd or even lines.

Related Research Articles

<span class="mw-page-title-main">Nyquist–Shannon sampling theorem</span> Sufficiency theorem for reconstructing signals from samples

The Nyquist–Shannon sampling theorem is an essential principle for digital signal processing linking the frequency range of a signal and the sample rate required to avoid a type of distortion called aliasing. The theorem states that the sample rate must be at least twice the bandwidth of the signal to avoid aliasing distortion. In practice, it is used to select band-limiting filters to keep aliasing distortion below an acceptable amount when an analog signal is sampled or when sample rates are changed within a digital signal processing function.

Time stretching is the process of changing the speed or duration of an audio signal without affecting its pitch. Pitch scaling is the opposite: the process of changing the pitch without affecting the speed. Pitch shift is pitch scaling implemented in an effects unit and intended for live performance. Pitch control is a simpler process which affects pitch and speed simultaneously by slowing down or speeding up a recording.

<span class="mw-page-title-main">Digital audio</span> Technology that records, stores, and reproduces sound

Digital audio is a representation of sound recorded in, or converted into, digital form. In digital audio, the sound wave of the audio signal is typically encoded as numerical samples in a continuous sequence. For example, in CD audio, samples are taken 44,100 times per second, each with 16-bit sample depth. Digital audio is also the name for the entire technology of sound recording and reproduction using audio signals that have been encoded in digital form. Following significant advances in digital audio technology during the 1970s and 1980s, it gradually replaced analog audio technology in many areas of audio engineering, record production and telecommunications in the 1990s and 2000s.

The Whittaker–Shannon interpolation formula or sinc interpolation is a method to construct a continuous-time bandlimited function from a sequence of real numbers. The formula dates back to the works of E. Borel in 1898, and E. T. Whittaker in 1915, and was cited from works of J. M. Whittaker in 1935, and in the formulation of the Nyquist–Shannon sampling theorem by Claude Shannon in 1949. It is also commonly called Shannon's interpolation formula and Whittaker's interpolation formula. E. T. Whittaker, who published it in 1915, called it the Cardinal series.

In digital signal processing, spatial anti-aliasing is a technique for minimizing the distortion artifacts (aliasing) when representing a high-resolution image at a lower resolution. Anti-aliasing is used in digital photography, computer graphics, digital audio, and many other applications.

<span class="mw-page-title-main">Aliasing</span> Signal processing effect

In signal processing and related disciplines, aliasing is the overlapping of frequency components resulting from a sample rate below the Nyquist frequency. This overlap results in distortion or artifacts when the signal is reconstructed from samples which causes the reconstructed signal to differ from the original continuous signal. Aliasing that occurs in signals sampled in time, for instance in digital audio or the stroboscopic effect, is referred to as temporal aliasing. Aliasing in spatially sampled signals is referred to as spatial aliasing.

<span class="mw-page-title-main">Telecine</span> Process for broadcasting content stored on film stock

Telecine is the process of transferring film into video and is performed in a color suite. The term is also used to refer to the equipment used in this post-production process.

<span class="mw-page-title-main">Sampling (signal processing)</span> Measurement of a signal at discrete time intervals

In signal processing, sampling is the reduction of a continuous-time signal to a discrete-time signal. A common example is the conversion of a sound wave to a sequence of "samples". A sample is a value of the signal at a point in time and/or space; this definition differs from the term's usage in statistics, which refers to a set of such values.

<span class="mw-page-title-main">Undersampling</span> Signal processing sample technique

In signal processing, undersampling or bandpass sampling is a technique where one samples a bandpass-filtered signal at a sample rate below its Nyquist rate, but is still able to reconstruct the signal.

In digital signal processing, downsampling, compression, and decimation are terms associated with the process of resampling in a multi-rate digital signal processing system. Both downsampling and decimation can be synonymous with compression, or they can describe an entire process of bandwidth reduction (filtering) and sample-rate reduction. When the process is performed on a sequence of samples of a signal or a continuous function, it produces an approximation of the sequence that would have been obtained by sampling the signal at a lower rate.

In digital signal processing, upsampling, expansion, and interpolation are terms associated with the process of resampling in a multi-rate digital signal processing system. Upsampling can be synonymous with expansion, or it can describe an entire process of expansion and filtering (interpolation). When upsampling is performed on a sequence of samples of a signal or other continuous function, it produces an approximation of the sequence that would have been obtained by sampling the signal at a higher rate. For example, if compact disc audio at 44,100 samples/second is upsampled by a factor of 5/4, the resulting sample-rate is 55,125.

<span class="mw-page-title-main">576i</span> Standard-definition video mode

576i is a standard-definition digital video mode, originally used for digitizing analog television in most countries of the world where the utility frequency for electric power distribution is 50 Hz. Because of its close association with the legacy colour encoding systems, it is often referred to as PAL, PAL/SECAM or SECAM when compared to its 60 Hz NTSC-colour-encoded counterpart, 480i.

In signal processing, oversampling is the process of sampling a signal at a sampling frequency significantly higher than the Nyquist rate. Theoretically, a bandwidth-limited signal can be perfectly reconstructed if sampled at the Nyquist rate or above it. The Nyquist rate is defined as twice the bandwidth of the signal. Oversampling is capable of improving resolution and signal-to-noise ratio, and can be helpful in avoiding aliasing and phase distortion by relaxing anti-aliasing filter performance requirements.

<span class="mw-page-title-main">Delta-sigma modulation</span> Method for converting signals between digital and analog

Delta-sigma modulation is an oversampling method for encoding signals into low bit depth digital signals at a very high sample-frequency as part of the process of delta-sigma analog-to-digital converter (ADCs) and digital-to-analog converters (DACs). Delta-sigma modulation achieves high quality by utilizing a negative feedback loop during quantization to the lower bit depth that continuously corrects quantization errors and moves quantization noise to higher frequencies well above the original signal's bandwidth. Subsequent low-pass filtering for demodulation easily removes this high frequency noise and time averages to achieve high accuracy in amplitude.

In a mixed-signal system, a reconstruction filter, sometimes called an anti-imaging filter, is used to construct a smooth analog signal from a digital input, as in the case of a digital to analog converter (DAC) or other sampled data output device.

Resampling may refer to:

<span class="mw-page-title-main">Lanczos resampling</span> Application of a mathematical formula

Lanczos filtering and Lanczos resampling are two applications of a mathematical formula. It can be used as a low-pass filter or used to smoothly interpolate the value of a digital signal between its samples. In the latter case, it maps each sample of the given signal to a translated and scaled copy of the Lanczos kernel, which is a sinc function windowed by the central lobe of a second, longer, sinc function. The sum of these translated and scaled kernels is then evaluated at the desired points.

<span class="mw-page-title-main">Image scaling</span> Changing the resolution of a digital image

In computer graphics and digital imaging, imagescaling refers to the resizing of a digital image. In video technology, the magnification of digital material is known as upscaling or resolution enhancement.

The zero-order hold (ZOH) is a mathematical model of the practical signal reconstruction done by a conventional digital-to-analog converter (DAC). That is, it describes the effect of converting a discrete-time signal to a continuous-time signal by holding each sample value for one sample interval. It has several applications in electrical communication.

Television standards conversion is the process of changing a television transmission or recording from one video system to another. Converting video between different numbers of lines, frame rates, and color models in video pictures is a complex technical problem. However, the international exchange of television programming makes standards conversion necessary so that video may be viewed in another nation with a differing standard. Typically video is fed into video standards converter which produces a copy according to a different video standard. One of the most common conversions is between the NTSC and PAL standards.

References

  1. Oppenheim, Alan V.; Schafer, Ronald W.; Buck, John R. (1999). Discrete-time signal processing (2nd ed.). Upper Saddle River, N.J.: Prentice Hall. ISBN   0-13-754920-2.
  2. Lyons, Richard (2010). "10. Sample Rate Conversion". Understanding Digital Signal Processing. Prentice Hall. ISBN   978-0137027415. In satellite and medical image processing, sample rate conversion is necessary for image enhancement, scale change, and image rotation
  3. Antoniou, Andreas (2006). Digital Signal Processing. McGraw-Hill. p. 830. ISBN   0-07-145424-1. An alternative and more satisfactory approach...
  4. Partch, Harry (2009). Genesis of a Music (Second ed.). Da Capo Press. p. 101. ISBN   978-0306801068. The ratios of the "equal semitone" are, with progressive accuracy: 18/17, 89/84, 196/185
  5. 1 2 Rajamani, K.; Yhean-Sen Lai; Furrow, C. W. (2000). "An efficient algorithm for sample rate conversion from CD to DAT" (PDF). IEEE Signal Processing Letters. 7 (10): 288. doi:10.1109/97.870683.
  6. Bill Wood (2005), Apollo Television (PDF), NASA, p. 5

Further reading