Digital audio

Last updated
Audio levels display on a digital audio recorder (Zoom H4n) Zoom H4n audio recording levels.jpg
Audio levels display on a digital audio recorder (Zoom H4n)

Digital audio is sound that has been recorded in, or converted into, digital form. In digital audio, the sound wave of the audio signal is encoded as numerical samples in continuous sequence. For example, in CD audio, samples are taken 44100 times per second each with 16 bit sample depth. Digital audio is also the name for the entire technology of sound recording and reproduction using audio signals that have been encoded in digital form. Following significant advances in digital audio technology during the 1970s, it gradually replaced analog audio technology in many areas of audio engineering and telecommunications in the 1990s and 2000s.

Contents

In a digital audio system, an analog electrical signal representing the sound is converted with an analog-to-digital converter (ADC) into a digital signal, typically using pulse-code modulation. This digital signal can then be recorded, edited, modified, and copied using computers, audio playback machines, and other digital tools. When the sound engineer wishes to listen to the recording on headphones or loudspeakers (or when a consumer wishes to listen to a digital sound file), a digital-to-analog converter (DAC) performs the reverse process, converting a digital signal back into an analog signal, which is then sent through an audio power amplifier and ultimately to a loudspeaker.

Digital audio systems may include compression, storage, processing, and transmission components. Conversion to a digital format allows convenient manipulation, storage, transmission, and retrieval of an audio signal. Unlike analog audio, in which making copies of a recording results in generation loss and degradation of signal quality, digital audio allows an infinite number of copies to be made without any degradation of signal quality.

Overview

A sound wave, in red, represented digitally, in blue (after sampling and 4-bit quantization). 4-bit-linear-PCM.svg
A sound wave, in red, represented digitally, in blue (after sampling and 4-bit quantization).

Digital audio technologies are used in the recording, manipulation, mass-production, and distribution of sound, including recordings of songs, instrumental pieces, podcasts, sound effects, and other sounds. Modern online music distribution depends on digital recording and data compression. The availability of music as data files, rather than as physical objects, has significantly reduced the costs of distribution. [1] Before digital audio, the music industry distributed and sold music by selling physical copies in the form of records and cassette tapes. With digital-audio and online distribution systems such as iTunes, companies sell digital sound files to consumers, which the consumer receives over the Internet.

An analog audio system converts physical waveforms of sound into electrical representations of those waveforms by use of a transducer, such as a microphone. The sounds are then stored on an analog medium such as magnetic tape, or transmitted through an analog medium such as a telephone line or radio. The process is reversed for reproduction: the electrical audio signal is amplified and then converted back into physical waveforms via a loudspeaker. Analog audio retains its fundamental wave-like characteristics throughout its storage, transformation, duplication, and amplification.

Analog audio signals are susceptible to noise and distortion, due to the innate characteristics of electronic circuits and associated devices. Disturbances in a digital system do not result in error unless the disturbance is so large as to result in a symbol being misinterpreted as another symbol or disturb the sequence of symbols. It is therefore generally possible to have an entirely error-free digital audio system in which no noise or distortion is introduced between conversion to digital format, and conversion back to analog.

A digital audio signal may optionally be encoded for correction of any errors that might occur in the storage or transmission of the signal. This technique, known as channel coding, is essential for broadcast or recorded digital systems to maintain bit accuracy. Eight-to-fourteen modulation is a channel code used in the audio compact disc (CD).

Conversion process

The lifecycle of sound from its source, through an ADC, digital processing, a DAC, and finally as sound again. A-D-A Flow.svg
The lifecycle of sound from its source, through an ADC, digital processing, a DAC, and finally as sound again.

A digital audio system starts with an ADC that converts an analog signal to a digital signal. [note 1] The ADC runs at a specified sampling rate and converts at a known bit resolution. CD audio, for example, has a sampling rate of 44.1  kHz (44,100 samples per second), and has 16-bit resolution for each stereo channel. Analog signals that have not already been bandlimited must be passed through an anti-aliasing filter before conversion, to prevent the aliasing distortion that is caused by audio signals with frequencies higher than the Nyquist frequency (half the sampling rate).

A digital audio signal may be stored or transmitted. Digital audio can be stored on a CD, a digital audio player, a hard drive, a USB flash drive, or any other digital data storage device. The digital signal may be altered through digital signal processing, where it may be filtered or have effects applied. Sample-rate conversion including upsampling and downsampling may be used to conform signals that have been encoded with a different sampling rate to a common sampling rate prior to processing. Audio data compression techniques, such as MP3, Advanced Audio Coding, Ogg Vorbis, or FLAC, are commonly employed to reduce the file size. Digital audio can be carried over digital audio interfaces such as AES3 or MADI. Digital audio can be carried over a network using audio over Ethernet, audio over IP or other streaming media standards and systems.

For playback, digital audio must be converted back to an analog signal with a DAC which may use oversampling.

History

Coding

Pulse-code modulation (PCM) was invented by British scientist Alec Reeves in 1937. [2] In 1950, C. Chapin Cutler of Bell Labs filed the patent on differential pulse-code modulation (DPCM), [3] a data compression algorithm. Adaptive DPCM (ADPCM) was introduced by P. Cummiskey, Nikil S. Jayant and James L. Flanagan at Bell Labs in 1973. [4] [5]

Perceptual coding was first used for speech coding compression, with linear predictive coding (LPC). [6] Initial concepts for LPC date back to the work of Fumitada Itakura (Nagoya University) and Shuzo Saito (Nippon Telegraph and Telephone) in 1966. [7] During the 1970s, Bishnu S. Atal and Manfred R. Schroeder at Bell Labs developed a form of LPC called adaptive predictive coding (APC), a perceptual coding algorithm that exploited the masking properties of the human ear, followed in the early 1980s with the code-excited linear prediction (CELP) algorithm. [6]

Discrete cosine transform (DCT) coding, a lossy compression method first proposed by Nasir Ahmed in 1972, [8] [9] provided the basis for the modified discrete cosine transform (MDCT), which was developed by J. P. Princen, A. W. Johnson and A. B. Bradley in 1987. [10] The MDCT is the basis for most audio coding standards, such as Dolby Digital (AC-3), [11] MP3 (MPEG Layer III), [12] [6] Advanced Audio Coding (AAC), Windows Media Audio (WMA), and Vorbis (Ogg). [11]

Recording

PCM was used in telecommunications applications long before its first use in commercial broadcast and recording. Commercial digital recording was pioneered in Japan by NHK and Nippon Columbia and their Denon brand, in the 1960s. The first commercial digital recordings were released in 1971. [13]

The BBC also began to experiment with digital audio in the 1960s. By the early 1970s, it had developed a 2-channel recorder, and in 1972 it deployed a digital audio transmission system that linked their broadcast center to their remote transmitters. [13]

The first 16-bit PCM recording in the United States was made by Thomas Stockham at the Santa Fe Opera in 1976, on a Soundstream recorder. An improved version of the Soundstream system was used to produce several classical recordings by Telarc in 1978. The 3M digital multitrack recorder in development at the time was based on BBC technology. The first all-digital album recorded on this machine was Ry Cooder's Bop till You Drop in 1979. British record label Decca began development of its own 2-track digital audio recorders in 1978 and released the first European digital recording in 1979. [13]

Popular professional digital multitrack recorders produced by Sony/Studer (DASH) and Mitsubishi (ProDigi) in the early 1980s helped to bring about digital recording's acceptance by the major record companies. The 1982 introduction of the CD popularized digital audio with consumers. [13]

Telephony

The rapid development and wide adoption of pulse-code modulation (PCM) digital telephony was enabled by metal–oxide–semiconductor (MOS) switched capacitor (SC) circuit technology, developed in the early 1970s. [14] This led to the development of PCM codec-filter chips in the late 1970s. [14] [15] The silicon-gate CMOS (complementary MOS) PCM codec-filter chip, developed by David A. Hodges and W.C. Black in 1980, [14] has since been the industry standard for digital telephony. [14] [15] By the 1990s, telecommunication networks such as the public switched telephone network (PSTN) had been largely digitized with VLSI (very large-scale integration) CMOS PCM codec-filters, widely used in electronic switching systems for telephone exchanges, user-end modems and a range of digital transmission applications such as the integrated services digital network (ISDN), cordless telephones and cell phones. [15]

Technologies

Sony digital audio tape recorder PCM-7030 Sony PCM-7030 of DR 20111102a.jpg
Sony digital audio tape recorder PCM-7030

Digital audio is used in broadcasting of audio. Standard technologies include Digital audio broadcasting (DAB), Digital Radio Mondiale (DRM), HD Radio and In-band on-channel (IBOC).

Digital audio in recording applications is stored on audio-specific technologies including Compact disc (CD), Digital Audio Tape (DAT), Digital Compact Cassette (DCC) and MiniDisc. Digital audio may be stored in a standard audio file formats and stored on a Hard disk recorder, Blu-ray or DVD-Audio. Files may be played back on smartphones, computers or MP3 player.

Interfaces

Digital-audio-specific interfaces include:

Several interfaces are engineered to carry digital video and audio together, including HDMI and DisplayPort.

In professional architectural or installation applications, many digital audio audio over Ethernet protocols and interfaces exist.

See also

Notes

  1. Some audio signals such as those created by digital synthesis originate entirely in the digital domain, in which case analog to digital conversion does not take place.

Related Research Articles

A codec is a device or computer program which encodes or decodes a digital data stream or signal. Codec is a portmanteau of coder-decoder.

In signal processing, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Lossy compression reduces bits by removing unnecessary or less important information. Typically, a device that performs data compression is referred to as an encoder, and one that performs the reversal of the process (decompression) as a decoder.

Lossy compression data compression approach that reduces data size while discarding or channing some of it

In information technology, lossy compression or irreversible compression is the class of data encoding methods that uses inexact approximations and partial data discarding to represent the content. These techniques are used to reduce data size for storing, handling, and transmitting content. The different versions of the photo of the cat to the right show how higher degrees of approximation create coarser images as more details are removed. This is opposed to lossless data compression which does not degrade the data. The amount of data reduction possible using lossy compression is much higher than through lossless techniques.

Speech coding is an application of data compression of digital audio signals containing speech. Speech coding uses speech-specific parameter estimation using audio signal processing techniques to model the speech signal, combined with generic data compression algorithms to represent the resulting modeled parameters in a compact bitstream.

Telephony is the field of technology involving the development, application, and deployment of telecommunication services for the purpose of electronic transmission of voice, fax, or data, between distant parties. The history of telephony is intimately linked to the invention and development of the telephone.

An audio codec is a codec that encodes or decodes audio. In software, an audio codec is a computer program implementing an algorithm that compresses and decompresses digital audio data according to a given audio file or streaming media audio coding format. The objective of the algorithm is to represent the high-fidelity audio signal with minimum number of bits while retaining quality. This can effectively reduce the storage space and the bandwidth required for transmission of the stored audio file. Most software codecs are implemented as libraries which interface to one or more multimedia players. Most modern audio compression algorithms are based on modified discrete cosine transform (MDCT) coding and linear predictive coding (LPC).

Dolby Digital, also known as Dolby AC-3, is the name for audio compression technologies developed by Dolby Laboratories. Originally named Dolby Stereo Digital until 1995, except for Dolby TrueHD, the audio compression is lossy, based on the modified discrete cosine transform (MDCT) algorithm. The first use of Dolby Digital was to provide digital sound in cinemas from 35mm film prints; today, it is now also used for other applications such as TV broadcast, radio broadcast via satellite, digital video streaming, DVDs, Blu-ray discs and game consoles.

Sound can be recorded and stored and played using either digital or analog techniques. Both techniques introduce errors and distortions in the sound, and these methods can be systematically compared. Musicians and listeners have argued over the superiority of digital versus analog sound recordings. Arguments for analog systems include the absence of fundamental error mechanisms which are present in digital audio systems, including aliasing and quantization noise. Advocates of digital point to the high levels of performance possible with digital audio, including excellent linearity in the audible band and low levels of noise and distortion.

Sound quality

Sound quality is typically an assessment of the accuracy, fidelity, or intelligibility of audio output from an electronic device. Quality can be measured objectively, such as when tools are used to gauge the accuracy with which the device reproduces an original sound; or it can be measured subjectively, such as when human listeners respond to the sound or gauge its perceived similarity to another sound.

Direct Stream Digital system for digitally recreating audible signals

DSD Records (DSD) is a trademark used by Sony and Philips for their system of digitally recreating audible signals for the Super Audio CD (SACD).

Digital recording

In digital recording, audio signals picked up by a microphone or other transducer or video signals picked up by a camera or similar device are converted into a stream of discrete numbers, representing the changes over time in air pressure for audio, and chroma and luminance values for video, then recorded to a storage device. To play back a digital sound recording, the numbers are retrieved and converted back into their original analog waveforms so that they can be heard through a loudspeaker. To play back a digital video recording, the numbers are retrieved and converted back into their original analog waveforms so that they can be viewed on a video monitor, television or other display.

The digital sound revolution refers to the widespread adoption of digital audio technology in the computer industry beginning in the 1980s.

PCM adaptor a device that encodes digital audio as video

A PCM adaptor is a device that encodes digital audio as video for recording on a videocassette recorder. The adapter also has the ability to decode a video signal back to digital audio for playback. This digital audio system was used for mastering early compact discs.

Dolby Digital Plus, also known as Enhanced AC-3 is a digital audio compression scheme developed by Dolby Labs for transport and storage of multi-channel digital audio. It is a successor to Dolby Digital (AC-3), also developed by Dolby, and has a number of improvements including support for a wider range of data rates, increased channel count and multi-program support, and additional tools (algorithms) for representing compressed data and counteracting artifacts. While Dolby Digital (AC-3) supports up to five full-bandwidth audio channels at a maximum bitrate of 640 kbit/s, E-AC-3 supports up to 15 full-bandwidth audio channels at a maximum bitrate of 6.144 Mbit/s.

The dbx Model 700 Digital Audio Processor was a professional audio ADC/DAC combination unit, which digitized a stereo analog audio input into a bitstream, which was then encoded and encapsulated in an analog composite video signal, for recording to tape using a VCR as a transport. Unlike other similar pieces of equipment like the Sony PCM-F1, the Model 700 used a technique called Companded Predictive Delta Modulation, rather than the now-common pulse-code modulation. At the time of its introduction in the mid-1980s the device was the first commercial product to use this method, although it had been proposed in the 1960s and prototyped in the late '70s.

Adaptive differential pulse-code modulation (ADPCM) is a variant of differential pulse-code modulation (DPCM) that varies the size of the quantization step, to allow further reduction of the required data bandwidth for a given signal-to-noise ratio.

Sub-band coding

In signal processing, sub-band coding (SBC) is any form of transform coding that breaks a signal into a number of different frequency bands, typically by using a fast Fourier transform, and encodes each one independently. This decomposition is often the first step in data compression for audio and video signals.

Pulse-code modulation (PCM) is a method used to digitally represent sampled analog signals. It is the standard form of digital audio in computers, compact discs, digital telephony and other digital audio applications. In a PCM stream, the amplitude of the analog signal is sampled regularly at uniform intervals, and each sample is quantized to the nearest value within a range of digital steps.

Audio coding format digitally coded format for audio signals

An audio coding format is a content representation format for storage or transmission of digital audio. Examples of audio coding formats include MP3, AAC, Vorbis, FLAC, and Opus. A specific software or hardware implementation capable of audio compression and decompression to/from a specific audio coding format is called an audio codec; an example of an audio codec is LAME, which is one of several different codecs which implements encoding and decoding audio in the MP3 audio coding format in software.

References

  1. Janssens, Jelle; Stijn Vandaele; Tom Vander Beken (2009). "The Music Industry on (the) Line? Surviving Music Piracy in a Digital Era". European Journal of Crime, Criminal Law and Criminal Justice. 77 (96): 77–96. doi:10.1163/157181709X429105. hdl:1854/LU-608677.
  2. Genius Unrecognised, BBC, 2011-03-27, retrieved 2011-03-30
  3. USpatent 2605361,C. Chapin Cutler,"Differential Quantization of Communication Signals",issued 1952-07-29
  4. P. Cummiskey, Nikil S. Jayant, and J. L. Flanagan, "Adaptive quantization in differential PCM coding of speech", Bell Syst. Tech. J., vol. 52, pp. 1105—1118, Sept. 1973
  5. Cummiskey, P.; Jayant, Nikil S.; Flanagan, J. L. (1973). "Adaptive quantization in differential PCM coding of speech". The Bell System Technical Journal. 52 (7): 1105–1118. doi:10.1002/j.1538-7305.1973.tb02007.x. ISSN   0005-8580.
  6. 1 2 3 Schroeder, Manfred R. (2014). "Bell Laboratories". Acoustics, Information, and Communication: Memorial Volume in Honor of Manfred R. Schroeder. Springer. p. 388. ISBN   9783319056609.
  7. Gray, Robert M. (2010). "A History of Realtime Digital Speech on Packet Networks: Part II of Linear Predictive Coding and the Internet Protocol" (PDF). Found. Trends Signal Process. 3 (4): 203–303. doi:10.1561/2000000036. ISSN   1932-8346.
  8. Ahmed, Nasir (January 1991). "How I Came Up With the Discrete Cosine Transform". Digital Signal Processing . 1 (1): 4–5. doi:10.1016/1051-2004(91)90086-Z.
  9. Nasir Ahmed; T. Natarajan; Kamisetty Ramamohan Rao (January 1974). "Discrete Cosine Transform" (PDF). IEEE Transactions on Computers. C-23 (1): 90–93. doi:10.1109/T-C.1974.223784.
  10. J. P. Princen, A. W. Johnson und A. B. Bradley: Subband/transform coding using filter bank designs based on time domain aliasing cancellation, IEEE Proc. Intl. Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2161–2164, 1987.
  11. 1 2 Luo, Fa-Long (2008). Mobile Multimedia Broadcasting Standards: Technology and Practice. Springer Science & Business Media. p. 590. ISBN   9780387782638.
  12. Guckert, John (Spring 2012). "The Use of FFT and MDCT in MP3 Audio Compression" (PDF). University of Utah . Retrieved 14 July 2019.
  13. 1 2 3 4 Fine, Thomas (2008). Barry R. Ashpole (ed.). "The Dawn of Commercial Digital Recording" (PDF). ARSC Journal. Retrieved 2010-05-02.
  14. 1 2 3 4 Allstot, David J. (2016). "Switched Capacitor Filters" (PDF). In Maloberti, Franco; Davies, Anthony C. (eds.). A Short History of Circuits and Systems: From Green, Mobile, Pervasive Networking to Big Data Computing. IEEE Circuits and Systems Society. pp. 105–110. ISBN   9788793609860.
  15. 1 2 3 Floyd, Michael D.; Hillman, Garth D. (8 October 2018) [1st pub. 2000]. "Pulse-Code Modulation Codec-Filters". The Communications Handbook (2nd ed.). CRC Press. pp. 26–1, 26–2, 26–3.

Further reading