Audio signal processing

Last updated

Audio signal processing is a subfield of signal processing that is concerned with the electronic manipulation of audio signals. Audio signals are electronic representations of sound waveslongitudinal waves which travel through air, consisting of compressions and rarefactions. The energy contained in audio signals is typically measured in decibels. As audio signals may be represented in either digital or analog format, processing may occur in either domain. Analog processors operate directly on the electrical signal, while digital processors operate mathematically on its digital representation.

Signal processing models and analyzes data representations of physical events

Signal processing is an electrical engineering subfield that focuses on analysing, modifying and synthesizing signals such as sound, images and biological measurements. Signal processing techniques can be used to improve transmission, storage efficiency and subjective quality and to also emphasize or detect components of interest in a measured signal.

An audio signal is a representation of sound, typically using a level of electrical voltage for analog signals, and a series of binary numbers for digital signals. Audio signals have frequencies in the audio frequency range of roughly 20 to 20,000 Hz, which corresponds to the lower and upper limits of human hearing. Audio signals may be synthesized directly, or may originate at a transducer such as a microphone, musical instrument pickup, phonograph cartridge, or tape head. Loudspeakers or headphones convert an electrical audio signal back into sound.

Longitudinal wave waves in which the displacement of the medium is in the same direction as, or the opposite direction to, the direction of propagation of the wave

Longitudinal waves are waves in which the displacement of the medium is in the same direction as, or the opposite direction to, the direction of propagation of the wave. Mechanical longitudinal waves are also called compressional or compression waves, because they produce compression and rarefaction when traveling through a medium, and pressure waves, because they produce increases and decreases in pressure.



The motivation for audio signal processing began at the beginning of the 20th century with inventions like the telephone, phonograph, and radio that allowed for the transmission and storage of audio signals. Audio processing was necessary for early radio broadcasting, as there were many problems with studio-to-transmitter links. [1] The theory of signal processing and its application to audio was largely developed at Bell Labs in the mid 20th century. Claude Shannon and Harry Nyquist's early work on communication theory, sampling theory, and Pulse-code modulation laid the foundations for the field. In 1957, Max Mathews became the first person to synthesize audio from a computer, giving birth to computer music.

Telephone Telecommunications device

A telephone, or phone, is a telecommunications device that permits two or more users to conduct a conversation when they are too far apart to be heard directly. A telephone converts sound, typically and most efficiently the human voice, into electronic signals that are transmitted via cables and other communication channels to another telephone which reproduces the sound to the receiving user.

Phonograph Device for playback of acoustic sounds stored as deviations on a disk or cylinder

The phonograph is a device for the mechanical recording and reproduction of sound. In its later forms, it is also called a gramophone or, since the 1940s, a record player. The sound vibration waveforms are recorded as corresponding physical deviations of a spiral groove engraved, etched, incised, or impressed into the surface of a rotating cylinder or disc, called a "record". To recreate the sound, the surface is similarly rotated while a playback stylus traces the groove and is therefore vibrated by it, very faintly reproducing the recorded sound. In early acoustic phonographs, the stylus vibrated a diaphragm which produced sound waves which were coupled to the open air through a flaring horn, or directly to the listener's ears through stethoscope-type earphones.

Radio Technology of using radio waves to carry information

Radio is the technology of signaling and communicating using radio waves. Radio waves are electromagnetic waves of frequency between 30 hertz (Hz) and 300 gigahertz (GHz). They are generated by an electronic device called a transmitter connected to an antenna which radiates the waves, and received by a radio receiver connected to another antenna. Radio is very widely used in modern technology, in radio communication, radar, radio navigation, remote control, remote sensing and other applications. In radio communication, used in radio and television broadcasting, cell phones, two-way radios, wireless networking and satellite communication among numerous other uses, radio waves are used to carry information across space from a transmitter to a receiver, by modulating the radio signal in the transmitter. In radar, used to locate and track objects like aircraft, ships, spacecraft and missiles, a beam of radio waves emitted by a radar transmitter reflects off the target object, and the reflected waves reveal the object's location. In radio navigation systems such as GPS and VOR, a mobile receiver receives radio signals from navigational radio beacons whose position is known, and by precisely measuring the arrival time of the radio waves the receiver can calculate its position on Earth. In wireless radio remote control devices like drones, garage door openers, and keyless entry systems, radio signals transmitted from a controller device control the actions of a remote device.

Analog signals

An analog audio signal is a continuous signal represented by an electrical voltage or current that is “analogous” to the sound waves in the air. Analog signal processing then involves physically altering the continuous signal by changing the voltage or current or charge via electrical circuits.

Historically, before the advent of widespread digital technology, analog was the only method by which to manipulate a signal. Since that time, as computers and software have become more capable and affordable and digital signal processing has become the method of choice. However, in music applications, analog technology is often still desirable as it often produces nonlinear responses that are difficult to replicate with digital filters.

Digital signals

A digital representation expresses the audio waveform as a sequence of symbols, usually binary numbers. This permits signal processing using digital circuits such as digital signal processors, microprocessors and general-purpose computers. Most modern audio systems use a digital approach as the techniques of digital signal processing are much more powerful and efficient than analog domain signal processing. [2]

Digital signal processor specialized microprocessor optimized for digital signal processing

A digital signal processor (DSP) is a specialized microprocessor, with its architecture optimized for the operational needs of digital signal processing.

Microprocessor Computer processor contained on an integrated-circuit chip

A microprocessor is a computer processor that incorporates the functions of a central processing unit on a single integrated circuit (IC), or at most a few integrated circuits. The microprocessor is a multipurpose, clock driven, register based, digital integrated circuit that accepts binary data as input, processes it according to instructions stored in its memory and provides results as output. Microprocessors contain both combinational logic and sequential digital logic. Microprocessors operate on numbers and symbols represented in the binary number system.

A computer is a machine that can be instructed to carry out sequences of arithmetic or logical operations automatically via computer programming. Modern computers have the ability to follow generalized sets of operations, called programs. These programs enable computers to perform an extremely wide range of tasks. A "complete" computer including the hardware, the operating system, and peripheral equipment required and used for "full" operation can be referred to as a computer system. This term may as well be used for a group of computers that are connected and work together, in particular a computer network or computer cluster.

Application areas

Processing methods and application areas include storage, data compression, music information retrieval, speech processing, localization, acoustic detection, transmission, noise cancellation, acoustic fingerprinting, sound recognition, synthesis, and enhancement (e.g. equalization, filtering, level compression, echo and reverb removal or addition, etc.).

Music information retrieval (MIR) is the interdisciplinary science of retrieving information from music. MIR is a small but growing field of research with many real-world applications. Those involved in MIR may have a background in musicology, psychoacoustics, psychology, academic music study, signal processing, informatics, machine learning, optical music recognition, computational intelligence or some combination of these.

Speech processing is the study of speech signals and the processing methods of signals. The signals are usually processed in a digital representation, so speech processing can be regarded as a special case of digital signal processing, applied to speech signals. Aspects of speech processing includes the acquisition, manipulation, storage, transfer and output of speech signals. The input is called speech recognition and the output is called speech synthesis.

Acoustic location the general use of sound to locate objects

Acoustic location is the use of sound to determine the distance and direction of its source or reflector. Location can be done actively or passively, and can take place in gases, liquids, and in solids.

Audio broadcasting

Audio signal processing is used when broadcasting audio signals in order to enhance their fidelity or optimize for bandwidth or latency. In this domain, the most important audio processing takes place just before the transmitter. The audio processor here must prevent or minimize overmodulation, compensate for non-linear transmitters (a potential issue with medium wave and shortwave broadcasting), and adjust overall loudness to desired level.

Active noise control

Active noise control is a technique designed to reduce unwanted sound. By creating a signal that is identical to the unwanted noise but with the opposite polarity, the two signals cancel out due to destructive interference.

Audio synthesis

Audio synthesis is the electronic generation of audio signals. A musical instrument that accomplishes this is called a synthesizer. Synthesizers can either imitate sounds or generate new ones. Audio synthesis is also used to generate human speech using speech synthesis.

Audio effects

Audio effects are systems designed to alter how an audio signal sounds. Unprocessed audio is metaphorically referred to as dry, while processed audio is referred to as wet. [3]

See also

Related Research Articles

Digital signal processing (DSP) is the use of digital processing, such as by computers or more specialized digital signal processors, to perform a wide variety of signal processing operations. The signals processed in this manner are a sequence of numbers that represent samples of a continuous variable in a domain such as time, space, or frequency.

Effects unit electronic or digital device that alters how a musical instrument or other audio source sounds

An effects unit or effectspedal is an electronic or digital device that alters the sound of a musical instrument or other audio source. Common effects include distortion/overdrive, often used with electric guitar in electric blues and rock music; dynamic effects such as volume pedals and compressors, which affect loudness; filters such as wah-wah pedals and graphic equalizers, which modify frequency ranges; modulation effects, such as chorus, flangers and phasers; pitch effects such as pitch shifters; and time effects, such as reverb and delay, which create echoing sounds and emulate the sound of different spaces.

Vocoder electronic device

A vocoder is a category of voice codec that analyzes and synthesizes the human voice signal for audio data compression, multiplexing, voice encryption, voice transformation, etc.

Sound effect artificially created or enhanced sounds, or sound processes used to emphasize artistic or other content of films, television shows, live performance, animation, video games, music, or other media

A sound effect is an artificially created or enhanced sound, or sound process used to emphasize artistic or other content of films, television shows, live performance, animation, video games, music, or other media. These are normally created with foley. In motion picture and television production, a sound effect is a sound recorded and presented to make a specific storytelling or creative point without the use of dialogue or music. The term often refers to a process applied to a recording, without necessarily referring to the recording itself. In professional motion picture and television production, dialogue, music, and sound effects recordings are treated as separate elements. Dialogue and music recordings are never referred to as sound effects, even though the processes applied to such as reverberation or flanging effects, often are called "sound effects".

Time stretching is the process of changing the speed or duration of an audio signal without affecting its pitch. Pitch scaling is the opposite: the process of changing the pitch without affecting the speed. Pitch shift is pitch scaling implemented in an effects unit and intended for live performance. Pitch control is a simpler process which affects pitch and speed simultaneously by slowing down or speeding up a recording.

Flanging is an audio effect produced by mixing two identical signals together, one signal delayed by a small and gradually changing period, usually smaller than 20 milliseconds. This produces a swept comb filter effect: peaks and notches are produced in the resulting frequency spectrum, related to each other in a linear harmonic series. Varying the time delay causes these to sweep up and down the frequency spectrum. A flanger is an effects unit that creates this effect.

Reason (software) software for creating and editing music

Reason is a digital audio workstation for creating and editing music and audio developed by Swedish software company Reason Studios. Reason emulates a rack of hardware synthesizers, samplers, signal processors, sequencers, and mixers, all of which can be freely interconnected in an arbitrary manner. Reason can be used either as a complete virtual music studio or as a set of virtual instruments to be used with other sequencing software in a fashion that mimics live performance.

Audio filter

An audio filter is a frequency dependent amplifier circuit, working in the audio frequency range, 0 Hz to beyond 20 kHz. Audio filters can amplify (boost), pass or attenuate (cut) some frequency ranges. Many types of filters exist for different audio applications including hi-fi stereo systems, musical synthesizers, sound effects, sound reinforcement systems, instrument amplifiers and virtual reality systems.

Sound reinforcement system combination of microphones, signal processors, amplifiers, and loudspeakers in enclosures all controlled by a mixing console that makes live or pre-recorded sounds louder and may also distribute those sounds to a larger or more distant audience

A sound reinforcement system is the combination of microphones, signal processors, amplifiers, and loudspeakers in enclosures all controlled by a mixing console that makes live or pre-recorded sounds louder and may also distribute those sounds to a larger or more distant audience. In many situations, a sound reinforcement system is also used to enhance or alter the sound of the sources on the stage, typically by using electronic effects, such as reverb, as opposed to simply amplifying the sources unaltered.

Telephone hybrid type of telephone circuit

A telephone hybrid is the component at the ends of a subscriber line of the public switched telephone network (PSTN) that converts between two-wire and four-wire forms of bidirectional audio paths. When used in broadcast facilities to enable the airing of telephone callers, the broadcast-quality telephone hybrid is known as a broadcast telephone hybrid or telephone balance unit.

A phaser is an electronic sound processor used to filter a signal by creating a series of peaks and troughs in the frequency spectrum. The position of the peaks and troughs of the waveform being affected is typically modulated so that they vary over time, creating a sweeping effect. For this purpose, phasers usually include a low-frequency oscillator.


Moogerfooger is the trademark for a series of analog effects pedals manufactured by Moog Music. There are currently eight different pedals produced; however, one of these models is designed for processing control voltages rather than audio signal. A sixth model, the Analog Delay, was released in a limited edition of 1000 units and has become a collector's item.

Delay (audio effect) audio effect reminiscent of an echo

Delay is an audio effect and an effects unit which records an input signal to an audio storage medium, and then plays it back after a period of time. The delayed signal may either be played back multiple times, or played back into the recording again, to create the sound of a repeating, decaying echo.

De-essing is any technique intended to reduce or eliminate the excessive prominence of sibilant consonants, such as the sounds normally represented in English by "s", "z", "ch", "j" and "sh", in recordings of the human voice. Sibilance lies in frequencies anywhere between 2–10 kHz, depending on the individual voice.

Audio mixing (recorded music) audio mixing to yield recorded sound

In sound recording and reproduction, audio mixing is the process of combining multitrack recordings into a final mono, stereo or surround sound product. In the process of combining the separate tracks, their relative levels are adjusted and balanced and various processes such as equalization and compression are commonly applied to individual tracks, groups of tracks, and the overall mix. In stereo and surround sound mixing, the placement of the tracks within the stereo field are adjusted and balanced. Audio mixing techniques and approaches vary widely and have a significant influence on the final product.

Musical "outboard equipment" or "gear" is used to alter how a musical instrument sounds. Outboard, (external effects units) can be used either during a live performance or in the recording studio. These are separate from the effects that may be applied by using a mixing console or a digital audio workstation. Some outboard effects units and digital signal processing (DSP) boxes commonly found in a studio are:

In telecommunication, equalization is the reversal of distortion incurred by a signal transmitted through a channel. Equalizers are used to render the frequency response—for instance of a telephone line—flat from end-to-end. When a channel has been equalized the frequency domain attributes of the signal at the input are faithfully reproduced at the output. Telephones, DSL lines and television cables use equalizers to prepare data signals for transmission.

A mixing engineer is a person responsible for combining ("mixing") the different sonic elements of a piece of recorded music into a final version of a song. They mix the elements of a recorded piece together to achieve a good balance of volume, while at the same time deciding other properties such as pan positioning, effects, and so on.

Calf Studio Gear, often referred to as Calf Plugins, is a set of open source LV2 plugins for the Linux platform. The suite intends to be a complete set of plugins for audio mixing, virtual instruments and mastering. As of version 0.90.0 there are 47 plugins in the suite.


  1. Atti, Andreas Spanias, Ted Painter, Venkatraman (2006). Audio signal processing and coding ([Online-Ausg.] ed.). Hoboken, NJ: John Wiley & Sons. p. 464. ISBN   0-471-79147-4.
  2. Zölzer, Udo (1997). Digital Audio Signal Processing. John Wiley and Sons. ISBN   0-471-97226-6.
  3. Hodgson, Jay (2010). Understanding Records, p.95. ISBN   978-1-4411-5607-5.
  4. "WHOLE LOTTA LOVE by LED ZEPPELIN" . Retrieved 5 January 2018.
  5. O'Neil, Bill. "Page's Studio Tricks III (Backwards echo)" . Retrieved 5 January 2018.
  6. Audrey. "The History of Reverse Reverb" . Retrieved 5 January 2018.

Further reading