Dereverberation

Last updated December 15, 2024

Dereverberation is the process by which the effects of reverberation are removed from sound, after such reverberant sound has been picked up by microphones. Dereverberation is a subtopic of acoustic digital signal processing and is most commonly applied to speech but also has relevance in some aspects of music processing. Dereverberation of audio (speech or music) is a corresponding function to blind deconvolution of images, although the techniques used are usually very different. Reverberation itself is caused by sound reflections in a room (or other enclosed space) and is quantified by the room reverberation time and the direct-to-reverberant ratio. The effect of dereverberation is to increase the direct-to-reverberant ratio so that the sound is perceived as closer and clearer.

A main application of dereverberation is in hands-free phones and desktop conferencing terminals because, in these cases, the microphones are not close to the source of sound – the talker’s mouth – but at arm’s length or further distance. As well as telecommunications, dereverberation is importantly applied in automatic speech recognition because speech recognizers are usually error-prone in reverberant scenarios.

Dereverberation became established as a topic of scientific research in the years 2000 to 2005.,^[1] although a few notable early articles exist.^[2] The first scientific text book on the topic was published in 2010.^[3] A global scientific study sponsored by the IEEE Technical Committee for Audio and Acoustic Signal Processing took place in 2014.^[4]

Three different approaches can be followed^[5] to perform dereverberation. In the first approach, reverberation is cancelled by exploiting a mathematical model of the acoustic system (or room) and, after estimation of the room acoustic model parameters, forming an estimate for the original signal. In the second approach, reverberation is suppressed by treating it as a type of (convolutional) noise and performing a de-noising process specifically adapted to reverberation. In the third approach, the original dereverberated signal is directly estimated from the microphone signals using, for example, a deep neural network machine learning approach or alternatively a multichannel linear filter. Examples of the most effective methods in the state-of-the art include approaches based on linear prediction ^[6]^[7]

Related Research Articles

Acoustics is a branch of physics that deals with the study of mechanical waves in gases, liquids, and solids including topics such as vibration, sound, ultrasound and infrasound. A scientist who works in the field of acoustics is an acoustician while someone working in the field of acoustics technology may be called an acoustical engineer. The application of acoustics is present in almost all aspects of modern society with the most obvious being the audio and noise control industries.

Audio signal processing is a subfield of signal processing that is concerned with the electronic manipulation of audio signals. Audio signals are electronic representations of sound waves—longitudinal waves which travel through air, consisting of compressions and rarefactions. The energy contained in audio signals or sound power level is typically measured in decibels. As audio signals may be represented in either digital or analog format, processing may occur in either domain. Analog processors operate directly on the electrical signal, while digital processors operate mathematically on its digital representation.

Linear predictive coding (LPC) is a method used mostly in audio signal processing and speech processing for representing the spectral envelope of a digital signal of speech in compressed form, using the information of a linear predictive model.

A microphone array is any number of microphones operating in tandem. There are many applications:

Reverberation, in acoustics, is a persistence of sound after it is produced. Reverberation is created when a sound or signal is reflected. This causes numerous reflections to build up and then decay as the sound is absorbed by the surfaces of objects in the space – which could include furniture, people, and air. This is most noticeable when the sound source stops but the reflections continue, their amplitude decreasing, until zero is reached.

<span class="mw-page-title-main">Recording studio</span> Facility for sound recording

A recording studio is a specialized facility for recording and mixing of instrumental or vocal musical performances, spoken words, and other sounds. They range in size from a small in-home project studio large enough to record a single singer-guitarist, to a large building with space for a full orchestra of 100 or more musicians. Ideally, both the recording and monitoring spaces are specially designed by an acoustician or audio engineer to achieve optimum acoustic properties.

Room acoustics is a subfield of acoustics dealing with the behaviour of sound in enclosed or partially-enclosed spaces. The architectural details of a room influences the behaviour of sound waves within it, with the effects varying by frequency. Acoustic reflection, diffraction, and diffusion can combine to create audible phenomena such as room modes and standing waves at specific frequencies and locations, echos, and unique reverberation patterns.

An echo chamber is a hollow enclosure used to produce reverberation, usually for recording purposes. A traditional echo chamber is covered in highly acoustically reflective surfaces. By using directional microphones pointed away from the speakers, echo capture is maximized. Some portions of the room can be moved to vary the room's decay time. Nowadays, effects units are more widely used to create such effects, but echo chambers are still used today, such as the famous echo chambers at Capitol Studios.

Acoustical engineering is the branch of engineering dealing with sound and vibration. It includes the application of acoustics, the science of sound and vibration, in technology. Acoustical engineers are typically concerned with the design, analysis and control of sound.

A sound reinforcement system is the combination of microphones, signal processors, amplifiers, and loudspeakers in enclosures all controlled by a mixing console that makes live or pre-recorded sounds louder and may also distribute those sounds to a larger or more distant audience. In many situations, a sound reinforcement system is also used to enhance or alter the sound of the sources on the stage, typically by using electronic effects, such as reverb, as opposed to simply amplifying the sources unaltered.

A noise gate or simply gate is an electronic device or software that is used to control the volume of an audio signal. Comparable to a limiter, which attenuates signals above a threshold, such as loud attacks from the start of musical notes, noise gates attenuate signals that register below the threshold. However, noise gates attenuate signals by a fixed amount, known as the range. In its simplest form, a noise gate allows a main signal to pass through only when it is above a set threshold: the gate is "open". If the signal falls below the threshold, no signal is allowed to pass : the gate is "closed". A noise gate is used when the level of the "signal" is above the level of the unwanted "noise". The threshold is set above the level of the "noise", and so when there is no main "signal", the gate is closed.

<span class="mw-page-title-main">Digital room correction</span> Acoustics process

Digital room correction is a process in the field of acoustics where digital filters designed to ameliorate unfavorable effects of a room's acoustics are applied to the input of a sound reproduction system. Modern room correction systems produce substantial improvements in the time domain and frequency domain response of the sound reproduction system.

Adaptive feedback cancellation is a common method of cancelling audio feedback in a variety of electro-acoustic systems such as digital hearing aids. The time varying acoustic feedback leakage paths can only be eliminated with adaptive feedback cancellation. When an electro-acoustic system with an adaptive feedback canceller is presented with a correlated input signal, a recurrent distortion artifact, entrainment is generated. There is a difference between the system identification and feedback cancellation.

Gated reverb or gated ambience is an audio processing technique that combines strong reverb and a noise gate that cuts the tail of the reverb. The effect is typically applied to recordings of drums to make the hits sound powerful and "punchy" while keeping the overall mix sound clean and transparent.

Computational auditory scene analysis (CASA) is the study of auditory scene analysis by computational means. In essence, CASA systems are "machine listening" systems that aim to separate mixtures of sound sources in the same way that human listeners do. CASA differs from the field of blind signal separation in that it is based on the mechanisms of the human auditory system, and thus uses no more than two microphone recordings of an acoustic environment. It is related to the cocktail party problem.

In sound recording and reproduction, audio mixing is the process of optimizing and combining multitrack recordings into a final mono, stereo or surround sound product. In the process of combining the separate tracks, their relative levels are adjusted and balanced and various processes such as equalization and compression are commonly applied to individual tracks, groups of tracks, and the overall mix. In stereo and surround sound mixing, the placement of the tracks within the stereo field are adjusted and balanced. Audio mixing techniques and approaches vary widely and have a significant influence on the final product.

An assistive listening device (ALD) is part of a system used to improve hearing ability for people in a variety of situations where they are unable to distinguish speech in noisy environments. Often, in a noisy or crowded room it is almost impossible for an individual who is hard of hearing to distinguish one voice among many. This is often exacerbated by the effect of room acoustics on the quality of perceived speech. Hearing aids are able to amplify and process these sounds, and improve the speech to noise ratio. However, if the sound is too distorted by the time it reaches the listener, even the best hearing aids will struggle to unscramble the signal. Assistive listening devices offer a more adaptive alternative to hearing aids, but can be more complex and cumbersome.

Direct Field Acoustic Testing, DFAT or DFAN, is a technique used for acoustic testing of aerospace structures by subjecting them to sound waves created by an array of acoustic drivers. The method uses electro-dynamic acoustic loudspeakers, arranged around the test article to provide a uniform, well-controlled, direct sound field at the surface of the unit under test. The system employs high capability acoustic drivers, powerful audio amplifiers, a narrow-band multiple-input-multiple-output (MIMO) controller and precision laboratory microphones to produce an acoustic environment that can simulate a helicopter, aircraft, jet engine or launch vehicle sound pressure field. A high level system is capable of overall sound pressure levels in the 125–147 dB for more than one minute over a frequency range from 25 Hz to 10 kHz.

Echo suppression and echo cancellation are methods used in telephony to improve voice quality by preventing echo from being created or removing it after it is already present. In addition to improving subjective audio quality, echo suppression increases the capacity achieved through silence suppression by preventing echo from traveling across a telecommunications network. Echo suppressors were developed in the 1950s in response to the first use of satellites for telecommunications.

A reverb effect, or reverb, is an audio effect applied to a sound signal to simulate reverberation. It may be created through physical means, such as echo chambers, or electronically through audio signal processing. The American producer Bill Putnam is credited for the first artistic use of artificial reverb in music, on the 1947 song "Peg o' My Heart" by the Harmonicats.

References

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[IWAENC-1] P. A. Naylor and N. D. Gaubitch, “Speech dereverberation,” in Proc. Intl. Workshop Acoust. Echo Noise Control (IWAENC), 2005.

[RYALL-2] L. E. Ryall, "Improvements in electric signal amplifiers incorporating voice-operated devices", Patent GB509613A, 1938.

[Springer2010-3] P. A. Naylor and N. D. Gaubitch, Eds., Speech Dereverberation. Springer, 2010.

[REVERB-4] The REVERB Challenge

[HABETS-5] E. Habets, Fifty Years of Reverberation Reduction, Audio Engineering Society 60th Conference on Dereverberation and Reverberation of Audio Music and Speech

[JUKIC-6] A. Jukic et al., "Multi-Channel Linear Prediction-Based Speech Dereverberation With Sparse Priors"

[DELCROIX-7] M. Delcroix et al, Linear Prediction-based Dereverberation with Advanced Speech Enhancement and Recognition Technologies, REVERB Challenge Workshop, 2014

[1]

[2]

[3]

[4]

[5]

[6]

[7]