Audio normalization

Last updated May 30, 2024

Audio normalization is the application of a constant amount of gain to an audio recording to bring the amplitude to a target level (the norm). Because the same amount of gain is applied across the entire recording, the signal-to-noise ratio and relative dynamics are unchanged. Normalization is one of the functions commonly provided by a digital audio workstation.

Peak normalization

One type of normalization is peak normalization, wherein the gain is changed to bring the highest PCM sample value or analog signal peak to a given level –usually 0 dBFS, the loudest level allowed in a digital system.^[1]

Since it searches only for the highest level, peak normalization alone does not account for the apparent loudness of the content. As such, peak normalization is generally used to change the volume in such a way to ensure optimal use of available dynamic range during the mastering stage of a digital recording. When combined with compression/limiting, however, peak normalization becomes a feature that can provide a loudness advantage over non–peak-normalized material. This feature of digital recording systems, compression and limiting followed by peak normalization, enables contemporary trends in program loudness.^[2]^[3]

Loudness normalization

Another type of normalization is based on a measure of loudness, wherein the gain is changed to bring the average loudness to a target level. This average may be approximate, such as a simple measurement of average power (e.g. RMS), or more accurate, such as a measure that addresses human perception e.g. that defined by EBU R128 and offered by ReplayGain, Sound Check and GoldWave.

For example, YouTube's preferred loudness level is −14 LUFS, so if an audio program is analyzed to be −10 LUFS, YouTube will lower the loudness by 4 dB to bring it to the preferred level.

Loudness normalization combats varying loudness when listening to multiple songs in a sequence. Before loudness normalization, one song in a playlist might be quieter than the rest, so the listener would have to turn a volume knob up to adjust the playback volume.^[4]

Depending on the dynamic range of the content and the target level, loudness normalization can result in peaks that exceed the recording medium's limits, causing clipping. Software offering loudness normalization typically provides the option of dynamic range compression to prevent clipping when this happens. In this situation, signal-to-noise ratio and relative dynamics are altered.

Loudness standards

Standardised normalized loudness levels vary by territory and application.^[5]

−24 LUFS: ATSC A/85 (US TV), NPRSS,^{[ clarification needed ]} and PRX radio broadcast^[6]^[5]
−23 LUFS: EBU R 128 broadcast^[5]
−19 to −16 LUFS: PRX podcasts^[7]
−14 LUFS: Spotify, YouTube and other streaming platforms^[5]

Related Research Articles

Dynamic range is the ratio between the largest and smallest values that a certain quantity can assume. It is often used in the context of signals, like sound and light. It is measured either as a ratio or as a base-10 (decibel) or base-2 logarithmic value of the ratio between the largest and smallest signal values.

A mixing console or mixing desk is an electronic device for mixing audio signals, used in sound recording and reproduction and sound reinforcement systems. Inputs to the console include microphones, signals from electric or electronic instruments, or recorded sounds. Mixers may control analog or digital signals. The modified signals are summed to produce the combined output signals, which can then be broadcast, amplified through a sound reinforcement system or recorded.

Automatic gain control (AGC) is a closed-loop feedback regulating circuit in an amplifier or chain of amplifiers, the purpose of which is to maintain a suitable signal amplitude at its output, despite variation of the signal amplitude at the input. The average or peak output signal level is used to dynamically adjust the gain of the amplifiers, enabling the circuit to work satisfactorily with a greater range of input signal levels. It is used in most radio receivers to equalize the average volume (loudness) of different radio stations due to differences in received signal strength, as well as variations in a single station's radio signal due to fading. Without AGC the sound emitted from an AM radio receiver would vary to an extreme extent from a weak to a strong signal; the AGC effectively reduces the volume if the signal is strong and raises it when it is weaker. In a typical receiver the AGC feedback control signal is usually taken from the detector stage and applied to control the gain of the IF or RF amplifier stages.

<span class="mw-page-title-main">Mastering (audio)</span> Form of audio post-production

Mastering, a form of audio post production, is the process of preparing and transferring recorded audio from a source containing the final mix to a data storage device, the source from which all copies will be produced. In recent years, digital masters have become usual, although analog masters—such as audio tapes—are still being used by the manufacturing industry, particularly by a few engineers who specialize in analog mastering.

Dynamic range compression (DRC) or simply compression is an audio signal processing operation that reduces the volume of loud sounds or amplifies quiet sounds, thus reducing or compressing an audio signal's dynamic range. Compression is commonly used in sound recording and reproduction, broadcasting, live sound reinforcement and some instrument amplifiers.

In electronics, a limiter is a circuit that allows signals below a specified input power or level to pass unaffected while attenuating (lowering) the peaks of stronger signals that exceed this threshold. Limiting is a type of dynamic range compression. Clipping is an extreme version of limiting.

ReplayGain is a proposed technical standard published by David Robinson in 2001 to measure and normalize the perceived loudness of audio in computer audio formats such as MP3 and Ogg Vorbis. It allows media players to normalize loudness for individual tracks or albums. This avoids the common problem of having to manually adjust volume levels between tracks when playing audio files from albums that have been mastered at different loudness levels.

In digital and analog audio, headroom refers to the amount by which the signal-handling capabilities of an audio system can exceed a designated nominal level. Headroom can be thought of as a safety zone allowing transient audio peaks to exceed the nominal level without damaging the system or the audio signal, e.g., via clipping. Standards bodies differ in their recommendations for nominal level and headroom.

dBFS Unit of measurement for amplitude levels in digital systems

Decibels relative to full scale is a unit of measurement for amplitude levels in digital systems, such as pulse-code modulation (PCM), which have a defined maximum peak level. The unit is similar to the units dBov and decibels relative to overload (dBO).

<span class="mw-page-title-main">Peak programme meter</span> A quasi-peak audio level meter originally developed in the 1930s

A peak programme meter (PPM) is an instrument used in professional audio that indicates the level of an audio signal.

Clipping is a form of waveform distortion that occurs when an amplifier is overdriven and attempts to deliver an output voltage or current beyond its maximum capability. Driving an amplifier into clipping may cause it to output power in excess of its power rating.

The loudness war is a trend of increasing audio levels in recorded music, which reduces audio fidelity and—according to many critics—listener enjoyment. Increasing loudness was first reported as early as the 1940s, with respect to mastering practices for 7-inch singles. The maximum peak level of analog recordings such as these is limited by varying specifications of electronic equipment along the chain from source to listener, including vinyl and Compact Cassette players. The issue garnered renewed attention starting in the 1990s with the introduction of digital signal processing capable of producing further loudness increases.

Programme level refers to the signal level that an audio source is transmitted or recorded at, and is important in audio if listeners of Compact Discs (CDs), radio and television are to get the best experience, without excessive noise in quiet periods or distortion of loud sounds. Programme level is often measured using a peak programme meter or a VU meter.

Loudness monitoring of programme levels is needed in radio and television broadcasting, as well as in audio post production. Traditional methods of measuring signal levels, such as the peak programme meter and VU meter, do not give the subjectively valid measure of loudness that many would argue is needed to optimise the listening experience when changing channels or swapping disks.

The alignment level in an audio signal chain or on an audio recording is a defined anchor point that represents a reasonable or typical level.

Parallel compression, also known as New York compression, is a dynamic range compression technique used in sound recording and mixing. Parallel compression, a form of upward compression, is achieved by mixing an unprocessed 'dry', or lightly compressed signal with a heavily compressed version of the same signal. Rather than lowering the highest peaks for the purpose of dynamic range reduction, it decreases the dynamic range by raising up the softest sounds, adding audible detail. It is most often used on stereo percussion buses in recording and mixdown, on electric bass, and on vocals in recording mixes and live concert mixes.

Dialnorm is the metadata parameter that controls playback gain within the Dolby Laboratories Dolby Digital (AC-3) audio compression system. Dialnorm stands for dialog normalization. Dialnorm is an integer value with range 1 to 31 corresponding to a playback gain of -30 to 0 dB (unity) respectively. Higher values afford more headroom and are appropriate for dynamic material such as an action film.

A mixing engineer is responsible for combining ("mixing") different sonic elements of an auditory piece into a complete rendition, whether in music, film, or any other content of auditory nature. The finished piece, recorded or live, must achieve a good balance of properties, such as volume, pan positioning, and other effects, while resolving any arising frequency conflicts from various sound sources. These sound sources can comprise the different musical instruments or vocals in a band or orchestra, dialogue or Foley in a film, and more.

EBU R 128 is a recommendation for loudness normalisation and maximum level of audio signals. It is primarily followed during audio mixing of television and radio programmes and adopted by broadcasters to measure and control programme loudness. It was first issued by the European Broadcasting Union in August 2010 and most recently revised in August 2020.

Loudness, K-weighted, relative to full scale (LKFS) is a standard loudness measurement unit used for audio normalization in broadcast television systems and other video and music streaming services.

References

↑ Des (20 April 2008). "10 Myths About Normalization". Hometracked. Retrieved 10 June 2012.
↑ Shelvock, Matt (2012). Audio Mastering as Musical Practice. London: University of Western Ontario: EDT. p. 26.
↑ Katz, Bob (2007). Mastering Audio: The Art and the Science . Focal Press. pp. 168. ISBN 978-0-240-80837-6.
↑ "What are the "loudness wars" and loudness normalization?". Hybrid Studios. Archived from the original on 27 June 2018. Retrieved 1 July 2018.
1 2 3 4 Tépper, Allan (23 March 2018). "How many LUFS for ideal audio loudness? Why can't we be friends?". Pro Video Coalition. Retrieved 11 July 2019.
↑ "Formatting Audio Files for Broadcast". PRX – Help Desk. PRX . Retrieved 21 March 2022. Loudness at -24 LUFS, ± 2 LU (recommended)
↑ "How should I format my audio files for Publish?". PRX – Help Desk. PRX . Retrieved 21 March 2022. Set your loudness between -16db LUFS and -19db LUFS. There is no set industry standard, [...]

External links

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[Myths-1] Des (20 April 2008). "10 Myths About Normalization". Hometracked. Retrieved 10 June 2012.

[2] Shelvock, Matt (2012). Audio Mastering as Musical Practice. London: University of Western Ontario: EDT. p. 26.

[3] Katz, Bob (2007). Mastering Audio: The Art and the Science . Focal Press. pp. 168. ISBN 978-0-240-80837-6.

[4] "What are the "loudness wars" and loudness normalization?". Hybrid Studios. Archived from the original on 27 June 2018. Retrieved 1 July 2018.

[pro-vid-5] 1 2 3 4 Tépper, Allan (23 March 2018). "How many LUFS for ideal audio loudness? Why can't we be friends?". Pro Video Coalition. Retrieved 11 July 2019.

[6] "Formatting Audio Files for Broadcast". PRX – Help Desk. PRX . Retrieved 21 March 2022. Loudness at -24 LUFS, ± 2 LU (recommended)

[7] "How should I format my audio files for Publish?". PRX – Help Desk. PRX . Retrieved 21 March 2022. Set your loudness between -16db LUFS and -19db LUFS. There is no set industry standard, [...]

[1]

[2]

[3]

[4]

[5]

[6]

[7]