Dolby Digital Plus

Last updated
Dolby Digital Plus
Dolby-Digital-Plus.svg
MIME / IANAaudio/eac3
Created by Dolby Laboratories
Encoding formatsE-AC3

Dolby Digital Plus, also known as Enhanced AC-3 (and commonly abbreviated as DDP, DD+, E-AC-3 or EC-3), is a digital audio compression scheme developed by Dolby Labs for the transport and storage of multi-channel digital audio. It is a successor to Dolby Digital (AC-3), and has a number of improvements over that codec, including support for a wider range of data rates (32 kbit/s to 6144 kbit/s), an increased channel count, and multi-program support (via substreams), as well as additional tools (algorithms) for representing compressed data and counteracting artifacts. Whereas Dolby Digital (AC-3) supports up to five full-bandwidth audio channels at a maximum bitrate of 640 kbit/s, E-AC-3 supports up to 15 full-bandwidth audio channels at a maximum bitrate of 6.144  Mbit/s.

Contents

The full set of technical specifications for E-AC-3 (and AC-3) are standardized and published in Annex E of ATSC A/52:2012, [1] as well as Annex E of ETSI TS 102 366. [2]

Technical details

Specifications

Dolby Digital Plus is capable of the following:

Structure

A Dolby Digital Plus service consists of one or more substreams. There are three types of substreams:

All DD+ streams must contain at least one independent substream or legacy substream, which contains the first (or only) 5.1 channels of the primary audio program. Additional independent substreams may be used for secondary audio programs such as foreign language soundtracks, commentary, or descriptions/voiceovers for the visually impaired. Dependent substreams may be provided for programs that have additional soundstage channels beyond 5.1.

Within each substream, provision is made for encoding five full-bandwidth channels, one low-frequency channel, and one coupling channel. The coupling channel is used for medium-to-high-frequency information which is common to multiple full-bandwidth channels. Its content is mixed in with the other channels in a fashion prescribed by the metadata, it is not reproduced as a discrete channel by the decoder.

Dolby Digital Plus includes comprehensive bitstream metadata for decoder control over output loudness (via dialnorm), downmixing, and reversible dynamic range control (via DRC).

Syntax

Dolby Digital Plus is nominally a 16-bit-aligned protocol, though very few fields in the syntax respect any byte or word boundaries. As many syntax elements are optional or variable-length, including some whose presence or length is dependent on complex preceding calculations, and there is little redundancy in the syntax, DD+ can be extremely difficult to parse correctly, with syntactically valid but incorrect parsings easily produced by defective encoders.

A DD+ stream is a collection of fixed-length syncframe packets, each of which corresponds to either 256, 512, 768, or 1536 consecutive time-domain audio samples. (The 1536-sample case is the most common case, and corresponds to Dolby Digital; the shorter subframe lengths are intended for use in interactive applications like video games where reducing encoder latency is an important concern). Each syncframe is independently decodable, and belongs to a specific substream within the service. A syncframe consists of the following syntax elements (some of which may be elided when a Dolby Digital Plus service is encapsulated into another format or transport):

Storage of transform coefficients

At the heart of both Dolby Digital and DD+ is a modified discrete cosine transform (MDCT), which is used to transform the audio signal into the frequency domain; within each block up to 256 frequency coefficients may be transmitted. Coefficients are transmitted in a binary floating-point format, with exponents transmitted separately from mantissas. This allows for highly efficient coding.

Exponents for each channel are encoded in a highly packed differential format, with the deltas between consecutive frequency bins (other than the first) being given in the stream. Three formats, or exponent strategies, are used; these are known as "D15", "D25", and "D45". In D15, each bin has a unique exponent, while in D25 and D45, delta values correspond to either pairs or quads of frequency bins. Audio blocks other than the first in a syncframe may additionally reuse the prior block's exponent set (this is required for channels that use the Adaptive Hybrid Transform).

The decoded exponents, along with a set of metadata parameters, is used to derive the bit allocation pointers (BAPs), which specify the number of bits allocated to each mantissa. Bins which correspond to frequencies in which human hearing is more precise are allocated more bits; bins which correspond to frequencies that humans are less sensitive to are allocated fewer. Anywhere between zero and 16 bits may be allocated for each mantissa; if zero bits are transmitted, a dither function may be optionally applied to generate the frequency coefficient.

Algorithm

Dolby Digital Plus, like many lossy audio codecs, uses a heavily quantized frequency-domain representation of the signal to achieve coding gain; this section describes the operation of the base transform as well as various optional "tools" specified by the standard, which are used to achieve either greater compression or to reduce audible coding artifacts. [3]

Modified discrete cosine transform

Both Dolby Digital and DD+ encoder converts a multichannel audio signal to the frequency domain using the modified discrete cosine transform (MDCT), with a switchable block length of either 256 or 512 samples (the latter is used with stationary signals, the former with transient signals). The frequency domain representation is then quantized according to a psycho-acoustic model and transmitted. A floating-point format for frequency coefficients is used, and mantissas and exponents are stored and transmitted separately, with both being heavily compressed.

Adaptive hybrid transform (AHT)

For highly stationary signals, such as long notes in musical performance, the Adaptive Hybrid Transform (AHT) is used. This tool is unique to Dolby Digital Plus (and unsupported in Dolby Digital), and uses an additional Type II discrete cosine transform (DCT) to combine six adjacent transform blocks (located within a syncframe) into an effectively longer block. In addition to the two-stage transform, a different bit-allocation structure is used, and two ways of representing encoded mantissas are deployed: use of vector quantization, which gives the highest coding gain, and use of gain-adapted quantization (GAQ) when greater signal-fidelity is required. Gain-adaptive quantization may be independently enabled for each frequency bin within a channel, and permits variable-length mantissa encoding.

Coupling

As many multi-channel audio programs have high degrees of correlation between individual channels, a coupling channel is typically used. High frequency information which is common among two or more channels is transmitted in a separate channel (one that is not reproduced by a decoder, but only mixed back into the original channels) known as the coupling channel; along with coefficients known as "coupling coordinates" that guide the decoder on how to reconstruct the original channels.

Dolby Digital Plus supports a more elaborate version of the coupling tool known as Enhanced Coupling (ECPL). This algorithm, which is considerably more expensive to process (both for encoders and decoders) allows phase information to be included in coupling coordinates, allowing for phase relationships between channels that are coupled to be preserved.

Spectral extension

Dolby Digital Plus provides another tool for high frequencies. As high frequency components are often harmonics of lower-frequency sounds, Spectral Extension (SPX) allows high frequency components to be synthesized algorithmically from lower-frequency components. This tool is also unique to Dolby Digital Plus, and unsupported in Dolby Digital.

Rematrixing

Stereo programs are typically rematrixed and encoded as an L+R and L-R channel. This is done both to increase coding gain (the L-R channel can typically be heavily compressed, and the subsequent un-matrixing will cause many compression artifacts to cancel), and to preserve phase relationships necessary for proper playback of Dolby Surround-encoded material.

Transient pre-noise processing

Transient pre-noise processing (TPNP) is a Dolby Digital Plus-specific tool to reduce the resulting artifacts of signal quantization and other compression techniques. Unlike the other tools described above, which operate in the frequency domain and precede the conversion back into PCM samples, TPNP is a tool which essentially performs a windowed cut-and-paste operation on the time-domain signal to erase certain predictable quantization artifacts.

Relation to Dolby Digital and Dolby Atmos

Dolby Digital Plus bitstreams are not directly backward compatible with legacy Dolby Digital decoders. However, Dolby Digital Plus is a functional superset of Dolby Digital, and decoders include a mandatory component that directly converts (without decoding and re-encoding) the Dolby Digital Plus bitstream to a Dolby Digital bitstream (operating at 640 kbit/s) for carriage via legacy S/PDIF connections (including S/PDIF over HDMI) to external decoders (e.g. AVRs, etc.). All Dolby Digital Plus decoders can decode Dolby Digital bitstreams.

However, Dolby Atmos bitstreams are encoded to be backwards compatible with Dolby Digital Plus decoders, and as such Dolby Atmos can be decoded by Dolby Digital Plus compatible devices. This has been marketed by Dolby, as the lossy compression variation of Dolby Atmos under the label "Dolby Digital Plus Atmos" to differentiate it from the lossless DolbyHD-based original. Most Dolby Digital Plus bitstreams are now encoded in Atmos encoding.


Dynamic range compression

One design goal of DD+ is quality playback in a variety of environments, ranging from home theaters and other acoustically controlled environments where high dynamic range playback is feasible, to portable and automotive environments where much background noise is present, and dynamic range compression may be necessary to make all parts of an audio program audible.

DD+ provides the following operating modes for different listener/viewer environments.

Dolby Digital Plus Decoder Operating Modes:

ModeReference Loudness (LKFS)Application
Line−31 LKFSHome Theatre Playback  Provides Full "cinema" Dynamic Range
RF−20 LKFSTV Speaker Playback  Provides Typical "broadcast" Dynamic Range
Portable−11 LKFSPortable Device Speaker & Headphone Playback  Provides Minimum Dynamic Range (similar to music production/mixing/mastering techniques)

Note: All of the decoder operating modes (listed above) are available in every Dolby Digital Plus decoder. The default operating mode is governed by device category and application. In some devices, users may also have a choice (via menu) to select an alternate mode that suits their particular taste and/or application.

In addition, Dolby Digital and DD+ contain additional metadata to permit error-free translation into range-restricted downstream channels, such as RF modulation, where excessive output signal amplitude may result in significant distortion or modulation errors.

Encapsulation, use, and storage of Dolby Digital streams

Physical transport for consumer devices

IEC 61937-3: defines how to transmit Dolby Digital (AC-3) and Dolby Digital Plus (E-AC-3) bitstreams via an IEC 60958/61937 (S/PDIF) interface. However, the S/PDIF interface has insufficient bandwidth to transport Dolby Digital Plus (E-AC-3) bitstreams at the 3.0 Mbit/s datarate specified by HD DVD; lower datarates are possible.

Much consumer gear, and even some professional gear, does not recognize Dolby Digital Plus as an encoded format, and will treat DD+ signals over a S/PDIF or similar interface, or stored in a .WAV file or similar container format, as though they were linear PCM data. This is not problematic if the data is passed unchanged, but any gain scaling or sample rate conversion, operations which are aurally harmless to PCM data, will corrupt and destroy a Dolby Digital Plus stream. (Older codecs such as DTS or AC-3 are more likely to be recognized as compressed formats and protected from such processing).

Dolby Digital Plus may be transmitted across HDMI 1.3 or newer, according to IEC 61937-3.

Physical transport for professional devices and applications

As the AES-3 interface is the professional analog to S/PDIF, Dolby Digital Plus streams may be carried over AES-3 connections with sufficient bandwidth, and/or over other interfaces that encapsulate AES-3 (such as SMPTE 259M and SMPTE 299M embedded audio). Additional standards promulgated by SMPTE specify the encoding of Dolby transports, including Dolby Digital, Dolby Digital Plus, and Dolby E (a professional-only codec used in audio/video applications) on an AES interface. The SMPTE 337 standard specifies the signalling and carriage of signals that are not PCM audio over an AES-3 interface, and the SMPTE 340-2008 standard specifies how Dolby Digital Plus and Dolby Digital are to be transmitted over that interface. The combination of SMPTE 340-2008 and 337M allow the Dolby Digital Plus bitstream to be stored and transported within professional production, contribution and distribution workflows prior to emission to consumers.

Consumer broadcast in digital television systems

Either DD+ or Dolby Digital is specified by the Advanced Television Systems Committee as the primary audio codec for the ATSC digital television system, and is commonly used for other DTV applications (such as cable and satellite broadcast) in countries which use ATSC for digital television.

For broadcast (emission) to consumers, the Dolby Digital Plus bitstream is packetized in an MPEG elementary stream, and multiplexed (with video) into an MPEG Transport Stream. In ATSC systems, the specification for carrying Dolby Digital Plus is described in ATSC A/53 Part 3 & Part 6. In DVB systems, the specification for carrying Dolby Digital Plus is described in ETSI TS 101 154 and ETSI EN 300 468.

Dolby Digital Plus is seeing increasing use in digital television systems, particular in cable and satellite systems, as a replacement for Dolby Digital. Many such applications do not take advantage of its higher channel count or ability to support multiple independent programs; instead it is used as a higher-efficiency codec than AC-3.

HD DVD and Blu-ray Disc

Both the now-defunct HD DVD standard and Blu-ray Disc include Dolby Digital Plus. It is a mandatory component of HD DVD and an optional component of Blu-ray. The maximum number of discrete coded channels is the same for both formats: 7.1. However, HD DVD and Blu-ray impose different technical constraints on the supported audio-codecs. Hence, the usage of DD+ differs substantially between HD DVD and Blu-ray Disc.

Dolby Digital (AC-3) and Dolby Digital Plus (E-AC-3) bitrate comparison
CodecHD DVDBlu-ray Disc
DecodingChannelsBitrateDecodingChannelsBitrate
AC-3mandatory1 to 5.1448 kbit/smandatory1 to 5.1640 kbit/s
E-AC-3mandatory1 to 7.13.024 Mbit/soptional, available for rear channels only6.1 to 7.11.664 Mbit/s
TrueHDmandatory
optional
1 or 2
3 to 8
18.0 Mbit/s
18.0 Mbit/s
optional1 to 818.0 Mbit/s

On HD DVD, DD+ is designated a mandatory audio codec. A HD DVD movie may use DD+ as the primary (or only) audio track. A HD DVD player is required to support DD+ audio by decoding and outputting it to the player's output jacks. As stored on disc, the DD+ bitstream can carry for any number of audio channels up to the maximum allowed, at any bitrate up to 3.0 Mbit/s.

On Blu-ray Disc, DD+ is an optional codec, and is deployed as an extension to a "core" AC-3 5.1 audiotrack. The AC-3 core is encoded at 640 kbit/s, carries 5 primary channels (and 1 LFE), and is independently playable as a movie audio track by any Blu-ray Disc player. The DD+ extension bitstream is used on players that support it by replacing the rear channels in the 5.1 setup with higher fidelity versions, along with providing a possible channel extension to 6.1 or 7.1. The complete audio track is allowed a combined bitrate of 1.7 Mbit/s: 640 kbit/s for the AC-3 5.1 core, and 1 Mbit/s for the DD+ extension. During playback, both the core and extension bitstreams contribute to the final audio-output, according to rules embedded in the bitstream metadata. [4] [ better source needed ]

Media players and downmixing

Generally, a Dolby Digital Plus bitstream can only be transported over an HDMI 1.3 or greater link. Older receivers support earlier versions of HDMI, or only have support for the S/PDIF system for digital audio, or analog inputs.

For non-HDMI 1.3 links, the player can decode the audio and then transmit it via a variety of different methods.

Most receivers and players support S/PDIF. This lower bandwidth digital connection is not capable of transmitting lossless PCM audio with more than two channels, but a player can transmit a S/PDIF compatible audio stream to the receiver in one of the following ways:

Should the player need to decode the audio for a non-HDMI 1.3 receiver, the results should be predictable. The DD+ specification explicitly defines downmixing modes and mechanics, so any source soundfield (up to 14.1) can be reproduced predictably for any listening environment (down to a single channel).

See also

Related Research Articles

MPEG-1 is a standard for lossy compression of video and audio. It is designed to compress VHS-quality raw digital video and CD audio down to about 1.5 Mbit/s without excessive quality loss, making video CDs, digital cable/satellite TV and digital audio broadcasting (DAB) practical.

Dolby Digital, originally synonymous with Dolby AC-3, is the name for a family of audio compression technologies developed by Dolby Laboratories. Called Dolby Stereo Digital until 1995, it is lossy compression. The first use of Dolby Digital was to provide digital sound in cinemas from 35 mm film prints. It has since also been used for TV broadcast, radio broadcast via satellite, digital video streaming, DVDs, Blu-ray discs and game consoles.

<span class="mw-page-title-main">G.711</span> ITU-T recommendation

G.711 is a narrowband audio codec originally designed for use in telephony that provides toll-quality audio at 64 kbit/s. It is an ITU-T standard (Recommendation) for audio encoding, titled Pulse code modulation (PCM) of voice frequencies released for use in 1972.

<span class="mw-page-title-main">S/PDIF</span> Standardized digital audio interface

S/PDIF is a type of digital audio interface used in consumer audio equipment to output audio over relatively short distances. The signal is transmitted over either a coaxial cable using RCA or BNC connectors, or a fibre-optic cable using TOSLINK connectors. S/PDIF interconnects components in home theaters and other digital high-fidelity systems.

<span class="mw-page-title-main">Meridian Lossless Packing</span> Audio file format

Meridian Lossless Packing, also known as Packed PCM (PPCM), is a lossless compression technique for PCM audio data developed by Meridian Audio, Ltd. MLP is the standard lossless compression method for DVD-Audio content and typically provides about 1.5:1 compression on most music material. All DVD-Audio players are equipped with MLP decoding, while its use on the discs themselves is at their producers' discretion.

<span class="mw-page-title-main">Dolby</span> Audio technology company

Dolby Laboratories, Inc. is a British-American technology corporation specializing in audio noise reduction, audio encoding/compression, spatial audio, and HDR imaging. Dolby licenses its technologies to consumer electronics manufacturers.

Musepack or MPC is an open source lossy audio codec, specifically optimized for transparent compression of stereo audio at bitrates of 160–180 kbit/s. It was formerly known as MPEGplus, MPEG+ or MP+.

Inter-Integrated Circuit Sound is a serial interface protocol for transmitting two-channel, digital audio as pulse-code modulation (PCM) between integrated circuit (IC) components of an electronic device. An I²S bus separates clock and serial data signals, resulting in simpler receivers than those required for asynchronous communications systems that need to recover the clock from the data stream. Alternatively, I²S is spelled I2S or IIS. Despite a similar name, I²S is unrelated to I²C.

<span class="mw-page-title-main">High-Efficiency Advanced Audio Coding</span> Audio codec

High-Efficiency Advanced Audio Coding (HE-AAC) is an audio coding format for lossy data compression of digital audio defined as an MPEG-4 Audio profile in ISO/IEC 14496–3. It is an extension of Low Complexity AAC (AAC-LC) optimized for low-bitrate applications such as streaming audio. The usage profile HE-AAC v1 uses spectral band replication (SBR) to enhance the modified discrete cosine transform (MDCT) compression efficiency in the frequency domain. The usage profile HE-AAC v2 couples SBR with Parametric Stereo (PS) to further enhance the compression efficiency of stereo signals.

<span class="mw-page-title-main">DTS, Inc.</span> Series of multichannel audio technologies

DTS, Inc. is an American company. DTS company makes multichannel audio technologies for film and video. Based in Calabasas, California, the company introduced its DTS technology in 1993 as a competitor to Dolby Laboratories, incorporating DTS in the film Jurassic Park (1993). The DTS product is used in surround sound formats for both commercial/theatrical and consumer-grade applications. It was known as The Digital Experience until 1995. DTS licenses its technologies to consumer electronics manufacturers.

Dolby TrueHD is a lossless, multi-channel audio codec developed by Dolby Laboratories for home video, used principally in Blu-ray Disc and compatible hardware. Dolby TrueHD, along with Dolby Digital Plus (E-AC-3) and Dolby AC-4, is one of the intended successors to the Dolby Digital (AC-3) lossy surround format. Dolby TrueHD competes with DTS's DTS-HD Master Audio, another lossless surround sound codec.

<span class="mw-page-title-main">DTS-HD Master Audio</span> Lossless audio codec for home theater

DTS-HD Master Audio is a multi-channel, lossless audio codec developed by DTS as an extension of the lossy DTS Coherent Acoustics codec. Rather than being an entirely new coding mechanism, DTS-HD MA encodes an audio master in lossy DTS first, then stores a concurrent stream of supplementary data representing whatever the DTS encoder discarded. This gives DTS-HD MA a lossy "core" able to be played back by devices that cannot decode the more complex lossless audio. DTS-HD MA's primary application is audio storage and playback for Blu-ray Disc media; it competes in this respect with Dolby TrueHD, another lossless surround format.

<span class="mw-page-title-main">Comparison of high-definition optical disc formats</span>

This article compares the technical specifications of multiple high-definition formats, including HD DVD and Blu-ray Disc; two mutually incompatible, high-definition optical disc formats that, beginning in 2006, attempted to improve upon and eventually replace the DVD standard. The two formats remained in a format war until February 19, 2008, when Toshiba, HD DVD's creator, announced plans to cease development, manufacturing and marketing of HD DVD players and recorders.

Adaptive differential pulse-code modulation (ADPCM) is a variant of differential pulse-code modulation (DPCM) that varies the size of the quantization step, to allow further reduction of the required data bandwidth for a given signal-to-noise ratio.

SBC, or low-complexity subband codec, is an audio subband codec specified by the Bluetooth Special Interest Group (SIG) for the Advanced Audio Distribution Profile (A2DP). SBC is a digital audio encoder and decoder used to transfer data to Bluetooth audio output devices like headphones or loudspeakers. It can also be used on the Internet. It was designed with Bluetooth bandwidth limitations and processing power in mind to obtain a reasonably good audio quality at medium bit rates with low computational complexity. As of A2DP version 1.3, the Low Complexity Subband Coding remains the default codec and its implementation is mandatory for devices supporting that profile, but vendors are free to add their own codecs to match their needs.

Constrained Energy Lapped Transform (CELT) is an open, royalty-free lossy audio compression format and a free software codec with especially low algorithmic delay for use in low-latency audio communication. The algorithms are openly documented and may be used free of software patent restrictions. Development of the format was maintained by the Xiph.Org Foundation and later coordinated by the Opus working group of the Internet Engineering Task Force (IETF).

<span class="mw-page-title-main">Sub-band coding</span>

In signal processing, sub-band coding (SBC) is any form of transform coding that breaks a signal into a number of different frequency bands, typically by using a fast Fourier transform, and encodes each one independently. This decomposition is often the first step in data compression for audio and video signals.

Pulse-code modulation (PCM) is a method used to digitally represent analog signals. It is the standard form of digital audio in computers, compact discs, digital telephony and other digital audio applications. In a PCM stream, the amplitude of the analog signal is sampled at uniform intervals, and each sample is quantized to the nearest value within a range of digital steps.

<span class="mw-page-title-main">Audio coding format</span> Digitally coded format for audio signals

An audio coding format is a content representation format for storage or transmission of digital audio. Examples of audio coding formats include MP3, AAC, Vorbis, FLAC, and Opus. A specific software or hardware implementation capable of audio compression and decompression to/from a specific audio coding format is called an audio codec; an example of an audio codec is LAME, which is one of several different codecs which implements encoding and decoding audio in the MP3 audio coding format in software.

Dolby AC-4 is an audio compression technology developed by Dolby Laboratories. Dolby AC-4 bitstreams can contain audio channels and/or audio objects. Dolby AC-4 has been adopted by the DVB project and standardized by the ETSI.

References

  1. Advanced Television Systems Committee (17 December 2012), ATSC Standard: Digital Audio Compression (AC-3, E-AC-3) (PDF), Washington, DC: Author, ATSC A/52:2012
  2. Digital Audio Compression (AC-3, Enhanced AC-3) Standard (PDF), European Telecommunications Standards Institute, 20 September 2017, ETSI TS 102 366 V1.4.1 (2017-09, retrieved 21 September 2023
  3. Andersen, Robert Loring; Crockett, B.; Davidson, G.; Davis, Mark; Fielder, L.; Turner, Stephen C.; Vinton, M.; Williams, P. (1 October 2004). "Introduction to Dolby Digital Plus, an Enhancement to the Dolby Digital Coding System" (PDF). Journal of The Audio Engineering Society. Archived from the original (PDF) on 2016-11-19.
  4. "avcodec/eac3: add support for dependent stream · FFmpeg/FFmpeg@ae92970". GitHub. Retrieved 2019-06-10.