MPEG-H 3D Audio

Last updated August 09, 2024

MPEG-H 3D Audio, specified as ISO/IEC 23008-3 (MPEG-H Part 3), is an audio coding standard developed by the ISO/IEC Moving Picture Experts Group (MPEG) to support coding audio as audio channels, audio objects, or higher order ambisonics (HOA). MPEG-H 3D Audio can support up to 64 loudspeaker channels and 128 codec core channels.

Objects may be used alone or in combination with channels or HOA components. The use of audio objects allows for interactivity or personalization of a program by adjusting the gain or position of the objects during rendering in the MPEG-H decoder. Audio is encoded using an improved modified discrete cosine transform (MDCT) algorithm.^[1]

Channels, objects, and HOA components may be used to transmit immersive sound as well as mono, stereo, or surround sound. The MPEG-H 3D Audio decoder renders the bitstream to a number of standard speaker configurations as well as to misplaced speakers. Binaural rendering of sound for headphone listening is also supported.

These are the ISO standards relating to MPEG-H 3D Audio:

ISO/IEC 23008-3:2022 - Part 3: 3D audio

ISO/IEC 23008-6:2021/Amd 1:2024 - Part 6: 3D audio reference software

ISO/IEC 23008-9:2023 - Part 9: 3D Audio conformance testing

History

In January 2013, the requirements were released for MPEG-H 3D Audio which was for an increase in the immersion of audio and to allow for a greater number of loudspeakers for audio localization.^[2] The allowed audio types would be audio channels, audio objects, and HOA.^[2]

On September 10, 2014, Fraunhofer IIS demonstrated a real time MPEG-H 3D audio encoder.^[3]

In February 2015, MPEG announced that MPEG-H 3D Audio would be published as an International Standard.^[4]

On March 10, 2015, the Advanced Television Systems Committee announced that MPEG-H 3D Audio was one of the three standards proposed for the audio system of ATSC 3.0.^[5]

On April 10, 2015, Fraunhofer, Technicolor, and Qualcomm demonstrated a live broadcast signal chain consisting of all the elements needed to implement MPEG-H based audio in broadcast television. The demonstration featured a simulated remote truck at a sports event, a network control center, a local affiliate station, and a consumer living room. The audio was produced and encoded through an MPEG-H audio monitoring and authoring unit, mpeg-h real-time broadcast encoders, and real-time professional and consumer MPEG-H decoders. The audio was decoded in the consumer living room on a Technicolor set-top box.^[6]^[7]

In April 2015, the Advanced Television Systems Committee announced that systems from Dolby Laboratories and the MPEG-H Audio Alliance (Fraunhofer, Technicolor, and Qualcomm) would be tested in the coming months for use as the audio layer for the ATSC 3.0 signal.^[8]

In August 2015, the Advanced Television Systems Committee announced that systems from Dolby Laboratories and the MPEG-H Audio Alliance were demonstrated to the ATSC showing how they would work in both professional broadcast facilities and consumer home environments.^[9]^[10]

On April 18, 2016, South Korean broadcast equipment manufacturers Kai Media and DS Broadcast announced the availability of MPEG-H 3D Audio in their latest 4K broadcast encoders.^[11]

On May 2, 2016, the Advanced Television Systems Committee has elevated the A/342 audio standard for ATSC 3.0 to the status of a Candidate Standard. The MPEG-H Audio Alliance TV audio system and Dolby AC-4 are part of the A/342 standard.^[12]

On June 24, 2016, the South Korean standardization organization "Telecommunications Technology Association" TTA published the standard for "Transmission and Reception of Terrestrial UHD TV Broadcasting Service" for the South Korean terrestrial UHD TV broadcasting service to be launched in February 2017. The TTA standard is based on ATSC 3.0 and specifies MPEG-H 3D Audio as the sole audio codec for the 4K TV system.^[13]^[14]^[15]

On January 3, 2017, Fraunhofer IIS announced a trademark program to identify interoperable products that include MPEG-H.^[16]

On January 8, 2019, Sony announced an immersive music service "360 Reality Audio" that uses MPEG-H.^[17]^[18]^[19]

Profiles

The Main profile of MPEG-H 3D Audio has five levels.^[20]

Levels for the Main profile of MPEG-H 3D Audio^[20]
Level	Maximum number of core channels	Maximum number of loudspeaker channels
1	8	8
2	16	16
3	32	24
4	64	24
5	128	64

MPEG announced the availability of the MPEG-H 3D Audio Amendment 3 for late 2016. This amendment defines the Low Complexity Profile which includes technology that increases coding efficiency and also adds features designed for use in the broadcast industry.^[21]

Related Research Articles

<span class="mw-page-title-main">MP3</span> Digital audio format

MP3 is a coding format for digital audio developed largely by the Fraunhofer Society in Germany under the lead of Karlheinz Brandenburg, with support from other digital scientists in other countries. Originally defined as the third audio format of the MPEG-1 standard, it was retained and further extended—defining additional bit rates and support for more audio channels—as the third audio format of the subsequent MPEG-2 standard. A third version, known as MPEG-2.5—extended to better support lower bit rates—is commonly implemented but is not a recognized standard.

<span class="mw-page-title-main">Moving Picture Experts Group</span> Alliance of working groups to set standards for multimedia coding

The Moving Picture Experts Group (MPEG) is an alliance of working groups established jointly by ISO and IEC that sets standards for media coding, including compression coding of audio, video, graphics, and genomic data; and transmission and file formats for various applications. Together with JPEG, MPEG is organized under ISO/IEC JTC 1/SC 29 – Coding of audio, picture, multimedia and hypermedia information.

<span class="mw-page-title-main">MPEG-2</span> Video encoding standard

MPEG-2 is a standard for "the generic coding of moving pictures and associated audio information". It describes a combination of lossy video compression and lossy audio data compression methods, which permit storage and transmission of movies using currently available storage media and transmission bandwidth. While MPEG-2 is not as efficient as newer standards such as H.264/AVC and H.265/HEVC, backwards compatibility with existing hardware and software means it is still widely used, for example in over-the-air digital television broadcasting and in the DVD-Video standard.

MPEG-4 is a group of international standards for the compression of digital audio and visual data, multimedia systems, and file storage formats. It was originally introduced in late 1998 as a group of audio and video coding formats and related technology agreed upon by the ISO/IEC Moving Picture Experts Group (MPEG) under the formal standard ISO/IEC 14496 – Coding of audio-visual objects. Uses of MPEG-4 include compression of audiovisual data for Internet video and CD distribution, voice and broadcast television applications. The MPEG-4 standard was developed by a group led by Touradj Ebrahimi and Fernando Pereira.

MPEG-1 Audio Layer II or MPEG-2 Audio Layer II is a lossy audio compression format defined by ISO/IEC 11172-3 alongside MPEG-1 Audio Layer I and MPEG-1 Audio Layer III (MP3). While MP3 is much more popular for PC and Internet applications, MP2 remains a dominant standard for audio broadcasting.

Advanced Audio Coding (AAC) is an audio coding standard for lossy digital audio compression. It was designed to be the successor of the MP3 format and generally achieves higher sound quality than MP3 at the same bit rate.

Advanced Video Coding (AVC), also referred to as H.264 or MPEG-4 Part 10, is a video compression standard based on block-oriented, motion-compensated coding. It is by far the most commonly used format for the recording, compression, and distribution of video content, used by 91% of video industry developers as of September 2019. It supports a maximum resolution of 8K UHD.

Advanced Television Systems Committee (ATSC) standards are an International set of standards for broadcast and digital television transmission over terrestrial, cable and satellite networks. It is largely a replacement for the analog NTSC standard and, like that standard, is used mostly in the United States, Mexico, Canada, South Korea and Trinidad & Tobago. Several former NTSC users, such as Japan, have not used ATSC during their digital television transition, because they adopted other systems such as ISDB developed by Japan, and DVB developed in Europe, for example.

High-Efficiency Advanced Audio Coding (HE-AAC) is an audio coding format for lossy data compression of digital audio defined as an MPEG-4 Audio profile in ISO/IEC 14496–3. It is an extension of Low Complexity AAC (AAC-LC) optimized for low-bitrate applications such as streaming audio. The usage profile HE-AAC v1 uses spectral band replication (SBR) to enhance the modified discrete cosine transform (MDCT) compression efficiency in the frequency domain. The usage profile HE-AAC v2 couples SBR with Parametric Stereo (PS) to further enhance the compression efficiency of stereo signals.

MPEG-4 Part 2, MPEG-4 Visual is a video compression format developed by the Moving Picture Experts Group (MPEG). It belongs to the MPEG-4 ISO/IEC standards. It uses block-wise motion compensation and a discrete cosine transform (DCT), similar to previous standards such as MPEG-1 Part 2 and H.262/MPEG-2 Part 2.

MPEG Multichannel, also known as MPEG-2 Backwards Compatible, or MPEG-2 BC, is an extension to the MPEG-1 Layer II audio compression specification, as defined in the MPEG-2 Audio standard which allows it provide up to 5.1-channels of audio. To maintain backwards compatibility with the older 2-channel (stereo) audio specification, it uses a channel matrixing scheme, where the additional channels are mixed into the two backwards compatible channels. Extra information in the data stream contains signals to process extra channels from the matrix.

The Video Coding Experts Group or Visual Coding Experts Group is a working group of the ITU Telecommunication Standardization Sector (ITU-T) concerned with standards for compression coding of video, images, audio signals, biomedical waveforms, and other signals. It is responsible for standardization of the "H.26x" line of video coding standards, the "T.8xx" line of image coding standards, and related technologies.

The MPEG-4 Low Delay Audio Coder is audio compression standard designed to combine the advantages of perceptual audio coding with the low delay necessary for two-way communication. It is closely derived from the MPEG-2 Advanced Audio Coding (AAC) standard. It was published in MPEG-4 Audio Version 2 and in its later revisions.

MPEG Surround, also known as Spatial Audio Coding (SAC) is a lossy compression format for surround sound that provides a method for extending mono or stereo audio services to multi-channel audio in a backwards compatible fashion. The total bit rates used for the core and the MPEG Surround data are typically only slightly higher than the bit rates used for coding of the core. MPEG Surround adds a side-information stream to the core bit stream, containing spatial image data. Legacy stereo playback systems will ignore this side-information while players supporting MPEG Surround decoding will output the reconstructed multi-channel audio.

Unified Speech and Audio Coding (USAC) is an audio compression format and codec for both music and speech or any mix of speech and audio using very low bit rates between 12 and 64 kbit/s. It was developed by Moving Picture Experts Group (MPEG) and was published as an international standard ISO/IEC 23003-3 and also as an MPEG-4 Audio Object Type in ISO/IEC 14496-3:2009/Amd 3 in 2012.

MPEG media transport (MMT), specified as ISO/IEC 23008-1, is a digital container standard developed by Moving Picture Experts Group (MPEG) that supports High Efficiency Video Coding (HEVC) video. MMT was designed to transfer data using the all-Internet Protocol (All-IP) network.

MPEG-H is a group of international standards under development by the ISO/IEC Moving Picture Experts Group (MPEG). It has various "parts" – each of which can be considered a separate standard. These include a media transport protocol standard, a video compression standard, an audio compression standard, a digital file format container standard, three reference software packages, three conformance testing standards, and related technologies and technical reports. The group of standards is formally known as ISO/IEC 23008 – High efficiency coding and media delivery in heterogeneous environments. Development of the standards began around 2010, and the first fully approved standard in the group was published in 2013. Most of the standards in the group have been revised or amended several times to add additional extended features since their first edition.

Dolby AC-4 is an audio compression technology developed by Dolby Laboratories. Dolby AC-4 bitstreams can contain audio channels and/or audio objects. Dolby AC-4 has been adopted by the DVB project and standardized by the ETSI.

References

↑ Bleidt, R. L.; Sen, D.; Niedermeier, A.; Czelhan, B.; Füg, S.; et al. (2017). "Development of the MPEG-H TV Audio System for ATSC 3.0" (PDF). IEEE Transactions on Broadcasting. 63 (1): 202–236. doi:10.1109/TBC.2017.2661258. S2CID 30821673.
1 2 "Call for Proposals on 3D Audio". MPEG. Retrieved 2015-03-14.
↑ "Fraunhofer IIS Demonstrates Real-Time MPEG-H Audio Encoder System for Broadcast Applications at IBC". Business Wire. 2014-09-10. Retrieved 2015-03-15.
↑ "MPEG-H 3D Audio progresses to International Standard". MPEG. Retrieved 2015-03-14.
↑ "Advanced Television Systems Committee Begins Review of ATSC 3.0 Audio System Proposals". Advanced Television Systems Committee. 2015-03-10. Archived from the original on 2015-03-13. Retrieved 2015-03-14.
↑ "Fraunhofer IIS, Qualcomm and Technicolor to Demonstrate the World's First Live Broadcast of MPEG-H Interactive and Immersive TV Audio". Business Wire. Retrieved 9 October 2015.
↑ MPEG-H Audio Brings New Features to TV and Streaming Sound, Electronic Design, July 10, 2015
↑ "Evaluation of Proposed ATSC 3.0 Audio Systems Begins". Advanced Television Systems Committee. April 2015. Retrieved 9 October 2015.
↑ "Listen Up! Atlanta Hears ATSC 3.0 Audio As Proponents Demonstrate Advantages". Advanced Television Systems Committee. 3 August 2015. Retrieved 9 October 2015.
↑ "Demonstrations Show Off Potential of ATSC 3.0 Audio Standard". Sports Video Group. Retrieved 9 October 2015.
↑ "First broadcast encoders with MPEG-H Audio launched | Fraunhofer Audio Blog". 2016-04-18. Retrieved 2016-08-02.
↑ "More ATSC 3.0 Standards Progress! - ATSC". 3 May 2016. Retrieved 2016-08-02.
↑ "Transmission and Reception for Terrestrial UHDTV Broadcasting Service". TTA. 2016-06-24. Retrieved 2016-08-02.
↑ "Korea Reveals Its Plans for UHDTV at NAB Show". 4 May 2016. Retrieved 2016-08-02.
↑ "World's 1st Terrestrial UHD TV Service With MPEG-H Audio | Fraunhofer Audio Blog". 2016-06-30. Retrieved 2016-08-02.
↑ "Fraunhofer Announces MPEG-H Trademark to Identify Interoperable Products". Business Wire. 3 January 2017. Retrieved 11 January 2017.
↑ Sony. "Sony Introduces All New "360 Reality Audio" Music Experience That Immerses Listeners in a Three-Dimensional Sound Field Powered by Object-Based Spatial Audio Technology". www.prnewswire.com (Press release). Retrieved 2019-01-26.
↑ "Sony Unveils the 'Future of Music' With 360 Reality Audio at CES 2019". Billboard. 11 January 2019. Retrieved 2019-01-26.
↑ "I want Sony's 360 Reality Audio to be the future of music". Engadget. Retrieved 2019-01-26.
1 2 "Text of ISO/IEC 23008-3/ DAM, 3D Audio Profiles". MPEG. Retrieved 2015-03-14.
↑ "MPEG 115 - Geneva - MPEG-H 3D Audio AMD 3 reaches FDAM status | MPEG". mpeg.chiariglione.org. Retrieved 2016-08-02.

External links

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Bleidt, R. L.; Sen, D.; Niedermeier, A.; Czelhan, B.; Füg, S.; et al. (2017). "Development of the MPEG-H TV Audio System for ATSC 3.0" (PDF). IEEE Transactions on Broadcasting. 63 (1): 202–236. doi:10.1109/TBC.2017.2661258. S2CID 30821673.

[3DAudioJanuary2013Call-2] 1 2 "Call for Proposals on 3D Audio". MPEG. Retrieved 2015-03-14.

[3DAudioSeptember2014Businesswire-3] "Fraunhofer IIS Demonstrates Real-Time MPEG-H Audio Encoder System for Broadcast Applications at IBC". Business Wire. 2014-09-10. Retrieved 2015-03-15.

[3DAudioFebruary2015MPEG111-4] "MPEG-H 3D Audio progresses to International Standard". MPEG. Retrieved 2015-03-14.

[ATSCAudioMarch2015-5] "Advanced Television Systems Committee Begins Review of ATSC 3.0 Audio System Proposals". Advanced Television Systems Committee. 2015-03-10. Archived from the original on 2015-03-13. Retrieved 2015-03-14.

[6] "Fraunhofer IIS, Qualcomm and Technicolor to Demonstrate the World's First Live Broadcast of MPEG-H Interactive and Immersive TV Audio". Business Wire. Retrieved 9 October 2015.

[7] MPEG-H Audio Brings New Features to TV and Streaming Sound, Electronic Design, July 10, 2015

[8] "Evaluation of Proposed ATSC 3.0 Audio Systems Begins". Advanced Television Systems Committee. April 2015. Retrieved 9 October 2015.

[9] "Listen Up! Atlanta Hears ATSC 3.0 Audio As Proponents Demonstrate Advantages". Advanced Television Systems Committee. 3 August 2015. Retrieved 9 October 2015.

[10] "Demonstrations Show Off Potential of ATSC 3.0 Audio Standard". Sports Video Group. Retrieved 9 October 2015.

[11] "First broadcast encoders with MPEG-H Audio launched | Fraunhofer Audio Blog". 2016-04-18. Retrieved 2016-08-02.

[12] "More ATSC 3.0 Standards Progress! - ATSC". 3 May 2016. Retrieved 2016-08-02.

[13] "Transmission and Reception for Terrestrial UHDTV Broadcasting Service". TTA. 2016-06-24. Retrieved 2016-08-02.

[14] "Korea Reveals Its Plans for UHDTV at NAB Show". 4 May 2016. Retrieved 2016-08-02.

[15] "World's 1st Terrestrial UHD TV Service With MPEG-H Audio | Fraunhofer Audio Blog". 2016-06-30. Retrieved 2016-08-02.

[16] "Fraunhofer Announces MPEG-H Trademark to Identify Interoperable Products". Business Wire. 3 January 2017. Retrieved 11 January 2017.

[17] Sony. "Sony Introduces All New "360 Reality Audio" Music Experience That Immerses Listeners in a Three-Dimensional Sound Field Powered by Object-Based Spatial Audio Technology". www.prnewswire.com (Press release). Retrieved 2019-01-26.

[18] "Sony Unveils the 'Future of Music' With 360 Reality Audio at CES 2019". Billboard. 11 January 2019. Retrieved 2019-01-26.

[19] "I want Sony's 360 Reality Audio to be the future of music". Engadget. Retrieved 2019-01-26.

[3DAudioProfilesFebruary2015-20] 1 2 "Text of ISO/IEC 23008-3/ DAM, 3D Audio Profiles". MPEG. Retrieved 2015-03-14.

[21] "MPEG 115 - Geneva - MPEG-H 3D Audio AMD 3 reaches FDAM status | MPEG". mpeg.chiariglione.org. Retrieved 2016-08-02.

[avs-22] 1 2 Also used in China's DVB-S/S2 network.

[mobaho-23] 1 2 Defunct.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[note 1]

[note 2]

v t e MPEG (Moving Picture Experts Group)
MPEG-1 2 3 4 7 21 A B C D E G V M U H I 5
MPEG-1 Parts	Part 1: Systems Program stream Part 2: Video based on H.261 Part 3: Audio Layer I Layer II Layer III
MPEG-2 Parts	Part 1: Systems (H.222.0) Transport stream Program stream Part 2: Video (H.262) Part 3: Audio Layer I Layer II Layer III MPEG Multichannel Part 6: DSM CC Part 7: Advanced Audio Coding
MPEG-4 Parts	Part 2: Video based on H.263 Part 3: Audio Part 6: DMIF Part 10: Advanced Video Coding (H.264) Part 11: Scene description Part 12: ISO base media file format Part 14: MP4 file format Part 17: Streaming text format Part 20: LASeR Part 22: Open Font Format Part 33: Internet Video Coding
MPEG-7 Parts	Part 2: Description definition language
MPEG-21 Parts	Parts 2, 3 and 9: Digital Item Part 5: Rights Expression Language
MPEG-D Parts	Part 1: MPEG Surround Part 3: Unified Speech and Audio Coding
MPEG-G Parts	Part 1: Transport and Storage of Genomic Information Part 2: Coding of Genomic Information Part 3: APIs Part 4: Reference Software Part 5: Conformance
MPEG-H Parts	Part 1: MPEG media transport Part 2: High Efficiency Video Coding (H.265) Part 3: MPEG-H 3D Audio Part 12: High Efficiency Image File Format
MPEG-I Parts	Part 3: Versatile Video Coding (H.266)
MPEG-5 Parts	Part 1: Essential Video Coding Part 2: Low Complexity Enhancement Video Coding
Other	MPEG-DASH