MPEG-4 Part 3

Last updated

MPEG-4 Part 3 or MPEG-4 Audio (formally ISO/IEC 14496-3) is the third part of the ISO/IEC MPEG-4 international standard developed by Moving Picture Experts Group. [1] It specifies audio coding methods. The first version of ISO/IEC 14496-3 was published in 1999. [2]

Contents

The MPEG-4 Part 3 consists of a variety of audio coding technologies – from lossy speech coding (HVXC, CELP), general audio coding (AAC, TwinVQ, BSAC), lossless audio compression (MPEG-4 SLS, Audio Lossless Coding, MPEG-4 DST), a Text-To-Speech Interface (TTSI), Structured Audio (using SAOL, SASL, MIDI) and many additional audio synthesis and coding techniques. [3] [4] [5] [6] [7] [8] [9] [10] [11]

MPEG-4 Audio does not target a single application such as real-time telephony or high-quality audio compression. It applies to every application which requires the use of advanced sound compression, synthesis, manipulation, or playback. MPEG-4 Audio is a new type of audio standard that integrates numerous different types of audio coding: natural sound and synthetic sound, low bitrate delivery and high-quality delivery, speech and music, complex soundtracks and simple ones, traditional content and interactive content. [7]

Versions

MPEG-4 Audio versions and editions [12]
EditionRelease dateLatest amendmentStandardDescription
First edition19992001ISO/IEC 14496-3:1999 [2] also known as "MPEG-4 Audio Version 1"
2000ISO/IEC 14496-3:1999/Amd 1:2000 [13] also known as "MPEG-4 Audio Version 2", an Amendment to first edition [7] [8]
Second edition20012005ISO/IEC 14496-3:2001 [14]
Third edition20052008ISO/IEC 14496-3:2005 [15]
Fourth edition20092015 and under development [12] ISO/IEC 14496-3:2009 [1] [16]
Fifth edition2019ISO/IEC 14496-3:2019 [17] Current version

Subparts

MPEG-4 Part 3 contains following subparts: [16]

MPEG-4 Audio Object Types

MPEG-4 Audio includes a system for handling a diverse group of audio formats in a uniform manner. Each format is assigned a unique Audio Object Type to represent it. [18] [19] Object Type is used to distinguish between different coding methods. It directly determines the MPEG-4 tool subset required to decode a specific object. The MPEG-4 profiles are based on the object types and each profile supports a different list of object types. [19]

MPEG-4 Audio Object Types [7] [9] [18] [20] [21]
Object Type IDAudio Object TypeFirst public release dateDescription
1AAC Main1999contains AAC LC
2 AAC LC (Low Complexity)1999Used in the "AAC Profile". MPEG-4 AAC LC Audio Object Type is based on the MPEG-2 Part 7 Low Complexity profile (LC) combined with Perceptual Noise Substitution (PNS) (defined in MPEG-4 Part 3 Subpart 4). [4] [22]
3AAC SSR (Scalable Sample Rate)1999MPEG-4 AAC SSR Audio Object Type is based on the MPEG-2 Part 7 Scalable Sampling Rate profile (SSR) combined with Perceptual Noise Substitution (PNS) (defined in MPEG-4 Part 3 Subpart 4). [4] [22]
4AAC LTP (Long Term Prediction)1999contains AAC LC
5SBR (Spectral Band Replication)2003 [23] used with AAC LC in the "High Efficiency AAC Profile" (HE-AAC v1)
6AAC Scalable1999
7 TwinVQ 1999audio coding at very low bitrates
8CELP (Code Excited Linear Prediction)1999speech coding
9 HVXC (Harmonic Vector eXcitation Coding)1999speech coding
10(Reserved)
11(Reserved)
12TTSI (Text-To-Speech Interface)1999
13Main synthesis1999contains 'wavetable' sample-based synthesis [24] and Algorithmic Synthesis and Audio Effects
14'wavetable' sample-based synthesis 1999based on SoundFont and DownLoadable Sounds, [24] contains General MIDI
15 General MIDI 1999
16Algorithmic Synthesis and Audio Effects1999
17ER AAC LC2000 Error Resilient
18(Reserved )
19ER AAC LTP2000Error Resilient
20ER AAC Scalable2000Error Resilient
21ER TwinVQ2000Error Resilient
22ER BSAC (Bit-Sliced Arithmetic Coding)2000It is also known as "Fine Granule Audio" or fine grain scalability tool. It is used in combination with the AAC coding tools and replaces the noiseless coding and the bitstream formatting of MPEG-4 Version 1 GA coder. Error Resilient
23ER AAC LD (Low Delay)2000Error Resilient, used with CELP, ER CELP, HVXC, ER HVXC and TTSI in the "Low Delay Profile", (commonly used for real-time conversation applications)
24ER CELP2000Error Resilient
25ER HVXC2000Error Resilient
26ER HILN (Harmonic and Individual Lines plus Noise)2000Error Resilient
27ER Parametric2000Error Resilient
28SSC (SinuSoidal Coding)2004 [25] [26]
29PS (Parametric Stereo)2004 [27] and 2006 [28] [29] used with AAC LC and SBR in the "HE-AAC v2 Profile". PS coding tool was defined in 2004 and Object Type defined in 2006.
30 MPEG Surround 2007 [30] also known as MPEG Spatial Audio Coding (SAC), it is a type of spatial audio coding [31] [32] (MPEG Surround was also defined in ISO/IEC 23003-1 in 2007 [33] )
31(ESCAPE)
32 MPEG-1/2 Layer-1 2005 [34]
33 MPEG-1/2 Layer-2 2005 [34]
34 MPEG-1/2 Layer-3 2005 [34] also known as "MP3onMP4"
35DST (Direct Stream Transfer)2005 [35] lossless audio coding, used on Super Audio CD
36ALS (Audio Lossless Coding)2006 [29] lossless audio coding
37SLS (Scalable Lossless Coding)2006 [36] two-layer audio coding with lossless layer and lossy General Audio core/layer (e.g. AAC)
38SLS non-core2006lossless audio coding without lossy General Audio core/layer (e.g. AAC)
39ER AAC ELD (Enhanced Low Delay)2008 [37] Error Resilient
40SMR (Symbolic Music Representation) Simple2008note: Symbolic Music Representation is also the MPEG-4 Part 23 standard (ISO/IEC 14496-23:2008) [38] [39]
41SMR Main2008
42USAC (Unified Speech and Audio Coding)2012Unified Speech and audio Coding is defined in MPEG-D Part 3 (ISO/IEC 23003-3:2012) [40]
43SAOC (Spatial Audio Object Coding)2010 [41] [42] note: Spatial Audio Object Coding is also the MPEG-D Part 2 standard (ISO/IEC 23003-2:2010) [43]
44LD MPEG Surround2010 [44] This object type conveys Low Delay MPEG Surround Coding side information (that was defined in MPEG-D Part 2 – ISO/IEC 23003-2 [43] ) in the MPEG-4 Audio framework.
45SAOC-DE2013Spatial Audio Object Coding Dialogue Enhancement
46Audio Sync2015The audio synchronization tool provides capability of synchronizing multiple contents in multiple devices.

Audio Profiles

Hierarchical structure of AAC Profile, HE-AAC Profile and HE-AAC v2 Profile, and compatibility between them. The HE-AAC Profile decoder is fully capable of decoding any AAC Profile stream. Similarly the HE-AAC v2 decoder can handle all HE-AAC Profile streams as well as all AAC Profile streams. Based on the MPEG-4 Part 3 technical specification. HE-AAC and HE-AAC v2.svg
Hierarchical structure of AAC Profile, HE-AAC Profile and HE-AAC v2 Profile, and compatibility between them. The HE-AAC Profile decoder is fully capable of decoding any AAC Profile stream. Similarly the HE-AAC v2 decoder can handle all HE-AAC Profile streams as well as all AAC Profile streams. Based on the MPEG-4 Part 3 technical specification.

The MPEG-4 Audio standard defines several profiles. These profiles are based on the object types and each profile supports different list of object types. Each profile may also have several levels, which limit some parameters of the tools present in a profile. These parameters usually are the sampling rate and the number of audio channels decoded at the same time.

MPEG-4 Audio Profiles [19] [21]
Audio ProfileAudio Object TypesFirst public release date
AAC ProfileAAC LC2003
High Efficiency AAC ProfileAAC LC, SBR2003
HE-AAC v2 ProfileAAC LC, SBR, PS2006
Main Audio ProfileAAC Main, AAC LC, AAC SSR, AAC LTP, AAC Scalable, TwinVQ, CELP, HVXC, TTSI, Main synthesis1999
Scalable Audio ProfileAAC LC, AAC LTP, AAC Scalable, TwinVQ, CELP, HVXC, TTSI1999
Speech Audio Profile CELP, HVXC, TTSI1999
Synthetic Audio ProfileTTSI, Main synthesis1999
High Quality Audio ProfileAAC LC, AAC LTP, AAC Scalable, CELP, ER AAC LC, ER AAC LTP, ER AAC Scalable, ER CELP2000
Low Delay Audio ProfileCELP, HVXC, TTSI, ER AAC LD, ER CELP, ER HVXC2000
Natural Audio ProfileAAC Main, AAC LC, AAC SSR, AAC LTP, AAC Scalable, TwinVQ, CELP, HVXC, TTSI, ER AAC LC, ER AAC LTP, ER AAC Scalable, ER TwinVQ, ER BSAC, ER AAC LD, ER CELP, ER HVXC, ER HILN, ER Parametric2000
Mobile Audio Internetworking ProfileER AAC LC, ER AAC Scalable, ER TwinVQ, ER BSAC, ER AAC LD2000
HD-AAC ProfileAAC LC, SLS [45] 2009 [46]
ALS Simple ProfileALS2010 [42] [47]

Audio storage and transport

Multiplex, storage and transmission formats for MPEG-4 Audio [16]
StandardDescription
MultiplexISO/IEC 14496-1MPEG-4 Multiplex scheme (M4Mux) [48]
MultiplexISO/IEC 14496-3Low Overhead Audio Transport Multiplex (LATM)
StorageISO/IEC 14496-3 (informative)Audio Data Interchange Format (ADIF) – only for AAC
StorageISO/IEC 14496-12MPEG-4 file format (MP4) / ISO base media file format
TransmissionISO/IEC 14496-3 (informative)Audio Data Transport Stream (ADTS) – only for AAC
TransmissionISO/IEC 14496-3Low Overhead Audio Stream (LOAS), based on LATM

There is no standard for transport of elementary streams over a channel, because the broad range of MPEG-4 applications have delivery requirements that are too wide to easily characterize with a single solution.

The capabilities of a transport layer and the communication between transport, multiplex, and demultiplex functions are described in the Delivery Multimedia Integration Framework (DMIF) in ISO/IEC 14496-6. [16] A wide variety of delivery mechanisms exist below this interface, e.g., MPEG transport stream, Real-time Transport Protocol (RTP), etc.

Transport in Real-time Transport Protocol is defined in RFC 3016 (RTP Payload Format for MPEG-4 Audio/Visual Streams), RFC 3640 (RTP Payload Format for Transport of MPEG-4 Elementary Streams), RFC 4281 (The Codecs Parameter for "Bucket" Media Types) and RFC 4337 (MIME Type Registration for MPEG-4).

LATM and LOAS were defined for natural audio applications, which do not require sophisticated object-based coding or other functions provided by MPEG-4 Systems.

Bifurcation in the AAC technical standard

The Advanced Audio Coding in MPEG-4 Part 3 (MPEG-4 Audio) Subpart 4 was enhanced relative to the previous standard MPEG-2 Part 7 (Advanced Audio Coding), in order to provide better sound quality for a given encoding bitrate.

It is assumed that any Part 3 and Part 7 differences will be ironed out by the ISO standards body in the near future to avoid the possibility of future bitstream incompatibilities. At present there are no known player or codec incompatibilities due to the newness of the standard.

The MPEG-2 Part 7 standard (Advanced Audio Coding) was first published in 1997 and offers three default profiles: [49] [50] Low Complexity profile (LC), Main profile and Scalable Sampling Rate profile (SSR).

The MPEG-4 Part 3 Subpart 4 (General Audio Coding) combined the profiles from MPEG-2 Part 7 with Perceptual Noise Substitution (PNS) and defined them as Audio Object Types (AAC LC, AAC Main, AAC SSR). [4]

HE-AAC

High-Efficiency Advanced Audio Coding is an extension of AAC LC using spectral band replication (SBR), and Parametric Stereo (PS). It is designed to increase coding efficiency at low bitrates by using partial parametric representation of audio.

AAC-SSR

AAC Scalable Sample Rate was introduced by Sony to the MPEG-2 Part 7 and MPEG-4 Part 3 standards.[ citation needed ] It was first published in ISO/IEC 13818-7, Part 7: Advanced Audio Coding (AAC) in 1997. [49] [50] The audio signal is first split into 4 bands using a 4 band polyphase quadrature filter bank. Then these 4 bands are further split using MDCTs with a size k of 32 or 256 samples. This is similar to normal AAC LC which uses MDCTs with a size k of 128 or 1024 directly on the audio signal.

The advantage of this technique is that short block switching can be done separately for every PQF band. So high frequencies can be encoded using a short block to enhance temporal resolution, low frequencies can be still encoded with high spectral resolution. However, due to aliasing between the 4 PQF bands coding efficiencies around (1,2,3) * fs/8 is worse than normal MPEG-4 AAC LC.[ citation needed ]

MPEG-4 AAC-SSR is very similar to ATRAC and ATRAC-3.

Why AAC-SSR was introduced

The idea behind AAC-SSR was not only the advantage listed above, but also the possibility of reducing the data rate by removing 1, 2 or 3 of the upper PQF bands. A very simple bitstream splitter can remove these bands and thus reduce the bitrate and sample rate.

Example:

Note: although possible, the resulting quality is much worse than typical for this bitrate. So for normal 64 kbit/s AAC LC a bandwidth of 14–16 kHz is achieved by using intensity stereo and reduced NMRs. This degrades audible quality less than transmitting 6 kHz bandwidth with perfect quality.

BSAC

Bit Sliced Arithmetic Coding is an MPEG-4 standard (ISO/IEC 14496-3 subpart 4) for scalable audio coding. BSAC uses an alternative noiseless coding to AAC, with the rest of the processing being identical to AAC. This support for scalability allows for nearly transparent sound quality at 64 kbit/s and graceful degradation at lower bit rates. BSAC coding is best performed in the range of 40 kbit/s to 64 kbit/s, though it operates in the range of 16 kbit/s to 64 kbit/s. The AAC-BSAC codec is used in Digital Multimedia Broadcasting (DMB) applications.

Licensing

In 2002, the MPEG-4 Audio Licensing Committee selected the Via Licensing Corporation as the Licensing Administrator for the MPEG-4 Audio patent pool. [3] [51] [52]

See also

Related Research Articles

<span class="mw-page-title-main">Moving Picture Experts Group</span> Alliance of working groups to set standards for multimedia coding

The Moving Picture Experts Group (MPEG) is an alliance of working groups established jointly by ISO and IEC that sets standards for media coding, including compression coding of audio, video, graphics, and genomic data; and transmission and file formats for various applications. Together with JPEG, MPEG is organized under ISO/IEC JTC 1/SC 29 – Coding of audio, picture, multimedia and hypermedia information.

MPEG-1 is a standard for lossy compression of video and audio. It is designed to compress VHS-quality raw digital video and CD audio down to about 1.5 Mbit/s without excessive quality loss, making video CDs, digital cable/satellite TV and digital audio broadcasting (DAB) practical.

MPEG-4 is a group of international standards for the compression of digital audio and visual data, multimedia systems, and file storage formats. It was originally introduced in late 1998 as a group of audio and video coding formats and related technology agreed upon by the ISO/IEC Moving Picture Experts Group (MPEG) under the formal standard ISO/IEC 14496 – Coding of audio-visual objects. Uses of MPEG-4 include compression of audiovisual data for Internet video and CD distribution, voice and broadcast television applications. The MPEG-4 standard was developed by a group led by Touradj Ebrahimi and Fernando Pereira.

Advanced Audio Coding (AAC) is an audio coding standard for lossy digital audio compression. Designed to be the successor of the MP3 format, AAC generally achieves higher sound quality than MP3 encoders at the same bit rate.

Harmonic Vector Excitation Coding, abbreviated as HVXC is a speech coding algorithm specified in MPEG-4 Part 3 standard for very low bit rate speech coding. HVXC supports bit rates of 2 and 4 kbit/s in the fixed and variable bit rate mode and sampling frequency of 8 kHz. It also operates at lower bitrates, such as 1.2 - 1.7 kbit/s, using a variable bit rate technique. The total algorithmic delay for the encoder and decoder is 36 ms.

<span class="mw-page-title-main">High-Efficiency Advanced Audio Coding</span> Audio codec

High-Efficiency Advanced Audio Coding (HE-AAC) is an audio coding format for lossy data compression of digital audio defined as an MPEG-4 Audio profile in ISO/IEC 14496–3. It is an extension of Low Complexity AAC (AAC-LC) optimized for low-bitrate applications such as streaming audio. The usage profile HE-AAC v1 uses spectral band replication (SBR) to enhance the modified discrete cosine transform (MDCT) compression efficiency in the frequency domain. The usage profile HE-AAC v2 couples SBR with Parametric Stereo (PS) to further enhance the compression efficiency of stereo signals.

TwinVQ is an audio compression technique developed by Nippon Telegraph and Telephone Corporation (NTT) Human Interface Laboratories in 1994. The compression technique has been used in both standardized and proprietary designs.

MPEG-4 Part 17, or MPEG-4 Timed Text (MP4TT), or MPEG-4 Streaming text format is the text-based subtitle format for MPEG-4, published as ISO/IEC 14496-17 in 2006. It was developed in response to the need for a generic method for coding of text as one of the multimedia components within audiovisual presentations.

The Extensible MPEG-4 Textual Format (XMT) is a high-level, XML-based file format for storing MPEG-4 data in a way suitable for further editing. In contrast, the more common MPEG-4 Part 14 (MP4) format is less flexible and used for distributing finished content.

MPEG-4 Part 11Scene description and application engine was published as ISO/IEC 14496-11 in 2005. MPEG-4 Part 11 is also known as BIFS, XMT, MPEG-J. It defines:

MPEG-4 Audio Lossless Coding, also known as MPEG-4 ALS, is an extension to the MPEG-4 Part 3 audio standard to allow lossless audio compression. The extension was finalized in December 2005 and published as ISO/IEC 14496-3:2005/Amd 2:2006 in 2006. The latest description of MPEG-4 ALS was published as subpart 11 of the MPEG-4 Audio standard in December 2019.

MPEG-4 Structured Audio is an ISO/IEC standard for describing sound. It was published as subpart 5 of MPEG-4 Part 3 in 1999.

<span class="mw-page-title-main">MPEG-4 SLS</span> Extension to the MPEG-4 Audio standard

MPEG-4 SLS, or MPEG-4 Scalable to Lossless as per ISO/IEC 14496-3:2005/Amd 3:2006 (Scalable Lossless Coding), is an extension to the MPEG-4 Part 3 (MPEG-4 Audio) standard to allow lossless audio compression scalable to lossy MPEG-4 General Audio coding methods (e.g., variations of AAC). It was developed jointly by the Institute for Infocomm Research (I2R) and Fraunhofer, which commercializes its implementation of a limited subset of the standard under the name of HD-AAC. Standardization of the HD-AAC profile for MPEG-4 Audio is under development (as of September 2009).

<span class="mw-page-title-main">MP4 file format</span> Digital format for storing video and audio

MPEG-4 Part 14 or MP4 is a digital multimedia container format most commonly used to store video and audio, but it can also be used to store other data such as subtitles and still images. Like most modern container formats, it allows streaming over the Internet. The only filename extension for MPEG-4 Part 14 files as defined by the specification is .mp4. MPEG-4 Part 14 is a standard specified as a part of MPEG-4.

The MPEG-4 Low Delay Audio Coder is audio compression standard designed to combine the advantages of perceptual audio coding with the low delay necessary for two-way communication. It is closely derived from the MPEG-2 Advanced Audio Coding (AAC) standard. It was published in MPEG-4 Audio Version 2 and in its later revisions.

MPEG Surround, also known as Spatial Audio Coding (SAC) is a lossy compression format for surround sound that provides a method for extending mono or stereo audio services to multi-channel audio in a backwards compatible fashion. The total bit rates used for the core and the MPEG Surround data are typically only slightly higher than the bit rates used for coding of the core. MPEG Surround adds a side-information stream to the core bit stream, containing spatial image data. Legacy stereo playback systems will ignore this side-information while players supporting MPEG Surround decoding will output the reconstructed multi-channel audio.

Structured Audio Orchestra Language (SAOL) is an imperative, MUSIC-N programming language designed for describing virtual instruments, processing digital audio, and applying sound effects. It was published as subpart 5 of MPEG-4 Part 3 in 1999.

The ISO base media file format (ISOBMFF) is a container file format that defines a general structure for files that contain time-based multimedia data such as video and audio. It is standardized in ISO/IEC 14496-12, a.k.a. MPEG-4 Part 12, and was formerly also published as ISO/IEC 15444-12, a.k.a. JPEG 2000 Part 12.

Unified Speech and Audio Coding (USAC) is an audio compression format and codec for both music and speech or any mix of speech and audio using very low bit rates between 12 and 64 kbit/s. It was developed by Moving Picture Experts Group (MPEG) and was published as an international standard ISO/IEC 23003-3 and also as an MPEG-4 Audio Object Type in ISO/IEC 14496-3:2009/Amd 3 in 2012.

References

  1. 1 2 ISO (2009). "ISO/IEC 14496-3:2009 - Information technology -- Coding of audio-visual objects -- Part 3: Audio". ISO. Retrieved 2009-10-06.
  2. 1 2 ISO (1999). "ISO/IEC 14496-3:1999 - Information technology -- Coding of audio-visual objects -- Part 3: Audio". ISO. Retrieved 2009-10-06.
  3. 1 2 Business Wire (2002-12-02). "MPEG-4 Audio Licensing Committee Selects Via Licensing Corporation as Administrator; MPEG-4 Audio Licensing Committee Finalizing Terms for Audio Profile Licensing". The Free Library. Retrieved 2009-10-06.{{cite web}}: |author= has generic name (help)
  4. 1 2 3 4 Karlheinz Brandenburg; Oliver Kunz; Akihiko Sugiyama (1999). "MPEG-4 Natural Audio Coding – Audio profiles and levels". chiariglione.org. Archived from the original on 2010-07-17. Retrieved 2009-10-06.
  5. Karlheinz Brandenburg; Oliver Kunz; Akihiko Sugiyama. "MPEG-4 Natural Audio Coding – scalability in MPEG-4 natural audio". chiariglione.org. Archived from the original on 2010-02-28. Retrieved 2009-10-06.
  6. D. Thom, H. Purnhagen, and the MPEG Audio Subgroup (October 1998). "MPEG Audio FAQ – MPEG-4". chiariglione.org. Retrieved 2009-10-06.{{cite web}}: CS1 maint: multiple names: authors list (link)
  7. 1 2 3 4 ISO/IEC JTC 1/SC 29/WG 11 (July 1999), ISO/IEC 14496-3:/Amd.1 – Final Committee Draft – MPEG-4 Audio Version 2 (PDF), archived from the original (PDF) on 2012-08-01, retrieved 2009-10-07
  8. 1 2 Heiko Purnhagen (1999-06-07), An Overview of MPEG-4 Audio Version 2 (PDF), Heiko Purnhagen, archived from the original (PDF) on 2017-07-06, retrieved 2009-10-07
  9. 1 2 Heiko Purnhagen (2001-06-01). "The MPEG-4 Audio Standard: Overview and Applications". Heiko Purnhagen. Retrieved 2009-10-07.[ dead link ]
  10. Heiko Purnhagen (2001-11-07). "The MPEG Audio Web Page – MPEG-4 Audio (ISO/IEC 14496-3)" . Retrieved 2009-10-07.[ dead link ]
  11. Rob Koenen, ISO/IEC JTC1/SC29/WG11 (March 2002). "Overview of the MPEG-4 Standard". chiariglione.org. Retrieved 2009-10-06.
  12. 1 2 MPEG. "MPEG standards – Full list of standards developed or under development". chiariglione.org. Archived from the original on April 20, 2010. Retrieved 2009-10-31.
  13. ISO (2000). "ISO/IEC 14496-3:1999/Amd 1:2000 - Audio extensions". ISO. Retrieved 2009-10-07.
  14. ISO (2001). "ISO/IEC 14496-3:2001 - Information technology -- Coding of audio-visual objects -- Part 3: Audio". ISO. Retrieved 2009-10-14.
  15. ISO (2005). "ISO/IEC 14496-3:2005 - Information technology -- Coding of audio-visual objects -- Part 3: Audio". ISO. Retrieved 2009-10-14.
  16. 1 2 3 4 ISO/IEC (2009-09-01), ISO/IEC 14496-3:2009 - Information technology -- Coding of audio-visual objects -- Part 3: Audio (PDF), IEC, retrieved 2009-10-07
  17. ISO/IEC (2019-12-01), ISO/IEC 14496-3:2019 - Information technology -- Coding of audio-visual objects -- Part 3: Audio, IEC, retrieved 2020-06-02
  18. 1 2 MultimediaWiki (2009). "MPEG-4 Audio". MultimediaWiki. Retrieved 2009-10-09.
  19. 1 2 3 Bernhard Grill; Stefan Geyersberger; Johannes Hilpert; Bodo Teichmann (July 2004), Implementation of MPEG-4 Audio Components on various Platforms (PDF), Fraunhofer Gesellschaft, archived from the original (PDF) on 2007-06-10, retrieved 2009-10-09
  20. ISO/IEC JTC1/SC29/WG11 N2203 (March 1998). "MPEG-4 Audio (Final Committee Draft 14496-3)". Heiko Purnhagen. Retrieved 2009-10-07.[ dead link ]
  21. 1 2 3 ISO/IEC JTC1/SC29/WG11/N7016 (2005-01-11), Text of ISO/IEC 14496-3:2001/FPDAM 4, Audio Lossless Coding (ALS), new audio profiles and BSAC extensions, archived from the original (DOC) on 2014-05-12, retrieved 2009-10-09
  22. 1 2 Karlheinz Brandenburg; Oliver Kunz; Akihiko Sugiyama (1999). "MPEG-4 Natural Audio Coding – General Audio Coding (AAC based)". chiariglione.org. Archived from the original on 2010-02-19. Retrieved 2009-10-06.
  23. ISO (2003). "Bandwidth extension, ISO/IEC 14496-3:2001/Amd 1:2003". ISO. Retrieved 2009-10-13.
  24. 1 2 Scheirer, Eric D.; Ray, Lee (1998). "Algorithmic and Wavetable Synthesis in the MPEG-4 Multimedia Standard". Audio Engineering Society Convention 105, 1998. CiteSeerX   10.1.1.35.2773 . 2.2 Wavetable synthesis with SASBF: The SASBF wavetable-bank format had a somewhat complex history of development. The original specification was contributed by E-Mu Systems and was based on their "SoundFont" format [15]. After integration of this component in the MPEG-4 reference software was complete, the MIDI Manufacturers Association (MMA) approached MPEG requesting that MPEG-4 SASBF be compatible with their "Downloaded Sounds" format [13]. E-Mu agreed that this compatibility was desirable, and so a new format was negotiated and designed collaboratively by all parties.
  25. ISO (2004). "Parametric coding for high-quality audio, ISO/IEC 14496-3:2001/Amd 2:2004". ISO. Retrieved 2009-10-13.
  26. ISO/IEC JTC1/SC29/WG11 (2003-07-25). "Text of ISO/IEC 14496-3:2001/FPDAM2 (Parametric Audio) - N5713". Archived from the original (DOC) on 2014-05-12. Retrieved 2009-10-13.
  27. 3GPP (2004-09-30). "3GPP TS 26.401 V6.0.0 (2004-09), General Audio Codec audio processing functions; Enhanced aacPlus General Audio CodecGeneral Description (Release 6)" (DOC). 3GPP. Retrieved 2009-10-13.
  28. 3GPP (2005-01-04). "ETSI TS 126 401 V6.1.0 (2004-12) - Universal Mobile Telecommunications System (UMTS)General audio codec audio processing functions; Enhanced aacPlus general audio codecGeneral description (3GPP TS 26.401 version 6.1.0 Release 6)". 3GPP. Retrieved 2009-10-13.
  29. 1 2 ISO (2006). "Audio Lossless Coding (ALS), new audio profiles and BSAC extensions, ISO/IEC 14496-3:2005/Amd 2:2006". ISO. Retrieved 2009-10-13.
  30. ISO (2007). "BSAC extensions and transport of MPEG Surround, ISO/IEC 14496-3:2005/Amd 5:2007". ISO. Retrieved 2009-10-13.
  31. ISO/IEC JTC1/SC29/WG11 (July 2005). "Tutorial on MPEG Surround Audio Coding". Archived from the original on 2010-04-30. Retrieved 2010-02-09.
  32. ISO/IEC JTC1/SC29/WG11 (July 2005). "Tutorial on MPEG Surround Audio Coding". Archived from the original on 2008-03-24. Retrieved 2010-02-09.
  33. ISO (2007-01-29). "ISO/IEC 23003-1:2007 - Information technology -- MPEG audio technologies -- Part 1: MPEG Surround". ISO. Retrieved 2009-10-24.
  34. 1 2 3 ISO (2005). "MPEG-1/2 audio in MPEG-4, ISO/IEC 14496-3:2001/Amd 3:2005". ISO. Retrieved 2009-10-13.
  35. ISO (2005). "Lossless coding of oversampled audio, ISO/IEC 14496-3:2001/Amd 6:2005". ISO. Retrieved 2009-10-13.
  36. ISO (2006). "Scalable Lossless Coding (SLS), ISO/IEC 14496-3:2005/Amd 3:2006". ISO. Retrieved 2009-10-13.
  37. ISO (2008). "Enhanced low delay AAC, ISO/IEC 14496-3:2005/Amd 9:2008". ISO. Retrieved 2009-10-13.
  38. ISO (2008). "ISO/IEC 14496-23:2008, Information technology -- Coding of audio-visual objects -- Part 23: Symbolic Music Representation". ISO. Retrieved 2009-10-13.
  39. ISO (2008). "Symbolic Music Representation conformance, ISO/IEC 14496-4:2004/Amd 29:2008". ISO. Retrieved 2009-10-13.
  40. ISO (2012). "ISO/IEC 23003-3:2012 - Information technology -- MPEG audio technologies -- Part 3: Unified speech and audio coding". ISO. Retrieved 2019-11-07.
  41. ISO (2009). "ISO/IEC 14496-3:2009/Amd 2:2010, ALS simple profile and transport of SAOC". ISO. Retrieved 2009-10-13.
  42. 1 2 ISO/IEC JTC1/SC29/WG11 (2009-07-03), ISO/IEC 14496-3:200X/PDAM 2 – ALS Simple Profile and Transport of SAOC , N10826, archived from the original (DOC) on 2014-07-29, retrieved 2009-10-13
  43. 1 2 ISO (2010). "ISO/IEC 23003-2:2010 - Information technology -- MPEG audio technologies -- Part 2: Spatial Audio Object Coding (SAOC)". ISO. Retrieved 2010-12-27.
  44. AES Convention Paper 8099 – A new parametric stereo and Multi Channel Extension for MPEG-4 Enhanced Low Delay AAC (AAC-ELD) (PDF), retrieved 2019-11-07
  45. ISO/IEC JTC1/SC29/WG11 (2008-10-17), ISO/IEC 14496-3:2005/PDAM 10:200X HD-AAC profile, MPEG2008/N10188, archived from the original (DOC) on 2014-05-12, retrieved 2009-10-19
  46. ISO (2009-09-11). "ISO/IEC 14496-3:2009/Amd 1:2009 - HD-AAC profile and MPEG Surround signaling". ISO. Retrieved 2009-10-15.
  47. ISO (2009-10-08). "ISO/IEC 14496-3:2009/Amd 2:2010 - ALS simple profile and transport of SAOC". ISO. Retrieved 2009-10-15.
  48. ISO (2004-11-15), ISO/IEC 14496-1, Third edition 2004-11-15, Part 1: Systems (PDF), ISO, archived from the original (PDF) on June 14, 2011, retrieved 2009-10-14
  49. 1 2 ISO (2004-10-15), ISO/IEC 13818-7, Third edition, Part 7 – Advanced Audio Coding (AAC) (PDF), p. 32, archived from the original (PDF) on 2011-07-13, retrieved 2009-10-19
  50. 1 2 ISO (1997). "ISO/IEC 13818-7:1997, Information technology -- Generic coding of moving pictures and associated audio information -- Part 7: Advanced Audio Coding (AAC)" . Retrieved 2009-10-19.
  51. Business Wire (2009-01-05). "Via Licensing Announces MPEG-4 SLS Patent Pool License". Reuters. Archived from the original on 2013-01-04. Retrieved 2009-10-09.{{cite web}}: |author= has generic name (help)
  52. Via Licensing Corporation (2009-05-12). "Via Licensing Announces the Availability of an MPEG-4 SLS Joint Patent Licensing Program". Business Wire. Retrieved 2009-10-09.