Advanced Audio Coding

Last updated
Advanced Audio Coding
Filename extension MPEG/3GPP container

Apple container

ADTS stream

.aac
Internet media type
audio/aac
audio/aacp
audio/3gpp
audio/3gpp2
audio/mp4
audio/mp4a-latm
audio/mpeg4-generic
Developed by Bell, Fraunhofer, Dolby, Sony, Nokia, LG Electronics, NEC, NTT Docomo, Panasonic [1]
Initial releaseDecember 1997;26 years ago (1997-12) [2]
Latest release
ISO/IEC 14496-3:2019
December 2019;4 years ago (2019-12)
Type of format Lossy audio
Contained by MPEG-4 Part 14, 3GP and 3G2, ISO base media file format and Audio Data Interchange Format (ADIF)
Standard ISO/IEC 13818-7,
ISO/IEC 14496-3
Open format?Yes
Free format?No [3]

Advanced Audio Coding (AAC) is an audio coding standard for lossy digital audio compression. It was designed to be the successor of the MP3 format and generally achieves higher sound quality than MP3 at the same bit rate. [4]

Contents

AAC has been standardized by ISO and IEC as part of the MPEG-2 and MPEG-4 specifications. [5] [6] Part of AAC, HE-AAC ("AAC+"), is part of MPEG-4 Audio and is adopted into digital radio standards DAB+ and Digital Radio Mondiale, and mobile television standards DVB-H and ATSC-M/H.

AAC supports inclusion of 48 full-bandwidth (up to 96 kHz) audio channels in one stream plus 16 low frequency effects (LFE, limited to 120 Hz) channels, up to 16 "coupling" or dialog channels, and up to 16 data streams. The quality for stereo is satisfactory to modest requirements at 96 kbit/s in joint stereo mode; however, hi-fi transparency demands data rates of at least 128 kbit/s (VBR). Tests[ which? ] of MPEG-4 audio have shown that AAC meets the requirements referred to as "transparent" for the ITU at 128 kbit/s for stereo, and 384 kbit/s for 5.1 audio. [7] AAC uses only a modified discrete cosine transform (MDCT) algorithm, giving it higher compression efficiency than MP3, which uses a hybrid coding algorithm that is part MDCT and part FFT. [4]

AAC is the default or standard audio format for iPhone, iPod, iPad, Nintendo DSi, Nintendo 3DS, Apple Music, [a] iTunes, DivX Plus Web Player, PlayStation 4 and various Nokia Series 40 phones. It is supported on a wide range of devices and software such as PlayStation Vita, Wii, digital audio players like Sony Walkman or SanDisk Clip, Android and BlackBerry devices, various in-dash car audio systems,[ when? ][ vague ] and is also one of the audio formats used on the Spotify web player. [8]

History

Background

The discrete cosine transform (DCT), a type of transform coding for lossy compression, was proposed by Nasir Ahmed in 1972, and developed by Ahmed with T. Natarajan and K. R. Rao in 1973, publishing their results in 1974. [9] [10] [11] This led to the development of the modified discrete cosine transform (MDCT), proposed by J. P. Princen, A. W. Johnson and A. B. Bradley in 1987, [12] following earlier work by Princen and Bradley in 1986. [13] The MP3 audio coding standard introduced in 1992 used a hybrid coding algorithm that is part MDCT and part FFT. [14] AAC uses a purely MDCT algorithm, giving it higher compression efficiency than MP3. [4] Development further advanced when Lars Liljeryd introduced a method that radically shrank the amount of information needed to store the digitized form of a song or speech. [15]

AAC was developed with the cooperation and contributions of companies including Bell Labs, Fraunhofer IIS, Dolby Laboratories, LG Electronics, NEC, Panasonic, Sony Corporation, [1] ETRI, JVC Kenwood, Philips, Microsoft, and NTT. [16] It was officially declared an international standard by the Moving Picture Experts Group in April 1997. It is specified both as Part 7 of the MPEG-2 standard, and Subpart 4 in Part 3 of the MPEG-4 standard. [17]

Standardization

In 1997, AAC was first introduced as MPEG-2 Part 7, formally known as ISO/IEC 13818-7:1997. This part of MPEG-2 was a new part, since MPEG-2 already included MPEG-2 Part 3, formally known as ISO/IEC 13818-3: MPEG-2 BC (Backwards Compatible). [18] [19] Therefore, MPEG-2 Part 7 is also known as MPEG-2 NBC (Non-Backward Compatible), because it is not compatible with the MPEG-1 audio formats (MP1, MP2 and MP3). [18] [20] [21] [22]

MPEG-2 Part 7 defined three profiles: Low-Complexity profile (AAC-LC / LC-AAC), Main profile (AAC Main) and Scalable Sampling Rate profile (AAC-SSR). AAC-LC profile consists of a base format very much like AT&T's Perceptual Audio Coding (PAC) coding format, [23] [24] [25] with the addition of temporal noise shaping (TNS), [26] the Kaiser window (described below), a nonuniform quantizer, and a reworking of the bitstream format to handle up to 16 stereo channels, 16 mono channels, 16 low-frequency effect (LFE) channels and 16 commentary channels in one bitstream. The Main profile adds a set of recursive predictors that are calculated on each tap of the filterbank. The SSR uses a 4-band PQMF filterbank, with four shorter filterbanks following, in order to allow for scalable sampling rates.

In 1999, MPEG-2 Part 7 was updated and included in the MPEG-4 family of standards and became known as MPEG-4 Part 3 , MPEG-4 Audio or ISO/IEC 14496-3:1999. This update included several improvements. One of these improvements was the addition of Audio Object Types which are used to allow interoperability with a diverse range of other audio formats such as TwinVQ, CELP, HVXC, speech synthesis and MPEG-4 Structured Audio. Another notable addition in this version of the AAC standard is Perceptual Noise Substitution (PNS). In that regard, the AAC profiles (AAC-LC, AAC Main and AAC-SSR profiles) are combined with perceptual noise substitution and are defined in the MPEG-4 audio standard as Audio Object Types. [27] MPEG-4 Audio Object Types are combined in four MPEG-4 Audio profiles: Main (which includes most of the MPEG-4 Audio Object Types), Scalable (AAC LC, AAC LTP, CELP, HVXC, TwinVQ, Wavetable Synthesis, TTSI), Speech (CELP, HVXC, TTSI) and Low Rate Synthesis (Wavetable Synthesis, TTSI). [27] [28]

The reference software for MPEG-4 Part 3 is specified in MPEG-4 Part 5 and the conformance bit-streams are specified in MPEG-4 Part 4. MPEG-4 Audio remains backward-compatible with MPEG-2 Part 7. [29]

The MPEG-4 Audio Version 2 (ISO/IEC 14496-3:1999/Amd 1:2000) defined new audio object types: the low delay AAC (AAC-LD) object type, bit-sliced arithmetic coding (BSAC) object type, parametric audio coding using harmonic and individual line plus noise and error resilient (ER) versions of object types. [30] [31] [32] It also defined four new audio profiles: High Quality Audio Profile, Low Delay Audio Profile, Natural Audio Profile and Mobile Audio Internetworking Profile. [33]

The HE-AAC Profile (AAC LC with SBR) and AAC Profile (AAC LC) were first standardized in ISO/IEC 14496-3:2001/Amd 1:2003. [34] The HE-AAC v2 Profile (AAC LC with SBR and Parametric Stereo) was first specified in ISO/IEC 14496-3:2005/Amd 2:2006. [35] [36] [37] The Parametric Stereo audio object type used in HE-AAC v2 was first defined in ISO/IEC 14496-3:2001/Amd 2:2004. [38] [39] [40]

The current version of the AAC standard is defined in ISO/IEC 14496-3:2009. [41]

AAC+ v2 is also standardized by ETSI (European Telecommunications Standards Institute) as TS 102005. [38]

The MPEG-4 Part 3 standard also contains other ways of compressing sound. These include lossless compression formats, synthetic audio and low bit-rate compression formats generally used for speech.

AAC's improvements over MP3

Advanced Audio Coding is designed to be the successor of the MPEG-1 Audio Layer 3 , known as MP3 format, which was specified by ISO/IEC in 11172-3 (MPEG-1 Audio) and 13818-3 (MPEG-2 Audio).

Improvements include:

Overall, the AAC format allows developers more flexibility to design codecs than MP3 does, and corrects many of the design choices made in the original MPEG-1 audio specification. This increased flexibility often leads to more concurrent encoding strategies and, as a result, to more efficient compression. This is especially true at very low bit rates where the superior stereo coding, pure MDCT, and better transform window sizes leave MP3 unable to compete.

While the MP3 format has near-universal hardware and software support, primarily because MP3 was the format of choice during the crucial first few years of widespread music file-sharing/distribution over the internet, AAC is a strong contender due to some unwavering industry support. [42]

Functionality

AAC is a wideband audio coding algorithm that exploits two primary coding strategies to dramatically reduce the amount of data needed to represent high-quality digital audio:

The actual encoding process consists of the following steps:

The MPEG-4 audio standard does not define a single or small set of highly efficient compression schemes but rather a complex toolbox to perform a wide range of operations from low bit rate speech coding to high-quality audio coding and music synthesis.

AAC encoders can switch dynamically between a single MDCT block of length 1024 points or 8 blocks of 128 points (or between 960 points and 120 points, respectively).

Modular encoding

AAC takes a modular approach to encoding. Depending on the complexity of the bitstream to be encoded, the desired performance and the acceptable output, implementers may create profiles to define which of a specific set of tools they want to use for a particular application.

The MPEG-2 Part 7 standard (Advanced Audio Coding) was first published in 1997 and offers three default profiles: [2] [44]

The MPEG-4 Part 3 standard (MPEG-4 Audio) defined various new compression tools (a.k.a. Audio Object Types) and their usage in brand new profiles. AAC is not used in some of the MPEG-4 Audio profiles. The MPEG-2 Part 7 AAC LC profile, AAC Main profile and AAC SSR profile are combined with Perceptual Noise Substitution and defined in the MPEG-4 Audio standard as Audio Object Types (under the name AAC LC, AAC Main and AAC SSR). These are combined with other Object Types in MPEG-4 Audio profiles. [27] Here is a list of some audio profiles defined in the MPEG-4 standard: [35] [45]

One of many improvements in MPEG-4 Audio is an Object Type called Long Term Prediction (LTP), which is an improvement of the Main profile using a forward predictor with lower computational complexity. [29]

AAC error protection toolkit

Applying error protection enables error correction up to a certain extent. Error correcting codes are usually applied equally to the whole payload. However, since different parts of an AAC payload show different sensitivity to transmission errors, this would not be a very efficient approach.

The AAC payload can be subdivided into parts with different error sensitivities.

Error Resilient (ER) AAC

Error Resilience (ER) techniques can be used to make the coding scheme itself more robust against errors.

For AAC, three custom-tailored methods were developed and defined in MPEG-4 Audio

AAC Low Delay

The audio coding standards MPEG-4 Low Delay (AAC-LD), Enhanced Low Delay (AAC-ELD), and Enhanced Low Delay v2 (AAC-ELDv2) as defined in ISO/IEC 14496-3:2009 and ISO/IEC 14496-3:2009/Amd 3 are designed to combine the advantages of perceptual audio coding with the low delay necessary for two-way communication. They are closely derived from the MPEG-2 Advanced Audio Coding (AAC) format. [47] [48] [49] AAC-ELD is recommended by GSMA as super-wideband voice codec in the IMS Profile for High Definition Video Conference (HDVC) Service. [50]

Licensing and patents

No licenses or payments are required for a user to stream or distribute audio in AAC format. [51] This reason alone might have made AAC a more attractive format to distribute audio than its predecessor MP3, particularly for streaming audio (such as Internet radio) depending on the use case.

However, a patent license is required for all manufacturers or developers of AAC "end-user" codecs. [52] The terms (as disclosed to SEC) uses per-unit pricing. In the case of software, each computer running the software is to be considered a separate "unit". [53]

It used to be common for free and open source software implementations such as FFmpeg and FAAC to only distribute in source code form so as to not "otherwise supply" an AAC codec. However, FFmpeg has since become more lenient on patent matters: the "gyan.dev" builds recommended by the official site now contain its AAC codec, with the FFmpeg legal page stating that patent law conformance is the user's responsibility. [54] (See below under Products that support AAC, Software.) The Fedora Project, a community backed by Red Hat, has imported the "Third-Party Modified Version of the Fraunhofer FDK AAC Codec Library for Android" to its repositories on September 25, 2018, [55] and has enabled FFmpeg's native AAC encoder and decoder for its ffmpeg-free package on January 31, 2023. [56]

The AAC patent holders include Bell Labs, Dolby, ETRI, Fraunhofer, JVC Kenwood, LG Electronics, Microsoft, NEC, NTT (and its subsidiary NTT Docomo), Panasonic, Philips, and Sony Corporation. [16] [1] Based on the list of patents from the SEC terms, the last baseline AAC patent expires in 2028, and the last patent for all AAC extensions mentioned expires in 2031. [57]

Extensions and improvements

Some extensions have been added to the first AAC standard (defined in MPEG-2 Part 7 in 1997):

Container formats

In addition to the MP4, 3GP and other container formats based on ISO base media file format for file storage, AAC audio data was first packaged in a file for the MPEG-2 standard using Audio Data Interchange Format (ADIF), [62] consisting of a single header followed by the raw AAC audio data blocks. [63] However, if the data is to be streamed within an MPEG-2 transport stream, a self-synchronizing format called an Audio Data Transport Stream (ADTS) is used, consisting of a series of frames, each frame having a header followed by the AAC audio data. [62] This file and streaming-based format are defined in MPEG-2 Part 7, but are only considered informative by MPEG-4, so an MPEG-4 decoder does not need to support either format. [62] These containers, as well as a raw AAC stream, may bear the .aac file extension. MPEG-4 Part 3 also defines its own self-synchronizing format called a Low Overhead Audio Stream (LOAS) that encapsulates not only AAC, but any MPEG-4 audio compression scheme such as TwinVQ and ALS. This format is what was defined for use in DVB transport streams when encoders use either SBR or parametric stereo AAC extensions. However, it is restricted to only a single non-multiplexed AAC stream. This format is also referred to as a Low Overhead Audio Transport Multiplex (LATM), which is just an interleaved multiple stream version of a LOAS. [62]

Products that support AAC

HDTV Standards

Japanese ISDB-T

In December 2003, Japan started broadcasting terrestrial DTV ISDB-T standard that implements MPEG-2 video and MPEG-2 AAC audio. In April 2006 Japan started broadcasting the ISDB-T mobile sub-program, called 1seg, that was the first implementation of video H.264/AVC with audio HE-AAC in Terrestrial HDTV broadcasting service on the planet.

International ISDB-Tb

In December 2007, Brazil started broadcasting terrestrial DTV standard called International ISDB-Tb that implements video coding H.264/AVC with audio AAC-LC on main program (single or multi) and video H.264/AVC with audio HE-AACv2 in the 1seg mobile sub-program.

DVB

The ETSI, the standards governing body for the DVB suite, supports AAC, HE-AAC and HE-AAC v2 audio coding in DVB applications since at least 2004. [64] DVB broadcasts which use the H.264 compression for video normally use HE-AAC for audio.[ citation needed ]

Hardware

iTunes and iPod

In April 2003, Apple brought mainstream attention to AAC by announcing that its iTunes and iPod products would support songs in MPEG-4 AAC format (via a firmware update for older iPods). Customers could download music in a closed-source digital rights management (DRM)-restricted form of 128 kbit/s AAC (see FairPlay) via the iTunes Store or create files without DRM from their own CDs using iTunes. In later years, Apple began offering music videos and movies, which also use AAC for audio encoding.

On May 29, 2007, Apple began selling songs and music videos from participating record labels at higher bitrate (256 kbit/s cVBR) and free of DRM, a format dubbed "iTunes Plus" . These files mostly adhere to the AAC standard and are playable on many non-Apple products but they do include custom iTunes information such as album artwork and a purchase receipt, so as to identify the customer in case the file is leaked out onto peer-to-peer networks. It is possible, however, to remove these custom tags to restore interoperability with players that conform strictly to the AAC specification. As of January 6, 2009, nearly all music on the USA regioned iTunes Store became DRM-free, with the remainder becoming DRM-free by the end of March 2009. [65]

iTunes offers a "Variable Bit Rate" encoding option which encodes AAC tracks in the Constrained Variable Bitrate scheme (a less strict variant of ABR encoding); the underlying QuickTime API does offer a true VBR encoding profile however. [66]

As of September 2009, Apple has added support for HE-AAC (which is fully part of the MP4 standard) only for radio streams, not file playback, and iTunes still lacks support for true VBR encoding.

Other portable players

Mobile phones

For a number of years, many mobile phones from manufacturers such as Nokia, Motorola, Samsung, Sony Ericsson, BenQ-Siemens and Philips have supported AAC playback. The first such phone was the Nokia 5510 released in 2002 which also plays MP3s. However, this phone was a commercial failure[ citation needed ] and such phones with integrated music players did not gain mainstream popularity until 2005 when the trend of having AAC as well as MP3 support continued. Most new smartphones and music-themed phones support playback of these formats.

  • Sony Ericsson phones support various AAC formats in MP4 container. AAC-LC is supported in all phones beginning with K700, phones beginning with W550 have support of HE-AAC. The latest devices such as the P990, K610, W890i and later support HE-AAC v2.
  • Nokia XpressMusic and other new generation Nokia multimedia phones like N- and E-Series also support AAC format in LC, HE, M4A and HEv2 profiles. These also supports playing LTP-encoded AAC audio.
  • BlackBerry phones running the BlackBerry 10 operating system support AAC playback natively. Select previous generation BlackBerry OS devices also support AAC.
  • bada OS
  • Apple's iPhone supports AAC and FairPlay protected AAC files formerly used as the default encoding format in the iTunes Store until the removal of DRM restrictions in March 2009.
  • Android 2.3 [67] and later supports AAC-LC, HE-AAC and HE-AAC v2 in MP4 or M4A containers along with several other audio formats. Android 3.1 and later supports raw ADTS files. Android 4.1 can encode AAC. [68]
  • WebOS by HP/Palm supports AAC, AAC+, eAAC+, and .m4a containers in its native music player as well as several third-party players. However, it does not support Apple's FairPlay DRM files downloaded from iTunes. [69]
  • Windows Phone 's Silverlight runtime supports AAC-LC, HE-AAC and HE-AAC v2 decoding.

Other devices

  • Apple's iPad : Supports AAC and FairPlay protected AAC files used as the default encoding format in the iTunes Store
  • Palm OS PDAs : Many Palm OS based PDAs and smartphones can play AAC and HE-AAC with the 3rd party software Pocket Tunes. Version 4.0, released in December 2006, added support for native AAC and HE-AAC files. The AAC codec for TCPMP, a popular video player, was withdrawn after version 0.66 due to patent issues, but can still be downloaded from sites other than corecodec.org. CorePlayer, the commercial follow-on to TCPMP, includes AAC support. Other Palm OS programs supporting AAC include Kinoma Player and AeroPlayer.
  • Windows Mobile : Supports AAC either by the native Windows Media Player or by third-party products (TCPMP, CorePlayer)[ citation needed ]
  • Epson : Supports AAC playback in the P-2000 and P-4000 Multimedia/Photo Storage Viewers
  • Sony Reader : plays M4A files containing AAC, and displays metadata created by iTunes. Other Sony products, including the A and E series Network Walkmans, support AAC with firmware updates (released May 2006) while the S series supports it out of the box.
  • Sonos Digital Media Player: supports playback of AAC files
  • Barnes & Noble Nook Color : supports playback of AAC encoded files
  • Roku SoundBridge : a network audio player, supports playback of AAC encoded files
  • Squeezebox : network audio player (made by Slim Devices, a Logitech company) that supports playback of AAC files
  • PlayStation 3 : supports encoding and decoding of AAC files
  • Xbox 360 : supports streaming of AAC through the Zune software, and of supported iPods connected through the USB port
  • Wii : supports AAC files through version 1.1 of the Photo Channel as of December 11, 2007. All AAC profiles and bitrates are supported as long as it is in the .m4a file extension. The 1.1 update removed MP3 compatibility, but according to Nintendo, users who have installed this may freely downgrade to the old version if they wish. [70]
  • Livescribe Pulse and Echo Smartpens: record and store audio in AAC format. The audio files can be replayed using the pen's integrated speaker, attached headphones, or on a computer using the Livescribe Desktop software. The AAC files are stored in the user's "My Documents" folder of the Windows OS and can be distributed and played without specialized hardware or software from Livescribe.
  • Google Chromecast : supports playback of LC-AAC and HE-AAC audio [71]

Software

Almost all current computer media players include built-in decoders for AAC, or can utilize a library to decode it. On Microsoft Windows, DirectShow can be used this way with the corresponding filters to enable AAC playback in any DirectShow based player. Mac OS X supports AAC via the QuickTime libraries.

Adobe Flash Player, since version 9 update 3, can also play back AAC streams. [72] [73] Since Flash Player is also a browser plugin, it can play AAC files through a browser as well.

The Rockbox open source firmware (available for multiple portable players) also offers support for AAC to varying degrees, depending on the model of player and the AAC profile.

Optional iPod support (playback of unprotected AAC files) for the Xbox 360 is available as a free download from Xbox Live. [74]

The following is a non-comprehensive list of other software player applications:

Some of these players (e.g., foobar2000, Winamp, and VLC) also support the decoding of ADTS (Audio Data Transport Stream) using the SHOUTcast protocol. Plug-ins for Winamp and foobar2000 enable the creation of such streams.

Nero Digital Audio

In May 2006, Nero AG released an AAC encoding tool free of charge, Nero Digital Audio (the AAC codec portion has become Nero AAC Codec), [75] which is capable of encoding LC-AAC, HE-AAC and HE-AAC v2 streams. The tool is a command-line interface tool only. A separate utility is also included to decode to PCM WAV.

Various tools including the foobar2000 audio player and MediaCoder can provide a GUI for this encoder.

FAAC and FAAD2

FAAC and FAAD2 stand for Freeware Advanced Audio Coder and Decoder 2 respectively. FAAC supports audio object types LC, Main and LTP. [76] FAAD2 supports audio object types LC, Main, LTP, SBR and PS. [77] Although FAAD2 is free software, FAAC is not free software.

Fraunhofer FDK AAC

A Fraunhofer-authored open-source encoder/decoder included in Android has been ported to other platforms. FFmpeg’s native AAC encoder does not support HE-AAC and HE-AACv2, but GPL 2.0+ of ffmpeg is not compatible with FDK AAC, hence ffmpeg with libfdk-aac is not redistributable. The QAAC encoder that is using Apple's Core Media Audio is still higher quality than FDK.

FFmpeg and Libav

The native AAC encoder created in FFmpeg's libavcodec, and forked with Libav, was considered experimental and poor. A significant amount of work was done for the 3.0 release of FFmpeg (February 2016) to make its version usable and competitive with the rest of the AAC encoders. [78] Libav has not merged this work and continues to use the older version of the AAC encoder. These encoders are LGPL-licensed open-source and can be built for any platform that the FFmpeg or Libav frameworks can be built.

Both FFmpeg and Libav can use the Fraunhofer FDK AAC library via libfdk-aac, and while the FFmpeg native encoder has become stable and good enough for common use, FDK is still considered the highest quality encoder available for use with FFmpeg. [79] Libav also recommends using FDK AAC if it is available. [80] FFmpeg 4.4 and above can also use the Apple audiotoolbox encoder. [79]

Although the native AAC encoder only produces AAC-LC, ffmpeg's native decoder is able to deal with a wide range of input formats.

See also

Notes

  1. only used on web player, Google Home, Amazon Alexa, and Microsoft Windows app.

Related Research Articles

<span class="mw-page-title-main">MP3</span> Digital audio format

MP3 is a coding format for digital audio developed largely by the Fraunhofer Society in Germany under the lead of Karlheinz Brandenburg. It was designed to greatly reduce the amount of data required to represent audio, yet still sound like a faithful reproduction of the original uncompressed audio to most listeners; for example, compared to CD-quality digital audio, MP3 compression can commonly achieve a 75–95% reduction in size, depending on the bit rate. In popular usage, MP3 often refers to files of sound or music recordings stored in the MP3 file format (.mp3) on consumer electronic devices.

MPEG-1 is a standard for lossy compression of video and audio. It is designed to compress VHS-quality raw digital video and CD audio down to about 1.5 Mbit/s without excessive quality loss, making video CDs, digital cable/satellite TV and digital audio broadcasting (DAB) practical.

Windows Media Audio (WMA) is a series of audio codecs and their corresponding audio coding formats developed by Microsoft. It is a proprietary technology that forms part of the Windows Media framework. WMA consists of four distinct codecs. The original WMA codec, known simply as WMA, was conceived as a competitor to the popular MP3 and RealAudio codecs. WMA Pro, a newer and more advanced codec, supports multichannel and high-resolution audio. A lossless codec, WMA Lossless, compresses audio data without loss of audio fidelity. WMA Voice, targeted at voice content, applies compression using a range of low bit rates. Microsoft has also developed a digital container format called Advanced Systems Format to store audio encoded by WMA.

MPEG-4 Part 3 or MPEG-4 Audio is the third part of the ISO/IEC MPEG-4 international standard developed by Moving Picture Experts Group. It specifies audio coding methods. The first version of ISO/IEC 14496-3 was published in 1999.

3GP is a multimedia container format defined by the Third Generation Partnership Project (3GPP) for 3G UMTS multimedia services. It is used on 5G phones.

Harmonic Vector Excitation Coding, abbreviated as HVXC is a speech coding algorithm specified in MPEG-4 Part 3 standard for very low bit rate speech coding. HVXC supports bit rates of 2 and 4 kbit/s in the fixed and variable bit rate mode and sampling frequency of 8 kHz. It also operates at lower bitrates, such as 1.2 - 1.7 kbit/s, using a variable bit rate technique. The total algorithmic delay for the encoder and decoder is 36 ms.

<span class="mw-page-title-main">High-Efficiency Advanced Audio Coding</span> Audio codec

High-Efficiency Advanced Audio Coding (HE-AAC) is an audio coding format for lossy data compression of digital audio defined as an MPEG-4 Audio profile in ISO/IEC 14496–3. It is an extension of Low Complexity AAC (AAC-LC) optimized for low-bitrate applications such as streaming audio. The usage profile HE-AAC v1 uses spectral band replication (SBR) to enhance the modified discrete cosine transform (MDCT) compression efficiency in the frequency domain. The usage profile HE-AAC v2 couples SBR with Parametric Stereo (PS) to further enhance the compression efficiency of stereo signals.

TwinVQ is an audio compression technique developed by Nippon Telegraph and Telephone Corporation (NTT) Human Interface Laboratories in 1994. The compression technique has been used in both standardized and proprietary designs.

MPEG-4 Part 2, MPEG-4 Visual is a video compression format developed by the Moving Picture Experts Group (MPEG). It belongs to the MPEG-4 ISO/IEC standards. It uses block-wise motion compensation and a discrete cosine transform (DCT), similar to previous standards such as MPEG-1 Part 2 and H.262/MPEG-2 Part 2.

FAAC is a software project which includes the AAC encoder FAAC and decoder FAAD2. It supports MPEG-2 AAC as well as MPEG-4 AAC. It supports several MPEG-4 Audio object types, file formats, multichannel and gapless encoding/decoding and MP4 metadata tags. The encoder and decoder is compatible with standard-compliant audio applications using one or more of these object types and facilities. It also supports Digital Radio Mondiale.

MPEG-4 Audio Lossless Coding, also known as MPEG-4 ALS, is an extension to the MPEG-4 Part 3 audio standard to allow lossless audio compression. The extension was finalized in December 2005 and published as ISO/IEC 14496-3:2005/Amd 2:2006 in 2006. The latest description of MPEG-4 ALS was published as subpart 11 of the MPEG-4 Audio standard in December 2019.

<span class="mw-page-title-main">MPEG-4 SLS</span> Extension to the MPEG-4 Audio standard

MPEG-4 SLS, or MPEG-4 Scalable to Lossless as per ISO/IEC 14496-3:2005/Amd 3:2006 (Scalable Lossless Coding), is an extension to the MPEG-4 Part 3 (MPEG-4 Audio) standard to allow lossless audio compression scalable to lossy MPEG-4 General Audio coding methods (e.g., variations of AAC). It was developed jointly by the Institute for Infocomm Research (I2R) and Fraunhofer, which commercializes its implementation of a limited subset of the standard under the name of HD-AAC. Standardization of the HD-AAC profile for MPEG-4 Audio is under development (as of September 2009).

<span class="mw-page-title-main">MP4 file format</span> Digital format for storing video and audio

MPEG-4 Part 14, or MP4, is a digital multimedia container format most commonly used to store video and audio, but it can also be used to store other data such as subtitles and still images. Like most modern container formats, it allows streaming over the Internet. The only filename extension for MPEG-4 Part 14 files as defined by the specification is .mp4. MPEG-4 Part 14 is a standard specified as a part of MPEG-4.

The MPEG-4 Low Delay Audio Coder is audio compression standard designed to combine the advantages of perceptual audio coding with the low delay necessary for two-way communication. It is closely derived from the MPEG-2 Advanced Audio Coding (AAC) standard. It was published in MPEG-4 Audio Version 2 and in its later revisions.

MPEG Surround, also known as Spatial Audio Coding (SAC) is a lossy compression format for surround sound that provides a method for extending mono or stereo audio services to multi-channel audio in a backwards compatible fashion. The total bit rates used for the core and the MPEG Surround data are typically only slightly higher than the bit rates used for coding of the core. MPEG Surround adds a side-information stream to the core bit stream, containing spatial image data. Legacy stereo playback systems will ignore this side-information while players supporting MPEG Surround decoding will output the reconstructed multi-channel audio.

The ISO base media file format (ISOBMFF) is a container file format that defines a general structure for files that contain time-based multimedia data such as video and audio. It is standardized in ISO/IEC 14496-12, a.k.a. MPEG-4 Part 12, and was formerly also published as ISO/IEC 15444-12, a.k.a. JPEG 2000 Part 12.

Unified Speech and Audio Coding (USAC) is an audio compression format and codec for both music and speech or any mix of speech and audio using very low bit rates between 12 and 64 kbit/s. It was developed by Moving Picture Experts Group (MPEG) and was published as an international standard ISO/IEC 23003-3 and also as an MPEG-4 Audio Object Type in ISO/IEC 14496-3:2009/Amd 3 in 2012.

<span class="mw-page-title-main">Audio coding format</span> Digitally coded format for audio signals

An audio coding format is a content representation format for storage or transmission of digital audio. Examples of audio coding formats include MP3, AAC, Vorbis, FLAC, and Opus. A specific software or hardware implementation capable of audio compression and decompression to/from a specific audio coding format is called an audio codec; an example of an audio codec is LAME, which is one of several different codecs which implements encoding and decoding audio in the MP3 audio coding format in software.

Nero AAC Codec is a set of software tools for encoding and decoding Advanced Audio Coding (AAC) format audio, and editing MPEG-4 metadata. It was developed and distributed by Nero AG, and is available at no cost for Windows and Linux for non-commercial use. The codec was originally part of Nero Digital, but was later released as a stand-alone package.

References

  1. 1 2 3 "Via Licensing Announces Updated AAC Joint Patent License". Business Wire . 5 January 2009. Retrieved 18 June 2019.
  2. 1 2 ISO (1997). "ISO/IEC 13818-7:1997, Information technology — Generic coding of moving pictures and associated audio information — Part 7: Advanced Audio Coding (AAC)". Archived from the original on 2012-09-25. Retrieved 2010-07-18.
  3. Advanced Audio Coding (MPEG-4) (Full draft). Sustainability of Digital Formats. Washington, D.C.: Library of Congress. 22 June 2010. Retrieved 1 December 2021.
  4. 1 2 3 Brandenburg, Karlheinz (1999). "MP3 and AAC Explained" (PDF). Archived from the original (PDF) on 2017-02-13.
  5. ISO (2006) ISO/IEC 13818-7:2006 – Information technology — Generic coding of moving pictures and associated audio information — Part 7: Advanced Audio Coding (AAC) Archived 2016-03-03 at the Wayback Machine , Retrieved on 2009-08-06
  6. ISO (2006) ISO/IEC 14496-3:2005 – Information technology — Coding of audio-visual objects — Part 3: Audio Archived 2016-04-13 at the Wayback Machine , Retrieved on 2009-08-06
  7. "The AAC Audio Coding Family For Broadcast and Cable TV" (PDF). 2013. p. 6. Archived from the original (PDF) on 2023-09-28. Retrieved 2024-01-29.
  8. "Audio file formats for Spotify". Spotify. Retrieved 20 September 2021.
  9. Ahmed, Nasir (January 1991). "How I Came Up With the Discrete Cosine Transform". Digital Signal Processing . 1 (1): 4–5. Bibcode:1991DSP.....1....4A. doi:10.1016/1051-2004(91)90086-Z.
  10. Ahmed, Nasir; Natarajan, T.; Rao, K. R. (January 1974), "Discrete Cosine Transform", IEEE Transactions on Computers, C-23 (1): 90–93, doi:10.1109/T-C.1974.223784, S2CID   149806273
  11. Rao, K. R.; Yip, P. (1990), Discrete Cosine Transform: Algorithms, Advantages, Applications, Boston: Academic Press, ISBN   978-0-12-580203-1
  12. J. P. Princen, A. W. Johnson und A. B. Bradley: Subband/transform coding using filter bank designs based on time domain aliasing cancellation, IEEE Proc. Intl. Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2161–2164, 1987
  13. John P. Princen, Alan B. Bradley: Analysis/synthesis filter bank design based on time domain aliasing cancellation, IEEE Trans. Acoust. Speech Signal Processing, ASSP-34 (5), 1153–1161, 1986
  14. Guckert, John (Spring 2012). "The Use of FFT and MDCT in MP3 Audio Compression" (PDF). University of Utah . Retrieved 14 July 2019.
  15. Borland, John (March 18, 2004). "The sound of science". CNET. Retrieved 2023-04-21.
  16. 1 2 "AAC Licensors". Via Corp. Retrieved 15 January 2020.
  17. ISO/IEC 14496-3:2009 - Information technology -- Coding of audio-visual objects -- Part 3: Audio (PDF) (Technical report). ISO/IEC. 1 September 2009. Archived (PDF) from the original on 14 June 2011. Retrieved 2009-10-07.
  18. 1 2 "AAC". MPEG.ORG. Archived from the original on 3 October 2009. Retrieved 2009-10-28.
  19. "ISO/IEC 13818-7, Fourth edition, Part 7 - Advanced Audio Coding (AAC)" (PDF). ISO. 15 January 2006. Archived (PDF) from the original on 6 March 2009. Retrieved 2009-10-28.
  20. Bouvigne, Gabriel (2003). "MPEG-2/MPEG-4 - AAC". MP3'Tech. Archived from the original on 2010-01-05. Retrieved 2009-10-28.
  21. "MPEG Audio FAQ Version 9 - MPEG-1 and MPEG-2 BC". ISO. October 1998. Archived from the original on 2010-02-18. Retrieved 2009-10-28.
  22. "Florence Press Release". ISO. March 1996. Archived from the original on 2010-04-08. Retrieved 2009-10-28.
  23. Johnston, J. D. and Ferreira, A. J., "Sum-difference stereo transform coding", ICASSP '92, March 1992, pp. II-569-572.
  24. Sinha, D. and Johnston, J. D., "Audio compression at low bit rates using a signal adaptive switched filterbank", IEEE ASSP, 1996, pp. 1053-1057.
  25. Johnston, J. D., Sinha, D., Dorward, S. and Quackenbush, S., "AT&T perceptual audio coder (PAC)" in Collected Papers on Digital Audio Bit-Rate Reduction, Gilchrist, N. and Grewin, C. (Ed.), Audio Engineering Society, 1996.
  26. Herre, J. and Johnston, J. D., "Enhancing the performance of perceptual audio coders by using temporal noise shaping", AES 101st Convention, no. preprint 4384, 1996
  27. 1 2 3 Brandenburg, Karlheinz; Kunz, Oliver; Sugiyama, Akihiko. "MPEG-4 Natural Audio Coding - Audio profiles and levels". chiariglione.org. Archived from the original on 2010-07-17. Retrieved 2009-10-06.
  28. "ISO/IEC FCD 14496-3 Subpart 1 - Draft - N2203" (PDF). ISO/IEC JTC 1/SC 29/WG 11. 15 May 1998. Retrieved 2009-10-07.
  29. 1 2 3 Brandenburg, Karlheinz; Kunz, Oliver; Sugiyama, Akihiko (1999). "MPEG-4 Natural Audio Coding - General Audio Coding (AAC based)". chiariglione.org. Archived from the original on 2010-02-19. Retrieved 2009-10-06.
  30. "ISO/IEC 14496-3:1999/Amd 1:2000 - Audio extensions". ISO. 2000. Archived from the original on 2011-06-06. Retrieved 2009-10-07.
  31. "ISO/IEC 14496-3:/Amd.1 - Final Committee Draft - MPEG-4 Audio Version 2" (PDF). ISO/IEC JTC 1/SC 29/WG 11. July 1999. Archived from the original (PDF) on 2012-08-01. Retrieved 2009-10-07.
  32. Purnhagen, Heiko (19 February 2000). "MPEG-4 Version 2 Audio Workshop:HILN - Parametric Audio Coding" (PDF). Paris. AES 108th Convention: MPEG-4 Version 2 Audio What is it about?. Retrieved 2009-10-07.
  33. Pereira, Fernando (October 2001). "Levels for Audio Profiles". MPEG Industry Forum. Archived from the original on 2010-01-08. Retrieved 2009-10-15.
  34. "ISO/IEC 14496-3:2001/Amd 1:2003 - Bandwidth extension". ISO. 2003. Archived from the original on 2011-06-06. Retrieved 2009-10-07.
  35. 1 2 "Text of ISO/IEC 14496-3:2001/FPDAM 4, Audio Lossless Coding (ALS), new audio profiles and BSAC extensions". ISO/IEC JTC1/SC29/WG11/N7016. 11 January 2005. Archived from the original (DOC) on 12 May 2014. Retrieved 2009-10-09.
  36. "Audio Lossless Coding (ALS), new audio profiles and BSAC extensions, ISO/IEC 14496-3:2005/Amd 2:2006". ISO. 2006. Archived from the original on 2012-01-04. Retrieved 2009-10-13.
  37. Mody, Mihir (6 June 2005). "Audio compression gets better and more complex". Embedded.com. Archived from the original on 8 February 2016. Retrieved 2009-10-13.
  38. 1 2 "MPEG-4 aacPlus - Audio coding for today's digital media world" (PDF). Archived from the original (PDF) on 2006-10-26. Retrieved 2007-01-29.
  39. "Parametric coding for high-quality audio, ISO/IEC 14496-3:2001/Amd 2:2004". ISO. 2004. Archived from the original on 2012-01-04. Retrieved 2009-10-13.
  40. "3GPP TS 26.401 V6.0.0 (2004-09), General Audio Codec audio processing functions; Enhanced aacPlus General Audio Codec; General Description (Release 6)" (DOC). 3GPP. 30 September 2004. Archived from the original on 19 August 2006. Retrieved 2009-10-13.
  41. "ISO/IEC 14496-3:2009 - Information technology -- Coding of audio-visual objects -- Part 3: Audio". ISO. 2009. Archived from the original on 2011-06-06. Retrieved 2009-10-07.
  42. "AAC". Hydrogenaudio. Archived from the original on 2014-07-06. Retrieved 2011-01-24.
  43. US patent application 20070297624 Digital audio encoding
  44. "ISO/IEC 13818-7, Third edition, Part 7 - Advanced Audio Coding (AAC)" (PDF). ISO. 15 October 2004. p. 32. Archived from the original (PDF) on 13 July 2011. Retrieved 2009-10-19.
  45. Grill, Bernhard; Geyersberger, Stefan; Hilpert, Johannes; Teichmann, Bodo (July 2004). Implementation of MPEG-4 Audio Components on various Platforms (PDF). 109th AES Convention 2000 September 22–25 Los Angeles. Fraunhofer Gesellschaft. Archived from the original (PDF) on 2007-06-10. Retrieved 2009-10-09.
  46. "ISO/IEC 14496-3:2009/Amd 3:2012 - Transport of unified speech and audio coding (USAC)". ISO. Archived from the original on 2016-03-08. Retrieved 2016-08-03.
  47. "ISO/IEC 14496-3:2009 - Information technology -- Coding of audio-visual objects -- Part 3: Audio". ISO. Archived from the original on 2016-05-20. Retrieved 2016-08-02.
  48. "ISO/IEC 14496-3:2009/Amd 3:2012 - Transport of unified speech and audio coding (USAC)". ISO. Archived from the original on 2016-08-19. Retrieved 2016-08-02.
  49. "The AAC-ELD Family for High Quality Communication Services | MPEG". mpeg.chiariglione.org. Archived from the original on 2016-08-20. Retrieved 2016-08-02.
  50. IMS Profile for High Definition Video Conference (HDVC) Service (PDF). GSMA. 24 May 2016. p. 10. Archived (PDF) from the original on 18 August 2016.
  51. "AAC Licensing FAQ Q5". Via Licensing. Retrieved 2020-01-15.
  52. "AAC License Fees". Via Licensing. Retrieved 2020-01-15.
  53. Via Licensing Corporation (June 5, 2018). "AAC PATENT LICENSE AGREEMENT". www.sec.gov. Retrieved 21 April 2023.
  54. "FFmpeg License and Legal Considerations". ffmpeg.org.
  55. "Commit - rpms/fdk-aac-free - b27d53fbad872ea0ec103653fddaec83238132d9 - src.fedoraproject.org". src.fedoraproject.org.
  56. "Commit - rpms/ffmpeg - 45f894ec0e43a37775393c159021a4ac60170a55 - src.fedoraproject.org". src.fedoraproject.org.
  57. "List of AAC related patents". hydrogenaud.io.
  58. Thom, D.; Purnhagen, H. (October 1998). "MPEG Audio FAQ Version 9 - MPEG-4". chiariglione.org. MPEG Audio Subgroup. Archived from the original on 2010-02-14. Retrieved 2009-10-06.
  59. "The xHE-AAC Trademark Program". Fraunhofer Institute for Integrated Circuits IIS. Retrieved 2021-02-11.
  60. "Fraunhofer's xHE-AAC Audio Codec Software Extends Native AAC Support In Android P For Better Quality At Low Bitrates". Fraunhofer Institute for Integrated Circuits IIS. Retrieved 2020-07-11.
  61. "ISO/IEC 14496-3:2019". ISO. Retrieved 2022-02-19.
  62. 1 2 3 4 Wolters, Martin; Kjorling, Kristofer; Homm, Daniel; Purnhagen, Heiko. A closer look into MPEG-4 High Efficiency AAC (PDF). p. 3. Archived from the original (PDF) on 2003-12-19. Retrieved 2008-07-31. Presented at the 115th Convention of the Audio Engineering Society, 10–13 October 2003.
  63. "Advanced Audio Coding (MPEG-2), Audio Data Interchange Format". Library of Congress / National Digital Information Infrastructure and Preservation Program. 7 March 2007. Archived from the original on 30 July 2008. Retrieved 2008-07-31.
  64. ETSI TS 101 154 v1.5.1: Specification for the use of Video and Audio Coding in Broadcasting Applications based on the MPEG transport stream
  65. Cohen, Peter (2010-05-27). "iTunes Store goes DRM-free". Macworld. Mac Publishing. Archived from the original on 18 February 2009. Retrieved 2009-02-10.
  66. "Apple AAC". Hydrogenaudio. Archived from the original on 2021-11-23. Retrieved 2021-11-22.
  67. "Gingerbread - Android Developers". Android Developers. Archived from the original on 29 December 2017. Retrieved 8 May 2018.
  68. "Supported media formats - Android Developers". Android Developers. Archived from the original on 11 March 2012. Retrieved 8 May 2018.
  69. "Palm Pre Phone / Features, Details". Palm USA. Archived from the original on 2011-05-24.
  70. "Nintendo - Customer Service - Wii - Photo Channel". nintendo.com. Archived from the original on 5 May 2017. Retrieved 8 May 2018.
  71. "Supported Media for Google Cast". Archived from the original on 2015-09-23. Retrieved 2015-09-22. | Supported Media for Google Cast
  72. "Statistics - Adobe Flash runtimes". www.adobe.com. Archived from the original on 2 October 2011. Retrieved 8 May 2018.
  73. "Adobe Delivers Flash Player 9 with H.264 Video Support". Adobe press release. 2007-12-04. Archived from the original on 2014-08-21. Retrieved 2014-08-20.
  74. "Xbox.com | System Use - Use an Apple iPod with Xbox 360". Archived from the original on April 8, 2007.
  75. "Nero Platinum 2018 Suite - Award-winning all-rounder". Nero AG. Archived from the original on 14 December 2012. Retrieved 8 May 2018.
  76. "FAAC". AudioCoding.com. Archived from the original on 2009-12-11. Retrieved 2009-11-03.
  77. "FAAD2". AudioCoding.com. Archived from the original on 2009-12-11. Retrieved 2009-11-03.
  78. "December 5th, 2015, The native FFmpeg AAC encoder is now stable!". ffmpeg.org. Archived from the original on 16 July 2016. Retrieved 26 June 2016.
  79. 1 2 "FFmpeg AAC Encoding Guide". Archived from the original on 17 April 2016. Retrieved 11 April 2016. Which encoder provides the best quality? ... the likely answer is: libfdk_aac
  80. "Libav Wiki - Encoding AAC". Archived from the original on 2016-04-20. Retrieved 11 April 2016.