An audio file format is a file format for storing digital audio data on a computer system. The bit layout of the audio data (excluding metadata) is called the audio coding format and can be uncompressed, or compressed to reduce the file size, often using lossy compression. The data can be a raw bitstream in an audio coding format, but it is usually embedded in a container format or an audio data format with defined storage layer.
It is important to distinguish between the audio coding format, the container containing the raw audio data, and an audio codec. A codec performs the encoding and decoding of the raw audio data while this encoded data is (usually) stored in a container file. Although most audio file formats support only one type of audio coding data (created with an audio coder), a multimedia container format (as Matroska or AVI) may support multiple types of audio and video data.
There are three major groups of audio file formats:
.ape
), WavPack (filename extension .wv
), TTA, ATRAC Advanced Lossless, ALAC (filename extension .m4a
), MPEG-4 SLS, MPEG-4 ALS, MPEG-4 DST, Windows Media Audio Lossless (WMA Lossless), and Shorten (SHN).One major uncompressed audio format, LPCM, is the same variety of PCM as used in Compact Disc Digital Audio and is the format most commonly accepted by low level audio APIs and D/A converter hardware. Although LPCM can be stored on a computer as a raw audio format, it is usually stored in a .wav
file on Windows or in a .aiff
file on macOS. The Audio Interchange File Format (AIFF) format is based on the Interchange File Format (IFF), and the WAV format is based on the similar Resource Interchange File Format (RIFF). WAV and AIFF are designed to store a wide variety of audio formats, lossless and lossy; they just add a small, metadata-containing header before the audio data to declare the format of the audio data, such as LPCM with a particular sample rate, bit depth, endianness and number of channels. Since WAV and AIFF are widely supported and can store LPCM, they are suitable file formats for storing and archiving an original recording.
BWF (Broadcast Wave Format) is a standard audio format created by the European Broadcasting Union as a successor to WAV. Among other enhancements, BWF allows more robust metadata to be stored in the file. See European Broadcasting Union: Specification of the Broadcast Wave Format (EBU Technical document 3285, July 1997). This is the primary recording format used in many professional audio workstations in the television and film industry. BWF files include a standardized timestamp reference which allows for easy synchronization with a separate picture element. Stand-alone, file based, multi-track recorders from AETA, [1] Sound Devices, [2] Zaxcom, [3] HHB Communications Ltd, [4] Fostex, Nagra, Aaton, [5] and TASCAM all use BWF as their preferred format.
A lossless compressed audio format stores data in less space without losing any information. The original, uncompressed data can be recreated from the compressed version.
Uncompressed audio formats encode both sound and silence with the same number of bits per unit of time. Encoding an uncompressed minute of absolute silence produces a file of the same size as encoding an uncompressed minute of music. In a lossless compressed format, however, the music would occupy a smaller file than an uncompressed format and the silence would take up almost no space at all.
Lossless compression formats include FLAC, WavPack, Monkey's Audio, ALAC (Apple Lossless). They provide a compression ratio of about 2:1 (i.e. their files take up half the space of PCM). Development in lossless compression formats aims to reduce processing time while maintaining a good compression ratio.
Lossy audio format enables even greater reductions in file size by removing some of the audio information and simplifying the data. This, of course, results in a reduction in audio quality, but a variety of techniques are used, mainly by exploiting psychoacoustics, to remove the parts of the sound that have the least effect on perceived quality, and to minimize the amount of audible noise added during the process. The popular MP3 format is probably the best-known example, but the AAC format found on the iTunes Music Store is also common. Most formats offer a range of degrees of compression, generally measured in bit rate. The lower the rate, the smaller the file and the more significant the quality loss.
File Extension | Creation Company | Description |
---|---|---|
.3gp | Multimedia container format can contain proprietary formats as AMR, AMR-WB or AMR-WB+, but also some open formats | |
.aa | Audible (Amazon) | A low-bitrate audiobook container format with DRM, containing audio encoded as either MP3 or the ACELP speech codec. |
.aac | The Advanced Audio Coding format is based on the MPEG-2 and MPEG-4 standards. AAC files are usually ADTS or ADIF containers. | |
.aax | Audible (Amazon) | An Audiobook format, which is a variable-bitrate (allowing high quality) M4B file encrypted with DRM. MPB contains AAC or ALAC encoded audio in an MPEG-4 container. (More details below.) |
.act | ACT is a lossy ADPCM 8 kbit/s compressed audio format recorded by most Chinese MP3 and MP4 players with a recording function, and voice recorders | |
.aiff | Apple | A standard uncompressed CD-quality, audio file format used by Apple. Established 3 years prior to Microsoft's uncompressed version wav. |
.alac | Apple | An audio coding format developed by Apple Inc. for lossless data compression of digital music. |
.amr | AMR-NB audio, used primarily for speech. | |
.ape | Matthew T. Ashland | Monkey's Audio lossless audio compression format. |
.au | Sun Microsystems | The standard audio file format used by Sun, Unix and Java. The audio in au files can be PCM or compressed with the μ-law, a-law or G.729 codecs. |
.awb | AMR-WB audio, used primarily for speech, same as the ITU-T's G.722.2 specification. | |
.dss | Olympus | DSS files are an Olympus proprietary format. DSS files use a high compression rate, which reduces the file size and allows files to be copied and transferred quickly. [6] It allows additional data to be held in the file header. |
.dvf | Sony | A Sony proprietary format for compressed voice files; commonly used by Sony dictation recorders. |
.flac | A file format for the Free Lossless Audio Codec, an open-source lossless compression codec. | |
.gsm | Designed for telephony use in Europe, GSM is used to store telephone voice messages and conversations. With a bitrate of 13 kbit/s, GSM files can compress and encode audio at telephone quality. [7] Note that WAV files can also be encoded with the GSM codec. | |
.iklax | iKlax | An iKlax Media proprietary format, the iKlax format is a multi-track digital audio format allowing various actions on musical data, for instance on mixing and volumes arrangements. |
.ivs | 3D Solar UK Ltd | A proprietary version with DRM developed by 3D Solar UK Ltd for use in music downloaded from their Tronme Music Store and interactive music and video player. |
.m4a | An audio-only MPEG-4 file, used by Apple for unprotected music downloaded from their iTunes Music Store. Audio within the m4a file is typically encoded with AAC, although lossless ALAC may also be used. | |
.m4b | An Audiobook / podcast extension with AAC or ALAC encoded audio in an MPEG-4 container. Both M4A and M4B formats can contain metadata including chapter markers, images, and hyperlinks, but M4B allows "bookmarks" (remembering the last listening spot), whereas M4A does not. [8] | |
.m4p | Apple | A version of AAC with proprietary DRM developed by Apple for use in music downloaded from their iTunes Music Store and their music streaming service known as Apple Music. |
.mmf | Yamaha, Samsung | A Samsung audio format that is used in ringtones. Developed by Yamaha (SMAF stands for "Synthetic music Mobile Application Format", and is a multimedia data format invented by the Yamaha Corporation, .mmf file format). |
.movpkg | Apple | An Apple audio format primarily used for Lossless and Hi-Res audio files through Apple Music. Also used for storing Apple TV videos. |
.mp3 | MPEG Layer III Audio | |
.mpc | Musepack or MPC (formerly known as MPEGplus, MPEG+ or MP+) is an open source lossy audio codec, specifically optimized for transparent compression of stereo audio at bitrates of 160–180 kbit/s. | |
.msv | Sony | A Sony proprietary format for Memory Stick compressed voice files. |
.nmf | NICE | NICE Media Player audio file |
.ogg, .oga, .mogg | Xiph.Org Foundation | A free, open source container format supporting a variety of formats, the most popular of which is the audio format Vorbis. Vorbis offers compression similar to MP3 but is less popular. Mogg, the "Multi-Track-Single-Logical-Stream Ogg-Vorbis", is the multi-channel or multi-track Ogg file format. |
.opus | Internet Engineering Task Force | A lossy audio compression format developed by the Internet Engineering Task Force (IETF) and made especially suitable for interactive real-time applications over the Internet. As an open format standardised through RFC 6716, a reference implementation is provided under the 3-clause BSD license. |
.ra, .rm | RealNetworks | A RealAudio format designed for streaming audio over the Internet. The .ra format allows files to be stored in a self-contained fashion on a computer, with all of the audio data contained inside the file itself. |
.raw | A raw file can contain audio in any format but is usually used with PCM audio data. It is rarely used except for technical tests. | |
.rf64 | One successor to the Wav format, overcoming the 4GiB size limitation. | |
.sln | Signed Linear PCM format used by Asterisk. Prior to v.10 the standard formats were 16-bit Signed Linear PCM sampled at 8 kHz and at 16 kHz. With v.10 many more sampling rates were added. [9] | |
.tta | The True Audio, real-time lossless audio codec. | |
.voc | Creative Technology | The file format consists of a 26-byte header and a series of subsequent data blocks containing the audio information |
.vox | The vox format most commonly uses the Dialogic ADPCM (Adaptive Differential Pulse Code Modulation) codec. Similar to other ADPCM formats, it compresses to 4-bits. Vox format files are similar to wave files except that the vox files contain no information about the file itself so the codec sample rate and number of channels must first be specified in order to play a vox file. | |
.wav | IBM and Microsoft | Standard audio file container format used mainly in Windows PCs. Commonly used for storing uncompressed (PCM), CD-quality sound files, which means that they can be large in size—around 10 MB per minute. Wave files can also contain data encoded with a variety of (lossy) codecs to reduce the file size (for example the GSM or MP3 formats). Wav files use a RIFF structure. |
.wma | Microsoft | Windows Media Audio format, created by Microsoft. Designed with DRM abilities for copy protection. |
.wv | Format for wavpack files. | |
.webm | Royalty-free format created for HTML video. | |
.8svx | Electronic Arts | The IFF-8SVX format for 8-bit sound samples, created by Electronic Arts in 1984 at the birth of the Amiga. |
.cda | Format for cda files for Radio. |
A codec is a device or computer program that encodes or decodes a data stream or signal. Codec is a portmanteau of coder/decoder.
In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Lossy compression reduces bits by removing unnecessary or less important information. Typically, a device that performs data compression is referred to as an encoder, and one that performs the reversal of the process (decompression) as a decoder.
In information technology, lossy compression or irreversible compression is the class of data compression methods that uses inexact approximations and partial data discarding to represent the content. These techniques are used to reduce data size for storing, handling, and transmitting content. Higher degrees of approximation create coarser images as more details are removed. This is opposed to lossless data compression which does not degrade the data. The amount of data reduction possible using lossy compression is much higher than using lossless techniques.
Lossless compression is a class of data compression that allows the original data to be perfectly reconstructed from the compressed data with no loss of information. Lossless compression is possible because most real-world data exhibits statistical redundancy. By contrast, lossy compression permits reconstruction only of an approximation of the original data, though usually with greatly improved compression rates.
Waveform Audio File Format is an audio file format standard for storing an audio bitstream on personal computers. The format was developed and published for the first time in 1991 by IBM and Microsoft. It is the main format used on Microsoft Windows systems for uncompressed audio. The usual bitstream encoding is the linear pulse-code modulation (LPCM) format.
Audio Video Interleave is a proprietary multimedia container format and Windows standard introduced by Microsoft in November 1992 as part of its Video for Windows software. AVI files can contain both audio and video data in a file container that allows synchronous audio-with-video playback. Like the DVD video format, AVI files support multiple streaming audio and video, although these features are seldom used.
Audio Interchange File Format (AIFF) is an audio file format standard used for storing sound data for personal computers and other electronic audio devices. The format was developed by Apple Inc. in 1988 based on Electronic Arts' Interchange File Format and is most commonly used on Apple Macintosh computer systems.
Monkey's Audio is an algorithm and file format for lossless audio data compression. Lossless data compression does not discard data during the process of encoding, unlike lossy compression methods such as Advanced Audio Coding, MP3, Vorbis, and Opus. Therefore, it may be decompressed to a file that is identical to the source material.
8-Bit Sampled Voice (8SVX) is an audio file format standard developed by Electronic Arts for the Amiga computer series. It is a data subtype of the IFF file container format. It typically contains linear pulse-code modulation (LPCM) digital audio.
The Apple Lossless Audio Codec, also known as Apple Lossless, or Apple Lossless Encoder (ALE), is an audio coding format, and its reference audio codec implementation, developed by Apple Inc. for lossless data compression of digital music. After initially keeping it proprietary from its inception in 2004, in late 2011 Apple made the codec available open source and royalty-free. Traditionally, Apple has referred to the codec as Apple Lossless, though more recently it has begun to use the abbreviated term ALAC when referring to the codec.
Transcoding is the direct digital-to-digital conversion of one encoding to another, such as for video data files, audio files, or character encoding. This is usually done in cases where a target device does not support the format or has limited storage capacity that mandates a reduced file size, or to convert incompatible or obsolete data to a better-supported or modern format.
WavPack is a free and open-source lossless audio compression format and application implementing the format. It is unique in the way that it supports hybrid audio compression alongside normal compression which is similar to how FLAC works. It also supports compressing a wide variety of lossless formats, including various variants of PCM and also DSD as used in SACDs, together with its support for surround audio.
MPEG-4 SLS, or MPEG-4 Scalable to Lossless as per ISO/IEC 14496-3:2005/Amd 3:2006 (Scalable Lossless Coding), is an extension to the MPEG-4 Part 3 (MPEG-4 Audio) standard to allow lossless audio compression scalable to lossy MPEG-4 General Audio coding methods (e.g., variations of AAC). It was developed jointly by the Institute for Infocomm Research (I2R) and Fraunhofer, which commercializes its implementation of a limited subset of the standard under the name of HD-AAC. Standardization of the HD-AAC profile for MPEG-4 Audio is under development (as of September 2009).
OptimFROG is a proprietary, lossless audio codec developed by Florin Ghido. OptimFROG is optimized for high compression at the expense of encoding and decoding speed, and consistently measures among the highest compressing lossless codecs. OptimFROG comes with three compressors: a lossless codec for integer LPCM format in WAV files, one for IEEE 754 floating-point WAV files, and third codec called DualStream.
Pulse-code modulation (PCM) is a method used to digitally represent analog signals. It is the standard form of digital audio in computers, compact discs, digital telephony and other digital audio applications. In a PCM stream, the amplitude of the analog signal is sampled at uniform intervals, and each sample is quantized to the nearest value within a range of digital steps. Alec Reeves, Claude Shannon, Barney Oliver and John R. Pierce are credited with its invention.
MPEG-1 Audio Layer III HD was an audio compression codec developed by Technicolor, formerly known as Thomson.
A video coding format is a content representation format of digital video content, such as in a data file or bitstream. It typically uses a standardized video compression algorithm, most commonly based on discrete cosine transform (DCT) coding and motion compensation. A specific software, firmware, or hardware implementation capable of compression or decompression in a specific video coding format is called a video codec.
An audio coding format is a content representation format for storage or transmission of digital audio. Examples of audio coding formats include MP3, AAC, Vorbis, FLAC, and Opus. A specific software or hardware implementation capable of audio compression and decompression to/from a specific audio coding format is called an audio codec; an example of an audio codec is LAME, which is one of several different codecs which implements encoding and decoding audio in the MP3 audio coding format in software.