Developer(s) | David Bryant |
---|---|
Stable release | |
Repository | |
Operating system | Cross-platform |
Type | Audio codec Container |
License | BSD license |
Website | wavpack.com |
Filename extension | .wv (for both lossless and hybrid files), .wvc (correction file for hybrid files only) |
---|---|
Internet media type | audio/x-wavpack (.wv), audio/x-wavpack-correction (.wvc) |
Magic number | wvpk |
Type of format | Lossless audio |
Contained by | Matroska (not required) |
Open format? | Yes |
Free format? | Yes |
WavPack is a free and open-source lossless audio compression format and application implementing the format. It is unique in the way that it supports hybrid audio compression alongside normal compression which is similar to how FLAC works. It also supports compressing a wide variety of lossless formats, including various variants of PCM and also DSD as used in SACDs, together with its support for surround audio.
WavPack compression can compress (and losslessly restore) 8, 16, 24, and 32-bit fixed-point, and 32-bit floating-point PCM audio files in the .WAV file format. It can also handle DSD input in DSDIFF or DSF format. [2] It also supports surround sound streams and high sampling rates. Like other lossless compression schemes, the data reduction rate varies with the source, but it is generally between 30% and 70% for typical popular music and somewhat better than that for classical music and other sources with greater dynamic range. [3]
WavPack also incorporates a "hybrid" mode, which still provides the features of lossless compression, but creates two files: a relatively small, high-quality, lossy file (.wv) that can be used by itself; and a "correction" file (.wvc) that, when combined with the lossy file, provides full lossless restoration. This allows the use of lossy and lossless codecs together. The lossy algorithm is similar to ADPCM. [4]
Hybrid mode can handle floating-point data, but only when "exception" values like infinities or NaNs are not present. It cannot handle DSD because there is no lossy algorithm for DSD. [2]
A similar "hybrid" feature is also offered by OptimFROG DualStream, MPEG-4 SLS and DTS-HD Master Audio.
David Bryant started development on WavPack in mid-1998 with the release of version 1.0 (1998-08-15). This first version compressed and decompressed audio losslessly, and it already featured one of the best efficiency vs. speed ratios among lossless encoders. [5]
Very soon after the release of version 1.0, v. 2.0 (2 September 1998) was released, featuring lossy encoding (using only quantization of prediction residue for data reduction – no psychoacoustic masking model was applied to the stream).
In 1999, version 3.0 (12 September 1999) was released, with a new "fast mode" (albeit with reduced compression ratio), compression of raw (headerless) PCM audio files, and error detection using a 32-bit cyclic redundancy check.
A feature added in late 3.x versions is the "hybrid" mode where the encoder generates a lossy file and a correction file such that both can be decompressed back to a PCM stream that has same quality as the original. [6]
In WavPack 4, a new file format structure is introduced. A "roadmap" is also published by the author with 4.40, containing possible hints on future development. [7] WavPack 5 introduced support for compressing DSD.
Some software supports the format natively (like DeaDBeeF, foobar2000, [8] and Jack! The Knife), while others require plugins. The official WavPack website offers plugins for Winamp, Nero Burning ROM, MediaChest 2.1, and several other applications, as well as a DirectShow filter. [9] dBpoweramp CD-Ripper, [10] by the author of foobar2000, as well as foobar2000 [11] itself, and Asunder allow ripping Audio CDs directly into Wavpack files.
Linux support is available with a native port.
FFmpeg has a native WavPack encoder, which may be combined with software like GNU parallel to use multiple CPU cores to quickly transcode other lossless formats into WavPack, and from WavPack to any format that FFmpeg supports, without the need for additional software. However, FFMpeg's encoder is somewhat limited.
As of 2023, FFmpeg's WavPack encoder has some considerable limitations. It can only produce version 4 bitstreams, which do not support fast verification for file integrity checks, or more than 16 channels. It will also discard RIFF chunks and may not behave predictably with 24-bit input. It also defaults to less than optimal compression to achieve faster encoding. Due to lack of support for Hybrid mode, FFmpeg-based playback software will fail to consider the .wvc correction file if there is one present, and will play or transcode only the lossy section. (However, this is not the usual mode of operation for WavPack.) As native Wavpack support for Direct Stream Digital was added in version 5 of the reference encoder, FFmpeg WavPack also is limited to encoding PCM input. [12]
Native support:
Non-native support:
The WavPack website also includes a plugin that allows support for the format on the Roku PhotoBridge HD Archived 2005-07-08 at the Wayback Machine . [9]
To ensure high-speed operation, WavPack uses a predictor that is implemented entirely in integer math. [14] In its "fast" mode the prediction is simply the arithmetic extrapolation of the previous two samples. For example, if the previous two samples were −10 and 20, then the prediction would be 50. For the default mode a simple adaptive factor is added to weigh the influence of the earlier sample on the prediction. In the example above the resulting prediction could then vary between 20 for no influence to 50 for full influence. This weight factor is constantly updated based on the audio data's changing spectral characteristics.
The prediction generated is then subtracted from the actual sample to be encoded to generate the error value. In mono mode this value is sent directly to the coder. However, stereo signals tend to have some correlation between the two channels that can be further exploited. Therefore, two error values are calculated that represent the difference and average of the left and right error values. In the "fast" mode of operation these two new values are simply sent to the coder instead of the left and right values. In the default mode, the difference value is always sent to the coder along with one of the other three values (average, left, or right). An adaptive algorithm continuously determines the most efficient of the three to send based on the changing balance of the channels.
Instead of Rice coding, a special data encoder for WavPack is used. Rice coding is the optimal bit coding for this type of data, and WavPack's encoder is less efficient, but only by about 0.15 bits per sample (or less than 1% for 16-bit data). However, there are some advantages in exchange. The first one is that WavPack's encoder does not require the data to be buffered ahead of encoding; instead it converts each sample directly to bitcodes. This is more computationally efficient and is better in some applications where coding delay is critical. The second advantage is that it is easily adaptable to lossy encoding, since all significant bits (except the implied "one" MSB) are transmitted directly. In this way it is possible to only transmit, for example, the 3 most significant bits (with sign) of each sample. In fact, it is possible to transmit only the sign and implied MSB for each sample with an average of only 3.65 bits per sample.
This coding scheme is used to implement the "lossy" mode of WavPack. In the "fast" mode the output of the non-adaptive decorrelator is simply rounded to the nearest codable value for the specified number of bits. In the default mode the adaptive decorrelator is used (which reduces the average noise about 1 dB), and both the current and the next sample are considered in choosing the better of the two available codes (which reduces noise another 1 dB).
No floating-point arithmetic is used in WavPack's data path because, according to the author, integer operations are less susceptible to subtle chip-to-chip variations that could corrupt the lossless nature of the compression (the Pentium floating-point bug being an example). It is possible that a lossless compressor that used floating-point math could generate different output when running on that faulty Pentium. Even disregarding actual bugs, floating-point math is complicated enough that there could be subtle differences between "correct" implementations that could cause trouble for this type of application. [15] A 32-bit error detection code to the generated streams is included to maintain user confidence in the integrity of WavPack's compression.
WavPack source code is portable and has been compiled on several Unix and Unix-like operating systems (Linux, Mac OS X, Solaris, FreeBSD, OpenBSD, NetBSD, Compaq Tru64, HP-UX...) as well as Windows, DOS, Palm OS, and OpenVMS. It works on many architectures, including x86, ARM, PowerPC, AMD64, IA-64, SPARC, Alpha, PA-RISC, MIPS and Motorola 68k.
A cut-down version of WavPack was developed for the Texas Instruments TMS320 series Digital Signal Processor. This was aimed predominantly at encouraging manufacturers to incorporate WavPack compression (and de-compression) into portable memory audio recorders. This version supported features that were applicable only to embedded applications (stream compression in real-time, selectable compression rate) and dropped off features that only applied to full computer systems (self extraction, high compression modes, 32-bit floats). The TMS320 series DSPs are native integer devices, and support WavPack well. Some "special" features of the full WavPack software were included (ability to generate a correction "file" (stream), for example), and others were excluded. The port was based on version 4.
WavPack support was added to WinZip starting with version 11.0 beta, released in October 2006. [16] This extension to the ZIP file format was included by PKWARE, the maintainers of the format, in the official APPNOTE.TXT description file starting with version 6.3.2, released on 28 September 2007. [17]
An audio file format is a file format for storing digital audio data on a computer system. The bit layout of the audio data is called the audio coding format and can be uncompressed, or compressed to reduce the file size, often using lossy compression. The data can be a raw bitstream in an audio coding format, but it is usually embedded in a container format or an audio data format with defined storage layer.
The Au file format is a simple audio file format introduced by Sun Microsystems. The format was common on NeXT systems and on early Web pages. Originally it was headerless, being 8-bit μ-law-encoded data at an 8000 Hz sample rate. Hardware from other vendors often used sample rates as high as 8192 Hz, often integer multiples of video clock signal frequencies. Newer files have a header that consists of six unsigned 32-bit words, an optional information chunk which is always of non-zero size, and then the data.
Windows Media Audio (WMA) is a series of audio codecs and their corresponding audio coding formats developed by Microsoft. It is a proprietary technology that forms part of the Windows Media framework. WMA consists of four distinct codecs. The original WMA codec, known simply as WMA, was conceived as a competitor to the popular MP3 and RealAudio codecs. WMA Pro, a newer and more advanced codec, supports multichannel and high-resolution audio. A lossless codec, WMA Lossless, compresses audio data without loss of audio fidelity. WMA Voice, targeted at voice content, applies compression using a range of low bit rates. Microsoft has also developed a digital container format called Advanced Systems Format to store audio encoded by WMA.
Adaptive Transform Acoustic Coding (ATRAC) is a family of proprietary audio compression algorithms developed by Sony. MiniDisc was the first commercial product to incorporate ATRAC, in 1992. ATRAC allowed a relatively small disc like MiniDisc to have the same running time as CD while storing audio information with minimal perceptible loss in quality. Improvements to the codec in the form of ATRAC3, ATRAC3plus, and ATRAC Advanced Lossless followed in 1999, 2002, and 2006 respectively.
Shorten (SHN) is a file format used for compressing audio data. It is a form of data compression of files and is used to losslessly compress CD-quality audio files. Shorten is no longer developed and other lossless audio codecs such as FLAC, Monkey's Audio (APE), TTA, and WavPack (WV) have become more popular. It is still in use to trade concert recordings that are already encoded as Shorten files. Shorten files use the .shn file extension.
Super Audio CD (SACD) is an optical disc format for audio storage introduced in 1999. It was developed jointly by Sony and Philips Electronics and intended to be the successor to the compact disc (CD) format.
Monkey's Audio is an algorithm and file format for lossless audio data compression. Lossless data compression does not discard data during the process of encoding, unlike lossy compression methods such as Advanced Audio Coding, MP3, Vorbis, and Opus. Therefore, it may be decompressed to a file that is identical to the source material.
Advanced Audio Coding (AAC) is an audio coding standard for lossy digital audio compression. It was designed to be the successor of the MP3 format and generally achieves higher sound quality than MP3 at the same bit rate.
Direct Stream Digital (DSD) is a trademark used by Sony and Philips for their system for digitally encoding audio signals for the Super Audio CD (SACD).
The Apple Lossless Audio Codec (ALAC), also known as Apple Lossless, or Apple Lossless Encoder (ALE), is an audio coding format, and its reference audio codec implementation, developed by Apple Inc. for lossless data compression of digital music. After initially keeping it proprietary from its inception in 2004, in late 2011 Apple made the codec available open source and royalty-free. Traditionally, Apple has referred to the codec as Apple Lossless, though more recently it has begun to use the abbreviated term ALAC when referring to the codec.
Transcoding is the direct digital-to-digital conversion of one encoding to another, such as for video data files, audio files, or character encoding. This is usually done in cases where a target device does not support the format or has limited storage capacity that mandates a reduced file size, or to convert incompatible or obsolete data to a better-supported or modern format.
QuickTime Animation format is a video compression format and codec created by Apple Computer to enable playback of RGB video in real time without expensive hardware. It is generally found in the QuickTime container with the FourCC 'rle '. It can perform either lossless or lossy compression and is one of the few video codecs that supports an alpha channel. Supported color depths are 1-bit (monochrome), 15-bit RGB, 24-bit RGB, 32-bit ARGB, as well as palettized RGB. As a result of reverse-engineering of the format, a decoder is implemented in XAnim as well as an encoder and decoder in libavcodec.
Rockbox is a free and open-source software replacement for the OEM firmware in various forms of digital audio players (DAPs) with an original kernel. It offers an alternative to the player's operating system, in many cases without removing the original firmware, which provides a plug-in architecture for adding various enhancements and functions. Enhancements include personal digital assistant (PDA) functions, applications, utilities, and games. Rockbox can also retrofit video playback functions on players first released in mid-2000. Rockbox includes a voice-driven user-interface suitable for operation by visually impaired users.
Dolby Digital Plus, also known as Enhanced AC-3, is a digital audio compression scheme developed by Dolby Labs for the transport and storage of multi-channel digital audio. It is a successor to Dolby Digital (AC-3), and has a number of improvements over that codec, including support for a wider range of data rates, an increased channel count, and multi-program support, as well as additional tools (algorithms) for representing compressed data and counteracting artifacts. Whereas Dolby Digital (AC-3) supports up to five full-bandwidth audio channels at a maximum bitrate of 640 kbit/s, E-AC-3 supports up to 15 full-bandwidth audio channels at a maximum bitrate of 6.144 Mbit/s.
OptimFROG is a proprietary, lossless audio codec developed by Florin Ghido. OptimFROG is optimized for high compression at the expense of encoding and decoding speed, and consistently measures among the highest compressing lossless codecs. OptimFROG comes with three compressors: a lossless codec for integer LPCM format in WAV files, one for IEEE 754 floating-point WAV files, and third codec called DualStream.
MPEG-1 Audio Layer III HD was an audio compression codec developed by Technicolor, formerly known as Thomson.
An audio coding format is a content representation format for storage or transmission of digital audio. Examples of audio coding formats include MP3, AAC, Vorbis, FLAC, and Opus. A specific software or hardware implementation capable of audio compression and decompression to/from a specific audio coding format is called an audio codec; an example of an audio codec is LAME, which is one of several different codecs which implements encoding and decoding audio in the MP3 audio coding format in software.