WavPack

Last updated
WavPack
Developer(s) David Bryant
Stable release
5.1.0 / 18 January 2017;2 years ago (2017-01-18) [1]
Operating system Cross-platform
Type Audio codec Container
License BSD license
Website wavpack.com
WavPack
Filename extension .wv

WavPack is a free and open-source lossless audio compression format.

A file format is a standard way that information is encoded for storage in a computer file. It specifies how bits are used to encode information in a digital storage medium. File formats may be either proprietary or free and may be either unpublished or open.

Contents

Features

WavPack compression (.WV files) can compress (and restore) 8-, 16-, 24-, and 32-bit fixed-point, and 32-bit floating point audio files in the .WAV file format. It also supports surround sound streams and high frequency sampling rates. Like other lossless compression schemes, the data reduction rate varies with the source, but it is generally between 30% and 70% for typical popular music and somewhat better than that for classical music and other sources with greater dynamic range. [2]

In computing, a fixed-point number representation is a real data type for a number that has a fixed number of digits after the radix point. Fixed-point number representation can be compared to the more complicated floating-point number representation.

Waveform Audio File Format is a Microsoft and IBM audio file format standard for storing an audio bitstream on PCs. It is an application of the Resource Interchange File Format (RIFF) bitstream format method for storing data in "chunks", and thus is also close to the 8SVX and the AIFF format used on Amiga and Macintosh computers, respectively. It is the main format used on Microsoft Windows systems for raw and typically uncompressed audio. The usual bitstream encoding is the linear pulse-code modulation (LPCM) format.

Surround sound

Surround sound is a technique for enriching the fidelity and depth of sound reproduction by using multiple audio channels from speakers that surround the listener. Its first application was in movie theaters. Prior to surround sound, theater sound systems commonly had three "screen channels" of sound, from loudspeakers located in front of the audience at the left, center, and right. Surround sound adds one or more channels from loudspeakers behind the listener, able to create the sensation of sound coming from any horizontal direction 360° around the listener. Surround sound formats vary in reproduction and recording methods along with the number and positioning of additional channels. The most common surround sound specification, the ITU's 5.1 standard, calls for 6 speakers: Center (C) in front of the listener, Left (L) and Right (R) at angles of 60° on either side of the center, and Left Surround (LS) and Right Surround (RS) at angles of 100–120°, plus a subwoofer whose position is not critical.

Hybrid mode

WavPack also incorporates a "hybrid" mode which still provides the features of lossless compression, but it creates two files: a relatively small, high-quality, lossy file (.wv) that can be used by itself; and a "correction" file (.wvc) that, when combined with the lossy file, provides full lossless restoration. This allows the use of lossy and lossless codecs together.

A similar "hybrid" feature is also offered by OptimFROG DualStream, MPEG-4 SLS and DTS-HD Master Audio.

MPEG-4 SLS extension to the MPEG-4 Part 3 (MPEG-4 Audio) standard

MPEG-4 SLS, or MPEG-4 Scalable to Lossless as per ISO/IEC 14496-3:2005/Amd 3:2006 (Scalable Lossless Coding), is an extension to the MPEG-4 Part 3 (MPEG-4 Audio) standard to allow lossless audio compression scalable to lossy MPEG-4 General Audio coding methods (e.g., variations of AAC). It was developed jointly by the Institute for Infocomm Research (I2R) and Fraunhofer, which commercializes its implementation of a limited subset of the standard under the name of HD-AAC. Standardization of the HD-AAC profile for MPEG-4 Audio is under development (as of September 2009).

DTS-HD Master Audio lossless audio compression codec

DTS-HD Master Audio is a combined lossless/lossy audio codec created by DTS, commonly used for surround-sound movie soundtracks on Blu-ray Disc.

Summary

Open-source software software licensed to ensure source code usage rights

Open-source software (OSS) is a type of computer software in which source code is released under a license in which the copyright holder grants users the rights to study, change, and distribute the software to anyone and for any purpose. Open-source software may be developed in a collaborative public manner. Open-source software is a prominent example of open collaboration.

Streaming media Continuous multimedia operated and presented to users by a provider

Streaming media is multimedia that is constantly received by and presented to an end-user while being delivered by a provider. The verb "to stream" refers to the process of delivering or obtaining media in this manner; the term refers to the delivery method of the medium, rather than the medium itself, and is an alternative to file downloading, a process in which the end-user obtains the entire file for the content before watching or listening to it.

Rockbox firmware

Rockbox is a free and open-source software replacement for the OEM firmware in various forms of digital audio players (DAPs) with an original kernel. It offers an alternative to the player's operating system, in many cases without removing the original firmware, which provides a plug-in architecture for adding various enhancements and functions. Enhancements include personal digital assistant (PDA) functions, applications, utilities, and games. Rockbox can also retrofit video playback functions on players first released in mid-2000. Rockbox includes a voice-driven user-interface suitable for operation by visually impaired users.

History

David Bryant started development on WavPack in mid-1998 with the release of version 1.0 (1998-08-15). This first version compressed and decompressed audio losslessly, and it already featured one of the best efficiency vs. speed ratios among lossless encoders. [3]

Very soon after the release of version 1.0, v. 2.0 (2 September 1998) was released, featuring lossy encoding (using only quantization of prediction residue for data reduction - no psychoacoustic masking model was applied to the stream).

In 1999, version 3.0 (12 September 1999) was released, with a new "fast mode" (albeit with reduced compression ratio), compression of raw (headerless) PCM audio files, and error detection using a 32-bit cyclic redundancy check.

A feature added in late 3.x versions is the "hybrid" mode where the encoder generates a lossy file and a correction file such that both can be decompressed back to a PCM stream that is same quality as the original.[ citation needed ] A “roadmap” is also published by the author, containing possible hints on future development. [4]

Support

Software

Some software supports the format natively (like Jack! The Knife), while others require plugins. The official WavPack website offers plugins for Winamp, Nero Burning ROM, MediaChest 2.1, Foobar 2000 and several other applications, as well as a DirectShow filter. [5] Asunder allows ripping Audio CDs directly into Wavpack files.

Hardware

Native support:

Non-native support:

The WavPack website also includes a plugin that allows support for the format on the Roku PhotoBridge HD. [5]

Technology

To ensure high-speed operation, WavPack uses a predictor that is implemented entirely in integer math. [7] In its "fast" mode the prediction is simply the arithmetic extrapolation of the previous two samples. For example, if the previous two samples were −10 and 20, then the prediction would be 50. For the default mode a simple adaptive factor is added to weigh the influence of the earlier sample on the prediction. In our example the resulting prediction could then vary between 20 for no influence to 50 for full influence. This weight factor is constantly updated based on the audio data's changing spectral characteristics.

The prediction generated is then subtracted from the actual sample to be encoded to generate the error value. In mono mode this value is sent directly to the coder. However, stereo signals tend to have some correlation between the two channels that can be further exploited. Therefore, two error values are calculated that represent the difference and average of the left and right error values. In the "fast" mode of operation these two new values are simply sent to the coder instead of the left and right values. In the default mode, the difference value is always sent to the coder along with one of the other three values (average, left, or right). An adaptive algorithm continuously determines the most efficient of the three to send based on the changing balance of the channels.

Instead of Rice coding, a special data encoder for WavPack is used. Rice coding is the optimal bit coding for this type of data, and WavPack's encoder is less efficient, but only by about 0.15 bits/sample (or less than 1% for 16-bit data). However, there are some advantages in exchange; the first one is that WavPack's encoder does not require the data to be buffered ahead of encoding; instead it converts each sample directly to bitcodes. This is more computationally efficient, and it is better in some applications where coding delay is critical. The second advantage is that it is easily adaptable to lossy encoding, since all significant bits (except the implied "one" MSB) are transmitted directly. In this way it is possible to only transmit, for example, the 3 most significant bits (with sign) of each sample. In fact, it is possible to transmit only the sign and implied MSB for each sample with an average of only 3.65 bits/sample.

This coding scheme is used to implement the "lossy" mode of WavPack. In the "fast" mode the output of the non-adaptive decorrelator is simply rounded to the nearest codable value for the specified number of bits. In the default mode the adaptive decorrelator is used (which reduces the average noise about 1 dB) and both the current and the next sample are considered in choosing the better of the two available codes (which reduces noise another 1 dB).

No floating-point arithmetic is used in WavPack's data path because, according to the author, integer operations are less susceptible to subtle chip-to-chip variations that could corrupt the lossless nature of the compression (the Pentium floating point bug being an example). It is possible that a lossless compressor that used floating-point math could generate different output when running on that faulty Pentium. Even disregarding actual bugs, floating-point math is complicated enough that there could be subtle differences between "correct" implementations that could cause trouble for this type of application. [8] A 32-bit error detection code to the generated streams is included to maintain user confidence in the integrity of WavPack's compression.

WavPack source code is portable, and has been compiled on several Unix and Unix-like operating systems (Linux, Mac OS X, Solaris, FreeBSD, OpenBSD, NetBSD, Compaq Tru64, HP-UX...) as well as Windows, DOS, Palm OS, and OpenVMS. It works on many architectures, including x86, ARM, PowerPC, AMD64, IA-64, SPARC, Alpha, PA-RISC, MIPS and Motorola 68k.

A cut-down version of WavPack was developed for the Texas Instruments TMS320 series Digital Signal Processor. This was aimed predominantly at encouraging manufacturers to incorporate WavPack compression (and de-compression) into portable memory audio recorders. This version supported features that were applicable only to embedded applications (stream compression in real-time, selectable compression rate) and dropped off features that only applied to full computer systems (self extraction, high compression modes, 32-bit floats). The TMS320 series DSPs are native integer devices, and support WavPack well. Some 'special' features of the full WavPack software were included (ability to generate a correction 'file' (stream) for example) and others were excluded. The port was based on version 4.

WavPack support was added to WinZip starting with version 11.0 beta, released in October 2006. [9] This extension to the ZIP file format was included by PKWARE, the maintainers of the format, in the official APPNOTE.TXT description file starting with version 6.3.2, released on 28 September 2007. [10]

See also

Related Research Articles

An audio file format is a file format for storing digital audio data on a computer system. The bit layout of the audio data is called the audio coding format and can be uncompressed, or compressed to reduce the file size, often using lossy compression. The data can be a raw bitstream in an audio coding format, but it is usually embedded in a container format or an audio data format with defined storage layer.

The Au file format is a simple audio file format introduced by Sun Microsystems. The format was common on NeXT systems and on early Web pages. Originally it was headerless, being simply 8-bit µ-law-encoded data at an 8000 Hz sample rate. Hardware from other vendors often used sample rates as high as 8192 Hz, often integer multiples of video clock signal frequencies. Newer files have a header that consists of six unsigned 32-bit words, an optional information chunk and then the data.

In signal processing, data compression, source coding, or bit-rate reduction involves encoding information using fewer bits than the original representation. Compression can be either lossy or lossless. Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Lossy compression reduces bits by removing unnecessary or less important information.

Lossy compression data compression approach that reduces data size while discarding or channing some of it

In information technology, lossy compression or irreversible compression is the class of data encoding methods that uses inexact approximations and partial data discarding to represent the content. These techniques are used to reduce data size for storing, handling, and transmitting content. The different versions of the photo of the cat to the right show how higher degrees of approximation create coarser images as more details are removed. This is opposed to lossless data compression which does not degrade the data. The amount of data reduction possible using lossy compression is much higher than through lossless techniques.

Lossless compression is a class of data compression algorithms that allows the original data to be perfectly reconstructed from the compressed data. By contrast, lossy compression permits reconstruction only of an approximation of the original data, though usually with improved compression rates.

Windows Media Audio (WMA) is a series of audio codecs and their corresponding audio coding formats developed by Microsoft. It is a proprietary technology that forms part of the Windows Media framework. WMA consists of four distinct codecs. The original WMA codec, known simply as WMA, was conceived as a competitor to the popular MP3 and RealAudio codecs. WMA Pro, a newer and more advanced codec, supports multichannel and high resolution audio. A lossless codec, WMA Lossless, compresses audio data without loss of audio fidelity. WMA Voice, targeted at voice content, applies compression using a range of low bit rates. Microsoft has also developed a digital container format called Advanced Systems Format to store audio encoded by WMA.

Adaptive Transform Acoustic Coding (ATRAC) is a family of proprietary audio compression algorithms developed by Sony. MiniDisc was the first commercial product to incorporate ATRAC in 1992. ATRAC allowed a relatively small disc like MiniDisc to have the same running time as CD while storing audio information with minimal loss in perceptible quality. Improvements to the codec in the form of ATRAC3, ATRAC3plus, and ATRAC Advanced Lossless followed in 1999, 2002, and 2006 respectively.

FLAC reference software for the handling of FLAC data

FLAC is an audio coding format for lossless compression of digital audio, and is also the name of the free software project producing the FLAC tools, the reference software package that includes a codec implementation. Digital audio compressed by FLAC's algorithm can typically be reduced to between 50 and 70 percent of its original size and decompress to an identical copy of the original audio data.

Monkey's Audio is an algorithm and file format for lossless audio data compression. Lossless data compression does not discard data during the process of encoding, unlike lossy compression methods such as AAC, MP3, Vorbis and Musepack.

Advanced Audio Coding (AAC) is an audio coding standard for lossy digital audio compression. Designed to be the successor of the MP3 format, AAC generally achieves better sound quality than MP3 at the same bit rate. The confusingly named AAC+ (HE-AAC) does so only at low bit rates and less so at high ones.

Apple Lossless, also known as Apple Lossless Audio Codec (ALAC), or Apple Lossless Encoder (ALE), is an audio coding format, and its reference audio codec implementation, developed by Apple Inc. for lossless data compression of digital music. After initially keeping it proprietary from its inception in 2004, in late 2011 Apple made the codec available open source and royalty-free. Traditionally, Apple has referred to the codec as Apple Lossless, though more recently it has begun to use the abbreviated term ALAC when referring to the codec.

Transcoding is the direct digital-to-digital conversion of one encoding to another, such as for movie data files, audio files, or character encoding. This is usually done in cases where a target device does not support the format or has limited storage capacity that mandates a reduced file size, or to convert incompatible or obsolete data to a better-supported or modern format.

True Audio (TTA) is a lossless compressor for multichannel 8, 16 and 24 bits audio data. .tta is the extension to filenames of audio files created by the True Audio codec.

JPEG XR is a still-image compression standard and file format for continuous tone photographic images, based on technology originally developed and patented by Microsoft under the name HD Photo. It supports both lossy and lossless compression, and is the preferred image format for Ecma-388 Open XML Paper Specification documents.

MPEG-1 Audio Layer III HD more commonly known and advertised by its abbreviation mp3HD is an audio compression codec developed by Technicolor formerly known as Thomson. It achieves lossless data compression, and is backwards compatible with the MP3 format by storing two data streams in one file.

JPEG XT is an image compression standard which specifies backward-compatible extensions of the base JPEG standard.

References

  1. Changelog
  2. Heijden, Hans (11 July 2006). "Compression and speed of lossless audio formats" . Retrieved 17 July 2009.
  3. Speek (7 February 2005). "Performance comparison of lossless audio compressors" . Retrieved 17 July 2009.
  4. http://www.hydrogenaud.io/forums/index.php?s=&showtopic=50911&view=findpost&p=456571
  5. 1 2 "WavPack downloads".
  6. "Sound Codecs, Rockbox Wiki".
  7. Bryant, David (21 March 2007). "Forum comment by developer" . Retrieved 17 July 2009.
  8. Goldberg, David (March 1991). "What Every Computer Scientist Should Know About Floating-Point Arithmetic" (PDF). ACM Computing Surveys . 23 (1): 5–48. doi:10.1145/103162.103163 . Retrieved 2016-01-20. (, )
  9. "WinZip - Additional Compression Methods Specification". WinZip International LLC. 15 November 2006. Retrieved 6 January 2008.
  10. "APPNOTE.TXT - .ZIP File Format Specification". PKWARE Inc. 28 September 2007. Retrieved 6 January 2008.