TwinVQ

Last updated

TwinVQ (transform-domain weighted interleave vector quantization) is an audio compression technique developed by Nippon Telegraph and Telephone Corporation (NTT) Human Interface Laboratories (now Cyber Space Laboratories) in 1994. [1] [2] [3] [4] The compression technique has been used in both standardized and proprietary designs.

Contents

TwinVQ in MPEG-4

In the context of the MPEG-4 Audio (MPEG-4 Part 3), TwinVQ is an audio codec optimized for audio coding at ultra low bitrates around 8 kbit/s.

TwinVQ is one of the object types defined in MPEG-4 Audio, published as subpart 4 of ISO/IEC 14496-3 (for the first time in 1999 - a.k.a. MPEG-4 Audio version 1). [5] [6] [7] [8] [9] This object type is based on a general audio transform coding scheme which is integrated with the AAC coding frame work, a spectral flattening module, and a weighted interleave vector quantization module. This scheme reportedly has high coding gain for low bit rate and potential robustness against channel errors and packet loss, since it does not use any variable length coding and adaptive bit allocation. It supports bitrate scalability, both by means of layered TwinVQ coding and in combination with the scalable AAC.

Note that some commercialized products such as Metasound (Voxware), [10] [11] SoundVQ (Yamaha), [12] [13] [14] and SolidAudio (Hagiwara) are also based on the TwinVQ technology, but the configurations are different from the MPEG-4 TwinVQ. [6]

TwinVQ as a proprietary audio format

A proprietary audio compression format called TwinVQ was developed by Nippon Telegraph and Telephone Corporation (NTT) (in NTT's Human Interface Laboratories) [15] [16] and marketed by Yamaha under the name SoundVQ. [13] The NTT also offered a TwinVQ demonstration software for non-commercial purposes - NTT TwinVQ Encoder and TwinVQ Player, encoder API, decoder API and header file format. [17] [18] The filename extension is .vqf.

TwinVQ uses Twin vector quantization. The proprietary TwinVQ codec supports constant bit rate encoding at 80, 96, 112, 128, 160 and 192 kbit/s. It was claimed that TwinVQ files are about 30 to 35% smaller than MP3 files of equivalent quality. For example, a 96 kbit/s TwinVQ file allegedly has roughly the same quality as a 128 kbit/s MP3 file. The higher quality is achieved at the cost of higher processor usage.

Yamaha marketed TwinVQ as an alternative to MP3, but the format never became very popular. This could be attributed to the proprietary nature of the format — third party software was scarce and there was no hardware support. Also the encoding was extremely slow and there was not much music available in TwinVQ format. As other MP3 alternatives emerged, TwinVQ quickly became obsolete.

The proprietary version of TwinVQ can be also used for speech encoding. Compression technology specifically designed to handle voice compression was published by NTT. The NTT TwinVQ implementation supported sampling frequencies from 8 kHz or 11.025 kHz and bit rate from 8 kbit/s. [14] [19] [20] [21] [22]

Software support

Official

NTT in Japan once offered on its website a player/encoder for download. [23] This was not as successful as the Yamaha version (see below) and, nowadays, can be found at ReallyRareWares. [24]

Yamaha released an English player application called SoundVQ. [25] Several third party players also supported the format including WinAmp (with the appropriate input plugin) and K-Jöfol (which supported the format natively).

Third-Party Software

The format was reverse-engineered in 2009 by the FFmpeg project and decoding of vqf files is supported by the open-source libavcodec library, [26] which makes it supported in players that utilize the library, such as VLC media player.

Some older versions of Nero Burning ROM are able to encode to TwinVQ/VQF.


Some CD-Ripping/Converter software also support encoding to .vqf format.

See also

Related Research Articles

<span class="mw-page-title-main">Moving Picture Experts Group</span> Alliance of working groups to set standards for multimedia coding

The Moving Picture Experts Group (MPEG) is an alliance of working groups established jointly by ISO and IEC that sets standards for media coding, including compression coding of audio, video, graphics, and genomic data; and transmission and file formats for various applications. Together with JPEG, MPEG is organized under ISO/IEC JTC 1/SC 29 – Coding of audio, picture, multimedia and hypermedia information.

MPEG-1 is a standard for lossy compression of video and audio. It is designed to compress VHS-quality raw digital video and CD audio down to about 1.5 Mbit/s without excessive quality loss, making video CDs, digital cable/satellite TV and digital audio broadcasting (DAB) practical.

<span class="mw-page-title-main">MPEG-2</span> Video encoding standard

MPEG-2 is a standard for "the generic coding of moving pictures and associated audio information". It describes a combination of lossy video compression and lossy audio data compression methods, which permit storage and transmission of movies using currently available storage media and transmission bandwidth. While MPEG-2 is not as efficient as newer standards such as H.264/AVC and H.265/HEVC, backwards compatibility with existing hardware and software means it is still widely used, for example in over-the-air digital television broadcasting and in the DVD-Video standard.

MPEG-4 is a group of international standards for the compression of digital audio and visual data, multimedia systems, and file storage formats. It was originally introduced in late 1998 as a group of audio and video coding formats and related technology agreed upon by the ISO/IEC Moving Picture Experts Group (MPEG) under the formal standard ISO/IEC 14496 – Coding of audio-visual objects. Uses of MPEG-4 include compression of audiovisual data for Internet video and CD distribution, voice and broadcast television applications. The MPEG-4 standard was developed by a group led by Touradj Ebrahimi and Fernando Pereira.

MPEG-1 Audio Layer II or MPEG-2 Audio Layer II is a lossy audio compression format defined by ISO/IEC 11172-3 alongside MPEG-1 Audio Layer I and MPEG-1 Audio Layer III (MP3). While MP3 is much more popular for PC and Internet applications, MP2 remains a dominant standard for audio broadcasting.

Advanced Audio Coding (AAC) is an audio coding standard for lossy digital audio compression. It was designed to be the successor of the MP3 format and generally achieves higher sound quality than MP3 at the same bit rate.

MPEG-4 Part 3 or MPEG-4 Audio is the third part of the ISO/IEC MPEG-4 international standard developed by Moving Picture Experts Group. It specifies audio coding methods. The first version of ISO/IEC 14496-3 was published in 1999.

Harmonic Vector Excitation Coding, abbreviated as HVXC is a speech coding algorithm specified in MPEG-4 Part 3 standard for very low bit rate speech coding. HVXC supports bit rates of 2 and 4 kbit/s in the fixed and variable bit rate mode and sampling frequency of 8 kHz. It also operates at lower bitrates, such as 1.2 - 1.7 kbit/s, using a variable bit rate technique. The total algorithmic delay for the encoder and decoder is 36 ms.

<span class="mw-page-title-main">High-Efficiency Advanced Audio Coding</span> Audio codec

High-Efficiency Advanced Audio Coding (HE-AAC) is an audio coding format for lossy data compression of digital audio defined as an MPEG-4 Audio profile in ISO/IEC 14496–3. It is an extension of Low Complexity AAC (AAC-LC) optimized for low-bitrate applications such as streaming audio. The usage profile HE-AAC v1 uses spectral band replication (SBR) to enhance the modified discrete cosine transform (MDCT) compression efficiency in the frequency domain. The usage profile HE-AAC v2 couples SBR with Parametric Stereo (PS) to further enhance the compression efficiency of stereo signals.

MPEG-4 Part 2, MPEG-4 Visual is a video compression format developed by the Moving Picture Experts Group (MPEG). It belongs to the MPEG-4 ISO/IEC standards. It uses block-wise motion compensation and a discrete cosine transform (DCT), similar to previous standards such as MPEG-1 Part 2 and H.262/MPEG-2 Part 2.

MPEG-4 Audio Lossless Coding, also known as MPEG-4 ALS, is an extension to the MPEG-4 Part 3 audio standard to allow lossless audio compression. The extension was finalized in December 2005 and published as ISO/IEC 14496-3:2005/Amd 2:2006 in 2006. The latest description of MPEG-4 ALS was published as subpart 11 of the MPEG-4 Audio standard in December 2019.

MPEG-4 Structured Audio is an ISO/IEC standard for describing sound. It was published as subpart 5 of MPEG-4 Part 3 in 1999.

<span class="mw-page-title-main">MP4 file format</span> Digital format for storing video and audio

MPEG-4 Part 14, or MP4, is a digital multimedia container format most commonly used to store video and audio, but it can also be used to store other data such as subtitles and still images. Like most modern container formats, it allows streaming over the Internet. The only filename extension for MPEG-4 Part 14 files as defined by the specification is .mp4. MPEG-4 Part 14 is a standard specified as a part of MPEG-4.

The MPEG-4 Low Delay Audio Coder is audio compression standard designed to combine the advantages of perceptual audio coding with the low delay necessary for two-way communication. It is closely derived from the MPEG-2 Advanced Audio Coding (AAC) standard. It was published in MPEG-4 Audio Version 2 and in its later revisions.

MPEG Surround, also known as Spatial Audio Coding (SAC) is a lossy compression format for surround sound that provides a method for extending mono or stereo audio services to multi-channel audio in a backwards compatible fashion. The total bit rates used for the core and the MPEG Surround data are typically only slightly higher than the bit rates used for coding of the core. MPEG Surround adds a side-information stream to the core bit stream, containing spatial image data. Legacy stereo playback systems will ignore this side-information while players supporting MPEG Surround decoding will output the reconstructed multi-channel audio.

MPEG-1 Audio Layer I, commonly abbreviated to MP1, is one of three audio formats included in the MPEG-1 standard. It is a deliberately simplified version of MPEG-1 Audio Layer II (MP2), created for applications where lower compression efficiency could be tolerated in return for a less complex algorithm that could be executed with simpler hardware requirements. While supported by most media players, the codec is considered largely obsolete, and replaced by MP2 or MP3.

Structured Audio Orchestra Language (SAOL) is an imperative, MUSIC-N programming language designed for describing virtual instruments, processing digital audio, and applying sound effects. It was published as subpart 5 of MPEG-4 Part 3 in 1999.

The ISO base media file format (ISOBMFF) is a container file format that defines a general structure for files that contain time-based multimedia data such as video and audio. It is standardized in ISO/IEC 14496-12, a.k.a. MPEG-4 Part 12, and was formerly also published as ISO/IEC 15444-12, a.k.a. JPEG 2000 Part 12.

Unified Speech and Audio Coding (USAC) is an audio compression format and codec for both music and speech or any mix of speech and audio using very low bit rates between 12 and 64 kbit/s. It was developed by Moving Picture Experts Group (MPEG) and was published as an international standard ISO/IEC 23003-3 and also as an MPEG-4 Audio Object Type in ISO/IEC 14496-3:2009/Amd 3 in 2012.

References

  1. Nippon Telegraph and Telephone Corp. (1995). "R&D activities of NTT's Research and Development Headquarters in 1994 - An Integral Multimedia Capability - Compression Encoding of Music with TwinVQ (archived website)". Archived from the original on 1997-10-09. Retrieved 2010-08-06.
  2. NTT (1996). "Welcome to the home of TwinVQ! (archived website) (Japanese)". Archived from the original on 2000-08-30. Retrieved 2010-08-06.
  3. "AES E-Library - Transform-Domain Weighted Interleave Vector Quantization (TwinVQ)". Audio Engineering Society. 1996. Retrieved 2010-08-06.
  4. "Our research of Audio". NTT HI Labs. 1997. Archived from the original on 1999-01-28. Retrieved 2010-08-06.
  5. ISO (1999). "ISO/IEC 14496-3:1999 - Information technology -- Coding of audio-visual objects -- Part 3: Audio". ISO. Retrieved 2009-10-09.
  6. 1 2 D. Thom, H. Purnhagen, and the MPEG Audio Subgroup (October 1998). "MPEG Audio FAQ Version 9 - MPEG-4 - an introduction to MPEG-4 Audio". chiariglione.org. Retrieved 2009-10-06.{{cite web}}: CS1 maint: multiple names: authors list (link)
  7. ISO/IEC JTC 1/SC 29/WG 11 (July 1999), ISO/IEC 14496-3:/Amd.1 - Final Committee Draft - MPEG-4 Audio Version 2 (PDF), archived from the original (PDF) on 2012-08-01, retrieved 2009-10-07{{citation}}: CS1 maint: numeric names: authors list (link)
  8. Heiko Purnhagen (2001-06-01). "The MPEG-4 Audio Standard: Overview and Applications". Heiko Purnhagen. Retrieved 2009-10-07.[ dead link ]
  9. ISO/IEC JTC1/SC29/WG11 N2203 (March 1998). "MPEG-4 Audio (Final Committee Draft 14496-3)". Heiko Purnhagen. Retrieved 2009-10-07.{{cite web}}: CS1 maint: numeric names: authors list (link)[ dead link ]
  10. Business Wire (1996-12-11). "Voxware expands technology offerings & signs licensing agreement with NTT". The Free Library. Retrieved 2009-10-06.{{cite web}}: |author= has generic name (help)
  11. Business Wire (1997-05-13). "IBM Licenses Voxware's MetaSound Technology for Use in Multimedia Products". The Free Library. Retrieved 2009-10-06.{{cite web}}: |author= has generic name (help)
  12. YAMAHA CORPORATION (2000). "Yamaha SoundVQ". Archived from the original on 2003-02-27. Retrieved 2009-10-06.
  13. 1 2 YAMAHA CORPORATION (1997). "Yamaha "SoundVQ"". Archived from the original on 1998-12-07. Retrieved 2010-08-06.
  14. 1 2 NTT-East Multimedia Business Department (2000-03-31). "About TwinVQ". Archived from the original on 2000-08-17. Retrieved 2009-10-06.{{cite web}}: |author= has generic name (help)
  15. "Music compression technology "TwinVQ" (archived website) (Japanese)". 1996. Archived from the original on 1997-06-27. Retrieved 2010-08-06.
  16. "About TwinVQ (archived website) (Japanese)". 1997. Archived from the original on 1997-07-25. Retrieved 2010-08-06.
  17. NTT-East Multimedia Business Department (2008). "TwinVQ Software". Archived from the original on 2005-11-09. Retrieved 2009-10-07.{{cite web}}: |author= has generic name (help)
  18. NTT-East Multimedia Business Department (2002). "TwinVQ - libraries and sample programs". Archived from the original on 2002-04-11. Retrieved 2009-10-07.{{cite web}}: |author= has generic name (help)
  19. NTT-East Multimedia Business Department (2000). "TwinVQ F.A.Q." Archived from the original on 2000-08-19. Retrieved 2009-10-06.{{cite web}}: |author= has generic name (help)
  20. NTT (1998-03-24). "TwinVQ (archived website)". Archived from the original on 1998-04-30. Retrieved 2009-10-06.
  21. MultimediaWiki (2009). "VQF". MultimediaWiki. Retrieved 2009-10-07.
  22. "TwinVQ F.A.Q. (archived website) (Japanese)". 1997. Archived from the original on 1997-07-25. Retrieved 2010-08-06.
  23. "TwinVQ". 2004-10-12. Archived from the original on 2004-10-12. Retrieved 2017-10-01.
  24. "ReallyRareWares". www.rarewares.org. Retrieved 2017-10-01.
  25. "ReallyRareWares". www.rarewares.org. Retrieved 2017-10-01.
  26. "TwinVQ decoder source-code". Archived from the original on 2012-03-23. Retrieved 2009-08-23.