Variable-Rate Multimode Wideband

Last updated
Variable-Rate Multimode Wideband (VMR-WB)
Internet media type
audio/VMR-WB
Developed by 3GPP2
Type of format Lossy audio
Contained by 3G2, RTP
Extended from AMR-WB/G.722.2
Website 3GPP2 standards

Variable-Rate Multimode Wideband (VMR-WB) is a source-controlled variable-rate multimode codec designed for robust encoding/decoding of wideband/narrowband speech. The operation of VMR-WB is controlled by speech signal characteristics (i.e., source-controlled) and by traffic condition of the network (i.e., network-controlled mode switching). Depending on the traffic conditions and the desired quality of service (QoS), one of the 4 operational modes is used. All operating modes of the existing VMR-WB standard are fully compliant with cdma2000 rate-set II. VMR-WB modes 0, 1, and 2 are cdma2000 native modes with mode 0 providing the highest quality and mode 2 the lowest ADR. VMR-WB mode 3 is the AMR-WB interoperable mode operating at an ADR slightly higher than mode 0 and providing a quality equal or better than that of AMR-WB at 12.65 kbit/s when in an interoperable interconnection with AMR-WB at 12.65 kbit/s.

Now also a cdma2000 rate-set I compliant mode is implemented to the coder as mode 4. The average bitrate of the mode is 6.1 kbit/s (maximum is 8.55 kbit/s). Source coding bitrates are: Rate-Set I - 8.55, 4.0, 2.0, 0.8 kbit/s, Rate-Set II - 13.3, 6.2, 2.7, 1.0 kbit/s. VMR-WB uses 16 kHz sampling frequency. [1] [2] Algorithmic delay is 33.75ms.

VMR-WB can be also used in 3GPP2 container file format - 3G2.

VMR-WB was designed by Nokia and VoiceAge. It is based on AMR-WB. [2]

Related Research Articles

Speex is an audio compression codec specifically tuned for the reproduction of human speech and also a free software speech codec that may be used on VoIP applications and podcasts. It is based on the CELP speech coding algorithm. Its creators claim Speex to be free of any patent restrictions and it is licensed under the revised (3-clause) BSD license. It may be used with the Ogg container format or directly transmitted over UDP/RTP. It may also be used with the FLV container format.

MPEG-4 Part 3 or MPEG-4 Audio is the third part of the ISO/IEC MPEG-4 international standard developed by Moving Picture Experts Group. It specifies audio coding methods. The first version of ISO/IEC 14496-3 was published in 1999.

The Adaptive Multi-Rateaudio codec is an audio compression format optimized for speech coding. AMR is a multi-rate narrowband speech codec that encodes narrowband (200–3400 Hz) signals at variable bit rates ranging from 4.75 to 12.2 kbit/s with toll quality speech starting at 7.4 kbit/s.

Enhanced Full Rate or EFR or GSM-EFR or GSM 06.60 is a speech coding standard that was developed in order to improve the quality of GSM.

Full Rate was the first digital speech coding standard used in the GSM digital mobile phone system. It uses linear predictive coding (LPC). The bit rate of the codec is 13 kbit/s, or 1.625 bits/audio sample. The quality of the coded speech is quite poor by modern standards, but at the time of development it was a good compromise between computational complexity and quality, requiring only on the order of a million additions and multiplications per second. The codec is still widely used in networks around the world. Gradually FR will be replaced by Enhanced Full Rate (EFR) and Adaptive Multi-Rate (AMR) standards, which provide much higher speech quality with lower bit rate.

Half Rate is a speech coding system for GSM, developed in the early 1990s.

Adaptive Multi-Rate Wideband (AMR-WB) is a patented wideband speech audio coding standard developed based on Adaptive Multi-Rate encoding, using a similar methodology to algebraic code-excited linear prediction (ACELP). AMR-WB provides improved speech quality due to a wider speech bandwidth of 50–7000 Hz compared to narrowband speech coders which in general are optimized for POTS wireline quality of 300–3400 Hz. AMR-WB was developed by Nokia and VoiceAge and it was first specified by 3GPP.

<span class="mw-page-title-main">G.722</span> ITU-T recommendation

G.722 is an ITU-T standard 7 kHz wideband audio codec operating at 48, 56 and 64 kbit/s. It was approved by ITU-T in November 1988. Technology of the codec is based on sub-band ADPCM (SB-ADPCM). The corresponding narrow-band codec based on the same technology is G.726.

Extended Adaptive Multi-Rate – Wideband (AMR-WB+) is an audio codec that extends AMR-WB. It adds support for stereo signals and higher sampling rates. Another main improvement is the use of transform coding additionally to ACELP. This greatly improves the generic audio coding. Automatic switching between transform coding and ACELP provides both good speech and audio quality with moderate bit rates.

3GP is a multimedia container format defined by the Third Generation Partnership Project (3GPP) for 3G UMTS multimedia services. It is used on 3G mobile phones but can also be played on some 2G and 4G phones.

Enhanced Variable Rate CODEC (EVRC) is a speech codec used in CDMA networks. It was developed in 1995 to replace the QCELP vocoder which used more bandwidth on the carrier's network, thus EVRC's primary goal was to offer the mobile carriers more capacity on their networks while not increasing the amount of bandwidth or wireless spectrum needed. EVRC uses RCELP technology.

Selectable Mode Vocoder (SMV) is variable bitrate speech coding standard used in CDMA2000 networks. SMV provides multiple modes of operation that are selected based on input speech characteristics.

Enhanced Variable Rate Codec B (EVRC-B) is a speech codec used by CDMA networks. EVRC-B is an enhancement to EVRC and compresses each 20 milliseconds of 8000 Hz, 16-bit sampled speech input into output frames of one of the four different sizes: Rate 1 - 171 bits, Rate 1/2 - 80 bits, Rate 1/4 - 40 bits, Rate 1/8 - 16 bits.

<span class="mw-page-title-main">G.729.1</span> ITU-T Recommendation

G.729.1 is an 8-32 kbit/s embedded speech and audio codec providing bitstream interoperability with G.729, G.729 Annex A and G.729 Annex B. Its official name is G.729-based embedded variable bit rate codec: An 8-32 kbit/s scalable wideband coder bitstream interoperable with G.729. It was introduced in 2006.

G.719 is an ITU-T standard audio coding format providing high quality, moderate bit rate wideband audio coding at low computational load. It was produced through a collaboration between Polycom and Ericsson.

Wideband audio, also known as wideband voice or HD voice, is high definition voice quality for telephony audio, contrasted with standard digital telephony "toll quality". It extends the frequency range of audio signals transmitted over telephone lines, resulting in higher quality speech. The range of the human voice extends from 100 Hz to 17 kHz but traditional, voiceband or narrowband telephone calls limit audio frequencies to the range of 300 Hz to 3.4 kHz. Wideband audio relaxes the bandwidth limitation and transmits in the audio frequency range of 50 Hz to 7 kHz. In addition, some wideband codecs may use a higher audio bit depth of 16 bits to encode samples, also resulting in much better voice quality.

<span class="mw-page-title-main">G.718</span> ITU-T Recommendation

G.718 is an ITU-T Recommendation embedded scalable speech and audio codec providing high quality narrowband speech over the lower bit rates and high quality wideband speech over the complete range of bit rates. In addition, G.718 is designed to be highly robust to frame erasures, thereby enhancing the speech quality when used in Internet Protocol (IP) transport applications on fixed, wireless and mobile networks. Despite its embedded nature, the codec also performs well with both narrowband and wideband generic audio signals. The codec has an embedded scalable structure, enabling maximum flexibility in the transport of voice packets through IP networks of today and in future media-aware networks. In addition, the embedded structure of G.718 will easily allow the codec to be extended to provide a superwideband and stereo capability through additional layers which are currently under development in ITU-T Study Group 16. The bitstream may be truncated at the decoder side or by any component of the communication system to instantaneously adjust the bit rate to the desired value without the need for out-of-band signalling. The encoder produces an embedded bitstream structured in five layers corresponding to the five available bit rates: 8, 12, 16, 24 & 32 kbit/s.

<span class="mw-page-title-main">Voice over LTE</span> High-speed wireless communication functionality

Voice over LTE (VoLTE) is an LTE high-speed wireless communication standard for voice calls using mobile phones and data terminals. VoLTE has up to three times more voice and data capacity than older 3G UMTS and up to six times more than 2G GSM. It uses less bandwidth because VoLTE's packet headers are smaller than those of unoptimized VoIP/LTE. VoLTE calls are usually charged at the same rate as other calls.

Enhanced Voice Services (EVS) is a superwideband speech audio coding standard that was developed for VoLTE. It offers up to 20 kHz audio bandwidth and has high robustness to delay jitter and packet losses due to its channel aware coding and improved packet loss concealment. It has been developed in 3GPP and is described in 3GPP TS 26.441. The application areas of EVS consist of improved telephony and teleconferencing, audiovisual conferencing services, and streaming audio. Source code of both decoder and encoder in ANSI C is available as 3GPP TS 26.442 and is being updated regularly. Samsung uses the term HD+ when doing a call using EVS.

References

  1. PacketCable 2.0 - Codec and Media Specification - PKT-SP-CODEC-MEDIA-I07-090702 (PDF), 2009, p. 87, retrieved 2011-06-23
  2. 1 2 VoiceAge website: VMR-WB — Source-controlled Variable Bit Rate Wideband Compression (archived), 2008-04-19, archived from the original on April 19, 2008, retrieved 2011-06-23