Tandem Free Operation

Last updated April 24, 2024

Tandem Free Operation (TFO) is a part of ETSI's 3GPP standard specification,^[1] which has been included from R99 of the standards specifications onwards.

Overview

In traditional GSM networks, a call between two Mobile Stations (MS) involve a dual encoding/decoding process. Speech signals are first encoded in the originating MS, converted to G.711 in the local transcoder, converted back to a GSM codec in the remote transcoder and finally converted back to speech at the terminating MS. In this configuration the two transcoders are operating in tandem introducing a voice quality degradation. It is possible to eliminate this problem by removing the two transcoding operations in the voice path if the two MS are using the same codec.

Details

Broadly, the equipments that are en-route two end mobile sets can be categorized into two types:

Active voice equipments, that does the transcoding operation, either from a GSM/UMTS speech codec (e.g.: GSM-EFR, GSM-AMR) to G.711/PCM or opposite.
Passive equipments (or In-Path Equipment), does not transcode, but change the voice signals in some way. For example: Line Echo Canceller, attenuation algorithms or any equipment that change the voice samples.

Active Equipment

These equipment are typically at the edge of the core networks that acts as a gateway between mobile core networks (IP based) and digital PSTN networks. Mobile core IP networks carry voice encoded in the form of one of the GSM/UMTS codecs (e.g.: GSM AMR). When this has to be carried over a G.711/PCM based PSTN network, the gateway equipment transcodes from GSM/UMTS codec to G.711 PCM samples. This results in a certain loss of voice quality.

A single G.711/PCM sample is an 8-bit value and is sampled at the rate of 8 kHz. Hence, the bandwidth requirement is 64 kbit/s, with each bit corresponding to 8 kbit/s.

TFO is a mechanism that steals least significant bits (LSBs) of PCM samples to literally embed the bits from encoded stream. Since most GSM/UMTS codec rates are around the range of 8 kbit/s to 16 kbit/s (and higher rates of up to 32 kbit/s for 16 kHz sampled Wide Band codecs), one needs to steal only about 1 or 2 LSBs of total 8 bits. This aspect is very important, as, if there is a breakage in TFO connection, the upper most significant bits (MSBs) can still be used to carry transcoded G.711/PCM sample values. The degradation due to loss of 1 or 2 LSBs is not much.

The remote transcoder equipment then extracts the encoded stream from LSBs of PCM samples and reconstructs it as codec frames and then sent as though it was encoded by itself, thereby virtually avoiding two iterations of trancoding.

Flow

The transcoder equipment that supports TFO runs a well defined state machine. Based on the sequence of events, the state machine table defines a sequence of actions to be performed.

As a part of this sequence, the local transcoder sends TFO In-band Signalling messages (IS_Messages) on the LSBs of the PCM samples. The protocol is very well defined in the specification documents.

The remote transcoder equipment that receives these messages, acknowledges (ACKs) with its own IS_Messages. Upon initial exchange, the two transcoders also exchange their capabilities (the codecs they support, etc.). Once a common codec is decided, they both start streaming PCM samples with LSBs containing encoded stream.

In-Path Equipment (IPE)

As described earlier, these are not active equipment that does transcoding. These typically come in the path of two transcoders. To ensure that the TFO stream that is embedded in the LSBs of PCM samples are not touched by these equipments, even these have to be aware of TFO.

The specifications define the role for these. In a nutshell, these equipments have to detect TFO traffic (by checking for IS_Messages on the input) and ensure that these are not touched on the output.

Related Research Articles

<span class="mw-page-title-main">GSM</span> Cellular telephone network standard

The Global System for Mobile Communications (GSM) is a standard developed by the European Telecommunications Standards Institute (ETSI) to describe the protocols for second-generation (2G) digital cellular networks used by mobile devices such as mobile phones and tablets. GSM is also a trade mark owned by the GSM Association. GSM may also refer to the Full Rate voice codec.

<span class="mw-page-title-main">General Packet Radio Service</span> Packet oriented mobile data service on 2G and 3G

General Packet Radio Service (GPRS), also called 2.5G, is a packet oriented mobile data standard on the 2G cellular communication network's global system for mobile communications (GSM). GPRS was established by European Telecommunications Standards Institute (ETSI) in response to the earlier CDPD and i-mode packet-switched cellular technologies. It is now maintained by the 3rd Generation Partnership Project (3GPP).

<span class="mw-page-title-main">G.723.1</span> ITU-T Recommendation

G.723.1 is an audio codec for voice that compresses voice audio in 30 ms frames. An algorithmic look-ahead of 7.5 ms duration means that total algorithmic delay is 37.5 ms. Its official name is Dual rate speech coder for multimedia communications transmitting at 5.3 and 6.3 kbit/s. It is sometimes associated with a Truespeech trademark in coprocessors produced by DSP Group.

<span class="mw-page-title-main">G.711</span> ITU-T recommendation

G.711 is a narrowband audio codec originally designed for use in telephony that provides toll-quality audio at 64 kbit/s. It is an ITU-T standard (Recommendation) for audio encoding, titled Pulse code modulation (PCM) of voice frequencies released for use in 1972.

Network switching subsystem (NSS) is the component of a GSM system that carries out call out and mobility management functions for mobile phones roaming on the network of base stations. It is owned and deployed by mobile phone operators and allows mobile devices to communicate with each other and telephones in the wider public switched telephone network (PSTN). The architecture contains specific features and functions which are needed because the phones are not fixed in one location.

The base station subsystem (BSS) is the section of a traditional cellular telephone network which is responsible for handling traffic and signaling between a mobile phone and the network switching subsystem. The BSS carries out transcoding of speech channels, allocation of radio channels to mobile phones, paging, transmission and reception over the air interface and many other tasks related to the radio network.

The Adaptive Multi-Rateaudio codec is an audio compression format optimized for speech coding. AMR is a multi-rate narrowband speech codec that encodes narrowband (200–3400 Hz) signals at variable bit rates ranging from 4.75 to 12.2 kbit/s with toll quality speech starting at 7.4 kbit/s.

In communications, Circuit Switched Data (CSD) is the original form of data transmission developed for the time-division multiple access (TDMA)-based mobile phone systems like Global System for Mobile Communications (GSM). After 2010 many telecommunication carriers dropped support for CSD, and CSD has been superseded by GPRS and EDGE (E-GPRS).

Enhanced Full Rate or EFR or GSM-EFR or GSM 06.60 is a speech coding standard that was developed in order to improve the quality of GSM.

Full Rate was the first digital speech coding standard used in the GSM digital mobile phone system. It uses linear predictive coding (LPC). The bit rate of the codec is 13 kbit/s, or 1.625 bits/audio sample. The quality of the coded speech is quite poor by modern standards, but at the time of development it was a good compromise between computational complexity and quality, requiring only on the order of a million additions and multiplications per second. The codec is still widely used in networks around the world. Gradually FR will be replaced by Enhanced Full Rate (EFR) and Adaptive Multi-Rate (AMR) standards, which provide much higher speech quality with lower bit rate.

Half Rate is a speech coding system for GSM, developed in the early 1990s.

Adaptive Multi-Rate Wideband (AMR-WB) is a patented wideband speech audio coding standard developed based on Adaptive Multi-Rate encoding, using a similar methodology to algebraic code-excited linear prediction (ACELP). AMR-WB provides improved speech quality due to a wider speech bandwidth of 50–7000 Hz compared to narrowband speech coders which in general are optimized for POTS wireline quality of 300–3400 Hz. AMR-WB was developed by Nokia and VoiceAge and it was first specified by 3GPP.

Extended Adaptive Multi-Rate – Wideband (AMR-WB+) is an audio codec that extends AMR-WB. It adds support for stereo signals and higher sampling rates. Another main improvement is the use of transform coding additionally to ACELP. This greatly improves the generic audio coding. Automatic switching between transform coding and ACELP provides both good speech and audio quality with moderate bit rates.

High-Efficiency Advanced Audio Coding (HE-AAC) is an audio coding format for lossy data compression of digital audio defined as an MPEG-4 Audio profile in ISO/IEC 14496–3. It is an extension of Low Complexity AAC (AAC-LC) optimized for low-bitrate applications such as streaming audio. The usage profile HE-AAC v1 uses spectral band replication (SBR) to enhance the modified discrete cosine transform (MDCT) compression efficiency in the frequency domain. The usage profile HE-AAC v2 couples SBR with Parametric Stereo (PS) to further enhance the compression efficiency of stereo signals.

Internet Low Bitrate Codec (iLBC) is a royalty-free narrowband speech audio coding format and an open-source reference implementation (codec), developed by Global IP Solutions (GIPS) formerly Global IP Sound. It was formerly freeware with limitations on commercial use, but since 2011 it is available under a free software/open source license as a part of the open source WebRTC project. It is suitable for VoIP applications, streaming audio, archival and messaging. The algorithm is a version of block-independent linear predictive coding, with the choice of data frame lengths of 20 and 30 milliseconds. The encoded blocks have to be encapsulated in a suitable protocol for transport, usually the Real-time Transport Protocol (RTP).

Transcoder and Rate Adaptation Unit or TRAU, performs transcoding function for speech channels and RA for data channels in the GSM network. The Transcoder/Rate Adaptation Unit (TRAU) is the data rate conversion unit. The PSTN/ISDN switch is a switch for 64 kbit/s voice. Current technology permits to decrease the bit-rate. Since MSC is basically a PSTN/ISDN switch its bit-rate is still 64 kbit/s. That is why a rate conversion is required in between the BSC and MSC...

Wideband audio, also known as wideband voice or HD voice, is high definition voice quality for telephony audio, contrasted with standard digital telephony "toll quality". It extends the frequency range of audio signals transmitted over telephone lines, resulting in higher quality speech. The range of the human voice extends from 100 Hz to 17 kHz but traditional, voiceband or narrowband telephone calls limit audio frequencies to the range of 300 Hz to 3.4 kHz. Wideband audio relaxes the bandwidth limitation and transmits in the audio frequency range of 50 Hz to 7 kHz. In addition, some wideband codecs may use a higher audio bit depth of 16 bits to encode samples, also resulting in much better voice quality.

<span class="mw-page-title-main">G.718</span> ITU-T Recommendation

G.718 is an ITU-T Recommendation embedded scalable speech and audio codec providing high quality narrowband speech over the lower bit rates and high quality wideband speech over the complete range of bit rates. In addition, G.718 is designed to be highly robust to frame erasures, thereby enhancing the speech quality when used in Internet Protocol (IP) transport applications on fixed, wireless and mobile networks. Despite its embedded nature, the codec also performs well with both narrowband and wideband generic audio signals. The codec has an embedded scalable structure, enabling maximum flexibility in the transport of voice packets through IP networks of today and in future media-aware networks. In addition, the embedded structure of G.718 will easily allow the codec to be extended to provide a superwideband and stereo capability through additional layers which are currently under development in ITU-T Study Group 16. The bitstream may be truncated at the decoder side or by any component of the communication system to instantaneously adjust the bit rate to the desired value without the need for out-of-band signalling. The encoder produces an embedded bitstream structured in five layers corresponding to the five available bit rates: 8, 12, 16, 24 & 32 kbit/s.

Adaptive differential pulse-code modulation (ADPCM) is a variant of differential pulse-code modulation (DPCM) that varies the size of the quantization step, to allow further reduction of the required data bandwidth for a given signal-to-noise ratio.

References

↑ "3GPP Specification detail 28.062".

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[spec-1] "3GPP Specification detail 28.062".

[1]