FS-1016

Last updated

FS-1016 (also called FED-STD-1016) is a deprecated secure telephony speech encoding standard for Code-excited linear prediction (CELP) developed by the United States Department of Defense and finalized February 14, 1991. [1]

Contents

Unlike the vocoder used in FS-1015, CELP provides more natural speech. However, due to quite low bit rate and early development (1991), the speech quality is very noisy and below that of commercial cellular speech codec, such as AMR. FS-1016 is no longer used since its follow-up MELP provides better performance in all applications.

Technical details

The bit rate of the codec is 4.8 kbit/s. The complexity of the codec is 19 MIPS. The RAM requirement is 1.5 kilobytes. Frame size of the codec is 30 ms. Look-ahead of 7.5 ms is also required.

The variation of CELP used in the FS-1016 is so-called ternary codebook, meaning that all excitation amplitudes are either +1, −1 or 0. The sub frame gain is calculated and sent to the receiver.

Related Research Articles

Speech coding is an application of data compression to digital audio signals containing speech. Speech coding uses speech-specific parameter estimation using audio signal processing techniques to model the speech signal, combined with generic data compression algorithms to represent the resulting modeled parameters in a compact bitstream.

Linear predictive coding (LPC) is a method used mostly in audio signal processing and speech processing for representing the spectral envelope of a digital signal of speech in compressed form, using the information of a linear predictive model.

Speex is an audio compression codec specifically tuned for the reproduction of human speech and also a free software speech codec that may be used on voice over IP applications and podcasts. It is based on the code excited linear prediction speech coding algorithm. Its creators claim Speex to be free of any patent restrictions and it is licensed under the revised (3-clause) BSD license. It may be used with the Ogg container format or directly transmitted over UDP/RTP. It may also be used with the FLV container format.

<span class="mw-page-title-main">G.723.1</span> ITU-T Recommendation

G.723.1 is an audio codec for voice that compresses voice audio in 30 ms frames. An algorithmic look-ahead of 7.5 ms duration means that total algorithmic delay is 37.5 ms. Its official name is Dual rate speech coder for multimedia communications transmitting at 5.3 and 6.3 kbit/s. It is sometimes associated with a Truespeech trademark in coprocessors produced by DSP Group.

Personal Digital Cellular (PDC) was a 2G mobile telecommunications standard used exclusively in Japan.

The Adaptive Multi-Rateaudio codec is an audio compression format optimized for speech coding. AMR is a multi-rate narrowband speech codec that encodes narrowband (200–3400 Hz) signals at variable bit rates ranging from 4.75 to 12.2 kbit/s with toll quality speech starting at 7.4 kbit/s.

Full Rate was the first digital speech coding standard used in the GSM digital mobile phone system. It uses linear predictive coding (LPC). The bit rate of the codec is 13 kbit/s, or 1.625 bits/audio sample. The quality of the coded speech is quite poor by modern standards, but at the time of development it was a good compromise between computational complexity and quality, requiring only on the order of a million additions and multiplications per second. The codec is still widely used in networks around the world. Gradually FR will be replaced by Enhanced Full Rate (EFR) and Adaptive Multi-Rate (AMR) standards, which provide much higher speech quality with lower bit rate.

Adaptive Multi-Rate Wideband (AMR-WB) is a patented wideband speech audio coding standard developed based on Adaptive Multi-Rate encoding, using a similar methodology to algebraic code-excited linear prediction (ACELP). AMR-WB provides improved speech quality due to a wider speech bandwidth of 50–7000 Hz compared to narrowband speech coders which in general are optimized for POTS wireline quality of 300–3400 Hz. AMR-WB was developed by Nokia and VoiceAge and it was first specified by 3GPP.

Algebraic code-excited linear prediction (ACELP) is a speech coding algorithm in which a limited set of pulses is distributed as excitation to a linear prediction filter. It is a linear predictive coding (LPC) algorithm that is based on the code-excited linear prediction (CELP) method and has an algebraic structure. ACELP was developed in 1989 by the researchers at the Université de Sherbrooke in Canada.

<span class="mw-page-title-main">G.729</span> ITU-T Recommendation

G.729 is a royalty-free narrow-band vocoder-based audio data compression algorithm using a frame length of 10 milliseconds. It is officially described as Coding of speech at 8 kbit/s using code-excited linear prediction speech coding (CS-ACELP), and was introduced in 1996. The wide-band extension of G.729 is called G.729.1, which equals G.729 Annex J.

G.728 is an ITU-T standard for speech coding operating at 16 kbit/s. It is officially described as Coding of speech at 16 kbit/s using low-delay code excited linear prediction.

Mixed-excitation linear prediction (MELP) is a United States Department of Defense speech coding standard used mainly in military applications and satellite communications, secure voice, and secure radio devices. Its standardization and later development was led and supported by the NSA and NATO. The current "enhanced" version is known as MELPe.

Harmonic Vector Excitation Coding, abbreviated as HVXC is a speech coding algorithm specified in MPEG-4 Part 3 standard for very low bit rate speech coding. HVXC supports bit rates of 2 and 4 kbit/s in the fixed and variable bit rate mode and sampling frequency of 8 kHz. It also operates at lower bitrates, such as 1.2 - 1.7 kbit/s, using a variable bit rate technique. The total algorithmic delay for the encoder and decoder is 36 ms.

FIPS 137, originally issued as FED-STD-1015, is a secure telephony speech encoding standard for Linear Predictive Coding vocoder developed by the United States Department of Defense and finished on November 28, 1984. It was based on the earlier STANAG 4198 promulgated by NATO on February 13, 1984.

Code-excited linear prediction (CELP) is a linear predictive speech coding algorithm originally proposed by Manfred R. Schroeder and Bishnu S. Atal in 1985. At the time, it provided significantly better quality than existing low bit-rate algorithms, such as residual-excited linear prediction (RELP) and linear predictive coding (LPC) vocoders. Along with its variants, such as algebraic CELP, relaxed CELP, low-delay CELP and vector sum excited linear prediction, it is currently the most widely used speech coding algorithm. It is also used in MPEG-4 Audio speech coding. CELP is commonly used as a generic term for a class of algorithms and not for a particular codec.

Vector sum excited linear prediction (VSELP) is a speech coding method used in several cellular standards. The VSELP algorithm is an analysis-by-synthesis coding technique and belongs to the class of speech coding algorithms known as CELP.

<span class="mw-page-title-main">G.729.1</span> ITU-T Recommendation

G.729.1 is an 8-32 kbit/s embedded speech and audio codec providing bitstream interoperability with G.729, G.729 Annex A and G.729 Annex B. Its official name is G.729-based embedded variable bit rate codec: An 8-32 kbit/s scalable wideband coder bitstream interoperable with G.729. It was introduced in 2006.

<span class="mw-page-title-main">G.718</span> ITU-T Recommendation

G.718 is an ITU-T Recommendation embedded scalable speech and audio codec providing high quality narrowband speech over the lower bit rates and high quality wideband speech over the complete range of bit rates. In addition, G.718 is designed to be highly robust to frame erasures, thereby enhancing the speech quality when used in Internet Protocol (IP) transport applications on fixed, wireless and mobile networks. Despite its embedded nature, the codec also performs well with both narrowband and wideband generic audio signals. The codec has an embedded scalable structure, enabling maximum flexibility in the transport of voice packets through IP networks of today and in future media-aware networks. In addition, the embedded structure of G.718 will easily allow the codec to be extended to provide a superwideband and stereo capability through additional layers which are currently under development in ITU-T Study Group 16. The bitstream may be truncated at the decoder side or by any component of the communication system to instantaneously adjust the bit rate to the desired value without the need for out-of-band signalling. The encoder produces an embedded bitstream structured in five layers corresponding to the five available bit rates: 8, 12, 16, 24 & 32 kbit/s.

<span class="mw-page-title-main">Audio coding format</span> Digitally coded format for audio signals

An audio coding format is a content representation format for storage or transmission of digital audio. Examples of audio coding formats include MP3, AAC, Vorbis, FLAC, and Opus. A specific software or hardware implementation capable of audio compression and decompression to/from a specific audio coding format is called an audio codec; an example of an audio codec is LAME, which is one of several different codecs which implements encoding and decoding audio in the MP3 audio coding format in software.

References

  1. "FED-STD-1016, FEDERAL STANDARD: TELECOMMUNICATIONS: ANALOG TO DIGITAL CONVERSION OF RADIO VOICE BY 4,800 BIT/SECOND CODE EXCITED LINEAR PREDICTION (CELP) (14 FEB 1991)" . Retrieved 2018-08-17.