Μ-law algorithm

Last updated December 06, 2024 • 4 min readFrom Wikipedia, The Free Encyclopedia

Audio quality comparison

Problems playing these files? See media help.

The μ-law algorithm (sometimes written mu-law, often abbreviated as u-law) is a companding algorithm, primarily used in 8-bit PCM digital telecommunications systems in North America and Japan. It is one of the two companding algorithms in the G.711 standard from ITU-T, the other being the similar A-law. A-law is used in regions where digital telecommunication signals are carried on E-1 circuits, e.g. Europe.

Companding algorithms reduce the dynamic range of an audio signal. In analog systems, this can increase the signal-to-noise ratio (SNR) achieved during transmission; in the digital domain, it can reduce the quantization error (hence increasing the signal-to-quantization-noise ratio). These SNR increases can be traded instead for reduced bandwidth for equivalent SNR.

At the cost of a reduced peak SNR, it can be mathematically shown that μ-law's non-linear quantization effectively increases dynamic range by 33 dB or 5+1⁄2 bits over a linearly-quantized signal, hence 13.5 bits (which rounds up to 14 bits) is the most resolution required for an input digital signal to be compressed for 8-bit μ-law.^[2]

Algorithm types

The μ-law algorithm may be described in an analog form and in a quantized digital form.

Continuous

For a given input $x$ , the equation for μ-law encoding is^[3] $F(x)=\operatorname {sgn}(x){\dfrac {\ln(1+\mu |x|)}{\ln(1+\mu )}},\quad -1\leq x\leq 1,$

where $μ = 255$ in the North American and Japanese standards, and $sgn(x)$ is the sign function. The range of this function is −1 to 1.

μ-law expansion is then given by the inverse equation:^[3] $F^{-1}(y)=\operatorname {sgn}(y){\dfrac {(1+\mu )^{|y|}-1}{\mu }},\quad -1\leq y\leq 1.$

Discrete

The discrete form is defined in ITU-T Recommendation G.711.^[4]

G.711 is unclear about how to code the values at the limit of a range (e.g. whether +31 codes to 0xEF or 0xF0).^{[ citation needed ]} However, G.191 provides example code in the C language for a μ-law encoder.^[5] The difference between the positive and negative ranges, e.g. the negative range corresponding to +30 to +1 is −31 to −2. This is accounted for by the use of 1s' complement (simple bit inversion) rather than 2's complement to convert a negative value to a positive value during encoding.

Quantized μ-law algorithm
14-bit binary linear input code	8-bit compressed code
+8158 to +4063 in 16 intervals of 256	0x80 + interval number
+4062 to +2015 in 16 intervals of 128	0x90 + interval number
+2014 to +991 in 16 intervals of 64	0xA0 + interval number
+990 to +479 in 16 intervals of 32	0xB0 + interval number
+478 to +223 in 16 intervals of 16	0xC0 + interval number
+222 to +95 in 16 intervals of 8	0xD0 + interval number
+94 to +31 in 16 intervals of 4	0xE0 + interval number
+30 to +1 in 15 intervals of 2	0xF0 + interval number
0	0xFF
−1	0x7F
−31 to −2 in 15 intervals of 2	0x70 + interval number
−95 to −32 in 16 intervals of 4	0x60 + interval number
−223 to −96 in 16 intervals of 8	0x50 + interval number
−479 to −224 in 16 intervals of 16	0x40 + interval number
−991 to −480 in 16 intervals of 32	0x30 + interval number
−2015 to −992 in 16 intervals of 64	0x20 + interval number
−4063 to −2016 in 16 intervals of 128	0x10 + interval number
−8159 to −4064 in 16 intervals of 256	0x00 + interval number

Implementation

The μ-law algorithm may be implemented in several ways:

Analog: Use an amplifier with non-linear gain to achieve companding entirely in the analog domain.
Non-linear ADC: Use an analog-to-digital converter with quantization levels which are unequally spaced to match the μ-law algorithm.
Digital: Use the quantized digital version of the μ-law algorithm to convert data once it is in the digital domain.

Software/DSP: Use the continuous version of the μ-law algorithm to calculate the companded values.

Usage justification

μ-law encoding is used because speech has a wide dynamic range. In analog signal transmission, in the presence of relatively constant background noise, the finer detail is lost. Given that the precision of the detail is compromised anyway, and assuming that the signal is to be perceived as audio by a human, one can take advantage of the fact that the perceived acoustic intensity level or loudness is logarithmic by compressing the signal using a logarithmic-response operational amplifier (Weber–Fechner law). In telecommunications circuits, most of the noise is injected on the lines, thus after the compressor, the intended signal is perceived as significantly louder than the static, compared to an uncompressed source. This became a common solution, and thus, prior to common digital usage, the μ-law specification was developed to define an interoperable standard.

This pre-existing algorithm had the effect of significantly lowering the amount of bits required to encode a recognizable human voice in digital systems. A sample could be effectively encoded using μ-law in as little as 8 bits, which conveniently matched the symbol size of the majority of common computers.

μ-law encoding effectively reduced the dynamic range of the signal, thereby increasing the coding efficiency while biasing the signal in a way that results in a signal-to-distortion ratio that is greater than that obtained by linear encoding for a given number of bits.

m-law decoding as generated with the Sun Microsystems C-language routine g711.c commonly available on the Internet Ulaw.JPG — μ-law decoding as generated with the Sun Microsystems C-language routine g711.c commonly available on the Internet

The μ-law algorithm is also used in the .au format, which dates back at least to the SPARCstation 1 by Sun Microsystems as the native method used by the /dev/audio interface, widely used as a de facto standard for sound on Unix systems. The au format is also used in various common audio APIs such as the classes in the sun.audio Java package in Java 1.1 and in some C# methods.

This plot illustrates how μ-law concentrates sampling in the smaller (softer) values. The horizontal axis represents the byte values 0-255 and the vertical axis is the 16-bit linear decoded value of μ-law encoding.

Comparison with A-law

The μ-law algorithm provides a slightly larger dynamic range than the A-law at the cost of worse proportional distortions for small signals. By convention, A-law is used for an international connection if at least one country uses it.

Related Research Articles

The Au file format is a simple audio file format introduced by Sun Microsystems. The format was common on NeXT systems and on early Web pages. Originally it was headerless, being 8-bit μ-law-encoded data at an 8000 Hz sample rate. Hardware from other vendors often used sample rates as high as 8192 Hz, often integer multiples of video clock signal frequencies. Newer files have a header that consists of six unsigned 32-bit words, an optional information chunk which is always of non-zero size, and then the data.

Speech coding is an application of data compression to digital audio signals containing speech. Speech coding uses speech-specific parameter estimation using audio signal processing techniques to model the speech signal, combined with generic data compression algorithms to represent the resulting modeled parameters in a compact bitstream.

In electronics, an analog-to-digital converter is a system that converts an analog signal, such as a sound picked up by a microphone or light entering a digital camera, into a digital signal. An ADC may also provide an isolated measurement such as an electronic device that converts an analog input voltage or current to a digital number representing the magnitude of the voltage or current. Typically the digital output is a two's complement binary number that is proportional to the input, but there are other possibilities.

An A-law algorithm is a standard companding algorithm, used in European 8-bit PCM digital communications systems to optimize, i.e. modify, the dynamic range of an analog signal for digitizing. It is one of the two companding algorithms in the G.711 standard from ITU-T, the other being the similar μ-law, used in North America and Japan.

In telecommunications and signal processing, companding is a method of mitigating the detrimental effects of a channel with limited dynamic range. The name is a portmanteau of the words compressing and expanding, which are the functions of a compander at the transmitting and receiving ends, respectively. The use of companding allows signals with a large dynamic range to be transmitted over facilities that have a smaller dynamic range capability. Companding is employed in telephony and other audio applications such as professional wireless microphones and analog recording.

Delta modulation is an analog-to-digital and digital-to-analog signal conversion technique used for transmission of voice information where quality is not of primary importance. DM is the simplest form of differential pulse-code modulation (DPCM) where the difference between successive samples is encoded into n-bit data streams. In delta modulation, the transmitted data are reduced to a 1-bit data stream representing either up (↗) or down (↘). Its main features are:

Signal-to-noise ratio is a measure used in science and engineering that compares the level of a desired signal to the level of background noise. SNR is defined as the ratio of signal power to noise power, often expressed in decibels. A ratio higher than 1:1 indicates more signal than noise.

<span class="mw-page-title-main">Digital-to-analog converter</span> Device that converts a digital signal into an analog signal

In electronics, a digital-to-analog converter is a system that converts a digital signal into an analog signal. An analog-to-digital converter (ADC) performs the reverse function.

<span class="mw-page-title-main">G.711</span> ITU-T recommendation

G.711 is a narrowband audio codec originally designed for use in telephony that provides toll-quality audio at 64 kbit/s. It is an ITU-T standard (Recommendation) for audio encoding, titled Pulse code modulation (PCM) of voice frequencies released for use in 1972.

Quantization, in mathematics and digital signal processing, is the process of mapping input values from a large set to output values in a (countable) smaller set, often with a finite number of elements. Rounding and truncation are typical examples of quantization processes. Quantization is involved to some degree in nearly all digital signal processing, as the process of representing a signal in digital form ordinarily involves rounding. Quantization also forms the core of essentially all lossy compression algorithms.

Near Instantaneous Companded Audio Multiplex (NICAM) is an early form of lossy compression for digital audio. It was originally developed in the early 1970s for point-to-point links within broadcasting networks. In the 1980s, broadcasters began to use NICAM compression for transmissions of stereo TV sound to the public.

Continuously variable slope delta modulation is a voice coding method. It is a delta modulation with variable step size, first proposed by Greefkes and Riemens in 1970.

<span class="mw-page-title-main">G.729</span> ITU-T Recommendation

G.729 is a royalty-free narrow-band vocoder-based audio data compression algorithm using a frame length of 10 milliseconds. It is officially described as Coding of speech at 8 kbit/s using code-excited linear prediction speech coding (CS-ACELP), and was introduced in 1996. The wide-band extension of G.729 is called G.729.1, which equals G.729 Annex J.

<span class="mw-page-title-main">G.726</span> ITU-T Recommendation

G.726 is an ITU-T ADPCM speech codec standard covering the transmission of voice at rates of 16, 24, 32, and 40 kbit/s. It was introduced to supersede both G.721, which covered ADPCM at 32 kbit/s, and G.723, which described ADPCM for 24 and 40 kbit/s. G.726 also introduced a new 16 kbit/s rate. The four bit rates associated with G.726 are often referred to by the bit size of a sample, which are 2, 3, 4, and 5-bits respectively. The corresponding wide-band codec based on the same technology is G.722.

<span class="mw-page-title-main">Audio bit depth</span> Number of bits of information recorded for each digital audio sample

In digital audio using pulse-code modulation (PCM), bit depth is the number of bits of information in each sample, and it directly corresponds to the resolution of each sample. Examples of bit depth include Compact Disc Digital Audio, which uses 16 bits per sample, and DVD-Audio and Blu-ray Disc, which can support up to 24 bits per sample.

Adaptive differential pulse-code modulation (ADPCM) is a variant of differential pulse-code modulation (DPCM) that varies the size of the quantization step, to allow further reduction of the required data bandwidth for a given signal-to-noise ratio.

<span class="mw-page-title-main">Sub-band coding</span>

In signal processing, sub-band coding (SBC) is any form of transform coding that breaks a signal into a number of different frequency bands, typically by using a fast Fourier transform, and encodes each one independently. This decomposition is often the first step in data compression for audio and video signals.

Pulse-code modulation (PCM) is a method used to digitally represent analog signals. It is the standard form of digital audio in computers, compact discs, digital telephony and other digital audio applications. In a PCM stream, the amplitude of the analog signal is sampled at uniform intervals, and each sample is quantized to the nearest value within a range of digital steps. Alec Reeves, Claude Shannon, Barney Oliver and John R. Pierce are credited with its invention.

Pulse-density modulation, or PDM, is a form of modulation used to represent an analog signal with a binary signal. In a PDM signal, specific amplitude values are not encoded into codewords of pulses of different weight as they would be in pulse-code modulation (PCM); rather, the relative density of the pulses corresponds to the analog signal's amplitude. The output of a 1-bit DAC is the same as the PDM encoding of the signal.

In digital communications, an encoding law is a allocation of signal quantization levels across the possible analog signal levels in an analog-to-digital converter system. They can be viewed as a simple form of instantaneous companding.

References

↑ "Video/Voice/Speech Codecs". Grandstream. Retrieved 19 July 2020.
↑ Ess, David Van (29 December 2014) [2007-10-09]. "Cypress Semiconductor AN2095: Algorithm - Logarithmic Signal Companding - Not Just a Good Idea - It Is μ-Law" (PDF). Infineon Technologies . Archived (PDF) from the original on 6 October 2022. Retrieved 28 June 2023.
1 2 "Waveform Coding Techniques - Cisco". 2 February 2006. Retrieved 7 December 2020.
↑ "ITU-T Recommendation G.711".
↑ "G.191 : Software tools for speech and audio coding standardization". www.itu.int.

This article incorporates public domain material from Federal Standard 1037C. General Services Administration. Archived from the original on 22 January 2022.

External links

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] "Video/Voice/Speech Codecs". Grandstream. Retrieved 19 July 2020.

[2] Ess, David Van (29 December 2014) [2007-10-09]. "Cypress Semiconductor AN2095: Algorithm - Logarithmic Signal Companding - Not Just a Good Idea - It Is μ-Law" (PDF). Infineon Technologies . Archived (PDF) from the original on 6 October 2022. Retrieved 28 June 2023.

[mulaw-equation-3] 1 2 "Waveform Coding Techniques - Cisco". 2 February 2006. Retrieved 7 December 2020.

[4] "ITU-T Recommendation G.711".

[5] "G.191 : Software tools for speech and audio coding standardization". www.itu.int.

[1]

[2]

[3]

[4]

[5]