Bit Rate Reduction

Last updated April 12, 2023

Bit Rate Reduction, or BRR, also called Bit Rate Reduced, is a name given to an audio compression method used on the SPC700 sound coprocessor used in the SNES, as well as the audio processors of the Philips CD-i, the PlayStation, and the Apple Macintosh Quadra series.^[1] The method is a form of ADPCM.

BRR compresses each consecutive sequence of sixteen 16-bit PCM samples into a block of 9 bytes. From most to least significant, the first byte of each block consists of four bits indicating the range of the block (see below) which controls the size of steps between the 16 possible values such that minute changes can be recorded if the 16 values are closer together but minute changes are lost if the 16 values are far apart, two bits indicating the filter (see below), and two bits of control information for the SPC700. The remaining eight bytes consist of 16 signed 4-bit nibbles which correspond to the 16 samples, packed in a big-endian manner. As 32 bytes of input become 9 bytes of output, the BRR algorithm yields a 3.56:1 compression ratio.

Decompression algorithm

A nibble n in a block with filter $f$ and range $r$ should be decoded into a PCM sample $s_{t}$ using the following second-order linear prediction equation:

s_{t}=2^{r}n+k_{1}s_{t-1}-k_{2}s_{t-2}

Here, $s_{t-1}$ and $s_{t-2}$ are the last-output and next-to-last-output PCM samples, respectively. The filter type $f$ is translated into IIR prediction coefficients $k$ using the following table:

Filter f	k₁	k₂
0	0	0
1	15/16	0
2	61/32	15/16
3	115/64	13/16

These calculations are all done in signed 16.16 fixed-point arithmetic.

Or in words:

Filter 0 linearly decodes the $r$ bit downquantized version of the samples.
Filter 1 adds an $r$ bit downquantized version of the samples to a lowered previous input (delta pack or differential coding).
Filters 2 and 3 add an $r$ bit downquantized version of the samples to the linear extrapolation from the last two samples (2nd order differential coding).

The coefficients of the above filters are specified as slightly less than 1 or 2 in order to realize a leaky integrator that is more resilient to errors in the encoded bitstream. Otherwise, any errors could propagate infinitely, as an impulse response of an ideal integrator is a step function. The denominators are powers of 2 to facilitate implementation with bit shifts as opposed to a more expensive hardware multiplier.

The PlayStation APU and the Philips CD-i CDIC add another set of coefficients to the above and reorders them, for five unique of 8 filters total (these come from the Green Book and Yellow Book specifications):

Filter f	k₁	k₂
0	0	0
1	15/16	0
2	115/64	13/16
3	49/32	55/64
4	61/32	15/16

These calculations are all done in signed 16.16 fixed-point arithmetic.

Related Research Articles

JPEG is a commonly used method of lossy compression for digital images, particularly for those images produced by digital photography. The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and image quality. JPEG typically achieves 10:1 compression with little perceptible loss in image quality. Since its introduction in 1992, JPEG has been the most widely used image compression standard in the world, and the most widely used digital image format, with several billion JPEG images produced every day as of 2015.

A cyclic redundancy check (CRC) is an error-detecting code commonly used in digital networks and storage devices to detect accidental changes to digital data. Blocks of data entering these systems get a short check value attached, based on the remainder of a polynomial division of their contents. On retrieval, the calculation is repeated and, in the event the check values do not match, corrective action can be taken against data corruption. CRCs can be used for error correction.

In electronics, an analog-to-digital converter is a system that converts an analog signal, such as a sound picked up by a microphone or light entering a digital camera, into a digital signal. An ADC may also provide an isolated measurement such as an electronic device that converts an analog input voltage or current to a digital number representing the magnitude of the voltage or current. Typically the digital output is a two's complement binary number that is proportional to the input, but there are other possibilities.

A delta modulation is an analog-to-digital and digital-to-analog signal conversion technique used for transmission of voice information where quality is not of primary importance. DM is the simplest form of differential pulse-code modulation (DPCM) where the difference between successive samples is encoded into n-bit data streams. In delta modulation, the transmitted data are reduced to a 1-bit data stream. Its main features are:

Reed–Solomon codes are a group of error-correcting codes that were introduced by Irving S. Reed and Gustave Solomon in 1960. They have many applications, the most prominent of which include consumer technologies such as MiniDiscs, CDs, DVDs, Blu-ray discs, QR codes, data transmission technologies such as DSL and WiMAX, broadcast systems such as satellite communications, DVB and ATSC, and storage systems such as RAID 6.

Digital audio is a representation of sound recorded in, or converted into, digital form. In digital audio, the sound wave of the audio signal is typically encoded as numerical samples in a continuous sequence. For example, in CD audio, samples are taken 44,100 times per second, each with 16-bit sample depth. Digital audio is also the name for the entire technology of sound recording and reproduction using audio signals that have been encoded in digital form. Following significant advances in digital audio technology during the 1970s and 1980s, it gradually replaced analog audio technology in many areas of audio engineering, record production and telecommunications in the 1990s and 2000s.

<span class="mw-page-title-main">G.711</span> ITU-T recommendation

G.711 is a narrowband audio codec originally designed for use in telephony that provides toll-quality audio at 64 kbit/s. G.711 passes audio signals in the range of 300–3400 Hz and samples them at the rate of 8,000 samples per second, with the tolerance on that rate of 50 parts per million (ppm). Non-uniform (logarithmic) quantization with 8 bits is used to represent each sample, resulting in a 64 kbit/s bit rate. There are two slightly different versions: μ-law, which is used primarily in North America and Japan, and A-law, which is in use in most other countries outside North America.

In signal processing, sampling is the reduction of a continuous-time signal to a discrete-time signal. A common example is the conversion of a sound wave to a sequence of "samples". A sample is a value of the signal at a point in time and/or space; this definition differs from the term's usage in statistics, which refers to a set of such values.

In telecommunications and computing, bit rate is the number of bits that are conveyed or processed per unit of time.

In digital photography, computer-generated imagery, and colorimetry, a grayscale image is one in which the value of each pixel is a single sample representing only an amount of light; that is, it carries only intensity information. Grayscale images, a kind of black-and-white or gray monochrome, are composed exclusively of shades of gray. The contrast ranges from black at the weakest intensity to white at the strongest.

Continuously variable slope delta modulation is a voice coding method. It is a delta modulation with variable step size, first proposed by Greefkes and Riemens in 1970.

Full Rate was the first digital speech coding standard used in the GSM digital mobile phone system. It uses linear predictive coding (LPC). The bit rate of the codec is 13 kbit/s, or 1.625 bits/audio sample. The quality of the coded speech is quite poor by modern standards, but at the time of development it was a good compromise between computational complexity and quality, requiring only on the order of a million additions and multiplications per second. The codec is still widely used in networks around the world. Gradually FR will be replaced by Enhanced Full Rate (EFR) and Adaptive Multi-Rate (AMR) standards, which provide much higher speech quality with lower bit rate.

Code-excited linear prediction (CELP) is a linear predictive speech coding algorithm originally proposed by Manfred R. Schroeder and Bishnu S. Atal in 1985. At the time, it provided significantly better quality than existing low bit-rate algorithms, such as residual-excited linear prediction (RELP) and linear predictive coding (LPC) vocoders. Along with its variants, such as algebraic CELP, relaxed CELP, low-delay CELP and vector sum excited linear prediction, it is currently the most widely used speech coding algorithm. It is also used in MPEG-4 Audio speech coding. CELP is commonly used as a generic term for a class of algorithms and not for a particular codec.

Delta-sigma modulation is a method for encoding analog signals into digital signals as found in an analog-to-digital converter (ADC). It is also used to convert high-bit-count, low-frequency digital signals into lower-bit-count, higher-frequency digital signals as part of the process to convert digital signals into analog as part of a digital-to-analog converter (DAC).

<span class="mw-page-title-main">Ricoh 2A03</span> CPU made by Ricoh for the Nintendo NES

The Ricoh 2A03 or RP2A03 / Ricoh 2A07 or RP2A07 is an 8-bit microprocessor manufactured by Ricoh for the Nintendo Entertainment System video game console. It was also used as a sound chip and secondary CPU by Nintendo's arcade games Punch-Out!! and Donkey Kong 3.

<span class="mw-page-title-main">Audio bit depth</span> Number of bits of information recorded for each digital audio sample

In digital audio using pulse-code modulation (PCM), bit depth is the number of bits of information in each sample, and it directly corresponds to the resolution of each sample. Examples of bit depth include Compact Disc Digital Audio, which uses 16 bits per sample, and DVD-Audio and Blu-ray Disc which can support up to 24 bits per sample.

CRI ADX is a lossy proprietary audio storage and compression format developed by CRI Middleware specifically for use in video games; it is derived from ADPCM. Its most notable feature is a looping function that has proved useful for background sounds in various games that have adopted the format, including many games for the Sega Dreamcast as well as some PlayStation 2, GameCube and Wii games. One of the first games to use ADX was Burning Rangers, on the Sega Saturn. Notably, the Sonic the Hedgehog series from the Dreamcast generation up to at least Shadow the Hedgehog have used this format for sound and voice recordings. Jet Set Radio Future for original Xbox also used this format.

The cyclic redundancy check (CRC) is based on division in the ring of polynomials over the finite field GF(2), that is, the set of polynomials where each coefficient is either zero or one, and arithmetic operations wrap around.

The pulse-code modulation (PCM) technology was patented and developed in France in 1938, but could not be used because suitable technology was not available until World War II. This came about with the arrival of digital systems in the 1960s, when improving the performance of communications networks became a real possibility. However, this technology was not completely adopted until the mid-1970s, due to the large amount of analog systems already in place and the high cost of digital systems, as semiconductors were very expensive. PCM’s initial goal was that of converting an analog voice telephone channel into a digital one based on the sampling theorem.

Pulse-code modulation (PCM) is a method used to digitally represent sampled analog signals. It is the standard form of digital audio in computers, compact discs, digital telephony and other digital audio applications. In a PCM stream, the amplitude of the analog signal is sampled at uniform intervals, and each sample is quantized to the nearest value within a range of digital steps.

References

↑ "68kMLA".

SPC 700 Documentation
US Patent 4,685,115 [beginnings of system which became BRR]
US Patent 4,783,792 [further development toward BRR]
US Patent 4,797,902 [BRR; example coefficients can be seen on page 21]
US Patent 4,829,522 [BRR with error correction-aware interpolation for reading from a disc medium such as a MiniDisc; the final MiniDisc implementation did not use BRR]
US Patent 5,041,830 [BRR shifting/quantization]
US Patent 5,070,515 [BRR encoding/noise shaping; example coefficients can be seen on page 23]
US Patent 5,086,475 [BRR Looping, pitch/frequency detection for encoding]
US Patent 5,111,530 [Rather specific patent on the workings of the DSP in the SNES and PlayStation APU]
US Patent 5,128,963 [a later patent on the system which became BRR]
US Patent 5,166,981 [Using LPC analysis for assisting in encoding BRR]
US Patent 5,303,374 [Predictive error generator for assisting in encoding BRR; coefficients can be seen on page 6]
US Patent 5,430,241 [BRR Looping, pitch/frequency detection for encoding, similar to 5,086,475]
US Patent 5,519,166 [BRR Looping, pitch/frequency detection for encoding, continuation of 5,430,241]
US Patent 5,978,492 [BRR in the context of CD-XA on Sony PlayStation ]

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] "68kMLA".

[1]