Exponential-Golomb coding

Last updated February 01, 2025

An exponential-Golomb code (or just Exp-Golomb code) is a type of universal code. To encode any nonnegative integer x using the exp-Golomb code:

Extension to negative numbers

Exp-Golomb coding is used in the H.264/MPEG-4 AVC and H.265 High Efficiency Video Coding video compression standards, in which there is also a variation for the coding of signed numbers by assigning the value 0 to the binary codeword '0' and assigning subsequent codewords to input values of increasing magnitude (and alternating sign, if the field can contain a negative number):

 0 ⇒ 0 ⇒ 1 ⇒ 1  1 ⇒ 1 ⇒ 10 ⇒ 010 −1 ⇒ 2 ⇒ 11 ⇒ 011  2 ⇒ 3 ⇒ 100 ⇒ 00100 −2 ⇒ 4 ⇒ 101 ⇒ 00101  3 ⇒ 5 ⇒ 110 ⇒ 00110 −3 ⇒ 6 ⇒ 111 ⇒ 00111  4 ⇒ 7 ⇒ 1000 ⇒ 0001000 −4 ⇒ 8 ⇒ 1001 ⇒ 0001001 ...^[1]

In other words, a non-positive integer x≤0 is mapped to an even integer −2x, while a positive integer x>0 is mapped to an odd integer 2x−1.

Exp-Golomb coding is also used in the Dirac video codec.^[3]

Generalization to order k

To encode larger numbers in fewer bits (at the expense of using more bits to encode smaller numbers), this can be generalized using a nonnegative integer parameter k. To encode a nonnegative integer x in an order-k exp-Golomb code:

Encode ⌊x/2^k⌋ using order-0 exp-Golomb code described above, then
Encode x mod 2^k in binary with k bits

An equivalent way of expressing this is:

Encode x+2^k−1 using the order-0 exp-Golomb code (i.e. encode x+2^k using the Elias gamma code), then
Delete k leading zero bits from the encoding result

Exp-Golomb-k coding examples
x	k=0	k=1	k=2	k=3	x	k=0	k=1	k=2	k=3	x	k=0	k=1	k=2	k=3
0	1	10	100	1000	10	0001011	001100	01110	010010	20	000010101	00010110	0011000	011100
1	010	11	101	1001	11	0001100	001101	01111	010011	21	000010110	00010111	0011001	011101
2	011	0100	110	1010	12	0001101	001110	0010000	010100	22	000010111	00011000	0011010	011110
3	00100	0101	111	1011	13	0001110	001111	0010001	010101	23	000011000	00011001	0011011	011111
4	00101	0110	01000	1100	14	0001111	00010000	0010010	010110	24	000011001	00011010	0011100	00100000
5	00110	0111	01001	1101	15	000010000	00010001	0010011	010111	25	000011010	00011011	0011101	00100001
6	00111	001000	01010	1110	16	000010001	00010010	0010100	011000	26	000011011	00011100	0011110	00100010
7	0001000	001001	01011	1111	17	000010010	00010011	0010101	011001	27	000011100	00011101	0011111	00100011
8	0001001	001010	01100	010000	18	000010011	00010100	0010110	011010	28	000011101	00011110	000100000	00100100
9	0001010	001011	01101	010001	19	000010100	00010101	0010111	011011	29	000011110	00011111	000100001	00100101

Related Research Articles

In computing and electronic systems, binary-coded decimal (BCD) is a class of binary encodings of decimal numbers where each digit is represented by a fixed number of bits, usually four or eight. Sometimes, special bit patterns are used for a sign or other indications.

In computer science and information theory, a Huffman code is a particular type of optimal prefix code that is commonly used for lossless data compression. The process of finding or using such a code is Huffman coding, an algorithm developed by David A. Huffman while he was a Sc.D. student at MIT, and published in the 1952 paper "A Method for the Construction of Minimum-Redundancy Codes".

In information theory, the Hamming distance between two strings or vectors of equal length is the number of positions at which the corresponding symbols are different. In other words, it measures the minimum number of substitutions required to change one string into the other, or equivalently, the minimum number of errors that could have transformed one string into the other. In a more general context, the Hamming distance is one of several string metrics for measuring the edit distance between two sequences. It is named after the American mathematician Richard Hamming.

In mathematics and computing, Fibonacci coding is a universal code which encodes positive integers into binary code words. It is one example of representations of integers based on Fibonacci numbers. Each code word ends with "11" and contains no other instances of "11" before the end.

Elias $code$ or Elias gamma code is a universal code encoding positive integers developed by Peter Elias. It is used most commonly when coding integers whose upper-bound cannot be determined beforehand.

Elias δ code or Elias delta code is a universal code encoding the positive integers developed by Peter Elias.

A prefix code is a type of code system distinguished by its possession of the "prefix property", which requires that there is no whole code word in the system that is a prefix of any other code word in the system. It is trivially true for fixed-length codes, so only a point of consideration for variable-length codes.

Golomb coding is a lossless data compression method using a family of data compression codes invented by Solomon W. Golomb in the 1960s. Alphabets following a geometric distribution will have a Golomb code as an optimal prefix code, making Golomb coding highly suitable for situations in which the occurrence of small values in the input stream is significantly more likely than large values.

Unary coding, or the unary numeral system and also sometimes called thermometer code, is an entropy encoding that represents a natural number, n, with a code of length n + 1, usually n ones followed by a zero or with n − 1 ones followed by a zero. For example 5 is represented as 111110 or 11110. Some representations use n or n − 1 zeros followed by a one. The ones and zeros are interchangeable without loss of generality. Unary coding is both a prefix-free code and a self-synchronizing code.

Truncated binary encoding is an entropy encoding typically used for uniform probability distributions with a finite alphabet. It is parameterized by an alphabet with total size of number n. It is a slightly more general form of binary encoding when n is not a power of two.

In computing, signed number representations are required to encode negative numbers in binary number systems.

<span class="mw-page-title-main">Aztec Code</span> Type of matrix barcode

The Aztec Code is a matrix code invented by Andrew Longacre, Jr. and Robert Hussey in 1995. The code was published by AIM, Inc. in 1997. Although the Aztec Code was patented, that patent was officially made public domain. The Aztec Code is also published as ISO/IEC 24778:2024 standard. Named after the resemblance of the central finder pattern to an Aztec pyramid, Aztec Code has the potential to use less space than other matrix barcodes because it does not require a surrounding blank "quiet zone".

Elias ω coding or Elias omega coding is a universal code encoding the positive integers developed by Peter Elias. Like Elias gamma coding and Elias delta coding, it works by prefixing the positive integer with a representation of its order of magnitude in a universal code. Unlike those other two codes, however, Elias omega recursively encodes that prefix; thus, they are sometimes known as recursive Elias codes.

In data compression, a universal code for integers is a prefix code that maps the positive integers onto binary codewords, with the additional property that whatever the true probability distribution on integers, as long as the distribution is monotonic (i.e., p(i) ≥ p(i + 1) for all positive i), the expected lengths of the codewords are within a constant factor of the expected lengths that the optimal code for that probability distribution would have assigned. A universal code is asymptotically optimal if the ratio between actual and optimal expected lengths is bounded by a function of the information entropy of the code that, in addition to being bounded, approaches 1 as entropy approaches infinity.

In computer science and information theory, a canonical Huffman code is a particular type of Huffman code with unique properties which allow it to be described in a very compact manner. Rather than storing the structure of the code tree explicitly, canonical Huffman codes are ordered in such a way that it suffices to only store the lengths of the codewords, which reduces the overhead of the codebook.

In coding theory, a variable-length code is a code which maps source symbols to a variable number of bits. The equivalent concept in computer science is bit string.

A variable-length quantity (VLQ) is a universal code that uses an arbitrary number of binary octets to represent an arbitrarily large integer. A VLQ is essentially a base-128 representation of an unsigned integer with the addition of the eighth bit to mark continuation of bytes. VLQ is identical to LEB128 except in endianness. See the example below.

In coding theory, a polynomial code is a type of linear code whose set of valid code words consists of those polynomials that are divisible by a given fixed polynomial.

In information theory, Shannon–Fano–Elias coding is a precursor to arithmetic coding, in which probabilities are used to determine codewords. It is named for Claude Shannon, Robert Fano, and Peter Elias.

The five-qubit error correcting code is the smallest quantum error correcting code that can protect a logical qubit from any arbitrary single qubit error. In this code, 5 physical qubits are used to encode the logical qubit. With $and being Pauli matrices and the Identity matrix, this code's generators are . Its logical operators are and . Once the logical qubit is encoded, errors on the physical qubits can be detected via stabilizer measurements. A lookup table that maps the results of the stabilizer measurements to the types and locations of the errors gives the control system of the quantum computer enough information to correct errors.$

References

1 2 Richardson, Iain (2010). The H.264 Advanced Video Compression Standard. Wiley. pp. 208, 221. ISBN 978-0-470-51692-8.
↑ Rupp, Markus (2009). Video and Multimedia Transmissions over Cellular Networks: Analysis, Modelling and Optimization in Live 3G Mobile Networks. Wiley. p. 149. ISBN 9780470747766.
↑ "Dirac Specification" (PDF). BBC. Archived from the original on 2015-05-03. Retrieved 9 March 2011.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[richardson-1] 1 2 Richardson, Iain (2010). The H.264 Advanced Video Compression Standard. Wiley. pp. 208, 221. ISBN 978-0-470-51692-8.

[2] Rupp, Markus (2009). Video and Multimedia Transmissions over Cellular Networks: Analysis, Modelling and Optimization in Live 3G Mobile Networks. Wiley. p. 149. ISBN 9780470747766.

[3] "Dirac Specification" (PDF). BBC. Archived from the original on 2015-05-03. Retrieved 9 March 2011.

[1]

[2]

[3]

Exponential-Golomb coding

Contents

Extension to negative numbers

Generalization to order k

See also

Related Research Articles

References