Exponential-Golomb coding

Last updated

An exponential-Golomb code (or just Exp-Golomb code) is a type of universal code. To encode any nonnegative integer x using the exp-Golomb code:

Contents

  1. Write down x+1 in binary
  2. Count the bits written, subtract one, and write that number of starting zero bits preceding the previous bit string.

The first few values of the code are:

 0 ⇒ 1 ⇒ 1  1 ⇒ 10 ⇒ 010  2 ⇒ 11 ⇒ 011  3 ⇒ 100 ⇒ 00100  4 ⇒ 101 ⇒ 00101  5 ⇒ 110 ⇒ 00110  6 ⇒ 111 ⇒ 00111  7 ⇒ 1000 ⇒ 0001000  8 ⇒ 1001 ⇒ 0001001 ... [1] 

In the above examples, consider the case 3. For 3, x+1 = 3 + 1 = 4. 4 in binary is '100'. '100' has 3 bits, and 3-1 = 2. Hence add 2 zeros before '100', which is '00100'

Similarly, consider 8. '8 + 1' in binary is '1001'. '1001' has 4 bits, and 4-1 is 3. Hence add 3 zeros before 1001, which is '0001001'.

This is identical to the Elias gamma code of x+1, allowing it to encode 0. [2]

Extension to negative numbers

Exp-Golomb coding is used in the H.264/MPEG-4 AVC and H.265 High Efficiency Video Coding video compression standards, in which there is also a variation for the coding of signed numbers by assigning the value 0 to the binary codeword '0' and assigning subsequent codewords to input values of increasing magnitude (and alternating sign, if the field can contain a negative number):

 0 ⇒ 0 ⇒ 1 ⇒ 1  1 ⇒ 1 ⇒ 10 ⇒ 010 −1 ⇒ 2 ⇒ 11 ⇒ 011  2 ⇒ 3 ⇒ 100 ⇒ 00100 −2 ⇒ 4 ⇒ 101 ⇒ 00101  3 ⇒ 5 ⇒ 110 ⇒ 00110 −3 ⇒ 6 ⇒ 111 ⇒ 00111  4 ⇒ 7 ⇒ 1000 ⇒ 0001000 −4 ⇒ 8 ⇒ 1001 ⇒ 0001001 ... [1] 

In other words, a non-positive integer x≤0 is mapped to an even integer −2x, while a positive integer x>0 is mapped to an odd integer 2x−1.

Exp-Golomb coding is also used in the Dirac video codec. [3]

Generalization to order k

To encode larger numbers in fewer bits (at the expense of using more bits to encode smaller numbers), this can be generalized using a nonnegative integer parameter  k. To encode a nonnegative integer x in an order-k exp-Golomb code:

  1. Encode ⌊x/2k⌋ using order-0 exp-Golomb code described above, then
  2. Encode x mod 2k in binary

An equivalent way of expressing this is:

  1. Encode x+2k−1 using the order-0 exp-Golomb code (i.e. encode x+2k using the Elias gamma code), then
  2. Delete k leading zero bits from the encoding result
Exp-Golomb-k coding examples
 x k=0k=1k=2k=3 x k=0k=1k=2k=3 x k=0k=1k=2k=3
011010010001000010110011000111001001020000010101000101100011000011100
10101110110011100011000011010111101001121000010110000101110011001011101
201101001101010120001101001110001000001010022000010111000110000011010011110
30010001011111011130001110001111001000101010123000011000000110010011011011111
40010101100100011001400011110001000000100100101102400001100100011010001110000100000
5001100111010011101150000100000001000100100110101112500001101000011011001110100100001
600111001000010101110160000100010001001000101000110002600001101100011100001111000100010
70001000001001010111111170000100100001001100101010110012700001110000011101001111100100011
800010010010100110001000018000010011000101000010110011010280000111010001111000010000000100100
900010100010110110101000119000010100000101010010111011011290000111100001111100010000100100101

See also

Related Research Articles

<span class="mw-page-title-main">Binary-coded decimal</span> System of digitally encoding numbers

In computing and electronic systems, binary-coded decimal (BCD) is a class of binary encodings of decimal numbers where each digit is represented by a fixed number of bits, usually four or eight. Sometimes, special bit patterns are used for a sign or other indications.

<span class="mw-page-title-main">Huffman coding</span> Technique to compress data

In computer science and information theory, a Huffman code is a particular type of optimal prefix code that is commonly used for lossless data compression. The process of finding or using such a code is Huffman coding, an algorithm developed by David A. Huffman while he was a Sc.D. student at MIT, and published in the 1952 paper "A Method for the Construction of Minimum-Redundancy Codes".

<span class="mw-page-title-main">Hamming distance</span> Number of bits that differ between two strings

In information theory, the Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different. In other words, it measures the minimum number of substitutions required to change one string into the other, or the minimum number of errors that could have transformed one string into the other. In a more general context, the Hamming distance is one of several string metrics for measuring the edit distance between two sequences. It is named after the American mathematician Richard Hamming.

In mathematics and computing, Fibonacci coding is a universal code which encodes positive integers into binary code words. It is one example of representations of integers based on Fibonacci numbers. Each code word ends with "11" and contains no other instances of "11" before the end.

Elias code or Elias gamma code is a universal code encoding positive integers developed by Peter Elias. It is used most commonly when coding integers whose upper-bound cannot be determined beforehand.

Elias δ code or Elias delta code is a universal code encoding the positive integers developed by Peter Elias.

<span class="mw-page-title-main">Arithmetic coding</span> Form of entropy encoding used in data compression

Arithmetic coding (AC) is a form of entropy encoding used in lossless data compression. Normally, a string of characters is represented using a fixed number of bits per character, as in the ASCII code. When a string is converted to arithmetic encoding, frequently used characters will be stored with fewer bits and not-so-frequently occurring characters will be stored with more bits, resulting in fewer bits used in total. Arithmetic coding differs from other forms of entropy encoding, such as Huffman coding, in that rather than separating the input into component symbols and replacing each with a code, arithmetic coding encodes the entire message into a single number, an arbitrary-precision fraction q, where 0.0 ≤ q < 1.0. It represents the current information as a range, defined by two numbers. A recent family of entropy coders called asymmetric numeral systems allows for faster implementations thanks to directly operating on a single natural number representing the current information.

A prefix code is a type of code system distinguished by its possession of the "prefix property", which requires that there is no whole code word in the system that is a prefix of any other code word in the system. It is trivially true for fixed-length code, so only a point of consideration in variable-length code.

Golomb coding is a lossless data compression method using a family of data compression codes invented by Solomon W. Golomb in the 1960s. Alphabets following a geometric distribution will have a Golomb code as an optimal prefix code, making Golomb coding highly suitable for situations in which the occurrence of small values in the input stream is significantly more likely than large values.

Unary coding, or the unary numeral system and also sometimes called thermometer code, is an entropy encoding that represents a natural number, n, with a code of length n + 1, usually n ones followed by a zero or with n − 1 ones followed by a zero. For example 5 is represented as 111110 or 11110. Some representations use n or n − 1 zeros followed by a one. The ones and zeros are interchangeable without loss of generality. Unary coding is both a prefix-free code and a self-synchronizing code.

Truncated binary encoding is an entropy encoding typically used for uniform probability distributions with a finite alphabet. It is parameterized by an alphabet with total size of number n. It is a slightly more general form of binary encoding when n is not a power of two.

In computing, signed number representations are required to encode negative numbers in binary number systems.

<span class="mw-page-title-main">Aztec Code</span> Type of matrix barcode

The Aztec Code is a matrix code invented by Andrew Longacre, Jr. and Robert Hussey in 1995. The code was published by AIM, Inc. in 1997. Although the Aztec Code was patented, that patent was officially made public domain. The Aztec Code is also published as ISO/IEC 24778:2008 standard. Named after the resemblance of the central finder pattern to an Aztec pyramid, Aztec Code has the potential to use less space than other matrix barcodes because it does not require a surrounding blank "quiet zone".

Elias ω coding or Elias omega coding is a universal code encoding the positive integers developed by Peter Elias. Like Elias gamma coding and Elias delta coding, it works by prefixing the positive integer with a representation of its order of magnitude in a universal code. Unlike those other two codes, however, Elias omega recursively encodes that prefix; thus, they are sometimes known as recursive Elias codes.

<span class="mw-page-title-main">Universal code (data compression)</span>

In data compression, a universal code for integers is a prefix code that maps the positive integers onto binary codewords, with the additional property that whatever the true probability distribution on integers, as long as the distribution is monotonic (i.e., p(i) ≥ p(i + 1) for all positive i), the expected lengths of the codewords are within a constant factor of the expected lengths that the optimal code for that probability distribution would have assigned. A universal code is asymptotically optimal if the ratio between actual and optimal expected lengths is bounded by a function of the information entropy of the code that, in addition to being bounded, approaches 1 as entropy approaches infinity.

In computer science and information theory, a canonical Huffman code is a particular type of Huffman code with unique properties which allow it to be described in a very compact manner. Rather than storing the structure of the code tree explicitly, canonical Huffman codes are ordered in such a way that it suffices to only store the lengths of the codewords, which reduces the overhead of the codebook.

<span class="mw-page-title-main">Hadamard code</span> Error-correcting code

The Hadamard code is an error-correcting code named after Jacques Hadamard that is used for error detection and correction when transmitting messages over very noisy or unreliable channels. In 1971, the code was used to transmit photos of Mars back to Earth from the NASA space probe Mariner 9. Because of its unique mathematical properties, the Hadamard code is not only used by engineers, but also intensely studied in coding theory, mathematics, and theoretical computer science. The Hadamard code is also known under the names Walsh code, Walsh family, and Walsh–Hadamard code in recognition of the American mathematician Joseph Leonard Walsh.

A variable-length quantity (VLQ) is a universal code that uses an arbitrary number of binary octets to represent an arbitrarily large integer. A VLQ is essentially a base-128 representation of an unsigned integer with the addition of the eighth bit to mark continuation of bytes. VLQ is identical to LEB128 except in endianness. See the example below.

In coding theory, burst error-correcting codes employ methods of correcting burst errors, which are errors that occur in many consecutive bits rather than occurring in bits independently of each other.

The five-qubit error correcting code is the smallest quantum error correcting code that can protect a logical qubit from any arbitrary single qubit error. In this code, 5 physical qubits are used to encode the logical qubit. With and being Pauli matrices and the Identity matrix, this code's generators are . Its logical operators are and . Once the logical qubit is encoded, errors on the physical qubits can be detected via stabilizer measurements. A lookup table that maps the results of the stabilizer measurements to the types and locations of the errors gives the control system of the quantum computer enough information to correct errors.

References

  1. 1 2 Richardson, Iain (2010). The H.264 Advanced Video Compression Standard. Wiley. pp. 208, 221. ISBN   978-0-470-51692-8.
  2. Rupp, Markus (2009). Video and Multimedia Transmissions over Cellular Networks: Analysis, Modelling and Optimization in Live 3G Mobile Networks. Wiley. p. 149. ISBN   9780470747766.
  3. "Dirac Specification" (PDF). BBC. Archived from the original (PDF) on 2015-05-03. Retrieved 9 March 2011.