Golomb coding

Last updated

Golomb coding is a lossless data compression method using a family of data compression codes invented by Solomon W. Golomb in the 1960s. Alphabets following a geometric distribution will have a Golomb code as an optimal prefix code, [1] making Golomb coding highly suitable for situations in which the occurrence of small values in the input stream is significantly more likely than large values.

Contents

Rice coding

Rice coding (invented by Robert F. Rice) denotes using a subset of the family of Golomb codes to produce a simpler (but possibly suboptimal) prefix code. Rice used this set of codes in an adaptive coding scheme; "Rice coding" can refer either to that adaptive scheme or to using that subset of Golomb codes. Whereas a Golomb code has a tunable parameter that can be any positive integer value, Rice codes are those in which the tunable parameter is a power of two. This makes Rice codes convenient for use on a computer since multiplication and division by 2 can be implemented more efficiently in binary arithmetic.

Rice was motivated to propose this simpler subset due to the fact that geometric distributions are often varying with time, not precisely known, or both, so selecting the seemingly optimal code might not be very advantageous.

Rice coding is used as the entropy encoding stage in a number of lossless image compression and audio data compression methods.

Overview

Golomb coding example for a source x with geometric distribution, with parameter p(0) = 0.2, using Golomb code with M = 3. The 2-bit code 00 is used 20% of the time; the 3-bit codes 010, 011, and 100 are used over 38% of the time; 4 bits or more are needed in a minority of cases. For this source, entropy = 3.610 bits; for this code with this source, rate = 3.639 bits; therefore redundancy = 0.030 bits, or efficiency = 0.992 bits per bit. Golomb code example.png
Golomb coding example for a source x with geometric distribution, with parameter p(0) = 0.2, using Golomb code with M = 3. The 2-bit code 00 is used 20% of the time; the 3-bit codes 010, 011, and 100 are used over 38% of the time; 4 bits or more are needed in a minority of cases. For this source, entropy = 3.610 bits; for this code with this source, rate = 3.639 bits; therefore redundancy = 0.030 bits, or efficiency = 0.992 bits per bit.

Construction of codes

Golomb coding uses a tunable parameter M to divide an input value x into two parts: q, the result of a division by M, and r, the remainder. The quotient is sent in unary coding, followed by the remainder in truncated binary encoding. When , Golomb coding is equivalent to unary coding.

Golomb–Rice codes can be thought of as codes that indicate a number by the position of the bin (q), and the offset within the bin (r). The example figure shows the position q and offset r for the encoding of integer x using Golomb–Rice parameter M = 3, with source probabilities following a geometric distribution with p(0) = 0.2.

Formally, the two parts are given by the following expression, where x is the nonnegative integer being encoded:

and

.
This image shows the redundancy, in bits, of the Golomb code, when M is chosen optimally, for 1 - p(0) >= 0.45 GolombCodeRedundancy.svg
This image shows the redundancy, in bits, of the Golomb code, when M is chosen optimally, for 1 − p(0) 0.45

Both q and r will be encoded using variable numbers of bits: q by a unary code, and r by b bits for Rice code, or a choice between b and b+1 bits for Golomb code (i.e. M is not a power of 2), with . If , then use b bits to encode r; otherwise, use b+1 bits to encode r. Clearly, if M is a power of 2 and we can encode all values of r with b bits.

The integer x treated by Golomb was the run length of a Bernoulli process, which has a geometric distribution starting at 0. The best choice of parameter M is a function of the corresponding Bernoulli process, which is parameterized by the probability of success in a given Bernoulli trial. M is either the median of the distribution or the median ±1. It can be determined by these inequalities:

which are solved by

.

For the example with p(0) = 0.2:

.

The Golomb code for this distribution is equivalent to the Huffman code for the same probabilities, if it were possible to compute the Huffman code for the infinite set of source values.

Use with signed integers

Golomb's scheme was designed to encode sequences of non-negative numbers. However, it is easily extended to accept sequences containing negative numbers using an overlap and interleave scheme, in which all values are reassigned to some positive number in a unique and reversible way. The sequence begins: 0, −1, 1, −2, 2, −3, 3, −4, 4, ... The n-th negative value (i.e., ) is mapped to the nth odd number (), and the mth positive value is mapped to the m-th even number (). This may be expressed mathematically as follows: a positive value x is mapped to (), and a negative value y is mapped to (). Such a code may be used for simplicity, even if suboptimal. Truly optimal codes for two-sided geometric distributions include multiple variants of the Golomb code, depending on the distribution parameters, including this one. [2]

Simple algorithm

Below is the Rice–Golomb encoding, where the remainder code uses simple truncated binary encoding, also named "Rice coding" (other varying-length binary encodings, like arithmetic or Huffman encodings, are possible for the remainder codes, if the statistic distribution of remainder codes is not flat, and notably when not all possible remainders after the division are used). In this algorithm, if the M parameter is a power of 2, it becomes equivalent to the simpler Rice encoding:

  1. Fix the parameter M to an integer value.
  2. For N, the number to be encoded, find
    1. quotient = q = floor(N/M)
    2. remainder = r = N modulo M
  3. Generate codeword
    1. The code format : <Quotient code><Remainder code>, where
    2. Quotient code (in unary coding)
      1. Write a q-length string of 1 bits (alternatively, of 0 bits)
      2. Write a 0 bit (respectively, a 1 bit)
    3. Remainder code (in truncated binary encoding)
      1. Let
        1. If code r in binary representation using b bits.
        2. If code the number in binary representation using b + 1 bits.

Decoding:

  1. Decode the unary representation of q (count the number of 1 in the beginning of the code)
  2. Skip the 0 delimiter
  3. Let
    1. Interpret next b bits as a binary number r'. If holds, then the reminder
    2. Otherwise interpret b + 1 bits as a binary number r', the reminder is given by
  4. Compute

Example

Set M = 10. Thus . The cutoff is .

Encoding of quotient part
qoutput bits
00
110
2110
31110
411110
5111110
61111110
N
Encoding of remainder part
roffsetbinaryoutput bits
000000000
110001001
220010010
330011011
440100100
550101101
61211001100
71311011101
81411101110
91511111111

For example, with a Rice–Golomb encoding using parameter M = 10, the decimal number 42 would first be split into q = 4 and r = 2, and would be encoded as qcode(q),rcode(r) = qcode(4),rcode(2) = 11110,010 (you don't need to encode the separating comma in the output stream, because the 0 at the end of the q code is enough to say when q ends and r begins ; both the qcode and rcode are self-delimited).

Use for run-length encoding

Note that p and 1 – p are reversed in this section compared to the use in earlier sections.

Given an alphabet of two symbols, or a set of two events, P and Q, with probabilities p and (1 p) respectively, where p ≥ 1/2, Golomb coding can be used to encode runs of zero or more Ps separated by single Qs. In this application, the best setting of the parameter M is the nearest integer to . When p = 1/2, M = 1, and the Golomb code corresponds to unary (n ≥ 0Ps followed by a Q is encoded as n ones followed by a zero). If a simpler code is desired, one can assign Golomb–Rice parameter b (i.e., Golomb parameter ) to the integer nearest to ; although not always the best parameter, it is usually the best Rice parameter and its compression performance is quite close to the optimal Golomb code. (Rice himself proposed using various codes for the same data to figure out which was best. A later JPL researcher proposed various methods of optimizing or estimating the code parameter. [3] )

Consider using a Rice code with a binary portion having b bits to run-length encode sequences where P has a probability p. If is the probability that a bit will be part of an k-bit run (Ps and one Q) and is the compression ratio of that run, then the expected compression ratio is

Compression is often expressed in terms of , the proportion compressed. For , the run-length coding approach results in compression ratios close to entropy. For example, using Rice code for yields 91.89% compression, while the entropy limit is 91.92%.

Adaptive run-length Golomb–Rice encoding

When a probability distribution for integers is not known, the optimal parameter for a Golomb–Rice encoder cannot be determined. Thus, in many applications, a two-pass approach is used: first, the block of data is scanned to estimate a probability density function (PDF) for the data. The Golomb–Rice parameter is then determined from that estimated PDF. A simpler variation of that approach is to assume that the PDF belongs to a parametrized family, estimate the PDF parameters from the data, and then compute the optimal Golomb–Rice parameter. That is the approach used in most of the applications discussed below.

An alternative approach to efficiently encode integer data whose PDF is not known, or is varying, is to use a backwards-adaptive encoder. The run-length Golomb–Rice (RLGR) code achieves that using a very simple algorithm that adjusts the Golomb–Rice parameter up or down, depending on the last encoded symbol. A decoder can follow the same rule to track the variation of the encoding parameters, so no side information needs to be transmitted, just the encoded data. Assuming a generalized Gaussian PDF, which covers a wide range of statistics seen in data such as prediction errors or transform coefficients in multimedia codecs, the RLGR encoding algorithm can perform very well in such applications.

Applications

Golomb-coded Rice algorithm experiment compression ratios Golomb coded Rice Algorithm experiment Compression Ratios.png
Golomb-coded Rice algorithm experiment compression ratios

Numerous signal codecs use a Rice code for prediction residues. In predictive algorithms, such residues tend to fall into a two-sided geometric distribution, with small residues being more frequent than large residues, and the Rice code closely approximates the Huffman code for such a distribution without the overhead of having to transmit the Huffman table. One signal that does not match a geometric distribution is a sine wave, because the differential residues create a sinusoidal signal whose values are not creating a geometric distribution (the highest and lowest residue values have similar high frequency of occurrences, only the median positive and negative residues occur less often).

Several lossless audio codecs, such as Shorten, [4] FLAC, [5] Apple Lossless, and MPEG-4 ALS, use a Rice code after the linear prediction step (called "adaptive FIR filter" in Apple Lossless). Rice coding is also used in the FELICS lossless image codec.

The Golomb–Rice coder is used in the entropy coding stage of Rice algorithm based lossless image codecs. One such experiment yields the compression ratio graph shown.

The JPEG-LS scheme uses Rice–Golomb to encode the prediction residuals.

The run-length Golomb–Rice (RLGR) adaptive version of Golomb–Rice coding, mentioned above, is used for encoding screen content in virtual machines in the RemoteFX component of the Microsoft Remote Desktop Protocol.

See also

Related Research Articles

<span class="mw-page-title-main">Huffman coding</span> Technique to compress data

In computer science and information theory, a Huffman code is a particular type of optimal prefix code that is commonly used for lossless data compression. The process of finding or using such a code is Huffman coding, an algorithm developed by David A. Huffman while he was a Sc.D. student at MIT, and published in the 1952 paper "A Method for the Construction of Minimum-Redundancy Codes".

<span class="mw-page-title-main">Entropy (information theory)</span> Expected amount of information needed to specify the output of a stochastic data source

In information theory, the entropy of a random variable is the average level of "information", "surprise", or "uncertainty" inherent to the variable's possible outcomes. Given a discrete random variable , which takes values in the alphabet and is distributed according to :

<span class="mw-page-title-main">JPEG</span> Lossy compression method for reducing the size of digital images

JPEG is a commonly used method of lossy compression for digital images, particularly for those images produced by digital photography. The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and image quality. JPEG typically achieves 10:1 compression with little perceptible loss in image quality. Since its introduction in 1992, JPEG has been the most widely used image compression standard in the world, and the most widely used digital image format, with several billion JPEG images produced every day as of 2015.

Phase-shift keying (PSK) is a digital modulation process which conveys data by changing (modulating) the phase of a constant frequency carrier wave. The modulation is accomplished by varying the sine and cosine inputs at a precise time. It is widely used for wireless LANs, RFID and Bluetooth communication.

In information theory, an entropy coding is any lossless data compression method that attempts to approach the lower bound declared by Shannon's source coding theorem, which states that any lossless data compression method must have expected code length greater or equal to the entropy of the source.

Elias code or Elias gamma code is a universal code encoding positive integers developed by Peter Elias. It is used most commonly when coding integers whose upper-bound cannot be determined beforehand.

<span class="mw-page-title-main">Arithmetic coding</span> Form of entropy encoding used in data compression

Arithmetic coding (AC) is a form of entropy encoding used in lossless data compression. Normally, a string of characters is represented using a fixed number of bits per character, as in the ASCII code. When a string is converted to arithmetic encoding, frequently used characters will be stored with fewer bits and not-so-frequently occurring characters will be stored with more bits, resulting in fewer bits used in total. Arithmetic coding differs from other forms of entropy encoding, such as Huffman coding, in that rather than separating the input into component symbols and replacing each with a code, arithmetic coding encodes the entire message into a single number, an arbitrary-precision fraction q, where 0.0 ≤ q < 1.0. It represents the current information as a range, defined by two numbers. A recent family of entropy coders called asymmetric numeral systems allows for faster implementations thanks to directly operating on a single natural number representing the current information.

In mathematics, a real number is said to be simply normal in an integer base b if its infinite sequence of digits is distributed uniformly in the sense that each of the b digit values has the same natural density 1/b. A number is said to be normal in base b if, for every positive integer n, all possible strings n digits long have density bn.

Unary coding, or the unary numeral system and also sometimes called thermometer code, is an entropy encoding that represents a natural number, n, with a code of length n + 1, usually n ones followed by a zero or with n − 1 ones followed by a zero. For example 5 is represented as 111110 or 11110. Some representations use n or n − 1 zeros followed by a one. The ones and zeros are interchangeable without loss of generality. Unary coding is both a prefix-free code and a self-synchronizing code.

Truncated binary encoding is an entropy encoding typically used for uniform probability distributions with a finite alphabet. It is parameterized by an alphabet with total size of number n. It is a slightly more general form of binary encoding when n is not a power of two.

<span class="mw-page-title-main">Quantization (signal processing)</span> Process of mapping a continuous set to a countable set

Quantization, in mathematics and digital signal processing, is the process of mapping input values from a large set to output values in a (countable) smaller set, often with a finite number of elements. Rounding and truncation are typical examples of quantization processes. Quantization is involved to some degree in nearly all digital signal processing, as the process of representing a signal in digital form ordinarily involves rounding. Quantization also forms the core of essentially all lossy compression algorithms.

Elias ω coding or Elias omega coding is a universal code encoding the positive integers developed by Peter Elias. Like Elias gamma coding and Elias delta coding, it works by prefixing the positive integer with a representation of its order of magnitude in a universal code. Unlike those other two codes, however, Elias omega recursively encodes that prefix; thus, they are sometimes known as recursive Elias codes.

<span class="mw-page-title-main">Universal code (data compression)</span>

In data compression, a universal code for integers is a prefix code that maps the positive integers onto binary codewords, with the additional property that whatever the true probability distribution on integers, as long as the distribution is monotonic (i.e., p(i) ≥ p(i + 1) for all positive i), the expected lengths of the codewords are within a constant factor of the expected lengths that the optimal code for that probability distribution would have assigned. A universal code is asymptotically optimal if the ratio between actual and optimal expected lengths is bounded by a function of the information entropy of the code that, in addition to being bounded, approaches 1 as entropy approaches infinity.

Lossless JPEG is a 1993 addition to JPEG standard by the Joint Photographic Experts Group to enable lossless compression. However, the term may also be used to refer to all lossless compression schemes developed by the group, including JPEG 2000 and JPEG-LS.

FELICS, which stands for Fast Efficient & Lossless Image Compression System, is a lossless image compression algorithm that performs 5-times faster than the original lossless JPEG codec and achieves a similar compression ratio.

An exponential-Golomb code is a type of universal code. To encode any nonnegative integer x using the exp-Golomb code:

  1. Write down x+1 in binary
  2. Count the bits written, subtract one, and write that number of starting zero bits preceding the previous bit string.
<span class="mw-page-title-main">Progressive Graphics File</span> File format

PGF is a wavelet-based bitmapped image format that employs lossless and lossy data compression. PGF was created to improve upon and replace the JPEG format. It was developed at the same time as JPEG 2000 but with a focus on speed over compression ratio.

Distributed source coding (DSC) is an important problem in information theory and communication. DSC problems regard the compression of multiple correlated information sources that do not communicate with each other. By modeling the correlation between multiple sources at the decoder side together with channel codes, DSC is able to shift the computational complexity from encoder side to decoder side, therefore provide appropriate frameworks for applications with complexity-constrained sender, such as sensor networks and video/multimedia compression. One of the main properties of distributed source coding is that the computational burden in encoders is shifted to the joint decoder.

In cryptography, Very Smooth Hash (VSH) is a provably secure cryptographic hash function invented in 2005 by Scott Contini, Arjen Lenstra and Ron Steinfeld. Provably secure means that finding collisions is as difficult as some known hard mathematical problem. Unlike other provably secure collision-resistant hashes, VSH is efficient and usable in practice. Asymptotically, it only requires a single multiplication per log(n) message-bits and uses RSA-type arithmetic. Therefore, VSH can be useful in embedded environments where code space is limited.

Asymmetric numeral systems (ANS) is a family of entropy encoding methods introduced by Jarosław (Jarek) Duda from Jagiellonian University, used in data compression since 2014 due to improved performance compared to previous methods. ANS combines the compression ratio of arithmetic coding, with a processing cost similar to that of Huffman coding. In the tabled ANS (tANS) variant, this is achieved by constructing a finite-state machine to operate on a large alphabet without using multiplication.

References

  1. Gallager, R. G.; van Voorhis, D. C. (1975). "Optimal source codes for geometrically distributed integer alphabets". IEEE Transactions on Information Theory . 21 (2): 228–230. doi:10.1109/tit.1975.1055357.
  2. Merhav, N.; Seroussi, G.; Weinberger, M. J. (2000). "Coding of sources with two-sided geometric distributions and unknown parameters". IEEE Transactions on Information Theory . 46 (1): 229–236. doi:10.1109/18.817520.
  3. Kiely, A. (2004). Selecting the Golomb Parameter in Rice Coding (Technical report). Jet Propulsion Laboratory. 42-159.
  4. "man shorten". Archived from the original on 2014-01-30. Retrieved 2008-12-07.
  5. "FLAC - Format overview". xiph.org.

Further reading