Decoding methods

Last updated December 05, 2023

In coding theory, decoding is the process of translating received messages into codewords of a given code. There have been many common methods of mapping messages to codewords. These are often used to recover messages sent over a noisy channel, such as a binary symmetric channel.

Notation

$C\subset \mathbb {F} _{2}^{n}$ is considered a binary code with the length $n$ ; $x,y$ shall be elements of $\mathbb {F} _{2}^{n}$ ; and $d(x,y)$ is the distance between those elements.

Ideal observer decoding

One may be given the message $x\in \mathbb {F} _{2}^{n}$ , then ideal observer decoding generates the codeword $y\in C$ . The process results in this solution:

\mathbb {P} (y{\mbox{ sent}}\mid x{\mbox{ received}})

For example, a person can choose the codeword $y$ that is most likely to be received as the message $x$ after transmission.

Decoding conventions

Each codeword does not have an expected possibility: there may be more than one codeword with an equal likelihood of mutating into the received message. In such a case, the sender and receiver(s) must agree ahead of time on a decoding convention. Popular conventions include:

Request that the codeword be resent – automatic repeat-request.
Choose any random codeword from the set of most likely codewords which is nearer to that.
If another code follows, mark the ambiguous bits of the codeword as erasures and hope that the outer code disambiguates them
Report a decoding failure to the system

Maximum likelihood decoding

Given a received vector $x\in \mathbb {F} _{2}^{n}$ maximum likelihood decoding picks a codeword $y\in C$ that maximizes

\mathbb {P} (x{\mbox{ received}}\mid y{\mbox{ sent}})

,

that is, the codeword $y$ that maximizes the probability that $x$ was received, given that $y$ was sent. If all codewords are equally likely to be sent then this scheme is equivalent to ideal observer decoding. In fact, by Bayes Theorem,

{\begin{aligned}\mathbb {P} (x{\mbox{ received}}\mid y{\mbox{ sent}})&{}={\frac {\mathbb {P} (x{\mbox{ received}},y{\mbox{ sent}})}{\mathbb {P} (y{\mbox{ sent}})}}\\&{}=\mathbb {P} (y{\mbox{ sent}}\mid x{\mbox{ received}})\cdot {\frac {\mathbb {P} (x{\mbox{ received}})}{\mathbb {P} (y{\mbox{ sent}})}}.\end{aligned}}

Upon fixing $\mathbb {P} (x{\mbox{ received}})$ , $x$ is restructured and $\mathbb {P} (y{\mbox{ sent}})$ is constant as all codewords are equally likely to be sent. Therefore, $\mathbb {P} (x{\mbox{ received}}\mid y{\mbox{ sent}})$ is maximised as a function of the variable $y$ precisely when $\mathbb {P} (y{\mbox{ sent}}\mid x{\mbox{ received}})$ is maximised, and the claim follows.

As with ideal observer decoding, a convention must be agreed to for non-unique decoding.

The maximum likelihood decoding problem can also be modeled as an integer programming problem.^[1]

The maximum likelihood decoding algorithm is an instance of the "marginalize a product function" problem which is solved by applying the generalized distributive law.^[2]

Minimum distance decoding

Given a received codeword $x\in \mathbb {F} _{2}^{n}$ , minimum distance decoding picks a codeword $y\in C$ to minimise the Hamming distance:

d(x,y)=\#\{i:x_{i}\not =y_{i}\}

i.e. choose the codeword $y$ that is as close as possible to $x$ .

Note that if the probability of error on a discrete memoryless channel $p$ is strictly less than one half, then minimum distance decoding is equivalent to maximum likelihood decoding, since if

d(x,y)=d,\,

then:

{\begin{aligned}\mathbb {P} (y{\mbox{ received}}\mid x{\mbox{ sent}})&{}=(1-p)^{n-d}\cdot p^{d}\\&{}=(1-p)^{n}\cdot \left({\frac {p}{1-p}}\right)^{d}\\\end{aligned}}

which (since p is less than one half) is maximised by minimising d.

Minimum distance decoding is also known as nearest neighbour decoding. It can be assisted or automated by using a standard array. Minimum distance decoding is a reasonable decoding method when the following conditions are met:

The probability $p$ that an error occurs is independent of the position of the symbol.
Errors are independent events – an error at one position in the message does not affect other positions.

These assumptions may be reasonable for transmissions over a binary symmetric channel. They may be unreasonable for other media, such as a DVD, where a single scratch on the disk can cause an error in many neighbouring symbols or codewords.

As with other decoding methods, a convention must be agreed to for non-unique decoding.

Syndrome decoding

Syndrome decoding is a highly efficient method of decoding a linear code over a noisy channel, i.e. one on which errors are made. In essence, syndrome decoding is minimum distance decoding using a reduced lookup table. This is allowed by the linearity of the code.^[3]

Suppose that $C\subset \mathbb {F} _{2}^{n}$ is a linear code of length $n$ and minimum distance $d$ with parity-check matrix $H$ . Then clearly $C$ is capable of correcting up to

t=\left\lfloor {\frac {d-1}{2}}\right\rfloor

errors made by the channel (since if no more than $t$ errors are made then minimum distance decoding will still correctly decode the incorrectly transmitted codeword).

Now suppose that a codeword $x\in \mathbb {F} _{2}^{n}$ is sent over the channel and the error pattern $e\in \mathbb {F} _{2}^{n}$ occurs. Then $z=x+e$ is received. Ordinary minimum distance decoding would lookup the vector $z$ in a table of size $|C|$ for the nearest match - i.e. an element (not necessarily unique) $c\in C$ with

d(c,z)\leq d(y,z)

for all $y\in C$ . Syndrome decoding takes advantage of the property of the parity matrix that:

Hx=0

for all $x\in C$ . The syndrome of the received $z=x+e$ is defined to be:

Hz=H(x+e)=Hx+He=0+He=He

To perform ML decoding in a binary symmetric channel, one has to look-up a precomputed table of size $2^{n-k}$ , mapping $He$ to $e$ .

Note that this is already of significantly less complexity than that of a standard array decoding.

However, under the assumption that no more than $t$ errors were made during transmission, the receiver can look up the value $He$ in a further reduced table of size

{\begin{matrix}\sum _{i=0}^{t}{\binom {n}{i}}\\\end{matrix}}

List decoding

Information set decoding

This is a family of Las Vegas-probabilistic methods all based on the observation that it is easier to guess enough error-free positions, than it is to guess all the error-positions.

The simplest form is due to Prange: Let $G$ be the $k\times n$ generator matrix of $C$ used for encoding. Select $k$ columns of $G$ at random, and denote by $G'$ the corresponding submatrix of $G$ . With reasonable probability $G'$ will have full rank, which means that if we let $c'$ be the sub-vector for the corresponding positions of any codeword $c=mG$ of $C$ for a message $m$ , we can recover $m$ as $m=c'G'^{-1}$ . Hence, if we were lucky that these $k$ positions of the received word $y$ contained no errors, and hence equalled the positions of the sent codeword, then we may decode.

If $t$ errors occurred, the probability of such a fortunate selection of columns is given by $\textstyle {\binom {n-t}{k}}/{\binom {n}{k}}$ .

This method has been improved in various ways, e.g. by Stern^[4] and Canteaut and Sendrier.^[5]

Partial response maximum likelihood

Partial response maximum likelihood (PRML) is a method for converting the weak analog signal from the head of a magnetic disk or tape drive into a digital signal.

Viterbi decoder

A Viterbi decoder uses the Viterbi algorithm for decoding a bitstream that has been encoded using forward error correction based on a convolutional code. The Hamming distance is used as a metric for hard decision Viterbi decoders. The squared Euclidean distance is used as a metric for soft decision decoders.

Optimal decision decoding algorithm (ODDA)

Optimal decision decoding algorithm (ODDA) for an asymmetric TWRC system.^{[ clarification needed ]}^[6]

Related Research Articles

In information theory, the Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different. In other words, it measures the minimum number of substitutions required to change one string into the other, or the minimum number of errors that could have transformed one string into the other. In a more general context, the Hamming distance is one of several string metrics for measuring the edit distance between two sequences. It is named after the American mathematician Richard Hamming.

Reed–Solomon codes are a group of error-correcting codes that were introduced by Irving S. Reed and Gustave Solomon in 1960. They have many applications, the most prominent of which include consumer technologies such as MiniDiscs, CDs, DVDs, Blu-ray discs, QR codes, data transmission technologies such as DSL and WiMAX, broadcast systems such as satellite communications, DVB and ATSC, and storage systems such as RAID 6.

A binary symmetric channel is a common communications channel model used in coding theory and information theory. In this model, a transmitter wishes to send a bit, and the receiver will receive a bit. The bit will be "flipped" with a "crossover probability" of p, and otherwise is received correctly. This model can be applied to varied communication channels such as telephone lines or disk drive storage.

Additive white Gaussian noise (AWGN) is a basic noise model used in information theory to mimic the effect of many random processes that occur in nature. The modifiers denote specific characteristics:

In coding theory, block codes are a large and important family of error-correcting codes that encode data in blocks. There is a vast number of examples for block codes, many of which have a wide range of practical applications. The abstract definition of block codes is conceptually useful because it allows coding theorists, mathematicians, and computer scientists to study the limitations of all block codes in a unified way. Such limitations often take the form of bounds that relate different parameters of the block code to each other, such as its rate and its ability to detect and correct errors.

In coding theory, a linear code is an error-correcting code for which any linear combination of codewords is also a codeword. Linear codes are traditionally partitioned into block codes and convolutional codes, although turbo codes can be seen as a hybrid of these two types. Linear codes allow for more efficient encoding and decoding algorithms than other codes.

In coding theory, the Singleton bound, named after Richard Collom Singleton, is a relatively crude upper bound on the size of an arbitrary block code $with block length, size and minimum distance . It is also known as the Joshibound . proved by Joshi (1958) and even earlier by Komamiya (1953).$

In mathematics and computer science, in the field of coding theory, the Hamming bound is a limit on the parameters of an arbitrary block code: it is also known as the sphere-packing bound or the volume bound from an interpretation in terms of packing balls in the Hamming metric into the space of all possible words. It gives an important limitation on the efficiency with which any error-correcting code can utilize the space in which its code words are embedded. A code that attains the Hamming bound is said to be a perfect code.

Reed–Muller codes are error-correcting codes that are used in wireless communications applications, particularly in deep-space communication. Moreover, the proposed 5G standard relies on the closely related polar codes for error correction in the control channel. Due to their favorable theoretical and mathematical properties, Reed–Muller codes have also been extensively studied in theoretical computer science.

In information theory, the noisy-channel coding theorem, establishes that for any given degree of noise contamination of a communication channel, it is possible to communicate discrete data nearly error-free up to a computable maximum rate through the channel. This result was presented by Claude Shannon in 1948 and was based in part on earlier work and ideas of Harry Nyquist and Ralph Hartley.

The Hadamard code is an error-correcting code named after Jacques Hadamard that is used for error detection and correction when transmitting messages over very noisy or unreliable channels. In 1971, the code was used to transmit photos of Mars back to Earth from the NASA space probe Mariner 9. Because of its unique mathematical properties, the Hadamard code is not only used by engineers, but also intensely studied in coding theory, mathematics, and theoretical computer science. The Hadamard code is also known under the names Walsh code, Walsh family, and Walsh–Hadamard code in recognition of the American mathematician Joseph Leonard Walsh.

In coding theory, concatenated codes form a class of error-correcting codes that are derived by combining an inner code and an outer code. They were conceived in 1966 by Dave Forney as a solution to the problem of finding a code that has both exponentially decreasing error probability with increasing block length and polynomial-time decoding complexity. Concatenated codes became widely used in space communications in the 1970s.

In coding theory, list decoding is an alternative to unique decoding of error-correcting codes for large error rates. The notion was proposed by Elias in the 1950s. The main idea behind list decoding is that the decoding algorithm instead of outputting a single possible message outputs a list of possibilities one of which is correct. This allows for handling a greater number of errors than that allowed by unique decoding.

In coding theory, a standard array is a $by array that lists all elements of a particular vector space. Standard arrays are used to decode linear codes; i.e. to find the corresponding codeword for any received vector.$

A locally decodable code (LDC) is an error-correcting code that allows a single bit of the original message to be decoded with high probability by only examining a small number of bits of a possibly corrupted codeword. This property could be useful, say, in a context where information is being transmitted over a noisy channel, and only a small subset of the data is required at a particular time and there is no need to decode the entire message at once. Note that locally decodable codes are not a subset of locally testable codes, though there is some overlap between the two.

The Gilbert–Varshamov bound for linear codes is related to the general Gilbert–Varshamov bound, which gives a lower bound on the maximal number of elements in an error-correcting code of a given block length and minimum Hamming weight over a field $. This may be translated into a statement about the maximum rate of a code with given length and minimum distance. The Gilbert-Varshamov bound for linear codes asserts the existence of q -ary linear codes for any relative minimum distance less than the given bound that simultaneously have high rate. The existence proof uses the probabilistic method, and thus is not constructive. The Gilbert-Varshamov bound is the best known in terms of relative distance for codes over alphabets of size less than 49. For larger alphabets, algebraic geometry codes sometimes achieve an asymptotically better rate vs. distance tradeoff than is given by the Gilbert-Varshamov bound.$

In coding theory, generalized minimum-distance (GMD) decoding provides an efficient algorithm for decoding concatenated codes, which is based on using an errors-and-erasures decoder for the outer code.

In coding theory, folded Reed–Solomon codes are like Reed–Solomon codes, which are obtained by mapping $Reed-Solomon codewords over a larger alphabet by careful bundling of codeword symbols.$

In coding theory, burst error-correcting codes employ methods of correcting burst errors, which are errors that occur in many consecutive bits rather than occurring in bits independently of each other.

In coding theory, Zemor's algorithm, designed and developed by Gilles Zemor, is a recursive low-complexity approach to code construction. It is an improvement over the algorithm of Sipser and Spielman.

References

↑ Feldman, Jon; Wainwright, Martin J.; Karger, David R. (March 2005). "Using Linear Programming to Decode Binary Linear Codes". IEEE Transactions on Information Theory . 51 (3): 954–972. doi:10.1109/TIT.2004.842696. S2CID 3120399.
↑ Aji, Srinivas M.; McEliece, Robert J. (March 2000). "The Generalized Distributive Law" (PDF). IEEE Transactions on Information Theory . 46 (2): 325–343. doi:10.1109/18.825794.
↑ Beutelspacher, Albrecht; Rosenbaum, Ute (1998). Projective Geometry. Cambridge University Press. p. 190. ISBN 0-521-48277-1.
↑ Stern, Jacques (1989). "A method for finding codewords of small weight". Coding Theory and Applications. Lecture Notes in Computer Science. Vol. 388. Springer-Verlag. pp. 106–113. doi:10.1007/BFb0019850. ISBN 978-3-540-51643-9.
↑ Ohta, Kazuo; Pei, Dingyi, eds. (1998). Advances in Cryptology — ASIACRYPT'98. Lecture Notes in Computer Science. Vol. 1514. pp. 187–199. doi:10.1007/3-540-49649-1. ISBN 978-3-540-65109-3. S2CID 37257901.
↑ Siamack Ghadimi (2020), Optimal decision decoding algorithm (ODDA) for an asymmetric TWRC system;, Universal Journal of Electrical and Electronic Engineering