In statistics and coding theory, a Hamming space is usually the set of all binary strings of length N, where different binary strings are considered to be adjacent when they differ only in one position. The total distance between any two binary strings is then the total number of positions at which the corresponding bits are different, called the Hamming distance. [1] [2] Hamming spaces are named after American mathematician Richard Hamming, who introduced the concept in 1950. [3] They are used in the theory of coding signals and transmission.
More generally, a Hamming space can be defined over any alphabet (set) Q as the set of words of a fixed length N with letters from Q. [4] [5] If Q is a finite field, then a Hamming space over Q is an N-dimensional vector space over Q. In the typical, binary case, the field is thus GF(2) (also denoted by Z2). [4]
In coding theory, if Q has q elements, then any subset C (usually assumed of cardinality at least two) of the N-dimensional Hamming space over Q is called a q-ary code of length N; the elements of C are called codewords . [4] [5] In the case where C is a linear subspace of its Hamming space, it is called a linear code. [4] A typical example of linear code is the Hamming code. Codes defined via a Hamming space necessarily have the same length for every codeword, so they are called block codes when it is necessary to distinguish them from variable-length codes that are defined by unique factorization on a monoid.
The Hamming distance endows a Hamming space with a metric, which is essential in defining basic notions of coding theory such as error detecting and error correcting codes. [4]
Hamming spaces over non-field alphabets have also been considered, especially over finite rings (most notably over Z4) giving rise to modules instead of vector spaces and ring-linear codes (identified with submodules) instead of linear codes. The typical metric used in this case the Lee distance. There exist a Gray isometry between (i.e. GF(22m)) with the Hamming distance and (also denoted as GR(4,m)) with the Lee distance. [6] [7] [8]
In information theory, the Hamming distance between two strings or vectors of equal length is the number of positions at which the corresponding symbols are different. In other words, it measures the minimum number of substitutions required to change one string into the other, or equivalently, the minimum number of errors that could have transformed one string into the other. In a more general context, the Hamming distance is one of several string metrics for measuring the edit distance between two sequences. It is named after the American mathematician Richard Hamming.
Coding theory is the study of the properties of codes and their respective fitness for specific applications. Codes are used for data compression, cryptography, error detection and correction, data transmission and data storage. Codes are studied by various scientific disciplines—such as information theory, electrical engineering, mathematics, linguistics, and computer science—for the purpose of designing efficient and reliable data transmission methods. This typically involves the removal of redundancy and the correction or detection of errors in the transmitted data.
In mathematics and electronics engineering, a binary Golay code is a type of linear error-correcting code used in digital communications. The binary Golay code, along with the ternary Golay code, has a particularly deep and interesting connection to the theory of finite sporadic groups in mathematics. These codes are named in honor of Marcel J. E. Golay whose 1949 paper introducing them has been called, by E. R. Berlekamp, the "best single published page" in coding theory.
In coding theory, block codes are a large and important family of error-correcting codes that encode data in blocks. There is a vast number of examples for block codes, many of which have a wide range of practical applications. The abstract definition of block codes is conceptually useful because it allows coding theorists, mathematicians, and computer scientists to study the limitations of all block codes in a unified way. Such limitations often take the form of bounds that relate different parameters of the block code to each other, such as its rate and its ability to detect and correct errors.
In coding theory, the weight enumerator polynomial of a binary linear code specifies the number of words of each possible Hamming weight.
In coding theory, the dual code of a linear code
In coding theory, a linear code is an error-correcting code for which any linear combination of codewords is also a codeword. Linear codes are traditionally partitioned into block codes and convolutional codes, although turbo codes can be seen as a hybrid of these two types. Linear codes allow for more efficient encoding and decoding algorithms than other codes.
In coding theory, the Singleton bound, named after Richard Collom Singleton, is a relatively crude upper bound on the size of an arbitrary block code with block length , size and minimum distance . It is also known as the Joshibound. proved by Joshi (1958) and even earlier by Komamiya (1953).
In mathematics and computer science, in the field of coding theory, the Hamming bound is a limit on the parameters of an arbitrary block code: it is also known as the sphere-packing bound or the volume bound from an interpretation in terms of packing balls in the Hamming metric into the space of all possible words. It gives an important limitation on the efficiency with which any error-correcting code can utilize the space in which its code words are embedded. A code that attains the Hamming bound is said to be a perfect code.
In coding theory, decoding is the process of translating received messages into codewords of a given code. There have been many common methods of mapping messages to codewords. These are often used to recover messages sent over a noisy channel, such as a binary symmetric channel.
Reed–Muller codes are error-correcting codes that are used in wireless communications applications, particularly in deep-space communication. Moreover, the proposed 5G standard relies on the closely related polar codes for error correction in the control channel. Due to their favorable theoretical and mathematical properties, Reed–Muller codes have also been extensively studied in theoretical computer science.
In coding theory, a cyclic code is a block code, where the circular shifts of each codeword gives another word that belongs to the code. They are error-correcting codes that have algebraic properties that are convenient for efficient error detection and correction.
In coding theory, a generator matrix is a matrix whose rows form a basis for a linear code. The codewords are all of the linear combinations of the rows of this matrix, that is, the linear code is the row space of its generator matrix.
The Hadamard code is an error-correcting code named after the French mathematician Jacques Hadamard that is used for error detection and correction when transmitting messages over very noisy or unreliable channels. In 1971, the code was used to transmit photos of Mars back to Earth from the NASA space probe Mariner 9. Because of its unique mathematical properties, the Hadamard code is not only used by engineers, but also intensely studied in coding theory, mathematics, and theoretical computer science. The Hadamard code is also known under the names Walsh code, Walsh family, and Walsh–Hadamard code in recognition of the American mathematician Joseph Leonard Walsh.
In coding theory, the Lee distance is a distance between two strings and of equal length n over the q-ary alphabet {0, 1, …, q − 1} of size q ≥ 2. It is a metric defined as If q = 2 or q = 3 the Lee distance coincides with the Hamming distance, because both distances are 0 for two single equal symbols and 1 for two single non-equal symbols. For q > 3 this is not the case anymore; the Lee distance between single letters can become bigger than 1. However, there exists a Gray isometry between with the Lee weight and with the Hamming weight.
In coding theory, Justesen codes form a class of error-correcting codes that have a constant rate, constant relative distance, and a constant alphabet size.
A locally decodable code (LDC) is an error-correcting code that allows a single bit of the original message to be decoded with high probability by only examining a small number of bits of a possibly corrupted codeword. This property could be useful, say, in a context where information is being transmitted over a noisy channel, and only a small subset of the data is required at a particular time and there is no need to decode the entire message at once. Note that locally decodable codes are not a subset of locally testable codes, though there is some overlap between the two.
DNA code construction refers to the application of coding theory to the design of nucleic acid systems for the field of DNA–based computation.
AN codes are error-correcting code that are used in arithmetic applications. Arithmetic codes were commonly used in computer processors to ensure the accuracy of its arithmetic operations when electronics were more unreliable. Arithmetic codes help the processor to detect when an error is made and correct it. Without these codes, processors would be unreliable since any errors would go undetected. AN codes are arithmetic codes that are named for the integers and that are used to encode and decode the codewords.
The Gilbert–Varshamov bound for linear codes is related to the general Gilbert–Varshamov bound, which gives a lower bound on the maximal number of elements in an error-correcting code of a given block length and minimum Hamming weight over a field . This may be translated into a statement about the maximum rate of a code with given length and minimum distance. The Gilbert–Varshamov bound for linear codes asserts the existence of q-ary linear codes for any relative minimum distance less than the given bound that simultaneously have high rate. The existence proof uses the probabilistic method, and thus is not constructive. The Gilbert–Varshamov bound is the best known in terms of relative distance for codes over alphabets of size less than 49. For larger alphabets, algebraic geometry codes sometimes achieve an asymptotically better rate vs. distance tradeoff than is given by the Gilbert–Varshamov bound.