Expander code

Last updated
Expander codes
Tanner graph example.PNG
bipartite expander graph
Classification
Type Linear block code
Block length
Message length
Rate
Distance
Alphabet size
Notation -code

In coding theory, expander codes form a class of error-correcting codes that are constructed from bipartite expander graphs. Along with Justesen codes, expander codes are of particular interest since they have a constant positive rate, a constant positive relative distance, and a constant alphabet size. In fact, the alphabet contains only two elements, so expander codes belong to the class of binary codes. Furthermore, expander codes can be both encoded and decoded in time proportional to the block length of the code.

Contents

Expander codes

In coding theory, an expander code is a linear block code whose parity check matrix is the adjacency matrix of a bipartite expander graph. These codes have good relative distance , where and are properties of the expander graph as defined later), rate , and decodability (algorithms of running time exist).

Definition

Let be a -biregular graph between a set of nodes , called variables, and a set of nodes , called constraints.

Let be a function designed so that, for each constraint , the variables neighboring are .

Let be an error-correcting code of block length . The expander code is the code of block length whose codewords are the words such that, for , is a codeword of . [1]

It has been shown that nontrivial lossless expander graphs exist. Moreover, we can explicitly construct them. [2]

Rate

The rate of is its dimension divided by its block length. In this case, the parity check matrix has size , and hence has rate at least .

Distance

Suppose . Then the distance of a expander code is at least .

Proof

Note that we can consider every codeword in as a subset of vertices , by saying that vertex if and only if the th index of the codeword is a 1. Then is a codeword iff every vertex is adjacent to an even number of vertices in . (In order to be a codeword, , where is the parity check matrix. Then, each vertex in corresponds to each column of . Matrix multiplication over then gives the desired result.) So, if a vertex is adjacent to a single vertex in , we know immediately that is not a codeword. Let denote the neighbors in of , and denote those neighbors of which are unique, i.e., adjacent to a single vertex of .

Lemma 1

For every of size , .

Proof

Trivially, , since implies . follows since the degree of every vertex in is . By the expansion property of the graph, there must be a set of edges which go to distinct vertices. The remaining edges make at most neighbors not unique, so .

Corollary

Every sufficiently small has a unique neighbor. This follows since .

Lemma 2

Every subset with has a unique neighbor.

Proof

Lemma 1 proves the case , so suppose . Let such that . By Lemma 1, we know that . Then a vertex is in iff , and we know that , so by the first part of Lemma 1, we know . Since , , and hence is not empty.

Corollary

Note that if a has at least 1 unique neighbor, i.e. , then the corresponding word corresponding to cannot be a codeword, as it will not multiply to the all zeros vector by the parity check matrix. By the previous argument, . Since is linear, we conclude that has distance at least .

Encoding

The encoding time for an expander code is upper bounded by that of a general linear code - by matrix multiplication. A result due to Spielman shows that encoding is possible in time. [3]

Decoding

Decoding of expander codes is possible in time when using the following algorithm.

Let be the vertex of that corresponds to the th index in the codewords of . Let be a received word, and . Let be , and be . Then consider the greedy algorithm:


Input: received word .

initialize y' to y while there is a v in R adjacent to an odd number of vertices in V(y')     if there is an i such that o(i) > e(i)         flip entry i in y'     else         fail

Output: fail, or modified codeword .


Proof

We show first the correctness of the algorithm, and then examine its running time.

Correctness

We must show that the algorithm terminates with the correct codeword when the received codeword is within half the code's distance of the original codeword. Let the set of corrupt variables be , , and the set of unsatisfied (adjacent to an odd number of vertices) vertices in be . The following lemma will prove useful.

Lemma 3

If , then there is a with .

Proof

By Lemma 1, we know that . So an average vertex has at least unique neighbors (recall unique neighbors are unsatisfied and hence contribute to ), since , and thus there is a vertex with .

So, if we have not yet reached a codeword, then there will always be some vertex to flip. Next, we show that the number of errors can never increase beyond .

Lemma 4

If we start with , then we never reach at any point in the algorithm.

Proof

When we flip a vertex , and are interchanged, and since we had , this means the number of unsatisfied vertices on the right decreases by at least one after each flip. Since , the initial number of unsatisfied vertices is at most , by the graph's -regularity. If we reached a string with errors, then by Lemma 1, there would be at least unique neighbors, which means there would be at least unsatisfied vertices, a contradiction.

Lemmas 3 and 4 show us that if we start with (half the distance of ), then we will always find a vertex to flip. Each flip reduces the number of unsatisfied vertices in by at least 1, and hence the algorithm terminates in at most steps, and it terminates at some codeword, by Lemma 3. (Were it not at a codeword, there would be some vertex to flip). Lemma 4 shows us that we can never be farther than away from the correct codeword. Since the code has distance (since ), the codeword it terminates on must be the correct codeword, since the number of bit flips is less than half the distance (so we couldn't have traveled far enough to reach any other codeword).

Complexity

We now show that the algorithm can achieve linear time decoding. Let be constant, and be the maximum degree of any vertex in . Note that is also constant for known constructions.

  1. Pre-processing: It takes time to compute whether each vertex in has an odd or even number of neighbors.
  2. Pre-processing 2: We take time to compute a list of vertices in which have .
  3. Each Iteration: We simply remove the first list element. To update the list of odd / even vertices in , we need only update entries, inserting / removing as necessary. We then update entries in the list of vertices in with more odd than even neighbors, inserting / removing as necessary. Thus each iteration takes time.
  4. As argued above, the total number of iterations is at most .

This gives a total runtime of time, where and are constants.

See also

Notes

This article is based on Dr. Venkatesan Guruswami's course notes. [4]

Related Research Articles

In graph theory, an expander graph is a sparse graph that has strong connectivity properties, quantified using vertex, edge or spectral expansion. Expander constructions have spawned research in pure and applied mathematics, with several applications to complexity theory, design of robust computer networks, and the theory of error-correcting codes.

A binary symmetric channel is a common communications channel model used in coding theory and information theory. In this model, a transmitter wishes to send a bit, and the receiver will receive a bit. The bit will be "flipped" with a "crossover probability" of p, and otherwise is received correctly. This model can be applied to varied communication channels such as telephone lines or disk drive storage.

In vector calculus, Green's theorem relates a line integral around a simple closed curve C to a double integral over the plane region D bounded by C. It is the two-dimensional special case of Stokes' theorem.

In mathematics, loop-erased random walk is a model for a random simple path with important applications in combinatorics, physics and quantum field theory. It is intimately connected to the uniform spanning tree, a model for a random tree. See also random walk for more general treatment of this topic.

<span class="mw-page-title-main">Szemerédi regularity lemma</span>

Szemerédi's regularity lemma is one of the most powerful tools in extremal graph theory, particularly in the study of large dense graphs. It states that the vertices of every large enough graph can be partitioned into a bounded number of parts so that the edges between different parts behave almost randomly.

In mathematics, in particular in algebraic geometry and differential geometry, Dolbeault cohomology is an analog of de Rham cohomology for complex manifolds. Let M be a complex manifold. Then the Dolbeault cohomology groups depend on a pair of integers p and q and are realized as a subquotient of the space of complex differential forms of degree (p,q).

In information theory, the noisy-channel coding theorem, establishes that for any given degree of noise contamination of a communication channel, it is possible to communicate discrete data nearly error-free up to a computable maximum rate through the channel. This result was presented by Claude Shannon in 1948 and was based in part on earlier work and ideas of Harry Nyquist and Ralph Hartley.

In graph theory, a graph is said to be a pseudorandom graph if it obeys certain properties that random graphs obey with high probability. There is no concrete definition of graph pseudorandomness, but there are many reasonable characterizations of pseudorandomness one can consider.

In the mathematical discipline of graph theory, the expander walk sampling theorem intuitively states that sampling vertices in an expander graph by doing relatively short random walk can simulate sampling the vertices independently from a uniform distribution. The earliest version of this theorem is due to Ajtai, Komlós & Szemerédi (1987), and the more general version is typically attributed to Gillman (1998).

In mathematics, the Grothendieck inequality states that there is a universal constant with the following property. If Mij is an n × n matrix with

In mathematics, the Johnson–Lindenstrauss lemma is a result named after William B. Johnson and Joram Lindenstrauss concerning low-distortion embeddings of points from high-dimensional into low-dimensional Euclidean space. The lemma states that a set of points in a high-dimensional space can be embedded into a space of much lower dimension in such a way that distances between the points are nearly preserved. The map used for the embedding is at least Lipschitz, and can even be taken to be an orthogonal projection.

In mathematics, the Möbius energy of a knot is a particular knot energy, i.e., a functional on the space of knots. It was discovered by Jun O'Hara, who demonstrated that the energy blows up as the knot's strands get close to one another. This is a useful property because it prevents self-intersection and ensures the result under gradient descent is of the same knot type.

Alan M. Frieze is a professor in the Department of Mathematical Sciences at Carnegie Mellon University, Pittsburgh, United States. He graduated from the University of Oxford in 1966, and obtained his PhD from the University of London in 1975. His research interests lie in combinatorics, discrete optimisation and theoretical computer science. Currently, he focuses on the probabilistic aspects of these areas; in particular, the study of the asymptotic properties of random graphs, the average case analysis of algorithms, and randomised algorithms. His recent work has included approximate counting and volume computation via random walks; finding edge disjoint paths in expander graphs, and exploring anti-Ramsey theory and the stability of routing algorithms.

Uniform convergence in probability is a form of convergence in probability in statistical asymptotic theory and probability theory. It means that, under certain conditions, the empirical frequencies of all events in a certain event-family converge to their theoretical probabilities. Uniform convergence in probability has applications to statistics as well as machine learning as part of statistical learning theory.

In coding theory, folded Reed–Solomon codes are like Reed–Solomon codes, which are obtained by mapping Reed–Solomon codewords over a larger alphabet by careful bundling of codeword symbols.

In coding theory, Zemor's algorithm, designed and developed by Gilles Zemor, is a recursive low-complexity approach to code construction. It is an improvement over the algorithm of Sipser and Spielman.

<span class="mw-page-title-main">Pentagramma mirificum</span>

Pentagramma mirificum is a star polygon on a sphere, composed of five great circle arcs, all of whose internal angles are right angles. This shape was described by John Napier in his 1614 book Mirifici Logarithmorum Canonis Descriptio along with rules that link the values of trigonometric functions of five parts of a right spherical triangle. The properties of pentagramma mirificum were studied, among others, by Carl Friedrich Gauss.

In graph theory, the graph removal lemma states that when a graph contains few copies of a given subgraph, then all of the copies can be eliminated by removing a small number of edges. The special case in which the subgraph is a triangle is known as the triangle removal lemma.

The counting lemmas this article discusses are statements in combinatorics and graph theory. The first one extracts information from -regular pairs of subsets of vertices in a graph , in order to guarantee patterns in the entire graph; more explicitly, these patterns correspond to the count of copies of a certain graph in . The second counting lemma provides a similar yet more general notion on the space of graphons, in which a scalar of the cut distance between two graphs

The blow-up lemma, proved by János Komlós, Gábor N. Sárközy, and Endre Szemerédi in 1997, is an important result in extremal graph theory, particularly within the context of the regularity method. It states that the regular pairs in the statement of Szemerédi's regularity lemma behave like complete bipartite graphs in the context of embedding spanning graphs of bounded degree.

References

  1. Sipser, M.; Spielman, D.A. (1996). "Expander codes". IEEE Transactions on Information Theory. 42 (6): 1710–1722. doi:10.1109/18.556667.
  2. Capalbo, M.; Reingold, O.; Vadhan, S.; Wigderson, A. (2002). "Randomness conductors and constant-degree lossless expanders". STOC '02 Proceedings of the thirty-fourth annual ACM symposium on Theory of computing. ACM. pp. 659–668. doi:10.1145/509907.510003. ISBN   978-1-58113-495-7. S2CID   1918841.
  3. Spielman, D. (1996). "Linear-time encodable and decodable error-correcting codes". IEEE Transactions on Information Theory. 42 (6): 1723–31. CiteSeerX   10.1.1.47.2736 . doi:10.1109/18.556668.
  4. Guruswami, V. (15 November 2006). "Lecture 13: Expander Codes" (PDF). CSE 533: Error-Correcting. University of Washington.
    Guruswami, V. (March 2010). "Notes 8: Expander Codes and their decoding" (PDF). Introduction to Coding Theory. Carnegie Mellon University.
    Guruswami, V. (September 2004). "Guest column: error-correcting codes and expander graphs". ACM SIGACT News. 35 (3): 25–41. doi:10.1145/1027914.1027924. S2CID   17550280.