Computationally bounded adversary

Last updated

In information theory, the computationally bounded adversary problem is a different way of looking at the problem of sending data over a noisy channel. In previous models the best that could be done was ensuring correct decoding for up to d/2 errors, where d was the Hamming distance of the code. The problem with doing it this way is that it does not take into consideration the actual amount of computing power available to the adversary. Rather, it only concerns itself with how many bits of a given code word can change and still have the message decode properly. In the computationally bounded adversary model the channel – the adversary – is restricted to only being able to perform a reasonable amount of computation to decide which bits of the code word need to change. In other words, this model does not need to consider how many errors can possibly be handled, but only how many errors could possibly be introduced given a reasonable amount of computing power on the part of the adversary. Once the channel has been given this restriction it becomes possible to construct codes that are both faster to encode and decode compared to previous methods that can also handle a large number of errors.

Contents

Comparison to other models

Worst-case model

At first glance, the worst-case model seems intuitively ideal. The guarantee that an algorithm will succeed no matter what is, of course, highly alluring. However, it demands too much. A real-life adversary cannot spend an indefinite amount of time examining a message in order to find the one error pattern which an algorithm would struggle with.

As a comparison, consider the Quicksort algorithm. In the worst-case scenario, Quicksort makes O(n2) comparisons; however, such an occurrence is rare. Quicksort almost invariably makes O(n log n) comparisons instead, and even outperforms other algorithms which can guarantee O(n log n) behavior. Let us suppose an adversary wishes to force the Quicksort algorithm to make O(n2) comparisons. Then he would have to search all of the n! permutations of the input string and test the algorithm on each until he found the one for which the algorithm runs significantly slower. But since this would take O(n!) time, it is clearly infeasible for an adversary to do this. Similarly, it is unreasonable to assume an adversary for an encoding and decoding system would be able to test every single error pattern in order to find the most effective one.

Stochastic noise model

The stochastic noise model can be described as a kind of "dumb" noise model. That is to say that it does not have the adaptability to deal with "intelligent" threats. Even if the attacker is bounded it is still possible that they might be able to overcome the stochastic model with a bit of cleverness. The stochastic model has no real way to fight against this sort of attack and as such is unsuited to dealing with the kind of "intelligent" threats that would be preferable to have defenses against.


Therefore, a computationally bounded adversarial model has been proposed as a compromise between the two. [1] This forces one to consider that messages may be perverted in conscious, even malicious ways, but without forcing an algorithm designer to worry about rare cases which likely will never occur.

Applications

Comparison to stochastic noise channel

Since any computationally bounded adversary could in O(n) time flip a coin for each bit, it is intuitively clear that any encoding and decoding system which can work against this adversary must also work in the stochastic noise model. The converse is less simple; however, it can be shown that any system which works in the stochastic noise model can also efficiently encode and decode against a computationally bounded adversary, and only at an additional cost which is polynomial in n. [1] The following method to accomplish this was designed by Dick Lipton, and is taken from: [1]

An illustration of the method. The first row gives the initial encoded message; the second, after random permutation and random R; the third, after the adversary adds N; the fourth, after unpermuting; the fifth, the encoded message with the adversary's error removed. Illustration of Proof for Computationally Bounded Adversary.png
An illustration of the method. The first row gives the initial encoded message; the second, after random permutation and random R; the third, after the adversary adds N; the fourth, after unpermuting; the fifth, the encoded message with the adversary's error removed.

Let be an encoder for the stochastic noise model and be a simple decoder for the same, each of which runs in polynomial time. Furthermore, let both the sender and receiver share some random permutation function and a random pattern .

For encoding: 1. Let .

2. Let .

3. Transmit

Then for decoding: 1. Receive . Compute .

2. Calculate .

Similarly to the Quicksort comparison above, if the channel wants to do something smart, it must first test all the permutations. However, this is infeasible for a computationally bounded adversary, so the most it can do is make a random error pattern . But then:

since by definition.

, where

since any permutation is linear with respect to XOR,

as per the definition of above.

Since is random, is just random noise and we can use the simple decoder to decode the received message and get back .

Specific applications

By assuming a computationally bounded adversary, it is possibly to design a locally decodable code which is both efficient and near-optimal, with a negligible error probability. These codes are used in complexity theory for things like self-correcting computations, probabilistically checkable proof systems, and worst-case to average-case hardness reductions in the constructions of pseudo-random generators. They are useful in cryptography as a result of their connection with private information retrieval protocols. They are also in a number of database applications like fault-tolerant data storage. [2]

Furthermore, it is possible to construct codes which surpass known bounds for worst-case codes—specifically, unique decoding with a error rate. [3] This can be done by concatenating timestamped digital signatures onto messages. A computationally bounded channel cannot forge a signature; and while it may have valid past signatures, the receiver can use list decoding and select a message only if its signature has the correct timestamp.

See also

Related Research Articles

A binary symmetric channel is a common communications channel model used in coding theory and information theory. In this model, a transmitter wishes to send a bit, and the receiver will receive a bit. The bit will be "flipped" with a "crossover probability" of p, and otherwise is received correctly. This model can be applied to varied communication channels such as telephone lines or disk drive storage.

Additive white Gaussian noise (AWGN) is a basic noise model used in information theory to mimic the effect of many random processes that occur in nature. The modifiers denote specific characteristics:

The Viterbi algorithm is a dynamic programming algorithm for obtaining the maximum a posteriori probability estimate of the most likely sequence of hidden states—called the Viterbi path—that results in a sequence of observed events, especially in the context of Markov information sources and hidden Markov models (HMM).

Channel capacity, in electrical engineering, computer science, and information theory, is the tight upper bound on the rate at which information can be reliably transmitted over a communication channel.

<span class="mw-page-title-main">Coding theory</span> Study of the properties of codes and their fitness

Coding theory is the study of the properties of codes and their respective fitness for specific applications. Codes are used for data compression, cryptography, error detection and correction, data transmission and data storage. Codes are studied by various scientific disciplines—such as information theory, electrical engineering, mathematics, linguistics, and computer science—for the purpose of designing efficient and reliable data transmission methods. This typically involves the removal of redundancy and the correction or detection of errors in the transmitted data.

In mathematics and statistics, a stationary process is a stochastic process whose unconditional joint probability distribution does not change when shifted in time. Consequently, parameters such as mean and variance also do not change over time. If you draw a line through the middle of a stationary process then it should be flat; it may have 'seasonal' cycles, but overall it does not trend up nor down.

In information theory, turbo codes are a class of high-performance forward error correction (FEC) codes developed around 1990–91, but first published in 1993. They were the first practical codes to closely approach the maximum channel capacity or Shannon limit, a theoretical maximum for the code rate at which reliable communication is still possible given a specific noise level. Turbo codes are used in 3G/4G mobile communications and in satellite communications as well as other applications where designers seek to achieve reliable information transfer over bandwidth- or latency-constrained communication links in the presence of data-corrupting noise. Turbo codes compete with low-density parity-check (LDPC) codes, which provide similar performance.

<span class="mw-page-title-main">Complexity class</span> Set of problems in computational complexity theory

In computational complexity theory, a complexity class is a set of computational problems of related resource-based complexity. The two most commonly analyzed resources are time and memory.

In cryptography, the McEliece cryptosystem is an asymmetric encryption algorithm developed in 1978 by Robert McEliece. It was the first such scheme to use randomization in the encryption process. The algorithm has never gained much acceptance in the cryptographic community, but is a candidate for "post-quantum cryptography", as it is immune to attacks using Shor's algorithm and – more generally – measuring coset states using Fourier sampling.

Probabilistic encryption is the use of randomness in an encryption algorithm, so that when encrypting the same message several times it will, in general, yield different ciphertexts. The term "probabilistic encryption" is typically used in reference to public key encryption algorithms; however various symmetric key encryption algorithms achieve a similar property, and stream ciphers such as Freestyle which are inherently random. To be semantically secure, that is, to hide even partial information about the plaintext, an encryption algorithm must be probabilistic.

Code-excited linear prediction (CELP) is a linear predictive speech coding algorithm originally proposed by Manfred R. Schroeder and Bishnu S. Atal in 1985. At the time, it provided significantly better quality than existing low bit-rate algorithms, such as residual-excited linear prediction (RELP) and linear predictive coding (LPC) vocoders. Along with its variants, such as algebraic CELP, relaxed CELP, low-delay CELP and vector sum excited linear prediction, it is currently the most widely used speech coding algorithm. It is also used in MPEG-4 Audio speech coding. CELP is commonly used as a generic term for a class of algorithms and not for a particular codec.

In computational complexity theory, P/poly is a complexity class representing problems that can be solved by small circuits. More precisely, it is the set of formal languages that have polynomial-size circuit families. It can also be defined equivalently in terms of Turing machines with advice, extra information supplied to the Turing machine along with its input, that may depend on the input length but not on the input itself. In this formulation, P/poly is the class of decision problems that can be solved by a polynomial-time Turing machine with advice strings of length polynomial in the input size. These two different definitions make P/poly central to circuit complexity and non-uniform complexity.

In computer science, Luby transform codes are the first class of practical fountain codes that are near-optimal erasure correcting codes. They were invented by Michael Luby in 1998 and published in 2002. Like some other fountain codes, LT codes depend on sparse bipartite graphs to trade reception overhead for encoding and decoding speed. The distinguishing characteristic of LT codes is in employing a particularly simple algorithm based on the exclusive or operation to encode and decode the message.

In information theory, the noisy-channel coding theorem, establishes that for any given degree of noise contamination of a communication channel, it is possible to communicate discrete data nearly error-free up to a computable maximum rate through the channel. This result was presented by Claude Shannon in 1948 and was based in part on earlier work and ideas of Harry Nyquist and Ralph Hartley.

In computing, telecommunication, information theory, and coding theory, forward error correction (FEC) or channel coding is a technique used for controlling errors in data transmission over unreliable or noisy communication channels.

<span class="mw-page-title-main">Autoencoder</span> Neural network that learns efficient data encoding in an unsupervised manner

An autoencoder is a type of artificial neural network used to learn efficient codings of unlabeled data. An autoencoder learns two functions: an encoding function that transforms the input data, and a decoding function that recreates the input data from the encoded representation. The autoencoder learns an efficient representation (encoding) for a set of data, typically for dimensionality reduction.

In coding theory, list decoding is an alternative to unique decoding of error-correcting codes for large error rates. The notion was proposed by Elias in the 1950s. The main idea behind list decoding is that the decoding algorithm instead of outputting a single possible message outputs a list of possibilities one of which is correct. This allows for handling a greater number of errors than that allowed by unique decoding.

A locally decodable code (LDC) is an error-correcting code that allows a single bit of the original message to be decoded with high probability by only examining a small number of bits of a possibly corrupted codeword. This property could be useful, say, in a context where information is being transmitted over a noisy channel, and only a small subset of the data is required at a particular time and there is no need to decode the entire message at once. Note that locally decodable codes are not a subset of locally testable codes, though there is some overlap between the two.

Quantum block codes are useful in quantum computing and in quantum communications. The encoding circuit for a large block code typically has a high complexity although those for modern codes do have lower complexity.

In coding theory, folded Reed–Solomon codes are like Reed–Solomon codes, which are obtained by mapping Reed–Solomon codewords over a larger alphabet by careful bundling of codeword symbols.

References

  1. 1 2 3 Lipton (6 May 2009). "Worst Case Complexity". Gödel’s Lost Letter and P=NP. Retrieved 2013-04-01.
  2. Ostrovsky, Pandey, Sahai. "Private Locally Decodable Codes" (PDF). Retrieved 2013-04-01.{{cite web}}: CS1 maint: multiple names: authors list (link)
  3. Micali, Peikert; Sudan, A. Wilson. "Optimal Error Correction for Computationally Bounded Noise" (PDF). Retrieved 2013-04-01.