Pseudorandom generator theorem

Last updated June 27, 2023

In computational complexity theory and cryptography, the existence of pseudorandom generators is related to the existence of one-way functions through a number of theorems, collectively referred to as the pseudorandom generator theorem.

Introduction

Pseudorandomness

A distribution is considered pseudorandom if no efficient computation can distinguish it from the true uniform distribution by a non-negligible advantage. Formally, a family of distributions D_n is pseudorandom if for any polynomial size circuit C, and any ε inversely polynomial in n

|Prob_x∈U[C(x)=1] − Prob_x∈D[C(x)=1] | ≤ ε.

Pseudorandom generators

A function G_l: {0,1}^l → {0,1}^m, where l < m is a pseudorandom generator if:

G_l can be computed in time polynomial in l
G_l(x) is pseudorandom, when x is uniformly random.

One additional pseudorandom bit implies polynomially more pseudorandom bits

It can be shown that if there is a pseudorandom generator G_l: {0,1}^l → {0,1}^l+1, i.e. a generator that adds only one pseudorandom bit, then for any m = poly(l), there is a pseudorandom generator G'_l: {0,1}^l → {0,1}^m.

The idea of the proof is as follows: first s bits from uniform distribution U_l are picked and used as the seed to the first instance of G_l, which is known to be a pseudorandom generator. Next, the output of the first instance of G_l is divided into two parts: first l bits are fed into the second instance of G_l as a seed, while the last bit becomes the first bit of the output. Repeating this process for m times yields an output of m pseudorandom bits.

It can be shown that such G'_l, that consists of m instances of G_l, is indeed a pseudorandom generator by using a hybrid approach and proof by contradiction as follows:

Consider m+1 intermediate distributions H_i: 0 ≤ i ≤ m, where first i bits are chosen from the uniform distribution, and last m − i bits are chosen from the output of G'_l. Thus, H₀ is the full output of G'_l and H_m is a true uniform distribution U_m. Therefore, distributions H_i and H_i+1 will be different in only one bit (bit number i+1).

Now, assume that G'_l is not a pseudorandom distribution; that is, there exists some circuit C that can distinguish between G'_l and U_m with an advantage ε = 1/poly(l). In other words, this circuit can distinguish between H₀ and H_m. Therefore, there exists such i that the circuit C can distinguish between H_i and H_i+1 by at least ε / m. Note, that since m are polynomial in l, then ε / m is also polynomial in l and is still a non-negligible advantage.

Now, assume we are given l+1 bits that are either output of G_l or a drawn from uniform distribution. Let's reuse the approach of building large pseudorandom generators out of instances of G_l and construct a string of pseudorandom bits of length m−i−1 in the same way the G'_l was constructed above using the first l given bits as the seed. Then, let's create a string consisting of i bits drawn from uniform, concatenated with the last one of the given bits, followed by the created m−i−1 bits. The resulting output is either H_i or H_i+1, since the i+1 bit is either drawn from uniform or from G_l. Since by assumption we can distinguish between H_i and H_i+1 with non-negligible advantage, then we can distinguish between U and G_l, which implies that G_l is not a pseudorandom generator, which is a contradiction to the hypothesis. Q.E.D.

Now, let's illustrate that if exists, the circuit for distinguishing between G_l and U_l+1 does not have to toss random coins. As we showed above, if exists a circuit C for distinguishing between G'_l and U_m, where m = poly(l), then exists a circuit C' for distinguishing between G_l and U_l+1 that uses i random bits. For this circuit C' : | Prob_{u, s} [C' (u₁,...,u_i,G_l(s)) = 1 ] − Prob_{u, y} [C' (u₁,>,...,u_i,y) = 1] | ≥ ε / m,

where u is a string of i uniformly random bits, s is a string of l uniformly random bits, and y is a string of l+1 uniformly random bits.

Then,

Prob_u[ | Prob_s [C' (u₁,...,u_i,G_l(s)) = 1] - Prob_y [C' (u₁,...,u_i,y) = 1] | ] ≥ ε / m;

Which means, there exists some fixed string u of i bits that can be used as an "advice" to circuit C' for distinguishing between G_l and U_l+1.

Existence of pseudorandom generators

The existence of pseudorandom generators is related to the existence of one-way functions and hard-core predicates. Formally, pseudorandom generators exist if and only if one-way functions exist, or

PRG ↔ OWF

Definitions

One-way functions

Intuitively one-way functions are functions that are easy to compute and hard to invert. In other words, the complexity (or circuit size) of the function is much smaller than that of its inverse. Formally: A function ƒ: {0,1}ⁿ → {0,1}ⁿ is (S,ε) one-way if for any circuit C of size ≤ S,

Prob[ƒ(C(ƒ(x))) = ƒ(x)] ≤ ε.

Moreover, ƒ is a one-way function if

ƒ can be computed in polynomial time
ƒ is (poly(n), 1/poly(n)) one-way

Hard-core predicate

Function B: {0,1}ⁿ → {0,1} is a hard-core predicate for function ƒ if

B can be computed in polynomial time
for any polynomial size circuit C and any non-negligible ε = 1/poly(n), Prob_x~U[C(ƒ(x)) = B(x)] ≤ 1/2+ε

In other words, it is hard to predict B(x) from function ƒ(x).

Proof

Here an outline of the proof is given. Please see references for detailed proofs.

PRG → OWF

Consider a pseudorandom generator G_l: {0,1}^l → {0,1}^2l. Let's create the following one-way function ƒ: {0,1}ⁿ → {0,1}ⁿ that uses the first half of the output of G_l as its output. Formally,

ƒ(x,y) → G_l(x)

A key observation that justifies such selection, is that the size of the pre-image universe is 2ⁿ and is a negligible fraction of the image of the function of size 2²ⁿ.

To prove that ƒ is indeed a one-way function let's construct an argument by contradiction. Assume there exists a circuit C that inverts ƒ with advantage ε:

Prob[ƒ(C(ƒ(x,y))) = ƒ(x,y)] > ε

Then we can create the following algorithm that will distinguish G_l from uniform, which contradicts the hypothesis. The algorithm would take an input of 2n bits z and compute (x,y) = C(z). If G_l(x) = z the algorithm would accept, otherwise it rejects.

Now, if z is drawn from uniform distribution, the probability that the above algorithm accepts is ≤ 1/2^l, since the size of the pre-image is 1/2^l of the size of the image. However, if z was drawn from the output of G_l then the probability of acceptance is > ε by assumption of the existence of circuit C. Therefore, the advantage that circuit C has in distinguishing between the uniform U and output of G_l is > ε− 1/2^l, which is non-negligible and thus contradicts our assumption of G_l being a pseudorandom generator. Q.E.D.

OWF → PRG

For this case we prove a weaker version of the theorem:

One-way permutation → pseudorandom generator

A one-way permutation is a one-way function that is also a permutation of the input bits. A pseudorandom generator can be constructed from one-way permutation ƒ as follows:

G_l: {0,1}^l→{0,1}^l+1 = ƒ(x).B(x), where B is hard-core predicate of ƒ and "." is a concatenation operator. Note, that by the theorem proven above, it is only needed to show the existence of a generator that adds just one pseudorandom bit.

Hard-core predicate → PRG

First, let's show that if B is a hard-core predicate for ƒ then G_l is indeed pseudorandom. Again, we'll use an argument by contradiction.

Assume that G_l is not a pseudorandom generator; that is, there exists circuit C of polynomial size that distinguishes G_l(x) =ƒ(x).B(x) from U_l+1 with advantage ≥ε, where ε is non-negligible. Note, that since ƒ(x) is a permutation, then if x is drawn from uniform distribution, then so if ƒ(x). Therefore, U_l+1 is equivalent to ƒ(x).b, where b is a bit drawn independently from a uniform distribution. Formally,

Prob_x~U[C(G(x))=1] − Prob_x~U,b~U[C(x.b)=1] ≥ ε

Let's construct the following algorithm C':

1. Given z=f(x) guess bit b  2. Run C on z.b 3. IF C(z.b)=1 4.     output b 5. ELSE 6.     output 1-b

Given the output of ƒ the algorithm first guesses bit b by tossing a random coin, i.e. Prob[b=0] = Prob[b=1] = 0.5. Then, algorithm (circuit) C is run on f(x).b and if the result is 1 then b is outputted, otherwise the inverse of b is returned.

Then probability of C' guessing B(x) correctly is:

Prob_x~U[C'(z)=B(x)] =

Prob[b=B(x) ∧ C(z.b)=1] + Prob[b≠B(x) ∧ C(z.b)=0] =

Prob[b=B(x)]⋅Prob[C(z.b)=1 | b=B(x)] + Prob[b≠B(x)]⋅Prob[C(z.b)=0 | b≠B(x)] =

1/2⋅Prob[C(z.b)=1 | b=B(x)] + 1/2⋅Prob[C(z.b)=0 | b≠B(x)] =

(1−1/2)⋅Prob[C(z.b)=1 | b=B(x)] + 1/2⋅(1−Prob[C(z.b)=1 | b≠B(x)]) =

1/2+Prob_z.b~G(x)[C(z.b)=1] − 1/2⋅(Prob[C(z.b)=1 | b=B(x)]+Prob[C(z.b)=1 | b≠B(x)]) =

1/2+Prob_z.b~G(x)[C(z.b)=1] − Prob_z.b~U[C(z.b)=1] ≥ 1/2+ε

This implies that circuit C' can predict B(x) with probability more than 1/2 + ε, which means that B cannot be a hard-core predicate for ƒ and the hypothesis is contradicted. Q.E.D.

OWP → hard-core predicate

The outline of the proof is as follows:

If ƒ{0,1}ⁿ→{0,1}ⁿ is a one-way permutation, then so is ƒ'{0,1}²ⁿ→{0,1}²ⁿ, where ƒ'(x,y)=ƒ(x).y by definition. Then B(x,y)=x⋅y is a hard-core predicate for ƒ', where ⋅ is a vector dot product. To prove that it is indeed hard-core let's assume otherwise, and show a contradiction with the hypothesis of ƒ being one-way. If B is not a hard-core predicate, then there exists a circuit C that predicts it, so

Prob_x,y[C(ƒ(x),y)=x⋅y] ≥ 1/2+ε. That fact can be used to recover x by cleverly constructing permutations y that isolate bits in x. In fact, for a constant fraction of x, there exists a polynomial time algorithm that lists O(1/ε²) candidates that include all valid x. Thus, an algorithm can invert ƒ(x) in polynomial time for a non-negligible fraction of x, which contradicts the hypothesis.

Related Research Articles

In computational complexity theory, a branch of computer science, bounded-error probabilistic polynomial time (BPP) is the class of decision problems solvable by a probabilistic Turing machine in polynomial time with an error probability bounded by 1/3 for all instances. BPP is one of the largest practical classes of problems, meaning most problems of interest in BPP have efficient probabilistic algorithms that can be run quickly on real modern machines. BPP also contains P, the class of problems solvable in polynomial time with a deterministic machine, since a deterministic machine is a special case of a probabilistic machine.

In computational complexity theory, the class NC (for "Nick's Class") is the set of decision problems decidable in polylogarithmic time on a parallel computer with a polynomial number of processors. In other words, a problem with input size n is in NC if there exist constants c and k such that it can be solved in time $O ((log n) c)$ using $O (n k)$ parallel processors. Stephen Cook coined the name "Nick's class" after Nick Pippenger, who had done extensive research on circuits with polylogarithmic depth and polynomial size.

A pseudorandom sequence of numbers is one that appears to be statistically random, despite having been produced by a completely deterministic and repeatable process. Simply put, the problem is that many of the sources of randomness available to humans rely on physical processes not readily available to computer programs.

A pseudorandom number generator (PRNG), also known as a deterministic random bit generator (DRBG), is an algorithm for generating a sequence of numbers whose properties approximate the properties of sequences of random numbers. The PRNG-generated sequence is not truly random, because it is completely determined by an initial value, called the PRNG's seed. Although sequences that are closer to truly random can be generated using hardware random number generators, pseudorandom number generators are important in practice for their speed in number generation and their reproducibility.

A linear congruential generator (LCG) is an algorithm that yields a sequence of pseudo-randomized numbers calculated with a discontinuous piecewise linear equation. The method represents one of the oldest and best-known pseudorandom number generator algorithms. The theory behind them is relatively easy to understand, and they are easily implemented and fast, especially on computer hardware which can provide modular arithmetic by storage-bit truncation.

A cryptographically secure pseudorandom number generator (CSPRNG) or cryptographic pseudorandom number generator (CPRNG) is a pseudorandom number generator (PRNG) with properties that make it suitable for use in cryptography. It is also loosely known as a cryptographic random number generator (CRNG).

In computer science, a one-way function is a function that is easy to compute on every input, but hard to invert given the image of a random input. Here, "easy" and "hard" are to be understood in the sense of computational complexity theory, specifically the theory of polynomial time problems. Not being one-to-one is not considered sufficient for a function to be called one-way.

A commitment scheme is a cryptographic primitive that allows one to commit to a chosen value while keeping it hidden to others, with the ability to reveal the committed value later. Commitment schemes are designed so that a party cannot change the value or statement after they have committed to it: that is, commitment schemes are binding. Commitment schemes have important applications in a number of cryptographic protocols including secure coin flipping, zero-knowledge proofs, and secure computation.

In cryptography, a hard-core predicate of a one-way function f is a predicate b which is easy to compute but is hard to compute given f(x). In formal terms, there is no probabilistic polynomial-time (PPT) algorithm that computes b(x) from f(x) with probability significantly greater than one half over random choice of x. In other words, if x is drawn uniformly at random, then given f(x), any PPT adversary can only distinguish the hard-core bit b(x) and a uniformly random bit with negligible advantage over the length of x.

In theoretical computer science and cryptography, a pseudorandom generator (PRG) for a class of statistical tests is a deterministic procedure that maps a random seed to a longer pseudorandom string such that no statistical test in the class can distinguish between the output of the generator and the uniform distribution. The random seed itself is typically a short binary string drawn from the uniform distribution.

In cryptography, a pseudorandom function family, abbreviated PRF, is a collection of efficiently-computable functions which emulate a random oracle in the following way: no efficient algorithm can distinguish between a function chosen randomly from the PRF family and a random oracle. Pseudorandom functions are vital tools in the construction of cryptographic primitives, especially secure encryption schemes.

In cryptography, a pseudorandom permutation (PRP) is a function that cannot be distinguished from a random permutation (that is, a permutation selected at random with uniform probability, from the family of all permutations on the function's domain) with practical effort.

In theoretical computer science, circuit complexity is a branch of computational complexity theory in which Boolean functions are classified according to the size or depth of the Boolean circuits that compute them. A related notion is the circuit complexity of a recursive language that is decided by a uniform family of circuits $.$

In theoretical computer science, a small-bias sample space is a probability distribution that fools parity functions. In other words, no parity function can distinguish between a small-bias sample space and the uniform distribution with high probability, and hence, small-bias sample spaces naturally give rise to pseudorandom generators for parity functions.

A randomness extractor, often simply called an "extractor", is a function, which being applied to output from a weakly random entropy source, together with a short, uniformly random seed, generates a highly random output that appears independent from the source and uniformly distributed. Examples of weakly random sources include radioactive decay or thermal noise; the only restriction on possible sources is that there is no way they can be fully controlled, calculated or predicted, and that a lower bound on their entropy rate can be established. For a given source, a randomness extractor can even be considered to be a true random number generator (TRNG); but there is no single extractor that has been proven to produce truly random output from any type of weakly random source.

In theoretical computer science, a pseudorandom generator for low-degree polynomials is an efficient procedure that maps a short truly random seed to a longer pseudorandom string in such a way that low-degree polynomials cannot distinguish the output distribution of the generator from the truly random distribution. That is, evaluating any low-degree polynomial at a point determined by the pseudorandom string is statistically close to evaluating the same polynomial at a point that is chosen uniformly at random.

In 1997, Moni Naor and Omer Reingold described efficient constructions for various cryptographic primitives in private key as well as public-key cryptography. Their result is the construction of an efficient pseudorandom function. Let p and l be prime numbers with l |p−1. Select an element g ∈ $of multiplicative order l . Then for each (n+1) -dimensional vector a = (a 0,a 1, ..., a n)\in they define the function$

In discrete mathematics, ideal lattices are a special class of lattices and a generalization of cyclic lattices. Ideal lattices naturally occur in many parts of number theory, but also in other areas. In particular, they have a significant place in cryptography. Micciancio defined a generalization of cyclic lattices as ideal lattices. They can be used in cryptosystems to decrease by a square root the number of parameters necessary to describe a lattice, making them more efficient. Ideal lattices are a new concept, but similar lattice classes have been used for a long time. For example, cyclic lattices, a special case of ideal lattices, are used in NTRUEncrypt and NTRUSign.

The distributional learning theory or learning of probability distribution is a framework in computational learning theory. It has been proposed from Michael Kearns, Yishay Mansour, Dana Ron, Ronitt Rubinfeld, Robert Schapire and Linda Sellie in 1994 and it was inspired from the PAC-framework introduced by Leslie Valiant.

In cryptography, the hybrid argument is a proof technique used to show that two distributions are computationally indistinguishable.

References

W. Diffie, M.E. Hellman. "New Directions in Cryptography." IEEE Transactions on Information Theory, IT-22, pp. 644–654, 1976.
A.C. Yao. "Theory and Application of Trapdoor Functions." 23rd IEEE Symposium on Foundations of Computer Science, pp. 80–91, 1982.
M. Blum and S. Micali "How to Generate Cryptographically Strong Sequences of Pseudo-Random Bits." SIAM Journal on Computing, v13, pp. 850–864, 1984.
J. Hastad, R. Impagliazzo, L.A. Levin and M. Luby. "A Pseudorandom Generator from any One-way Function." SIAM Journal on Computing, v28 n4, pp.-1364-1396, 1999.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.