Pseudorandom generator

Last updated

In theoretical computer science and cryptography, a pseudorandom generator (PRG) for a class of statistical tests is a deterministic procedure that maps a random seed to a longer pseudorandom string such that no statistical test in the class can distinguish between the output of the generator and the uniform distribution. The random seed itself is typically a short binary string drawn from the uniform distribution.

Contents

Many different classes of statistical tests have been considered in the literature, among them the class of all Boolean circuits of a given size. It is not known whether good pseudorandom generators for this class exist, but it is known that their existence is in a certain sense equivalent to (unproven) circuit lower bounds in computational complexity theory. Hence the construction of pseudorandom generators for the class of Boolean circuits of a given size rests on currently unproven hardness assumptions.

Definition

Let be a class of functions. These functions are the statistical tests that the pseudorandom generator will try to fool, and they are usually algorithms. Sometimes the statistical tests are also called adversaries or distinguishers. [1] The notation in the codomain of the functions is the Kleene star.

A function with is a pseudorandom generator against with bias if, for every in , the statistical distance between the distributions and is at most , where is the uniform distribution on .

The quantity is called the seed length and the quantity is called the stretch of the pseudorandom generator.

A pseudorandom generator against a family of adversaries with bias is a family of pseudorandom generators , where is a pseudorandom generator against with bias and seed length .

In most applications, the family represents some model of computation or some set of algorithms, and one is interested in designing a pseudorandom generator with small seed length and bias, and such that the output of the generator can be computed by the same sort of algorithm.

In cryptography

In cryptography, the class usually consists of all circuits of size polynomial in the input and with a single bit output, and one is interested in designing pseudorandom generators that are computable by a polynomial-time algorithm and whose bias is negligible in the circuit size. These pseudorandom generators are sometimes called cryptographically secure pseudorandom generators (CSPRGs).

It is not known if cryptographically secure pseudorandom generators exist. Proving that they exist is difficult since their existence implies P ≠ NP, which is widely believed but a famously open problem. The existence of cryptographically secure pseudorandom generators is widely believed. This is because it has been proven that pseudorandom generators can be constructed from any one-way function which are believed to exist. [2] [3] Pseudorandom generators are necessary for many applications in cryptography.

The pseudorandom generator theorem shows that cryptographically secure pseudorandom generators exist if and only if one-way functions exist.

Uses

Pseudorandom generators have numerous applications in cryptography. For instance, pseudorandom generators provide an efficient analog of one-time pads. It is well known that in order to encrypt a message m in a way that the cipher text provides no information on the plaintext, the key k used must be random over strings of length |m|. Perfectly secure encryption is very costly in terms of key length. Key length can be significantly reduced using a pseudorandom generator if perfect security is replaced by semantic security. Common constructions of stream ciphers are based on pseudorandom generators.

Pseudorandom generators may also be used to construct symmetric key cryptosystems, where a large number of messages can be safely encrypted under the same key. Such a construction can be based on a pseudorandom function family, which generalizes the notion of a pseudorandom generator.

In the 1980s, simulations in physics began to use pseudorandom generators to produce sequences with billions of elements, and by the late 1980s, evidence had developed that a few common generators gave incorrect results in such cases as phase transition properties of the 3D Ising model and shapes of diffusion-limited aggregates. Then in the 1990s, various idealizations of physics simulations—based on random walks, correlation functions, localization of eigenstates, etc., were used as tests of pseudorandom generators. [4]

Testing

NIST announced SP800-22 Randomness tests to test whether a pseudorandom generator produces high quality random bits. Yongge Wang showed that NIST testing is not enough to detect weak pseudorandom generators and developed statistical distance based testing technique LILtest. [5]

For derandomization

A main application of pseudorandom generators lies in the derandomization of computation that relies on randomness, without corrupting the result of the computation. Physical computers are deterministic machines, and obtaining true randomness can be a challenge. Pseudorandom generators can be used to efficiently simulate randomized algorithms with using little or no randomness. In such applications, the class describes the randomized algorithm or class of randomized algorithms that one wants to simulate, and the goal is to design an "efficiently computable" pseudorandom generator against whose seed length is as short as possible. If a full derandomization is desired, a completely deterministic simulation proceeds by replacing the random input to the randomized algorithm with the pseudorandom string produced by the pseudorandom generator. The simulation does this for all possible seeds and averages the output of the various runs of the randomized algorithm in a suitable way.

Constructions

For polynomial time

A fundamental question in computational complexity theory is whether all polynomial time randomized algorithms for decision problems can be deterministically simulated in polynomial time. The existence of such a simulation would imply that BPP = P. To perform such a simulation, it is sufficient to construct pseudorandom generators against the family F of all circuits of size s(n) whose inputs have length n and output a single bit, where s(n) is an arbitrary polynomial, the seed length of the pseudorandom generator is O(log n) and its bias is ⅓.

In 1991, Noam Nisan and Avi Wigderson provided a candidate pseudorandom generator with these properties. In 1997 Russell Impagliazzo and Avi Wigderson proved that the construction of Nisan and Wigderson is a pseudorandom generator assuming that there exists a decision problem that can be computed in time 2O(n) on inputs of length n but requires circuits of size 2Ω(n).

For logarithmic space

While unproven assumption about circuit complexity are needed to prove that the Nisan–Wigderson generator works for time-bounded machines, it is natural to restrict the class of statistical tests further such that we need not rely on such unproven assumptions. One class for which this has been done is the class of machines whose work space is bounded by . Using a repeated squaring trick known as Savitch's theorem, it is easy to show that every probabilistic log-space computation can be simulated in space . Noam Nisan (1992) showed that this derandomization can actually be achieved with a pseudorandom generator of seed length that fools all -space machines. Nisan's generator has been used by Saks and Zhou (1999) to show that probabilistic log-space computation can be simulated deterministically in space . This result was improved by William Hoza in 2021 to space .

For linear functions

When the statistical tests consist of all multivariate linear functions over some finite field , one speaks of epsilon-biased generators. The construction of Naor & Naor (1990) achieves a seed length of , which is optimal up to constant factors. Pseudorandom generators for linear functions often serve as a building block for more complicated pseudorandom generators.

For polynomials

Viola (2008) proves that taking the sum of small-bias generators fools polynomials of degree . The seed length is .

For constant-depth circuits

Constant depth circuits that produce a single output bit.[ citation needed ]

Limitations on probability

The pseudorandom generators used in cryptography and universal algorithmic derandomization have not been proven to exist, although their existence is widely believed[ citation needed ]. Proofs for their existence would imply proofs of lower bounds on the circuit complexity of certain explicit functions. Such circuit lower bounds cannot be proved in the framework of natural proofs assuming the existence of stronger variants of cryptographic pseudorandom generators. [6]

Related Research Articles

In computational complexity theory, a branch of computer science, bounded-error probabilistic polynomial time (BPP) is the class of decision problems solvable by a probabilistic Turing machine in polynomial time with an error probability bounded by 1/3 for all instances. BPP is one of the largest practical classes of problems, meaning most problems of interest in BPP have efficient probabilistic algorithms that can be run quickly on real modern machines. BPP also contains P, the class of problems solvable in polynomial time with a deterministic machine, since a deterministic machine is a special case of a probabilistic machine.

A pseudorandom sequence of numbers is one that appears to be statistically random, despite having been produced by a completely deterministic and repeatable process. Simply put, the problem is that many of the sources of randomness available to humans rely on physical processes not readily available to computer programs.

A pseudorandom number generator (PRNG), also known as a deterministic random bit generator (DRBG), is an algorithm for generating a sequence of numbers whose properties approximate the properties of sequences of random numbers. The PRNG-generated sequence is not truly random, because it is completely determined by an initial value, called the PRNG's seed. Although sequences that are closer to truly random can be generated using hardware random number generators, pseudorandom number generators are important in practice for their speed in number generation and their reproducibility.

<span class="mw-page-title-main">Interactive proof system</span>

In computational complexity theory, an interactive proof system is an abstract machine that models computation as the exchange of messages between two parties: a prover and a verifier. The parties interact by exchanging messages in order to ascertain whether a given string belongs to a language or not. The prover possesses unlimited computational resources but cannot be trusted, while the verifier has bounded computation power but is assumed to be always honest. Messages are sent between the verifier and prover until the verifier has an answer to the problem and has "convinced" itself that it is correct.

A cryptographically secure pseudorandom number generator (CSPRNG) or cryptographic pseudorandom number generator (CPRNG) is a pseudorandom number generator (PRNG) with properties that make it suitable for use in cryptography. It is also loosely known as a cryptographic random number generator (CRNG).

In computer science, a one-way function is a function that is easy to compute on every input, but hard to invert given the image of a random input. Here, "easy" and "hard" are to be understood in the sense of computational complexity theory, specifically the theory of polynomial time problems. Not being one-to-one is not considered sufficient for a function to be called one-way.

A randomized algorithm is an algorithm that employs a degree of randomness as part of its logic or procedure. The algorithm typically uses uniformly random bits as an auxiliary input to guide its behavior, in the hope of achieving good performance in the "average case" over all possible choices of random determined by the random bits; thus either the running time, or the output are random variables.

<span class="mw-page-title-main">Complexity class</span> Set of problems in computational complexity theory

In computational complexity theory, a complexity class is a set of computational problems "of related resource-based complexity". The two most commonly analyzed resources are time and memory.

In computational complexity and cryptography, two families of distributions are computationally indistinguishable if no efficient algorithm can tell the difference between them except with negligible probability.

In computational complexity theory, P/poly is a complexity class representing problems that can be solved by small circuits. More precisely, it is the set of formal languages that have polynomial-size circuit families. It can also be defined equivalently in terms of Turing machines with advice, extra information supplied to the Turing machine along with its input, that may depend on the input length but not on the input itself. In this formulation, P/poly is the class of decision problems that can be solved by a polynomial-time Turing machine with advice strings of length polynomial in the input size. These two different definitions make P/poly central to circuit complexity and non-uniform complexity.

<span class="mw-page-title-main">Russell Impagliazzo</span> American computer scientist

Russell Graham Impagliazzo is a professor of computer science at the University of California, San Diego, specializing in computational complexity theory.

In computational complexity theory and cryptography, the existence of pseudorandom generators is related to the existence of one-way functions through a number of theorems, collectively referred to as the pseudorandom generator theorem.

In cryptography, a pseudorandom function family, abbreviated PRF, is a collection of efficiently-computable functions which emulate a random oracle in the following way: no efficient algorithm can distinguish between a function chosen randomly from the PRF family and a random oracle. Pseudorandom functions are vital tools in the construction of cryptographic primitives, especially secure encryption schemes.

In cryptography, a pseudorandom permutation (PRP) is a function that cannot be distinguished from a random permutation (that is, a permutation selected at random with uniform probability, from the family of all permutations on the function's domain) with practical effort.

<span class="mw-page-title-main">Circuit complexity</span> Model of computational complexity

In theoretical computer science, circuit complexity is a branch of computational complexity theory in which Boolean functions are classified according to the size or depth of the Boolean circuits that compute them. A related notion is the circuit complexity of a recursive language that is decided by a uniform family of circuits .

In theoretical computer science, a small-bias sample space is a probability distribution that fools parity functions. In other words, no parity function can distinguish between a small-bias sample space and the uniform distribution with high probability, and hence, small-bias sample spaces naturally give rise to pseudorandom generators for parity functions.

A randomness extractor, often simply called an "extractor", is a function, which being applied to output from a weak entropy source, together with a short, uniformly random seed, generates a highly random output that appears independent from the source and uniformly distributed. Examples of weakly random sources include radioactive decay or thermal noise; the only restriction on possible sources is that there is no way they can be fully controlled, calculated or predicted, and that a lower bound on their entropy rate can be established. For a given source, a randomness extractor can even be considered to be a true random number generator (TRNG); but there is no single extractor that has been proven to produce truly random output from any type of weakly random source.

In theoretical computer science, a pseudorandom generator for low-degree polynomials is an efficient procedure that maps a short truly random seed to a longer pseudorandom string in such a way that low-degree polynomials cannot distinguish the output distribution of the generator from the truly random distribution. That is, evaluating any low-degree polynomial at a point determined by the pseudorandom string is statistically close to evaluating the same polynomial at a point that is chosen uniformly at random.

In 1997, Moni Naor and Omer Reingold described efficient constructions for various cryptographic primitives in private key as well as public-key cryptography. Their result is the construction of an efficient pseudorandom function. Let p and l be prime numbers with l |p−1. Select an element g of multiplicative order l. Then for each (n+1)-dimensional vector a = (a0,a1, ..., an)∈ they define the function

In computational learning theory, Occam learning is a model of algorithmic learning where the objective of the learner is to output a succinct representation of received training data. This is closely related to probably approximately correct (PAC) learning, where the learner is evaluated on its predictive power of a test set.

References

  1. Katz, Jonathan (2014-11-06). Introduction to modern cryptography. Lindell, Yehuda (Second ed.). Boca Raton. ISBN   9781466570269. OCLC   893721520.{{cite book}}: CS1 maint: location missing publisher (link)
  2. HÅstad, Johan; Impagliazzo, Russell; Levin, Leonid A.; Luby, Michael (1 January 1999). "A Pseudorandom Generator from any One-way Function". SIAM Journal on Computing. 28 (4): 1364–1396. doi:10.1137/S0097539793244708.
  3. Katz, Jonathan; Lindell, Yehuda (20 December 2020). Introduction to Modern Cryptography. CRC Press. p. 262. ISBN   978-1-351-13302-9.
  4. Wolfram, Stephen (2002). A New Kind of Science . Wolfram Media, Inc. p.  1085. ISBN   978-1-57955-008-0.
  5. "Statistical Testing Techniques for Pseudorandom generation".
  6. Razborov, Alexander; Rudich, Steven (August 1997). "Natural Proofs". Journal of Computer and System Sciences. 55 (1): 24–35. doi: 10.1006/jcss.1997.1494 .