One-way function

Last updated
Unsolved problem in computer science:
Do one-way functions exist?

In computer science, a one-way function is a function that is easy to compute on every input, but hard to invert given the image of a random input. Here, "easy" and "hard" are to be understood in the sense of computational complexity theory, specifically the theory of polynomial time problems. This has nothing to do with whether the function is one-to-one; finding any one input with the desired image is considered a successful inversion. (See § Theoretical definition, below.)

Contents

The existence of such one-way functions is still an open conjecture. Their existence would prove that the complexity classes P and NP are not equal, thus resolving the foremost unsolved question of theoretical computer science. [1] :ex. 2.2,page 70 The converse is not known to be true, i.e. the existence of a proof that P  NP would not directly imply the existence of one-way functions. [2]

In applied contexts, the terms "easy" and "hard" are usually interpreted relative to some specific computing entity; typically "cheap enough for the legitimate users" and "prohibitively expensive for any malicious agents".[ citation needed ] One-way functions, in this sense, are fundamental tools for cryptography, personal identification, authentication, and other data security applications. While the existence of one-way functions in this sense is also an open conjecture, there are several candidates that have withstood decades of intense scrutiny. Some of them are essential ingredients of most telecommunications, e-commerce, and e-banking systems around the world.

Theoretical definition

A function f : {0, 1}* → {0, 1}* is one-way if f can be computed by a polynomial-time algorithm, but any polynomial-time randomized algorithm that attempts to compute a pseudo-inverse for f succeeds with negligible probability. (The * superscript means any number of repetitions, see Kleene star.) That is, for all randomized algorithms , all positive integers c and all sufficiently large n = length(x),

where the probability is over the choice of x from the discrete uniform distribution on {0, 1} n, and the randomness of . [3]

Note that, by this definition, the function must be "hard to invert" in the average-case, rather than worst-case sense. This is different from much of complexity theory (e.g., NP-hardness), where the term "hard" is meant in the worst-case. That is why even if some candidates for one-way functions (described below) are known to be NP-complete, it does not imply their one-wayness. The latter property is only based on the lack of known algorithms to solve the problem.

It is not sufficient to make a function "lossy" (not one-to-one) to have a one-way function. In particular, the function that outputs the string of n zeros on any input of length n is not a one-way function because it is easy to come up with an input that will result in the same output. More precisely: For such a function that simply outputs a string of zeroes, an algorithm F that just outputs any string of length n on input f(x) will "find" a proper preimage of the output, even if it is not the input which was originally used to find the output string.

A one-way permutation is a one-way function that is also a permutation—that is, a one-way function that is bijective. One-way permutations are an important cryptographic primitive, and it is not known if their existence is implied by the existence of one-way functions.

A trapdoor one-way function or trapdoor permutation is a special kind of one-way function. Such a function is hard to invert unless some secret information, called the trapdoor, is known.

A collision-free hash functionf is a one-way function that is also collision-resistant; that is, no randomized polynomial time algorithm can find a collision—distinct values x, y such that f(x) = f(y)—with non-negligible probability. [4]

Theoretical implications of one-way functions

If f is a one-way function, then the inversion of f would be a problem whose output is hard to compute (by definition) but easy to check (just by computing f on it). Thus, the existence of a one-way function implies that FP    FNP, which in turn implies that P  NP. However, P  NP does not imply the existence of one-way functions.

The existence of a one-way function implies the existence of many other useful concepts, including:

Candidates for one-way functions

The following are several candidates for one-way functions (as of April 2009). Clearly, it is not known whether these functions are indeed one-way; but extensive research has so far failed to produce an efficient inverting algorithm for any of them.[ citation needed ]

Multiplication and factoring

The function f takes as inputs two prime numbers p and q in binary notation and returns their product. This function can be "easily" computed in O(b2) time, where b is the total number of bits of the inputs. Inverting this function requires finding the factors of a given integer N. The best factoring algorithms known run in time, where b is the number of bits needed to represent N.

This function can be generalized by allowing p and q to range over a suitable set of semiprimes. Note that f is not one-way for randomly selected integers p, q> 1, since the product will have 2 as a factor with probability 3/4 (because the probability that an arbitrary p is odd is 1/2, and likewise for q, so if they're chosen independently, the probability that both are odd is therefore 1/4; hence the probability that p or q is even, is 1 − 1/4 = 3/4).

The Rabin function (modular squaring)

The Rabin function, [1] :57 or squaring modulo , where p and q are primes is believed to be a collection of one-way functions. We write

to denote squaring modulo N: a specific member of the Rabin collection. It can be shown that extracting square roots, i.e. inverting the Rabin function, is computationally equivalent to factoring N (in the sense of polynomial-time reduction). Hence it can be proven that the Rabin collection is one-way if and only if factoring is hard. This also holds for the special case in which p and q are of the same bit length. The Rabin cryptosystem is based on the assumption that this Rabin function is one-way.

Discrete exponential and logarithm

Modular exponentiation can be done in polynomial time. Inverting this function requires computing the discrete logarithm. Currently there are several popular groups for which no algorithm to calculate the underlying discrete logarithm in polynomial time is known. These groups are all finite abelian groups and the general discrete logarithm problem can be described as thus.

Let G be a finite abelian group of cardinality n. Denote its group operation by multiplication. Consider a primitive element αG and another element βG. The discrete logarithm problem is to find the positive integer k, where 1 ≤ k ≤ n, such that:

The integer k that solves the equation αk = β is termed the discrete logarithm of β to the base α. One writes k = logαβ.

Popular choices for the group G in discrete logarithm cryptography are the cyclic groups (Zp)× (e.g. ElGamal encryption, Diffie–Hellman key exchange, and the Digital Signature Algorithm) and cyclic subgroups of elliptic curves over finite fields (see elliptic curve cryptography).

An elliptic curve is a set of pairs of elements of a field satisfying y2 = x3 + ax + b. The elements of the curve form a group under an operation called "point addition" (which is not the same as the addition operation of the field). Multiplication kP of a point P by an integer k (i.e., a group action of the additive group of the integers) is defined as repeated addition of the point to itself. If k and P are known, it is easy to compute R = kP, but if only R and P are known, it is assumed to be hard to compute k.

Cryptographically secure hash functions

There are a number of cryptographic hash functions that are fast to compute, such as SHA-256. Some of the simpler versions have fallen to sophisticated analysis, but the strongest versions continue to offer fast, practical solutions for one-way computation. Most of the theoretical support for the functions are more techniques for thwarting some of the previously successful attacks.

Other candidates

Other candidates for one-way functions include the hardness of the decoding of random linear codes, the hardness of certain lattice problems, and the subset sum problem (Naccache–Stern knapsack cryptosystem).

Universal one-way function

There is an explicit function f that has been proved to be one-way, if and only if one-way functions exist. [5] In other words, if any function is one-way, then so is f. Since this function was the first combinatorial complete one-way function to be demonstrated, it is known as the "universal one-way function". The problem of finding a one-way function is thus reduced to provingperhaps non-constructivelythat one such function exists.

There also exists a function that is one-way if polynomial-time bounded Kolmogorov complexity is mildly hard on average. Since the existence of one-way functions implies that polynomial-time bounded Kolmogorov complexity is mildly hard on average, the function is a universal one-way function. [6]

See also

Related Research Articles

The P versus NP problem is a major unsolved problem in theoretical computer science. Informally, it asks whether every problem whose solution can be quickly verified can also be quickly solved.

In theoretical computer science and mathematics, computational complexity theory focuses on classifying computational problems according to their resource usage, and explores the relationships between these classifications. A computational problem is a task solved by a computer. A computation problem is solvable by mechanical application of mathematical steps, such as an algorithm.

Shor's algorithm is a quantum algorithm for finding the prime factors of an integer. It was developed in 1994 by the American mathematician Peter Shor. It is one of the few known quantum algorithms with compelling potential applications and strong evidence of superpolynomial speedup compared to best known classical (non-quantum) algorithms. On the other hand, factoring numbers of practical significance requires far more qubits than available in the near future. Another concern is that noise in quantum circuits may undermine results, requiring additional qubits for quantum error correction.

In mathematics, for given real numbers a and b, the logarithm logba is a number x such that bx = a. Analogously, in any group G, powers bk can be defined for all integers k, and the discrete logarithm logba is an integer k such that bk = a. In number theory, the more commonly used term is index: we can write x = indra (mod m) (read "the index of a to the base r modulo m") for rxa (mod m) if r is a primitive root of m and gcd(a,m) = 1.

<span class="mw-page-title-main">Trapdoor function</span> One-way cryptographic tool

In theoretical computer science and cryptography, a trapdoor function is a function that is easy to compute in one direction, yet difficult to compute in the opposite direction without special information, called the "trapdoor". Trapdoor functions are a special case of one-way functions and are widely used in public-key cryptography.

<span class="mw-page-title-main">Combinatorial optimization</span> Subfield of mathematical optimization

Combinatorial optimization is a subfield of mathematical optimization that consists of finding an optimal object from a finite set of objects, where the set of feasible solutions is discrete or can be reduced to a discrete set. Typical combinatorial optimization problems are the travelling salesman problem ("TSP"), the minimum spanning tree problem ("MST"), and the knapsack problem. In many such problems, such as the ones previously mentioned, exhaustive search is not tractable, and so specialized algorithms that quickly rule out large parts of the search space or approximation algorithms must be resorted to instead.

<span class="mw-page-title-main">Complexity class</span> Set of problems in computational complexity theory

In computational complexity theory, a complexity class is a set of computational problems "of related resource-based complexity". The two most commonly analyzed resources are time and memory.

In computer science, parameterized complexity is a branch of computational complexity theory that focuses on classifying computational problems according to their inherent difficulty with respect to multiple parameters of the input or output. The complexity of a problem is then measured as a function of those parameters. This allows the classification of NP-hard problems on a finer scale than in the classical setting, where the complexity of a problem is only measured as a function of the number of bits in the input. This appears to have been first demonstrated in Gurevich, Stockmeyer & Vishkin (1984). The first systematic work on parameterized complexity was done by Downey & Fellows (1999).

<span class="mw-page-title-main">PP (complexity)</span> Class of problems in computer science

In complexity theory, PP, or PPT is the class of decision problems solvable by a probabilistic Turing machine in polynomial time, with an error probability of less than 1/2 for all instances. The abbreviation PP refers to probabilistic polynomial time. The complexity class was defined by Gill in 1977.

In computational complexity theory, a function problem is a computational problem where a single output is expected for every input, but the output is more complex than that of a decision problem. For function problems, the output is not simply 'yes' or 'no'.

Random self-reducibility (RSR) is the rule that a good algorithm for the average case implies a good algorithm for the worst case. RSR is the ability to solve all instances of a problem by solving a large fraction of the instances.

In computational complexity, problems that are in the complexity class NP but are neither in the class P nor NP-complete are called NP-intermediate, and the class of such problems is called NPI. Ladner's theorem, shown in 1975 by Richard E. Ladner, is a result asserting that, if P ≠ NP, then NPI is not empty; that is, NP contains problems that are neither in P nor NP-complete. Since it is also true that if NPI problems exist, then P ≠ NP, it follows that P = NP if and only if NPI is empty.

In computational complexity theory, a computational hardness assumption is the hypothesis that a particular problem cannot be solved efficiently. It is not known how to prove (unconditional) hardness for essentially any useful problem. Instead, computer scientists rely on reductions to formally relate the hardness of a new or complicated problem to a computational hardness assumption about a problem that is better-understood.

In computational complexity theory, the average-case complexity of an algorithm is the amount of some computational resource used by the algorithm, averaged over all possible inputs. It is frequently contrasted with worst-case complexity which considers the maximal complexity of the algorithm over all possible inputs.

In computer science, lattice problems are a class of optimization problems related to mathematical objects called lattices. The conjectured intractability of such problems is central to the construction of secure lattice-based cryptosystems: lattice problems are an example of NP-hard problems which have been shown to be average-case hard, providing a test case for the security of cryptographic algorithms. In addition, some lattice problems which are worst-case hard can be used as a basis for extremely secure cryptographic schemes. The use of worst-case hardness in such schemes makes them among the very few schemes that are very likely secure even against quantum computers. For applications in such cryptosystems, lattices over vector spaces or free modules are generally considered.

Generic-case complexity is a subfield of computational complexity theory that studies the complexity of computational problems on "most inputs".

In cryptography, Very Smooth Hash (VSH) is a provably secure cryptographic hash function invented in 2005 by Scott Contini, Arjen Lenstra, and Ron Steinfeld. Provably secure means that finding collisions is as difficult as some known hard mathematical problem. Unlike other provably secure collision-resistant hashes, VSH is efficient and usable in practice. Asymptotically, it only requires a single multiplication per log(n) message-bits and uses RSA-type arithmetic. Therefore, VSH can be useful in embedded environments where code space is limited.

In computational complexity theory, the complexity class PPP is a subclass of TFNP. It is the class of search problems that can be shown to be total by an application of the pigeonhole principle. Christos Papadimitriou introduced it in the same paper that introduced PPAD and PPA. PPP contains both PPAD and PWPP as subclasses. These complexity classes are of particular interest in cryptography because they are strongly related to cryptographic primitives such as one-way permutations and collision-resistant hash functions.

In cryptography, cryptographic hash functions can be divided into two main categories. In the first category are those functions whose designs are based on mathematical problems, and whose security thus follows from rigorous mathematical proofs, complexity theory and formal reduction. These functions are called provably secure cryptographic hash functions. To construct these is very difficult, and few examples have been introduced. Their practical use is limited.

References

  1. 1 2 Oded Goldreich (2001). Foundations of Cryptography: Volume 1, Basic Tools (draft available from author's site). Cambridge University Press. ISBN   0-521-79172-3. See also wisdom.weizmann.ac.il.
  2. Goldwasser, S. and Bellare, M. "Lecture Notes on Cryptography". Summer course on cryptography, MIT, 1996–2001.
  3. Many authors view this definition as strong one-way function. A weak one-way function can be defined similarly except that the probability that every adversarial fails to invert f is noticeable. However, one may construct strong one-way functions based on weak ones. Loosely speaking, strong and weak versions of one-way function are equivalent theoretically. See Goldreich's Foundations of Cryptography, vol. 1, ch. 2.1–2.3.
  4. Russell, A. (1995). "Necessary and Sufficient Conditions for Collision-Free Hashing". Journal of Cryptology . 8 (2): 87–99. doi:10.1007/BF00190757. S2CID   26046704.
  5. Levin, Leonid A. (January 2003). "The Tale of One-Way Functions". Problems of Information Transmission. 39 (39): 92–103. arXiv: cs.CR/0012023 . doi:10.1023/A:1023634616182.
  6. Liu, Yanyi; Pass, Rafael (2020-09-24). "On One-way Functions and Kolmogorov Complexity". arXiv: 2009.11514 [cs.CC].

Further reading