Collision problem

Last updated January 31, 2024 • 1 min readFrom Wikipedia, The Free Encyclopedia

The r-to-1 collision problem is an important theoretical problem in complexity theory, quantum computing, and computational mathematics. The collision problem most often refers to the 2-to-1 version:^[1] given $n$ even and a function $f:\,\{1,\ldots ,n\}\rightarrow \{1,\ldots ,n\}$ , we are promised that f is either 1-to-1 or 2-to-1. We are only allowed to make queries about the value of $f(i)$ for any $i\in \{1,\ldots ,n\}$ . The problem then asks how many such queries we need to make to determine with certainty whether f is 1-to-1 or 2-to-1.

Classical solutions

Deterministic

Solving the 2-to-1 version deterministically requires ${\textstyle {\frac {n}{2}}+1}$ queries, and in general distinguishing r-to-1 functions from 1-to-1 functions requires ${\textstyle {\frac {n}{r}}+1}$ queries.

This is a straightforward application of the pigeonhole principle: if a function is r-to-1, then after ${\textstyle {\frac {n}{r}}+1}$ queries we are guaranteed to have found a collision. If a function is 1-to-1, then no collision exists. Thus, ${\textstyle {\frac {n}{r}}+1}$ queries suffice. If we are unlucky, then the first $n/r$ queries could return distinct answers, so ${\textstyle {\frac {n}{r}}+1}$ queries is also necessary.

Randomized

If we allow randomness, the problem is easier. By the birthday paradox, if we choose (distinct) queries at random, then with high probability we find a collision in any fixed 2-to-1 function after $\Theta ({\sqrt {n}})$ queries.

Quantum solution

The BHT algorithm, which uses Grover's algorithm, solves this problem optimally by only making $O(n^{1/3})$ queries to f.

Related Research Articles

In computer science, binary search, also known as half-interval search, logarithmic search, or binary chop, is a search algorithm that finds the position of a target value within a sorted array. Binary search compares the target value to the middle element of the array. If they are not equal, the half in which the target cannot lie is eliminated and the search continues on the remaining half, again taking the middle element to compare to the target value, and repeating this until the target value is found. If the search ends with the remaining half being empty, the target is not in the array.

Shor's algorithm is a quantum algorithm for finding the prime factors of an integer. It was developed in 1994 by the American mathematician Peter Shor. It is one of the few known quantum algorithms with compelling potential applications and strong evidence of superpolynomial speedup compared to best known classical algorithms. On the other hand, factoring numbers of practical significance requires far more qubits than available in the near future. Another concern is that noise in quantum circuits may undermine results, requiring additional qubits for quantum error correction.

In quantum computing, Grover's algorithm, also known as the quantum search algorithm, is a quantum algorithm for unstructured search that finds with high probability the unique input to a black box function that produces a particular output value, using just $evaluations of the function, where is the size of the function's domain. It was devised by Lov Grover in 1996.$

In mathematics, and more specifically in computer algebra, computational algebraic geometry, and computational commutative algebra, a Gröbner basis is a particular kind of generating set of an ideal in a polynomial ring $K [x 1, ..., x n]$ over a field $K$ . A Gröbner basis allows many important properties of the ideal and the associated algebraic variety to be deduced easily, such as the dimension and the number of zeros when it is finite. Gröbner basis computation is one of the main practical tools for solving systems of polynomial equations and computing the images of algebraic varieties under projections or rational maps.

In computer science, a one-way function is a function that is easy to compute on every input, but hard to invert given the image of a random input. Here, "easy" and "hard" are to be understood in the sense of computational complexity theory, specifically the theory of polynomial time problems. Not being one-to-one is not considered sufficient for a function to be called one-way.

In theoretical computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by the algorithm, supposing that each elementary operation takes a fixed amount of time to perform. Thus, the amount of time taken and the number of elementary operations performed by the algorithm are taken to be related by a constant factor.

A randomized algorithm is an algorithm that employs a degree of randomness as part of its logic or procedure. The algorithm typically uses uniformly random bits as an auxiliary input to guide its behavior, in the hope of achieving good performance in the "average case" over all possible choices of random determined by the random bits; thus either the running time, or the output are random variables.

In quantum computing, a quantum algorithm is an algorithm which runs on a realistic model of quantum computation, the most commonly used model being the quantum circuit model of computation. A classical algorithm is a finite sequence of instructions, or a step-by-step procedure for solving a problem, where each step or instruction can be performed on a classical computer. Similarly, a quantum algorithm is a step-by-step procedure, where each of the steps can be performed on a quantum computer. Although all classical algorithms can also be performed on a quantum computer, the term quantum algorithm is usually used for those algorithms which seem inherently quantum, or use some essential feature of quantum computation such as quantum superposition or quantum entanglement.

In mathematical optimization, the ellipsoid method is an iterative method for minimizing convex functions over convex sets. The ellipsoid method generates a sequence of ellipsoids whose volume uniformly decreases at every step, thus enclosing a minimizer of a convex function.

Stochastic approximation methods are a family of iterative methods typically used for root-finding problems or for optimization problems. The recursive update rules of stochastic approximation methods can be used, among other things, for solving linear systems when the collected data is corrupted by noise, or for approximating extreme values of functions which cannot be computed directly, but only estimated via noisy observations.

In computational complexity theory, the element distinctness problem or element uniqueness problem is the problem of determining whether all the elements of a list are distinct.

In computer science, locality-sensitive hashing (LSH) is a fuzzy hashing technique that hashes similar input items into the same "buckets" with high probability. Since similar items end up in the same buckets, this technique can be used for data clustering and nearest neighbor search. It differs from conventional hashing techniques in that hash collisions are maximized, not minimized. Alternatively, the technique can be seen as a way to reduce the dimensionality of high-dimensional data; high-dimensional input items can be reduced to low-dimensional versions while preserving relative distances between items.

In computational complexity theory and quantum computing, Simon's problem is a computational problem that is proven to be solved exponentially faster on a quantum computer than on a classical computer. The quantum algorithm solving Simon's problem, usually called Simon's algorithm, served as the inspiration for Shor's algorithm. Both problems are special cases of the abelian hidden subgroup problem, which is now known to have efficient quantum algorithms.

In computational complexity the decision tree model is the model of computation in which an algorithm is considered to be basically a decision tree, i.e., a sequence of queries or tests that are done adaptively, so the outcome of previous tests can influence the tests performed next.

A locally decodable code (LDC) is an error-correcting code that allows a single bit of the original message to be decoded with high probability by only examining a small number of bits of a possibly corrupted codeword. This property could be useful, say, in a context where information is being transmitted over a noisy channel, and only a small subset of the data is required at a particular time and there is no need to decode the entire message at once. Note that locally decodable codes are not a subset of locally testable codes, though there is some overlap between the two.

The elliptic curve only hash (ECOH) algorithm was submitted as a candidate for SHA-3 in the NIST hash function competition. However, it was rejected in the beginning of the competition since a second pre-image attack was found.

In computer science, the range query problem consists of efficiently answering several queries regarding a given interval of elements within an array. For example, a common task, known as range minimum query, is finding the smallest value inside a given range within a list of numbers.

In quantum computing, the Brassard-Høyer-Tapp algorithm or BHT algorithm is a quantum algorithm that solves the collision problem. In this problem, one is given n and an r-to-1 function $and needs to find two inputs that f maps to the same output. The BHT algorithm only makes queries to f, which matches the lower bound of in the black box model.$

In post-quantum cryptography, ring learning with errors (RLWE) is a computational problem which serves as the foundation of new cryptographic algorithms, such as NewHope, designed to protect against cryptanalysis by quantum computers and also to provide the basis for homomorphic encryption. Public-key cryptography relies on construction of mathematical problems that are believed to be hard to solve if no further information is available, but are easy to solve if some information used in the problem construction is known. Some problems of this sort that are currently used in cryptography are at risk of attack if sufficiently large quantum computers can ever be built, so resistant problems are sought. Homomorphic encryption is a form of encryption that allows computation on ciphertext, such as arithmetic on numeric values stored in an encrypted database.

In PAC learning, error tolerance refers to the ability of an algorithm to learn when the examples received have been corrupted in some way. In fact, this is a very common and important issue since in many applications it is not possible to access noise-free data. Noise can interfere with the learning process at different levels: the algorithm may receive data that have been occasionally mislabeled, or the inputs may have some false information, or the classification of the examples may have been maliciously adulterated.

References

↑ Scott Aaronson (2004). "Limits on Efficient Computation in the Physical World" (PDF).

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Scott Aaronson (2004). "Limits on Efficient Computation in the Physical World" (PDF).

[1]