Galactic algorithm

Last updated

A galactic algorithm is one with record-breaking theoretical (asymptotic) performance, but which is never used in practice. Typical reasons are that the performance gains only appear for problems that are so large they never occur, or the algorithm's complexity outweighs a relatively small gain in performance. Galactic algorithms were so named by Richard Lipton and Ken Regan, [1] because they will never be used on any data sets on Earth.

Contents

Possible use cases

Even if they are never used in practice, galactic algorithms may still contribute to computer science:

Examples

Integer multiplication

An example of a galactic algorithm is the fastest known way to multiply two numbers, [3] which is based on a 1729-dimensional Fourier transform. [4] It needs bit operations, but as the constants hidden by the big O notation are large, it is never used in practice. However, it also shows why galactic algorithms may still be useful. The authors state: "we are hopeful that with further refinements, the algorithm might become practical for numbers with merely billions or trillions of digits." [4]

Matrix multiplication

The first improvement over brute-force matrix multiplication (which needs multiplications) was the Strassen algorithm: a recursive algorithm that needs multiplications. This algorithm is not galactic and is used in practice. Further extensions of this, using sophisticated group theory, are the Coppersmith–Winograd algorithm and its slightly better successors, needing multiplications. These are galactic – "We nevertheless stress that such improvements are only of theoretical interest, since the huge constants involved in the complexity of fast matrix multiplication usually make these algorithms impractical." [5]

Communication channel capacity

Claude Shannon showed a simple but asymptotically optimal code that can reach the theoretical capacity of a communication channel. It requires assigning a random code word to every possible -bit message, then decoding by finding the closest code word. If is chosen large enough, this beats any existing code and can get arbitrarily close to the capacity of the channel. Unfortunately, any big enough to beat existing codes is also completely impractical. [6] These codes, though never used, inspired decades of research into more practical algorithms that today can achieve rates arbitrarily close to channel capacity. [7]

Sub-graphs

The problem of deciding whether a graph contains as a minor is NP-complete in general, but where is fixed, it can be solved in polynomial time. The running time for testing whether is a minor of in this case is , [8] where is the number of vertices in and the big O notation hides a constant that depends superexponentially on . The constant is greater than in Knuth's up-arrow notation, where is the number of vertices in . [9] Even the case of cannot be reasonably computed as the constant is greater than 2 tetrated by 65536, that is, .

Cryptographic breaks

In cryptography jargon, a "break" is any attack faster in expectation than brute force – i.e., performing one trial decryption for each possible key. For many cryptographic systems, breaks are known, but are still practically infeasible with current technology. One example is the best attack known against 128-bit AES, which takes only operations. [10] Despite being impractical, theoretical breaks can provide insight into vulnerability patterns, and sometimes lead to discovery of exploitable breaks.

Traveling salesman problem

For several decades, the best known approximation to the traveling salesman problem in a metric space was the very simple Christofides algorithm which produced a path at most 50% longer than the optimum. (Many other algorithms could usually do much better, but could not provably do so.) In 2020, a newer and much more complex algorithm was discovered that can beat this by percent. [11] Although no one will ever switch to this algorithm for its very slight worst-case improvement, it is still considered important because "this minuscule improvement breaks through both a theoretical logjam and a psychological one". [12]

A single algorithm, "Hutter search", can solve any well-defined problem in an asymptotically optimal time, barring some caveats. It works by searching through all possible algorithms (by runtime), while simultaneously searching through all possible proofs (by length of proof), looking for a proof of correctness for each algorithm. Since the proof of correctness is of finite size, it "only" adds a constant and does not affect the asymptotic runtime. However, this constant is so big that the algorithm is entirely impractical. [13] [14] For example, if the shortest proof of correctness of a given algorithm is 1000 bits long, the search will examine at least 2999 other potential proofs first.

Hutter search is related to Solomonoff induction, which is a formalization of Bayesian inference. All computable theories (as implemented by programs) which perfectly describe previous observations are used to calculate the probability of the next observation, with more weight put on the shorter computable theories. Again, the search over all possible explanations makes this procedure galactic.

Optimization

Simulated annealing, when used with a logarithmic cooling schedule, has been proven to find the global optimum of any optimization problem. However, such a cooling schedule results in entirely impractical runtimes, and is never used. [15] However, knowing this ideal algorithm exists has led to practical variants that are able to find very good (though not provably optimal) solutions to complex optimization problems. [16]

Minimum spanning trees

The expected linear time MST algorithm is able to discover the minimum spanning tree of a graph in , where is the number of edges and is the number of nodes of the graph. [17] However, the constant factor that is hidden by the Big O notation is huge enough to make the algorithm impractical. An implementation is publicly available [18] and given the experimentally estimated implementation constants, it would only be faster than Borůvka's algorithm for graphs in which . [19]

Hash tables

Researchers have found an algorithm that achieves the provably best-possible [20] asymptotic performance in terms of time-space tradeoff. [21] But it remains purely theoretical: "Despite the new hash table’s unprecedented efficiency, no one is likely to try building it anytime soon. It’s just too complicated to construct." [22] and "in practice, constants really matter. In the real world, a factor of 10 is a game ender.” [23]

Related Research Articles

<span class="mw-page-title-main">Kolmogorov complexity</span> Measure of algorithmic complexity

In algorithmic information theory, the Kolmogorov complexity of an object, such as a piece of text, is the length of a shortest computer program that produces the object as output. It is a measure of the computational resources needed to specify the object, and is also known as algorithmic complexity, Solomonoff–Kolmogorov–Chaitin complexity, program-size complexity, descriptive complexity, or algorithmic entropy. It is named after Andrey Kolmogorov, who first published on the subject in 1963 and is a generalization of classical information theory.

The P versus NP problem is a major unsolved problem in theoretical computer science. Informally, it asks whether every problem whose solution can be quickly verified can also be quickly solved.

In computer science, the computational complexity or simply complexity of an algorithm is the amount of resources required to run it. Particular focus is given to computation time and memory storage requirements. The complexity of a problem is the complexity of the best algorithms that allow solving the problem.

<span class="mw-page-title-main">NP (complexity)</span> Complexity class used to classify decision problems

In computational complexity theory, NP is a complexity class used to classify decision problems. NP is the set of decision problems for which the problem instances, where the answer is "yes", have proofs verifiable in polynomial time by a deterministic Turing machine, or alternatively the set of problems that can be solved in polynomial time by a nondeterministic Turing machine.

<span class="mw-page-title-main">Minimum spanning tree</span> Least-weight tree connecting graph vertices

A minimum spanning tree (MST) or minimum weight spanning tree is a subset of the edges of a connected, edge-weighted undirected graph that connects all the vertices together, without any cycles and with the minimum possible total edge weight. That is, it is a spanning tree whose sum of edge weights is as small as possible. More generally, any edge-weighted undirected graph has a minimum spanning forest, which is a union of the minimum spanning trees for its connected components.

In computer science, divide and conquer is an algorithm design paradigm. A divide-and-conquer algorithm recursively breaks down a problem into two or more sub-problems of the same or related type, until these become simple enough to be solved directly. The solutions to the sub-problems are then combined to give a solution to the original problem.

The bin packing problem is an optimization problem, in which items of different sizes must be packed into a finite number of bins or containers, each of a fixed given capacity, in a way that minimizes the number of bins used. The problem has many applications, such as filling up containers, loading trucks with weight capacity constraints, creating file backups in media, splitting a network prefix into multiple subnets, and technology mapping in FPGA semiconductor chip design.

In computer science and operations research, approximation algorithms are efficient algorithms that find approximate solutions to optimization problems with provable guarantees on the distance of the returned solution to the optimal one. Approximation algorithms naturally arise in the field of theoretical computer science as a consequence of the widely believed P ≠ NP conjecture. Under this conjecture, a wide class of optimization problems cannot be solved exactly in polynomial time. The field of approximation algorithms, therefore, tries to understand how closely it is possible to approximate optimal solutions to such problems in polynomial time. In an overwhelming majority of the cases, the guarantee of such algorithms is a multiplicative one expressed as an approximation ratio or approximation factor i.e., the optimal solution is always guaranteed to be within a (predetermined) multiplicative factor of the returned solution. However, there are also many approximation algorithms that provide an additive guarantee on the quality of the returned solution. A notable example of an approximation algorithm that provides both is the classic approximation algorithm of Lenstra, Shmoys and Tardos for scheduling on unrelated parallel machines.

In linear algebra, the Strassen algorithm, named after Volker Strassen, is an algorithm for matrix multiplication. It is faster than the standard matrix multiplication algorithm for large matrices, with a better asymptotic complexity, although the naive algorithm is often better for smaller matrices. The Strassen algorithm is slower than the fastest known algorithms for extremely large matrices, but such galactic algorithms are not useful in practice, as they are much slower for matrices of practical size. For small matrices even faster algorithms exist.

In computer science, a disjoint-set data structure, also called a union–find data structure or merge–find set, is a data structure that stores a collection of disjoint (non-overlapping) sets. Equivalently, it stores a partition of a set into disjoint subsets. It provides operations for adding new sets, merging sets, and finding a representative member of a set. The last operation makes it possible to find out efficiently if any two elements are in the same or different sets.

Matrix chain multiplication is an optimization problem concerning the most efficient way to multiply a given sequence of matrices. The problem is not actually to perform the multiplications, but merely to decide the sequence of the matrix multiplications involved. The problem may be solved using dynamic programming.

In computing, a cache-oblivious algorithm is an algorithm designed to take advantage of a processor cache without having the size of the cache as an explicit parameter. An optimal cache-oblivious algorithm is a cache-oblivious algorithm that uses the cache optimally. Thus, a cache-oblivious algorithm is designed to perform well, without modification, on multiple machines with different cache sizes, or for a memory hierarchy with different levels of cache having different sizes. Cache-oblivious algorithms are contrasted with explicit loop tiling, which explicitly breaks a problem into blocks that are optimally sized for a given cache.

In computer science, an algorithm is said to be asymptotically optimal if, roughly speaking, for large inputs it performs at worst a constant factor worse than the best possible algorithm. It is a term commonly encountered in computer science research as a result of widespread use of big-O notation.

<span class="mw-page-title-main">Closest pair of points problem</span>

The closest pair of points problem or closest pair problem is a problem of computational geometry: given points in metric space, find a pair of points with the smallest distance between them. The closest pair problem for points in the Euclidean plane was among the first geometric problems that were treated at the origins of the systematic study of the computational complexity of geometric algorithms.

Because matrix multiplication is such a central operation in many numerical algorithms, much work has been invested in making matrix multiplication algorithms efficient. Applications of matrix multiplication in computational problems are found in many fields including scientific computing and pattern recognition and in seemingly unrelated problems such as counting the paths through a graph. Many different algorithms have been designed for multiplying matrices on different types of hardware, including parallel and distributed systems, where the computational work is spread over multiple processors.

In theoretical computer science, the computational complexity of matrix multiplication dictates how quickly the operation of matrix multiplication can be performed. Matrix multiplication algorithms are a central subroutine in theoretical and numerical algorithms for numerical linear algebra and optimization, so finding the fastest algorithm for matrix multiplication is of major practical relevance.

In cryptography, Very Smooth Hash (VSH) is a provably secure cryptographic hash function invented in 2005 by Scott Contini, Arjen Lenstra, and Ron Steinfeld. Provably secure means that finding collisions is as difficult as some known hard mathematical problem. Unlike other provably secure collision-resistant hashes, VSH is efficient and usable in practice. Asymptotically, it only requires a single multiplication per log(n) message-bits and uses RSA-type arithmetic. Therefore, VSH can be useful in embedded environments where code space is limited.

<i>X</i> + <i>Y</i> sorting Problem of sorting pairs of numbers by their sum

In computer science, sorting is the problem of sorting pairs of numbers by their sums. Applications of the problem include transit fare minimisation, VLSI design, and sparse polynomial multiplication. As with comparison sorting and integer sorting more generally, algorithms for this problem can be based only on comparisons of these sums, or on other operations that work only when the inputs are small integers.

In network theory, the Wiener connector is a means of maximizing efficiency in connecting specified "query vertices" in a network. Given a connected, undirected graph and a set of query vertices in a graph, the minimum Wiener connector is an induced subgraph that connects the query vertices and minimizes the sum of shortest path distances among all pairs of vertices in the subgraph. In combinatorial optimization, the minimum Wiener connector problem is the problem of finding the minimum Wiener connector. It can be thought of as a version of the classic Steiner tree problem, where instead of minimizing the size of the tree, the objective is to minimize the distances in the subgraph.

The geometric set cover problem is the special case of the set cover problem in geometric settings. The input is a range space where is a universe of points in and is a family of subsets of called ranges, defined by the intersection of and geometric shapes such as disks and axis-parallel rectangles. The goal is to select a minimum-size subset of ranges such that every point in the universe is covered by some range in .

References

  1. 1 2 Lipton, Richard J.; Regan, Kenneth W. (2013). "David Johnson: Galactic Algorithms". People, Problems, and Proofs: Essays from Gödel's Lost Letter: 2010. Heidelberg: Springer Berlin. pp. 109–112. ISBN   9783642414220.
  2. Fortnow, L. (2009). "The status of the P versus NP problem" (PDF). Communications of the ACM. 52 (9): 78–86. doi:10.1145/1562164.1562186. S2CID   5969255.
  3. David, Harvey; Hoeven, Joris van der (March 2019). "Integer multiplication in time O(n log n)". HAL. hal-02070778.
  4. 1 2 Harvey, David (9 April 2019). "We've found a quicker way to multiply really big numbers". The Conversation. Retrieved 9 March 2023.
  5. Le Gall, F. (2012), "Faster algorithms for rectangular matrix multiplication", Proceedings of the 53rd Annual IEEE Symposium on Foundations of Computer Science (FOCS 2012), pp. 514–523, arXiv: 1204.1111 , doi:10.1109/FOCS.2012.80, ISBN   978-0-7695-4874-6, S2CID   2410545
  6. Larry Hardesty (January 19, 2010). "Explained: The Shannon limit". MIT News Office.
  7. "Capacity-approaching codes (Chapter 13 of Principles Of Digital Communication II)" (PDF). MIT OpenCourseWare. 2005.
  8. Kawarabayashi, Ken-ichi; Kobayashi, Yusuke; Reed, Bruce (2012). "The disjoint paths problem in quadratic time". Journal of Combinatorial Theory . Series B. 102 (2): 424–435. doi: 10.1016/j.jctb.2011.07.004 .
  9. Johnson, David S. (1987). "The NP-completeness column: An ongoing guide (edition 19)". Journal of Algorithms. 8 (2): 285–303. CiteSeerX   10.1.1.114.3864 . doi:10.1016/0196-6774(87)90043-5.
  10. Biaoshuai Tao & Hongjun Wu (2015). Information Security and Privacy. Lecture Notes in Computer Science. Vol. 9144. pp. 39–56. doi:10.1007/978-3-319-19962-7_3. ISBN   978-3-319-19961-0.
  11. Anna R. Karlin; Nathan Klein; Shayan Oveis Gharan (September 1, 2020). "A (Slightly) Improved Approximation Algorithm for Metric TSP". arXiv: 2007.01409 [cs.DS].
  12. Klarreich, Erica (8 October 2020). "Computer Scientists Break Traveling Salesperson Record". Quanta Magazine.
  13. Hutter, Marcus (2002-06-14). "The Fastest and Shortest Algorithm for All Well-Defined Problems". arXiv: cs/0206022 .
  14. Gagliolo, Matteo (2007-11-20). "Universal search". Scholarpedia. 2 (11): 2575. Bibcode:2007SchpJ...2.2575G. doi: 10.4249/scholarpedia.2575 . ISSN   1941-6016.
  15. Liang, Faming; Cheng, Yichen; Lin, Guang (2014). "Simulated stochastic approximation annealing for global optimization with a square-root cooling schedule". Journal of the American Statistical Association. 109 (506): 847–863. doi:10.1080/01621459.2013.872993. S2CID   123410795.
  16. Ingber, Lester (1993). "Simulated annealing: Practice versus theory". Mathematical and Computer Modelling. 18 (11): 29–57. CiteSeerX   10.1.1.15.1046 . doi: 10.1016/0895-7177(93)90204-C .
  17. Karger, David R.; Klein, Philip N.; Tarjan, Robert E. (1995-03-01). "A randomized linear-time algorithm to find minimum spanning trees". Journal of the ACM. 42 (2): 321–328. doi: 10.1145/201019.201022 . ISSN   0004-5411.
  18. Thiesen, Francisco. "A C++ implementation for an Expected Linear-Time Minimum Spanning Tree Algorithm(Karger-Klein-Tarjan + Hagerup Minimum Spanning Tree Verification as a sub-routine)". GitHub . Retrieved 2022-11-19.
  19. Geiman Thiesen, Francisco. "Expected Linear-Time Minimum Spanning Trees". franciscothiesen.github.io. Retrieved 2022-11-13.
  20. Tianxiao Li, Jingxun Liang, Huacheng Yu, and Renfei Zhou. "Tight Cell-Probe Lower Bounds for Dynamic Succinct Dictionaries" (PDF).{{cite web}}: CS1 maint: multiple names: authors list (link)
  21. Bender, Michael; Farach-Colton, Martin; Kuszmaul, John; Kuszmaul, William; Mingmou, Liu (4 Nov 2021). "On the Optimal Time/Space Tradeoff for Hash Tables". arXiv: 2111.00602 [cs].
  22. "Scientists Find Optimal Balance of Data Storage and Time".
  23. William Kuszmaul, as quoted in "Scientists Find Optimal Balance of Data Storage and Time".