Asymptotically optimal algorithm

Last updated

In computer science, an algorithm is said to be asymptotically optimal if, roughly speaking, for large inputs it performs at worst a constant factor (independent of the input size) worse than the best possible algorithm. It is a term commonly encountered in computer science research as a result of widespread use of big-O notation.

Contents

More formally, an algorithm is asymptotically optimal with respect to a particular resource if the problem has been proven to require Ω(f(n)) of that resource, and the algorithm has been proven to use only O(f(n)).

These proofs require an assumption of a particular model of computation, i.e., certain restrictions on operations allowable with the input data.

As a simple example, it's known that all comparison sorts require at least Ω(n log n) comparisons in the average and worst cases. Mergesort and heapsort are comparison sorts which perform O(n log n) comparisons, so they are asymptotically optimal in this sense.

If the input data have some a priori properties which can be exploited in construction of algorithms, in addition to comparisons, then asymptotically faster algorithms may be possible. For example, if it is known that the N objects are integers from the range [1, N], then they may be sorted O(N) time, e.g., by the bucket sort.

A consequence of an algorithm being asymptotically optimal is that, for large enough inputs, no algorithm can outperform it by more than a constant factor. For this reason, asymptotically optimal algorithms are often seen as the "end of the line" in research, the attaining of a result that cannot be dramatically improved upon. Conversely, if an algorithm is not asymptotically optimal, this implies that as the input grows in size, the algorithm performs increasingly worse than the best possible algorithm.

In practice it's useful to find algorithms that perform better, even if they do not enjoy any asymptotic advantage. New algorithms may also present advantages such as better performance on specific inputs, decreased use of resources, or being simpler to describe and implement. Thus asymptotically optimal algorithms are not always the "end of the line".

Although asymptotically optimal algorithms are important theoretical results, an asymptotically optimal algorithm might not be used in a number of practical situations:

An example of an asymptotically optimal algorithm not used in practice is Bernard Chazelle's linear-time algorithm for triangulation of a simple polygon. Another is the resizable array data structure published in "Resizable Arrays in Optimal Time and Space", [1] which can index in constant time but on many machines carries a heavy practical penalty compared to ordinary array indexing.

Formal definitions

Formally, suppose that we have a lower-bound theorem showing that a problem requires Ω(f(n)) time to solve for an instance (input) of size n (see Big O notation § Big Omega notation for the definition of Ω). Then, an algorithm which solves the problem in O(f(n)) time is said to be asymptotically optimal. This can also be expressed using limits: suppose that b(n) is a lower bound on the running time, and a given algorithm takes time t(n). Then the algorithm is asymptotically optimal if:

This limit, if it exists, is always at least 1, as t(n) ≥ b(n).

Although usually applied to time efficiency, an algorithm can be said to use asymptotically optimal space, random bits, number of processors, or any other resource commonly measured using big-O notation.

Sometimes vague or implicit assumptions can make it unclear whether an algorithm is asymptotically optimal. For example, a lower bound theorem might assume a particular abstract machine model, as in the case of comparison sorts, or a particular organization of memory. By violating these assumptions, a new algorithm could potentially asymptotically outperform the lower bound and the "asymptotically optimal" algorithms.

Speedup

The nonexistence of an asymptotically optimal algorithm is called speedup. Blum's speedup theorem shows that there exist artificially constructed problems with speedup. However, it is an open problem whether many of the most well-known algorithms today are asymptotically optimal or not. For example, there is an algorithm for finding minimum spanning trees, where is the very slowly growing inverse of the Ackermann function, but the best known lower bound is the trivial . Whether this algorithm is asymptotically optimal is unknown, and would be likely to be hailed as a significant result if it were resolved either way. Coppersmith and Winograd (1982) proved that matrix multiplication has a weak form of speed-up among a restricted class of algorithms (Strassen-type bilinear identities with lambda-computation).

See also

Related Research Articles

<span class="mw-page-title-main">Analysis of algorithms</span> Study of resources used by an algorithm

In computer science, the analysis of algorithms is the process of finding the computational complexity of algorithms—the amount of time, storage, or other resources needed to execute them. Usually, this involves determining a function that relates the size of an algorithm's input to the number of steps it takes or the number of storage locations it uses. An algorithm is said to be efficient when this function's values are small, or grow slowly compared to a growth in the size of the input. Different inputs of the same size may cause the algorithm to have different behavior, so best, worst and average case descriptions might all be of practical interest. When not otherwise specified, the function describing the performance of an algorithm is usually an upper bound, determined from the worst case inputs to the algorithm.

In computer science, the computational complexity or simply complexity of an algorithm is the amount of resources required to run it. Particular focus is given to computation time and memory storage requirements. The complexity of a problem is the complexity of the best algorithms that allow solving the problem.

<span class="mw-page-title-main">Merge sort</span> Divide and conquer-based sorting algorithm

In computer science, merge sort is an efficient, general-purpose, and comparison-based sorting algorithm. Most implementations produce a stable sort, which means that the relative order of equal elements is the same in the input and output. Merge sort is a divide-and-conquer algorithm that was invented by John von Neumann in 1945. A detailed description and analysis of bottom-up merge sort appeared in a report by Goldstine and von Neumann as early as 1948.

<span class="mw-page-title-main">Sorting algorithm</span> Algorithm that arranges lists in order

In computer science, a sorting algorithm is an algorithm that puts elements of a list into an order. The most frequently used orders are numerical order and lexicographical order, and either ascending or descending. Efficient sorting is important for optimizing the efficiency of other algorithms that require input data to be in sorted lists. Sorting is also often useful for canonicalizing data and for producing human-readable output.

Big <i>O</i> notation Describes limiting behavior of a function

Big O notation is a mathematical notation that describes the limiting behavior of a function when the argument tends towards a particular value or infinity. Big O is a member of a family of notations invented by Paul Bachmann, Edmund Landau, and others, collectively called Bachmann–Landau notation or asymptotic notation. The letter O was chosen by Bachmann to stand for Ordnung, meaning the order of approximation.

<span class="mw-page-title-main">Shellsort</span> Sorting algorithm which uses multiple comparison intervals

Shellsort, also known as Shell sort or Shell's method, is an in-place comparison sort. It can be seen as either a generalization of sorting by exchange or sorting by insertion. The method starts by sorting pairs of elements far apart from each other, then progressively reducing the gap between elements to be compared. By starting with far apart elements, it can move some out-of-place elements into position faster than a simple nearest neighbor exchange. Donald Shell published the first version of this sort in 1959. The running time of Shellsort is heavily dependent on the gap sequence it uses. For many practical variants, determining their time complexity remains an open problem.

In computer science, bogosort is a sorting algorithm based on the generate and test paradigm. The function successively generates permutations of its input until it finds one that is sorted. It is not considered useful for sorting, but may be used for educational purposes, to contrast it with more efficient algorithms.

In computer science, divide and conquer is an algorithm design paradigm. A divide-and-conquer algorithm recursively breaks down a problem into two or more sub-problems of the same or related type, until these become simple enough to be solved directly. The solutions to the sub-problems are then combined to give a solution to the original problem.

The space complexity of an algorithm or a computer program is the amount of memory space required to solve an instance of the computational problem as a function of characteristics of the input. It is the memory required by an algorithm until it executes completely. This includes the memory space used by its inputs, called input space, and any other (auxiliary) memory it uses during execution, which is called auxiliary space.

<span class="mw-page-title-main">Self-balancing binary search tree</span> Any node-based binary search tree that automatically keeps its height the same

In computer science, a self-balancing binary search tree (BST) is any node-based binary search tree that automatically keeps its height small in the face of arbitrary item insertions and deletions. These operations when designed for a self-balancing binary search tree, contain precautionary measures against boundlessly increasing tree height, so that these abstract data structures receive the attribute "self-balancing".

<span class="mw-page-title-main">Time complexity</span> Estimate of time taken for running an algorithm

In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by the algorithm, supposing that each elementary operation takes a fixed amount of time to perform. Thus, the amount of time taken and the number of elementary operations performed by the algorithm are taken to be related by a constant factor.

In computer science, a selection algorithm is an algorithm for finding the th smallest value in a collection of ordered values, such as numbers. The value that it finds is called the th order statistic. Selection includes as special cases the problems of finding the minimum, median, and maximum element in the collection. Selection algorithms include quickselect, and the median of medians algorithm. When applied to a collection of values, these algorithms take linear time, as expressed using big O notation. For data that is already structured, faster algorithms may be possible; as an extreme case, selection in an already-sorted array takes time .

In computing, a cache-oblivious algorithm is an algorithm designed to take advantage of a processor cache without having the size of the cache as an explicit parameter. An optimal cache-oblivious algorithm is a cache-oblivious algorithm that uses the cache optimally. Thus, a cache-oblivious algorithm is designed to perform well, without modification, on multiple machines with different cache sizes, or for a memory hierarchy with different levels of cache having different sizes. Cache-oblivious algorithms are contrasted with explicit loop tiling, which explicitly breaks a problem into blocks that are optimally sized for a given cache.

<span class="mw-page-title-main">Klee's measure problem</span>

In computational geometry, Klee's measure problem is the problem of determining how efficiently the measure of a union of (multidimensional) rectangular ranges can be computed. Here, a d-dimensional rectangular range is defined to be a Cartesian product of d intervals of real numbers, which is a subset of Rd.

<span class="mw-page-title-main">Comparison sort</span> Type of sorting algorithm that works by comparing pairs of elements

A comparison sort is a type of sorting algorithm that only reads the list elements through a single abstract comparison operation that determines which of two elements should occur first in the final sorted list. The only requirement is that the operator forms a total preorder over the data, with:

  1. if ab and bc then ac (transitivity)
  2. for all a and b, ab or ba (connexity).

In computer science, the longest increasing subsequence problem aims to find a subsequence of a given sequence in which the subsequence's elements are sorted in an ascending order and in which the subsequence is as long as possible. This subsequence is not necessarily contiguous or unique. The longest increasing subsequences are studied in the context of various disciplines related to mathematics, including algorithmics, random matrix theory, representation theory, and physics. The longest increasing subsequence problem is solvable in time where denotes the length of the input sequence.

Algorithms that construct convex hulls of various objects have a broad range of applications in mathematics and computer science.

Because matrix multiplication is such a central operation in many numerical algorithms, much work has been invested in making matrix multiplication algorithms efficient. Applications of matrix multiplication in computational problems are found in many fields including scientific computing and pattern recognition and in seemingly unrelated problems such as counting the paths through a graph. Many different algorithms have been designed for multiplying matrices on different types of hardware, including parallel and distributed systems, where the computational work is spread over multiple processors.

Quantum complexity theory is the subfield of computational complexity theory that deals with complexity classes defined using quantum computers, a computational model based on quantum mechanics. It studies the hardness of computational problems in relation to these complexity classes, as well as the relationship between quantum complexity classes and classical complexity classes.

<i>X</i> + <i>Y</i> sorting Problem of sorting pairs of numbers by their sum

In computer science, sorting is the problem of sorting pairs of numbers by their sums. Applications of the problem include transit fare minimisation, VLSI design, and sparse polynomial multiplication. As with comparison sorting and integer sorting more generally, algorithms for this problem can be based only on comparisons of these sums, or on other operations that work only when the inputs are small integers.

References

  1. Brodnik, Andrej; Carlsson, Svante; Sedgewick, Robert; Munro, JI; Demaine, ED (1999), Resizable Arrays in Optimal Time and Space (PDF), Department of Computer Science, University of Waterloo