# Computational complexity

Last updated

In computer science, the computational complexity, or simply complexity of an algorithm is the amount of resources required for running it (a property unrelated to “complexity” in a conventional sense). The computational complexity of a problem is the minimum of the complexities of all possible algorithms for this problem (including the unknown algorithms).

## Contents

As the amount of needed resources varies with the input, the complexity is generally expressed as a function nf(n), where n is the size of the input, and f(n) is either the worst-case complexity, that is the maximum of the amount of resources that are needed for all inputs of size n, or the average-case complexity, that is average of the amount of resources over all input of size n.

When the nature of the resources is not explicitly given, this is usually the time needed for running the algorithm, and one talks of time complexity. However, this depends on the computer that is used, and the time is generally expressed as the number of needed elementary operations, which are supposed to take a constant time on a given computer, and to change by a constant factor when one changes computer.

Otherwise, the resource that is considered is often the size of the memory that is needed, and one talks of space complexity.

The study of the complexity of explicitly given algorithms is called analysis of algorithms, while the study of the complexity of problems is called computational complexity theory. Clearly, both areas are strongly related, as the complexity of an algorithm is always an upper bound of the complexity of the problem solved by this algorithm.

## Resources

### Time

The resource that is most commonly considered is the time, and one talks of time complexity. When "complexity" is used without being qualified, this generally means time complexity.

The usual units of time are not used in complexity theory, because they are too dependent on the choice of a specific computer and of the evolution of the technology. Therefore, instead of the real time, one generally consider the elementary operations that are done during the computation. These operations are supposed to take a constant time on a given machine, and are often called steps.

### Space

Another important resource is the size of computer memory that is needed for running algorithms.

### Others

The number of arithmetic operations is another resource that is commonly used. In this case, one talks of arithmetic complexity. If one knows an upper bound on the size of the binary representation of the numbers that occur during a computation, the time complexity is generally the product of the arithmetic complexity by a constant factor.

For many algorithms the size of the integers that are used during a computation is not bounded, and it is not realistic to consider that arithmetic operations take a constant time. Therefore, the time complexity, generally called bit complexity in this context, may be much larger than the arithmetic complexity. For example, the arithmetic complexity of the computation of the determinant of a n×n integer matrix is ${\displaystyle O(n^{3})}$ for the usual algorithms (Gaussian elimination). The bit complexity of the same algorithms is exponential in n, because the size of the coefficients may grow exponentially during the computation. On the other hand, if these algorithms are coupled with multi-modular arithmetic, the bit complexity may be reduced to O~(n4).

In sorting and searching, the resource that is generally considered is the number of entries comparisons. This is generally a good measure of the time complexity if data are suitably organized.

## Complexity as a function of input size

For clarity, only time complexity is considered in this section, but everything applies (with slight modifications) to the complexity with respect to other resources.

It is impossible to count the number of steps of an algorithm on all possible inputs. As the complexity increases generally with the size of the input, the complexity is generally expressed as a function of the size n (in bits) of the input, and therefore, the complexity is a function of n. However, the complexity of an algorithm may vary dramatically for different inputs of the same size. Therefore, several complexity functions are commonly used.

The worst-case complexity is the maximum of the complexity over all inputs of size n, and the average-case complexity is the average of the complexity over all inputs of size n (this makes sense, as the number of possible inputs of a given size is finite). Generally, when "complexity" is used without being further specified, this is the worst-case time complexity that is considered.

## Asymptotic complexity

It is generally difficult to compute precisely the worst-case and the average-case complexity. In addition, these exact values provide little practical application, as any change of computer or of model of computation would change the complexity somewhat. Moreover, the resource use is not critical for small values of n, and this makes that, for small n, the ease of implementation is generally more interesting than a good complexity.

For these reasons, one generally focuses on the behavior of the complexity for large n, that is on its asymptotic behavior when n tends to the infinity. Therefore, the complexity is generally expressed by using big O notation.

For example, the usual algorithm for integer multiplication has a complexity of ${\displaystyle O(n^{2}),}$ this means that there is a constant ${\displaystyle c_{u}}$ such that the multiplication of two integers of at most n digits may be done in a time less than ${\displaystyle c_{u}n^{2}.}$ This bound is sharp in the sense that the worst-case complexity and the average-case complexity are ${\displaystyle \Omega (n^{2}),}$ which means that there is a constant ${\displaystyle c_{l}}$ such that these complexities are larger than ${\displaystyle c_{l}n^{2}.}$ The radix does not appear in these complexity, as changing of radix changes only the constants ${\displaystyle c_{u}}$ and ${\displaystyle c_{l}.}$

## Models of computation

The evaluation of the complexity relies on the choice of a model of computation, which consists in defining the basic operations that are done in a unit of time. When the model of computation is not explicitly specified, this is generally meant as being multitape Turing machine.

### Deterministic models

A deterministic model of computation is a model of computation such that the successive states of the machine and the operations to be performed are completely determined by the preceding state. Historically, the first deterministic models were recursive functions, lambda calculus, and Turing machines. The model of Random access machines (also called RAM-machines) is also widely used, as closer to real computers.

When the model of computation is not specified, it is generally the model of multitape Turing machines. For most algorithms, the complexity is the same on multitape Turing machines and on RAM-machines, although, some care may be needed in storing data for getting this equivalence.

### Non-deterministic computation

In a non-deterministic model of computation, such as non-deterministic Turing machines, some choices may be done at some steps of the computation. In complexity theory, one considers all possible choices simultaneously, and the non-deterministic time complexity is the time needed, when the best choices are always done. In other words, one considers that the computation is done simultaneously on as many (identical) processors as needed, and the non-deterministic computation time is the time spent by the first processor that finishes the computation. This parallelism is partly amenable to quantum computing via superposed entangled states in running specific quantum algorithms, like e.g. Shor's factorization of yet only small integers (as of March 2018: 21 = 3 × 7).

Even when such a computation model is not realistic yet, it has theoretical importance, mostly related to the P = NP problem, which questions the identity of the complexity classes formed by taking "polynomial time" and "non-deterministic polynomial time" as least upper bounds. Simulating an NP-algorithm on a deterministic computer usually takes "exponential time". A problem is in the complexity class NP, if it may be solved in polynomial time on a non-deterministic machine. A problem is NP-complete if, roughly speaking, it is in NP and is not easier than any other NP problem. Many combinatorial problems, such as the Knapsack problem, the travelling salesman problem, and the Boolean satisfiability problem are NP-complete. For all these problems, the best known algorithm has exponential complexity. If any one of these problems could be solved in polynomial time on a deterministic machine, then all NP problems could also be solved in polynomial time, and one would have P = NP. As of 2017 it is generally conjectured that P ≠ NP, with the practical implication that the worst cases of NP problems are intrinsically difficult to solve, i.e., take longer than any reasonable time span (decades!) for interesting lengths of input.

### Parallel and distributed computation

Parallel and distributed computing consist of splitting computation on several processors, which work simultaneously. The difference between the different model lies mainly in the way of transmitting information between processors. Typically, in parallel computing the data transmission between processors is very fast, while, in distributed computing, the data transmission is done through a network and is therefore much slower.

The time needed for a computation on N processors is at least the quotient by N of the time needed by a single processor. In fact this theoretically optimal bound can never be reached, because some subtasks cannot be parallelized, and some processors may have to wait a result from another processor.

The main complexity problem is thus to design algorithms such that the product of the computation time by the number of processors is as close as possible to the time needed for the same computation on a single processor.

### Quantum computing

A quantum computer is a computer whose model of computation is based on quantum mechanics. The Church–Turing thesis applies to quantum computers, that is, every problem that can be solved by a quantum computer may be also solved by a Turing machine. However, some problems may theoretically be solved with a much lower complexity using a quantum computer than using a classical computer. This is, for the moment, purely theoretical, as no one knows how to build an efficient quantum computer.

Quantum complexity theory has been developed for studying computational complexity of quantum computing. It is used in post-quantum cryptography, which consists of designing cryptographic protocols that will resist to attacks with quantum computers, when such computers will really exist.

## Problem complexity (lower bounds)

The complexity of a problem is the infimum of the complexities of the algorithms that may solve the problem, including unknown algorithms. Thus the complexity of a problem is not greater than the complexity of any algorithm that solves the problems.

It follows that every complexity that is expressed with big O notation is a complexity of the algorithm as well as of the corresponding problem.

On the other hand, it is generally hard to obtain nontrivial lower bounds for problem complexity, and there are few methods for obtaining such lower bounds.

For solving most problems, it is required to read all input data, which, normally, needs a time proportional to the size of the data. Thus, such problems have a complexity that is at least linear, that is, using big omega notation, a complexity ${\displaystyle \Omega (n).}$

The solution of some problems, typically in computer algebra and computational algebraic geometry, may be very large. In such a case, the complexity is lower bounded by the maximal size of the output, since the output must be written. For example, a system of n polynomial equations of degree d in n indeterminates may have up to ${\displaystyle d^{n}}$ complex solutions, if the number of solutions is finite (this is Bézout's theorem). As these solutions must be written down, the complexity of this problem is ${\displaystyle \Omega (d^{n}).}$ For this problem, an algorithm of complexity ${\displaystyle d^{O(n)}}$ is known, which may thus be considered as asymptotically quasi-optimal.

A nonlinear lower bound of ${\displaystyle \Omega (n\log n)}$ is known for the number of comparisons needed for a sorting algorithm. Thus the best sorting algorithms are optimal, as their complexity is ${\displaystyle O(n\log n).}$ This lower bound results from the fact that there are n! ways of ordering n objects. As each comparison splits in two parts this set of n! orders, the number of N of comparisons that are needed for distinguishing all orders must verify ${\displaystyle 2^{N}>n!,}$ which implies ${\displaystyle N=\Omega (n\log n),}$ by Stirling's formula.

A standard method for getting lower bounds of complexity consists of reducing a problem to another problem. More precisely, suppose that one may encode a problem A of size n into a subproblem of size f(n) of a problem B, and that the complexity of A is ${\displaystyle \Omega (g(n)).}$ Without loss of generality, one may suppose that the function f increases with n and has an inverse function h. Then the complexity of the problem B is ${\displaystyle \Omega (g(h(n))).}$ This is this method that is used for proving that, if P ≠ NP (an unsolved conjecture), the complexity of every NP-complete problem is ${\displaystyle \Omega (n^{k}),}$ for every positive integer k.

## Use in algorithm design

Evaluating the complexity of an algorithm is an important part of algorithm design, as this gives useful information on the performance that may be expected.

It is a common misconception that the evaluation of the complexity of algorithms becomes less important, because of the exponential growth (Moore's law) of the power of modern computers. This is wrong because, this power increase allows working with large input data (big data). For example, when one wants to sort alphabetically a list of a few hundreds of entries, such as the bibliography of a book, any algorithm should work well in less than a second. On the other hand, for a list of a million of entries (the phone numbers of a large town, for example), the elementary algorithms that require ${\displaystyle O(n^{2})}$ comparisons would have to do a trillion of comparisons, which would need around three hours at the speed of 10 million of comparisons per second. On the other hand, the quicksort and merge sort require only ${\displaystyle n\log _{2}n}$ comparisons (as average-case complexity for the former, as worst-case complexity for the latter). For n = 1,000,000, this gives approximately 30,000,000 comparisons, which may be done in less than a second on a powerful computer.

Thus the evaluation of the complexity may allow eliminating many inefficient algorithms before any implementation. This may also be used for tuning complex algorithms without testing all variants. By determining the most costly steps of a complex algorithms, the study of complexity allows also focusing on these steps the effort for improving the efficiency of an implementation.

## Related Research Articles

In computational complexity theory, bounded-error probabilistic polynomial time (BPP) is the class of decision problems solvable by a probabilistic Turing machine in polynomial time with an error probability bounded away from 1/3 for all instances. BPP is one of the largest practical classes of problems, meaning most problems of interest in BPP have efficient probabilistic algorithms that can be run quickly on real modern machines. BPP also contains P, the class of problems solvable in polynomial time with a deterministic machine, since a deterministic machine is a special case of a probabilistic machine.

In computational complexity theory, bounded-error quantum polynomial time (BQP) is the class of decision problems solvable by a quantum computer in polynomial time, with an error probability of at most 1/3 for all instances. It is the quantum analogue of the complexity class BPP.

The P versus NP problem is a major unsolved problem in computer science. It asks whether every problem whose solution can be quickly verified can also be solved quickly.

Computational complexity theory focuses on classifying computational problems according to their inherent difficulty, and relating these classes to each other. A computational problem is a task solved by a computer. A computation problem is solvable by mechanical application of mathematical steps, such as an algorithm.

In computational complexity theory, NP is a complexity class used to classify decision problems. NP is the set of decision problems for which the problem instances, where the answer is "yes", have proofs verifiable in polynomial time.

In computational complexity theory, randomized polynomial time (RP) is the complexity class of problems for which a probabilistic Turing machine exists with these properties:

In computability theory, a probabilistic Turing machine is a non-deterministic Turing machine which chooses between the available transitions at each point according to some probability distribution.

In computer science, the space complexity of an algorithm or a computer program is the amount of memory space required to solve an instance of the computational problem as a function of the size of the input. It is the memory required by an algorithm to execute a program and produce output.

In computer science, the time complexity is the computational complexity that describes the amount of time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by the algorithm, supposing that each elementary operation takes a fixed amount of time to perform. Thus, the amount of time taken and the number of elementary operations performed by the algorithm are taken to differ by at most a constant factor.

In computational complexity theory, a complexity class is a set of problems of related resource-based complexity. A typical complexity class has a definition of the form:

In computer science, parameterized complexity is a branch of computational complexity theory that focuses on classifying computational problems according to their inherent difficulty with respect to multiple parameters of the input or output. The complexity of a problem is then measured as a function of those parameters. This allows the classification of NP-hard problems on a finer scale than in the classical setting, where the complexity of a problem is only measured by the number of bits in the input. The first systematic work on parameterized complexity was done by Downey & Fellows (1999).

In quantum computing, a quantum algorithm is an algorithm which runs on a realistic model of quantum computation, the most commonly used model being the quantum circuit model of computation. A classical algorithm is a finite sequence of instructions, or a step-by-step procedure for solving a problem, where each step or instruction can be performed on a classical computer. Similarly, a quantum algorithm is a step-by-step procedure, where each of the steps can be performed on a quantum computer. Although all classical algorithms can also be performed on a quantum computer, the term quantum algorithm is usually used for those algorithms which seem inherently quantum, or use some essential feature of quantum computation such as quantum superposition or quantum entanglement.

In computational complexity theory, Savitch's theorem, proved by Walter Savitch in 1970, gives a relationship between deterministic and non-deterministic space complexity. It states that for any function ,

In computational complexity theory, P, also known as PTIME or DTIME(nO ), is a fundamental complexity class. It contains all decision problems that can be solved by a deterministic Turing machine using a polynomial amount of computation time, or polynomial time.

In complexity theory, PP is the class of decision problems solvable by a probabilistic Turing machine in polynomial time, with an error probability of less than 1/2 for all instances. The abbreviation PP refers to probabilistic polynomial time. The complexity class was defined by Gill in 1977.

The Valiant–Vazirani theorem is a theorem in computational complexity theory stating that if there is a polynomial time algorithm for Unambiguous-SAT, then NP = RP. It was proven by Leslie Valiant and Vijay Vazirani in their paper titled NP is as easy as detecting unique solutions published in 1986. The proof is based on the Mulmuley–Vazirani–Vazirani isolation lemma, which was subsequently used for a number of important applications in theoretical computer science.

In computational complexity theory and analysis of algorithms, a number of metrics are defined describing the resources, such as time or space, that a machine needs to solve a particular problem. Interpreting these metrics meaningfully requires context, and this context is frequently implicit and depends on the field and the problem under consideration. This article describes a number of important pieces of context and how they affect metrics.

In computational complexity theory of computer science, the structural complexity theory or simply structural complexity is the study of complexity classes, rather than computational complexity of individual problems and algorithms. It involves the research of both internal structures of various complexity classes and the relations between different complexity classes.

In computational complexity the decision tree model is the model of computation in which an algorithm is considered to be basically a decision tree, i.e., a sequence of branching operations based on comparisons of some quantities, the comparisons being assigned unit computational cost.

Quantum complexity theory is a part of computational complexity theory in theoretical computer science. It studies complexity classes defined using quantum computers and quantum information which are computational models based on quantum mechanics. It studies the hardness of problems in relation to these complexity classes, and the relationship between quantum complexity classes and classical complexity classes.

## References

• Arora, Sanjeev; Barak, Boaz (2009), Computational Complexity: A Modern Approach, Cambridge, ISBN   978-0-521-42426-4, Zbl   1193.68112
• Du, Ding-Zhu; Ko, Ker-I (2000), Theory of Computational Complexity, John Wiley & Sons, ISBN   978-0-471-34506-0
• Garey, Michael R.; Johnson, David S. (1979), Computers and Intractability: A Guide to the Theory of NP-Completeness , W. H. Freeman, ISBN   0-7167-1045-5
• Goldreich, Oded (2008), Computational Complexity: A Conceptual Perspective, Cambridge University Press
• van Leeuwen, Jan, ed. (1990), Handbook of theoretical computer science (vol. A): algorithms and complexity, MIT Press, ISBN   978-0-444-88071-0
• Papadimitriou, Christos (1994), Computational Complexity (1st ed.), Addison Wesley, ISBN   0-201-53082-1
• Sipser, Michael (2006), Introduction to the Theory of Computation (2nd ed.), USA: Thomson Course Technology, ISBN   0-534-95097-3