Exponential time hypothesis

Last updated

In computational complexity theory, the exponential time hypothesis is an unproven computational hardness assumption that was formulated by Impagliazzo & Paturi (1999). It states that satisfiability of 3-CNF Boolean formulas cannot be solved in subexponential time, . More precisely, the usual form of the hypothesis asserts the existence of a number such that all algorithms that correctly solve this problem require time at least The exponential time hypothesis, if true, would imply that P ≠ NP, but it is a stronger statement. It implies that many computational problems are equivalent in complexity, in the sense that if one of them has a subexponential time algorithm then they all do, and that many known algorithms for these problems have optimal or near-optimal time complexity. [1]

Contents

Definition

The -SAT problem is a version of the Boolean satisfiability problem in which the input to the problem is a Boolean expression in conjunctive normal form (that is, an and of ors of variables and their negations) with at most variables per clause. The goal is to determine whether this expression can be made to be true by some assignment of Boolean values to its variables. 2-SAT has a linear time algorithm, but all known algorithms for larger take exponential time, with the base of the exponential function depending on . For instance, the WalkSAT probabilistic algorithm can solve -SAT in average time where is the number of variables in the given -SATinstance. [2] For each integer , define to be the smallest number such that -SAT can be solved in time . This minimum might not exist, if a sequence of better and better algorithms have correspondingly smaller exponential growth in their time bounds; in that case, define to be the infimum of the real numbers for which -SAT can be solved in time . Because problems with larger cannot be easier, these numbers are ordered as , and because of WalkSAT they are at most The exponential time hypothesis is the conjecture that they are all nonzero, or equivalently, that the smallest of them, , is nonzero. [1]

Some sources define the exponential time hypothesis to be the slightly weaker statement that 3-SAT cannot be solved in time . If there existed an algorithm to solve 3-SAT in time , then would equal zero. However, it is consistent with current knowledge that there could be a sequence of 3-SAT algorithms, each with running time for a sequence of numbers tending towards zero, but where the descriptions of these algorithms are so quickly growing that a single algorithm could not automatically select and run the most appropriate one. If this were to be the case, then would equal zero even though there would be no single algorithm running in time . [3] A related variant of the exponential time hypothesis is the non-uniform exponential time hypothesis, which posits that there is no family of algorithms (one for each length of the input, in the spirit of advice) that can solve 3-SAT in time . [4]

Because the numbers form a monotonic sequence that is bounded above by one, they must converge to a limit The strong exponential time hypothesis (SETH) is the conjecture that . [5]

Implications

Satisfiability

It is not possible for to equal for any finite : as Impagliazzo, Paturi & Zane (2001) showed, there exists a constant such that . Therefore, if the exponential time hypothesis is true, there must be infinitely many values of for which differs from . [6]

An important tool in this area is the sparsification lemma of Impagliazzo, Paturi & Zane (2001), which shows that, for every , any -CNF formula can be replaced by simpler -CNF formulas in which each variable appears only a constant number of times, and therefore in which the number of clauses is linear. The sparsification lemma is proven by repeatedly finding large sets of clauses that have a nonempty common intersection in a given formula, and replacing the formula by two simpler formulas, one of which has each of these clauses replaced by their common intersection and the other of which has the intersection removed from each clause. By applying the sparsification lemma and then using new variables to split the clauses, one may then obtain a set of 3-CNF formulas, each with a linear number of variables, such that the original -CNF formula is satisfiable if and only if at least one of these 3-CNF formulas is satisfiable. Therefore, if 3-SAT could be solved in subexponential time, one could use this reduction to solve -SAT in subexponential time as well. Equivalently, if for any , then as well, and the exponential time hypothesis would be true. [7] [6]

The limiting value of the sequence of numbers is at most equal to , where is the infimum of the numbers such that satisfiability of conjunctive normal form formulas without clause length limits can be solved in time . Therefore, if the strong exponential time hypothesis is true, then there would be no algorithm for general CNF satisfiability that is significantly faster than a brute-force search over all possible truth assignments. However, if the strong exponential time hypothesis fails, it would still be possible for to equal one. [8]

Other search problems

The exponential time hypothesis implies that many other problems in the complexity class SNP do not have algorithms whose running time is faster than for some constant . These problems include graph k-colorability, finding Hamiltonian cycles, maximum cliques, maximum independent sets, and vertex cover on -vertex graphs. Conversely, if any of these problems has a subexponential algorithm, then the exponential time hypothesis could be shown to be false. [7] [6]

If cliques or independent sets of logarithmic size could be found in polynomial time, the exponential time hypothesis would be false. Therefore, even though finding cliques or independent sets of such small size is unlikely to be NP-complete, the exponential time hypothesis implies that these problems are non-polynomial. [7] [9] More generally, the exponential time hypothesis implies that it is not possible to find cliques or independent sets of size in time . [10] The exponential time hypothesis also implies that it is not possible to solve the k-SUM problem (given real numbers, find of them that add to zero) in time . The strong exponential time hypothesis implies that it is not possible to find -vertex dominating sets more quickly than in time . [8]

The exponential time hypothesis implies also that the weighted feedback arc set problem on tournaments does not have a parametrized algorithm with running time . It does however have a parameterized algorithm with running time . [11]

The strong exponential time hypothesis leads to tight bounds on the parameterized complexity of several graph problems on graphs of bounded treewidth. In particular, if the strong exponential time hypothesis is true, then the optimal time bound for finding independent sets on graphs of treewidth is , the optimal time for the dominating set problem is , the optimum time for maximum cut is , and the optimum time for -coloringis . [12] Equivalently, any improvement on these running times would falsify the strong exponential time hypothesis. [13] The exponential time hypothesis also implies that any fixed-parameter tractable algorithm for edge clique cover must have double exponential dependence on the parameter. [14]

Communication complexity

In the three-party set disjointness problem in communication complexity, three subsets of the integers in some range are specified, and three communicating parties each know two of the three subsets. The goal is for the parties to transmit as few bits to each other on a shared communications channel in order for one of the parties to be able to determine whether the intersection of the three sets is empty or nonempty. A trivial -bit communications protocol would be for one of the three parties to transmit a bitvector describing the intersection of the two sets known to that party, after which either of the two remaining parties can determine the emptiness of the intersection. However, if there exists a protocol that solves the problem with communication and computation, it could be transformed into an algorithm for solving -SAT in time for any fixed constant , violating the strong exponential time hypothesis. Therefore, the strong exponential time hypothesis implies either that the trivial protocol for three-party set disjointness is optimal, or that any better protocol requires an exponential amount of computation. [8]

Structural complexity

If the exponential time hypothesis is true, then 3-SAT would not have a polynomial time algorithm, and therefore it would follow that P ≠ NP. More strongly, in this case, 3-SAT could not even have a quasi-polynomial time algorithm, so NP could not be a subset of QP. However, if the exponential time hypothesis fails, it would have no implication for the P versus NP problem. A padding argument proves the existence of NP-complete problems for which the best known running times have the form for , and if the best possible running time for 3-SAT were of this form, then P would be unequal to NP (because 3-SAT is NP-complete and this time bound is not polynomial) but the exponential time hypothesis would be false.

In parameterized complexity theory, because the exponential time hypothesis implies that there does not exist a fixed-parameter-tractable algorithm for maximum clique, it also implies that W[1] ≠ FPT. [10] It is an important open problem in this area whether this implication can be reversed: does W[1] ≠ FPT imply the exponential time hypothesis? There is a hierarchy of parameterized complexity classes called the M-hierarchy that interleaves the W-hierarchy in the sense that, for all ,; for instance, the problem of finding a vertex cover of size in an -vertex graph with parameter is complete for M[1]. The exponential time hypothesis is equivalent to the statement that M[1] ≠ FPT, and the question of whether for is also open. [3]

It is also possible to prove implications in the other direction, from the failure of a variation of the strong exponential time hypothesis to separations of complexity classes. As Williams (2010) shows, if there exists an algorithm that solves Boolean circuit satisfiability in time for some superpolynomially growing function , then NEXPTIME is not a subset of P/poly. Williams shows that, if algorithm exists, and a family of circuits simulating NEXPTIME in P/poly also existed, then algorithm could be composed with the circuits to simulate NEXPTIME problems nondeterministically in a smaller amount of time, violating the time hierarchy theorem. Therefore, the existence of algorithm proves the nonexistence of the family of circuits and the separation of these two complexity classes. [15]

See also

Notes

  1. 1 2 Impagliazzo, Russell; Paturi, Ramamohan (1999), "The Complexity of k-SAT", Proc. 14th IEEE Conf. on Computational Complexity, pp. 237–240, doi:10.1109/CCC.1999.766282, ISBN   978-0-7695-0075-1, S2CID   442454
  2. Schöning, Uwe (1999), "A probabilistic algorithm for -SAT and constraint satisfaction problems", 40th Annual Symposium on Foundations of Computer Science, FOCS '99, 17-18 October, 1999, New York, NY, USA, IEEE Computer Society, pp. 410–414, doi:10.1109/SFFCS.1999.814612, S2CID   1230959
  3. 1 2 Flum, Jörg; Grohe, Martin (2006), "16. Subexponential Fixed-Parameter Tractability", Parameterized Complexity Theory, EATCS Texts in Theoretical Computer Science, Springer-Verlag, pp. 417–451, ISBN   978-3-540-29952-3
  4. Chen, Yijia; Eickmeyer, Kord; Flum, Jörg (2012), "The exponential time hypothesis and the parameterized clique problem", in Thilikos, Dimitrios M.; Woeginger, Gerhard J. (eds.), Parameterized and Exact Computation – 7th International Symposium, IPEC 2012, Ljubljana, Slovenia, September 12–14, 2012, Proceedings, Lecture Notes in Computer Science, vol. 7535, Springer, pp. 13–24, CiteSeerX   10.1.1.680.8401 , doi:10.1007/978-3-642-33293-7_4
  5. Calabro, Chris; Impagliazzo, Russel; Paturi, Ramamohan (2009), "The Complexity of Satisfiability of Small Depth Circuits", Parameterized and Exact Computation, 4th International Workshop, IWPEC 2009, Copenhagen, Denmark, September 10-11, 2009, Revised Selected Papers, Lecture Notes in Computer Science, vol. 5917, pp. 75–85, CiteSeerX   10.1.1.331.764 , doi:10.1007/978-3-642-11269-0_6
  6. 1 2 3 Impagliazzo, Russell; Paturi, Ramamohan; Zane, Francis (2001), "Which problems have strongly exponential complexity?", Journal of Computer and System Sciences, 63 (4): 512–530, CiteSeerX   10.1.1.66.3717 , doi:10.1006/jcss.2001.1774
  7. 1 2 3 Woeginger, Gerhard (2003), "Exact algorithms for NP-hard problems: A survey", Combinatorial Optimization — Eureka, You Shrink! (PDF), Lecture Notes in Computer Science, vol. 2570, Springer-Verlag, pp. 185–207, CiteSeerX   10.1.1.168.5383 , doi:10.1007/3-540-36478-1_17, ISBN   978-3-540-00580-3, S2CID   289357, archived from the original (PDF) on 2020-09-30, retrieved 2011-03-31
  8. 1 2 3 Pătraşcu, Mihai; Williams, Ryan (2010), "On the possibility of faster SAT algorithms", Proc. 21st ACM/SIAM Symposium on Discrete Algorithms (SODA 2010) (PDF), pp. 1065–1075
  9. Feige, Uriel; Kilian, Joe (1997), "On limited versus polynomial nondeterminism", Chicago Journal of Theoretical Computer Science, 1: 1–20, doi: 10.4086/cjtcs.1997.001
  10. 1 2 Chen, Jianer; Huang, Xiuzhen; Kanj, Iyad A.; Xia, Ge (2006), "Strong computational lower bounds via parameterized complexity", Journal of Computer and System Sciences, 72 (8): 1346–1367, doi: 10.1016/j.jcss.2006.04.007
  11. Karpinski, Marek; Schudy, Warren (2010), "Faster Algorithms for Feedback Arc Set Tournament, Kemeny Rank Aggregation and Betweenness Tournament", Proc. ISAAC 2010, Part I, Lecture Notes in Computer Science, 6506: 3–14, arXiv: 1006.4396 , doi:10.1007/978-3-642-17517-6_3, ISBN   978-3-642-17516-9, S2CID   16512997
  12. Cygan, Marek; Fomin, Fedor V.; Kowalik, Lukasz; Lokshtanov, Daniel; Marx, Daniel; Pilipczuk, Marcin; Pilipczuk, Michal; Saurabh, Saket (2015), Parameterized Algorithms, Springer, p. 555, ISBN   978-3-319-21274-6
  13. Lokshtanov, Daniel; Marx, Dániel; Saurabh, Saket (2011), "Known algorithms on graphs of bounded treewidth are probably optimal", Proc. 22nd ACM/SIAM Symposium on Discrete Algorithms (SODA 2011), pp. 777–789, arXiv: 1007.5450 , doi:10.1137/1.9781611973082.61, S2CID   1810488
  14. Cygan, Marek; Pilipczuk, Marcin; Pilipczuk, Michał (2016), "Known algorithms for edge clique cover are probably optimal", SIAM Journal on Computing , 45 (1): 67–83, arXiv: 1203.1754 , doi:10.1137/130947076, MR   3448348, S2CID   11264145
  15. Williams, Ryan (2010), "Improving exhaustive search implies superpolynomial lower bounds", Proc. 42nd ACM Symposium on Theory of Computing (STOC 2010), New York, NY, USA: ACM, pp. 231–240, CiteSeerX   10.1.1.216.1299 , doi:10.1145/1806689.1806723, ISBN   9781450300506, S2CID   651703

Further reading

Related Research Articles

In logic and computer science, the Boolean satisfiability problem (sometimes called propositional satisfiability problem and abbreviated SATISFIABILITY, SAT or B-SAT) is the problem of determining if there exists an interpretation that satisfies a given Boolean formula. In other words, it asks whether the variables of a given Boolean formula can be consistently replaced by the values TRUE or FALSE in such a way that the formula evaluates to TRUE. If this is the case, the formula is called satisfiable. On the other hand, if no such assignment exists, the function expressed by the formula is FALSE for all possible variable assignments and the formula is unsatisfiable. For example, the formula "a AND NOT b" is satisfiable because one can find the values a = TRUE and b = FALSE, which make (a AND NOT b) = TRUE. In contrast, "a AND NOT a" is unsatisfiable.

The P versus NP problem is a major unsolved problem in theoretical computer science. Informally, it asks whether every problem whose solution can be quickly verified can also be quickly solved.

<span class="mw-page-title-main">Clique problem</span> Task of computing complete subgraphs

In computer science, the clique problem is the computational problem of finding cliques in a graph. It has several different formulations depending on which cliques, and what information about the cliques, should be found. Common formulations of the clique problem include finding a maximum clique, finding a maximum weight clique in a weighted graph, listing all maximal cliques, and solving the decision problem of testing whether a graph contains a clique larger than a given size.

<span class="mw-page-title-main">Time complexity</span> Estimate of time taken for running an algorithm

In theoretical computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by the algorithm, supposing that each elementary operation takes a fixed amount of time to perform. Thus, the amount of time taken and the number of elementary operations performed by the algorithm are taken to be related by a constant factor.

In computer science, 2-satisfiability, 2-SAT or just 2SAT is a computational problem of assigning values to variables, each of which has two possible values, in order to satisfy a system of constraints on pairs of variables. It is a special case of the general Boolean satisfiability problem, which can involve constraints on more than two variables, and of constraint satisfaction problems, which can allow more than two choices for the value of each variable. But in contrast to those more general problems, which are NP-complete, 2-satisfiability can be solved in polynomial time.

<span class="mw-page-title-main">Vertex cover</span> Subset of a graphs vertices, including at least one endpoint of every edge

In graph theory, a vertex cover of a graph is a set of vertices that includes at least one endpoint of every edge of the graph.

In computer science, parameterized complexity is a branch of computational complexity theory that focuses on classifying computational problems according to their inherent difficulty with respect to multiple parameters of the input or output. The complexity of a problem is then measured as a function of those parameters. This allows the classification of NP-hard problems on a finer scale than in the classical setting, where the complexity of a problem is only measured as a function of the number of bits in the input. This appears to have been first demonstrated in Gurevich, Stockmeyer & Vishkin (1984). The first systematic work on parameterized complexity was done by Downey & Fellows (1999).

In computational complexity theory, the Cook–Levin theorem, also known as Cook's theorem, states that the Boolean satisfiability problem is NP-complete. That is, it is in NP, and any problem in NP can be reduced in polynomial time by a deterministic Turing machine to the Boolean satisfiability problem.

<span class="mw-page-title-main">Russell Impagliazzo</span> American computer scientist

Russell Graham Impagliazzo is a professor of computer science at the University of California, San Diego, specializing in computational complexity theory.

In complexity theory, the Karp–Lipton theorem states that if the Boolean satisfiability problem (SAT) can be solved by Boolean circuits with a polynomial number of logic gates, then

MAX-3SAT is a problem in the computational complexity subfield of computer science. It generalises the Boolean satisfiability problem (SAT) which is a decision problem considered in complexity theory. It is defined as:

In computational complexity, problems that are in the complexity class NP but are neither in the class P nor NP-complete are called NP-intermediate, and the class of such problems is called NPI. Ladner's theorem, shown in 1975 by Richard E. Ladner, is a result asserting that, if P ≠ NP, then NPI is not empty; that is, NP contains problems that are neither in P nor NP-complete. Since it is also true that if NPI problems exist, then P ≠ NP, it follows that P = NP if and only if NPI is empty.

In computational complexity theory, a computational hardness assumption is the hypothesis that a particular problem cannot be solved efficiently. It is not known how to prove (unconditional) hardness for essentially any useful problem. Instead, computer scientists rely on reductions to formally relate the hardness of a new or complicated problem to a computational hardness assumption about a problem that is better-understood.

In computer science, a kernelization is a technique for designing efficient algorithms that achieve their efficiency by a preprocessing stage in which inputs to the algorithm are replaced by a smaller input, called a "kernel". The result of solving the problem on the kernel should either be the same as on the original input, or it should be easy to transform the output on the kernel to the desired output for the original problem.

MAXEkSAT is a problem in computational complexity theory that is a maximization version of the Boolean satisfiability problem 3SAT. In MAXEkSAT, each clause has exactly k literals, each with distinct variables, and is in conjunctive normal form. These are called k-CNF formulas. The problem is to determine the maximum number of clauses that can be satisfied by a truth assignment to the variables in the clauses.

<span class="mw-page-title-main">NP-completeness</span> Complexity class

In computational complexity theory, a problem is NP-complete when:

  1. It is a decision problem, meaning that for any input to the problem, the output is either "yes" or "no".
  2. When the answer is "yes", this can be demonstrated through the existence of a short solution.
  3. The correctness of each solution can be verified quickly and a brute-force search algorithm can find a solution by trying all possible solutions.
  4. The problem can be used to simulate every other problem for which we can verify quickly that a solution is correct. In this sense, NP-complete problems are the hardest of the problems to which solutions can be verified quickly. If we could find solutions of some NP-complete problem quickly, we could quickly find the solutions of every other problem to which a given solution can be easily verified.

In computer science, the Sharp Satisfiability Problem is the problem of counting the number of interpretations that satisfy a given Boolean formula, introduced by Valiant in 1979. In other words, it asks in how many ways the variables of a given Boolean formula can be consistently replaced by the values TRUE or FALSE in such a way that the formula evaluates to TRUE. For example, the formula is satisfiable by three distinct boolean value assignments of the variables, namely, for any of the assignments, , and, we have

In theoretical computer science, the term isolation lemma refers to randomized algorithms that reduce the number of solutions to a problem to one, should a solution exist. This is achieved by constructing random constraints such that, with non-negligible probability, exactly one solution satisfies these additional constraints if the solution space is not empty. Isolation lemmas have important applications in computer science, such as the Valiant–Vazirani theorem and Toda's theorem in computational complexity theory.

In computational complexity theory, a branch of computer science, the Max/min CSP/Ones classification theorems state necessary and sufficient conditions that determine the complexity classes of problems about satisfying a subset S of boolean relations. They are similar to Schaefer's dichotomy theorem, which classifies the complexity of satisfying finite sets of relations; however, the Max/min CSP/Ones classification theorems give information about the complexity of approximating an optimal solution to a problem defined by S.

In computational complexity theory, a fine-grained reduction is a transformation from one computational problem to another, used to relate the difficulty of improving the time bounds for the two problems. Intuitively, it provides a method for solving one problem efficiently by using the solution to the other problem as a subroutine. If problem can be solved in time and problem can be solved in time , then the existence of an -reduction from problem to problem implies that any significant speedup for problem would also lead to a speedup for problem .