Schaefer's dichotomy theorem

Last updated

In computational complexity theory, a branch of computer science, Schaefer's dichotomy theorem, proved by Thomas Jerome Schaefer, states necessary and sufficient conditions under which a finite set S of relations over the Boolean domain yields polynomial-time or NP-complete problems when the relations of S are used to constrain some of the propositional variables. [1] It is called a dichotomy theorem because the complexity of the problem defined by S is either in P or is NP-complete, as opposed to one of the classes of intermediate complexity that is known to exist (assuming P ≠ NP) by Ladner's theorem.

Contents

Special cases of Schaefer's dichotomy theorem include the NP-completeness of SAT (the Boolean satisfiability problem) and its two popular variants 1-in-3 SAT and not-all-equal 3SAT (often denoted by NAE-3SAT). In fact, for these two variants of SAT, Schaefer's dichotomy theorem shows that their monotone versions (where negations of variables are not allowed) are also NP-complete.

Original presentation

Schaefer defines a decision problem that he calls the Generalized Satisfiability problem for S (denoted by SAT(S)), where is a finite set of relations over the binary domain . An instance of the problem is an S-formula, i.e. a conjunction of constraints of the form where and the are propositional variables. The problem is to determine whether the given formula is satisfiable, in other words if the variables can be assigned values such that they satisfy all the constraints as given by the relations from S.

Schaefer identifies six classes of sets of Boolean relations for which SAT(S) is in P and proves that all other sets of relations generate an NP-complete problem. A finite set of relations S over the Boolean domain defines a polynomial-time computable satisfiability problem if any one of the following conditions holds:

  1. all relations that are not constantly false are true when all its arguments are true;
  2. all relations that are not constantly false are true when all its arguments are false;
  3. all relations are equivalent to a conjunction of binary clauses;
  4. all relations are equivalent to a conjunction of Horn clauses;
  5. all relations are equivalent to a conjunction of dual-Horn clauses;
  6. all relations are equivalent to a conjunction of affine formulae. [2]

Otherwise, the problem SAT(S) is NP-complete.

Modern presentation

A modern, streamlined presentation of Schaefer's theorem is given in an expository paper by Hubie Chen. [3] [4] In modern terms, the problem SAT(S) is viewed as a constraint satisfaction problem over the Boolean domain. In this area, it is standard to denote the set of relations by Γ and the decision problem defined by Γ as CSP(Γ).

This modern understanding uses algebra, in particular, universal algebra. For Schaefer's dichotomy theorem, the most important concept in universal algebra is that of a polymorphism. An operation is a polymorphism of a relation if, for any choice of m tuples from R, it holds that the tuple obtained from these m tuples by applying f coordinate-wise, i.e. , is in R. That is, an operation f is a polymorphism of R if R is closed under f: applying f to any tuples in R yields another tuple inside R. A set of relations Γ is said to have a polymorphism f if every relation in Γ has f as a polymorphism. This definition allows for the algebraic formulation of Schaefer's dichotomy theorem.

Let Γ be a finite constraint language over the Boolean domain. The problem CSP(Γ) is decidable in polynomial time if Γ has one of the following six operations as a polymorphism:

  1. the constant unary operation 1;
  2. the constant unary operation 0;
  3. the binary AND operation ∧;
  4. the binary OR operation ∨;
  5. the ternary majority operation
  6. the ternary minority operation

Otherwise, the problem CSP(Γ) is NP-complete.

In this formulation, it is easy to check if any of the tractability conditions hold.

Properties of Polymorphisms

Given a set Γ of relations, there is a surprisingly close connection between its polymorphisms and the computational complexity of CSP(Γ).

A relation R is called primitive positive definable, or short pp-definable, from a set Γ of relations if R(v1, ... , vk) ⇔ ∃x1 ... xm. C holds for some conjunction C of constraints from Γ and equations over the variables {v1,...,vk, x1,...,xm}. For example, if Γ consists of the ternary relation nae(x,y,z) holding if x,y,z are not all equal, and R(x,y,z) is xyz, then R can be pp-defined by R(x,y,z) ⇔ ∃a. nae(0,x,a) ∧ nae(y,za); this reduction has been used to prove that NAE-3SAT is NP-complete. The set of all relations that are pp-definable from Γ is denoted by ≪Γ≫. If Γ' ⊆ ≪Γ≫ for some finite constraint sets Γ and Γ', then CSP(Γ') reduces to CSP(Γ). [5]

Given a set Γ of relations, Pol(Γ) denotes the set of polymorphisms of Γ. Conversely, if O is a set of operations, then Inv(O) denotes the set of relations having all operations in O as a polymorphism. Pol and Inv together form an antitone Galois connection. For any finite set Γ of relations over a finite domain, ≪Γ≫ = Inv(Pol(Γ)) holds, that is, the set of relations pp-definable from Γ can be derived from the polymorphisms of Γ. [6] Moreover, if Pol(Γ) ⊆ Pol(Γ') for two finite relation sets Γ and Γ', then Γ' ⊆ ≪Γ≫ and CSP(Γ') reduces to CSP(Γ). As a consequence, two relation sets having the same polymorphisms lead to the same computational complexity. [7]

Generalizations

The analysis was later fine-tuned: CSP(Γ) is either solvable in co-NLOGTIME, L-complete, NL-complete, ⊕L-complete, P-complete or NP-complete, and given Γ, one can decide in polynomial time which of these cases holds. [8]

Schaefer's dichotomy theorem has also been generalized to use propositional logic of graphs instead of Boolean logic. [9]

If the problem is to count the number of solutions, which is denoted by #CSP(Γ), then there is a similar result for the binary domain by Creignou and Hermann. [10] Specifically, a finite set of relations S over the Boolean domain defines a polynomial time computable satisfiability problem if every relation in S is equivalent to a conjunction of affine formulae. [2]

For larger domains, a necessary condition for polynomial-time satisfiability was given by Bulatov and Dalmau. [11] Let Γ be a finite constraint language over the Boolean domain. If the problem #CSP(Γ) is computable in polynomial time, then Γ has a Mal'tsev operation as a polymorphism. Otherwise, the problem #CSP(Γ) is #P-complete. A Mal'tsev operation m is a ternary operation that satisfies An example of a Mal'tsev operation is the Minority operation given in the modern, algebraic formulation of Schaefer's dichotomy theorem above. Thus, when Γ has the Minority operation as a polymorphism, it is not only possible to decide CSP(Γ) in polynomial time, but to compute #CSP(Γ) in polynomial time. There are a total of 4 Mal'tsev operations on Boolean variables, determined by the values of and . An example of a less symmetric one is given by . On other domains, such as groups, examples of Mal'tsev operations include and For larger domains, even for a domain of size three, the existence of a Mal'tsev polymorphism for Γ is an insufficient condition for the tractability of #CSP(Γ). However, the absence of a Mal'tsev polymorphism for Γ implies the #P-hardness of #CSP(Γ).

See also

Related Research Articles

In logic and computer science, the Boolean satisfiability problem (sometimes called propositional satisfiability problem and abbreviated SATISFIABILITY, SAT or B-SAT) is the problem of determining if there exists an interpretation that satisfies a given Boolean formula. In other words, it asks whether the variables of a given Boolean formula can be consistently replaced by the values TRUE or FALSE in such a way that the formula evaluates to TRUE. If this is the case, the formula is called satisfiable. On the other hand, if no such assignment exists, the function expressed by the formula is FALSE for all possible variable assignments and the formula is unsatisfiable. For example, the formula "a AND NOT b" is satisfiable because one can find the values a = TRUE and b = FALSE, which make (a AND NOT b) = TRUE. In contrast, "a AND NOT a" is unsatisfiable.

The P versus NP problem is a major unsolved problem in theoretical computer science. In informal terms, it asks whether every problem whose solution can be quickly verified can also be quickly solved.

Constraint satisfaction problems (CSPs) are mathematical questions defined as a set of objects whose state must satisfy a number of constraints or limitations. CSPs represent the entities in a problem as a homogeneous collection of finite constraints over variables, which is solved by constraint satisfaction methods. CSPs are the subject of research in both artificial intelligence and operations research, since the regularity in their formulation provides a common basis to analyze and solve problems of many seemingly unrelated families. CSPs often exhibit high complexity, requiring a combination of heuristics and combinatorial search methods to be solved in a reasonable time. Constraint programming (CP) is the field of research that specifically focuses on tackling these kinds of problems. Additionally, the Boolean satisfiability problem (SAT), satisfiability modulo theories (SMT), mixed integer programming (MIP) and answer set programming (ASP) are all fields of research focusing on the resolution of particular forms of the constraint satisfaction problem.

In computational complexity theory, the polynomial hierarchy is a hierarchy of complexity classes that generalize the classes NP and co-NP. Each class in the hierarchy is contained within PSPACE. The hierarchy can be defined using oracle machines or alternating Turing machines. It is a resource-bounded counterpart to the arithmetical hierarchy and analytical hierarchy from mathematical logic. The union of the classes in the hierarchy is denoted PH.

In computational complexity theory, the Cook–Levin theorem, also known as Cook's theorem, states that the Boolean satisfiability problem is NP-complete. That is, it is in NP, and any problem in NP can be reduced in polynomial time by a deterministic Turing machine to the Boolean satisfiability problem.

In computational complexity theory, a function problem is a computational problem where a single output is expected for every input, but the output is more complex than that of a decision problem. For function problems, the output is not simply 'yes' or 'no'.

<span class="mw-page-title-main">Graph homomorphism</span> A structure-preserving correspondence between node-link graphs

In the mathematical field of graph theory, a graph homomorphism is a mapping between two graphs that respects their structure. More concretely, it is a function between the vertex sets of two graphs that maps adjacent vertices to adjacent vertices.

In computational complexity theory, an alternating Turing machine (ATM) is a non-deterministic Turing machine (NTM) with a rule for accepting computations that generalizes the rules used in the definition of the complexity classes NP and co-NP. The concept of an ATM was set forth by Chandra and Stockmeyer and independently by Kozen in 1976, with a joint journal publication in 1981.

In computational complexity theory, the class APX is the set of NP optimization problems that allow polynomial-time approximation algorithms with approximation ratio bounded by a constant. In simple terms, problems in this class have efficient algorithms that can find an answer within some fixed multiplicative factor of the optimal answer.

In complexity theory, the Karp–Lipton theorem states that if the Boolean satisfiability problem (SAT) can be solved by Boolean circuits with a polynomial number of logic gates, then

MAX-3SAT is a problem in the computational complexity subfield of computer science. It generalises the Boolean satisfiability problem (SAT) which is a decision problem considered in complexity theory. It is defined as:

In computational complexity theory, the maximum satisfiability problem (MAX-SAT) is the problem of determining the maximum number of clauses, of a given Boolean formula in conjunctive normal form, that can be made true by an assignment of truth values to the variables of the formula. It is a generalization of the Boolean satisfiability problem, which asks whether there exists a truth assignment that makes all clauses true.

The complexity of constraint satisfaction is the application of computational complexity theory to constraint satisfaction. It has mainly been studied for discriminating between tractable and intractable classes of constraint satisfaction problems on finite domains.

In computational complexity, problems that are in the complexity class NP but are neither in the class P nor NP-complete are called NP-intermediate, and the class of such problems is called NPI. Ladner's theorem, shown in 1975 by Richard E. Ladner, is a result asserting that, if P ≠ NP, then NPI is not empty; that is, NP contains problems that are neither in P nor NP-complete. Since it is also true that if NPI problems exist, then P ≠ NP, it follows that P = NP if and only if NPI is empty.

In computational complexity theory, the language TQBF is a formal language consisting of the true quantified Boolean formulas. A (fully) quantified Boolean formula is a formula in quantified propositional logic where every variable is quantified, using either existential or universal quantifiers, at the beginning of the sentence. Such a formula is equivalent to either true or false. If such a formula evaluates to true, then that formula is in the language TQBF. It is also known as QSAT.

<span class="mw-page-title-main">NP-completeness</span> Complexity class

In computational complexity theory, a problem is NP-complete when:

  1. It is a decision problem, meaning that for any input to the problem, the output is either "yes" or "no".
  2. When the answer is "yes", this can be demonstrated through the existence of a short solution.
  3. The correctness of each solution can be verified quickly and a brute-force search algorithm can find a solution by trying all possible solutions.
  4. The problem can be used to simulate every other problem for which we can verify quickly that a solution is correct. In this sense, NP-complete problems are the hardest of the problems to which solutions can be verified quickly. If we could find solutions of some NP-complete problem quickly, we could quickly find the solutions of every other problem to which a given solution can be easily verified.

In mathematical logic, a formula is satisfiable if it is true under some assignment of values to its variables. For example, the formula is satisfiable because it is true when and , while the formula is not satisfiable over the integers. The dual concept to satisfiability is validity; a formula is valid if every assignment of values to its variables makes the formula true. For example, is valid over the integers, but is not.

In computer science, the Sharp Satisfiability Problem is the problem of counting the number of interpretations that satisfy a given Boolean formula, introduced by Valiant in 1979. In other words, it asks in how many ways the variables of a given Boolean formula can be consistently replaced by the values TRUE or FALSE in such a way that the formula evaluates to TRUE. For example, the formula is satisfiable by three distinct boolean value assignments of the variables, namely, for any of the assignments, , and, we have

In computational complexity theory, a branch of computer science, the Max/min CSP/Ones classification theorems state necessary and sufficient conditions that determine the complexity classes of problems about satisfying a subset S of boolean relations. They are similar to Schaefer's dichotomy theorem, which classifies the complexity of satisfying finite sets of relations; however, the Max/min CSP/Ones classification theorems give information about the complexity of approximating an optimal solution to a problem defined by S.

In computational complexity theory, NP/poly is a complexity class, a non-uniform analogue of the class NP of problems solvable in polynomial time by a non-deterministic Turing machine. It is the non-deterministic complexity class corresponding to the deterministic class P/poly.

References

  1. Schaefer, Thomas J. (1978). "The Complexity of Satisfiability Problems". STOC 1978. pp. 216–226. doi:10.1145/800133.804350.
  2. 1 2 Schaefer (1978, p.218 left) defines an affine formula to be of the form x1 ⊕ ... ⊕ xn = c, where each xi is a variable, c is a constant, i.e. true or false, and "⊕" denotes XOR, i.e. addition in a Boolean ring.
  3. Chen, Hubie (December 2009). "A Rendezvous of Logic, Complexity, and Algebra". ACM Computing Surveys . 42 (1): 1–32. arXiv: cs/0611018 . doi:10.1145/1592451.1592453. S2CID   11975818.
  4. Chen, Hubie (December 2006). "A Rendezvous of Logic, Complexity, and Algebra". ACM SIGACT News (Logic Column). 37 (4): 85–114. arXiv: cs/0611018 . doi:10.1145/1189056.1189076. S2CID   14130916.
  5. Chen (2006), p.8, Proposition 3.9; Chen uses polynomial-time many-one reduction
  6. Chen (2006), p.9, Theorem 3.13
  7. Chen (2006), p.11, Theorem 3.15
  8. Allender, Eric; Bauland, Michael; Immerman, Neil; Schnoor, Henning; Vollmer, Heribert (June 2009). "The complexity of satisfiability problems: Refining Schaefer's theorem" (PDF). Journal of Computer and System Sciences . 75 (4): 245–254. doi: 10.1016/j.jcss.2008.11.001 . Retrieved 2013-09-19.
  9. Bodirsky, Manuel; Pinsker, Michael (2015). "Schaefer's Theorem for Graphs". J. ACM . 62 (3): 19:1–19:52. arXiv: 1011.2894 . doi:10.1145/2764899. S2CID   750401.
  10. Creignou, Nadia; Hermann, Miki (1996). "Complexity of generalized satisfiability counting problems". Information and Computation . 125 (1): 1–12. doi: 10.1006/inco.1996.0016 . ISSN   0890-5401.
  11. Bulatov, Andrei A.; Dalmau, Víctor (1 May 2007). "Towards a dichotomy theorem for the counting constraint satisfaction problem". Information and Computation. 205 (5): 651–678. doi:10.1016/j.ic.2006.09.005. hdl: 10230/36327 . ISSN   0890-5401.