This article includes a list of general references, but it lacks sufficient corresponding inline citations .(August 2011) |
The complexity of constraint satisfaction is the application of computational complexity theory to constraint satisfaction. It has mainly been studied for discriminating between tractable and intractable classes of constraint satisfaction problems on finite domains.
Solving a constraint satisfaction problem on a finite domain is an NP-complete problem in general. Research has shown a number of polynomial-time subcases, mostly obtained by restricting either the allowed domains or constraints or the way constraints can be placed over the variables. Research has also established a relationship between the constraint satisfaction problem and problems in other areas such as finite model theory and databases.
Establishing whether a constraint satisfaction problem on a finite domain has solutions is an NP-complete problem in general. This is an easy consequence of a number of other NP-complete problems being expressible as constraint satisfaction problems. Such other problems include propositional satisfiability and three-colorability.
Tractability can be obtained by considering specific classes of constraint satisfaction problems. As an example, if the domain is binary and all constraints are binary, establishing satisfiability is a polynomial-time problem because this problem is equivalent to 2-SAT, which is a polynomial-time problem.
One line of research used a correspondence between constraint satisfaction problem and the problem of establishing the existence of a homomorphism between two relational structures. This correspondence has been used to link constraint satisfaction with topics traditionally related to database theory.
A considered research problem is about the existence of dichotomies among sets of restrictions. This is the question of whether a set of restrictions contains only polynomial-time restrictions and NP-complete restrictions. For relational restrictions (see below) this question was settled in the positive for Boolean domains by Schaefer's dichotomy theorem [1] and for any finite domain by Andrei Bulatov [2] and Dmitriy Zhuk, [3] independently, in 2017.
Tractable subcases of the general constraint satisfaction problem can be obtained by placing suitable restrictions on the problems. Various kinds of restrictions have been considered.
Tractability can be obtained by restricting the possible domains or constraints. In particular, two kinds of restrictions have been considered:
More precisely, a relational restriction specifies a constraint language, which is a domain and a set of relations over this domain. A constraint satisfaction problem meets this restriction if it has exactly this domain and the relation of each constraint is in the given set of relations. In other words, a relational restriction bounds the domain and the set of satisfying values of each constraint, but not how the constraints are placed over variables. This is instead done by structural restrictions. Structural restriction can be checked by looking only at the scopes of constraints (their variables), ignoring their relations (their set of satisfying values).
A constraint language is tractable if there exists a polynomial algorithm solving all problems based on the language, that is, using the domain and relations specified in the domain. An example of a tractable constraint language is that of binary domains and binary constraints. Formally, this restriction corresponds to allowing only domains of size 2 and only constraints whose relation is a binary relation. While the second fact implies that the scopes of the constraints are binary, this is not a structural restriction because it does not forbid any constraint to be placed on an arbitrary pair of variables. Incidentally, the problem becomes NP-complete if either restriction is lifted: binary constraints and ternary domains can express the graph 3-coloring problem, while ternary constraints and binary domains can express 3-SAT; these two problems are both NP-complete.
An example of a tractable class defined in terms of a structural restriction is that of binary acyclic problems. Given a constraint satisfaction problem with only binary constraints, its associated graph has a vertex for every variable and an edge for every constraint; two vertices are joined if they are in a constraint. If the graph of a problem is acyclic, the problem is called acyclic as well. The problem of satisfiability on the class of binary acyclic problem is tractable. This is a structural restriction because it does not place any limit to the domain or to the specific values that satisfy a constraints; rather, it restricts the way constraints are placed over variables.
While relational and structural restrictions are the ones mostly used to derive tractable classes of constraint satisfaction, there are some tractable classes that cannot be defined by relational restrictions only or structural restrictions only. The tractable class defined in terms of row convexity cannot be defined only in terms of the relations or only in terms of the structure, as row convexity depends both on the relations and on the order of variables (and therefore cannot be checked by looking only at each constraint in turn).
The subcase obtained by restricting to a finite constraint language is called a non-uniform problem. These problems are mostly considered when expressing constraint satisfaction in terms of the homomorphism problem, as explained below. Uniform problems were also defined in the settings of homomorphism problems; a uniform problem can be defined as the union of a (possibly infinite) collection of non-uniform problems. A uniform problem made of an infinite set of non-uniform problems can be intractable even if all these non-uniform problems are tractable.
Some considered restrictions are based on the tractability of the constraint satisfaction problem where the constraints are all binary and form a tree over the variables. This is a structural restriction, as it can be checked by looking only at the scopes of the constraints, ignoring domains and relations.
This restriction is based on the primal graph of the problem, which is a graph whose vertices are the variables of the problem and the edges represent the presence of a constraint between two variables. Tractability can however also be obtained by placing the condition of being a tree to the primal graph of problems that are reformulations of the original one.
Constraint satisfaction problems can be reformulated in terms of other problems, leading to equivalent conditions to tractability. The most-used reformulation is that in terms of the homomorphism problem.
A link between constraint satisfaction and database theory has been provided in the form of a correspondence between the problem of constraint satisfiability and the problem of checking whether there exists a homomorphism between two relational structures. A relational structure is a mathematical representation of a relational database: it is a set of values and a set of relations over these values. Formally, , where each is a relation over , that is, a set of tuples of values of .
A relational structure is different from a constraint satisfaction problem because a constraint is a relation and a tuple of variables. Also different is the way in which they are used: for a constraint satisfaction problem, finding a satisfying assignment is the main problem; for a relation structure, the main problem is finding the answer to a query.
The constraint satisfaction problem is however related to the problem of establishing the existence of a homomorphism between two relational structures. A homomorphism is a function from the values of the first relation to the values of the second that, when applied to all values of a relation of the first structure, turns it into a subset of the corresponding relation of the second structure. Formally, is a homomorphism from to if it is a function from to such that, if then .
A direct correspondence between the constraint satisfaction problem and the homomorphism problem can be established. For a given constraint satisfaction problem, one can build a pair of relational structures, the first encoding the variables and the signatures of constraints, the second encoding the domains and the relations of the constraints. Satisfiability of the constraint satisfaction problem corresponds to finding a value for every variable such that replacing a value in a signature makes it a tuple in the relation of the constraint. This is possible exactly if this evaluation is a homomorphism between the two relational structures.
The inverse correspondence is the opposite one: given two relational structures, one encodes the values of the first in the variables of a constraint satisfaction problem, and the values of the second in the domain of the same problem. For every tuple of every relation of the first structure, there is a constraint having as values the correspondent relation of the second structure. This way, a homomorphism corresponds to mapping every scope of every constraint (every tuple of every relation of the first structure) into a tuple in the relation of the constraint (a tuple in the corresponding relation of the second structure).
A non-uniform constraint satisfaction problem is a restriction where the second structure of the homomorphism problem is fixed. In other words, every relational structure defines a non-uniform problem, that of telling whether a relation structure is homomorphic to it. A similar restriction can be placed on the first structure; for any fixed first structure, the homomorphism problem is tractable, because then there are only a polynomial number of functions from the first structure to the second. A uniform constraint satisfaction problem is an arbitrary restriction to the sets of structures for the first and second structure of the homomorphism problem.
Since the homomorphism problem is equivalent to conjunctive query evaluation and conjunctive query containment, these two problems are equivalent to constraint satisfaction as well.
Every constraint can be viewed as a table in a database, where the variables are interpreted as attribute names and the relation is the set of records in the table. The solutions of a constraint satisfaction problem are the result of an inner join of the tables representing its constraints; therefore, the problem of existence of solutions can be reformulated as the problem of checking whether the result of an inner join of a number of tables is nonempty.
Some constraint languages (or non-uniform problems) are known to correspond to problems solvable in polynomial time, and some others are known to express NP-complete problems. However, it is possible that some constraint languages are neither. It is known by Ladner's theorem that if P is not equal to NP, then there exist problems in NP that are neither polynomial-time nor NP-hard. For constraint problems with a fixed constraint language and no structural restrictions, such intermediate problems do not exist as proven by Andrei Bulatov [2] and Dmitriy Zhuk, [3] independently, in 2017. If no Ladner languages are expressible as constraint satisfaction problems, the set of all constraint languages can be divided exactly into those defining polynomial-time and those defining NP-complete problems; that is, this set exhibits a dichotomy.
Some particular cases of the Bulatov and Zhuk result were already known. The most well known such result is Schaefer's dichotomy theorem, which proves the existence of a dichotomy in the set of constraint languages on a binary domain. More precisely, it proves that a relation restriction on a binary domain is tractable if all its relations belong to one of six classes, and is NP-complete otherwise. Bulatov proved a dichotomy theorem for domains of three elements. [4]
Another dichotomy theorem for constraint languages is the Hell–Nesetril theorem, which shows a dichotomy for problems on binary constraints with a single fixed symmetric relation. In terms of the homomorphism problem, every such problem is equivalent to the existence of a homomorphism from a relational structure to a given fixed undirected graph (an undirected graph can be regarded as a relational structure with a single binary symmetric relation). The Hell–Nesetril theorem proves that every such problem is either polynomial-time or NP-complete. More precisely, the problem is polynomial-time if the graph is 2-colorable, that is, it is bipartite, and is NP-complete otherwise.
Some complexity results prove that some restrictions are polynomial without proving that all other possible restrictions of the same kind are NP-hard.
A sufficient condition for tractability is related to expressibility in Datalog. A Boolean Datalog query gives a truth value to a set of literals over a given alphabet, each literal being an expression of the form ; as a result, a Boolean Datalog query expresses a set of sets of literals, as it can be considered semantically equivalent to the set of all sets of literals that it evaluates to true.
On the other hand, a non-uniform problem can be seen as a way for expressing a similar set. For a given non-uniform problem, the set of relations that can be used in constraints is fixed; as a result, one can give unique names to them. An instance of this non-uniform problem can be then written as a set of literals of the form . Among these instances/sets of literals, some are satisfiable and some are not; whether a set of literals is satisfiable depends on the relations, which are specified by the non-uniform problem. In the other way around, a non-uniform problem tells which sets of literals represent satisfiable instances and which ones represent unsatisfiable instances. Once relations are named, a non-uniform problem expresses a set of sets of literals: those associated to satisfiable (or unsatisfiable) instances.
A sufficient condition of tractability is that a non-uniform problem is tractable if the set of its unsatisfiable instances can be expressed by a Boolean Datalog query. In other words, if the set of sets of literals that represent unsatisfiable instances of the non-uniform problem is also the set of sets of literals that satisfy a Boolean Datalog query, then the non-uniform problem is tractable.
Satisfiability can sometimes be established by enforcing a form of local consistency and then checking the existence of an empty domain or constraint relation. This is in general a correct but incomplete unsatisfiability algorithm: a problem may be unsatisfiable even if no empty domain or constraint relation is produced. For some forms of local consistency, this algorithm may also require exponential time. However, for some problems and for some kinds of local consistency, it is correct and polynomial-time.
The following conditions exploit the primal graph of the problem, which has a vertex for each variable and an edge between two nodes if the corresponding variables are in a constraint. The following are conditions on binary constraint satisfaction problems where enforcing local consistency is tractable and allows establishing satisfiability:
A condition that extends the last one holds for non-binary constraint satisfaction problems. Namely, for all problems for which there exists an ordering that makes the primal graph having induced width bounded by a constant i, enforcing strong directional i-consistency is tractable and allows establishing satisfiability.
Constraint satisfaction problems composed of binary constraints only can be viewed as graphs, where the vertices are variables and the edges represent the presence of a constraint between two variables. This graph is called the Gaifman graph or primal constraint graph (or simply primal graph) of the problem.
If the primal graph of a problem is acyclic, establishing satisfiability of the problem is a tractable problem. This is a structural restriction, as it can be checked by looking only at the scopes of the constraints, disregarding their relations and the domain. An acyclic graph is a forest, but connectedness is usually assumed; as a result, the condition that is usually considered is that primal graphs are trees.
This property of tree-like constraint satisfaction problems is exploited by decomposition methods, which convert problems into equivalent ones that only contain binary constraints arranged as a tree. The variables of these problems correspond to sets of variables of the original problem; the domain of such a variable is obtained by considering some of the constraints of the original problem whose scope is contained in the corresponding original set of variables; constraints of these new problems represent equality of variables that are contained in two sets.
If the graph of one such equivalent problem is a tree, the problem can be solved efficiently. On the other hand, producing one such equivalent problem may be not be efficient because of two factors: the need to determine the combined effects of a group of constraints over a set of variables, and the need to store all tuples of values that satisfy a given group of constraints.
A necessary condition for the tractability of a constraint language based on the universal gadget has been proved. The universal gadget is a particular constraint satisfaction problem that was initially defined for the sake of expressing new relations by projection.
A relation that is not present in a constraint language may be "simulated" by constraints using the relations in the language. In particular, a new relation can be created by placing a set of constraints and using only some of their variables. If all other constraints use only these variables, this set of constraints forces these variable to only assume specific values, practically simulating a new relation.
Every constraint satisfaction problem and subset of its variables defines a relation, which is composed by all tuples of values of the variables that can be extended to the other variables to form a solution. Technically, this relation is obtained by projecting the relation having the solutions as rows over the considered variables.
The universal gadget is based on the observation that every relation that contains -tuples can be defined by projecting a relation that contains all possible columns of elements from the domain. As an example, the following tables shows such a projection:
a b c d e f g h b d --------------- --- 1 1 1 1 0 0 0 0 -> 1 1 1 1 0 0 1 1 0 0 1 0 1 0 1 0 1 0 1 0 0 0
If the table on the left is the set of solutions of a constraint satisfaction problem, its variables and are constrained to the values of the table to the right. As a result, the constraint satisfaction problem can be used to set a constraint whose relation is the table on the right, which may not be in the constraint language.
As a result, if a constraint satisfaction problem has the table on the left as its set of solutions, every relation can be expressed by projecting over a suitable set of variables. A way for trying to obtain this table as the set of solution is to place every possible constraint that is not violated by the required solutions.
As an example, if the language contains the binary relation representing the Boolean disjunction (a relation containing all tuples of two elements that contains at least a 1), this relation is placed as a constraint on and , because their values in the table above are , again, and . Since all these values satisfy the constraint, the constraint is placed. On the other hand, a constraint with this relation is not placed on and , since the restriction of the table above to these two variables contains as a third row, and this evaluation violates that constraint.
The universal gadget of order is the constraint satisfaction problem containing all constraints that can be placed in order to obtain the table above. The solutions of the universal gadget include the rows of this table, but can contain other rows. If the solutions are exactly the rows of the table, every relation can be expressed by projecting on a subset of the variables. However, even if the solutions include other rows, some relations can still be expressed. A property of the universal gadget is that it is able to express, by projection, every relation that can be expressed by projection from an arbitrary constraint satisfaction problem based on the same language. More precisely, the universal gadget of order expresses all relations of rows that can be expressed in the constraint language.
Given a specific relation, its expressibility in the language can be checked by considering an arbitrary list of variables whose columns in the table above (the "ideal" solutions to the universal gadget) form that relation. The relation can be expressed in the language if and only if the solutions of the universal gadget coincides with the relation when projected over such a list of variables. In other words, one can check expressibility by selecting variables "as if" the solutions of the universal gadget were like in the table, and then check whether the restriction of the "real" solutions is actually the same as the relation. In the example above, the expressibility of the relation in the table on the right can be checked by looking whether the solutions of the universal gadget, when restricted to the variables and , are exactly the rows of this table.
A necessary condition for tractability can be expressed in terms of the universal gadget. The solutions of such a gadget can be tabulated as follows:
a b c d e f g h --------------- 1 1 1 1 0 0 0 0 1 1 0 0 1 1 0 0 (solutions that exist by definition) 1 0 1 0 1 0 1 0 --------------- .... 1 0 0 1 1 1 0 0 (other solutions are possible) ....
This table is made of two parts. The first part contains the solutions that exist by definition of this problem; the second part (that may be empty) contains all other solutions. Since the columns of the table are by definition associated to the possible -tuples of values of the domain, every solution can be viewed as a function from a -tuple of elements to a single element.
The function corresponding to a solution can be calculated from the first part of the table above and the solution. As an example, for the last solution marked in the table, this function can be determined for arguments as follows: first, these three values are the first part of the row "c" in the table; the value of the function is the value of the solution in the same column, that is, 0.
A necessary condition for tractability is the existence of a solution for a universal gadget of some order that is part of some classes of functions. This result however only holds for reduced languages, which are defined below.
Squashing functions are functions used to reduce the size of domain of constraint languages. A squashing function is defined in terms of a partition of the domain and a representative element for each set in the partition. The squashing function maps all elements of a set in the partition to the representative element of that set. For such a function being a squashing function it is also necessary that applying the function to all elements of a tuple of a relation in the language produces another tuple in the relation. The partition is assumed to contain at least a set of size greater than one.
Formally, given a partition of the domain containing at least a set of size greater than one, a squashing function is a function such that for every in the same partition, and for every tuple , it holds .
For constraint problems on a constraint language has a squashing function, the domain can be reduced via the squashing function. Indeed, every element in a set in the partition can be replaced with the result of applying the squashing function to it, as this result is guaranteed to satisfy at least all constraints that were satisfied by the element. As a result, all non-representative elements can be removed from the constraint language.
Constraint languages for which no squashing function exist are called reduced languages; equivalently, these are languages on which all reductions via squashing functions have been applied.
This section needs expansion. You can help by adding to it. (February 2010) |
The necessary condition for tractability based on the universal gadget holds for reduced languages. Such a language is tractable if the universal gadget has a solution that, when viewed as a function in the way specified above, is either a constant function, a majority function, an idempotent binary function, an affine function, or a semi-projection.
The relational model (RM) is an approach to managing data using a structure and language consistent with first-order predicate logic, first described in 1969 by English computer scientist Edgar F. Codd, where all data is represented in terms of tuples, grouped into relations. A database organized in terms of the relational model is a relational database.
In computational complexity theory, a decision problem is P-complete if it is in P and every problem in P can be reduced to it by an appropriate reduction.
Constraint programming (CP) is a paradigm for solving combinatorial problems that draws on a wide range of techniques from artificial intelligence, computer science, and operations research. In constraint programming, users declaratively state the constraints on the feasible solutions for a set of decision variables. Constraints differ from the common primitives of imperative programming languages in that they do not specify a step or sequence of steps to execute, but rather the properties of a solution to be found. In addition to constraints, users also need to specify a method to solve these constraints. This typically draws upon standard methods like chronological backtracking and constraint propagation, but may use customized code like a problem-specific branching heuristic.
Constraint satisfaction problems (CSPs) are mathematical questions defined as a set of objects whose state must satisfy a number of constraints or limitations. CSPs represent the entities in a problem as a homogeneous collection of finite constraints over variables, which is solved by constraint satisfaction methods. CSPs are the subject of research in both artificial intelligence and operations research, since the regularity in their formulation provides a common basis to analyze and solve problems of many seemingly unrelated families. CSPs often exhibit high complexity, requiring a combination of heuristics and combinatorial search methods to be solved in a reasonable time. Constraint programming (CP) is the field of research that specifically focuses on tackling these kinds of problems. Additionally, the Boolean satisfiability problem (SAT), satisfiability modulo theories (SMT), mixed integer programming (MIP) and answer set programming (ASP) are all fields of research focusing on the resolution of particular forms of the constraint satisfaction problem.
In computer science, 2-satisfiability, 2-SAT or just 2SAT is a computational problem of assigning values to variables, each of which has two possible values, in order to satisfy a system of constraints on pairs of variables. It is a special case of the general Boolean satisfiability problem, which can involve constraints on more than two variables, and of constraint satisfaction problems, which can allow more than two choices for the value of each variable. But in contrast to those more general problems, which are NP-complete, 2-satisfiability can be solved in polynomial time.
In computer science, parameterized complexity is a branch of computational complexity theory that focuses on classifying computational problems according to their inherent difficulty with respect to multiple parameters of the input or output. The complexity of a problem is then measured as a function of those parameters. This allows the classification of NP-hard problems on a finer scale than in the classical setting, where the complexity of a problem is only measured as a function of the number of bits in the input. This appears to have been first demonstrated in Gurevich, Stockmeyer & Vishkin (1984). The first systematic work on parameterized complexity was done by Downey & Fellows (1999).
In the mathematical field of graph theory, a graph homomorphism is a mapping between two graphs that respects their structure. More concretely, it is a function between the vertex sets of two graphs that maps adjacent vertices to adjacent vertices.
In artificial intelligence and operations research, constraint satisfaction is the process of finding a solution through a set of constraints that impose conditions that the variables must satisfy. A solution is therefore an assignment of values to the variables that satisfies all constraints—that is, a point in the feasible region.
The hidden transformation reformulates a constraint satisfaction problem in such a way all constraints have at most two variables. The new problem is satisfiable if and only if the original problem was, and solutions can be converted easily from one problem to the other.
In constraint satisfaction, local consistency conditions are properties of constraint satisfaction problems related to the consistency of subsets of variables or constraints. They can be used to reduce the search space and make the problem easier to solve. Various kinds of local consistency conditions are leveraged, including node consistency, arc consistency, and path consistency.
In universal algebra and in model theory, a structure consists of a set along with a collection of finitary operations and relations that are defined on it.
The dual problem is a reformulation of a constraint satisfaction problem expressing each constraint of the original problem as a variable. Dual problems only contain binary constraints, and are therefore solvable by algorithms tailored for such problems. The join graphs and join trees of a constraint satisfaction problem are graphs representing its dual problem or a problem obtained from the dual problem removing some redundant constraints.
Distributed constraint optimization is the distributed analogue to constraint optimization. A DCOP is a problem in which a group of agents must distributedly choose values for a set of variables such that the cost of a set of constraints over the variables is minimized.
In computational complexity theory, a branch of computer science, Schaefer's dichotomy theorem, proved by Thomas Jerome Schaefer, states necessary and sufficient conditions under which a finite set S of relations over the Boolean domain yields polynomial-time or NP-complete problems when the relations of S are used to constrain some of the propositional variables. It is called a dichotomy theorem because the complexity of the problem defined by S is either in P or is NP-complete, as opposed to one of the classes of intermediate complexity that is known to exist by Ladner's theorem.
In constraint satisfaction, a decomposition method translates a constraint satisfaction problem into another constraint satisfaction problem that is binary and acyclic. Decomposition methods work by grouping variables into sets, and solving a subproblem for each set. These translations are done because solving binary acyclic problems is a tractable problem.
In mathematics, a constraint is a condition of an optimization problem that the solution must satisfy. There are several types of constraints—primarily equality constraints, inequality constraints, and integer constraints. The set of candidate solutions that satisfy all constraints is called the feasible set.
In computational complexity theory, a gadget is a subunit of a problem instance that simulates the behavior of one of the fundamental units of a different computational problem. Gadgets are typically used to construct reductions from one computational problem to another, as part of proofs of NP-completeness or other types of computational hardness. The component design technique is a method for constructing reductions by using gadgets.
In database theory, a conjunctive query is a restricted form of first-order queries using the logical conjunction operator. Many first-order queries can be written as conjunctive queries. In particular, a large part of queries issued on relational databases can be expressed in this way. Conjunctive queries also have a number of desirable theoretical properties that larger classes of queries do not share.
In theoretical computer science, the circuit satisfiability problem is the decision problem of determining whether a given Boolean circuit has an assignment of its inputs that makes the output true. In other words, it asks whether the inputs to a given Boolean circuit can be consistently set to 1 or 0 such that the circuit outputs 1. If that is the case, the circuit is called satisfiable. Otherwise, the circuit is called unsatisfiable. In the figure to the right, the left circuit can be satisfied by setting both inputs to be 1, but the right circuit is unsatisfiable.
In computer science, an interchangeability algorithm is a technique used to more efficiently solve constraint satisfaction problems (CSP). A CSP is a mathematical problem in which objects, represented by variables, are subject to constraints on the values of those variables; the goal in a CSP is to assign values to the variables that are consistent with the constraints. If two variables A and B in a CSP may be swapped for each other without changing the nature of the problem or its solutions, then A and B are interchangeable variables. Interchangeable variables represent a symmetry of the CSP and by exploiting that symmetry, the search space for solutions to a CSP problem may be reduced. For example, if solutions with A=1 and B=2 have been tried, then by interchange symmetry, solutions with B=1 and A=2 need not be investigated.