Conflict-driven clause learning

Last updated

In computer science, conflict-driven clause learning (CDCL) is an algorithm for solving the Boolean satisfiability problem (SAT). Given a Boolean formula, the SAT problem asks for an assignment of variables so that the entire formula evaluates to true. The internal workings of CDCL SAT solvers were inspired by DPLL solvers. The main difference between CDCL and DPLL is that CDCL's backjumping is non-chronological.

Contents

Conflict-driven clause learning was proposed by Marques-Silva and Karem A. Sakallah (1996, 1999) [1] [2] and Bayardo and Schrag (1997). [3]

Background

Boolean satisfiability problem

The satisfiability problem consists in finding a satisfying assignment for a given formula in conjunctive normal form (CNF).

An example of such a formula is:

( (not A) or (not C) )   and   (B or C),

or, using a common notation: [4]

where A,B,C are Boolean variables, , , , and are literals, and and are clauses.

A satisfying assignment for this formula is e.g.:

since it makes the first clause true (since is true) as well as the second one (since is true).

This examples uses three variables (A, B, C), and there are two possible assignments (True and False) for each of them. So one has possibilities. In this small example, one can use brute-force search to try all possible assignments and check if they satisfy the formula. But in realistic applications with millions of variables and clauses brute force search is impractical. The responsibility of a SAT solver is to find a satisfying assignment efficiently and quickly by applying different heuristics for complex CNF formulas.

Unit clause rule (unit propagation)

If a clause has all but one of its literals or variables evaluated at False, then the free literal must be True in order for the clause to be True. For example, if the below unsatisfied clause is evaluated with and we must have in order for the clause to be true.

The iterated application of the unit clause rule is referred to as unit propagation or Boolean constraint propagation (BCP).

Resolution

Consider two clauses and . The clause , obtained by merging the two clauses and removing both and , is called the resolvent of the two clauses.

Algorithm

Conflict-driven clause learning works as follows.

  1. Select a variable and assign True or False. This is called decision state. Remember the assignment.
  2. Apply Boolean constraint propagation (unit propagation).
  3. Build the implication graph.
  4. If there is any conflict
    1. Find the cut in the implication graph that led to the conflict
    2. Derive a new clause which is the negation of the assignments that led to the conflict
    3. Non-chronologically backtrack ("back jump") to the appropriate decision level, where the first-assigned variable involved in the conflict was assigned
  5. Otherwise continue from step 1 until all variable values are assigned.

Example

A visual example of CDCL algorithm: [4]

Completeness

DPLL is a sound and complete algorithm for SAT. CDCL SAT solvers implement DPLL, but can learn new clauses and backtrack non-chronologically. Clause learning with conflict analysis affects neither soundness nor completeness. Conflict analysis identifies new clauses using the resolution operation. Therefore, each learnt clause can be inferred from the original clauses and other learnt clauses by a sequence of resolution steps. If cN is the new learnt clause, then ϕ is satisfiable if and only if ϕ ∪ {cN} is also satisfiable. Moreover, the modified backtracking step also does not affect soundness or completeness, since backtracking information is obtained from each new learnt clause. [5]

Applications

The main application of CDCL algorithm is in different SAT solvers including:

The CDCL algorithm has made SAT solvers so powerful that they are being used effectively in several[ citation needed ] real world application areas like AI planning, bioinformatics, software test pattern generation, software package dependencies, hardware and software model checking, and cryptography.

Related algorithms to CDCL are the Davis–Putnam algorithm and DPLL algorithm. The DP algorithm uses resolution refutation and it has potential memory access problem.[ citation needed ] Whereas the DPLL algorithm is OK for randomly generated instances, it is bad for instances generated in practical applications. CDCL is a more powerful approach to solve such problems in that applying CDCL provides less state space search in comparison to DPLL.

Works cited

  1. J.P. Marques-Silva; Karem A. Sakallah (November 1996). "GRASP-A New Search Algorithm for Satisfiability". Digest of IEEE International Conference on Computer-Aided Design (ICCAD). pp. 220–227. CiteSeerX   10.1.1.49.2075 . doi:10.1109/ICCAD.1996.569607. ISBN   978-0-8186-7597-3.
  2. J.P. Marques-Silva; Karem A. Sakallah (May 1999). "GRASP: A Search Algorithm for Propositional Satisfiability" (PDF). IEEE Transactions on Computers. 48 (5): 506–521. doi:10.1109/12.769433. Archived from the original (PDF) on 2016-03-04. Retrieved 2014-11-29.
  3. Roberto J. Bayardo Jr.; Robert C. Schrag (1997). "Using CSP look-back techniques to solve real world SAT instances" (PDF). Proc. 14th Nat. Conf. on Artificial Intelligence (AAAI). pp. 203–208.
  4. 1 2 In the pictures below, "" is used to denote "or", multiplication to denote "and", and a postfix "" to denote "not".
  5. Marques-Silva, Joao; Lynce, Ines; Malik, Sharad (February 2009). Handbook of Satisfiability (PDF). IOS Press. p. 138. ISBN   978-1-60750-376-7.
  6. "Glucose's home page".

Related Research Articles

In logic and computer science, the Boolean satisfiability problem (sometimes called propositional satisfiability problem and abbreviated SATISFIABILITY, SAT or B-SAT) is the problem of determining if there exists an interpretation that satisfies a given Boolean formula. In other words, it asks whether the variables of a given Boolean formula can be consistently replaced by the values TRUE or FALSE in such a way that the formula evaluates to TRUE. If this is the case, the formula is called satisfiable. On the other hand, if no such assignment exists, the function expressed by the formula is FALSE for all possible variable assignments and the formula is unsatisfiable. For example, the formula "a AND NOT b" is satisfiable because one can find the values a = TRUE and b = FALSE, which make (a AND NOT b) = TRUE. In contrast, "a AND NOT a" is unsatisfiable.

In boolean logic, a disjunctive normal form (DNF) is a canonical normal form of a logical formula consisting of a disjunction of conjunctions; it can also be described as an OR of ANDs, a sum of products, or — in philosophical logic — a cluster concept. As a normal form, it is useful in automated theorem proving.

In Boolean logic, a formula is in conjunctive normal form (CNF) or clausal normal form if it is a conjunction of one or more clauses, where a clause is a disjunction of literals; otherwise put, it is a product of sums or an AND of ORs.

In computer science, 2-satisfiability, 2-SAT or just 2SAT is a computational problem of assigning values to variables, each of which has two possible values, in order to satisfy a system of constraints on pairs of variables. It is a special case of the general Boolean satisfiability problem, which can involve constraints on more than two variables, and of constraint satisfaction problems, which can allow more than two choices for the value of each variable. But in contrast to those more general problems, which are NP-complete, 2-satisfiability can be solved in polynomial time.

In computational complexity theory, the Cook–Levin theorem, also known as Cook's theorem, states that the Boolean satisfiability problem is NP-complete. That is, it is in NP, and any problem in NP can be reduced in polynomial time by a deterministic Turing machine to the Boolean satisfiability problem.

In formal logic, Horn-satisfiability, or HORNSAT, is the problem of deciding whether a given set of propositional Horn clauses is satisfiable or not. Horn-satisfiability and Horn clauses are named after Alfred Horn.

In mathematical logic and automated theorem proving, resolution is a rule of inference leading to a refutation-complete theorem-proving technique for sentences in propositional logic and first-order logic. For propositional logic, systematically applying the resolution rule acts as a decision procedure for formula unsatisfiability, solving the Boolean satisfiability problem. For first-order logic, resolution can be used as the basis for a semi-algorithm for the unsatisfiability problem of first-order logic, providing a more practical method than one following from Gödel's completeness theorem.

Unit propagation (UP) or boolean constraint propagation (BCP) or the one-literal rule (OLR) is a procedure of automated theorem proving that can simplify a set of clauses.

<span class="mw-page-title-main">DPLL algorithm</span> Type of search algorithm

In logic and computer science, the Davis–Putnam–Logemann–Loveland (DPLL) algorithm is a complete, backtracking-based search algorithm for deciding the satisfiability of propositional logic formulae in conjunctive normal form, i.e. for solving the CNF-SAT problem.

In computational complexity theory, the maximum satisfiability problem (MAX-SAT) is the problem of determining the maximum number of clauses, of a given Boolean formula in conjunctive normal form, that can be made true by an assignment of truth values to the variables of the formula. It is a generalization of the Boolean satisfiability problem, which asks whether there exists a truth assignment that makes all clauses true.

In mathematical logic, a tautology is a formula that is true regardless of the interpretation of its component terms, with only the logical constants having a fixed meaning. For example, a formula that states, "the ball is green or the ball is not green," is always true, regardless of what a ball is and regardless of its colour. Tautology is usually, though not always, used to refer to valid formulas of propositional logic.

In computer science and mathematical logic, satisfiability modulo theories (SMT) is the problem of determining whether a mathematical formula is satisfiable. It generalizes the Boolean satisfiability problem (SAT) to more complex formulas involving real numbers, integers, and/or various data structures such as lists, arrays, bit vectors, and strings. The name is derived from the fact that these expressions are interpreted within ("modulo") a certain formal theory in first-order logic with equality. SMT solvers are tools that aim to solve the SMT problem for a practical subset of inputs. SMT solvers such as Z3 and cvc5 have been used as a building block for a wide range of applications across computer science, including in automated theorem proving, program analysis, program verification, and software testing.

<span class="mw-page-title-main">Implication graph</span> Directed graph representing a Boolean expression

In mathematical logic and graph theory, an implication graph is a skew-symmetric, directed graph G = (V, E) composed of vertex set V and directed edge set E. Each vertex in V represents the truth status of a Boolean literal, and each directed edge from vertex u to vertex v represents the material implication "If the literal u is true then the literal v is also true". Implication graphs were originally used for analyzing complex Boolean expressions.

In computer science and formal methods, a SAT solver is a computer program which aims to solve the Boolean satisfiability problem. On input a formula over Boolean variables, such as "(x or y) and (x or not y)", a SAT solver outputs whether the formula is satisfiable, meaning that there are possible values of x and y which make the formula true, or unsatisfiable, meaning that there are no such values of x and y. In this case, the formula is satisfiable when x is true, so the solver should return "satisfiable". Since the introduction of algorithms for SAT in the 1960s, modern SAT solvers have grown into complex software artifacts involving a large number of heuristics and program optimizations to work efficiently.

In computational complexity theory, the language TQBF is a formal language consisting of the true quantified Boolean formulas. A (fully) quantified Boolean formula is a formula in quantified propositional logic where every variable is quantified, using either existential or universal quantifiers, at the beginning of the sentence. Such a formula is equivalent to either true or false. If such a formula evaluates to true, then that formula is in the language TQBF. It is also known as QSAT.

In computer science, the Sharp Satisfiability Problem is the problem of counting the number of interpretations that satisfy a given Boolean formula, introduced by Valiant in 1979. In other words, it asks in how many ways the variables of a given Boolean formula can be consistently replaced by the values TRUE or FALSE in such a way that the formula evaluates to TRUE. For example, the formula is satisfiable by three distinct boolean value assignments of the variables, namely, for any of the assignments, , and, we have

In theoretical computer science, the circuit satisfiability problem is the decision problem of determining whether a given Boolean circuit has an assignment of its inputs that makes the output true. In other words, it asks whether the inputs to a given Boolean circuit can be consistently set to 1 or 0 such that the circuit outputs 1. If that is the case, the circuit is called satisfiable. Otherwise, the circuit is called unsatisfiable. In the figure to the right, the left circuit can be satisfied by setting both inputs to be 1, but the right circuit is unsatisfiable.

The Tseytin transformation, alternatively written Tseitin transformation, takes as input an arbitrary combinatorial logic circuit and produces an equisatisfiable boolean formula in conjunctive normal form (CNF). The length of the formula is linear in the size of the circuit. Input vectors that make the circuit output "true" are in 1-to-1 correspondence with assignments that satisfy the formula. This reduces the problem of circuit satisfiability on any circuit to the satisfiability problem on 3-CNF formulas. It was discovered by the Russian scientist Grigori Tseitin.

In computer science, there are certain classes of algorithms (heuristics) that solves types of the Boolean satisfiability problem despite there being no known efficient algorithm in the general case.

<span class="mw-page-title-main">Planar SAT</span> Boolean satisfiability problem restricted to a planar incidence graph

In computer science, the planar 3-satisfiability problem (abbreviated PLANAR 3SAT or PL3SAT) is an extension of the classical Boolean 3-satisfiability problem to a planar incidence graph. In other words, it asks whether the variables of a given Boolean formula—whose incidence graph consisting of variables and clauses can be embedded on a plane—can be consistently replaced by the values TRUE or FALSE in such a way that the formula evaluates to TRUE. If this is the case, the formula is called satisfiable. On the other hand, if no such assignment exists, the function expressed by the formula is FALSE for all possible variable assignments and the formula is unsatisfiable. For example, the formula "a AND NOT b" is satisfiable because one can find the values a = TRUE and b = FALSE, which make (a AND NOT b) = TRUE. In contrast, "a AND NOT a" is unsatisfiable.

References