# Unit propagation

Last updated

Unit propagation (UP) or Boolean Constraint propagation (BCP) or the one-literal rule (OLR) is a procedure of automated theorem proving that can simplify a set of (usually propositional) clauses.

## Definition

The procedure is based on unit clauses, i.e. clauses that are composed of a single literal. Because each clause needs to be satisfied, we know that this literal must be true. If a set of clauses contains the unit clause $l$ , the other clauses are simplified by the application of the two following rules:

1. every clause (other than the unit clause itself) containing $l$ is removed (the clause is satisfied if $l$ is);
2. in every clause that contains $\neg l$ this literal is deleted ($\neg l$ can not contribute to it being satisfied).

The application of these two rules lead to a new set of clauses that is equivalent to the old one.

For example, the following set of clauses can be simplified by unit propagation because it contains the unit clause $a$ .

$\{a\vee b,\neg a\vee c,\neg c\vee d,a\}$ Since $a\vee b$ contains the literal $a$ , this clause can be removed altogether. Since $\neg a\vee c$ contains the negation of the literal in the unit clause, this literal can be removed from the clause. The unit clause $a$ is not removed; this would make the resulting set not equivalent to the original one; this clause can be removed if already stored in some other form (see section "Using a partial model"). The effect of unit propagation can be summarized as follows.

 $\{$ $a\vee b,$ $\neg a\vee c,$ $\neg c\vee d,$ $a$ $\}$ (removed) ($\neg a$ deleted) (unchanged) (unchanged) $\{$ $c,$ $\neg c\vee d,$ $a$ $\}$ The resulting set of clauses $\{c,\neg c\vee d,a\}$ is equivalent to the above one. The new unit clause $c$ that results from unit propagation can be used for a further application of unit propagation, which would transform $\neg c\vee d$ into $d$ .

## Unit propagation and resolution

The second rule of unit propagation can be seen as a restricted form of resolution, in which one of the two resolvents must always be a unit clause. As for resolution, unit propagation is a correct inference rule, in that it never produces a new clause that was not entailed by the old ones. The differences between unit propagation and resolution are:

1. resolution is a complete refutation procedure while unit propagation is not; in other words, even if a set of clauses is contradictory, unit propagation may not generate an inconsistency;
2. the two clauses that are resolved cannot in general be removed after the generated clause is added to the set; on the contrary, the non-unit clause involved in a unit propagation can be removed when its simplification is added to the set;
3. resolution does not in general include the first rule used in unit propagation.

Resolution calculi that include subsumption can model rule one by subsumption and rule two by a unit resolution step, followed by subsumption.

Unit propagation, applied repeatedly as new unit clauses are generated, is a complete satisfiability algorithm for sets of propositional Horn clauses; it also generates a minimal model for the set if satisfiable: see Horn-satisfiability.

## Using a partial model

The unit clauses that are present in a set of clauses or can be derived from it can be stored in form of a partial model (this partial model may also contain other literals, depending on the application). In this case, unit propagation is performed based on the literals of the partial model, and unit clauses are removed if their literal is in the model. In the example above, the unit clause $a$ would be added to the partial model; the simplification of the set of clauses would then proceed as above with the difference that the unit clause $a$ is now removed from the set. The resulting set of clauses is equivalent to the original one under the assumption of validity of the literals in the partial model.

## Complexity

The direct implementation of unit propagation takes time quadratic in the total size of the set to check, which is defined to be the sum of the size of all clauses, where the size of each clause is the number of literals it contains.

Unit propagation can however be done in linear time by storing, for each variable, the list of clauses in which each literal is contained. For example, the set above can be represented by numbering each clause as follows:

$\{1:a\vee b,2:\neg a\vee c,3:\neg c\vee d,4:a\}$ and then storing, for each variable, the list of clauses containing the variable or its negation:

$a:1\ 2\ 4$ $b:1$ $c:2\ 3$ $d:3$ This simple data structure can be built in time linear in the size of the set, and allows finding all clauses containing a variable very easily. Unit propagation of a literal can be performed efficiently by scanning only the list of clauses containing the variable of the literal. More precisely, the total running time for doing unit propagation for all unit clauses is linear in the size of the set of clauses.

## Related Research Articles

In logic and computer science, the Boolean satisfiability problem is the problem of determining if there exists an interpretation that satisfies a given Boolean formula. In other words, it asks whether the variables of a given Boolean formula can be consistently replaced by the values TRUE or FALSE in such a way that the formula evaluates to TRUE. If this is the case, the formula is called satisfiable. On the other hand, if no such assignment exists, the function expressed by the formula is FALSE for all possible variable assignments and the formula is unsatisfiable. For example, the formula "a AND NOT b" is satisfiable because one can find the values a = TRUE and b = FALSE, which make = TRUE. In contrast, "a AND NOT a" is unsatisfiable.

Inductive logic programming (ILP) is a subfield of symbolic artificial intelligence which uses logic programming as a uniform representation for examples, background knowledge and hypotheses. Given an encoding of the known background knowledge and a set of examples represented as a logical database of facts, an ILP system will derive a hypothesised logic program which entails all the positive and none of the negative examples.

In Boolean logic, a formula is in conjunctive normal form (CNF) or clausal normal form if it is a conjunction of one or more clauses, where a clause is a disjunction of literals; otherwise put, it is a product of sums or an AND of ORs. As a canonical normal form, it is useful in automated theorem proving and circuit theory.

In mathematical logic and logic programming, a Horn clause is a logical formula of a particular rule-like form which gives it useful properties for use in logic programming, formal specification, and model theory. Horn clauses are named for the logician Alfred Horn, who first pointed out their significance in 1951.

In computer science, 2-satisfiability, 2-SAT or just 2SAT is a computational problem of assigning values to variables, each of which has two possible values, in order to satisfy a system of constraints on pairs of variables. It is a special case of the general Boolean satisfiability problem, which can involve constraints on more than two variables, and of constraint satisfaction problems, which can allow more than two choices for the value of each variable. But in contrast to those more general problems, which are NP-complete, 2-satisfiability can be solved in polynomial time. In proof theory, the semantic tableau is a decision procedure for sentential and related logics, and a proof procedure for formulae of first-order logic. An analytic tableau is a tree structure computed for a logical formula, having at each node a subformula of the original formula to be proved or refuted. Computation constructs this tree and uses it to prove or refute the whole formula. The tableau method can also determine the satisfiability of finite sets of formulas of various logics. It is the most popular proof procedure for modal logics.

In formal logic, Horn-satisfiability, or HORNSAT, is the problem of deciding whether a given set of propositional Horn clauses is satisfiable or not. Horn-satisfiability and Horn clauses are named after Alfred Horn.

In mathematical logic and automated theorem proving, resolution is a rule of inference leading to a refutation complete theorem-proving technique for sentences in propositional logic and first-order logic. For propositional logic, systematically applying the resolution rule acts as a decision procedure for formula unsatisfiability, solving the Boolean satisfiability problem. For first-order logic, resolution can be used as the basis for a semi-algorithm for the unsatisfiability problem of first-order logic, providing a more practical method than one following from Gödel's completeness theorem.

The Davis–Putnam algorithm was developed by Martin Davis and Hilary Putnam for checking the validity of a first-order logic formula using a resolution-based decision procedure for propositional logic. Since the set of valid first-order formulas is recursively enumerable but not recursive, there exists no general algorithm to solve this problem. Therefore, the Davis–Putnam algorithm only terminates on valid formulas. Today, the term "Davis–Putnam algorithm" is often used synonymously with the resolution-based propositional decision procedure that is actually only one of the steps of the original algorithm. In logic and computer science, the Davis–Putnam–Logemann–Loveland (DPLL) algorithm is a complete, backtracking-based search algorithm for deciding the satisfiability of propositional logic formulae in conjunctive normal form, i.e. for solving the CNF-SAT problem.

Constraint logic programming is a form of constraint programming, in which logic programming is extended to include concepts from constraint satisfaction. A constraint logic program is a logic program that contains constraints in the body of clauses. An example of a clause including a constraint is `A(X,Y):-X+Y>0,B(X),C(Y)`. In this clause, `X+Y>0` is a constraint; `A(X,Y)`, `B(X)`, and `C(Y)` are literals as in regular logic programming. This clause states one condition under which the statement `A(X,Y)` holds: `X+Y` is greater than zero and both `B(X)` and `C(Y)` are true.

Interval scheduling is a class of problems in computer science, particularly in the area of algorithm design. The problems consider a set of tasks. Each task is represented by an interval describing the time in which it needs to be executed. For instance, task A might run from 2:00 to 5:00, task B might run from 4:00 to 10:00 and task C might run from 9:00 to 11:00. A subset of intervals is compatible if no two intervals overlap. For example, the subset {A,C} is compatible, as is the subset {B}; but neither {A,B} nor {B,C} are compatible subsets, because the corresponding intervals within each subset overlap. In mathematical logic, an implication graph is a skew-symmetric directed graph G = composed of vertex set V and directed edge set E. Each vertex in V represents the truth status of a Boolean literal, and each directed edge from vertex u to vertex v represents the material implication "If the literal u is true then the literal v is also true". Implication graphs were originally used for analyzing complex Boolean expressions.

SLD resolution is the basic inference rule used in logic programming. It is a refinement of resolution, which is both sound and refutation complete for Horn clauses.

In computational complexity theory, SNP is a complexity class containing a limited subset of NP based on its logical characterization in terms of graph-theoretical properties. It forms the basis for the definition of the class MaxSNP of optimization problems.

MAXEkSAT is a problem in computational complexity theory that is a maximization version of the Boolean satisfiability problem 3SAT. In MAXEkSAT, each clause has exactly k literals, each with distinct variables, and is in conjunctive normal form. These are called k-CNF formulas. The problem is to determine the maximum number of clauses that can be satisfied by a truth assignment to the variables in the clauses.

The Tseytin transformation, alternatively written Tseitin transformation, takes as input an arbitrary combinatorial logic circuit and produces a boolean formula in conjunctive normal form (CNF), which can be solved by a CNF-SAT solver. The length of the formula is linear in the size of the circuit. Input vectors that make the circuit output "true" are in 1-to-1 correspondence with assignments that satisfy the formula. This reduces the problem of circuit satisfiability on any circuit to the satisfiability problem on 3-CNF formulas.

In proof theory, an area of mathematical logic, proof compression is the problem of algorithmically compressing formal proofs. The developed algorithms can be used to improve the proofs generated by automated theorem proving tools such as SAT solvers, SMT-solvers, first-order theorem provers and proof assistants.

In computer science, conflict-driven clause learning (CDCL) is an algorithm for solving the Boolean satisfiability problem (SAT). Given a Boolean formula, the SAT problem asks for an assignment of variables so that the entire formula evaluates to true. The internal workings of CDCL SAT solvers were inspired by DPLL solvers.

Given a Boolean expression with variables, finding an assignment of the variables such that is true is called the Boolean satisfiability problem, frequently abbreviated SAT, and is seen as the canonical NP-complete problem.

• Dowling, William F.; Gallier, Jean H. (1984), "Linear-time algorithms for testing the satisfiability of propositional Horn formulae", Journal of Logic Programming, 1 (3): 267–284, doi:, MR   0770156 .
• H. Zhang and M. Stickel (1996). An efficient algorithm for unit-propagation. In Proceedings of the Fourth International Symposium on Artificial Intelligence and Mathematics.