Structure-Guided Automated Reasoning
Abstract
Algorithmic meta-theorems state that problems definable in a fixed logic can be solved efficiently on structures with certain properties. An example is Courcelle’s Theorem, which states that all problems expressible in monadic second-order logic can be solved efficiently on structures of small treewidth. Such theorems are usually proven by algorithms for the model-checking problem of the logic, which is often complex and rarely leads to highly efficient solutions. Alternatively, we can solve the model-checking problem by grounding the given logic to propositional logic, for which dedicated solvers are available. Such encodings will, however, usually not preserve the input’s treewidth.
This paper investigates whether all problems definable in monadic second-order logic can efficiently be encoded into sat such that the input’s treewidth bounds the treewidth of the resulting formula. We answer this in the affirmative and, hence, provide an alternative proof of Courcelle’s Theorem. Our technique can naturally be extended: There are treewidth-aware reductions from the optimization version of Courcelle’s Theorem to maxsat and from the counting version of the theorem to #sat. By using encodings to sat, we obtain, ignoring polynomial factors, the same running time for the model-checking problem as we would with dedicated algorithms. Another immediate consequence is a treewidth-preserving reduction from the model-checking problem of monadic second-order logic to integer linear programming (ilp). We complement our upper bounds with new lower bounds based on ; and we show that the block size of the input’s formula and the treewidth of the input’s structure are tightly linked.
Finally, we present various side results needed to prove the main theorems: A treewidth-preserving cardinality constraints, treewidth-preserving encodings from cnfs into dnfs, and a treewidth-aware quantifier elimination scheme for qbf implying a treewidth-preserving reduction from qsat to sat. We also present a reduction from projected model counting to #sat that increases the treewidth by at most a factor of , yielding a algorithm for projected model counting that beats the currently best running time of .
Keywords and phrases:
automated reasoning, treewidth, satisfiability, max-sat, sharp-sat, monadic second-order logic, fixed-parameter tractabilityCopyright and License:
![[Uncaptioned image]](x1.png)
2012 ACM Subject Classification:
Theory of computation Constraint and logic programming ; Theory of computation Finite Model Theory ; Theory of computation Fixed parameter tractabilityAcknowledgements:
Part of the research was carried out while Hecher was visiting the Simons Institute for the Theory of Computing at UC Berkeley. Author names are stated alphabetically.Funding:
This research was funded by the Austrian Science Fund (FWF), grants J 4656 and P 32830, the Society for Research Funding in Lower Austria (GFF, Gesellschaft für Forschungsförderung NÖ) grant ExzF-0004, as well as the Vienna Science and Technology Fund (WWTF) grant ICT19-065.Editors:
Olaf Beyersdorff, Michał Pilipczuk, Elaine Pimentel, and Nguyễn Kim ThắngSeries and Publisher:

1 Introduction
Many tools from the automated reasoning quiver can be implemented efficiently if a graphical representation of the given formula with good structural properties is given. The textbook example is the satisfiability problem (sat), which can be solved in time on formulas whose primal graph has treewidth . (The primal graph contains a vertex for every variable of the formula and connects them if they appear together in a clause. Its treewidth intuitively measures how close it is to being a tree.) The result extends to the maximum satisfiability problem (maxsat), in which the clauses of the formula have weights and the goal is to minimize the weights of falsified clauses, and to the model counting problem (#sat), in which the goal is to compute the number of satisfying assignments. In this article, we will use the notation to describe a tower of twos of height with at the top, and as shorthand to hide polynomial factors, e.g., :
Fact 1 (folklore, see for instance [1, 4, 5, 14, 24, 29, 30]).
One can solve sat, maxsat, and #sat in time if a width- tree decomposition is given.
It is worth to take some time to inspect the details of Fact 1. The hidden polynomial factor is not the subject of this paper (as indicated by the notation), but can be made as small as [10, 26]. Our focus will be the value on top of the tower, which in Fact 1 is simply “”. Under the exponential-time hypothesis (), this is best possible.
The natural extension of the satisfiability problem to higher logic is the validity problem of fully quantified Boolean formulas (qsat). While it is well-known that qsat is fixed-parameter tractable (i.e., it is in ) with respect to treewidth [11], the dependencies on the treewidth is less sharp than in Fact 1. The height of the tower depends on the quantifier alternation of the formula, while the top value has the form due to the management of nested tables in the involved dynamic program.
In contrast to Fact 1, there is a big-oh on top of the tower in Fact 2. The higher order version of the model counting problem is the projected model counting problem (pmc), in which we need to count the number of models that are not identical on a given set of variables.
Fact 3 ([15]).
One can solve pmc in time if a width- tree decomposition is given.
The fine art of automated reasoning is descriptive complexity, which studies the complexity of problems in terms of the complexity of a description of these problems; independent of any abstract machine model [21, 25]. A prominent example is Courcelle’s Theorem that states that the problems that can be expressed in monadic second-order logic can be solved efficiently on instances of bounded treewidth [12]. Differently phrased, the theorem states that the model-checking problem for mso logic (mc(mso)) is fixed-parameter tractable (the parameter is the size of the formula and the treewidth of the structure):
Fact 4 ([12]).
One can solve mc(mso) in time if a width- tree decomposition is given.
For instance, the 3-coloring problem (Can we color the vertices of a graph with three colors such that adjacent vertices obtain different colors?) can be described by the sentence:
The sentence can be read aloud as: There are three colors red, blue, and green () such that for all vertices and () we have that (i) each vertex has at least one color (), and (ii), if and are connected by an edge () then they do not have the same color (). The model-checking problem mc(mso) obtains as input a relational structure (say a graph like or ) and an mso sentence (as the one from above) and asks whether is a model of , denoted by . In our example we have and . Using Fact 4, we can conclude from that the 3-coloring problem parameterized by the treewidth lies in .
Instead of utilizing Fact 4, another reasonable approach is to ground the mso sentence to a propositional formula and to then apply Fact 1. Formally, this means to reduce the model checking problem mc(mso) to sat, i.e., given a relational structure and an mso sentence , we need to produce, in polynomial time, a propositional formula such that iff . The naïve way of doing so is by generating an indicator variable for every set variable and every element in the universe of . Then we replace every first-order -quantifier by a “big-or” and -quantifier by a “big-and”:
The emerging question now is whether an automated translation such as the one we just sketched preserves treewidth in the following sense: Given a relational structure of treewidth and an mso sentence , can we mechanically derive a propositional formula with iff and for some function computable ? Consider for instance the following graph shown on the left (it is “almost a tree” and has treewidth ) and the primal graph of obtained using the just sketched transformation on the right. In this example, the tree-like structure is preserved, as the treewidth gets increased by a factor of and is at most :
We recap this finding as the following observation: The automated grounding process from mc(mso) to sat implies a reduction from the 3-coloring problem parameterized by the input’s treewidth to sat. We can, thus, derive that the 3-coloring problem can be solved in time using Fact 1 – without actually utilizing Courcelle’s Theorem!
For a second example consider the optimization and counting version of the dominating set problem: Given a graph , the task is either to find a minimum-size set of vertices such that every vertex is in or adjacent to vertex in , or to count the number of such sets. Optimization and counting problems can be modeled in descriptive complexity by “moving” an existential second-order quantifier (“guessing” the solution) out of the sentence and making it a free variable. The task is either to find a set of minimum size such that the given structure together with this set is a model of the formula, or to count the number of such sets. For instance, the following formula describes that is a dominating set:
We will also say that the formula Fagin-defines the property that is a dominating set. The problem #fd(mso) asks, given a relational structure and an mso formula with a free-set variable , how many subsets of the universe of satisfy . The optimization problem fd(mso) gets as additional input an integer and asks whether there is such a with . The reduction from mc(mso) to sat can be extended to a reduction from fd(mso) to maxsat and from #fd(mso) to pmc. In order to ground fd(mso), we add new indicator variables for the free-variable and every element of (as we did for the second-order quantifiers). For fd(mso), we additionally add a soft clause for each of these variables – implying that we seek a model that minimizes . We may now again ask: If we mechanically ground on a structure of bounded treewidth to a propositional formula , what can we say about the treewidth of ? Unfortunately, not so much. Even if the input has treewidth , the primal graph of may become a clique (of treewidth ):
It follows that we cannot derive an fpt-algorithm for the dominating set problem or its counting version by reasoning about , while we can conclude the fact from using appropriate versions of Courcelle’s Theorem. To summarize, we can naturally describe model-checking, optimization, and counting problems using monadic second-order logic. Using Courcelle’s Theorem, we can solve all of these problems in fpt-time on structures of bounded treewidth. Alternatively, we may ground the mso formulas to propositional logic and solve the problems using Fact 1. The produced encodings sometimes preserve the input’s structure (as for 3-coloring) and, thus, themselves serve as proof that the problems lie in . However, the input’s structure can also get eradicated, as we observed for the dominating set problem. The present paper is concerned with the question whether there is a unifying grounding procedure that maps Fagin-defined mso properties to propositional logic while preserving the input’s treewidth.
Contribution I: Faster Structure-guided Reasoning.
Before we develop a unifying, structure-aware grounding process from the model-checking problem of monadic second order logic to propositional logic, we first improve both of the underlying results. In particular, we remove the logarithmic dependencies on in top of the tower of Fact 2 and, thus, provide the first major improvement on qbf upper bounds with respect to treewidth since 20 years:
Theorem 1 (QBF Theorem).
One can solve qsat in time if a width- tree decomposition is given.
This bound matches the eth lower bound for qsat:
Fact 5 ([16]).
Unless fails, qsat cannot be solved in time .
We will prove Theorem 1 fully in the spirit of an automated reasoning paper by an encoding into sat. In particular, we will not need any pre-requirements other than Fact 1. With a similar encoding scheme, we will also slightly improve on Fact 3:
Theorem 2 (PMC Theorem).
One can solve pmc in time if a width- tree decomposition is given.
Fact 6 ([15]).
Unless fails, pmc cannot be solved in time .
Contribution II: A SAT Version of Courcelle’s Theorem.
We answer the main question of the introduction in the affirmative and provide a unifying, structure-aware encoding scheme from properties Fagin-defined with monadic second-order logic to variants of sat:
Theorem 3 (A SAT Version of Courcelle’s Theorem).
Assuming that the mso formulas on the left side are in prenex normal form and that a width- tree decomposition is given, there are encodings from …
-
1.
mc(mso) to sat of size ;
-
2.
fd(mso) to maxsat of size ;
-
3.
#fd(mso) to #sat of size .
All encodings of size have a treewidth of and can be computed in linear time with respect to their size.
In conjunction with Fact 1, the theorem implies Courcelle’s Theorem with sharp bounds on the values on top of the tower:
Corollary 4.
One can solve mc(mso) in time , and fd(mso) and #fd(mso) in time if a width- tree decomposition is given.
Contribution III: ETH Lower Bounds for the Encoding Size.
Given that we can encode mso definable properties into sat while preserving the input’s treewidth, we may ask next whether we can improve on the size of the encodings. While it is well-known that incarnations of Courcelle’s Theorem have to depend on the input’s treewidth and the formula’s size in a non-elementary way [3] (and hence, the encodings have to be huge at some point as well), these insights do not give us precise bounds on achievable encoding sizes.
Theorem 5 (ETH Lower Bound).
Under , there is no sat encoding for mc(mso) of size that can be computed in this time.
We can make the lower bound a bit more precise in the following sense: The value at the top of the tower actually does not just depend on the treewidth , but on the product of the treewidth and the block size of the sentence . The block size of a formula is the maximum number of consecutive quantifiers of the same type.
Theorem 6 (Trade-off Theorem).
Under , there is no sat encoding for mc(mso) of size that can be computed within this time.
1.1 Related Work
The concept of treewidth was discovered multiple times. The name was coined in the work by Robertson and Seymour [28], while the concept was studied by Arnborg and Proskurowski [2] under the name partial -trees simultaneously. However, treewidth was discovered even earlier by Bertelè and Brioschi [6], and independently by Halin [19]. Courcelle’s Theorem was proven in a series of articles by Bruno Courcelle [12], see also the textbook by Courcelle and Engelfriet for a detailed introduction [13]. The expressive power of monadic second-order logic was studied before, prominently by Büchi who showed that mso over strings characterizes the regular languages [9]. Related to our treewidth-aware reduction from mc(mso) to sat is the work by Gottlob, Pichler, and Wei, who solve mc(mso) using monadic Datalog [18]; and the work of Bliem, Pichler, and Woltran, who solve it using asp [8].
1.2 Structure of this Article
We provide preliminaries in the next section, prove Theorem 1 and 2 in Section 3, and establish a sat version of Courcelle’s Theorem in Section 4. The technical details of the latter can be found in the technical report version of this article. We extend the result to Fagin-definable properties in Section 5 and provide corresponding lower bounds in Section 6. We conclude and provide pointers for further research in the last section, which also contains an overview table of this article’s results. Due to lack of space, most proofs are only avilable in the technical report and are replaced by a proof sketch within the main text. The corresponding positions are clearly marked with a “”.
2 Preliminaries: Background in Logic and Structural Graph Theory
We use the notation of Knuth [23] and consider propositional formulas in conjunctive normal form (cnfs) like as set of sets . We denote the sets of variables, literals, and clauses of as , , and . A (partial) assignment is a subset such that for all , that is, a set of literals that does not contain both polarities of any variable. We use to denote partial assignments. The formula conditioned under a partial assignment is denoted by and obtained by removing all clauses from that contain a literal and by removing all literals with from the remaining clauses. A assignment is satisfying for a cnf if , and it is contradicting if . A dnf is a disjunction of conjunctions, i.e., a set of terms. We use the same notations as for cnfs, however, in we delete terms that contain a literal that appears negated in and remove the literals in from the remaining terms. Hence, is satisfying if , and contradicting if .
The model counting problem asks to compute the number of satisfying assignments of a cnf and is denoted by #sat. In projected model counting (pmc) we count the number of models that are not identical on a given set of variables. In the maximum satisfiability problem (maxsat) we partition the clauses of into a set of hard clauses and a set of weighted soft clauses, i.e., every clause comes with a weight . The formula is then called a wcnf and the goal is to find under all assignments with the one that maximizes . In a fully quantified Boolean formula (a qbf, also called a second-order propositional sentence) all variables are bounded by existential or universal quantifiers. Throughout the paper we assume that qbfs are in prenex normal form, meaning that all quantifers appear in the front of a quantifier-free formula called the matrix. As is customary, we assume that the matrix is a cnf if the last (i.e., most inner) quantifier is existential, and a dnf otherwise. A qbf is valid if it evaluates to true (see Chapter 29–31 in [7]). Define qsat to be the problem of deciding whether a given qbf is valid.
2.1 Descriptive Complexity
A vocabulary is a finite set of relational symbols of arity . A (finite, relational) -structure is a tuple with universe and interpretations . The size of is . We denote the set of all -structures by – e.g., is the set of directed graphs.
Let be a vocabulary and be an infinite repertoire of first-order variables. The first-order language is inductively defined, where the atomic formulas are the strings and for relational symbols . If then so are , , and . A variable that appears next to is called quantified and free otherwise. We denote a formula with if are precisely the free variables in . A formula without free variables is called a sentence. As customary, we extend the language of first-order logic by the usual abbreviations, e.g., and . To increase readability, we will use other lowercase Latin letters for variables and drop unnecessary braces by using the usual operator precedence instead. Furthermore, we use the dot notation in which we place a “” instead of an opening brace and silently close it at the latest syntactically correct position. A -structure is a model of a sentence , denoted by , if it evaluates to true under the semantics of quantified propositional logic while interpreting equality and relational symbols as specified by the structure. For instance, over describes the set of undirected graphs, and we have and .
We obtain the language of second-order logic by allowing quantification over relational variables of arbitrary arity, which we will denote by uppercase Latin letters. A relational variable is said to be monadic if its arity is one. A monadic second-order formula is one in which all quantified relational variables are monadic. The set of all such formulas is denoted by mso. The model checking problem for a vocabulary is the set that contains all pairs of -structures and mso sentences with . Whenever is not relevant (meaning that a result holds for all fixed ), we will refer to the problem as . We note that in the literature there is often a distinction between - and -logic, which describes the way the input is encoded [20]. Since we allow arbitrary relations, we do not have to make this distinction.
2.2 Treewidth and Tree Decompositions
While we consider graphs as relational structures as discussed in the previous section, we also use common graph-theoretic terminology and denote with and the vertex and edge set of . Unless stated otherwise, graphs in this paper are undirected and we use the natural set notations and write, for instance, . The degree of a vertex is the number of its neighbors. A tree decomposition of is a pair in which is a tree (a connected graph without cycles) and a function with the following two properties:
-
1.
for every the set is non-empty and connected in ;
-
2.
for every there is at least one node with .
The width of a tree decomposition is the maximum size of its bags minus one, i.e., . The treewidth of a graph is the minimum width any tree decomposition of must have. We do not require additional properties of tree decompositions, but we assume that is rooted at a and, thus, that nodes may have a and . Without loss of generality, we may also assume .
Example 7.
The treewidth of the Big Dipper constellation (as graph shown on the left) is at most two, as proven by the tree decomposition on the right:
2.3 Treewidth of Propositional Formulas and Relational Structures
The definition of treewidth can be lifted to other objects by associating a graph to them. The most common graph for cnfs (or dnfs) is the primal graph , which is the graph on vertex set that connects two vertices by an edge if the corresponding variables appear together in a clause. We then define and refer to a tree decomposition of as one of . Note that other graphical representations lead to other definitions of the treewidth of propositional formulas. A comprehensive listing can be found in the Handbook of Satisfiability [7, Chapter 17]. A labeled tree decomposition extends a tree decomposition with a mapping (i.e., a mapping from the nodes of to a subset of the clauses (or terms) of ) such that for every clause (or term) there is exactly one with that contains all variables appearing in . It is easy to transform a tree decomposition into a labeled one by traversing the tree once’s and by duplicating some bags. Hence, we will assume throughout this article that all tree decompositions are labeled.
A similar approach can be used to define tree decompositions of arbitrary structures: The primal graph of a structure , in this context also called the Gaifman graph, has as vertex set the universe of , i.e., , and contains an edge iff and appear together in some tuple of . As before, we define . One can alternatively define the concept of tree decompositions directly over relational structures, which leads to the same definition [17].
3 New Upper Bounds for Second-Order Propositional Logic
Central to our reductions are treewidth-preserving encodings from qsat to sat and from pmc to #sat. These encoding establishes new proofs of Chen’s Theorem [11] and the theorem by Fichte et al. [15], and improve the dependencies on in the tower of Fact 2 and 3.
3.1 Treewidth-Aware Encodings from QSAT to SAT
We use a quantifier elimination scheme that eliminates the most-inner quantifier block at the cost of introducing new variables while increasing the treewidth by a factor of . Let first be the given qbf, in which is a dnf. Let further be the given labeled width- tree decomposition of . We describe an encoding into a qbf, in which the last quantifier block gets replaced by new variables in .
We have to encode the fact that for an assignment on all assignments to satisfy , i. e., at least one term in For that end, we introduce auxiliary variables for every term and any partial assignment of the variables in that also appear in the bag that contains . More precisely, let be the node in with and let be an assignment of the variables of the bag that are quantified by . We introduce the variable that indicates that this assignment satisfies :
// may satisfy | (1) | |||
// falsifies | (2) |
We have to track whether can be satisfied by a local assignment . For every and every we introduce a variable that indicates that can be extended to a satisfying assignment for the subtree rooted at . Furthermore, we create variables for that propagate the information about satisfiability along the tree decomposition. That is, is set to true if there is an assignment that can be extended to a satisfying assignment and that is compatible with :
// Either there is a term satisfing the bag or we can propagate: | |||
(3) | |||
// Propagate satisfiability: | |||
(4) |
Finally, since , we need to ensure that for all possible assignments of there is at least one term that gets satisfied. Since satisfiability gets propagated to the root of the tree decomposition by the aforementioned constraint, we can enforce this property with:
The following lemma observes the correctness of the construction, and the subsequent lemma handles the case .
Lemma 8 ().
There is an algorithm that, given a qbf and a width- tree decomposition of , outputs in time a qbf and a width- tree decomposition of such that is valid iff is valid.
Lemma 9 ().
There is an algorithm that, given a qbf and a width- tree decomposition of , outputs in time a qbf and a width- tree decomposition of such that is valid iff is valid.
Sketch of Proof.
The case (in which is a cnf) works similarly: The result follows by negating the inverse, where the roles of cnf and dnf are switched, and universal and existential quantification are switched as well.
Proof of Theorem 1.
The theorem follows by exhaustively applying Lemma 8 and Lemma 9 until a cnf is reached. The price for removing one alternation are new variables and an increase of the treewidth by a factor of . Hence, after removing one quantifier block we have a treewidth of , after two we have , after three we then have ; and so on. We can bound all the intermediate “” by adding a “” on top of the tower, leading to a bound on the treewidth of In fact, we can bound the top of the tower even tighter by observing and guessing as a fix point. Inserting yields and . Consequently, we can bound the treeewidth of the encoding by and the size by .
3.2 Treewidth-Aware Encodings from PMC to #SAT
Recall that the input for pmc is a cnf and a set . The task is to count the assignments that can be extended to models of . We can also think of a formula with free variables and existential quantified variables ( is quantifier-free), for which we want to count the assignments to that make the formula satisfiable. The idea is to rewrite and to use a similar encoding as in the proof of Lemma 9 to remove the second quantifier.
In detail, we add a variable for every clause and every assignment of the corresponding bag . The semantic of this variable is that the clause is satisfiable under the partial assignment . We further add the propagation variables and for all , , and . The former indicates that the assignment can be extended to a satisfying assignment of the subtree rooted at ; the later propagates partial solutions from children to parents within the tree decomposition:
// may satisfy : | |||
(1) |
// satisfies : | |||
(2) |
// Either there is a clause satisfying the bag or we can propagate: | |||
(3) |
// Propagate satisfiability: | |||
(4) |
Observe that the constraints (1)–(4) contain no variable from (we removed them by locally speaking about ) and, furthermore, constraints (1), (3), and (4) are pure propagations, which leave no degree of freedom on the auxiliary variables. Hence, models of these constraint only have freedom in the variables in within constraint (2). We are left with the task to count only models that actually satisfy the input formula, which we achieve with:
Lemma 10 ().
There is an algorithm that, given a cnf , a set , and a width- tree decomposition of , outputs in time a cnf and a width- tree decomposition of such that the projected model count of on equals .
Proof of Theorem 2.
4 A SAT Version of Courcelle’s Theorem
We demonstrate the power of treewidth-aware encodings by providing an alternative proof of Courcelle’s theorem. We prove the main part of Theorem 3 in the following form:
Lemma 11.
There is an algorithm that, given a relational structure , a width- tree decomposition of , and an mso sentence in prenex normal form, produces in time a propositional formula and tree decomposition of of width such that .
The lemma assumes that the sentence is in prenex normal form with a quantifier-free part in cnf, i.e., with and () being second-order (first-order) variables. The requirement that the second-order quantifers appear before the first-order ones is for sake of presentation, the encoding works as is if the quantifiers are mixed. The main part of the proof is a treewidth-aware encoding from mc(mso) into qsat; which is then translated to sat using Theorem 1.
4.1 Auxiliary Encodings
Let be a propositional formula and be an arbitrary set of literals. A cardinality constraint with ensures that { at most, exactly, at least } literals of get assigned to true. Classic encodings of cardinality constraints increase the treewidth of by quite a lot. For instance, the naive encoding for completes into a clique. We encode a cardinality constraint without increasing the treewidth by distributing a sequential unary counter:
Lemma 12 ().
For every we can, given a cnf , a set , and a width- tree decomposition of , encode such that .
Sketch of Proof.
We add variables to every bag of the tree decomposition, which count the number of literals set to true in the subtree rooted at . The semantics of the sequential counter encoding [31] is then implemented along the edges of the decomposition. To cover the new constraints, we can add the auxiliary variables of the (at most two) children of to the bag of as well, resulting in an overall increase of the treewidth by .
The second auxiliary encoding is a treewidth-preserving conversion from cnfs to dnfs.
Lemma 13.
There is a polynomial-time algorithm that, given a cnf and a width- tree decomposition of , produce a dnf and a width- tree decomposition of such that for any , iff is a tautology ( is unsatisfiable).
Sketch of Proof.
For every clause we add a variable that is true iff is satisfied. Satisfiability is encoded along the tree by variables indicating that is satisfied in the subtree rooted at via
4.2 Indicator Variables for the Quantifiers
To prove Lemma 11 we construct a qbf for a given mso sentence , structure , and tree decomposition of . We first define the primary variables of , i.e., the prefix of (primary here refers to the fact that we will also need some auxiliary variables later). For every second-order quantifier or we introduce, as we did in the introduction, an indicator variable for every element with the semantic that is true iff . These variables are either existentially or universally quantified, depending on the second-order quantifier. If there are multiple quantifiers (say ), the order in which the variables are quantified is the same as the order of the second-order quantifiers. For first-order quantifiers or we do the same construction, i.e., we add variables for all with the semantics that is true iff was assigned to . Of course, of these variables we have to set exactly one to true, which we enforce by adding using Lemma 12.
4.3 Evaluation of Atoms
The last ingredient of our qbf encoding is the evaluation of the atoms in the mso sentence . An atom is for a relational symbol from the vocabulary of arity , containment in a second-order variable , equality , and the negation of the aforementioned. For every atom that appears in we introduce variables and for all that indicate that is true in bag or somewhere in the subtree rooted at , respectively. Note that the same atom can occur multiple times in , for instance in
there are two atoms . However, since is in prenex normal form (and, thus, variables cannot be rebound), these always evaluate in exactly the same way. Hence, it is sufficient to consider the set of atoms, which we denote by . We can propagate information about the atoms along the tree decomposition with:
This encoding introduces two variables per atom per bag (namely and ), which increases the treewidth by at most . To synchronize with the two children and , we add and to , yielding a total treewidth of at most .
An easy atom to evaluate is , since if and are equal (i.e., they both got assigned to the same element ), we can conclude this fact within a bag that contains :
For every and every quantifier (or ), we add the propositional variable to all bags containing . We increase the treewidth by at most the quantifier rank and, in return, cover constraints as the above trivially. Similarly, if there is a second-order variable and a first-order variable , the atom can be evaluated locally in every bag:
We have to evaluate atoms corresponding to relational symbols of the vocabulary. For each such symbol of arity we encode:
Here “” is an atom in which is a relational symbol and are quantified first-order variables. In the inner “big-or” we consider all in , i.e., elements that are in the relation . Then “” is a variable that describes that gets assigned to . Note that all tuples in appear together in at least one bag of the tree decomposition and, hence, there is at least one bag for which can be evaluated to true. The propagation ensures that, for every , the variable will be true iff is true. Since the quantifier-free part of is a cnf , we can encode it by replacing every occurrence of in with .
4.4 The Full Encoding in one Figure
For the readers convenience, we compiled the encoding into Figure 1. Combining the insights of the last sections proves Lemma 11, but if the inner-most quantifier is universal, existentially projecting the encoding variables would produce a qbf with one more block. This can, however, be circumvent using Lemma 13. We formally prove that “combining the insights” indeed leads to a sound proof of Lemma 11 in the technical report.
5 Fagin Definability via Automated Reasoning
In this section we prove the remaining two items of Theorem 3, i.e., a treewidth-aware encoding of the optimization version of Courcelle’s Theorem to maxsat; and a #sat encoding of the counting version of the theorem. The general approach is as follows: We obtain a mso formula with a free set variable as input (rather than a mso sentence as in Lemma 11). The objective of the model-checking problems adds requirements to this variable (for fd(mso) we seek a of minimum size such that ; for we want to count the number of sets with ). The “trick” is to rewrite as and apply Lemma 11 to in order to obtain a propositional formula . Observe that the quantifier alternation of may be one larger than the one of .
Lemma 14 ().
There is an algorithm that, given a structure with weights for , a width- tree decomposition of , and an mso formula in prenex normal form, produces in time a wcnf and a tree decomposition of width of such that the maximum weight of any model of equals the maximum value of under with .
Sketch of Proof.
Consider such that the clauses in are hard and the added clauses are soft. A model maximizing the soft clauses will minimize the number of variables set to true, i.e., corresponds to a minimum-size set with .
Sketch of Proof.
6 Lower Bounds for the Encoding Size of Model Checking Problems
We companion our sat encodings for mc(mso) with lower bounds on the achievable encoding size under . The first lower bound (Theorem 5) is obtained by an encoding from qsat into mc(mso) that implies that sat encodings of mc(mso) lead to faster qsat algorithms.
Lemma 16 ().
There is a polynomial-time algorithm that, given a qsat sentence , outputs a structure and an mso sentence with and such that iff evaluates to true.
Sketch of Proof.
The structure is the incidence graph of (the graph containing a node for every variable and every clause that connects variables to the clauses containing them) with some additional labels. The sentence uses second-order quantifier to guess the assignment of , and one additional -block to evaluate it.
Proof of Theorem 5.
6.1 An Encoding for Compressing Treewidth
For qsat one can “move” complexity from the quantifier rank of the formula to its treewidth and vice versa [16]. By Lemma 16, this means that any reduction from qsat to mc(mso) may produce an instance with small treewidth or quantifier alternation while increasing the other. We show that one can also decrease the treewidth by increasing the block size.
Lemma 17 ().
For every there is a polynomial-time algorithm that, on input of a cnf and a width- tree decomposition of , outputs a constant-size mso sentence with and , and a structure with such that .
Sketch of Proof.
The idea of the proof is to (i) encode the input’s formula as an incidence graph over which we reason with an mso sentence; we then (ii) replace this structure by its tree decomposition with additional “sync edges”; and finally we (iii) contract vertices in the tree decomposition to lower the treewidth, while we encode statements like (for a set variable ) by defining new set variables and by interpreting as “the th vertex contracted to is in ”.
Proof of Theorem 6.
We obtain the Trade-off Theorem by combining the proof strategy of Lemma 17 with the reduction from qsat to mc(mso) of Lemma 16. The result is a polynomial-time algorithm for every that, on input of a qbf and a width- tree decomposition of , outputs a constant-size mso sentence with and , and a structure with such that is valid iff .
It is out of the scope of this article, but worth mentioning, that the proofs of Lemma 17 and Theorem 6 can be generalized to the following finite-model theoretic result:
Proposition 18.
For every there is a polynomial-time algorithm that, given a relational structure , a width- tree decomposition of , and an mso sentence , outputs a structure and a sentence such that:
-
1.
;
-
2.
;
-
3.
.
7 Conclusion and Further Research
We studied structure-guided automated reasoning, where we utilize the input’s structure in propositional encodings. The scientific question we asked was whether we can encode every mso definable problem on structures of bounded treewidth into sat formulas of bounded treewidth. We proved this in the affirmative, implying an alternative proof of Courcelle’s Theorem. The most valuable aspects are, in our opinion, the simplicity of the proof (it is “just” an encoding into propositional logic) and the potential advantages in practice for formulas of small quantifier alternation (sat solvers are known to perform well on instances of small treewidth, even if they do not actively apply techniques such as dynamic programming). Another advantage is the surprisingly simple generalization to the optimization and counting version of Courcelle’s Theorem – we can directly “plug in” maxsat or #sat and obtain the corresponding results. As a byproduct, we also obtain new proofs showing (purely as encodings into propositional logic) that qsat parameterized by the input’s treewidth plus quantifier alternation is fixed-parameter tractable (improving a complex dynamic program with nested tables) and that pmc parameterized by treewidth is fixed-parameter tractable (improving a multi-pass dynamic program). Table 1 provides an overview of the encodings presented within this article.
Encoding… | ||||
---|---|---|---|---|
From | To | Treewidth | Size | Reference |
qsat | sat | Theorem 1 | ||
pmc | #sat | Theorem 2 | ||
sat | Lemma 12 | |||
cnf | dnf | Lemma 13 | ||
mc(mso) | sat | Lemma 11 | ||
fd(mso) | maxsat | Lemma 14 | ||
#fd(mso) | #sat | Lemma 15 | ||
sat | mc(mso) | Lemma 17 | ||
qsat | mc(mso) | Theorem 6 |
Our encodings are exponentially smaller than the best known running time for mc(mso), i.e., when we solve the instances using Fact 1, we obtain the same runtime. We complemented this finding with new -based lower bounds. Further research will be concerned with closing the remaining gap in the height of the tower between the lower and upper bounds. We show in an upcoming paper that the terms “” in Theorem 3 and “” in Theorem 5 can be replaced by “” on guarded formulas, i.e., formulas in which there are only two first-order quantifers that are only allowed to quantify edges. Here, refers to the quantifier alternation of the second-order quantifers only. Hence, on such guarded formulas (e.g., on all examples in the introduction), the bounds are tight. Another task that remains for further research is to evaluate the encodings in practice. This would also be interesting for the auxiliary encodings, e.g., can a treewidth-aware cardinality constraint compete with classical cardinality constraint?
References
- [1] Michael Alekhnovich and Alexander A. Razborov. Satisfiability, Branch-Width and Tseitin Tautologies. Comput. Complex., 20(4):649–678, 2011. doi:10.1007/S00037-011-0033-1.
- [2] Stefan Arnborg and Andrzej Proskurowski. Problems on Graphs with Bounded Decomposability. Bull. EATCS, 25:7–10, 1985.
- [3] Albert Atserias and Sergi Oliva. Bounded-width QBF is PSPACE-complete. Journal of Computer and System Sciences, 80(7):1415–1429, 2014. doi:10.1016/j.jcss.2014.04.014.
- [4] Fahiem Bacchus, Shannon Dalmao, and Toniann Pitassi. Algorithms and Complexity Results for #SAT and Bayesian Inference. In 44th Symposium on Foundations of Computer Science (FOCS 2003), 11-14 October 2003, Cambridge, MA, USA, Proceedings, pages 340–351, 2003. doi:10.1109/SFCS.2003.1238208.
- [5] Max Bannach, Malte Skambath, and Till Tantau. On the Parallel Parameterized Complexity of MaxSAT Variants. In 25th International Conference on Theory and Applications of Satisfiability Testing, SAT 2022, August 2-5, 2022, Haifa, Israel, pages 19:1–19:19, 2022. doi:10.4230/LIPIcs.SAT.2022.19.
- [6] Umberto Bertelè and Francesco Brioschi. On Non-serial Dynamic Programming. J. Comb. Theory, Ser. A, 14(2):137–148, 1973. doi:10.1016/0097-3165(73)90016-2.
- [7] Armin Biere, Marijn Heule, Hans van Maaren, and Toby Walsh, editors. Handbook of Satisfiability, Second Edition. IOS Press, 2021. doi:10.3233/FAIA336.
- [8] Bernhard Bliem, Reinhard Pichler, and Stefan Woltran. Declarative Dynamic Programming as an Alternative Realization of Courcelle’s Theorem. In Parameterized and Exact Computation – 8th International Symposium, IPEC 2013, Sophia Antipolis, France, September 4-6, 2013, Revised Selected Papers, pages 28–40, 2013. doi:10.1007/978-3-319-03898-8_4.
- [9] J. Richard Büchi. Weak Second-Order Arithmetic and Finite Automata. Zeitschrift für mathematische Logik und Grundlagen der Mathematik, 6(1-6):66–92, 1960.
- [10] Florent Capelli and Stefan Mengel. Tractable QBF by Knowledge Compilation. In 36th International Symposium on Theoretical Aspects of Computer Science, STACS 2019, March 13-16, 2019, Berlin, Germany, pages 18:1–18:16, 2019. doi:10.4230/LIPICS.STACS.2019.18.
- [11] Hubie Chen. Quantified Constraint Satisfaction and Bounded Treewidth. In Proceedings of the 16th Eureopean Conference on Artificial Intelligence, ECAI’2004, including Prestigious Applicants of Intelligent Systems, PAIS 2004, Valencia, Spain, August 22-27, 2004, pages 161–165, 2004.
- [12] Bruno Courcelle. The Monadic Second-Order Logic of Graphs. I. Recognizable Sets of Finite Graphs. Inf. Comput., 85(1):12–75, 1990. doi:10.1016/0890-5401(90)90043-H.
- [13] Bruno Courcelle and Joost Engelfriet. Graph Structure and Monadic Second-Order Logic – A Language-Theoretic Approach, volume 138 of Encyclopedia of mathematics and its applications. Cambridge University Press, 2012.
- [14] Adnan Darwiche. Decomposable Negation Normal Form. J. ACM, 48(4):608–647, 2001. doi:10.1145/502090.502091.
- [15] Johannes Klaus Fichte, Markus Hecher, Michael Morak, Patrick Thier, and Stefan Woltran. Solving Projected Model Counting by Utilizing Treewidth and its Limits. Artif. Intell., 314:103810, 2023. doi:10.1016/j.artint.2022.103810.
- [16] Johannes Klaus Fichte, Markus Hecher, and Andreas Pfandler. Lower Bounds for QBFs of Bounded Treewidth. In LICS ’20: 35th Annual ACM/IEEE Symposium on Logic in Computer Science, Saarbrücken, Germany, July 8-11, 2020, pages 410–424, 2020. doi:10.1145/3373718.3394756.
- [17] Jörg Flum and Martin Grohe. Parameterized Complexity Theory. Texts in Theoretical Computer Science. An EATCS Series. Springer, 2006. doi:10.1007/3-540-29953-X.
- [18] Georg Gottlob, Reinhard Pichler, and Fang Wei. Monadic Datalog Over Finite Structures of Bounded Treewidth. ACM Trans. Comput. Log., 12(1):3:1–3:48, 2010. doi:10.1145/1838552.1838555.
- [19] Rudolf Halin. S-functions For Graphs. Journal of geometry, 8(1):171–186, 1976.
- [20] Petr Hlinený, Sang-il Oum, Detlef Seese, and Georg Gottlob. Width Parameters Beyond Tree-width and their Applications. Comput. J., 51(3):326–362, 2008. doi:10.1093/comjnl/bxm052.
- [21] Neil Immerman. Descriptive Complexity. Graduate texts in computer science. Springer, 1999. doi:10.1007/978-1-4612-0539-5.
- [22] Bart M. P. Jansen and Stefan Kratsch. A Structural Approach to Kernels for ILPs: Treewidth and Total Unimodularity. In Algorithms – ESA 2015 – 23rd Annual European Symposium, Patras, Greece, September 14-16, 2015, Proceedings, pages 779–791, 2015. doi:10.1007/978-3-662-48350-3_65.
- [23] Donald E. Knuth. The Art of Computer Programming, volume 4, Fascicle 6. Addison-Wesley, 2016.
- [24] Tuukka Korhonen and Matti Järvisalo. Integrating Tree Decompositions into Decision Heuristics of Propositional Model Counters (Short Paper). In 27th International Conference on Principles and Practice of Constraint Programming, CP 2021, Montpellier, France (Virtual Conference), October 25-29, 2021, pages 8:1–8:11, 2021. doi:10.4230/LIPICS.CP.2021.8.
- [25] Stephan Kreutzer. Algorithmic meta-theorems. In Finite and Algorithmic Model Theory, pages 177–270. Cambridge University Press, 2011.
- [26] Michael Lampis, Stefan Mengel, and Valia Mitsou. QBF as an Alternative to Courcelle’s Theorem. In Theory and Applications of Satisfiability Testing – SAT 2018 – 21st International Conference, SAT 2018, Held as Part of the Federated Logic Conference, FloC 2018, Oxford, UK, July 9-12, 2018, Proceedings, pages 235–252, 2018. doi:10.1007/978-3-319-94144-8_15.
- [27] Ruiming Li, Dian Zhou, and Donglei Du. Satisfiability and Integer Pprogramming as Complementary Tools. In Proceedings of the 2004 Conference on Asia South Pacific Design Automation: Electronic Design and Solution Fair 2004, Yokohama, Japan, January 27-30, 2004, pages 879–882, 2004. doi:10.1109/ASPDAC.2004.178.
- [28] Neil Robertson and Paul D. Seymour. Graph Minors. II. Algorithmic Aspects of Tree-width. Journal of Algorithms, 7(3):309–322, 1986. doi:10.1016/0196-6774(86)90023-4.
- [29] Sigve Hortemo Sæther, Jan Arne Telle, and Martin Vatshelle. Solving #SAT and MAXSAT by Dynamic Programming. J. Artif. Intell. Res., 54:59–82, 2015. doi:10.1613/jair.4831.
- [30] Marko Samer and Stefan Szeider. Algorithms for Propositional Model Counting. J. Discrete Algorithms, 8(1):50–64, 2010. doi:10.1016/j.jda.2009.06.002.
- [31] Carsten Sinz. Towards an Optimal CNF Encoding of Boolean Cardinality Constraints. In Principles and Practice of Constraint Programming – CP 2005, 11th International Conference, CP 2005, Sitges, Spain, October 1-5, 2005, Proceedings, pages 827–831, 2005. doi:10.1007/11564751_73.