Symbolic Conflict Analysis in Pseudo-Boolean Optimization
Abstract
In the the last two decades, a lot of effort has been devoted to the development of satisfiability-checking tools for a variety of SAT-related problems. However, most of these tools lack optimization capabilities. That is, instead of finding any solution, one is sometimes interested in a solution that is best according to some criterion.
Pseudo-Boolean solvers can be used to deal with optimization by successively solving a series of problems that contain an additional pseudo-Boolean constraint expressing that a better solution is required. A key point for the success of this simple approach is that lemmas that are learned for one problem can be reused for subsequent ones.
In this paper we go one step further and show how, by using a simple symbolic conflict analysis procedure, not only can lemmas be reused between problems but also strengthened, thus further pruning the search space traversal. In addition, we show how this technique automatically allows one to infer upper bounds in maximization problems, thus giving an estimation of how far the solver is from finding an optimal solution. Experimental results with our PB solver reveal that (i) this technique is indeed effective in practice, providing important speedups in problems where several solutions are found and (ii) on problems with very few solutions, where the impact of our technique is limited, its overhead is negligible.
Keywords and phrases:
SAT, Pseudo-Boolean Optimization, Conflict AnalysisCopyright and License:
2012 ACM Subject Classification:
Theory of computationSupplementary Material:
Software (Source Code): https://github.com/dearzhaorui/symbolic-conflict-analysis
archived at
swh:1:dir:f5894275ca84f010f2626a78762667102cefb9f4
Funding:
All authors are supported by grant PID2021-122830OB-C43, funded by MCIN/AEI/ 10.13039/501100011033 and by “ERDF: A way of making Europe”. Second and third authors are supported by Barcelogic through research grants C-11423 and C-11422, respectively.Event:
28th International Conference on Theory and Applications of Satisfiability Testing (SAT 2025)Editors:
Jeremias Berg and Jakob NordströmSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
SAT solvers are nowadays routinely used in an increasing number of applications areas. Despite success stories initially emerged in the area of verification [4], they are now common in security [10, 35], cryptography [34] and even mathematics [18]. In parallel to these practical developments, theoretical works have shown that CDCL [23], the most successful procedure for SAT, cannot solve certain type of problems (e.g. the pigeon-hole principle [17]) in polynomial time [2, 28].
Pseudo-Boolean (PB) solving, also known as 0-1 Integer Linear Programming, has established itself as a promising alternative to SAT. First of all, 0-1 linear constraints are more general than clauses, and allow one to encode problems in a more compact way. Secondly, cutting planes [7], the underlying proof system of CDCL-based PB solvers [31] is exponentially more powerful than resolution [30], the proof system which CDCL-based SAT solvers rely on. Thus, PB solvers are, at least from the theoretical point of view, exponentially more powerful than SAT solvers.
Last, but not least, sometimes satisfiability checking is not enough. In several applications [16, 1, 26], one wants to find an optimal solution according to some criterion. Unlike what happens with SAT solvers, it is very easy to turn a PB solver into a PB optimizer, which finds a solution to a set of 0-1 linear constraints that maximizes a given objective 0-1 linear function. The solver is initially run on the set of constraints, ignoring the objective function, in order to obtain a first solution. If the objective function value for this solution is , a constraint expressing that only solutions with objective function value strictly larger than is added and the solver is executed again. The process is repeated until the solver is not able to find further solutions, concluding that the last solution found is optimal.
A noteworthy aspect of the previous approach, known as Linear Search, is that any lemma learned by the solver when looking for a solution larger than can still be used at any future point when a solution larger than is sought. This well-known fact [25] is crucial for the performance of this method.
Contributions.
In this paper, we go one step further and show how constraints learned when looking for a solution larger than can not only be reused, but also automatically strengthened so that a stronger version of them is available during the search for solutions larger than . This is accomplished by considering as a symbolic variable and adapting conflict analysis to take care of this feature. As a by-product of this symbolic way of treating constraints, one can automatically extract upper bounds from them, thus giving an estimation of how far the solver is from reaching an optimal solution. Experimental results show that the overhead of this symbolic procedure is negligible, and that its ability to further prune the search space results in important runtime improvements.
Related Work.
Pseudo-Boolean optimization is a well-studied problem for which three main methods exist: Linear Search, Core-guided Search [13, 24, 9], and Implicit Hitting Set approaches [33, 32]. They are considered to be complementary in the type of instances for which they work best and solvers might even use combinations of them (e.g. RoundingSat combines Linear with Core-guided Search). Linear Search, which is the topic of this paper, has also been applied to the MaxSAT problem. Two prominent contributions in this direction are Pacose [27] and MaxCDCL solver [21, 22], which combines a branch and bound search with a lower-bounding procedure to prune the search space. As far as we know, no previous work uses a symbolic treatment of the objective function and no other linear-search method can compute upper bounds with no additional cost. Hence, we believe the community will benefit from the introduction of this novel technique and will develop additional ways to further exploit it.
This paper is organized as follows. After some preliminaries in Section 2, we revisit the conflict analysis procedure of PB solvers in Section 3. Section 4 presents the main contributions of this paper: the symbolic conflict analysis procedure and its additional ability to compute upper bounds. Experimental evidence of their positive impact on a solver and a careful analysis of the data collected is reported in Section 5. We conclude in Section 6, where we also outline future research directions.
2 Preliminaries
Pseudo-Boolean Constraints.
Let be a set of propositional variables. A literal is either a variable () or the negation of one (). We will assume that . A (pseudo-Boolean or PB) constraint is a 0-1 linear inequality where the ’s are literals and, without loss of generality, the ’s (coefficients) and (degree) are positive integers. When all coefficients are we say that the constraint is a cardinality constraint and if, in addition, is also 1 we say that it is a . A formula is a set of constraints.
Satisfiability and Logical Consequence.
An assignment is a set of non-contradictory literals. It is total if for any either or , and partial otherwise. A literal is true in if , is false if and is undefined otherwise. Given a constraint of the form , an assignment satisfies it if , and falsifies it if no extension of can satisfy it. If we define , it can be seen that falsifies if and only if . Note the slack expression sums the coefficients of all non-false literals, and that is the maximum value that the left hand side of the constraint can reach. If even that does not exceed , no extension of will do. An assignment that satisfies all constraints of a formula is called a model. A formula is a logical consequence of if any model of is also a model of .
Inference Rules.
In order to determine the satisfiability of a set of constraints, one can use the cutting planes proof system, which consists of axioms for all literals , and the following two inference rules:
Linear combination:
Division:
It is well known [19] that a set of constraints is unsatisfiable if and only if can be derived using these rules. We note that in the application of linear combination we implicitly assume that any constraint is simplified by using that fact that . For example, the constraint can be rewritten as and is hence equivalent to . Another well-known rule, which is very useful in PB solving, limits coefficients to be at most equal to the degree:
Saturation:
Unit Propagation.
Given a constraint and an assignment , we say that unit propagates under if is undefined in , but is true in any total assignment extending that satisfies . The latter is equivalent to checking whether , i.e., if we do not set to true, the constraint becomes falsified. Given a formula and an assignment , unit propagation of under is the outcome of applying the following two rules until a fixpoint is reached: (i) if falsifies a constraint , a conflict is found with conflicting constraint and we stop, (ii) if unit propagates some literal due to constraint , extend with reason .
Conflict-Driven Pseudo-Boolean Solving.
A generalization of the well-known CDCL [23] algorithm for SAT can be applied to the pseudo-Boolean case [31]. The algorithm starts with an empty assignment and proceeds as follows: (1) Apply unit propagation, possibly extending . (2) If a conflict is found, a conflict analysis procedure uses the aforementioned inference rules to derive a constraint (called lemma) that can be safely added to the formula. If is the constraint , the formula is unsatisfiable, otherwise it is guaranteed that after removing some literals from in a last-in first-out way (backjumping), allows some literal to be unit propagated. Hence, we go to step 1. (3) If no conflict is found, and is total, it is a model of the formula. Otherwise, an undefined literal (decision literal) is added to and we go to step 1. If we see as a sequence, any literal appearing in the sequence between the -th decision literal (included) and before the -th one is said to belong to decision level .
Pseudo-Boolean Optimization.
Given a set of PB-constraint and an objective function , the Pseudo-Boolean Optimization problem consists of finding, among all assignments that satisfy , one that maximizes the value of the objective function. A very simple approach for this problem is to first run a PB solver to determine the satisfiability of . If a solution is found, for which the objective function has value , a constraint is added to and the same solver is launched again. Eventually, the solver will report the unsatisfiability of the augmented set of constraints, and the last found solution is an optimal one. This is sometimes known as the Linear Search approach to PB optimization.
3 Conflict Analysis in CDCL-Based Pseudo-Boolean Solvers
In CDCL-based PB solvers, when unit propagation finds a conflicting constraint, a procedure is launched whose ultimate goal is to derive a new constraint that (i) is a logical consequence of the current formula and hence can safely be added to it, and (ii) allows the solver to backjump to some previous decision level and propagate a literal at that point. This mimics conflict analysis in SAT solvers, where the 1-UIP scheme [36] has established itself as the dominant approach and, after twenty years, only small variants have been added to it (e.g. [12]).
Conflict analysis in PB solvers is a more complex task than in SAT and one can find different variants in state-of-the-art CDCL-based PB solvers. However, we believe that Algorithm 1 is an adequate abstraction of most of them. For the purpose of this paper, this simplified presentation is detailed enough so that we can introduce our procedure in Section 4. The overall idea of Algorithm 1 is that a series of linear combination steps are applied between the conflicting constraint and the reasons for certain literals in the trail until we obtain a constraint that propagates some literal at some previous decision level. This is essentially what conflict analysis does in CDCL-based SAT solvers, where resolution can be seen as a concrete case of linear combination.
There are three differences we would like to remark. The first difference can be found in the loop in lines 4-6. In a SAT solver, this loop looks for the topmost literal whose negation appears in . But the actual property we want about is that the addition of to is the one that caused to be false. In a PB constraint, the literal with this property need not be the topmost in whose negation appears in . For example, if , the conflicting constraint already became false when was added to , although the topmost literal whose negation appears in the constraint is .
The second, and probably the most important difference, is that we want the application of a linear combination to preserve the fact that the resulting constraint is false in . This property, which trivially holds in SAT solvers, is not true when using linear combinations in general.
Example 1.
Let us consider the two constraints and . Assume that we decide on and later on . The second constraint propagates and and hence we have the assignment . But now the first constraint is conflicting. After the loop in lines 4-6, is and, if line 8 were not present, and hence the linear combination111Note that in Algorithm 1 one can use the least common multiple to compute smaller but we omit this detail in the presentation. would be (note that the right hand side is , where the negative numbers come from canceling 25 units of variable and 10 units of variable ). One can now check that the derived constraint is no longer false in the new assignment .
To avoid this problem, which is well-known in the PB [6, 11, 3] and the ILP communities [20, 29], function replaces the constraints by weaker versions, that is, logical consequences. Probably the easiest solution is to replace the reason of a literal by the clause that expresses the implication. In the previous example we could convert the reason for , which was , to , expressing that if both and are false, then needs to be true. Now and and the linear combination that consists of adding 5 times the new reason clause with the conflicting constraint is , which is now false in the assignment . Other possibilities for weakening the reason constraint exist [6, 11, 3] but they are out of the scope of this paper.
The last difference in the procedure refers to the termination of the procedure. In SAT solving, conflict analysis terminates when the 1-UIP is found, which guarantees that the current propagates at some previous decision level. In PB solving, that constraint might be propagating even before the 1-UIP is found and hence it makes sense to check, after every linear combination step, whether propagates at some previous level. In order to alleviate the computational effort of this check, we use the fact that if and do not have any contradictory literal other than , then if does not propagate at some previous decision level, neither does the constraint resulting of the linear combination between and computed in line 11 of Algorithm 1. We prove this in the following.
Lemma 2.
Let and be two PB constraints such that their parts and have no contradictory literal and let be the linear combination . For any assignment that assigns some value to , if is false in then either or are also false in .
Proof.
Given a constraint , we denote by the (S)um of the coefficients of all literals in that are (N)on-(F)alse in . It holds that .
Let us assume that neither nor are false in and reach a contradiction. Since is false in , we have that . Since we know that is of the form , with no contradictory literals between and , we have that . Note that the last equality holds because and have no contradictory literals, otherwise cancellation between those literals could make it invalid. All in all, we can conclude that .
Now, since is not false, we have that and hence and also and . Similarly . If we add these last two inequalities, and considering that because is assigned in , we have that , which contradicts the last inequality of the previous paragraph.
Corollary 3.
Let and be two PB constraints such that their parts and have no contradictory literal and let be the linear combination . For any assignment that assigns some value to , if unit propagation of under propagates a literal , then either or also propagate under .
Proof.
Apply Lemma 2 with assignment and use the fact that a constraint is false in if and only if propagates a literal under .
We conclude this section with an example showing the execution of a CDCL-based PB solver that uses a conflict analysis method based on Algorithm 1. This allow us to show how our procedure of Section 4 is able to improve it. For simplicity, we have chosen an example for which the application of is never necessary.
Example 4.
First solution.
Since no constraint is propagating, the solver decides on a literal, say , which propagates and due to . Next decision is , which propagates and due to . Next decision is , which propagates and due to . All constraints are satisfied and we have found a solution for which the objective function has value .
Second solution.
Since we have to look for a better solution, constraint is added. The solver makes the same decisions and propagations as before, but the assignment , where a superscript indicates a decision and a subscript represents that constraint is the reason for that propagation, now falsifies constraint . Conflict analysis is started, popping literal from , making be 5 and the loop in lines terminate. Since is the reason for , and , the linear combination is , which is . This constraint propagates at decision level 2, and hence the solver backjumps to produce assignment . No other constraint propagates and the solver now decides on , which propagates due to and is a solution with value .
Third solution.
Since a better solution is needed we replace by . The solver restarts and makes the same decisions and propagations to produce . However, now propagates and, after that, propagates and , which produces a solution with value .
Fourth solution.
now becomes . The solver restarts and proceeds as before to reach . Now propagates and and the resulting assignment falsifies constraint The loop at lines pops from , making to be . Since the coefficient of in the reason of is and the coefficient of in is the linear combination computed is , which is . Since this constraint does not propagate at any previous decision level, one more iteration of conflict analysis is required. The loop at lines 4-6 pops literals and , after which the is 3 and the loop terminates. The procedure will now try to eliminate from by performing an appropriate linear combination with the reason for , which is . This linear combination is , which results in . This constraint is learned since it allows the solver to backjump to decision level 1 and propagate and . The assignment is now . No other constraint propagates and the solver decides on , which propagates due to . Next decision is on , which propagates due to . The current assignment is a solution with value .
Next step.
now becomes . Let us now remark that the solver, apart from the original constraints, only has the lemmas:
The solver restarts and, since no constraint propagates at decision level zero, it will decide on . However, as we will show in the next section, this can be done much better.
4 Symbolic Conflict Analysis in CDCL-Based Pseudo-Boolean Solvers
This section starts with the presentation of our procedure. Then, in Section 4.2 we explain how to obtain upper bounds and how to use it inside a binary-search solving procedure. Next, we explain in Section 4.3 how the procedure interacts with other inference rules commonly used in PB solvers.
4.1 The Symbolic Conflict Analysis Procedure
Let us consider again the maximization problem in Example 4. After every solution found, constraint , which states that only better solutions are accepted, is strengthened. However, we know that lemma was obtained via the linear combination , and lemma was . What we show in this section is how, after every solution is found, not only can we strengthen but also strengthen every lemma that was derived by using . Moreover, we can do it in a simple and efficient way that only requires minor modifications to the solver. Let us introduce our symbolic conflict analysis procedure with the following example.
Example 5.
Let us consider the same maximization problem as in Example 4.
First solution.
We proceed as before in order to find the first solution, with value .
Second solution.
Now, instead of adding constraint , we modify its degree so that it contains a symbolic representation of it. We consider a symbolic variable and instead add constraint . The symbolic part intuitively expresses that the objective function has to be at least 1 unit larger than the value of the best solution found so far, which is symbolically represented by a variable . In terms of propagation, this symbolic expression will be ignored in all constraints.
Hence, we have the same constraints as in the previous example and the solver constructs . Our symbolic conflict analysis is started, proceeding as usual except when performing linear combinations, where the symbolic part will be taken into account. As before, we have to compute . Note that input constraints like also have a symbolic representation of the degree. However, in those cases it will be equal to the degree. That is, the linear combination is:
A simple arithmetic manipulation, also on the symbolic part, results in the final learned lemma . Since the symbolic representation is ignored when propagating, the solver proceeds as in the previous example to obtain solution with cost 5.
Third solution.
This is where our procedure makes the solver behave differently. Since we are looking for a solution of cost larger than 5, we strengthen to become but we can also automatically strengthen to become . More explicitly, the new degree of a constraint is equal to symbolic part where variable is replaced by the value of the best solution found so far. We can observe that allows for more propagations than its initial non-strengthened version. However none of these propagations apply here and the solver finds the same solution as before, with value .
Fourth solution.
Constraints and are strengthened again to become now and . By making the same decisions, the solver finds again the same conflict and the first linear combination to be computed is again , which gives .
This constraint does not propagate at any previous decision level and one additional iteration is needed. The following linear combination is , which results in . It propagates and at decision level 1 and the solver proceeds identically to obtain a solution of value .
Fifth solution.
Constraints , and are strengthened again and become
These constraints are now much stronger than in the previous example due to symbolic conflict analysis. This clearly pays off now: propagates and at decision level zero, something that was not possible in Example 4, where our symbolic conflict analysis procedure was not used. The solver now decides on , which propagates and due to . Since we now have the strengthened version of , we can propagate . Unit propagation ends up finding a solution with value .
Sixth solution.
Constraints , and are strengthened again and become now:
After the propagation of at decision level zero, the solver decides on and unit propagation produces the assignment with value . This now falsifies and conflict analysis is triggered. So is popped from , and the linear combination is performed, giving . Since it does not propagate at decision level zero, conflict analysis continues. Then and are popped and the linear combination is performed, resulting in , which propagates at decision level zero. The solver backjumps to . Since nothing else propagates, the solver decides on and unit propagation results in solution with value 11.
Seventh solution.
Constraints , , and are strengthened again and become now:
These strengthenings allow the solver to compute the optimal solution only by unit propagation , which has value 12.
Summing up, there are only three differences with respect to the underlying algorithm of a CDCL-based PB solver. First of all, PB constraints now have a symbolic representation of their degree. Note that this representation will always be of the form , where . We cannot guarantee that they are natural numbers because of the use of the division rule that we will explain later. This symbolic representation is updated accordingly when the linear combination rule is applied. The second difference is that, when the first solution is found (with value ) a new constraint with symbolic part and degree is added. The third difference is that when a new solution is found with value , the degree of all constraints is updated. More concretely, if a constraint has as symbolic part its degree will become . The following theorem guarantees the correctness of our procedure.
Theorem 6.
Let us consider a set of PB constraints and a PB objective function to maximize. If the symbolic constraint can be derived by a series of symbolic linear combination steps from and the symbolic constraint , then can be derived by a series of linear combination steps from and .
4.2 Additional Benefits of Symbolic Conflict Analysis
In the previous section we showed the main benefit of symbolic conflict analysis: the strengthened constraints have stronger unit propagation capabilities and hence allow for a more efficient traversal of the search space. We now report on additional benefits that we obtain by using our symbolic conflict analysis procedure.
Computation of Upper Bounds.
Somewhat surprisingly, a by-product of the symbolic procedure we have introduced is that we can easily compute upper bounds on the value of the objective function we want to maximize.
Again, the idea is best illustrated by revisiting Example 5. Let us consider constraint in its original form . Assume that a solution with value is found. Our procedure would automatically update the degree of and it would become, removing its symbolic part for the sake of this reasoning, . The left-hand side of this inequality can at most be (the sum of its coefficients). This means that if (i.e. ) this constraint cannot be satisfied and hence the solver would conclude that there is no solution strictly larger than . Hence the best objective can at most be . Thus, after second solution (with value 5) is found in Example 5, we can guarantee that the optimal solution is in the interval .
Theorem 7.
Let us consider a PB maximization problem consisting of a set of PB constraints and an objective function . If we can derive by a series of symbolic linear combination steps a symbolic constraint from and , then the value of the optimal solution to is at most .
Proof.
Let be the value of the optimal solution to . Let us consider the set of constraints , which we know is satisfiable. Theorem 6 and the premises of this theorem guarantee that a series of linear combination steps exists that allow us to derive from the constraint . Since linear combination produces logical consequences, the latter constraint must be satisfiable. Hence must be such that , which is the same as . Since must be an integer value, we can take the floor of the right-hand side to obtain the desired bound.
In Example 5, the previous theorem allows the symbolic conflict analysis procedure to produce an upper bound of due constraint , and a better upper bound of due to .
An interesting question is whether the derivation of upper bounds can trigger an early termination of the procedure because the best solution found matches the computed upper bound. We now prove that, in this situation, the constraint from which the upper bound is computed becomes trivially unsatisfiable after the corresponding strengthening step.
Theorem 8.
Let us consider the constraint from which the upper bound is derived. If a solution with value is obtained, the corresponding strenghtened version of is unsatisfiable.
Proof.
Due to theorem 7, we know that . If we find a solution with value we can strengthen to obtain . We can manipulate the degree of the constraint as follows:
for some . Hence, the strengthened constraint is which is clearly unsatisfiable.
Use in a Binary-Search Approach.
The standard CDCL-based PB procedure we revisited in Example 4 strengthens the degree of the objective function every time a solution is found in order to find a better one. As mentioned before, this is sometimes known as Linear Search. However, we can use the well-known binary-search approach in which we always have an interval that includes the value of the optimal solution. If the optimal has been found. Otherwise we pick and recursively solve two problems, the first one in which a solution is sought in , and a second problem in which a solution is sought in .
One of the main drawbacks of this approach is that constraints learned when solving the problem cannot be reused for the problem, because they might have been derived by using the constraint that expresses that a solution of value at least is needed, which is a not a valid constraint in the problem.
However, this problem can easily be overcome in our symbolic conflict analysis procedure. First of all, we can easily identify the constraints that depend on : the ones with a non-zero coefficient on in the symbolic part. But, even more importantly, we can even reuse those ones for solving the problem by substituting by in the symbolic part and changing the degree of the constraint to that value.
4.3 Interaction with Other Inference Rules
As we have mentioned, our presentation in Algorithm 1 of the conflict analysis procedure used by CDCL-based PB solvers omitted some details in order to (i) ease its understanding and (ii) do not bias it towards any concrete solver. One of the ingredients that our presentation abstracts away is the use of other rules apart from linear combination. In particular, as explained in [15], the use of division and saturation allows solvers to process some problems exponentially faster. Apart from this important effect, they also allow the solver to deal with smaller coefficients in the constraints, which may become very large after a few linear combinations are applied. The recurrent use of infinite-precision arithmetic in a solver might have remarkable negative effects on performance.
The Division Rule.
This rule can be smoothly adapted to be used in our symbolic conflict analysis procedure. Whenever a constraint is divided by a positive natural number to obtain its symbolic part is also divided by the same constant to become . This is the reason why coefficients in the symbolic part might be rational numbers. When a new solution with value is found and the constraint needs to be strengthened, its degree will become the ceiling .
The Saturation Rule.
Another common rule in PB solvers is saturation, which forces all coefficients of a constraint to be at most equal to its degree. In general, the use of this rule is incompatible with our symbolic procedure. This is easy to see if we consider again Example 5. The first lemma that is learned is . At this point, we could apply saturation to obtain . However, after the fourth solution (with value 9) is found, the procedure would update the degree of this constraint to be , which would make the constraint unsatisfiable. Hence, the solver would wrongly conclude that the value of the optimal solution is .
We want to remark that saturation is still applicable to all constraints where has zero coefficient on the symbolic part, i.e., the ones that do not depend on the objective function. Regarding the rest of the constraints, there are a few possibilities: (i) apply saturation but remove the symbolic part or (ii) do not apply saturation, or (iii) apply saturation but store the original coefficients of the constraints in order to be used later when a strengthening step is performed.
5 Experimental Evaluation
In order to empirically evaluate the impact of our symbolic conflict analysis procedure, we have run experiments on the set of benchmarks that were used in the OPT-LIN (optimization problems with linear constraints) track of the 2024 Pseudo-Boolean Competition222https://www.cril.univ-artois.fr/PB24/. Since our solver does not accept big integers on the constraints, we limit our analysis to the 385 benchmarks that only use 32-bit integers. All experiments333Additional material can be found in https://github.com/dearzhaorui/symbolic-conflict-analysis. were done on 3.3Ghz 16GB Intel Xeon E-2124 machines, setting a time limit of one hour per benchmark. We report our findings in what follows.
Potential and Overhead of Symbolic Conflict Analysis.
After every new solution is found, except for the first one, our symbolic conflict analysis procedure performs a strengthening step: it changes the degree of all symbolic constraints. Hence, having a systematic low number of strengthening steps on all benchmarks would indicate that our procedure is rarely applicable. In order to confirm that this pessimistic scenario is not common we computed, for all 385 benchmarks, how many such steps are done within one hour. This information is summarized in the leftmost cumulative plot of Figure 1.
A point in the plot means that there exist of the benchmarks for which at least strengthening steps are applied. For example, around 50% of the benchmarks have at least 8 steps, one third of them has at least 15 steps, and 20% of the benchmarks have at least 24 steps. Hence, the important message one has to infer from the plot is that it is very common that a non-negligible number of strengthening steps are used. There are some instances, left out of the plot for readability, with an unusually large number of steps: 4 benchmarks with between 100 and 150 steps, 2 benchmarks between 151 and 200, and 2 benchmarks with more than 201, being 255 the maximum one. There are also extreme cases on the opposite direction: for about 18% of the benchmarks no strengthening step was applied. This is not a huge number but we believe it is large enough so that it is not negligible: doing expensive computations to later realize that our technique is not applicable at all would definitely slow down the solver on around 20% of the instances.
Fortunately, the rightmost scatter plot of Figure 1 reveals how surprisingly small the overhead of symbolic conflict analysis is. For precisely assessing this overhead, we implemented our symbolic procedure on top of our solver: all constraints have a symbolic part; linear combination, saturation and division are applied taking into account this symbolic part but, after a new solution is found, the strengthened degree of all symbolic constraints is computed but not updated. That is, we pay all the overhead of our procedure but, since the constraints are unchanged, the solver has the same search behavior as its baseline version. The scatter plot compares the runtime in seconds of this modified system and our original solver. It is entirely obvious that the overhead caused by the symbolic procedure is negligible. We want to remark that this is the case independently of the package we use for infinite-precision rational numbers on the symbolic part: we tried GMP [14], Boost [5] and a custom package developed in our research group and no differences were observed.


Runtime Improvement.
Let us now show that implementing symbolic conflict analysis on top of our solver produces important performance gains. The CDF plot on the left of Figure 2 shows how the addition of the symbolic conflict analysis procedure makes the solver able to solve more problems within any given time limit. For a time limit of 1 hour the solver is able to solve 6 more benchmarks up to optimality. The scatter plot on the right contains two parallel green lines delimiting speedups of 2x. We can see that despite there are concrete instances for which the symbolic solver is slower, they are mostly easy instances solved within 30 seconds, and the slowdown is never larger than 2x. On the other hand, there are several benchmarks for which the symbolic solver outperforms the baseline by very large speedup factors.


Finally, we want to compare our system with the state of the art. For this purpose we chose the latest version of RoundingSat [11, 9, 8]. Note that, in the optimization linear constraints category of the 2024 Pseudo-Boolean Competition, 8 of the 10 best performing solvers used RoundingSat, either as an oracle or combined with additional techniques. We ran RoundingSat in linear-search mode. Despite its full version, which combines linear and core-guided search is more powerful, the purpose of this comparison is limited to linear search. The CDF plot on Figure 3 shows how our original solver is slower than RoundingSat. This might be due to maturity of the implementation, different search strategies and, more importantly, concrete details in conflict analysis related to different weakening procedures, different uses of saturation and division, and termination criteria of the main loop as we explained in Section 3. However, the addition of symbolic conflict analysis bridges this gap and makes them behave more similarly. As we will mention, it is part of our future work to examine how symbolic conflict analysis can improve the linear-search component of other solvers, and RoundingSat is definitely a candidate.
In the optimization linear constraints category, 8 of the 10 best performing solvers did use RoundingSAT. That is, it was either used as an oracle or combined with additional techniques. Hence, it is at the core of almost every solver, except for SCIP.
Percentage of Symbolic Lemmas and Impact of Saturation.
The next question we want to address is to determine how many lemmas have a non-constant symbolic part, that is, with . The left histogram in Figure 4 summarizes this information. For each of the 312 out of the overall 385 benchmarks that apply some strengthening step, whenever this step is performed we compute the percentage of PB constraints that have a non-constant symbolic part. Note that in our solver clauses have a particular treatment and they are not considered as PB constraints in this computation. When the execution finishes, we compute the average over all strengthening steps of these percentages. The histogram essentially represents this average. More concretely, the first bar indicates that for 34 benchmarks this average is zero. The second bar shows that for 65 benchmarks this average is in the interval ; the third bar over the 10 label means that the average is in (5,10] for 13 benchmarks, and so on. As expected from the previous results on performance, the percentage of constraints with non-constant symbolic part is remarkable, making the strengthening step a powerful one.
Another question we wanted to answer is to discover the reasons why certain lemmas are non-symbolic (i.e. in ). One obvious reason is that it might happen that the constraint bounding the objective function has not been used in the derivation of this lemma. Another possibility is the interaction between saturation and our symbolic procedure, as we explained in Section 4.3. In our implementation, whenever saturation is to be applied on a constraint with non-constant symbolic part, we apply it and remove its symbolic information. If this happens too often, it could hinder the power of our symbolic reasoning. The right histogram in Figure 4 answers this question in a positive way: this situation is not common at all. For example, the bar with height 15 over the 20 label means that there are 15 instances for which, if we only consider non-symbolic constraints, the percentage of them that are non-symbolic due to an application of saturation is in the interval . As we can see, the main reason for a constraint not being symbolic is not the application of saturation.


Upper Bounding.
An interesting feature of our procedure is the computation of upper bounds. The number of times that our procedure improves the upper bound is significant for several benchmarks. In the left plot of Figure 5 we can see, for three concrete instances, the evolution of the upper and the lower bound. Since these instances are minimization problems, finding a new solution improves the upper bound, whereas our new bounding techniques improve the lower bound. We can see that both bounds are improved several times during the execution of the system. However, we want to note that there are of course instances for which the production of new bounds is not as frequent. This is observed in the right plot of Figure 5. We have considered all 87 benchmarks for which our solver timed out in one hour and such that after the last solution found, our bounding technique is able to reduce the size of the interval . The plot displays the number of instances for which the size of the last interval produced by the solver, divided by the size of the interval after the last solution was found is equal to a certain number. A small number implies that our bounding technique is able to produce a significant reduction on the interval size since the last found solution. As we can see, although for half of the benchmarks only small reductions are achieved, for the other half it does have a beneficial effect, with different magnitudes depending on the benchmark. This shows that, even when the solver is not able to find further solutions, the bounding technique still makes progress in the estimation of the optimum.


Binary Search.
It is well known that in PB optimization, binary search is generally outperformed by linear search. Unfortunately, according to our experiments, this is still the case even if one is able to reuse all lemmas from all problems, as we can do thanks to our symbolic procedure.
6 Conclusions and Future Work
We have introduced a novel symbolic conflict analysis procedure for PB optimization. Despite being easy to implement and having no overhead, experimental results have shown that it is able to significantly improve the performance of our linear-search based solver. In addition, it has several benefits such as the computation of upper bounds and its use in binary search.
As future work, we plan to evaluate the impact of our technique on other PB solvers. Finding ways to improve our bound computation is also on our agenda, as well as developing proof-logging techniques to verify the correctness of the results given by this technique.
References
- [1] Elvira Albert, Pablo Gordillo, Alejandro Hernández-Cerezo, and Albert Rubio. A max-smt superoptimizer for EVM handling memory and storage. In Dana Fisman and Grigore Rosu, editors, Tools and Algorithms for the Construction and Analysis of Systems – 28th International Conference, TACAS 2022, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022, Munich, Germany, April 2-7, 2022, Proceedings, Part I, volume 13243 of Lecture Notes in Computer Science, pages 201–219. Springer, 2022. doi:10.1007/978-3-030-99524-9_11.
- [2] Albert Atserias, Johannes Klaus Fichte, and Marc Thurley. Clause-learning algorithms with many restarts and bounded-width resolution. J. Artif. Intell. Res., 40:353–373, 2011. doi:10.1613/jair.3152.
- [3] Daniel Le Berre, Pierre Marquis, and Romain Wallon. On weakening strategies for PB solvers. In Luca Pulina and Martina Seidl, editors, Theory and Applications of Satisfiability Testing – SAT 2020 – 23rd International Conference, Alghero, Italy, July 3-10, 2020, Proceedings, volume 12178 of Lecture Notes in Computer Science, pages 322–331. Springer, 2020. doi:10.1007/978-3-030-51825-7_23.
- [4] Armin Biere and Daniel Kröning. Sat-based model checking. In Edmund M. Clarke, Thomas A. Henzinger, Helmut Veith, and Roderick Bloem, editors, Handbook of Model Checking, pages 277–303. Springer, 2018. doi:10.1007/978-3-319-10575-8_10.
- [5] The Boost C++ libraries. Available at https://www.boost.org/.
- [6] Donald Chai and Andreas Kuehlmann. A fast pseudo-boolean constraint solver. IEEE Trans. on CAD of Integrated Circuits and Systems, 24(3):305–317, 2005. doi:10.1109/TCAD.2004.842808.
- [7] W. Cook, C. Coullard, and Gy. Turan. On the complexity of cutting-plane proofs. Discrete Applied Mathematics, 18:25–38, 1987. doi:10.1016/0166-218X(87)90039-4.
- [8] Jo Devriendt, Ambros M. Gleixner, and Jakob Nordström. Learn to relax: Integrating 0-1 integer linear programming with pseudo-boolean conflict-driven search. Constraints An Int. J., 26(1):26–55, 2021. doi:10.1007/S10601-020-09318-X.
- [9] Jo Devriendt, Stephan Gocht, Emir Demirovic, Jakob Nordström, and Peter J. Stuckey. Cutting to the core of pseudo-boolean optimization: Combining core-guided search with cutting planes reasoning. In Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI 2021, Virtual Event, February 2-9, 2021, pages 3750–3758. AAAI Press, 2021. doi:10.1609/AAAI.V35I5.16492.
- [10] Julian Dolby, Mandana Vaziri, and Frank Tip. Finding Bugs Efficiently With a SAT Solver. In Proceedings of the 6th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering, pages 195–204, 2007. doi:10.1145/1287624.1287653.
- [11] Jan Elffers and Jakob Nordström. Divide and conquer: Towards faster pseudo-boolean solving. In Jérôme Lang, editor, Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13-19, 2018, Stockholm, Sweden, pages 1291–1299. ijcai.org, 2018. doi:10.24963/ijcai.2018/180.
- [12] Nick Feng and Fahiem Bacchus. Clause size reduction with all-uip learning. In Luca Pulina and Martina Seidl, editors, Theory and Applications of Satisfiability Testing – SAT 2020 – 23rd International Conference, Alghero, Italy, July 3-10, 2020, Proceedings, volume 12178 of Lecture Notes in Computer Science, pages 28–45. Springer, 2020. doi:10.1007/978-3-030-51825-7_3.
- [13] Zhaohui Fu and Sharad Malik. On solving the partial MAX-SAT problem. In Armin Biere and Carla P. Gomes, editors, Theory and Applications of Satisfiability Testing – SAT 2006, 9th International Conference, Seattle, WA, USA, August 12-15, 2006, Proceedings, volume 4121 of Lecture Notes in Computer Science, pages 252–265. Springer, 2006. doi:10.1007/11814948_25.
- [14] The GNU MP Bignum Library. Available at https://gmplib.org/.
- [15] Stephan Gocht, Jakob Nordström, and Amir Yehudayoff. On division versus saturation in pseudo-boolean solving. In Sarit Kraus, editor, Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10-16, 2019, pages 1711–1718. ijcai.org, 2019. doi:10.24963/IJCAI.2019/237.
- [16] Ana Graça, João Marques-Silva, Inês Lynce, and Arlindo L. Oliveira. Efficient haplotype inference with combined CP and OR techniques. In Laurent Perron and Michael A. Trick, editors, Integration of AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems, 5th International Conference, CPAIOR 2008, Paris, France, May 20-23, 2008, Proceedings, volume 5015 of Lecture Notes in Computer Science, pages 308–312. Springer, 2008. doi:10.1007/978-3-540-68155-7_28.
- [17] Armin Haken. The intractability of resolution. Theoretical Computer Science, 39(2 & 3):297–308, 1985. doi:10.1016/0304-3975(85)90144-6.
- [18] Marijn J. H. Heule, Oliver Kullmann, and Victor W. Marek. Solving and verifying the boolean pythagorean triples problem via cube-and-conquer. In Nadia Creignou and Daniel Le Berre, editors, Theory and Applications of Satisfiability Testing – SAT 2016 – 19th International Conference, Bordeaux, France, July 5-8, 2016, Proceedings, volume 9710 of Lecture Notes in Computer Science, pages 228–245. Springer, 2016. doi:10.1007/978-3-319-40970-2_15.
- [19] John N Hooker. Generalized resolution and cutting planes. Annals of Operations Research, 12:217–239, 1988.
- [20] Dejan Jovanovic and Leonardo Mendonça de Moura. Cutting to the chase – solving linear integer arithmetic. J. Autom. Reasoning, 51(1):79–108, 2013. doi:10.1007/s10817-013-9281-x.
- [21] Chu-Min Li, Zhenxing Xu, Jordi Coll, Felip Manyà, Djamal Habet, and Kun He. Combining clause learning and branch and bound for maxsat. In Laurent D. Michel, editor, 27th International Conference on Principles and Practice of Constraint Programming, CP 2021, Montpellier, France (Virtual Conference), October 25-29, 2021, volume 210 of LIPIcs, pages 38:1–38:18. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2021. doi:10.4230/LIPICS.CP.2021.38.
- [22] Chu-Min Li, Zhenxing Xu, Jordi Coll, Felip Manyà, Djamal Habet, and Kun He. Boosting branch-and-bound maxsat solvers with clause learning. AI Commun., 35(2):131–151, 2022. doi:10.3233/AIC-210178.
- [23] João Marques-Silva, Inês Lynce, and Sharad Malik. Conflict-driven clause learning SAT solvers. In Armin Biere, Marijn Heule, Hans van Maaren, and Toby Walsh, editors, Handbook of Satisfiability – Second Edition, volume 336 of Frontiers in Artificial Intelligence and Applications, pages 133–182. IOS Press, 2021. doi:10.3233/FAIA200987.
- [24] António Morgado, Federico Heras, Mark H. Liffiton, Jordi Planes, and João Marques-Silva. Iterative and core-guided maxsat solving: A survey and assessment. Constraints An Int. J., 18(4):478–534, 2013. doi:10.1007/S10601-013-9146-2.
- [25] Robert Nieuwenhuis and Albert Oliveras. On SAT Modulo Theories and Optimization Problems. In Proceedings of SAT ’06, volume 4121 of Lecture Notes in Computer Science, pages 156–169. Springer, 2006. doi:10.1007/11814948_18.
- [26] Robert Nieuwenhuis, Albert Oliveras, Enric Rodríguez-Carbonell, and Emma Rollon. Employee scheduling with sat-based pseudo-boolean constraint solving. IEEE Access, 9:142095–142104, 2021. doi:10.1109/ACCESS.2021.3120597.
- [27] Tobias Paxian and Bend Becker. Pacose: An Iterative SAT-based MaxSAT Solver. In Jeremias Berg, Matti Järvisalo, Ruben Martins, Andreas Niskanen, and Tobias Paxian, editors, MaxSAT Evaluation 2024 : Solver and Benchmark Descriptions, Department of Computer Science Series of Publications B, page 26. University of Helsinki, 2024. URL: http://hdl.handle.net/10138/584878.
- [28] Knot Pipatsrisawat and Adnan Darwiche. On the power of clause-learning SAT solvers as resolution engines. Artif. Intell., 175(2):512–525, 2011. doi:10.1016/j.artint.2010.10.002.
- [29] Albert Oliveras Robert Nieuwenhuis and Enric Rodríguez-Carbonell. Intsat: integer linear programming by conflict-driven constraint learning. Optimization Methods and Software, 2023. doi:10.1080/10556788.2023.2246167.
- [30] J.A. Robinson. A Machine-Oriented Logic Based on the Resolution Principle. Journal of the ACM, JACM, 12(1):23–41, 1965. doi:10.1145/321250.321253.
- [31] Olivier Roussel and Vasco M. Manquinho. Pseudo-boolean and cardinality constraints. In A. Biere, M. Heule, H. van Maaren, and T. Walsh, editors, Handbook of Satisfiability, volume 185 of Frontiers in AI and Applications, pages 695–733. IOS Press, 2009. doi:10.3233/978-1-58603-929-5-695.
- [32] Pavel Smirnov, Jeremias Berg, and Matti Järvisalo. Pseudo-boolean optimization by implicit hitting sets. In Laurent D. Michel, editor, 27th International Conference on Principles and Practice of Constraint Programming, CP 2021, Montpellier, France (Virtual Conference), October 25-29, 2021, volume 210 of LIPIcs, pages 51:1–51:20. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2021. doi:10.4230/LIPICS.CP.2021.51.
- [33] Pavel Smirnov, Jeremias Berg, and Matti Järvisalo. Improvements to the implicit hitting set approach to pseudo-boolean optimization. In Kuldeep S. Meel and Ofer Strichman, editors, 25th International Conference on Theory and Applications of Satisfiability Testing, SAT 2022, August 2-5, 2022, Haifa, Israel, volume 236 of LIPIcs, pages 13:1–13:18. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/LIPICS.SAT.2022.13.
- [34] Mate Soos, Karsten Nohl, and Claude Castelluccia. Extending SAT solvers to cryptographic problems. In Oliver Kullmann, editor, Theory and Applications of Satisfiability Testing – SAT 2009, 12th International Conference, SAT 2009, Swansea, UK, June 30 – July 3, 2009. Proceedings, volume 5584 of Lecture Notes in Computer Science, pages 244–257. Springer, 2009. doi:10.1007/978-3-642-02777-2_24.
- [35] Yichen Xie and Alexander Aiken. Saturn: A SAT-Based Tool for Bug Detection. In Proceedings of the 17th International Conference on Computer Aided Verification, CAV 2005, pages 139–143, 2005. doi:10.1007/11513988_13.
- [36] L. Zhang, C. F. Madigan, M. W. Moskewicz, and S. Malik. Efficient Conflict Driven Learning in a Boolean Satisfiability Solver. In International Conference on Computer-Aided Design, ICCAD ’01, pages 279–285, 2001.
