Bit-Precise Reasoning with Parametric Bit-Vectors
Abstract
The SMT-LIB theory of bit-vectors is restricted to bit-vectors of fixed width. However, several important applications can benefit from reasoning about bit-vectors of symbolic widths, i.e., parametric bit-vectors. Recent work has introduced an approach for solving formulas over parametric bit-vectors, via an eager translation to quantified integer arithmetic with uninterpreted functions. The approach was shown to be successful for several applications, including the bit-width independent verification of compiler optimizations, invertibility conditions, and rewrite rules. We extend and improve that approach in several aspects. Theoretically, we improve expressiveness by defining a new theory of parametric bit-vectors that supports more operators and allows reasoning about the bit-widths themselves. Algorithmically, we introduce a lazy algorithm that avoids the use of uninterpreted functions and quantified axioms for them. Empirically, we show a significant improvement by implementing and evaluating our approach, and comparing it experimentally to the previous one.
Keywords and phrases:
Satisfiability Modulo Theories, Bit-precise Reasoning, Parametric Bit-vectorsCopyright and License:
2012 ACM Subject Classification:
Theory of computation Automated reasoning ; Software and its engineering Formal methodsFunding:
This research was supported in part by the Stanford Center for Automated Reasoning, the Stanford Center for Blockchain Research, Defense Advanced Research Projects Agency (DARPA) contract FA875024-2-1001, Israel Science Foundation (ISF) grant number 619/21, National Science Foundation (NSF) grant number 2303489, and a gift from Amazon Web Services.Event:
28th International Conference on Theory and Applications of Satisfiability Testing (SAT 2025)Editors:
Jeremias Berg and Jakob NordströmSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
Bit-precise reasoning as provided by Satisfiability Modulo Theories (SMT) for the theory of fixed-size bit-vectors [3] is a key requirement for a wide range of verification applications. In particular, it allows verification engines to reason about machine integers and hardware registers and supports a variety of operators (including arithmetic, bitwise, and word operators such as concatenation and extraction). It is also very useful for other theories, such as floating-point arithmetic and non-linear integer arithmetic, where reasoning over certain fragments of those theories can be reduced to reasoning in the theory of fixed-size bit-vectors [7, 17].
One of the biggest limitations of this theory is already evident in its name: it only allows reasoning about bit-vectors whose size (or, equivalently, width) is fixed. Thus, when declaring a bit-vector sort in the SMT-LIB language, its width must be specified as a numeral. This limitation poses a serious expressiveness issue: when a verification tool verifies a property using this theory of bit-vectors, the verification result is only valid for the explicitly specified bit-width(s). In the context of software verification, this means that some programs are proven correct in the presence of, say, 32-bit integer variables, but if their type is changed to 64-bit integers, the verification process has to be repeated. Similar issues occur during the development of specialized solvers for this theory, where much effort is dedicated to the design of rewrite rules and the generation of auxiliary lemmas [23, 34, 28, 31]. Before adding new rules or lemmas in the solver, it is valuable to be able to verify that they are correct, and their correctness is usually independent of bit-width. Being able to do so fast and automatically using SMT solvers (as opposed to following the lengthier process of developing interactive proofs [12, 19]) is highly beneficial for the development process. However, state-of-the-art SMT solvers do not currently support reasoning over bit-vectors of parametric size. Thus, it is common to verify the correctness of such rewrite rules and lemmas only for fixed bit-widths, from 1 up to some reasonably large value. This does not yield a proof of correctness in general although it increases confidence that the rules and lemmas are also correct for arbitrary bit-widths [28, 31].
Early work on bit-precise reasoning supported reasoning about parametric bit-vectors, but only for a small fragment of the theory (see, e.g., [35, 5, 15, 16]). In all cases, however, the supported fragment is not expressive enough for modern verification efforts.
More recently, an approach for solving formulas over parametric bit-vectors via an eager translation to quantified non-linear integer arithmetic with uninterpreted functions was proposed by Niemetz et al. [30, 29]. In that work, bitwise operators and the exponentiation function with base , which is required for translating arithmetic bit-vector operators to the integers, are axiomatized by means of uninterpreted functions and quantifiers. In particular, [30] introduced a theory of parametric bit-vectors where bit-widths are implicit, and so it is not possible to reason about then. The translation was implemented outside of an SMT solver and evaluated on three case studies: bit-width independent verification conditions for compiler optimizations [20], invertibility conditions [28], and rewrite rules [34]. In addition, since all three case studies only require reasoning about a single parametric bit-width, the implementation of the translation targets a fragment of the SMT-LIB theory of bit-vectors that does not involve multiple bit-widths. As a result, the possibility of verifying, e.g., compiler optimization reductions was limited, as it excluded optimizations involving bit-vector concatenations, extractions and extensions, all of which involve multiple bit-widths.
Later work [39] presented a translation of fixed-size bit-vectors to quantifier-free non-linear integer arithmetic with uninterpreted functions (coined int-blasting, generalizing earlier approaches, see, e.g., [38, 8, 6]) with the goal of improving the scalability of reasoning over large bit-vectors. Unlike [30], the approach of [39] is lazy, does not introduce any quantifiers and does not support reasoning over parametric bit-vectors. Further, it is integrated in cvc5 [1]. While not in general competitive with state-of-the-art bit-vector solvers, int-blasting showed promising performance improvements on problems involving large bit-widths arising from smart contract verification.
In this paper, we propose a procedure for reasoning about parametric bit-vectors based on the eager approach presented in [30]. Our technique extends and combines this eager approach with lazy techniques introduced for reasoning about fixed size bit-vectors via a reduction to the integers in [39]. In particular, we make the following contributions:
-
(i)
We introduce a theory of parametric bit-vectors with symbolic bit-widths as part of its signature. Our theory definition simplifies yet strengthens the theory of parametric bit-vectors introduced in [30] while enabling the reasoning over bit-widths.
-
(ii)
We present a lazy algorithm for determining satisfiability of parametric bit-vectors constraints that does not rely on quantifiers (as opposed to [30]) and generalizes the lazy handling of bitwise and from [39] to parametric bit-vectors. Additionally, instead of eagerly axiomatizing exponentiation with base (as in [30]), we handle this operator lazily.
-
(iii)
We provide an implementation of a solver for parametric bit-vectors. The implementation is generic: it supports reasoning over arbitrary parametric bit-vector formulas (which goes beyond the prototype implementation presented in [30]). In particular, it fully supports reasoning over bit-widths, constraints on bit-vectors with multiple widths, and operators combining or returning bit-vectors of different widths, such as extraction, concatenation and extension.
-
(iv)
We evaluate our approach on a large and diverse set of benchmarks and show that our technique significantly improves over the eager approach introduced in [30].
For brevity and simplicity, and similarly to previous work [30, 39], our formal presentation focuses on quantifier-free formulas. We stress however, that our implementation fully supports quantifiers.111In fact, one of the benchmark sets in our experimental evaluation contains quantified formulas.
The remainder of this paper is organized as follows. In Section 2, we provide the necessary background on first-order logic. In Section 3, we present our definition for a theory of parametric bit-vectors. In Section 4, we describe our new solver for parametric bit-vectors. We provide a detailed evaluation of the solver in Section 5 and conclude with directions for future research in Section 6.
2 Preliminaries
We briefly review notions of many-sorted first-order theories with equality (see [13, 37] for more details). Let be a set of sort symbols, and for every sort , let be an infinite set of variables of sort . We assume that sets are pairwise disjoint and define as the union of sets . A signature consists of a set of sort symbols and a set of function symbols. Arities of function symbols are defined in the usual many-sorted way. Constants are treated as nullary functions. We assume that includes a Boolean sort and the Boolean constants (true) and (false). Function symbols whose return sort is are also called predicate symbols. We assume that for each sort , a binary equality predicate symbol is included in the signature. We will write just when is clear or not important.
We assume the usual definitions of well-sorted terms, literals, and formulas, and refer to them as -terms, -literals, and -formulas, respectively. We include the ternary if-then-else operator of arity for each , and omit when it is clear from the context.
A -interpretation maps: each to a distinct non-empty set of values (the domain of in ); each to an element ; and each to a total function if , and to an element in if . We use the usual inductive definition of the satisfiability relation between -interpretations and -formulas.
A theory is a pair , where is a signature and is a non-empty class of -interpretations that is closed under variable reassignment, i.e., if interpretation only differs from an in how it interprets variables, then also . Each interpretation interprets as the two-element set and as the identity relation over each sort . A -formula is -satisfiable (resp. -unsatisfiable) if it is satisfied by some (resp. no) interpretation in ; it is -valid if it is satisfied by all interpretations in .
| Symbol | SMT-LIB Syntax | Arity |
|---|---|---|
| =, distinct | ||
| 0, 1, 2, … | ||
| +, -, *, div, mod | ||
| <=, >=, <, > |
The theory of integer arithmetic is defined as in the SMT-LIB 2 standard [2, 4]. Its signature is shown in Table 1 and includes a single sort , function and predicate symbols , and a constant symbol for every integer value. Given a set of function symbols disjoint from those of , the signature is obtained from by the addition of the symbols in . The theory consists of all -interpretations whose reduct to is a -interpretation (i.e., the symbols in are freely interpreted). We may remove set braces when listing the elements of , e.g., by writing instead of .
3 A Theory of Parametric Bit-vectors
The SMT-LIB theory of bit-vectors does not support parametric bit-vectors. Neimetz et al. [30] define a theory of parametric bit-vectors based on auxiliary functions (outside of the signature of the theory). These functions provide an implicit mapping from bit-vector variables to their symbolic bit-width and from symbolic parametric bit-vector constants to -terms. A notion of admissibility is introduced to exclude mappings that assign non-standard interpretations (e.g., interpretations with zero or negative bit-widths). Furthermore, a parametric bit-vector formula is viewed as a class of fixed-size bit-vector formulas, i.e., a class of instances with concrete values for the bit-widths. This allows for a notion of well-sortedness based on the well-sortedness of these instances. As a consequence of relying on auxiliary mappings for symbolic bit-widths and constants, however, it is not possible to reason about bit-widths within that theory.
In the following, we introduce a new formal definition of the theory of parametric bit-vectors . As in [30], we have a single sort for bit-vectors of parametric size. Compared to that work, however, our definition of does not rely on meta-level functions to maintain well-sortedness constraints. Instead, we make symbolic bit-widths of parametric bit-vector terms explicit in the signature via the new operator and introduce an explicit representation of parametric bit-vector constants via the conversion operator . This enables reasoning about both parametric bit-vectors and their bit-widths within the theory.
3.1 Theory Definition
| Symbol | SMT-LIB Syntax | Arity |
| + the following operators | ||
| =, distinct | ||
| , , , | bvult, bvugt, bvslt, bvsgt | |
| , , , | bvule, bvuge, bvsle, bvsge | |
| , | bvnot, bvneg | |
| , , | bvand, bvor, bvxor | |
| , , | bvshl, bvlshr, bvashr | |
| , | bvadd, bvsub | |
| , , | bvmul, bvurem, bvudiv | |
| pextract | ||
| concat | ||
| pzero_extend | ||
| psign_extend | ||
| bvsize | ||
| int_to_pbv | ||
| + the following operators | ||
| piand | ||
| pow2 | ||
We define the theory of parametric bit-vectors as the pair , where the signature is the extension of described in Table 2. In addition to the integer sort , its set of sort symbols includes a single, new sort for bit-vectors of parametric size. The set of function and predicate symbols of consists of parametric variants of a strict subset of the fixed-size bit-vector operators defined in SMT-LIB 2. This set of parametric bit-vector operators, however, is complete in the sense that it suffices to express parametric variants of the remaining bit-vector operators from SMT-LIB 2. Additionally, we introduce two new function symbols, of arity and of arity , in the signature.
In all the interpretations of , the domain of is the set of integer numbers, and the domain of is the set of all bit-vectors of all possible positive widths. The operators in are interpreted in the same way as in . The new symbol is interpreted as the function that maps each bit-vector to its bit-width expressed as an integer. The new symbol is interpreted as the function that maps any two integers and to the bit-vector of size that represents in binary notation if , and to an arbitrary bit-vector otherwise. It can model constants of parametric bit-widths, e.g., represents the bit-vector that consists of zeros when . Operator is interpreted as the function that maps any two bit-vectors (of any two possible widths) to the bit-vector consisting of their concatenation. Operator denotes the function that maps a bit-vector , an integer , and an integer , in that order, to the bit-vector of width that ranges from the -th bit of to the -th bit of (inclusive) if , and is interpreted arbitrarily otherwise. For a bit-vector , we refer to the -th bit of as , which abbreviates . We interpret as the least significant bit (LSB) and as the most significant bit (MSB). Operators and take an integer and a bit-vector argument and are interpreted as in the SMT-LIB theory of fixed-size bit-vectors whenever the first argument (the number of added bits) is non-negative: the former extends the second argument with leading s, while the latter extends it with its most significant bit. If the number of added bits is negative, the interpretation is arbitrary. Operators and correspond to one’s and two’s complement. All other operators of are interpreted according to the following principle: if both operands have the same (positive) bit-width, then their interpretation is the same as in the theory of fixed size bit-vectors. Otherwise, the interpretation is arbitrary.
Remark 1.
Recall that the signature of the theory of fixed-size bit-vectors has a unique sort for each bit-width . A more natural choice for a theory of parametric bit-vectors might then be the generalization of to sorts of the form where is an arbitrary integer term. However, this introduces the complication that syntactically distinct sort terms with equivalent integer arguments, such as and or and , would have to denote the same domain (bit-vectors of 4 bits) or domain family (bit-vectors of even bit-width). This exceeds the expressiveness of many-sorted logic where, in effect, distinct sorts denote disjoint domains. For this reason, we define a single sort to represent bit-vectors of all bit-widths. As a consequence, in our signature , terms can be well-sorted that do not match well-sorted terms in . For example, the term is well-sorted in our case even when and denote two bit-vectors of different bit-widths, something that is not permitted in the language of . Thus, we establish restrictions on the interpretations of (described in Section 3.2) to have semantics that coincide with the semantics of the theory of fixed-size bit-vectors. In the example above, the restriction will be that and can only be interpreted as bit-vectors of the same width.
Note that parametric bit-vectors are expected to be supported more faithfully in the upcoming Version 3 of SMT-LIB, which is based on a higher-order logic with dependent types.
3.2 Satisfiability and Admissible Satisfiability
The way it is defined, theory contains non-standard interpretations as it has to account for terms like and formulas like that are not well sorted in the theory of fixed-size bit-vectors. To address this laxness of ’s sort system, we therefore define a notion of admissible -satisfiability that excludes such spurious interpretations. This definition relies on function Adm, defined in Algorithm 1. Function Adm collects width constraints while traversing over a given -term. It is defined via a function Bw as given in Algorithm 2, which constructs an integer term representing the symbolic bit-width of a parametric bit-vector term.
Definition 2.
Let be a -formula and a -interpretation. Let Adm be a function as defined in Algorithm 1. Interpretation is admissible w.r.t. if . Formula is admissibly -satisfiable if it is satisfied by a -interpretation that is admissible w.r.t. .
Thus, a formula is admissibly -satisfiable only if it is satisfied by an admissible -interpretation . We require that an admissible interpretation assign bit-widths to -terms in while satisfying the admissibility condition described by the function Adm defined in Algorithm 1.
Example 3.
Consider a formula given as . The admissibility condition is determined as . An admissible -interpretation satisfying is given by , and . Thus, is admissibly -satisfiable. Now, consider formula given as . Its admissibility condition is defined as (after simplifications), which is -unsatisfiable. That is, there exists no admissible -interpretation for and, thus, is not admissibly -satisfiable. However, is -satisfiable: consider a -interpretation , given by . Then, is the bit-vector . Since and do not have the same bit-width, may interpret arbitrarily, and so we can set to be again the bit-vector . We then get that satisfies .
Note that our notion of admissibility corresponds to the notion of admissibility introduced by Niemetz et al. [30]. However, its definition differs in one key aspect. In [30], admissibility was enforced by using auxiliary functions and by instantiating parametric bit-vector formulas for all possible fixed bit-widths via universal quantification over the parametric bit-widths. Bit-widths and mappings from -constants to -terms were implicit in the signature, whereas in our definition of , we explicitly include operators and . This allows us to explicitly define admissibility conditions and to reason about symbolic bit-widths in -formulas.
4 A Solver for Parametric Bit-vectors
In this section, we describe a solver for admissible -satisfiability. Its main component is the translation of -formulas to formulas over a signature . As defined in Table 2, signature extends with two new function symbols, and .
We define a theory over , where symbols and are interpreted arbitrarily. Additionally, for the purpose of the translation, we consider a theory over in which the interpretation of and is fixed and defined as follows. The term is interpreted as the -th power of 2 whenever is non-negative, and as otherwise. The interpretation of is defined as follows. If and , then is the integer that corresponds to performing a bitwise conjunction on the unsigned bit-vector representations of and . That is, let be the unsigned bit-vector representations of and of width , then the interpretation of is defined as . If and , then the interpretation of is the same as . If , then its interpretation is .
Remark 4.
Notice that is defined over the integers. Thus, we always interpret a negative exponent as 0, as this is consistent with truncating the integer division in towards zero. Further, notice that defines a special case for bit-width . Our translation will never produce -formulas that are satisfiable due to these corner cases because it ensures that: (1) the exponents of are integer terms representing bit-widths and (2) integer terms that represent bit-widths are always positive. In fact, for our purposes, the special cases mentioned above could also be interpreted arbitrarily. We choose a fixed interpretation since it is easier to implement.
Starting with a -formula, we first simplify the formula by applying rewrite rules , as described in Section 4.1. The rewritten -formula is then translated to a -formula using function Trans, as described in Section 4.2. Finally, the -satisfiability of the resulting -formula is determined via procedure Solve , described in Section 4.3.
4.1 The Parametric Bit-Vector Rewriter
State-of-the-art SMT solvers rely heavily on simplification techniques that are applies as a preprocessing step, before the main solver process begins. One such technique is based on term rewriting. For the theory of fixed-size bit-vectors (for which the dominant solving technique is bit-blasting, an eager reduction to propositional logic), SMT solvers implement hundreds of rewrite rules. However, rewrite rules for that target bit-blasting are not necessarily beneficial for alternative solving procedures. In particular, they may not be useful for our procedure, which relies on a translation to integer arithmetic. Thus, some care is required in selecting rewrite rules for our procedure.
We implemented a set of rewrite rules for based on rewrite rules for as implemented in the SMT solver cvc5 [1]. Rewrite rules implemented in cvc5 are documented in the domain-specific language RARE [33], which allows for easy adoption and lifting to . We selected -rewrite rules for lifting to based on the following principles: a rewrite rule should not introduce a bitwise operator; the translation of the rewritten formula to the integers should not have more occurrences of operators and than the original one. These principles are based on the fact that state-of-the-art solvers for integer arithmetic do not support bitwise operations over integers, and also offer very limited support for exponentiation. Further, is non-linear and one of the most expensive operations for integer arithmetic procedures. Interestingly, following these guiding principles, some -rewrite rules are useful for when applied in the opposite direction.
The rewrite rules that we have implemented are presented in Appendix A.
Example 5.
Consider a -rule that rewrites to . This is useful for bit-blasting since it reduces the size of the bit-level representation. However, for our translation to integer arithmetic, extractions are expensive as they introduce , and terms. Thus, we did not lift this rule to but included its right-to-left variant in instead.
4.2 The Translation Function Trans
After obtaining the rewritten -formula , we translate it into a -formula such that is admissibly -satisfiable iff is -satisfiable. The translation function Trans is given in Algorithms 3 and 4. It is based on the one in [30], with changes underlined.
Prior to the actual conversion step, -formula is augmented, conjunctively, with the admissibility constraints , constructed as described in Algorithm 1, as well as the range and size constraints . The latter are defined over integer constants that represent either parametric bit-vector constants or symbolic bit-widths, as described in Algorithm 4. The latter traverses the formula and adds constraints for the range of the integer terms representing bit-vector terms, according to their original bit-widths.
The augmented -formula is then converted via function Conv into an equisatisfiable -formula as defined in Algorithm 3.
We assume a one-to-one mapping from variables of sort to variables of sort , as well as a one-to-one mapping from -terms to variables of sort , such that their images are disjoint. Intuitively, translates variables of sort to corresponding variables of sort while translates terms of the form to fresh integer variables representing the size of the parametric bit-vector term . Compared to our previous work [30], our new translation handles the new operators and , improves the encodings of the operators , , and to eliminate applications of , and adds support for the operators , , and . Notice that the encoding of is not explicitly described, as that operator can be expressed in terms of and .
Following Zohar et al. [39], we do not handle and natively. Instead, we eliminate them by means of , and , based on the following identities derived from [18]: and . This elimination is embedded in the translation function and improves upon the original elimination in [39] by avoiding the introduction of additional operations.
Example 6.
Our translation function Trans makes heavy use of and operations. In practice, some of these applications can be safely eliminated to optimize the encoding. For example, consider the term , where . The result of Conv on this term is , where . Instead, we directly translate it to .
4.3 The -Solver
After translating a -formula to a -formula, we check if it is -satisfiable with a dedicated CEGAR-style procedure Solve , which iteratively refines over-approximations of and as given in Algorithm 5. This procedure takes an approach similar to the incremental linearization of non-linear arithmetic constraints [9, 10]. It relies on a set of quantifier-free lemmas which fully specify the semantics of operators and . The set is constructed by instantiating the (implicitly) universally quantified lemma schemas , shown in Table 3, and , shown in Table 4, for all -terms and -terms in . The instantiations of the lemmas also depend on the values of variables in the current interpretation . Note that when computing the instantiation for -lemmas, we also include -terms occurring in lemmas for .
The procedure further assumes the availability of a subprocedure Solve to determine the -satisfiability of a set of -formulas . Recall that in , symbols and are uninterpreted and thus, is an over-approximation of in . If Solve determines the unsatisfiability of , we conclude with “unsat”. If it determines its satisfiability, we conclude with “sat” only if the resulting interpretation also satisfies all lemmas in . Otherwise, we refine by adding to it all lemmas that are not satisfied by .
The set in Table 3 includes 4 lemma schemas from [30], which describe basic properties of the operator. In addition, we consider the following three new lemmas. Lemma considers negative inputs, for which is defined to be interpreted as in . Lemma strengthens a lemma from recent work proposing a CEGAR-style approach for a new SMT theory for integer arithmetic with exponentiation [14]. From there, we get that when , then is a lower bound for . We notice that when , we have as a tighter lower bound. In order to avoid non-linear multiplications, we linearize this bound via concrete values . We further include instances of with , to block concrete model values.
The set of lemma schemas in Table 4 includes 8 lemmas that were introduced in [30] and capture basic properties of the operator. In addition, we consider 4 lemmas that did not appear in [30]. Lemma captures the corner case when the bit-width is non-positive. Lemmas and capture the cases where one of the arguments is even or . Note that due to lemma , it is sufficient to consider only one variant of each lemma. Lemma is based on the fact that for a fixed bit-width , is defined as , with defined as as in Table 4. We observe that for a bit-width with and , the equation holds – consider, for example, . Lemma is a generalization of this observation. It is instantiated with .
Example 7.
Consider the formula . Its translation will be , with the addition of the appropriate admissibility and range constraints. Then lemma will be instantiated in set as . Lemma will be instantiated as . In contrast, the approach of [30] introduces the universal closure of such lemmas, instead of specific instantiations.
Note that procedure Solve resembles the lazy approach of [39], with several key differences. First, since we are dealing with parametric bit-widths, our algorithm does not necessarily terminate, as the set of possible bit-widths is unbounded. Second, our algorithm not only lazily handles operator , but also . Operator did not occur when int-blasting in [39] since there, all exponents were concrete constants. Thus, exponentiation was simply evaluated. Finally, our sets of lemmas for and expand both the axioms from [30] and the lemmas from [39].
| Name | Lemma Schema | Source |
| [30] | ||
| [30] | ||
| [30] | ||
| new | ||
| new | ||
| new | ||
| , | ||
| [30] | ||
| Name | Lemma Schema | Source |
| [30] | ||
| [30] | ||
| [30] | ||
| [30] | ||
| [30] | ||
| [30] | ||
| new | ||
| new | ||
| new | ||
| new | ||
| , | ||
| [30] | ||
| [30] | ||
Putting all components described in this section together, we obtain an incomplete but sound procedure for admissible -satisfiability.
Theorem 8.
Let be a -formula and let . If terminates, then is admissibly -satisfiable if and only if returns “sat”.
4.4 Proof of Theorem 8
The correctness argument for Theorem 8 is similar to those from [30, 39]: assuming a -interpretation that satisfies a -formula , while also being admissible w.r.t. , we can construct a -interpretation that satisfies the translation of , as well as the relevant lemmas from Tables 3 and 4, and vice versa. Compared to [30], the auxiliary admissibility condition that was assumed is now replaced by the satisfaction of . What is left to show is:
-
1.
The new translation of is correct.
-
2.
The eliminations of and are correct.
-
3.
The treatment of extraction and extensions is correct.
-
4.
The rules of the rewriter are -valid.
- 5.
We prove the first item in Section 4.4.1 and the second in Section 4.4.2. The third follows directly from the SMT-LIB standard. For the fourth, notice that we only use rewrite rules that are already used in cvc5 and are bit-width independent. For their correctness proofs, see, e.g., [19]. For the fifth, most lemmas are taken from [30]. Of the new lemmas that we add, correctness is trivial for lemmas , , , and . We prove the correctness of the remaining lemmas in Section 4.4.3.
4.4.1 Correctness of Translation of
It suffices to show that in every -interpretation that satisfies the formulas , , and , the expressions and are interpreted the same. This holds, as every such interpretation also satisfies .
4.4.2 Correctness of Elimination of | and
We start with the following lemma:
Lemma 9.
Suppose , and . Then, and .
Proof.
In this proof we treat and as arithmetic operators with a fixed interpretation as given to them by all -interpretations. This saves us from superscripting interpretations to these symbols, which is not needed since their interpretations are fixed. We consider the required inequalities separately:
-
:
-
1.
By the semantics of , we have . Since , we get . Therefore, .
-
2.
Next, we show . Since , we have: and , where for every , . By the semantics of , we have . Notice that we can also have .
Now, we clearly have that equals , which is: . This is equal to . Clearly, this must be at most , which equals , which must be strictly smaller than .
-
1.
-
:
-
1.
By the semantics of , we have and . Therefore, , and so .
-
2.
For the other direction, since for all , we have: . By the inequality item, .
-
1.
We now justify the elimination of operator and , relying on the following equations from [18]:
For , we get get: . Translating the right-hand side of this equation to integer arithmetic with , we get , where is the bit-width of and . This can be simplified to . In order to show an equality to , which is our actual translation, it is left to show that , as stated in the first item of Lemma 9.
The justification for the elimination of operator is similar. Combining and , we get that and thus . Translating the right-hand side of this equation to integer arithmetic with , we get , which can be simplified to . To prove that this is the same as , it suffices to prove that , as stated in the second item of Lemma 9.
4.4.3 Correctness of Lemmas
The following lemmas prove the validity of our lemmas and .
Lemma 10.
Suppose . Then .
Proof.
We first prove that by induction on . If then this trivially holds. Assume that this holds for some . Then, . By the induction hypothesis, . Also, when . Hence we get . Let and suppose . Then, and , and so .
Lemma 11.
If then .
Proof.
We always have . Since and , we have that for every , and thus .
5 Evaluation
We have implemented our approach by extending two tools: the generic SMT solver API smt-switch [21] and the SMT solver cvc5 [1]. In smt-switch, we extended the SMT-LIB parser with support for parsing -formulas, and implemented our rewriter (Section 4.1) and the translation Trans (Algorithm 3). In cvc5, we implemented our procedure Solve (Algorithm 5) by extending cvc5’s non-linear arithmetic module with two subsolvers to lazily handle and . The subsolver for generalizes the -subsolver in [39], which was only able to reason over with fixed bit-widths, by allowing reasoning over symbolic bit-widths.
Note that the handling of -formulas and their translation is entirely implemented in smt-switch. Externalizing the -translation pipeline enables the use of our approach in combination with other back-end solvers that support reasoning over and operators.
5.1 Benchmarks
The SMT-LIB standard [2, 4] does not yet define a theory for parametric bit-vectors. Consequently, benchmarks encoding problems over parametric bit-vectors are not yet part of the SMT-LIB benchmark library. Thus, we evaluate our techniques utilizing a wide range of benchmark sets, originating from various sources, as detailed below.
alive (200 benchmarks).
Set alive consists of verification conditions for compiler optimizations that are generated by Alive [20]. The evaluation of [30] included 180 such conditions. Here, we include 20 additional conditions that were not supported by the implementation of [30]. Among the new conditions, 17 involve multiple bit-widths and 3 include terms of the form , representing a bit-vector of length whose unsigned integer value is .
ic (180 benchmarks).
Set ic consists of benchmarks to verify the correctness of the invertibility conditions formalized in Niemetz et al. [27], which utilizes them for quantifier instantiation. They are also instrumental to the local search procedure of [27]. The evaluation of [30] included correctness checks for 160 out of 180 conditions. Here, we add the remaining 20 checks that encode invertibility conditions over concatenation, which were not supported by the implementation of [30].
rewrite (1500 benchmarks).
syrew (1500 benchmarks).
Set syrew consists of bit-width independent versions of the equivalence checks of -terms from [31], which were enumerated by the SyGuS engine of cvc5. There, these checks were instantiated for large bit-widths to evaluate the performance of -solvers.
lemmas (70 benchmarks).
Set lemmas, also originating from [31], consists of 70 refinement lemmas describing properties of arithmetic bit-vector operators for a CEGAR-style procedure for implemented in the SMT solver Bitwuzla [25]. These lemmas were verified to be correct up to bit-width 256 [31] but are required to be correct for arbitrarily large bit-widths. This benchmark set encodes bit-width independent correctness checks of these lemmas. Note that two benchmarks involve multiple bit-widths, and thus cannot be handled by the implementation of [30].
icfb (46 benchmarks).
Set icfb originates from a local search procedure [24] that generalizes the technique from [27] and defines invertibility conditions (and conditions for a weaker notion of invertibility called consistency) over ternary bit-vectors. These conditions were verified to be correct up to bit-width 65 but are required to hold for any bit-width. This benchmark set consists of bit-width independent correctness checks of the conditions, of which 16 benchmarks involve multiple bit-widths. Note that some conditions cannot be encoded to due to the occurrence of bit-width dependent functions that require counting, e.g., the counting of leading zeroes, and were thus excluded from this set. Further note that in the original publication [24], one of the invertibility conditions for operator was given incorrectly (the implementation and verification of the rule used the correct definition) [26]. We include both versions for this condition.
mut (9442 benchmarks).
The majority of the benchmarks in the sets above are unsatisfiable. For a more thorough evaluation on satisfiable benchmarks, we created a benchmark set mut by mutating the benchmarks from the other sets. This realistically mimics the introduction of bugs in verification conditions, e.g., by using a bitwise disjunction instead of conjunction. Let be a pair of -terms or function symbols with the same arity, and let be a -formula in which occurs. A mutation of is obtained from by replacing the first occurrence of by . We generated mutations for pairs and their permutations , and filtered out duplicates.
| Feature | Baseline | Eager | Pbv |
|---|---|---|---|
| Target Theory | |||
| Multiple Bit-widths | ✕ | ||
| Lazy | ✕ | ✕ | |
| Lazy | ✕ | ✕ | |
| -elimination | ✕ | ||
| -elimination | ✕ | ||
| without | ✕ | ||
| New lemmas for | ✕ | ||
| New lemmas for | ✕ | ||
| No redundant axioms | ✕ | ||
| ✕ | |||
| -reduction | ✕ |
5.2 Experimental Setup
For our experimental evaluation, we consider three configurations: configuration Baseline, which corresponds to the best configuration of [30]; configuration Pbv, which implements our techniques as described in Section 4; and configuration Eager, which corresponds to extending configuration Baseline with our rewriter and optimizations as detailed in Table 5, while still handling operators and eagerly. Recall that the implementation of [30] utilized translations that were tailored to each considered benchmark set, which does not allow a faithful comparison to our Pbv-configurations. Thus, we reimplemented the best configuration comb from [30] as configuration Baseline in our tool. This not only allows us to evaluate Baseline on a larger benchmark set, but also ensures a fair comparison since all configurations use the same underlying cvc5 version as well as the same infrastructure for parsing, manipulating and solving formulas.
Table 5 outlines the differences between the configurations. The features considered for each configuration are listed in the first column. Target Theory specifies the underlying theory (e.g., whether and are uninterpreted or not). Accordingly, Lazy and Lazy determine whether these operators are handled lazily or eagerly. For uninterpreted functions, we use quantified axiomatizations, and No redundant axioms determines whether these axiomatizations are always added to the input formula, regardless of whether the axiomatized operator occurs in the formula. Multiple Bit-widths corresponds to supporting multiple bit-widths in the same formula. -elimination and -elimination denote the elimination of and in the translation (as described in Section 4.2). The usage of an optimization to the translation of , which eliminates a operation (as shown in Algorithm 3), is determined by without . New lemmas for and New lemmas for correspond to the usage of the new lemmas from Tables 3 and 4. determines whether rewriter from Section 4.1 is enabled, and -reduction determines whether some optimizations that eliminate nested in arithmetic operators are applied, as mentioned in Section 4.2.
Configurations Baseline and Eager rely on an eager translation. They introduce uninterpreted functions that are axiomatized by means of quantified formulas that are added as preamble to each translation. In contrast, configuration Pbv employs a translation to theory , which does not rely on quantifiers and handles operators and lazily. Further, configuration Baseline does not support multiple bit-widths in the same formula. Configuration Eager, on the other hand, incorporates several optimizations over Baseline, including the new lemmas, that are also incorporated in Pbv. Notice that operators and are handled natively in configuration Baseline, while configurations Eager and Pbv use elimination rules for them. In Appendix B, we include an extended set of results with 14 more configurations.
Remark 12.
Even though our evaluation includes benchmark sets from the evaluation in [30], the numbers are not directly comparable due to the following reasons. Each benchmark in set ic describes an equivalence, which was split into two implications in [30]. Additionally, conditional inverses were introduced there, and were incorporated into the benchmarks with the main goal of proving invertibility conditions as correct. Benchmark set rewrite was solved in a semi-automated and iterative manner: whenever a rewrite candidate was proven correct, it was included as an axiom in the benchmarks to prove. In this evaluation, we refrain from such interventions, as our goal is to measure how efficient our approach is in a fully automated setting. Further, the translation of [30] utilized quantifier-specific optimizations and included instantiation patterns as well as concrete instantiations for some of the axioms. However, such patterns are often tailored to a specific solving procedure, and concrete instantiations are only sound for unsatisfiable benchmarks (which was the focus of [30]). In configuration Baseline, we do not use any of these optimizations.
We ran our experiments on a cluster of 22 machines equipped with Intel Xeon Gold 6348 CPUs. For each solver and benchmark pair, we used a CPU time limit of 60 seconds and a memory limit of 8GB. Preliminary experiments showed that the difference in results with higher time limits were not significant. Also, no configuration exceeded the memory limit of 8GB during solving.
5.3 Results
Table 6 summarizes the results for each benchmark set described in Section 5.1, over the configurations introduced in Section 5.2, as well as the virtual best solver over these configurations in column VBS.
| Benchmarks | # | Baseline | Eager | Pbv | VBS |
|---|---|---|---|---|---|
| alive | 200 | 71 | 93 | 107 | 124 |
| ic | 180 | 43 | 58 | 77 | 81 |
| rewrite | 2006 | 658 | 1221 | 1331 | 1382 |
| syrew | 1500 | 558 | 720 | 912 | 951 |
| lemmas | 70 | 12 | 14 | 23 | 23 |
| icfb | 46 | 1 | 9 | 12 | 12 |
| mut | 9441 | 669 | 1084 | 4863 | 4915 |
| total | 13443 | 2012 | 3199 | 7325 | 7488 |
| sat | 0 | 0 | 3641 | 3641 | |
| unsat | 2012 | 3199 | 3684 | 3847 | |
| unique | 24 | 57 | 4222 | 4303 | |
| time [seconds] | 709k | 634k | 375k | 355k |
Overall, configuration Pbv significantly outperforms Baseline and Eager in number of solved instances and runtime. Comparing Pbv against Eager indicates that using a lazily handling and is superior to the quantifier-based eager approach. Comparing Eager against Baseline shows that the optimizations discussed in Section 4 are beneficial for the eager translation as well. On the 1863 instances commonly solved by all three configurations, Eager is almost times faster than Baseline while Pbv is more than times faster than Eager. On the 3060 instances commonly solved by Eager and Pbv, configuration Pbv is more than times faster. Notice that configurations Baseline and Eager do not solve any satisfiable benchmarks, due to the usage of quantified axiomatizations. In fact, the presence of these axiomatizations already prevents both Baseline and Eager from reporting that the formula is satisfiable.
Our benchmark sets include in total benchmarks that involve multiple bit-widths, which Baseline does not support. Configuration Pbv solves instances, of which are satisfiable. Configuration Eager solves instances, all of which are unsatisfiable.
Simplifying the -formula via rewriting as described in Section 4.1 is overall beneficial for both the Eager and the Pbv configurations but has a bigger impact on Eager. Without rewriting, Eager solves 474 fewer instances, while Pbv only loses 72 instances.
Proving the correctness of the invertibility conditions from [28] for arbitrary bit-widths (benchmark set ic) has previously been attempted with two different approaches: (i) the eager approach described in [30] and (ii) a bit-width independent formalization in Coq [12]. Out of the 81 invertibility conditions proven across our three configurations, 11 were not proven by (i) and (ii). Of these 11, 8 involve multiple bit-widths which neither (i) nor (ii) can handle. The remaining 3 conditions involve operators that are not supported by (ii) (signed inequalities, multiplication and division).
For benchmarks in sets lemmas and icfb, we provide the first bit-width independent verification. The lemmas and conditions in these sets were only proven correct for a range of fixed bit-widths [31, 24]. Configuration Pbv was able to confirm that the incorrect condition is indeed incorrect. However, it was unable to prove the corrected condition within the given time limit.
6 Conclusion and Further Research
We have presented a new theory of parametric bit-vectors and proposed a dedicated, lazy procedure for solving -formulas. We have shown experimentally that our techniques are a significant improvement over previous techniques in a wide range of benchmarks.
In future work, we plan to explore proof production for parametric bit-vectors, based on the algorithms presented in this paper. This will enable the integration of the proposed solver into proof assistants like Coq [11], Lean [22] or Isabelle/HOL [32], in order to increase automation in these tools for proofs involving parametric bit-vectors.
For our evaluation, we compiled sets of benchmarks for various challenging applications. These benchmarks are, however, relatively small in terms of formula size. It would be interesting to evaluate the scalability of our approach with respect to formula size. However, obtaining appropriate benchmarks for this purpose is currently challenging.222For example, parameterizing bit-vector benchmarks from SMT-LIB (which are exclusively over fixed-size bit-vectors) while preserving their semantics is not straightforward. Generating and experimenting with such benchmarks is left for future work.
References
- [1] Haniel Barbosa, Clark W. Barrett, Martin Brain, Gereon Kremer, Hanna Lachnitt, Makai Mann, Abdalrhman Mohamed, Mudathir Mohamed, Aina Niemetz, Andres Nötzli, Alex Ozdemir, Mathias Preiner, Andrew Reynolds, Ying Sheng, Cesare Tinelli, and Yoni Zohar. cvc5: A versatile and industrial-strength SMT solver. In TACAS (1), volume 13243 of Lecture Notes in Computer Science, pages 415–442. Springer, 2022. doi:10.1007/978-3-030-99524-9_24.
- [2] Clark Barrett, Pascal Fontaine, and Cesare Tinelli. The Satisfiability Modulo Theories Library (SMT-LIB). www.SMT-LIB.org, 2023.
- [3] Clark Barrett, Pascal Fontaine, and Cesare Tinelli. The SMT-LIB Standard: Version 2.7. Technical report, Department of Computer Science, The University of Iowa, 2025. Available at www.SMT-LIB.org.
- [4] Clark Barrett, A. Stump, and Cesare Tinelli. The SMT-LIB standard - version 2.0. In Proceedings of the 8th international workshop on satisfiability modulo theories, Edinburgh, Scotland,(SMT ’10), 2010.
- [5] Nikolaj S. Bjørner and Mark C. Pichora. Deciding fixed and non-fixed size bit-vectors. In TACAS, volume 1384 of Lecture Notes in Computer Science, pages 376–392. Springer, 1998. doi:10.1007/BFB0054184.
- [6] Marco Bozzano, Roberto Bruttomesso, Alessandro Cimatti, Anders Franzén, Ziyad Hanna, Zurab Khasidashvili, Amit Palti, and Roberto Sebastiani. Encoding RTL constructs for MathSAT: a preliminary report. Electronic Notes in Theoretical Computer Science, 144(2):3–14, 2006. doi:10.1016/J.ENTCS.2005.12.001.
- [7] Martin Brain, Florian Schanda, and Youcheng Sun. Building better bit-blasting for floating-point problems. In Tomás Vojnar and Lijun Zhang, editors, Tools and Algorithms for the Construction and Analysis of Systems - 25th International Conference, TACAS 2019, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2019, Prague, Czech Republic, April 6-11, 2019, Proceedings, Part I, volume 11427 of Lecture Notes in Computer Science, pages 79–98. Springer, 2019. doi:10.1007/978-3-030-17462-0_5.
- [8] Raik Brinkmann and Rolf Drechsler. RTL-datapath verification using integer linear programming. In Proceedings of ASP-DAC/VLSI Design 2002. 7th Asia and South Pacific Design Automation Conference and 15h International Conference on VLSI Design, pages 741–746. IEEE, 2002. doi:10.1109/ASPDAC.2002.995022.
- [9] Alessandro Cimatti, Alberto Griggio, Ahmed Irfan, Marco Roveri, and Roberto Sebastiani. Invariant checking of NRA transition systems via incremental reduction to LRA with EUF. In Tools and Algorithms for the Construction and Analysis of Systems: 23rd International Conference, TACAS 2017, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2017, Uppsala, Sweden, April 22-29, 2017, Proceedings, Part I 23, pages 58–75. Springer, 2017. doi:10.1007/978-3-662-54577-5_4.
- [10] Alessandro Cimatti, Alberto Griggio, Ahmed Irfan, Marco Roveri, and Roberto Sebastiani. Satisfiability modulo transcendental functions via incremental linearization. In Automated Deduction–CADE 26: 26th International Conference on Automated Deduction, Gothenburg, Sweden, August 6–11, 2017, Proceedings, pages 95–113. Springer, 2017. doi:10.1007/978-3-319-63046-5_7.
- [11] The Coq development team. The coq proof assistant reference manual version 8.9, 2019. URL: https://coq.inria.fr/distrib/current/refman/.
- [12] Burak Ekici, Arjun Viswanathan, Yoni Zohar, Cesare Tinelli, and Clark W. Barrett. Formal verification of bit-vector invertibility conditions in Coq. In Uli Sattler and Martin Suda, editors, Frontiers of Combining Systems - 14th International Symposium, FroCoS 2023, Prague, Czech Republic, September 20-22, 2023, Proceedings, volume 14279 of Lecture Notes in Computer Science, pages 41–59. Springer, 2023. doi:10.1007/978-3-031-43369-6_3.
- [13] Herbert Enderton. A mathematical introduction to logic. Elsevier, 2001.
- [14] Florian Frohn and Jürgen Giesl. Satisfiability modulo exponential integer arithmetic. In IJCAR (1), volume 14739 of Lecture Notes in Computer Science, pages 344–365. Springer, 2024. doi:10.1007/978-3-031-63498-7_21.
- [15] Aarti Gupta and Allan L. Fisher. Parametric circuit representation using inductive boolean functions. In CAV, volume 697 of Lecture Notes in Computer Science, pages 15–28. Springer, 1993. doi:10.1007/3-540-56922-7_3.
- [16] Aarti Gupta and Allan L. Fisher. Representation and symbolic manipulation of linearly inductive boolean functions. In ICCAD, pages 192–199. IEEE Computer Society / ACM, 1993. doi:10.1109/ICCAD.1993.580055.
- [17] Fuqi Jia, Rui Han, Pei Huang, Minghao Liu, Feifei Ma, and Jian Zhang. Improving bit-blasting for nonlinear integer constraints. In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2023, pages 14–25, New York, NY, USA, 2023. Association for Computing Machinery. doi:10.1145/3597926.3598034.
- [18] Henry S. Warren Jr. Hacker’s Delight, Second Edition. Pearson Education, 2013.
- [19] Hanna Lachnitt, Mathias Fleury, Leni Aniva, Andrew Reynolds, Haniel Barbosa, Andres Nötzli, Clark W. Barrett, and Cesare Tinelli. IsaRare: Automatic verification of SMT rewrites in Isabelle/HOL. In Bernd Finkbeiner and Laura Kovács, editors, Tools and Algorithms for the Construction and Analysis of Systems - 30th International Conference, TACAS 2024, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2024, Luxembourg City, Luxembourg, April 6-11, 2024, Proceedings, Part I, volume 14570 of Lecture Notes in Computer Science, pages 311–330. Springer, 2024. doi:10.1007/978-3-031-57246-3_17.
- [20] Nuno P. Lopes, David Menendez, Santosh Nagarakatte, and John Regehr. Provably correct peephole optimizations with alive. In PLDI, pages 22–32. ACM, 2015. doi:10.1145/2737924.2737965.
- [21] Makai Mann, Amalee Wilson, Yoni Zohar, Lindsey Stuntz, Ahmed Irfan, Kristopher Brown, Caleb Donovick, Allison Guman, Cesare Tinelli, and Clark W. Barrett. Smt-Switch: A solver-agnostic C++ API for SMT solving. In SAT, volume 12831 of Lecture Notes in Computer Science, pages 377–386. Springer, 2021. doi:10.1007/978-3-030-80223-3_26.
- [22] Leonardo de Moura and Sebastian Ullrich. The Lean 4 theorem prover and programming language. In André Platzer and Geoff Sutcliffe, editors, Automated Deduction – CADE 28, pages 625–635, Cham, 2021. Springer International Publishing. doi:10.1007/978-3-030-79876-5_37.
- [23] Alexander Nadel. Bit-vector rewriting with automatic rule generation. In Armin Biere and Roderick Bloem, editors, Computer Aided Verification - 26th International Conference, CAV 2014, Held as Part of the Vienna Summer of Logic, VSL 2014, Vienna, Austria, July 18-22, 2014. Proceedings, volume 8559 of Lecture Notes in Computer Science, pages 663–679. Springer, 2014. doi:10.1007/978-3-319-08867-9_44.
- [24] Aina Niemetz and Mathias Preiner. Ternary propagation-based local search for more bit-precise reasoning. In FMCAD, pages 214–224. IEEE, 2020. doi:10.34727/2020/ISBN.978-3-85448-042-6_29.
- [25] Aina Niemetz and Mathias Preiner. Bitwuzla. In Constantin Enea and Akash Lal, editors, Computer Aided Verification - 35th International Conference, CAV 2023, Paris, France, July 17-22, 2023, Proceedings, Part II, volume 13965 of Lecture Notes in Computer Science, pages 3–17. Springer, 2023. doi:10.1007/978-3-031-37703-7_1.
- [26] Aina Niemetz and Mathias Preiner. Correction: Ternary Propagation-Based Local Search for more Bit-Precise Reasoning. https://bitwuzla.github.io/data/fmcad2020/fmcad2020-correction.pdf, 2025.
- [27] Aina Niemetz, Mathias Preiner, and Armin Biere. Propagation based local search for bit-precise reasoning. Formal Methods Syst. Des., 51(3):608–636, 2017. doi:10.1007/S10703-017-0295-6.
- [28] Aina Niemetz, Mathias Preiner, Andrew Reynolds, Clark W. Barrett, and Cesare Tinelli. On solving quantified bit-vector constraints using invertibility conditions. Formal Methods Syst. Des., 57(1):87–115, 2021. doi:10.1007/S10703-020-00359-9.
- [29] Aina Niemetz, Mathias Preiner, Andrew Reynolds, Yoni Zohar, Clark W. Barrett, and Cesare Tinelli. Towards bit-width-independent proofs in SMT solvers. In Pascal Fontaine, editor, Automated Deduction - CADE 27 - 27th International Conference on Automated Deduction, Natal, Brazil, August 27-30, 2019, Proceedings, volume 11716 of Lecture Notes in Computer Science, pages 366–384. Springer, 2019. doi:10.1007/978-3-030-29436-6_22.
- [30] Aina Niemetz, Mathias Preiner, Andrew Reynolds, Yoni Zohar, Clark W. Barrett, and Cesare Tinelli. Towards satisfiability modulo parametric bit-vectors. J. Autom. Reason., 65(7):1001–1025, 2021. doi:10.1007/S10817-021-09598-9.
- [31] Aina Niemetz, Mathias Preiner, and Yoni Zohar. Scalable bit-blasting with abstractions. In Arie Gurfinkel and Vijay Ganesh, editors, Computer Aided Verification - 36th International Conference, CAV 2024, Montreal, QC, Canada, July 24-27, 2024, Proceedings, Part I, volume 14681 of Lecture Notes in Computer Science, pages 178–200. Springer, 2024. doi:10.1007/978-3-031-65627-9_9.
- [32] Tobias Nipkow, Lawrence C Paulson, and Markus Wenzel. Isabelle/HOL: a proof assistant for higher-order logic, volume 2283 of Lecture Notes in Computer Science. Springer Science & Business Media, 2002. doi:10.1007/3-540-45949-9.
- [33] Andres Nötzli, Haniel Barbosa, Aina Niemetz, Mathias Preiner, Andrew Reynolds, Clark W Barrett, and Cesare Tinelli. Reconstructing fine-grained proofs of rewrites using a domain-specific language. In FMCAD, pages 65–74, 2022. doi:10.34727/2022/ISBN.978-3-85448-053-2_12.
- [34] Andres Nötzli, Andrew Reynolds, Haniel Barbosa, Aina Niemetz, Mathias Preiner, Clark W. Barrett, and Cesare Tinelli. Syntax-guided rewrite rule enumeration for SMT solvers. In SAT, volume 11628 of Lecture Notes in Computer Science, pages 279–297. Springer, 2019. doi:10.1007/978-3-030-24258-9_20.
- [35] Mark Christopher Pichora. Automated Reasoning about Hardware Data Types Using Bit-Vectors of Symbolic Lengths. PhD thesis, University of Toronto, CAN, 2003. AAINQ84686.
- [36] Andrew Reynolds, Haniel Barbosa, Andres Nötzli, Clark W. Barrett, and Cesare Tinelli. cvc4sy: Smart and fast term enumeration for syntax-guided synthesis. In CAV (2), volume 11562 of Lecture Notes in Computer Science, pages 74–83. Springer, 2019. doi:10.1007/978-3-030-25543-5_5.
- [37] Cesare Tinelli and Calogero G. Zarba. Combining decision procedures for sorted theories. In Jóse Júlio Alferes and João Leite, editors, Logics in Artificial Intelligence, pages 641–653, Berlin, Heidelberg, 2004. Springer Berlin Heidelberg. doi:10.1007/978-3-540-30227-8_53.
- [38] Zhihong Zeng, Priyank Kalla, and Maciej Ciesielski. LPSAT: a unified approach to RTL satisfiability. In Proceedings Design, Automation and Test in Europe. Conference and Exhibition 2001, pages 398–402. IEEE, 2001. doi:10.1109/DATE.2001.915055.
- [39] Yoni Zohar, Ahmed Irfan, Makai Mann, Aina Niemetz, Andres Nötzli, Mathias Preiner, Andrew Reynolds, Clark W. Barrett, and Cesare Tinelli. Bit-precise reasoning via int-blasting. In VMCAI, volume 13182 of Lecture Notes in Computer Science, pages 496–518. Springer, 2022. doi:10.1007/978-3-030-94583-1_24.
Appendix A Rewrite Rules of the Parametric Bit-Vector Rewriter
The following table lists the rewrite rules implemented in our rewriter . We denote with the bitwidth of , use a lexicographic ordering between terms, and normalize commutative operations (, , etc.) according to this ordering. For example, if , then is rewritten to itself and is rewritten to .
| Rule Name | Term | Rewritten Term |
|---|---|---|
| bv-concat-extract-merge | (concat (extract k (+ j 1) s) (extract j i s) ) | (extract k i s) |
| bv-extract-extract | (extract l k (extract j i x)) | (extract (+ i l) (+ i k) x) |
| bv-extract-whole | (extract 0 x) | x |
| bv-add-zero | (bvadd x 0) | x |
| bv-reverse-extract-and | (bvand (extract j i x) (extract j i y)) | (extract j i (bvand x y)) |
| bv-xor-simplify-2 | (bvxor x (bvnot x)) | (bvnot 0) |
| bv-or-zero | (bvor x (int_to_pbv 0)) | x |
| bv-mul-one | (bvmul x (int_to_pbv 1)) | x |
| bv-mul-zero | (bvmul x (int_to_pbv 0)) | (int_to_pbv 0) |
| bv-zero-extend-eliminate | (zero-extend 0 x) | x |
| bv-sign-extend-eliminate | (sign-extend 0 x) | x |
| bv-not-neq | (= x (bvnot x)) | false |
| bv-neg-sub | (bvneg (bvsub x y)) | (bvsub y x) |
| bv-neg-idemp | (bvneg (bvneg x)) | x |
| bv-ugt-eliminate | (bvugt x y) | (bvult y x) |
| bv-uge-eliminate | (bvuge x y) | (bvule y x) |
| bv-sgt-eliminate | (bvsgt x y) | (bvslt y x) |
| bv-sge-eliminate | (bvsge x y) | (bvsle y x) |
| bv-shl-by-const-0 | (bvshl x (int_to_pbv 0)) | x |
| bv-shl-by-const-2 | (bvshl x (int_to_pbv )) | (int_to_pbv 0) |
| bv-lshr-by-const-0 | (bvlshr x (int_to_pbv 0)) | x |
| bv-lshr-by-const-2 | (bvlshr x (int_to_pbv )) | (int_to_pbv 0) |
| bv-ashr-by-const-0 | (bvashr x (int_to_pbv 0)) | x |
| bv-bitwise-idemp-2 | (bvor x x) | x |
| bv-or-one | (bvor x (bvnot x)) | (bvnot (int_to_pbv 0)) |
| bv-xor-duplicate | (bvxor x x) | (int_to_pbv 0) |
| bv-xor-zero | (bvxor x (int_to_pbv 0)) | x |
| bv-bitwise-not-or | (bvor x (bvnot x)) | (bvnot (int_to_pbv 0)) |
| bv-xor-not | (bvxor (bvnot x) (bvnot y)) | (bvxor x y) |
| bv-not-idemp | (bvnot (bvnot x)) | x |
| bv-ult-zero-1 | (bvult (int_to_pbv 0) x) | (not (= x (int_to_pbv 0))) |
| bv-ult-zero-2 | (bvult x (int_to_pbv 0)) | false |
| bv-ult-self | (bvult x x) | false |
| bv-lt-self | (bvslt x x) | false |
| bv-ule-self | (bvule x x) | true |
| bv-ule-zero | (bvule x (int_to_pbv 0)) | (= x (int_to_pbv 0)) |
| bv-zero-ule | (bvule (int_to_pbv 0) x) | true |
| bv-sle-self | (bvsle x x) | true |
| bv-ule-max | (bvule x (bvnot x)) | true |
| bv-udiv-zero | (bvudiv x (int_to_pbv 0)) | (bvnot (int_to_pbv 0)) |
| bv-udiv-one | (bvudiv x (int_to_pbv 1)) | x |
| bv-urem-one | (bvurem x (int_to_pbv 1)) | (int_to_pbv 0) |
| bv-urem-self | (bvurem x x) | (int_to_pbv 0) |
| bv-shl-zero | (bvshl (int_to_pbv 0) x) | (int_to_pbv 0) |
| bv-lshr-zero | (bvlshr (int_to_pbv 0) x) | (int_to_pbv 0) |
| bv-ashr-zero | (bvashr (int_to_pbv 0) x) | (int_to_pbv 0) |
| bv-ult-one | (bvult x (int_to_pbv 1)) | (= x (int_to_pbv 0)) |
Appendix B More Detailed Evaluation
In addition to the experimental evaluation presented above, we have performed an in-depth ablation study of the various features listed in Table 5. In this section, we describe the results for 14 configurations we evaluated in addition to the three described in Table 5. These additional configurations are defined as described in Tables 7 and 8.
Configuration or-e is obtained from Baseline by turning on the elimination of , and similarly for xor-e. The right-most 6 configurations of Table 8 differ by turning on or off the bit-vector rewriter , as well as the arithmetic optimizations that eliminate operators. In particular, Eager - is obtained from Eager by turning both of these off.
The results of the additional configurations are shown in Tables 9 and 10. Table 9 focuses on configurations that are based on Baseline. Note that the elimination of in (configuration sh-m-e) has a greater effect than the elimination of and , since our benchmarks include more occurrences of . Table 10 includes configurations that evaluate the effect of the bit-vector rewriter . Overall, enabling the rewriter improves performance, especially for the eager configurations.
| Feature | or-e | xor-e | sh-m-e | all-e | pow2++ | piand++ |
|---|---|---|---|---|---|---|
| Target Theory | ||||||
| Multiple Bit-widths | ✕ | ✕ | ✕ | ✕ | ✕ | ✕ |
| Lazy | ✕ | ✕ | ✕ | ✕ | ✕ | ✕ |
| Lazy | ✕ | ✕ | ✕ | ✕ | ✕ | ✕ |
| -elimination | ✕ | ✕ | ✕ | ✕ | ||
| -elimination | ✕ | ✕ | ✕ | ✕ | ||
| without | ✕ | ✕ | ✕ | ✕ | ||
| New lemmas for | ✕ | ✕ | ✕ | ✕ | ✕ | |
| New lemmas for | ✕ | ✕ | ✕ | ✕ | ✕ | |
| No redundant axioms | ✕ | ✕ | ✕ | ✕ | ✕ | ✕ |
| ✕ | ✕ | ✕ | ✕ | ✕ | ✕ | |
| -reduction | ✕ | ✕ | ✕ | ✕ | ✕ | ✕ |
| Feature | pow2-l | piand-l | Eager - | Eager | Eager- | Pbv | Pbv - | Pbv- |
| Target Theory | ||||||||
| Multiple Bit-widths | ✕ | ✕ | ||||||
| Lazy | ✕ | ✕ | ✕ | ✕ | ||||
| Lazy | ✕ | ✕ | ✕ | ✕ | ||||
| -elimination | ✕ | ✕ | ||||||
| -elimination | ✕ | ✕ | ||||||
| without | ✕ | ✕ | ||||||
| New lemmas for | ✕ | |||||||
| New lemmas for | ✕ | |||||||
| No redundant axioms | ✕ | ✕ | ||||||
| ✕ | ✕ | ✕ | ✕ | ✕ | ✕ | |||
| -reduction | ✕ | ✕ | ✕ | ✕ | ✕ | ✕ |
| Benchmarks | # | or-e | xor-e | sh-m-e | all-e | pow2++ | piand++ | VBS |
|---|---|---|---|---|---|---|---|---|
| alive | 200 | 71 | 71 | 71 | 71 | 71 | 71 | 72 |
| ic | 180 | 44 | 44 | 48 | 49 | 43 | 43 | 50 |
| rewrite | 2006 | 692 | 653 | 724 | 758 | 669 | 633 | 809 |
| syrew | 1500 | 561 | 556 | 604 | 603 | 570 | 551 | 625 |
| lemmas | 70 | 12 | 12 | 13 | 13 | 12 | 12 | 13 |
| icfb | 46 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| mut | 9441 | 692 | 663 | 774 | 794 | 669 | 634 | 843 |
| total | 13443 | 2071 | 2000 | 2235 | 2289 | 2033 | 1945 | 2413 |
| sat | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
| unsat | 2071 | 2000 | 2235 | 2289 | 2033 | 1955 | 2413 |
| Benchmarks | # | pow2-l | piand-l | Eager - | Eager | Eager- | Pbv | Pbv - | Pbv- | VBS |
|---|---|---|---|---|---|---|---|---|---|---|
| alive | 200 | 63 | 74 | 87 | 95 | 85 | 105 | 97 | 100 | 125 |
| ic | 180 | 55 | 45 | 58 | 59 | 59 | 75 | 75 | 77 | 83 |
| rewrite | 2006 | 681 | 797 | 732 | 925 | 1068 | 1296 | 1148 | 1189 | 1381 |
| syrew | 1500 | 580 | 593 | 608 | 721 | 606 | 911 | 796 | 798 | 956 |
| lemmas | 70 | 17 | 13 | 14 | 14 | 14 | 23 | 23 | 23 | 25 |
| icfb | 46 | 1 | 1 | 9 | 9 | 9 | 11 | 11 | 12 | 12 |
| mut | 9441 | 732 | 733 | 885 | 902 | 1088 | 4832 | 4840 | 4881 | 5138 |
| total | 13443 | 2129 | 2256 | 2393 | 2725 | 2929 | 7253 | 6990 | 7080 | 7720 |
| sat | 0 | 0 | 0 | 0 | 0 | 3652 | 3657 | 3651 | 3841 | |
| unsat | 2129 | 2256 | 2393 | 2725 | 2929 | 3601 | 3333 | 3429 | 3879 |
