Super-Critical Trade-Offs in Resolution over Parities via Lifting
Abstract
Razborov [24] exhibited the following surprisingly strong trade-off phenomenon in propositional proof complexity: for a parameter , there exists -CNF formulas over variables, having resolution refutations of width, but every tree-like refutation of width needs size . We extend this result to tree-like Resolution over parities, commonly denoted by , with parameters essentially unchanged.
To obtain our result, we extend the lifting theorem of Chattopadhyay, Mande, Sanyal and Sherif [11] to handle tree-like affine DAGs. We introduce additional ideas from linear algebra to handle forget nodes along long paths.
Keywords and phrases:
Proof complexity, Lifting, Resolution over paritiesFunding:
Arkadev Chattopadhyay: Funded by the Department of Atomic Energy, Government of India, under project no. RTI4001, and a Google India Research Award.Copyright and License:
2012 ACM Subject Classification:
Theory of computation Proof complexityAcknowledgements:
We would like to thank an anonymous reviewer for pointing out the short and elegant proof of Theorem 10 that we include here. Our original proof was complicated.Editors:
Srikanth SrinivasanSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
Understanding trade-offs among complexity measures in a computational model is a well known interesting theme, with many published results (for example, time-space trade-offs [15, 16, 26], rounds-communication trade-offs [21, 10, 2] and space-size trade-offs in propositional proof complexity [6, 3, 5]). Typically, in these trade-offs, one showed that in various models of computation, simultaneous optimization of two complexity measures, like space and time, or rounds and total communication, or space and width (in refuting CNF formulas) is not always possible. In particular trying to optimize one complexity measure, necessarily leads to a huge blow-up in the other measure. For instance, in Yao’s 2-party model of communication, the Greater-Than function can be computed in round. It can also be computed using randomized protocols of communication cost . But every -round protocol, requires communication cost. On the other hand, every function has a protocol of cost . In all of the trade-off results cited above, the general story was that trying to optimize the use of one resource, led to the cost with respect to to the other resource shooting up to the cost needed by a naive/generic algorithm.
In 2016, Razborov [24] exhibited formulas for which very different and extreme kind of trade-offs hold in the propositional proof system of resolution. Although these unsatisfiable -CNF formulas on variables have refutations of width, every one of their tree-like refutation of width less than has size . That is, despite the fact that every -variable formula has a generic tree-like refutation of size , these exhibited formulas that do have refutation of small width require super-critical tree-like refutation size whenever width is mildly restricted. Moreover, the super-critical size is in fact exponentially larger than the generic upper bound. Razborov remarked that such a phenomenon seemed extremely rare in the known body of tradeoff results in the computational complexity literature. In concluding his work, he urged finding more instances of such trade-offs. In response to that, follow-up works have appeared. They can be classified into two types. Ones which continue to focus on resolution and, others on more powerful proof systems. Examples of the former include work by Berkholz and Nordstrom [8], who showed super-critical trade-offs between width and space. A recent work of Berkholz, Lichter and Vinall-Smeeth [7] proves super-critical trade-offs for narrow resolution width and narrow tree-like size for refuting the isomorphism of two graphs.
The second type of work answers Razborov’s call by finding such trade-offs in stronger proof systems. This includes the recent work of Fleming, Pitassi and Robere [14] who first showed that the argument of Razborov extends to general resolution DAGs. They then use it along with appropriate lifting theorems to prove trade-offs between size and depth for DAG like Resolution, Res, and cutting planes. In very recent progress in the area, De Rezende, Fleming, Jannet, Nordström, and Pang [12], and Göös, Maystre, Risse, Sokolov [18] independently showed super-critical trade-offs not only for various proof systems but also the first super-critical depth-size tradeoffs for monotone circuits. Our work also falls in this second type as we study tree-like resolution over parities, that generalizes tree-like resolution.
We exhibit super-critical trade-offs for width and tree-like size/depth in the style of Razborov for resolution over parities, denoted by . This system, introduced by Itsykson and Sokolov [19, 20], is one of the simplest generalizations of resolution for which obtaining super-polynomial lower bounds on size of refutations is a current, well known, challenge. Very recent works (see [13, 9]) managed to obtain exponential lower bounds on the size of regular proofs in this system.
Our work here will concern tree-like proofs. Lower bounds for them were obtained by Itsykson and Sokolov [20] themselves. More recently, two independent works, one by Beame and Koroth [4] and the other by Chattopadhyay, Mande, Sanyal and Sherif [11], proved lifting theorems that yielded a systematic way of lifting tree-like resolution width complexity to strong lower bounds on size of tree-like proofs for formulas lifted with constant-size gadgets. In this paper, we extend the lifting theorem by Chattopadhyay et al. [11] in the following manner. Their result was applicable to parity decision trees (duals of tree-like proofs) that only had usual nodes where the algorithm queried (correspondingly the proof resolved on) an linear form. We call such nodes query nodes. On the other hand, we want to deal here with width-bounded proofs that could be much deeper than , the total number of variables of the formula. This would correspond to parity decision trees where the height is much larger than , and therefore, necessarily there are nodes that forget. The affine space corresponding to such a forget node is strictly contained in the affine space corresponding to ’s only child node . Alternatively, in the bottom-up view of the corresponding proof, the linear clause at is strictly weakened to get the linear clause at . Dealing with such nodes, so that the width of the (ordinary) clauses in the extracted resolution proof never exceed the corresponding width of the linear clauses, is the main technical contribution of this work. Thus, we establish a depth-to-size lifting result from tree-like of arbitrary depth to tree-like resolution, which also preserves the width of the refutation.
Theorem 1.
Let be a lift of a contradiction by an appropriate gadget . Suppose there is a tree-like refutation for with size and width . Then, there is a tree-like resolution refutation for with depth at most and width at most .
Remark 2.
We point out the precise difference between our Theorem 1 and the earlier lifitng theorem of Chattopadhyay et al [11]. The earlier theorem, given a tree-like refutation of in of size and width , would have extracted a tree-like refutation of in ordinary resolution of depth , with no guarantees on the width of this refutation. In fact, the width could get as large as the depth of the extracted refutation, i.e. . In super-critical trade-offs, which is our chief interest here, the width of the given refutation of could be exponentially smaller than . This renders the earlier lifting theorem unusable for demonstrating such trade-offs.
Applying Theorem 1 to the trade-off by Razborov [24], we immediately obtain an analogous trade-off in the proof system.
Theorem 3.
Let be any parameter and let be an arbitrary constant. Then, there exists a -CNF contradiction over variables such that there is a resolution refutation for with width at most , but for every tree-like refutation for with , we have the bound .
The contradiction from the previous theorem is a lift of the contradiction constructed by Razborov [24] by an appropriate gadget with a constant size. A caveat of (as Razborov also noted) is that the number of clauses of is . Naturally, this caveat is inherited by our contradiction . This issue was addressed in very recent work of de Rezende et al. [12]. They provided a contradiction such that the size of its tree-like resolution with bounded width is super-exponential in not just in the number of variables but also in the size of the formula. However, the bound on width of the tree-like resolution for which super-critical size is needed is much more strict than in Razborov’s result. They showed the super-critical trade-off only for tree-like resolution of width smaller than , where is the width of the resolution refutation of . Since our gadget has size 3, we can guarantee only resolution refutation with width at most for the lifted formula . Thus, we can not lift their super-critical trade-off to as it is extremely sensitive to width. Razborov’s result [24], on the other hand, is more robust making it possible to be lifted by our Theorem 1.
If one were able to construct another formula improving state-of-the-art in supercritical trade-off between width and size of tree-like resolution that is not so sensitive to the width of the refutations, this improvement could be used with our simulation theorem (Theorem 1) and be lifted to tree-like .
Relation to Other Recent Works
The proof system has been an active area of research. Recently, Efremenko, Garlík, and Itsykson [13] showed that the binary pigeonhole principle formula requires an exponential-size refutation within the so-called bottom-regular . The bottom-regular is a fragment of that contains both tree-like and regular resolution proof systems. Furthermore, Bhattacharya, Chattopadhyay, and Dvořák [9] showed that bottom-regular can not polynomially simulate even ordinary, but DAG-like resolution. This separation was very recently improved quantitatively by Alekseev and Itsykson [1].
Furthermore, Alekseev and Itsykson [1] established a width-to-width lifting from resolution to . They proved this in a contra-positive way – if there is no resolution refutation of a contradiction with width , then there is no width- refutation of a lift of by an appropriate gadget . They utilized a game interpretation of resolution and to prove their lifting theorem. While their proof is quite short, it is unclear whether their technique can be adapted to prove the depth-to-size lifting theorem as we need in order to show the trade-off in (our Theorem 3). In particular, their theorem seems incomparable to the depth-to-size lifting of Chattopadhyay et al. [11]. On the other hand, since any refutation can be expanded into a tree-like refutation (with a possible exponential blow-up in size), our lifting theorem (Theorem 1) immediately implies the width-to-width lifting theorem of Alekseev and Itsykson (however our proof seems more involved). Hence, our Theorem 1 effectively contains a common generalization of the width-to-width lifting of Alekseev and Itsykson [1] and depth-to-size lifting of Chattopadhyay et al. [11]. Moreover, we use a completely different technique than Alekseev and Itsykson [1]. Specifically, we establish our lifting theorem directly by constructing a tree-like resolution refutation for a contradiction simulating a tree-like refutation for . To achieve this, we use some ideas from linear algebra that, to our knowledge, have not been previously utilized in the context of lifting.
Overview of Our Ideas
Overall, the ideas behind the proof of Theorem 1 are inspired by the work of Chattopadhyay et al. [11]. However, they did not consider refutation with a limited width. Thus, they only needed to process query nodes to prove their lifting theorem. In contrast, our setting also involves forget nodes, where a linear equation from the span of previously queried equations is forgotten. It turns out that processing forget nodes is non-trivial. In particular, an affine space can be viewed in two ways: the first is by the (linear) space of constraints that could be thought of as the dual view. The primal view is that of the set of vectors lying in space represented by a basis of the underlying vector space and a shift vector. Previously, in [11], the dual view was very effectively used for depth-to-size lifting in the absence of forget nodes. This is because a query node naturally adds a constraint to the dual space. On the other hand, a forget node increases the dimension of the affine space. This new space is not conveniently representable with respect to the basis maintained for the dual space of constraints just before “forgetting” happens. Here it seems the primal view is more helpful as any basis of a space can be extended to a basis of a space whenever . The main tool we use is a a characterization, via Theorem 10, of the constraint space of in terms of the constraint space of , where . The proof of this turns out to be simple111Our original proof was complicated. The much simpler proof we present here has been pointed out to us by an anonymous referee.. With more ideas, including a new notion of strongly stifled gadgets that extends the earlier notion of stifling introduced by [11], Section 7 yields the process of dealing with forget nodes.
2 Tree-like Proofs and Decision Trees
A proof in a propositional proof system starts from a set of clauses , called axioms, that is purportedly unsatisfiable. It generates a proof by deriving the empty clause from the axioms, using inference rules. The main inference rule in the standard resolution, called the resolution rule, derives a clause from clauses and (i.e., we resolve the variable ). If we can derive the empty clause from the original set , then it proves that the set is unsatisfiable.
Resolution over parities () is a generalization of the standard resolution, using linear clauses (disjunction of linear equations in ) to express lines of a proof. It consists of two rules:
- Resolution Rule:
-
From linear clauses and derive a linear clause .
- Weakening Rule:
-
From a linear clause derive a linear clause that is semantically implied by (i.e., any assignment satisfying also satisfies ).
The length of a resolution (or ) refutation of a formula is the number of applications of the rules above in order to refute the formula The width of a resolution (or ) refutation is the maximum width of any (linear) clause that is used in the resolution proof. A (linear) resolution proof is tree-like if the resolution rule is applied in a tree-like fashion. The depth of the tree-like proof is the depth of the underlying tree (i.e., the length of the longest path from the root to a leaf).
We can replace the general resolution rule with a canonical one:
- Canonical Resolution Rule:
-
From linear clauses and derive a linear clause .
Using the canonical resolution rule instead of the general one will not make the proof system substantially weaker. If we want to apply the resolution rule on the clauses and , we can apply the weakening rule to both of them to get linear clauses and and then apply the canonical resolution rule to derive the clause . Thus, from a tree-like refutation of a contradiction , we can derive an equivalent tree-like refutation that uses only canonical resolution rule (and the weakening rule) with , , and as for each application of the resolution rule in we add 2 applications of the weakening rule in , and the width might increase by 1 as we introduce clause and but only the clause was present in .
It is known that a tree-like resolution (or ) proof, for an unsatisfiable set of clauses , corresponds to a (parity) decision tree for a search problem that is defined as follows. For a given assignment of the variables of , one needs to find a clause in that is not satisfied by (at least one exists as the set is unsatisfiable). The correspondence holds even for general (not only tree-like) proofs (see for example Garg et al. [17], who credit it to earlier work of Razborov [23] that was simplified by Pudlák [22] and Sokolov[25]), but in this paper, we are interested only in tree-like proofs.
Let , where is a set of possible outputs. A forgetting parity decision tree (FPDT) computing is a tree such that each node has at most two children and the following conditions hold:
-
Each node is associated with an affine space .
-
For every node with two children and is called a query nodes. There is a linear query such that and , or vice versa. We say that is the query at .
-
Every node with exactly one child is called a forget node. It holds that .
-
Each leaf is labeled by such that for all , it holds that .
-
For the root , .
The size of an FPDT is the number of nodes of and the width of FPDT is the largest integer such that there exists an affine space of co-dimension at least associated with some node of . The depth of is denoted . Note that there are no forget nodes in a standard parity decision tree. Thus, for such trees, the width is exactly the depth of the tree. It no longer holds for this model, because we may “forget” some linear queries that have been made earlier.
A forgetting decision tree (FDT) is defined similarly to FPDT but instead of affine spaces, cubes are associated with each node. Consequently, the width of FDT is the maximum number such that there exists a cube of width at least associated with some node of and queries of single variables replace the linear queries at nodes.
The correspondence between an F(P)DT’s and tree-like resolution (or ) proofs using only canonical resolution rule is the following: We represent a (linear) resolution proof as a tree where nodes are associated with (linear) clauses. The leaves are associated with clauses of and the root is associated with the empty clause. Each node with two children corresponds to an application of the canonical resolution rule and each node with exactly one child corresponds to an application of the weakening rule. To get an F(P)DT for we just negate the clauses that are associated with the nodes. Thus, each node is associated with a cube (or an affine space in the case of /FPDT). Moreover, a cube (or an affine space associated with the node of an F(P)DT contains exactly the falsifying assignments of the (linear) clause that is associated with the corresponding node in the tree-like refutation that corresponds to . It is clear that the width and the depth of such decision tree are exactly the same as the width and the depth of the corresponding tree-like refutation and the length equals the number of inner nodes of (as the inner nodes of correspond to the applications of the resolution rule).
We say an FPDT is canonical if for each forget node of and its only child , it holds that . We say an FPDT is succinct if the parent of each forget node is a query node. Note that any FPDT can be transformed into an equivalent canonical (or succinct) FPDT by expanding forget nodes into paths of forget nodes (or contracting paths of forget nodes to single vertices).
Consider an FPDT and its succinct form . Note that the number of query nodes of and is the same, and analogously the number of query nodes on a root-leaf path in equals the number of query nodes of the corresponding path in . Thus, for an FPDT we define query size and query depth to be the number of query nodes of and the maximum number of query nodes on a root-leaf path of .
Observation 4.
Let be a refutation of a contradiction . Then, there is a canonical FPDT computing with and .
Proof.
As discussed above, first we modify to a refutation that uses only canonical resolution rule (and weakening rule). By this modification, we have and the number of applications of the resolution rule in is exactly the number of applications of the canonical resolution rule in . From , we derive an FPDT computing that we finally modify to an equivalent canonical FPDT . The width of and is the same, and the query size of is exactly the number of applications of the canonical resolution rule in . The modification of to the canonical FPDT does not change the width and the query size of the tree. Thus, we have and .
Observation 5.
Let be a succinct FPDT computing and be the tree-like refutation of corresponding to . Then, .
Proof.
As mentioned above, the length of equals the number of inner nodes of . The tree has at most query nodes. Since is succinct, the number of forget nodes is at most twice the number of query nodes as each query node might have at most two forget nodes as its children. Further, any child of a forget node is not a forget node. Thus, .
3 Lifting of Relations and Formulas
Let be a boolean function. For a relation we define its lift as
where .
For CNF over variables , let be the following lift of over the variables . For any clause of , let be the set of variables of , and let be the only falsifying assignment of . Then,
where is the following literal:
Now, the clauses of are .
Observation 6.
For a clause , an assignment of falsifies if and only if , i.e., falsifies .
4 Notation
We use the following notation. For a vector we use both and to denote the -th entry of , similarly for a matrix we use and to denote the entry in the -th row and -th column. For an ordered set of indices we denote by the subvector of given by , i.e., . For an abbreviation, we use to denote the vector without the -th entry, i.e., .
5 Stifling
In this section, we extend the notion of stifling introduced by Chattopadhyay et al. [11]. Let be a Boolean function. For and , we say a partial assignment is an -stifling pattern for if if and only if , and for any such that , we have . In words, assigns a value to all but the -th bit and when we extend to a full assignment , it holds that no matter how we set the value of the -th bit.
Definition 7.
A Boolean function is strongly stifled if there is a collection where each is an -stifling pattern for and
The collection is called a converting collection of stifling patterns of .
Chattopadhyay et al. [11] defined a stifled function (namely 1-stifled) as a function such that for each and there is an -stifling pattern for . In this work, we require not only the existence of the stifling patterns but a stronger property that we can convert the stifling patterns to each other. More formally, consider an -stifling pattern from the collection (from the definition above). Let an adversary give us a set of coordinates . Then, we are able to pick a coordinate such that the stifling pattern is equal to on all coordinates in .
By a simple verification we can show that indexing of two bits and majority of 3 bits are strongly stifled functions, where and if and only if .
Observation 8.
The functions and are strongly stifled.
Further, the strongly stifled notion is actually stronger than the original stifled notion, because the inner product function of 2-bit vectors is stifled [11] but not strongly stifled, where .
Observation 9.
The function is not strongly stifled.
For more details, see the appendix.
6 Linear Algebraic Tools
Let be an affine space over a field . A constraint representation of is a system of linear equations where and for some such that . The columns of correspond to the variables of the system and rows of correspond to the constraints. We say a constraint contains a variable if . A matrix is in an echelon form if there are columns such that for all it holds that
Thus, the submatrix of induced by the columns is the identity matrix . The variables corresponding to the columns are dependent variables of the system and the remaining variables are independent. The -th entry of the -th row of is called the pivot of the -th row. We say a constraint representation is in an echelon form if the matrix is in an echelon form.
Let be a matrix and be a non-zero vector. We define a matrix where arises from by adding the -th row to all rows such that and then deleting the row . Analogously, we define the operation for a constraint representation of an affine space , where we treat the vector as the last column of the matrix . It turns out that the operation is the only operation needed to get a constraint representation for a super-space as shown in the following theorem that will be the key technical tool for us to process forget nodes while proving our main lifting theorem.
Theorem 10.
Let be two affine spaces such that . Let be a constraint representation in the echelon form of such that . Then, there is a non-zero vector such that the following is true: for every with , is a constraint representation of in echelon form.
We call the vector given by Theorem 10 as forgetting vector because it allows us to forget one constraint in the representation of to get a representation of . Note that the dimension of equals the number of equations in the system . We say contains a constraint if .
Theorem 10 follows from the following well-known lemma, which we will prove for completeness. Let be two vector spaces such that .
Lemma 11.
For any holds that .
Proof.
Since is in but not in , we have that by the dimensions of and . Since as well, we have that for an appropriate vector . It follows that .
Corollary 12.
Let be a basis of . Let . Let , and for let
Then, is a basis of .
Proof.
Vectors ’s are linearly independent as ’s are linearly independent. By definition and Lemma 11, all ’s are in . Since , the vectors ’s has to form a basis of .
7 Simulation
In this section, we prove our lifting theorem.
Theorem 13 (Theorem 1 stated for F(P)DT).
Let be a relation and be a canonical FPDT computing where is a strongly stifled gadget. Then, there is an FDT computing such that and .
Algorithm
We prove Theorem 13 by simulation. On an input , the constructed FDT simulates given FPDT on an input (for ) with by traversing a path from the root of to a leaf. The main loop of the simulation is quite simple. We start in the root of and in each iteration, we process the current node of and pick a new node. Sometimes during the processing of a node of , we query or forget a bit of . When we reach a leaf of we just output the value of . The main loop is summarized in Algorithm 1.
Let be a current node of we have just encountered. We maintain a constraint representation in echelon form of the affine space . We store queried (and not forgotten) bits of in a partial assignment . Let be a set of all possible extensions of and be a number of fixed bits of . Thus, is a cube and is its width. Our goal is that any is represented in (i.e., there is such that ), and that is at most the co-dimension of .
An input of is divided into blocks , each of size , and each such block corresponds to exactly one entry of . Formally, . During the simulation of , we divide the blocks into two groups – free and fixed. Fixed blocks correspond to the entries of that were queried and were not forgotten – i.e., the entries fixed by . The other blocks are free.
For each fixed bit of , we have a unique constraint of such that the pivot of the constraint is in the block . The constraint is called the primary constraint of . The dependent variables of primary constraints are called marked variables. We will keep an invariant that each marked variable is in a different block, i.e., each fixed block contains a unique marked variable. The other (non-marked) variables of the fixed blocks are called stifling variables. We will assign values to the stifling variables according to appropriate stifling patterns, in such a way that a fixed block is always mapped to , independently on the value of the marked variable of . Further, we will assign a value to the marked variable in a block in such a way that the assignment will satisfy the corresponding primary constraint. Since each marked variable is in a unique primary constraint, we will show that such an assignment of all marked variables always exists.
The constraints that are not primary for any block are called secondary. It will turn out that all secondary constraints contain variables only from fixed blocks. All variables of free blocks are called free. Thus, the matrix has the following form (after rearranging columns):
Let be a converting collection of stifling patterns of , given by the assumption. Let be the following partial assignment:
Note that the partial assignments assigns values exactly to the stifling variables of . We will keep an invariant that if we set all stifling variables according to all secondary constraints of will be satisfied. This will help us to ensure that each is represented in .
At the beginning of our simulation, we are at the root of . Since , the matrix is an empty matrix and all variables are free because we have not queried any entry of yet. Also, the patterns and do not contain any fixed bit, i.e. they are equal to , or , respectively.
Input: Input for FDT
FPDT with the root computing
Initialization:
Simulation:
During the simulation, we will maintain the following invariants. This will help us to ensure that each is represented in .
- Invariant 1:
-
The system of equations is a constraint representation of in the echelon form.
- Invariant 3:
-
All variables of all free blocks are independent.
- Invariant 5:
-
For each fixed bit of , there is a unique constraint of such that the pivot (i.e., the marked variable) of the constraint is in the block .
- Invariant 7:
-
The partial assignment assigns values to all stifling variables and any extension of to a full assignment satisfies all secondary constraints.
Note that Invariant Invariant 7: implies that secondary constraints of contain only variables of fixed blocks. We will show these invariants hold for any node of at the moment, when is checked whether is a leaf – i.e., at Line 4 of Algorithm 1. Clearly, the invariants hold for the root of . Now, we describe how we process a node (depending on whether is a query node or forget node). We suppose the invariants hold for . During the processing, we pick an appropriate child of and make the new current node. Subsequently, we argue why the invariants hold for .
We remark that the query node processing is a careful adaptation of the node processing given by Chattopadhyay et al. [11]. All new machinery (strongly stifled function and obtaining a constraint representation of a super-space by Theorem 10) is used only for the processing of forget nodes.
Query Nodes
When is a query node, then introduces a new parity query , and if the computation of proceeds to the left child of , otherwise to the right child . Our goal is to pick an appropriate child of and create the system representing satisfying all our requirements. Let us start with a system , where
with being a parameter equal to 0 or 1. We fix the value of when we pick the appropriate child of as the new node. Surely, the system represents the space , however, it might not satisfy our requirements (for example the matrix might not be in the echelon form). Note that the matrix does not depend on the value of . We do another pivoting step of the Gaussian elimination to get the system into the echelon form, i.e.,
-
1.
We zero out all coefficients in corresponding the dependent variables in , to get a new constraint , where is a function of . We call the new constraint the reduced form of the constraint .
-
2.
We pick one of the remaining variables contained in as a new dependent variable, we pick an appropriate child of and we set the value of (and ), accordingly.
-
3.
We zero out all coefficients corresponding to in all original constraints from the system to get the new system .
It is clear the new system is a constraint representation of (i.e., Invariant Invariant 1: will hold for ). The crucial part is to pick a new dependent variable in Step 2 of the executed Gaussian elimination. Note that the reduced constraint does not contain any marked variable as all marked variables are dependant and, thus, they are zeroed out from in Step 1 of the executed pivoting step. There are two cases to consider as follows.
Case 1.
The new constraint contains only variables of the fixed blocks. Then, the new constraint becomes a secondary constraint, and the new dependent variable can be any variable of . Since the constraint contains only variables of fixed blocks (but no marked variable), we can assign a value to all variables of given by . Thus, there is such that for any extension of , it holds that . Then, we pick the appropriate child of , that gives us the right value of (and ) such that the new constraint holds for any extension of . This ensures that Invariant Invariant 7: holds for .
We did not query any new bit of in this case. It follows that the partial assignment and the set of fixed and free blocks are not changed. The set of primary constraints is unchanged as well. Further, the set of pivots of is not changed by the pivoting step of the Gaussian elimination. Thus, Invariant Invariant 5: holds for . Invariant Invariant 3: holds because the constraint does not contain any free variable of and thus the new dependent variable can not be from a free block.
Case 2.
The new constraint contains at least one variable of a free block . In this case, we can pick the new vertex as an arbitrary child of . Let be a subtree of rooted in a node of . We compare the query size of subtrees and and we pick to be the root of the subtree with the smaller query size, i.e., , where is the other children of .
We query and update the partial assignment by the value of to get . The block becomes a fixed block. The new constraint becomes the primary constraint of and the variable becomes the pivot of , i.e., becomes a marked variable. Since the set of pivots of is not changed, Invariant Invariant 5: holds for . Since the only new dependant variable is , Invariant Invariant 3: holds as well.
The partial assignment differs only at the block from (, and ). Since the block was free in , no secondary constraint of contains any variable of the block . Thus, no secondary constraints of were changed by the pivoting step in this case. The new constraint is primary. Thus, no secondary constraint of contains any variable of the block as well. Therefore, any extension of still satisfies all secondary constraints of and Invariant Invariant 7: holds for .
See Algorithm 2 for a summary of the query node processing.
Forget Nodes
In the case when is a forget node, the node has the only child and . We have the constraint representation of maintained by our simulation for and (where is the co-dimension of ). For processing the forget node, we introduce a classification of stifling variables. The variables of fixed blocks that are contained in the secondary constraints are called dangerous222Our notion of dangerous variables is not to be confused with the notion of dangerous constraints used by Efremenko et al. [13].. Note that the marked variables can not be dangerous. The remaining variables of fixed blocks (i.e., non-marked and non-dangerous) are called safe. Thus, with this new classification, the matrix has the following form:
Let be a forgetting vector given by an application of Theorem 10 to spaces and . The new system is obtained after applying for a right choice of (the function is defined in Section 6). By Theorem 10, the system is a constraint representation of in the echelon form, i.e. Invariant Invariant 1: holds for . Let be the number of primary constraints in , i.e. wlog, the constraints are primary and the constraints are secondary. We consider two cases.
Case 1.
There is an such that . Then, fix one such and take a system . We do not query or forget any bit of , thus and . To create , we only added the secondary constraint to some rows of and then we deleted it. Thus, the set of variables which appear in secondary constraints cannot grow in size and, therefore, secondary constraints are still satisfied by the assignment . Therefore, Invariant Invariant 7: holds for .
The set of primary constraints is not changed. The operation does not change the set of marked variables as the secondary constraint does not contain any pivot of the primary constraints. Thus, Invariants Invariant 5: holds for . The set of fixed blocks does not change and there is no new dependant variable. Thus, Invariant Invariant 3: holds as well.
Case 2.
For all , it holds that . Then, we fix some such that . Note that such an exists as is a non-zero vector. Again, let . Since has only zeroes at the coordinates corresponding to the secondary constraints, the secondary constraints are not changed by the operation. As the constraint is deleted and it was a primary one, one marked variable (the pivot of the constraint ) becomes independent and safe. Let be the block containing the variable , i.e., the constraint of is the primary constraint for . We consider two sub-cases.
Sub-case 2.1.
The other variables of are safe in as well, i.e., they are not in any secondary constraint. Thus, the whole block contains only independent and safe variables of . We forget the bit and make the block free. The set of other primary constraints (different from ) may change their form, but their pivots are not changed. Hence, Invariant Invariant 5: holds for . There is no new dependant variable. Thus Invariant Invariant 3: holds as well.
We get the partial assignment by simply setting the variable free. The partial assignment differs from only at the block (, and ). Further, the secondary constraints of do not contain any variable of the block by the assumption. Thus, Invariant Invariant 7: holds for .
Sub-case 2.2.
There is a dangerous variable in the block , i.e., there is a secondary constraint of that contains a variable of . In this case, we use the strong stifling property of . Let be the set of indices of all dangerous variables of in . Let be the index of the variable in (i.e., the previously marked variable in ). Note that because the variable is safe. Thus by definition of strongly stifling, there is a such that (note that ). Let be the -th variable in the block and be a secondary constraint that contains (such constraint exists by the assumption).
We run again the pivoting step for , i.e., we zero out all coefficients corresponding to in all other constraints of by adding the constraints to all other constraints containing . We denote the final system of constraints as . Note that is still a constraint representation of as it arises from only by row operations.
The constraint is now the only constraint containing the variable and becomes a dependent variable. Thus, we make the constraint a primary constraint for and we mark the variable . The primary constraint for was changed from the constraint of to the constraint of and the marked variable in the block was changed from to . The set of other primary constraints and their pivots were not changed. Thus, Invariant Invariant 5: holds for .
We do not change the assignment , thus the sets of free and fixed blocks are the same. The only change in the set of dependent variables was done in the block (that remains a fixed block), thus Invariant Invariant 3: holds for .
The secondary constraints of were not changed by the executed at the beginning of this case (as for all secondary constraints ). Since is a secondary constraint of , the secondary constraints of contains only variables of fixed blocks. However, we change the marked variable in the block . Thus, the partial assignment differs from at the block (, and , where and are indices of and in the block ). We need to be sure that still gives a solution to the secondary constraints of . Note that the secondary constraints of might still contain variables from the block .
By pivoting and making the constraint primary, the variable is not in any secondary constraint of . Since was a secondary constraint of , it can not happen that a safe variable in would become a dangerous one in (i.e., by the pivoting of ). In other words, the set of variables of the secondary constraints of is a subset of the set of variables of the secondary constraints of . Thus, the set still contains all dangerous variables of in . Since by the assumption, any extension satisfy all secondary constraints of and Invariant Invariant 7: holds for .
A summary of the forget node processing is in Algorithm 3.
Proof of Theorem 13
Theorem 13 follows from the following lemma.
Lemma 14.
Suppose the simulation is at Line 4 of Algorithm 1, i.e., it checks whether the current node is a leaf. Then,
-
1.
.
-
2.
For any , there is such that .
Proof of Item 1.
By Invariant Invariant 1:, the co-dimension of is exactly the number of equations in the system . By Invariant Invariant 5:, the number of fixed bits by is exactly the number of primary constraints in . Thus, .
Proof of Item 2.
Let . We will find a solution to the system such that . Thus, by Invariant Invariant 1:, .
First, we set variables of free blocks. Let be a free block. Thus, by Invariant Invariant 3:, all variables of are independent. We set the variables of in a way such that the block is mapped to by the gadget .
Now, we set the values of the stifling variables according to . By Invariant Invariant 7:, all secondary constraints are satisfied by any extension of . Recall that for a fixed block , where is the index of the marked variable of and . Since is a -stifling pattern, it holds that the block will be always mapped to by , no matter how we set the marked variables. Thus, the constructed solution will be mapped onto . By Invariant Invariant 5:, each primary constraint contains a unique marked variable. Thus, we can set a value to each marked variable in such a way the primary constraint containing is satisfied.
Proof of Theorem 13.
By Item 1 of Lemma 14, the width of a cube in a time of checking whether a vertex is a leaf is at most co-dimension of . Thus, the width of the constructed FDT is at most the width of .
Now, we bound the query depth of . Consider a root-leaf path of and let be the number of queries made on . Note that any time we query a bit of (Line 9 of Algorithm 2) we also pick a subtree with a smaller query size (Line 8 of Algorithm 2). Thus, by each query of we halve the query size of . Thus, .
It remains to prove the constructed FDT is correct. Let be a leaf of that is reached during the simulation and is the output of . Since computes , it holds that for all we have . Note that the processing phase (Lines 5-8 of Algorithm 1) is not executed for any leaf. Thus, the assertion of Lemma 14 holds for the leaf even at the time of output – Line 9 of Algorithm 1. Therefore at the end of the simulation, for any there is such that . Since , it holds that and the constructed FDT indeed outputs a correct answer.
8 Application
Razborov [24] showed the following trade-off between width and size of tree-like resolution.
Theorem 15 (Theorem 3.1, Razborov [24]).
Let be any parameter and let be an arbitrary constant. Then, there exists a -CNF contradiction over variables such that there is a resolution refutation for with width at most , but for any tree-like resolution refutation for with , we have the bound .
By our simulation, given by Theorem 13, we can lift the trade-off (given by the previous theorem) to tree-like and prove Theorem 3.
Theorem 3. [Restated, see original statement.]
Let be any parameter and let be an arbitrary constant. Then, there exists a -CNF contradiction over variables such that there is a resolution refutation for with width at most , but for every tree-like refutation for with , we have the bound .
Proof.
Let be a strongly stifled gadget – such functions exist as observed in Section 5. Let , and be a -CNF contradiction given by Theorem 15. We set that is a -CNF contradiction. Since there is a resolution refutation for with width at most , then there is a resolution refutation for with width at most .
Now, let be a tree-like refutation for with . By Observation 4, let be a canonical FPDT corresponding to that computes . Thus, we have and . We change to compute . Let be a leaf of outputting a clause of . The clause has to appear in a set of clauses for a clause of . We change the output of to be the clause instead of . By Observation 6, the tree now computes .
By Theorem 13, there is FDT computing with and . Let be the resolution refutation for corresponding to the succinct form of . Thus, and (by Observation 5). Since
| (1) |
we have that by Theorem 15. The last inequality in (1) holds if , which holds as we suppose that . Putting everything together, we have
References
- [1] Yaroslav Alekseev and Dmitry Itsykson. Lifting to regular resolution over parities via games. Electron. Colloquium Comput. Complex., TR24-128, 2024. URL: https://eccc.weizmann.ac.il/report/2024/128.
- [2] Sepehr Assadi, Gillat Kol, and Zhijun Zhang. Rounds vs communication tradeoffs for maximal independent sets. In 63rd IEEE Annual Symposium on Foundations of Computer Science, FOCS 2022, Denver, CO, USA, October 31 - November 3, 2022, pages 1193–1204. IEEE, 2022. doi:10.1109/FOCS54457.2022.00115.
- [3] Paul Beame, Christopher Beck, and Russell Impagliazzo. Time-space tradeoffs in resolution: superpolynomial lower bounds for superlinear space. In Howard J. Karloff and Toniann Pitassi, editors, Proceedings of the 44th Symposium on Theory of Computing Conference, STOC 2012, New York, NY, USA, May 19 - 22, 2012, pages 213–232. ACM, 2012. doi:10.1145/2213977.2213999.
- [4] Paul Beame and Sajin Koroth. On disperser/lifting properties of the index and inner-product functions. In Yael Tauman Kalai, editor, 14th Innovations in Theoretical Computer Science Conference, ITCS 2023, January 10-13, 2023, MIT, Cambridge, Massachusetts, USA, volume 251 of LIPIcs, pages 14:1–14:17. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPICS.ITCS.2023.14.
- [5] Chris Beck, Jakob Nordström, and Bangsheng Tang. Some trade-off results for polynomial calculus: extended abstract. In Dan Boneh, Tim Roughgarden, and Joan Feigenbaum, editors, Symposium on Theory of Computing Conference, STOC’13, Palo Alto, CA, USA, June 1-4, 2013, pages 813–822. ACM, 2013. doi:10.1145/2488608.2488711.
- [6] Eli Ben-Sasson. Size space tradeoffs for resolution. In John H. Reif, editor, Proceedings on 34th Annual ACM Symposium on Theory of Computing, May 19-21, 2002, Montréal, Québec, Canada, pages 457–464. ACM, 2002. doi:10.1145/509907.509975.
- [7] Christoph Berkholz, Moritz Lichter, and Harry Vinall-Smeeth. Supercritical size-width tree-like resolution trade-offs for graph isomorphism, 2024. doi:10.48550/arXiv.2407.17947.
- [8] Christoph Berkholz and Jakob Nordström. Supercritical space-width trade-offs for resolution. SIAM J. Comput., 49(1):98–118, 2020. doi:10.1137/16M1109072.
- [9] Sreejata Kishor Bhattacharya, Arkadev Chattopadhyay, and Pavel Dvořák. Exponential Separation Between Powers of Regular and General Resolution over Parities. In Rahul Santhanam, editor, 39th Computational Complexity Conference (CCC 2024), volume 300 of Leibniz International Proceedings in Informatics (LIPIcs), pages 23:1–23:32, Dagstuhl, Germany, 2024. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.CCC.2024.23.
- [10] Mark Braverman and Rotem Oshman. A rounds vs. communication tradeoff for multi-party set disjointness. In Chris Umans, editor, 58th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2017, Berkeley, CA, USA, October 15-17, 2017, pages 144–155. IEEE Computer Society, 2017. doi:10.1109/FOCS.2017.22.
- [11] Arkadev Chattopadhyay, Nikhil S. Mande, Swagato Sanyal, and Suhail Sherif. Lifting to parity decision trees via stifling. In Yael Tauman Kalai, editor, 14th Innovations in Theoretical Computer Science Conference, ITCS 2023, January 10-13, 2023, MIT, Cambridge, Massachusetts, USA, volume 251 of LIPIcs, pages 33:1–33:20. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPICS.ITCS.2023.33.
- [12] Susanna F. de Rezende, Noah Fleming, Duri Andrea Janett, Jakob Nordström, and Shuo Pang. Truly supercritical trade-offs for resolution, cutting planes, monotone circuits, and weisfeiler-leman, 2024. doi:10.48550/arXiv.2411.14267.
- [13] Klim Efremenko, Michal Garlík, and Dmitry Itsykson. Lower bounds for regular resolution over parities. In Bojan Mohar, Igor Shinkar, and Ryan O’Donnell, editors, Proceedings of the 56th Annual ACM Symposium on Theory of Computing, STOC 2024, Vancouver, BC, Canada, June 24-28, 2024, pages 640–651. ACM, 2024. doi:10.1145/3618260.3649652.
- [14] Noah Fleming, Toniann Pitassi, and Robert Robere. Extremely Deep Proofs. In Mark Braverman, editor, 13th Innovations in Theoretical Computer Science Conference (ITCS 2022), volume 215 of Leibniz International Proceedings in Informatics (LIPIcs), pages 70:1–70:23, Dagstuhl, Germany, 2022. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.ITCS.2022.70.
- [15] Lance Fortnow. Time-space tradeoffs for satisfiability. J. Comput. Syst. Sci., 60(2):337–353, 2000. doi:10.1006/jcss.1999.1671.
- [16] Lance Fortnow, Richard J. Lipton, Dieter van Melkebeek, and Anastasios Viglas. Time-space lower bounds for satisfiability. J. ACM, 52(6):835–865, 2005. doi:10.1145/1101821.1101822.
- [17] Ankit Garg, Mika Göös, Pritish Kamath, and Dmitry Sokolov. Monotone circuit lower bounds from resolution. Theory Comput., 16:1–30, 2020. Preliminary version in STOC 2018. doi:10.4086/TOC.2020.V016A013.
- [18] Mika Göös, Gilbert Maystre, Kilian Risse, and Dmitry Sokolov. Supercritical tradeoffs for monotone circuits, 2024. doi:10.48550/arXiv.2411.14268.
- [19] Dmitry Itsykson and Dmitry Sokolov. Lower bounds for splittings by linear combinations. In Erzsébet Csuhaj-Varjú, Martin Dietzfelbinger, and Zoltán Ésik, editors, Mathematical Foundations of Computer Science 2014 - 39th International Symposium, MFCS 2014, Budapest, Hungary, August 25-29, 2014. Proceedings, Part II, volume 8635 of Lecture Notes in Computer Science, pages 372–383. Springer, 2014. doi:10.1007/978-3-662-44465-8_32.
- [20] Dmitry Itsykson and Dmitry Sokolov. Resolution over linear equations modulo two. Ann. Pure Appl. Log., 171(1), 2020. doi:10.1016/j.apal.2019.102722.
- [21] Noam Nisan and Avi Wigderson. Rounds in communication complexity revisited. SIAM J. Comput., 22(1):211–219, 1993. doi:10.1137/0222016.
- [22] Pavel Pudlák. On extracting computations from propositional proofs (a survey). In Kamal Lodaya and Meena Mahajan, editors, IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, FSTTCS 2010, December 15-18, 2010, Chennai, India, volume 8 of LIPIcs, pages 30–41. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2010. doi:10.4230/LIPIcs.FSTTCS.2010.30.
- [23] Alexander A. Razborov. Unprovability of lower bounds on circuit size in certain fragments of bounded-arithmetic. Izvestiya. Math., 59(1):205–227, 1995.
- [24] Alexander A. Razborov. A new kind of tradeoffs in propositional proof complexity. J. ACM, 63(2):16:1–16:14, 2016. doi:10.1145/2858790.
- [25] Dmitry Sokolov. Dag-like communication and its applications. In Pascal Weil, editor, Computer Science – Theory and Applications – 12th International Computer Science Symposium in Russia, CSR 2017, Kazan, Russia, June 8-12, 2017, Proceedings, volume 10304 of Lecture Notes in Computer Science, pages 294–307. Springer, 2017. doi:10.1007/978-3-319-58747-9_26.
- [26] R. Ryan Williams. Time-space tradeoffs for counting NP solutions modulo integers. Comput. Complex., 17(2):179–219, 2008. doi:10.1007/s00037-008-0248-y.
Appendix A Appendix
In this section, we show that the functions and are strongly stifled and is not strongly stifled.
Observation 8. [Restated, see original statement.]
The functions and are strongly stifled.
Proof.
We present collections of -stifling patterns and for and , respectively. It is straight-forward to verify that these collections are converting collections of stifling patterns for and .
| 0 | 1 | |
|---|---|---|
| 1 | ||
| 2 | ||
| 3 |
| 0 | 1 | |
|---|---|---|
| 1 | ||
| 2 | ||
| 3 |
Observation 9. [Restated, see original statement.]
The function is not strongly stifled.
Proof.
The only -stifling pattern for is . Similarly, the only - and -stifling patterns for are , and , respectively. Now, let . There is no such that as required to be a strongly stifled function.
