Sensitivity and Query Complexity Under Uncertainty
Abstract
In this paper, we study the query complexity of Boolean functions in the presence of uncertainty, motivated by parallel computation with an unlimited number of processors where inputs are allowed to be unknown. We allow each query to produce three results: zero, one, or unknown. The output could also be: zero, one, or unknown, with the constraint that we should output “unknown” only when we cannot determine the answer from the revealed input bits. Such an extension of a Boolean function is called its hazard-free extension.
-
We prove an analogue of Huang’s celebrated sensitivity theorem [Annals of Mathematics, 2019] in our model of query complexity with uncertainty.
-
We show that the deterministic query complexity of the hazard-free extension of a Boolean function is at most quadratic in its randomized query complexity and quartic in its quantum query complexity, improving upon the best-known bounds in the Boolean world.
-
We exhibit an exponential gap between the smallest depth (size) of decision trees computing a Boolean function, and those computing its hazard-free extension.
-
We present general methods to convert decision trees for Boolean functions to those for their hazard-free counterparts, and show optimality of this construction. We also parameterize this result by the maximum number of unknown values in the input.
-
We show lower bounds on size complexity of decision trees for hazard-free extensions of Boolean functions in terms of the number of prime implicants and prime implicates of the underlying Boolean function.
Keywords and phrases:
CREW-PRAM, query complexity, decision trees, sensitivity, hazard-free extensionsFunding:
Deepu Benson: Funded by SERB SRG Project SRG/2021/001339.Copyright and License:
2012 ACM Subject Classification:
Theory of computation Oracles and decision treesEditors:
Paweł Gawrychowski, Filip Mazowiecki, and Michał SkrzypczakSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
A single-tape Turing Machine needs at least steps in the worst case to compute any -bit function that depends on all of its inputs. One way to achieve faster computation is to use multiple processors in parallel. Parallel computation is modeled using Parallel Random Access Machines (PRAM), originally defined by Fortune and Wyllie [10], in which there are multiple processors and multiple memory cells shared among all processors. In a single time step, each processor can read a fixed number of memory cells, execute one instruction, and write to a fixed number of memory cells. In the real world, access to shared memory has to be mediated using some mechanism to avoid read-write conflicts. A popular mechanism to achieve this is called Concurrent-Read Exclusive-Write (CREW) in which concurrent reads of the same memory cell are allowed, but each cell can only be written to by at most one processor in a single time step. An algorithm that violates this restriction is invalid.
A fundamental problem in such a model of computation is to determine the number of processors and the amount of time required for computing Boolean functions. A Boolean function is pre-determined111The number of input bits is fixed, making this model non-uniform. and the input bits are presented in shared memory locations. The processors have to write the output bit to a designated shared memory location. For example, consider the Boolean disjunction (logical OR) of all input bits which outputs 1 if and only if there is at least one among the inputs. There is a simple divide-and-conquer algorithm to compute the OR of input bits in time using processors that exploits the fact that OR is associative and distributive. This is essentially a CREW-PRAM algorithm to compute the OR of bits in -time. Note that each of the processors in the above algorithm computes a trivial function at each step namely the OR of two bits. Can we do better if we are allowed to use more complex computations at each step?
Cook and Dwork [7], and Cook, Dwork, and Reischuk [8] answer this question by showing that any CREW-PRAM algorithm that computes the logical OR of -bits needs -time, irrespective of the functions computed by the processors at each step. The reason their lower bound is independent of the functions allowed at each processor is because their lower bound really applies to the number accesses made to the shared memory. If we only care about analyzing the number of memory accesses by an algorithm running on an all-powerful processor, a neat way to think of the computation at each processor is to model it as a two-player interactive game: a querier who is all-powerful and an oracle. They want to compute a Boolean function known beforehand, on an input . However, the input is only known to the oracle. The only interaction allowed is where the querier asks a query , and the oracle can reply with . The query complexity of is defined as the the maximum number of queries required to find the answer, where the maximum is taken over all . The querier’s aim is to minimize the number of queries required to determine the value of the function in the worst-case. This is exactly the computation model studied in an exciting area of computer science simply called “query complexity”.
The technique used in [7] and [8] involves defining a measure that is equivalent to what is now commonly known as sensitivity of a Boolean function, denoted . More precisely, they show that the time needed by CREW-PRAM algorithms, irrespective of the instruction set, to compute a Boolean function is asymptotically lower bounded by the logarithm of the sensitivity of the function.
The question whether a function can be computed in time at most the logarithm of the function’s sensitivity, thereby characterizing the CREW-PRAM time complexity in terms of sensitivity, remained open. Nisan [20] introduced a generalization of sensitivity called block-sensitivity, denoted , which is lower bounded by sensitivity, and proved that CREW-PRAM time complexity is asymptotically the same as the logarithm of the block sensitivity of a function. Nisan further related decision tree depth complexity, denoted to block-sensitivity providing another characterization for CREW-PRAM time complexity as the logarithm of decision tree depth. However, the block-sensitivity of a function can be potentially much higher than its sensitivity, as shown by Rubinstein [22] with an explicit function that witnesses a quadratic gap between these two measures. After research spanning almost three decades which saw extensive study of relationships between the above parameters and various other parameters (see, for example, [1] and the references therein), such as certificate complexity (denoted ) and degree (denoted ), Huang [13], in a breakthrough result, proved that sensitivity and block-sensitivity are polynomially related. This is the celebrated sensitivity theorem:
Theorem 1 (Sensitivity theorem for Boolean functions [13]).
For all Boolean functions , .
By earlier results [20, 21, 3], Huang’s result implies that the measures of sensitivity, block sensitivity, certificate complexity, degree, approximate degree, deterministic/randomized/quantum query complexity are all polynomially equivalent for Boolean functions. This strong connection makes the study of query complexity essentially equivalent to studying the time complexity of CREW-PRAMs. Thus henceforth, we shall predominantly use terminology from query complexity, and also express our results in the context of query complexity.
In this work, we initiate a systematic study to understand the effect of allowing uncertainty among the inputs in the setting of CREW-PRAM. Allowing uncertainty in this setting is easier to understand in the equivalent query model: When an input bit is queried by the querier, the oracle can reply with “uncertain”. If it is possible for the function value to be determined in the presence of such uncertainties, then we would like the querier to output such a value. In what follows, we make the setting more formal.
To model uncertainty we use a classic three-valued logic, namely Kleene’s strong logic of indeterminacy [17], usually denoted K3. The logic K3 has three truth values , , and . The behaviour of the third value with respect to the basic Boolean primitives – conjunction (), disjunction (), and negation () – are given in Table 1.
| 0 | 1 | ||
|---|---|---|---|
| 0 | 0 | 0 | 0 |
| 0 | |||
| 1 | 0 | 1 |
| 0 | 1 | ||
|---|---|---|---|
| 0 | 0 | 1 | |
| 1 | |||
| 1 | 1 | 1 | 1 |
| 0 | 1 | ||
| 1 | 0 |
The logic K3 has been used in several other contexts where there is a need to represent and work with unknowns and hence has found several important wide-ranging applications in computer science. For instance, in relational database theory, SQL implements K3 and uses to represent a NULL value [19]. Perhaps the oldest use of K3 is in modeling hazards that occur in real-world combinational circuits. Recently there have been a series of results studying constructions and complexity of hazard-free circuits [14, 15, 16]. Here was used to represent metastability, or an unstable voltage, that can resolve to or at a later time.
One way to interpret the basic binary operations , , and in K3 is as follows: for a bit , if the value of the function is guaranteed to be regardless of all -settings to the -variables, then the output is . Otherwise, the output is . This interpretation generalizes in a nice way to -variate Boolean functions. In literature, this extension is typically called the hazard-free extension of (see, for instance, [14]), and is an important concept studied in circuits and switching network theory since the 1950s. The interested reader can refer to [14], and the references therein, for history and applications of this particular way of extending to K3. We define this extension formally below.
For a string , define the resolutions of as follows:
That is, denotes the set of all strings in that are consistent with the -valued bits of . The hazard-free extension of a Boolean function is defined as follows:
Definition 2 (Hazard-free Extensions).
For a Boolean function , we define its hazard-free extension as follows. For an input ,
To understand the motivation behind this definition, consider the instability partial order defined on by the relations and . The elements and are incomparable. This partial order captures the intuition that is less certain than and . This partial order can be extended naturally to elements of as iff for all . A function is natural if for all such that , we have and when . Intuitively, this property says that the function cannot produce a less certain output on a more certain input and if there is no uncertainty in the input, there should be no uncertainty in the output. A natural function extends a Boolean function if for all . There could be many natural functions that extend a Boolean function. Consider two natural functions and that extend . We say if for all . This says that the output of is at least as certain as the output of . An alternative definition for the hazard-free extension of a function is as follows: it is the unique function such that for all natural functions that extends . That is, the hazard-free extension of a Boolean function is the best we could hope to compute in the presence of uncertainties in the inputs to .
We note here that even though we use the term “hazard-free”, there is a fundamental difference between our model of computation and the ones studied in results such as [14, 15, 16]. For hazard-free circuits, the value represents an unstable voltage, and the gates in a circuit are fundamentally unable to detect it. That is, there is no circuit that can output when input is and otherwise. However, in our setting, the value is simply another symbol just like or . So we can indeed detect/read a value. The restriction that we have to compute the hazard-free extension is a semantic one in this paper, whereas Boolean circuits can only compute natural functions. In other words, we are interested in query complexity of hazard-free extensions of Boolean functions per se, and we have no notion of metastability in our computation model.
There is a rich body of work on the query complexity of Boolean functions that has established best-possible polynomial relationships among various models such as deterministic, randomized, and quantum models of query complexity, and their interplay with analytical measures such as block sensitivity and certificate complexity (see, for example, [1] and the references therein). We study these relationships in the presence of uncertainty, in particular for hazard-free extensions of Boolean functions. A main goal is to characterize query complexity (equivalently CREW-PRAM time complexity) of hazard-free extensions of Boolean functions using these parameters.
1.1 Our Results
In this subsection, we discuss the results presented in this paper. The organization of the paper follows the structure of this subsection.
1.1.1 Sensitivity Theorem in the Presence of Uncertainty
We prove the sensitivity theorem for Boolean functions in the presence of uncertainties. We define analogues of query complexity, sensitivity, block sensitivity, and certificate complexity called -query complexity (denoted ), -sensitivity (denoted ), -block sensitivity (denoted ), and -certificate complexity (denoted ), respectively. We show that analogues of Cook, Dwork, and Reischuk’s [8] lower bound and Nisan’s [20] upper bound hold in the presence of uncertainties by adapting their proofs suitably (See the full version [4] for details). Therefore, we can now focus on proving that these parameters are polynomially equivalent in the presence of uncertainties. Huang’s proof of the sensitivity theorem for Boolean functions crucially uses the parameter called the degree of a Boolean function. It is unclear how to define an analogue of degree in our setting. However, it turns out that a more classical parameter, the maximum of prime implicant size and prime implicate size, suffices.
Our proof of the sensitivity theorem in the presence of uncertainty is much simpler and more straightforward than the proof of the sensitivity theorem in the classical setting. It raises the question of whether we can find a simpler proof of the classical sensitivity theorem by generalizing our proof to handle an arbitrary upper bound on the number of uncertain values in the input. Note that the classical setting assumes that this number is 0 and we prove the sensitivity theorem by assuming that this number is , the number of inputs.
Recall that a literal is an input variable or its negation. An implicant (implicate) of a Boolean function is a subset of all literals such that is (is ) on any input that has all literals in set to (set to respectively). A prime implicant (prime implicate) is an implicant (implicate) such that no proper subset of it is an implicant (implicate respectively), i.e., the implicant (implicate) is minimal (w.r.t. set inclusion). The size of a prime implicant or prime implicate is the size of the set. Prime implicants and prime implicates of a Boolean function are widely studied in electronic circuit design and Boolean function analysis.
Theorem 3 (Sensitivity theorem for hazard-free extensions of Boolean functions).
Let be a Boolean function and let and be the sizes of a largest prime implicant and prime implicate of . Then, the parameters , , , , and are linearly equivalent.
We note here that while Huang [13] showed for all Boolean , our result shows that, in the presence of uncertainty block sensitivity and sensitivity are in fact linearly related to each other.
1.1.2 Relationships
Clearly, the query complexity of the hazard-free extension of a Boolean function cannot be smaller than that of itself. Can the query complexity of the hazard-free extension of a Boolean function be much more than the query complexity of the function itself? For monotone functions, we show that the answer is no. In fact, this also holds for randomized query complexity (denoted by and ) and quantum query complexity (denoted by and ).
Lemma 4.
Let be a monotone Boolean function. Then we have
A natural question to ask is whether the model we are considering is non-degenerate. In other words, do all hazard-free extensions of Boolean functions have large query complexity? Using Lemma 4, we can show that there are functions that are easy to compute even in the presence of uncertainties. The following monotone variant of the function (defined below) by Wegener [24] is sufficient.
Definition 5 ([24]).
For an even integer , define , as follows: The function is defined on (which is ) variables, where the latter variables are indexed by all -bit strings of Hamming weight exactly . For , define
It is easy to see that this function is monotone and has a query complexity of . We also exhibit a non-monotone, non-degenerate -variate function such that its hazard-free extension has query complexity (see full version [4] for the details).
For general functions, we show that uncertainty can blow-up query complexity exponentially. The Boolean function defined by
is a function on inputs that depends on all its inputs and has query complexity of . The inputs are called the selector bits and the inputs are called data bits. It is easy to show that any function that depends on all its input bits must have at least logarithmic (in ) query complexity. Therefore, is one of the easiest functions to compute in the query complexity model. We prove that its hazard-free extension is one of the hardest functions in the query complexity model.
Theorem 6.
We also show the following relationships between deterministic, randomized, and quantum query complexities of hazard-free extensions of Boolean functions.
Theorem 7.
For , we have
We remark here that the deterministic-randomized relationship above is better than the best-known cubic relationship in the Boolean world, while the quartic deterministic-quantum separation above matches the best-known separation in the Boolean world (see [1]). The key reason we are able to obtain better relationships is because we show that sensitivity, block sensitivity, and certificate complexity are all linearly related in the presence of uncertainty (Theorem 3). This is not the case in the classical Boolean setting. Regarding best possible separations: while a linear relationship between deterministic and randomized query complexities in the presence of uncertainty remains open, a quadratic deterministic-quantum separation follows from Theorem 6 (or from the OR function, which has maximal deterministic query complexity, but Grover’s search algorithm [12] offers a quadratic quantum speedup).
It is natural to model query algorithms using decision trees. A decision tree represents the strategy of the querier using a tree structure. The internal nodes of the tree are labeled by input variables and has two outgoing edges labeled and . Computation starts at the root and queries the variable labeling the current node. Then, the outgoing edge labeled by the answer is taken to reach the next node. The leaves of the tree are labeled with or and represent the answer determined by the querier. A decision tree is said to compute a Boolean function if the querier can correctly answer the value of for every possible input by following the decision tree from root to a leaf. The depth of the decision tree represents the worst-case time. The depth of a smallest depth decision tree that computes is called the decision tree depth complexity of , and it is the same as deterministic query complexity of .
1.1.3 Decision Tree Size
Decision trees have played an important role in machine learning. The size of a decision tree is an important measure as large decision trees often suffer from over-fitting. It has been long-known that functions that admit small-size decision trees are efficiently PAC learnable [9]. Moreover, the class of decision trees of large size is not efficiently PAC-learnable as their VC-dimension is directly proportional to their size. Various ideas to prune decision trees and reduce their size while maintaining reasonable empirical error (error on the training set) are used in practice. For a detailed treatment of the role of decision tree size in learning, the interested reader may refer to [23, Chapter 18]. We denote the decision tree size of a function by and the decision tree size of its hazard-free extension by . We show that for the function despite the exponential blow-up in depth from Theorem 6, the size blow-up is only polynomial.
Theorem 8.
.
In contrast, we show that there are functions for which the size blow-up is exponential. In particular, the function has linear-size Boolean decision trees. However, its hazard-free extension needs exponential-size decision trees.
Theorem 9.
.
We also show that there are hazard-free extensions of Boolean functions that require trees of size (see full version [4] for the details). Notice that a ternary tree of depth can have at most leaves. This lower bound is only smaller than this worst-case by a polynomial factor.
We also show how to construct decision trees for hazard-free extensions of Boolean functions from a decision tree for the underlying Boolean function.
Theorem 10.
For any Boolean function , we have .
The tightness of the first inequality is witnessed by the PARITYn function and that of the second inequality is witnessed by the function.
We also show that, in the case of hazard-free extensions too, sensitivity plays an important role in the function learning problem. The problem is as follows: We are provided a few input-output pairs and a guarantee that the function is from some family. The goal is to learn the function from as few samples as possible. It is known that a function with sensitivity is completely specified by its values on a Hamming ball of radius [11]. We prove an analogue for hazard-free extensions of Boolean functions. We refer the reader to the full version [4] for details.
Theorem 11.
A hazard-free extension that has is specified by its values on any Hamming ball of radius in .
1.1.4 Limited Uncertainty
We study computation in the presence of only a limited amount of uncertainty by introducing a parameter that limits the number of bits for which the oracle can respond . For metastability-containing electronic circuits, it is known that assuming only limited metastability allows constructing circuits that are significantly smaller [14]. We show a similar effect on decision tree size and query complexity when uncertainty is limited. See the full version [4] for details.
Theorem 12.
Let be a Boolean decision tree of size and depth for . Then, there exists a decision tree of size at most and depth at most for provided that the input is guaranteed to have at most positions with value .
For settings in which is a small constant, observe that the decision tree size is polynomial in the size of the Boolean decision tree. If is considered a parameter, observe that the depth remains fixed-parameter tractable in the language of parameterized complexity theory.
2 Preliminaries
Definition 13 (-query complexity).
Let be a Boolean function, and let be its hazard-free extension. A deterministic decision tree, also called a deterministic query algorithm, computing , is a ternary tree whose leaf nodes are labeled by elements of , each internal node is labeled by a variable where and has three outgoing edges, labeled , , and . On an input , the tree’s computation proceeds from the root down to a leaf as follows: from a node labeled , we take the outgoing edge labeled by value of until we reach a leaf. The label of the leaf is the output of the tree .
We say that computes if for all . The deterministic -query complexity of , denoted , is defined as
where the minimization is over all deterministic decision trees that compute .
Definition 14 (-sensitivity).
Let be a Boolean function, and let be its hazard-free extension. For an , we define the -sensitivity of at as:
The elements of the set are called the -sensitive bits of for . The -sensitivity of , denoted , is .
Definition 15 (-block sensitivity).
Let be a Boolean function and let be its hazard-free extension. For , the -block sensitivity of at is defined as maximum such that there are disjoint subsets and for each , there is a such that and differs from at exactly the positions in . Each is called a -sensitive block of on input . The -block sensitivity of , denoted , is then defined as the maximum -block sensitivity of taken over all .
For , we use to denote the maximum -block sensitivity of at , over all . For a string and a set that is a sensitive block of at , we abuse notation and use to denote an arbitrary but fixed string that satisfies and .
We now formally define certificate complexity for hazard-free extensions of Boolean functions. We first define a partial assignment as follows: By a partial assignment on bits, we mean a string , representing partial knowledge of a string in , where the -entries are yet to be determined. We say a string is consistent with a partial assignment if for all with .
Definition 16 (-certificate complexity).
Let and . A partial assignment is called a certificate for at if
-
is consistent with , and
-
for all consistent with .
The size of this certificate is . The domain of is said to be . The certificate complexity of at , denoted , is the minimum size of a certificate for at . The -certificate complexity of , denoted , is the maximum value of over all .
In other words, a certificate for at is a set of variables of that if revealed, guarantees the output of all consistent strings with the revealed variables to be equal to .
Definition 17 (Randomized -query complexity).
A randomized decision tree is a distribution over deterministic decision trees. We say a randomized decision tree computes with error if for all , the probability that it outputs is at least . The depth of a randomized decision tree is the maximum depth of a deterministic decision tree in its support. Define the randomized -query complexity of as follows.
where the minimization is over all randomized decision trees that compute to error at most .
We refer the reader to [6] for basics of quantum query complexity.
Definition 18 (Quantum -query Complexity).
A quantum query algorithm for begins in a fixed initial state in a finite-dimensional Hilbert space, applies a sequence of unitaries , and performs a measurement. Here, the initial state and the unitaries are independent of the input. The unitary represents the “query” operation, and does the following for each basis state: it maps to for all (here is interpreted as , and the last register represents workspace that is not affected by the application of a query oracle).
The algorithm then performs a 3-outcome measurement on a designated output qutrit and outputs the observed value.
We say that is a bounded-error quantum query algorithm computing if for all the probability that is output is at least . The (bounded-error) quantum -query complexity of , denoted by , is the least number of queries required for a quantum query algorithm to compute with error probability at most .
3 Sensitivity Theorem in the Presence of Uncertainty
We first show that sensitivity, block sensitivity, certificate complexity, and size of the largest prime implicant/prime implicate are all asymptotically the same. This is a more formal and more precise restatement of Theorem 3.
Theorem 19.
Let be a Boolean function. Let be the size of a largest prime implicant of , and be the size of a largest prime implicate of . Then, we have:
Proof.
The inequalities follow from definitions as for their Boolean counterparts (see, for example, [6, Proposition 1]).
To show the first inequality, we crucially use our three-valued domain. Let be a prime implicant of . Define the input to be in those positions where contains the corresponding positive literal, where has the corresponding negated literal, and everywhere else. We claim that each index in is sensitive for at . To see this, first observe that since being an implicant means is on all resolutions of . Let be a positive literal. Then, observe that if setting the ’th bit of to a does not change the value of , then changing the ’th bit to would also not change the value of . This means would also be an implicant of , contradicting the fact that was a prime implicant. The case when appears as a negated literal in is similar. Thus we have . A similar argument can be made for prime implicates as well, showing that . This proves the first inequality in the theorem.
To prove the last inequality, let be any input to . Observe that if , then the prover can pick an implicate (of size at most ) that has all literals set to in , and reveal those values to the verifier. If , then the prover reveals an implicant (of size at most ) that is . If , then there must exist inputs such that and . Since both and are resolutions of , it must hold that for all where , . i.e., and differ from only in positions where is . Hence, a prime implicant that is in must contain a in . Similarly, a prime implicate that is in must contain a in . Thus, there exists a prime implicant and a prime implicate of both of which contain a literal assigned in . The prover reveals the bits in such a prime implicant and prime implicate. Since every prime implicant and every prime implicate have at least one common variable, the prover reveals only values. Why should this convince the verifier? Since , , and coincide on positions where has or , it is possible to set the in the revealed prime implicant to values that make the function output . Similarly, it is possible to make the function output by filling in the values to the positions in the prime implicate revealed. Thus, the verifier can check that there are indeed two valid resolutions that give different outputs.
Remark 20.
Notice that Theorem 19 shows that in our setting, the parameters sensitivity, block sensitivity, and certificate complexity are equivalent (up to a multiplicative constant) to the largest prime implicants/prime implicates. In the Boolean world, for certificate complexity we get a tighter characterization in terms of CNF/DNF width, which is analogous to prime implicant/prime implicant size here. On the other hand, in the Boolean world, there is a quadratic separation between sensitivity and block sensitivity [22].
We now proceed to show that similar to their Boolean counterparts, the deterministic -query complexity is polynomially upper bounded by -sensitivity and deterministic, randomized, and quantum -query complexities are all polynomially related to each other. We do this in two parts:
-
(i)
.
-
(ii)
, .
We start by showing . Algorithm 1 is a deterministic query algorithm that achieves this bound, as shown in Theorem 24. Following this, we derive (ii) in Lemma 27. The final relationships among the three -query complexities is presented in Theorem 28.
We will need the following observations to prove correctness of the algorithm:
Observation 21.
Let be a partial assignment that does not contain a -certificate of for any . Then there exists a that is consistent with .
Proof.
Since does not already contain a -certificate of for any , this means that for each , there exists a setting to the -variables in the partial assignment to yield a -input to . Specifically, it must be the case that there exists some assignment to the undetermined ’s that sets an implicant of to , and some assignment that sets an implicate of to . Define the string to be the same as except with ’s replaced with . It can be observed that .
Observation 22.
Let , and . Let be a minimal certificate of at , and let its domain be . Then for all , we have .
Proof.
Assume towards a contradiction an index in a minimal certificate for at with . By the definition of a certificate, and , the partial assignment obtained by removing from the domain of is such that all strings consistent with satisfy . This shows that is also a certificate for at , contradicting the minimality of .
Proof.
Let denote , the number of iterations of the for loop on Line 4. Assume the algorithm reached Line 15, and let be the partial assignment constructed by the algorithm when it reaches Line 15.
Suppose, for the sake of contradiction, there exists an input that is consistent with . Since the algorithm reached Line 15, it must be the case that the partial assignment constructed does not contain a -certificate for any because otherwise one of the if conditions between Lines 9 and 11 would have terminated the algorithm. Then using Observation 21, there must also exist a -input consistent with . Hence every time Line 5 was executed, there was indeed an consistent with . Suppose the -certificates used during the run of the for loop on Line 4 were , and their respective domains were .
The fact that neither of the if conditions fired means that every time the oracle was queried, the replies differed from the -certificate being queried in at least one index each time. Let be the set of positions in where differs from . By the observations above, each is non-empty. Observe that since is chosen to be consistent with , it must be the case that and agree on all positions in . Hence is disjoint from . With the same reasoning, we can conclude that is disjoint from every where .
Observe that for each , there is a setting to the bits in such that becomes a -input – simply take the setting of these bits from , which is a -certificate. More formally, for each , there exists a string such that .
Since is consistent with , it agrees with on all positions where . This means the previous observation holds for too. That is, for each , there exists strings such that . Recall that , and hence the sets form a collection of disjoint sensitive blocks for at . Further, since the algorithm has not found a -certificate (or -certificate) yet, it must be the case that there is some -input consistent with by Observation 21. This means there is yet another disjoint block that is sensitive for at . But this is a contradiction since the maximum, over all -inputs of , number of disjoint sensitive blocks is .
A nearly identical proof can be used to show that every -input is inconsistent with .
Theorem 24.
Algorithm 1 correctly computes , and makes at most queries. Thus .
Proof.
If the algorithm outputs a value in , then it must have passed the corresponding if condition (either in Line 11, or in Line 13 and is trivially correct. If the algorithm outputs , then either the if condition on Line 9 must have passed, or Line 15 must have been reached. In the former case, the correctness of the algorithm is trivial. In the latter case, from Claim 23, we conclude that every -input and every -input of must be inconsistent with the partial assignment constructed by the algorithm when it arrives at Line 15. This means that every that is consistent with (in particular, the unknown input ) must satisfy , which concludes the proof of correctness.
The for loop runs for many iterations, and at most many bits are queried in each iteration.
We can now conclude that deterministic, randomized, and quantum -query complexities are all polynomially related. We use Yao’s minimax principle [25] and the adversary method for lower bounds on quantum query complexity due to Ambainis [2]. We state them below for convenience.
Lemma 25 (Yao’s minimax principle).
For finite sets and a function , we have if and only if there exists a distribution such that . Here, is the minimum depth of a deterministic decision tree that computes to error at most when inputs are drawn from the distribution .
Theorem 26 ([2, Theorem 5.1]).
Let be finite sets, let be a positive integer, and let be a function. Let be two sets of inputs such that for all . Let be such that
-
For every , there are at least different with ,
-
for every , there are at least different with ,
-
for every and , there are at most different such that and ,
-
for every and , there are at most different such that and .
Then .
We lower bound randomized and quantum -query complexities polynomially by -block sensitivity.
Lemma 27.
Let . Then,
Proof.
We use Lemma 25 for the randomized lower bound. Let be such that , with corresponding sensitive blocks . Define a distribution on as follows:
Towards a contradiction, let be a deterministic decision tree of cost less than that computes to error at most under the input distribution . Let denote the leaf of reached by the input . There are now two cases:
-
If the output at is not equal to , then errs on , which contributes to an error of 1/2 under , which is a contradiction.
-
If the output at equals , since the number of queries on this path is less than , there must exist at least many blocks such that no variable of is read on input (observe that in this case, also reaches the leaf ). Since , this means makes an error on each of these ’s, contributing to a total error of at least under , which is a contradiction.
This concludes the randomized lower bound.
For the quantum lower bound we use Theorem 26. Define , and . From Theorem 26 we have (since each index appears in at most 1 block, as each block is disjoint) and . Theorem 26 then implies .
Theorem 28.
For , we have
Proof.
4 Relationships
We start by proving that for monotone functions, the presence of uncertainty does not make computation significantly harder. Similar relationships are known for monotone combinational circuits w.r.t. containing metastability [14].
Proof of Lemma 4.
Any query algorithm for also computes with at most the same cost, and hence . Similarly we have and .
We start with a best (deterministic/randomized/quantum) query algorithm for , and an oracle holding an input for . Now, for , we define to be the same algorithm as , but whenever queries the bit of its input, it performs the following operation instead:
-
1.
Query the ’th bit of , denote the outcome by .
-
2.
If , return .
-
3.
If , return .
In the case of quantum query complexity, note that this operation can indeed be implemented quantumly making queries to . The initial query performs the instructions described above, and the second query uncomputes the values from the tuple we don’t need for the remaining part of the computation. Note here that by definition, and thus , which is what we need to implement for the uncomputation operations.
Let be the positions that have in . Recall that (and ) is the input that has all the in replaced by s (by s respectively). Run and (possibly repeated constantly many times each to boost correctness probability) to determine the values of and with high probability. If , then we output . Else if , we output . Else we have and , and we output .
Correctness.
First observe that for , all answers to queries in the algorithm are consistent with the input . Next observe that in the poset (equivalently “subcube”) formed by the resolutions of , the inputs and form the bottom and top elements respectively. Since is monotone, if , we can conclude that is on all resolutions of . Similarly when , it must be the case that is on all inputs in the poset. The remaining case is when and . In this case, the inputs and are themselves resolutions of with different evaluations of , and hence the algorithm correctly outputs .
By standard boosting of success probability using a Chernoff bound by repeating () constantly many times and taking a majority vote of the answers, we can ensure that the correctness probability of () is large enough, say at least 0.9. Thus, the algorithm described above has correctness probability at least , and its cost is at most a constant times the cost of .
It is easy to prove using Theorem 19 that the -query complexity of is exponentially larger than its query complexity. We claim that . Consider the input where all selector bits are and all data bits are . Flipping any data bit to zero changes the output from to . In the following theorem, we prove a stronger statement. A function is said to be evasive if its query complexity is the maximum possible. We first prove Theorem 6, i.e., is evasive for all . In fact, we prove a more general theorem. We consider decision trees for that are guaranteed to produce the correct output when the number of unstable values in the input is at most , for an arbitrary . We defer the proof of the following theorem to the full version [4].
Theorem 29.
Let . Any optimal-depth decision tree that correctly computes the hazard-free extension of on inputs with at most unstable values has depth exactly .
5 Decision Tree Size and Limited Uncertainty
We show that despite requiring exponentially more depth in the presence of uncertainty, we can compute the function using a decision tree that is only quadratically larger in size than the Boolean one (Theorem 8).
However, there are functions that require exponentially larger decision tree size in the presence of uncertainty such as the function (Theorem 9). We defer the proofs of these statements to the full version [4]. The size lower bound technique used here for both the functions involves constructing a set of inputs that must all lead to different leaves and hence any decision tree that computes the hazard-free extension correctly requires size at least as large as the size of this input set. The upper bound is shown by an explicit construction.
We present a general construction of decision trees of hazard-free extensions of functions from a decision tree of the underlying Boolean function (Theorem 10) in the full version [4] of this paper. The main idea is to construct the subtree of the root node in the new decision tree from the and subtrees that have been constructed recursively. The subtree construction can be viewed as a product construction where we replace each leaf in a copy of the subtree with a modified copy of the subtree. This product construction works because for every input that reaches the subtree, the output value can determined by the output values when that bit is and when it is , which is exactly what the and subtrees compute.
We also prove in Theorem 12 that the complexity increases only gradually along with the increase in the amount of uncertainty in the inputs. More specifically, we prove that if the inputs are guaranteed to have at most bits with value , without any guarantee on where they occur, the exponential blow-up in query complexity is contained to the parameter (See full version [4] for a proof). This proof also makes use of the product construction mentioned above.
6 Discussion and Open Problems
In this paper we initiated a study of query complexity of Boolean functions in the presence of uncertainty, by considering a natural generalization of Kleene’s strong logic of indeterminacy on three variables.
We showed that an analogue of the celebrated sensitivity theorem [13] holds in the presence of uncertainty too. While Huang showed a fourth-power relationship between sensitivity and block sensitivity in the Boolean world, we were able to show these measures are linearly related in the presence of uncertainty. The proof of sensitivity theorem in our setting is considerably different and easier from the proof of sensitivity theorem in the Boolean world. We can parameterize -sensitivity and -query complexity by restricting our attention to inputs that have at most unstable bits. The setting gives us the Boolean sensitivity theorem and the setting , our sensitivity theorem. Can we unify these two proofs using this parameterization? That is, is there a single proof for the sensitivity theorem that works for all ?
We showed using -analogues of block sensitivity, sensitivity, and certificate complexity that for all Boolean functions , its deterministic, randomized, and quantum -query complexities are polynomially related to each other. An interesting research direction would be to determine the tightest possible separations between all of these measures. It is interesting to note that our quadratic relationship between deterministic and randomized -query complexity improves upon the best-known cubic relationship in the usual query models. Moreover, our quartic deterministic-quantum relationship matches the best-known relationship in the Boolean world [1]. More generally, it would be interesting to see best known relationships between combinatorial measures of Boolean functions in this model, and see how they compare to the usual query model (see, for instance, [1, Table 1]). A linear relationship between deterministic and randomized query complexities in the presence of uncertainty remains open, but a quadratic deterministic-quantum separation follows from Theorem 6 or from the OR function (via Grover’s search algorithm [12]).
While we studied an important extension of Boolean functions to a specific three-valued logic that has been extensively studied in various contexts, an interesting future direction is to consider query complexities of Boolean functions on other interesting logics. Our definition of dictates that iff for all , and otherwise. A natural variant is to define a 0-1 valued function that outputs iff majority of equals over all . It is not hard to show that the complexity of this variant of is bounded from below by the usual query complexity of Majority on variables, which is in the deterministic, randomized, and quantum query models.
References
- [1] Scott Aaronson, Shalev Ben-David, Robin Kothari, Shravas Rao, and Avishay Tal. Degree vs. approximate degree and quantum implications of huang’s sensitivity theorem. In Proceedings of 53rd Annual Symposium on Theory of Computing (STOC 2021), pages 1330–1342, 2021. doi:10.1145/3406325.3451047.
- [2] Andris Ambainis. Quantum Lower Bounds by Quantum Arguments. J. Comput. Syst. Sci., 64(4):750–767, 2002. doi:10.1006/JCSS.2002.1826.
- [3] Robert Beals, Harry Buhrman, Richard Cleve, Michele Mosca, and Ronald de Wolf. Quantum Lower Bounds by Polynomials. J. ACM, 48(4):778–797, 2001. doi:10.1145/502090.502097.
- [4] Deepu Benson, Balagopal Komarath, Nikhil Mande, Sai Soumya Nalli, Jayalal Sarma, and Karteek Sreenivasaiah. Sensitivity and Query Complexity under Uncertainty, 2025. arXiv:2507.00148.
- [5] Deepu Benson, Balagopal Komarath, Jayalal Sarma, and Nalli Sai Soumya. Hazard-free decision trees. CoRR, abs/2501.00831, 2025. doi:10.48550/arXiv.2501.00831.
- [6] Harry Buhrman and Ronald de Wolf. Complexity measures and decision tree complexity: a survey. Theor. Comput. Sci., 288(1):21–43, 2002. doi:10.1016/S0304-3975(01)00144-X.
- [7] Stephen Cook and Cynthia Dwork. Bounds on the time for parallel RAM’s to compute simple functions. In Proceedings of the 14th Annual Symposium on Theory of Computing (STOC 1982), pages 231–233, 1982. doi:10.1145/800070.802196.
- [8] Stephen Cook, Cynthia Dwork, and Rüdiger Reischuk. Upper and Lower Time Bounds for Parallel Random Access Machines without Simultaneous Writes. SIAM Journal on Computing, 15(1):87–97, 1986. doi:10.1137/0215006.
- [9] Andrzej Ehrenfeucht and David Haussler. Learning Decision Trees from Random Examples. Inf. Comput., 82(3):231–246, 1989. doi:10.1016/0890-5401(89)90001-1.
- [10] Steven Fortune and James Wyllie. Parallelism in Random Access Machines. In Proceedings of the 10th Annual Symposium on Theory of Computing (STOC 1978), pages 114–118, 1978. doi:10.1145/800133.804339.
- [11] Parikshit Gopalan, Noam Nisan, Rocco A. Servedio, Kunal Talwar, and Avi Wigderson. Smooth Boolean Functions Are Easy: Efficient Algorithms for Low-Sensitivity Functions. In Proceedings of the Conference on Innovations in Theoretical Computer Science (ITCS 2016), pages 59–70, 2016. doi:10.1145/2840728.2840738.
- [12] Lov K Grover. A fast quantum mechanical algorithm for database search. In Proceedings of the 28th Annual Symposium on Theory of Computing (STOC 1996), pages 212–219, 1996. doi:10.1145/237814.237866.
- [13] Hao Huang. Induced subgraphs of hypercubes and a proof of the Sensitivity Conjecture. Annals of Mathematics, 190(3):949–955, 2019. doi:10.4007/annals.2019.190.3.6.
- [14] Christian Ikenmeyer, Balagopal Komarath, Christoph Lenzen, Vladimir Lysikov, Andrey Mokhov, and Karteek Sreenivasaiah. On the Complexity of Hazard-free Circuits. Journal of the ACM (JACM), 66(4):1–20, 2019. doi:10.1145/3320123.
- [15] Christian Ikenmeyer, Balagopal Komarath, and Nitin Saurabh. Karchmer-Wigderson Games for Hazard-Free Computation. In Proceedins of 14th Innovations in Theoretical Computer Science Conference (ITCS 2023), volume 251, pages 74:1–74:25, 2023. doi:10.4230/LIPIcs.ITCS.2023.74.
- [16] Stasys Jukna. Notes on Hazard-Free Circuits. SIAM J. Discret. Math., 35(2):770–787, 2021. doi:10.1137/20M1355240.
- [17] Stephen Cole Kleene. Introduction to Metamathematics. P. Noordhoff N.V., Groningen, 1952.
- [18] Nikhil S. Mande and Karteek Sreenivasaiah. Query Complexity with Unknowns. CoRR, abs/2412.06395, 2024. doi:10.48550/arXiv.2412.06395.
- [19] Ron van der Meyden. Logical Approaches to Incomplete Information: A Survey. In Logics for Databases and Information Systems, pages 307–356. Kluwer, 1998.
- [20] Noam Nisan. CREW PRAMs and Decision Trees. SIAM J. Comput., 20(6):999–1007, 1991. doi:10.1137/0220062.
- [21] Noam Nisan and Mario Szegedy. On the Degree of Boolean Functions as Real Polynomials. Comput. Complex., 4:301–313, 1994. doi:10.1007/BF01263419.
- [22] David Rubinstein. Sensitivity vs. Block Sensitivity of Boolean functions. Combinatorica, 15(2):297–299, 1995. doi:10.1007/BF01200762.
- [23] Shai Shalev-Shwartz and Shai Ben-David. Understanding Machine Learning - From Theory to Algorithms. Cambridge University Press, 2014. URL: http://www.cambridge.org/de/academic/subjects/computer-science/pattern-recognition-and-machine-learning/understanding-machine-learning-theory-algorithms.
- [24] Ingo Wegener. The Critical Complexity of All (Monotone) Boolean Functions and Monotone Graph Properties. Inf. Control., 67(1-3):212–222, 1985. doi:10.1016/S0019-9958(85)80036-X.
- [25] Andrew Chi-Chih Yao. Probabilistic computations: Toward a unified measure of complexity. In 18th Annual Symposium on Foundations of Computer Science (SFCS 1977), pages 222–227. IEEE Computer Society, 1977.
