Abstract 1 Introduction 2 Preliminaries 3 Results Statement 4 Notation for proving main theorems 5 Proof of Theorem 10 References

Randomized Lifting to Semi-Structured Communication Complexity via Linear Diversity

Vladimir Podolskii ORCID Tufts University, Medford, MA, USA Alexander Shekhovtsov ORCID Moscow Institute of Physics and Technology, Russia
Abstract

We study query-to-communication lifting. The major open problem in this area is to prove a lifting theorem for gadgets of constant size. The recent paper [2] introduces semi-structured communication complexity, in which one of the players can only send parities of their input bits. They have shown that for any m4 deterministic decision tree complexity of a function f can be lifted to the so called semi-structured communication complexity of fIndm, where Indm is the Indexing gadget.

As our main contribution we extend these results to randomized setting. Our results also apply to a substantially larger set of gadgets. More specifically, we introduce a new complexity measure of gadgets, linear diversity. For all gadgets g with non-trivial linear diversity we show that randomized decision tree complexity of f lifts to randomized semi-structured communication complexity of fg. In particular, this gives tight lifting results for Indexing gadget Indm, Inner Product gadget IPm for all m2, and for Majority gadget MAJm for all m4. We prove the same results for deterministic case.

From our result it immediately follows that deterministic/randomized decision tree complexity lifts to deterministic/randomized parity decision tree complexity. For randomized case this is the first result of this type. For deterministic case, our result improves the bound in [6] for Inner Product gadget.

To obtain our results we introduce a new secret sets approach to simulation of semi-structured communication protocols by decision trees. It allows us to simulate (restricted classes of) communication protocols on truly uniform distribution of inputs.

Keywords and phrases:
communication complexity, decision trees, lifting
Copyright and License:
[Uncaptioned image] © Vladimir Podolskii and Alexander Shekhovtsov; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Communication complexity
; Theory of computation Oracles and decision trees
Related Version:
Full Version: https://eccc.weizmann.ac.il/report/2024/199/
Editors:
Raghu Meka

1 Introduction

In recent years numerous results emerged that lift the complexity of a function in a weak model of computation to the complexity of a modified version of the function in a stronger model of computation [10, 13, 5, 11, 4, 14]. These results proved to be extremely useful for solving open problems in various areas of computational complexity [17, 18, 16, 9, 8]. In this type of results we start with a function f:{0,1}n{0,1} that is hard for a weak computation model (like decision trees) and for a gadget g:{0,1}m{0,1} we consider a function fgn:{0,1}nm{0,1} in which we substitute each variable of f by an output of g applied to fresh variables. Our goal is to show that the resulting function fgn is hard for a strong computation model (like communication complexity or Boolean formula complexity). We would like the result to hold for as simple g, as possible.

In this paper we are mostly interested in lifting from decision tree complexity to communication complexity, which is sometimes called query-to-communication lifting. This particular type of lifting has seen numerous results [17, 10, 12, 5, 11, 4]. In these papers the results of this type were obtained and gradually improved for both deterministic and randomized cases and for gradually increasing set of possible gadgets. The size of the gadget is a parameter that is of importance for applications. The smallest known size of the gadget for both deterministic and randomized case is logarithmic. For deterministic case, the results for logarithmic size of gadgets were obtained in [5] and [20]. For randomized case, the first result was obtained in [11] with a gadget of polynomial size. The paper [4] reduced the size of gadget for randomized case to logarithm. Obtaining the lifting results with the gadgets of constant size remains a major open problem.

One of the possible approaches to this problem is to address lifting to restricted models of communication or even simpler computational models and to try to obtain lifting with constant-size gadget in this setting. Some progress in this direction was obtained in recent independent papers [2, 6].

The paper [6] shows lifting from deterministic decision tree complexity Ddt(f) to deterministic parity decision tree complexity Ddt(fg) for a wide range of gadgets g, including gadgets of constant size. More specifically, for each gadget g they introduce a stifling complexity measure k and they show that Ddt(fg)Ω(kDdt(f)). In particular, from their result it follows that Ddt(fIndm)Ω(logmDdt(f)) and Ddt(fIPm)Ω(Ddt(f)) for any positive m, where Indm and IPm are Indexing and Inner Product gadgets that are among the most standard in this field (see Subsection 2.3 for the definition of these functions).

The paper [2] introduces semi-structured communication complexity. In this model one of the players is allowed to send only parities of their input bits. This model is restricted compared to regular communication complexity model, but is more powerful (up to a factor of 2) compared to parity decision trees, as players can easily simulate a parity decision tree with a semi-structured communication protocol. The paper [2] shows lifting from deterministic decision trees to semi-structured communication protocol with Indk gadget for any k4.

Our results

We show that for a wide range of gadgets (including constant size) lifting is possible from randomized decision trees to randomized semi-structured communication complexity.

More specifically, we introduce a complexity measure linear diversity for gadgets. Informally, it is equal to the number of distinct (up to negation) non-constant linear functions (over 𝔽2) in Bob’s variables we can obtain by fixing Alice’s variables. We observe that the linear diversity of Indm is m and the linear diversity of IPm is 2m1.

We show that for any function (or relation) f for any k2 and for any gadget g with linear diversity k randomized semi-structured communication complexity of fg is greater or equal to Ω(logkRdt(f)), where by Rdt(f) we denote the minimal depth of probabilistic decision tree computing f. In particular, our result applies to gadgets of constant size. Our result gives tight bounds for both Indm and IPm gadgets. When lifting to probabilistic parity decision trees, our result also gives tight bounds for the MAJm gadget using a trick described in Subsection 3.4.

Similarly to [6] we extend our result to give the same lower bound for the logarithm of the size of the randomized semi-structured communication protocol and to a version of communication complexity, in which Bob is allowed to send indicator functions of subspaces of his input bits.

Although our techniques (see below) is designed specifically for randomized case, we translate our results to deterministic case as well. Compared to [2] the deterministic version of our results apply to a wide range of gadgets.

As an immediate corollary, we have the same results (both randomized and deterministic) for lifting to parity decision trees. Compared to [6] the deterministic version of our result for parity decision trees uses linear diversity complexity measure instead of stifling. We discuss the comparison between these two measures below.

Our results can be used to obtain lower bounds on randomized parity decision tree complexity of Boolean functions. We provide a couple of examples of bounds we can obtain.

Linear diversity vs. stifling

A function g:{0,1}m{0,1} is k-stifled if for any subset S[m] of k input variables and any bit b we can fix all other variables in such a way that g evaluates to b no matter what the values of the variables in the subset S are.

The binary logarithm of linear diversity measure is greater than stifling at least for some functions. A notable example is IPm function that is a common gadget in lifting results. Its linear diversity is maximal, but its stifling is just 1.

Our techniques

The proof of our results builds on so-called simulation argument in the style of [11, 4]. In this argument, given a randomized communication protocol 𝚷 computing fgn of cost d, we build a randomized decision tree 𝐓 computing f of depth O(d/logk). The tree 𝐓 simulates 𝚷, querying the necessary information. Next we describe the simulation argument and then explain the new ideas.

For convenience we introduce the following notations:

  • The gadget: it is convenient to consider gadgets of the form g:[k]×{0,1}m{0,1}, where [k] corresponds to the inputs of Alice and {0,1}m corresponds to the inputs of Bob, and the linear diversity of g is k; that is, for convenience, we assume that for any fixed first input of g the resulting function on the second input is linear.

  • The collection of all gadgets: G:=gn:[k]n×({0,1}m)n{0,1}n.

  • The input to the i-th gadget: (xi,yi)[k]×{0,1}m.

  • The whole inputs of Alice and Bob: x=(x1,,xn), y=(y1,,yn).

Simulation argument.

The main idea is that 𝐓 on input z{0,1}n simulates 𝚷 on a random input (x,y) that is distributed uniformly on G1(z). Since fG(x,y)=f(z), 𝐓(z) will output f(z) as long as the simulation of 𝚷 performed correctly.

The main difficulty in this approach is to simulate 𝚷 on (x,y)G1(z) without knowing z. To overcome this problem, it was shown in [11, 4] that the distribution (x,y)G1(z) does not differ substantially (from the perspective of the players) from the uniform distribution on all inputs. Thus, the tree 𝐓 can instead simulate 𝚷 on (x,y), where (x,y) is distributed uniformly on [k]n×({0,1}m)n, until 𝚷 reveals too much information about some block of inputs (xi,yi). Once this happens for some i, the tree 𝐓 queries zi and proceeds with the simulation knowing the correct distribution of the pair (xi,yi).

However, in this approach the simulation becomes approximate. To be able to assume that the uniform distribution on G1(z) does not differ too much from the uniform distribution on all inputs, we need that the size of the gadget is at least logarithmic in n (we need this to bound the error probability of the simulation). Thus, it is not clear how to use this approach for gadgets of smaller size.

The key idea of our approach is to simulate 𝚷 precisely on the distribution (x,y)G1(z). This allows us to work with the gadgets of constant size. To address the issue of the simulation with unknown z, we introduce the main key ingredient of our argument, the secret sets technique.

Secret sets.

Let Si be some subset of {i}×[m]. Assume for now that zi equals the XOR of variables of y on the positions in Si. Then we can show that as long as Si is not in a linear span of linear functions sent by Bob in 𝚷, the uniform distribution on g1(zi) is indistinguishable (by players) from the uniform distribution on the whole i-th block of input. This allows us to simulate 𝚷 as if zi is known. Once Si falls into a linear span of functions sent by Bob, we just query zi and proceed with the simulation. To prove our bound, we must show that Bob needs to send many (about log2k) messages in 𝚷 on average to force us to query zi. Recall that Si is actually random. The intuition is that Bob must send roughly log2k linear functions for their linear span to capture a random Si.

There are two key ingredients to actually prove that the secret set technique forces Bob to send many messages. Below is the brief description of them.

Narrowing Bob’s messages to blocks.

For each Bob’s message we assign a block which will account for it. For each message assigned in block {i}×[m], we remove the part of it that lies outside of {i}×[m]. Denote by Li this set of truncated messages that assigned to ith block.

Each Li will admit the following property. Si is not lying in the linear span of messages sent by Bob as long as it is not lying in the linear span of Li. At some point, the property can become violated after Bob sends a message, but we will send additional messages for Alice and Bob to restore the property. More precisely, we pick a linear combination of messages assigned to ith block that annulates Si. We subtract Si from this linear combination, in which we take untruncated messages, and send this new message for Bob recursively.

Entropy.

We use the idea of fixed/unfixed blocks that was also used in the previous lifting theorems that utilized entropy. At the beginning, we consider all Si to be unfixed. Note that, initially, the binary entropy of Si is log2k, since xi is uniformly distributed on [k]. We want to show that Si rarely lies in the linear span of Li when a new element is added to Li. The case when the entropy of Si is sufficiently larger than |Li| is favorable for us, since in this case Si does not lie in the linear span of Li with a high probability. When the entropy of Si goes below that level, we fix Si (Alice sends xi). Since Alice and Bob must send a lot of messages to decrease the entropy of Si below the desired level or to increase the size of Li, the number of fixed Si is low.

As another feature of our approach we would like to mention that unlike the previous papers our simulation algorithm has a very simple description: we fix Alice’s input randomly and maintain the linear space generated by Bob’s messages to decide on querying zis. Correctness of the algorithm easily follows from its description and the most technical part of the proof shifts to the complexity analysis.

For the deterministic version of our result we again use the simulation argument in the style of [11, 4] and more specifically, our general strategy is very similar to the one of [2] (with the necessary generalization to a wide range of gadgets).

Organization

The rest of the paper is organized as follows. In Section 2 we give the necessary preliminary information and introduce key notions and notation. In Section 3 we give a formulation our results. In Subsection 3.5 we show some applications of our results. In Section 4 we introduce additional notation that is used in the proofs. In Subsection 5.1 we begin the proof of our main result and as a first step reformulate the theorem in the form that is convenient for the proof. In Subsection 5.2 we describe the protocol to simulate communication protocol by decision tree. In Subsection 5.3 we proof correctness of the simulation. In Subsection 5.4 we prove the bound on the number of queries in the simulation (this is the most technically heavy part of the proof).

Proofs of the remaining theorems are omitted due to the space constraint and are provided in the full version of the paper. The proofs of our results on lifting to the size of the communication protocols and to the subspace queries are achieved by a slight modification of the proof for the depth complexity. The proof of our results for the deterministic case are similar to the ones for randomized setting with necessary technical modifications.

2 Preliminaries

2.1 Standard Notation

Here we will describe some notation that we use. We denote [n]:={1,2,,n} for a non-negative integer n. The addition mod 2 is denoted by XOR or . Sometimes, we view {0,1}n as a vector space 𝔽2n.

2.2 Computational Models

For functions of the form f:{0,1}n×{0,1}m{0,1} we consider semi-structured communication protocols introduced in [2]. In these protocols Bob is only allowed to send the XOR of some subset of his input bits (and Alice is not restricted). A randomized semi-structured protocol is just a distribution over deterministic semi-structured protocols. We say that such a protocol computes the function f correctly, if on every input the probability of the correct output is at least 23. The complexity Rcc(f) of f in this model is the minimal depth of a protocol computing f.

We also consider a subspace-query model. Consider communication protocols in which Bob is only allowed to send indicators of whether his input lies in an affine subspace of 𝔽2m. We denote by sRcc(f) the minimum complexity of a randomized protocol computing f, where the protocol is taken from the restricted class.

Additionally, let us introduce a size-complexity of a protocol. A deterministic communication protocol can be represented by a tree in which in every node either Alice or Bob sends a message. We call the size-complexity of the protocol to be the number of leafs in this tree. Denote by sizeRcc(f) the minimum size-complexity of a randomized semi-structured protocol computing f.

We denote by Dcc(f), sizeDcc(f) and sDcc(f) the deterministic versions of these complexity measures.

We can define the semi-structured complexity measures for relations f(𝒳×{0,1}m)× completely analogously. Here a deterministic protocol Π is said to compute f if for any (x,y)𝒳×{0,1}m it outputs any z such that (x,y,z)f, if there is such z (the protocol can give any output if there is no such z).

For a function f:{0,1}n{0,1} we denote by Ddt(f) the minimal depth of a decision tree computing f. We denote by Ddt(f) the minimal depth of a parity decision tree computing f (on each step such a decision tree can query an XOR of a subset of the input bits). Analogously to semi-structured communication complexity, we denote by Ddt(f), sizeDdt(f), sDdt(f), Rdt(f), sizeRdt(f) and sRdt(f) complexity measures defined by deterministic and randomized parity decision trees.

There is the following standard connection between parity decision trees and communication complexity protocols.

Lemma 1.

For any function f:{0,1}n×{0,1}m{0,1} we have

Dcc(f)2Ddt(f),
Rcc(f)2Rdt(f).
Proof.

Given a (randomized) parity decision tree for f Alice and Bob can use it to compute the function f by a (randomized) communication protocol. For this they simulate each query one by one computing XOR of their portion of the input and sending them to each other. Simulation of each query requires two bits of communication.

2.3 Gadgets

We first define a class of gadgets that is of use for us.

Definition 2 (Family of linear functions).

The gadget g:[k]×{0,1}m{0,1} is a family of linear functions of order k, if the following is true

  1. (1)

    i[k] g(i,) is a non-trivial linear function as a function of the second argument (that is, it is an XOR of a nonempty subset of its inputs),

  2. (2)

    i,j[k],ij g(i,)g(j,).

For convenience from now on we will use the notation gi(y):=g(i,y).

We will use the following notion of gadget reduction.

Definition 3 (Gadget reduction).

Consider two gadgets g:𝒳×{0,1}m{0,1},h:𝒴×{0,1}n{0,1}. Then g is reducible to h, if there are mappings φ:𝒳𝒴,ψ:{0,1}m{0,1}n, such that

  1. (1)

    x𝒳,y{0,1}mg(x,y)=h(φ(x),ψ(y))

  2. (2)

    ψ is linear, that is y1,y2{0,1}mψ(y1y2)=ψ(y1)ψ(y2)

Note that gadget reduction relation is transitive.

Gadget reduction is useful for us due to the following lemma.

Lemma 4.

Assume that a gadget g reduces to a gadget h. Then for any relation f{0,1}n× we have

Rcc(fgn)Rcc(fhn),
sizeRcc(fgn)sizeRcc(fhn).
sRcc(fgn)sRcc(fhn).

The same inequalities are true for the deterministic complexities.

Proof.

If Alice and Bob would like to compute fgn, they can just compute the mappings φ:𝒳𝒴,ψ:{0,1}m{0,1}n on their inputs individually and use the protocol for fhn. Since ψ:{0,1}m{0,1}n is linear, Bob can simulate the protocol for fhn sending only XORs of his input bits. To see that, denote by ei:={0}i1×{1}×{0}mi. Thus, ψ is uniquely determined by ψ(e1),,ψ(em). Let x be Bob’s input, and let y=ψ(x). An arbitrary parity message of y can be represented as y,y for some y. Here, , is the dot product modulo 2. Observe that y,y=ψ(x),y=ixiψ(ei),y. In other words, to compute y,y, it is enough to take the XOR of all those xi for which ψ(ei),y equals one. This means that Bob can translate parity messages of y to parity messages of x.

Next we define the main complexity measure for the gadgets.

Definition 5 (Linear diversity).

We let linear diversity of a function h:𝒴×{0,1}n{0,1} be the maximal k such that there is a family of linear functions g:[k]×{0,1}m{0,1} of order k such that g reduces to h.

Next we introduce a couple of standard gadgets that will be important for us.

Definition 6 (Index function).

Let Indm:[m]×{0,1}m{0,1} be the function that on input (i,y) outputs yi.

Definition 7 (Inner product function).

Let IPm:{0,1}m×{0,1}m{0,1} be the function that on input (x,y) outputs i=1n(xiyi).

Lemma 8.

Indm is a family of linear functions of order m. IPm has linear diversity 2m1.

Proof.

The statement of the lemma is almost immediate. For Indm, note that for any i[m] the output of Indm is yi, which is a linear function. For IPm, note that for any fixed x{0,1}n the output of IPm is i:xi=1yi, which is a non-trivial linear function for each x0. The reduction from a family of linear functions is trivial (ψ is an identity function and φ sends an integer to its binary representation).

Lemma 9.

With probability approaching 1 as m tends to infinity, a random gadget g:[22m]×{0,1}m{0,1} has linear diversity at least 2m/2. (The values of g on each input are chosen randomly and independently.)

The proof of this lemma can be found in the full version of the paper.

3 Results Statement

3.1 Randomized Semi-Structured protocols

Theorem 10.

Let g:[k]×{0,1}m{0,1} be a family of linear functions of order k2. Then for any relation f{0,1}n× we have

Rcc(fgn)=Θ(log2kRdt(f)).

We note that big-O part is trivial, since Alice and Bob in communication protocol can simulate a decision tree for f and spend O(log2k) bits of communication for each tree query to compute the function g on the corresponding inputs.

We prove the following stronger versions of Theorem 10.

Theorem 11.

Let g:[k]×{0,1}m{0,1} be a family of linear functions of order k2. Then for any relation f{0,1}n× we have

log2sizeRcc(fgn)=Θ(log2kRdt(f)).
Theorem 12.

Let g:[k]×{0,1}m{0,1} be a family of linear functions of order k2. Then for any relation f{0,1}n× we have

sRcc(fgn)=Θ(log2kRdt(f)).

For both theorems big-O part follows since Rcc(h)log2sizeRcc(h) and Rcc(h)sRcc(h) for any function h.

As a corollary of these theorems and 4 we immediately obtain the following.

Corollary 13.

Let h:𝒴×{0,1}n{0,1} have linear diversity at least k for k2. Then for any relation f{0,1}n× we have

Rcc(fhn)=Ω(log2kRdt(f)).

The same result is true for logsizeRcc(fhn) and sRcc(fgn).

3.2 Deterministic Semi-structured protocols

We translate our results to deterministic case as well.

Theorem 14.

Let g:[k]×{0,1}m{0,1} be a family of linear functions of order k2. Then for any relation f{0,1}n× we have

Dcc(fgn) =Θ(log2kDdt(f)),
log2sizeDcc(fgn) =Θ(log2kDdt(f)),
sDcc(fgn) =Θ(log2kDdt(f)).
Corollary 15.

Let h:𝒴×{0,1}n{0,1} have linear diversity at least k for k2. Then for any relation f{0,1}n× we have

Dcc(fhn)=Ω(log2kDdt(f)).

The same result is true for logsizeDcc(fhn) and sDcc(fgn).

3.3 Parity decision trees

Finally, we prove that our results imply the same results for parity decision trees instead of semi-structured communication protocols.

Theorem 16.

Let h:𝒴×{0,1}n{0,1} have linear diversity at least k for k2. Then for any relation f{0,1}n× we have

Rdt(fhn)=Ω(log2kRdt(f)).

The same is true for logsizeRcc(fhn) and sRcc(fgn). The same results are also true for the deterministic complexities.

The part of this theorem concerning Rdt is a direct consequence of 1, Theorem 10, and 4. The same applies to Ddt.

The part of this theorem about logsizeRdt and sRdt does not follow from previous theorems that easily, and we prove them in the full version of the paper.

3.4 More powerful gadget reduction

In this section, we describe a more general version of a gadget reduction than in 3. We apply the reduction to the MAJ gadget, obtaining tight bounds for lifting to parity decision trees.

More specifically, we can consider the domain {0,1}m of a gadget as a vector space 𝔽2m and input variables x1,,xm as coordinates in the standard basis. The idea is that we can switch to another basis in this vector space, consider new coordinates as variables and apply the reduction in 3 after that. Since the transformation to the new variables is linear, new variables can be expressed as an XOR of old variables and vice versa. Thus, the parity decision trees in new and old variables can simulate each other (this does not hold for communication complexity setting, since switching to the new basis mixes up the variables belonging to Alice and Bob).

We illustrate this idea on a MAJ gadget. Recall that MAJm is a function that returns 1 iff at least m/2 of its m variables are equal to 1.

Lemma 17.

For any relation f{0,1}n× and m4, Rdt(fMAJmn)=Ω(mRdt(f)).

This method can be applied to gadgets other than MAJ. We formulate this approach in a theorem.

Theorem 18.

Let r:{0,1}m{0,1}, h:𝒳×{0,1}l{0,1}, g:𝒴{0,1} – be functions such that linear diversity of h is at least k2. Suppose that rhm reduces to g after a change of basis. Then, for any relation f{0,1}n×,

Rdt(fgn)=Ω(Rdt(frn)logk)

By reduction we mean 3. Here g depends on x, and we consider some linear change of basis xy. We also decide on some way to split variables y between Alice and Bob before applying 3.

Proofs of these results are provided in the full version of the paper.

3.5 Applications

A more detailed description of applications can be found in the full version of the paper.

Recursive majority function

Let MAJ31:=MAJ3 be the majority function that returns 1 if at least two of its three input bits equal to 1, otherwise it returns 0.

For k>1, define MAJ3k to be MAJ3 in which each argument is substituted by MAJ3k1. Thus, MAJ3k takes 3k inputs.

In [15], it is shown that

Ω(2.57143k)Rdt(MAJ3k)=O(2.64944k).

It is easily observed that MAJ32 is at least 2 linear diverse. By using it as a gadget, our lifting theorem imply that Rdt(MAJ3k)=Ω(Rdt(MAJ3k)). Thus, parity queries do not help in computing MAJ3k as compared to single variable queries.

The same approach can be applied to formulas having the form of complete binary AND-OR-tree. In other words, this function is obtained by repeated iteration of f(x)=(x1x2)(x3x4). It was shown in [19] that Rdt of this function is at least n0.7537, where n is the number of inputs. Since f can be used as a gadget, we translate this result to parity decision trees.

Quantum Complexity

Our lifting theorem can be applied to exhibit a separation between randomized parity decision tree complexity and bounded-error quantum complexity. Let klogn. It was shown in [1] that there is a k/2 versus Ω(n11k) separation between the quantum and randomized query complexity. The separation was shown for a partial function f called k-fold Forrelation.

Using our result, we can lift this separation to randomized parity query complexity. Consider the function fInd2n. On the one hand, its quantum complexity is the same as the quantum complexity of f (up to a factor), since a quantum protocol can extract inputs to f from Ind2 in constant number of additional queries. On the other hand, by Theorem 10, the randomized parity query complexity of fInd2n is no less than its randomized query complexity. Hence, we obtain the same separation even if our query model can ask parity queries.

An alternative method to obtain separation between randomized parity query and quantum complexity was given in [3]. They also relied on the result of [1]. They used Fourier analysis and properties of k-Fold Forrelation to derive their result. In contrast, our lifting theorem can lift any function f that exhibits the separation.

4 Notation for proving main theorems

In this section we introduce notation and facts about entropy that will be used for proving the main theorems in the subsequent sections. Recall that in a semi-structured communication protocol, Bob can send only parities of its input. In a lifting setting, Bob is given nm boolean variables. For each of n variables of the initial function, there are m gadget’s variables. To analyze Bob’s messages, we introduce the following definitions.

4.1 Notation

Definition 19 (XOR of a subset).

For a set S[m] and y{0,1}m, we define

yS:=jSyj.

Analogously, we define yS if y({0,1}m)n and S[n]×[m].

If y({0,1}m)n is the Bob’s input, then each of his messages can be represented as yS for some S[n]×[m].

Definition 20 (Subsets of [n]×[m] as a linear space).

We view subsets of [n]×[m] as vectors in a linear space over 𝔽2 (where addition (+) corresponds to symmetric difference).

With this notation, for any S,T[n]×[m],y({0,1}m)n, yS+T=ySyT.

Definition 21 (Linear Order on [n]×[m]).

We introduce a lexicographic linear order on [n]×[m]. A pair (i,j)[n]×[m] is said to be lower than (i,j) if i<i or i=i,j<j.

Definition 22 (Principal variable of a non-empty subset).

For a non-empty set S[n]×[m], denote by p(S) its lowest element.

Definition 23 (Block).

We refer to {i}×[m] as i-th block. A non-empty subset S[n]×[m] touches i-th block if p(S){i}×[m].

Definition 24 (Bernoulli distribution).

Denote by Bern(p) a distribution of a random variable that takes value 0 with probability 1p and 1 with probability p.

Definition 25 (Entropy).

For a random variable x, denote by H(x) its binary entropy. If X is a set, then we define its entropy as H(X):=log2|X|. If p is a number, then H(p) denotes the binary entropy of the distribution Bern(p).

Recall that the binary entropy of a random variable taking n values with non-zero probabilities p1,,pn equals to

i=1npilog2pi

4.2 Entropy theorems

We state well-known results that will be used for proving our main theorems.

Lemma 26 (Gibbs’ inequality [7]).

Let 0p1,0<q<1. Then,

H(p)plog21q+(1p)log211q
Lemma 27 (Fano’s Inequality [7]).

Let X,Y be random variables. Let ε=P(XY)1/2. Then,

H(X|Y)H(ε)+εlog2(|𝒳|1),

where 𝒳 denotes the support of X.

5 Proof of Theorem 10

5.1 Reformulation of Theorem 10

As we noted in Section 3, the -direction is simple. Thus, it remains to prove the other direction.

Let 𝚷 be a randomized semi-structured protocol computing fgn. Denote by d the depth of 𝚷. Our goal is to construct a randomized tree 𝐓 of depth O(d/log2k), that computes f(z) for any given z.

The idea is to simulate 𝚷 on a randomly uniformly chosen pair (x,y) satisfying gn(x,y)=z. For convenience, denote Pz={(x,y)gn(x,y)=z}. For any pair (x,y)Pz we have that 𝚷(x,y)=f(z) with probability at least 2/3. Thus, if we sample (x,y)U(Pz), the protocol 𝚷(x,y) will also be equal to f(z) with probability at least 2/3.

We use the following general strategy for 𝐓: sample (x,y)U(Pz), sample Π𝚷 (recall that 𝚷 is a random distribution on deterministic protocols), and output Π(x,y). Note that the choice of the pair (x,y) is independent from the choice of Π, and thus it does not matter in which order we sample Π and (x,y). As a result, what is left to do is to simulate the given deterministic semi-structured protocol Π on a random pair (x,y)U(Pz).

We are going to prove the following intermediate theorem.

Theorem 28.

Let Π be a deterministic semi-structured protocol with inputs from [k]n×({0,1}m)n. Let d be equal to the depth of Π. Then, there exists a randomized decision tree 𝐓 which, on input z{0,1}n, outputs a random variable that is destributed as Π(x,y) for (x,y)U(Pz). The expected number of queries to z made by 𝐓 is O(d/log2k), where the expectation is taken over (x,y) for a fixed z.

Before proceeding to the proof of Theorem 28 we show how to prove Theorem 10 based on Theorem 28.

Proof of Theorem 10.

The computation of f on input z proceeds as follows. We choose a random Π𝚷 and run 𝐓 provided by Theorem 28 on input z. By definition, 𝐓(z) has the same distribution as 𝚷(x,y) for (x,y)U(Pz). Thus, 𝐓 computes f(z) with probability at least 23.

The average number of queries made by 𝐓 is at most O(d/log2k). To achieve this number of queries in the worst case, we halt 𝐓 if it makes 10 times more queries than the expected number. By Markov’s inequality, this only happens with probability at most 110, the probability of the correct answer is still a constant greater than 1/2 and it can be increased to 2/3 by the standard argument for error reduction.

Now we proceed to the proof of Theorem 28. For this we need to describe how 𝐓 simulates Π. In the subsequent sections we will heavily use the notation introduced in Section 4.

5.2 Simulation algorithm for 𝑻

In this section we describe the algorithm for 𝐓.

𝐓 starts by sampling a random aU([k]n) and assuming that x=a.

After that, 𝐓 starts the simulation of Π. Since x is fixed (𝐓 knows x but for Π it is a random variable), Alice’s messages are easily simulated. Next, we describe how 𝐓 simulates Bob’s parity messages on the (unknown) variables y.

Recall that we can view g as a family of linear functions {g1,,gk} of order k (one function for each value of x). Since g1,,gk are linear functions, we can represent them as gj(x)=xGj for some Gj[m].

Let Si:={i}×Gxi. Note that all Sis are linearly independent since they are in distinct blocks. From now on we will call S1,,Sn the secret sets.

Consider an arbitrary step of simulation and assume Bob has already sent the parities of his inputs for the sets Q1,,Qt[n]×[m] and is now supposed to send the parity of y’s in the set Qt+1[n]×[m]. Note, that it can be assumed that Qt+1 is linearly independent of Q1,,Qt: otherwise, the message Qt+1 does not reveal any new information and can be omitted from the protocol. There are two cases:

  1. (1)

    There is a linear combination of Qis that includes Qt+1 and equals to some linear combination of the secret sets:

    Qi1+Qi2++Qik+Qt+1=Sj1++Sjl.

    In this case, the value yQt+1 is uniquely determined since

    yQt+1=yQi1+yQi2++yQik+zj1++zjl.

    The y’s parities on the sets Qi1,,Qik are already known from the previous messages of Bob. Therefore, 𝐓 queries zj1,,zjl (if it hasn’t already), and we calculate the value yQt+1 that Bob sends in this vertex.

  2. (2)

    Such a linear combination does not exist. In this case, 𝐓 sends a random bit Bern(1/2) as an XOR of y on the set Qt+1. In terms of the protocol Π, this corresponds to proceeding to one of the left and right children with equal probabilities.

Thus, we have described how to simulate each Bob’s message. Once 𝐓 reaches a leaf of Π, it outputs the value written in this leaf.

5.3 Correctness of the simulation of 𝚷 done by 𝑻

Next, we need to show that 𝐓(z) indeed simulates Π on a random pair (x,y)U(Pz), and that 𝐓(z) makes O(d/log2k) queries on average. We start with the correctness of the simulation.

It is easy to see that for any z the projection of U(Pz) onto x is the uniform distribution on U([k]n), therefore fixing x=aU([k]n) correctly corresponds to the distribution of (x,y) we would like to simulate the protocol on.

Note that once x is fixed the only constraints on y are of the form gj(yi)=zi, where j=xi. In particular, any variable in y, not included in {i}×Gxi, has a Bern(1/2) distribution.

Next, we justify why the algorithm can send Bern(1/2) as an XOR of Qt+1 if no linear combination exists (Case 2 above). Indeed, the y’s parities of Q1,,Qt,S1,,Sn define an affine subspace in our linear space. Since Qt+1 is linearly independent of Q1,,Qt,S1,,Sn, each of the conditions yQt+1=0 and yQt+1=1 is satisfied by exactly half of the points of the affine subspace.

If there exists a linear combination of Qis that includes Qt+1 and equals a linear combination of Sis (Case 1 above), then given the previous messages and the value of z there is only one possible value for yQt+1, which 𝐓 indeed sends in our simulation.

Thus, the distribution of 𝐓(z) matches Π(x,y) for (x,y)U(Pz).

5.4 Upper bound on the average number of queries made by 𝑻

Next we show that 𝐓(z) makes O(d/log2k) queries to the variables z on average. For this we introduce some complexity measures and study their behaviour during the execution of the protocol. To make this analysis cleaner we first modify the protocol Π to add more messages to it. This does not change the output of the protocol, but it allows us to simplify the exposition of time analysis. In the next section we describe the modified protocol. Next we discuss its connection to 𝐓. Finally, in Subsubsection 5.4.3, we derive the upper bound on the number of queries made by 𝐓.

Since we need to show the upper bound on the number of queries of 𝐓(z) for any z, we fix z for the whole argument.

5.4.1 Description of the modified protocol 𝚷¯𝒛

Here we describe Π¯z, a refined version of Π. Note that the protocol depends on z, which is fixed throughout the whole argument. We will be interested only in the behaviour of the protocol on inputs in Pz.

We modify the protocol in two ways. First, we modify the messages sent by Bob into equivalent messages. This does not actually change the information transmitted by Bob, this modification is needed only for the purpose of the analysis of the protocol. Next, at some moments of time we add additional messages sent by the players. The information transmitted in these messages is not affecting the following messages in the protocol. One can think of this in the following way: at some points of the protocol the players pause the execution of the protocol, exchange some extra information, and then resume the execution of the protocol as if nothing happened. These extra messages are needed just to introduce new intermediate vertices in communication tree that will allow us to analyse the complexity of 𝐓 in a cleaner way.

The pseudocode for the algorithm Π¯z is provided in Figure 1. Next we describe the protocol and introduce some important notation related to it.

First of all, observe that if Bob sends b1, b2 as parities on the sets Q1,Q2[n]×[m] respectively, this is equivalent to sending b1, b1b2 as parities on Q1,Q1+Q2 respectively. More generally, applying an invertible linear transformation to Bob’s messages does not change the information transmitted.

Recall that by Si={i}×Gxi we denote the secret sets. Note that Si depends on x and, thus, initially is only known to Alice. Initially, all Si are unrevealed, and over the course of the simulation, they will be gradually revealed. We also introduce a separate notion that of Si being fixed. Initially, all Si are unfixed and then will change status to fixed over the course of the simulation. A revealed Si will also be fixed, but not necessarily vice versa.

Suppose that at the current moment of simulating Π, Bob has sent parities on the sets Q1,,Qt[n]×[m] and now he wants to send the parity on the set Q[n]×[m]. We will maintain the principle variable invariant: all p(Qi) are distinct and, if i<j, then p(Qi) is lower than p(Qj). The variables p(Qi) will be referred to as principal. In particular, from this invariant, it follows that Q1,,Qt are linearly independent.

Define the procedure sift, which will transform Q to an equivalent message. The procedure sift iterates over i=1,,t and replaces Q with Q+Qi if p(Qi)Q. Note that after this procedure for any i we have that p(Qi)Q.

We run the procedure sift on Q. If it turns into an empty set, then Bob’s message does not provide any new information. In this case we finish its processing and move on to further simulation of Π. If the resulting Q is not empty, then it contains a principal variable. Since after sift, Q does not contain principal variables of previous messages, the principal variable of Q does not coincide with the principal variable of previous messages. We insert a copy of Q into the sequence Q1,,Qt in such a way that the principle variable invariant is maintained. That is, now the length of the sequence is t+1. Assume that p(Q) is in the i-th input block. Let Li:={Qj{i}×[m]|p(Qj){i}×[m]}. In other words, we consider all sets whose principal variables are in the i-th block and intersect them with the i-th block. Note that sets in Li are linearly independent. Denote by i the linear span of Li with the zero vector (empty set) removed.

At this point, we apply the second modification to the protocol. If the binary entropy of Si is less than 110log2k+log2|i|+13, Alice sends Si, and Si is considered fixed. Here the protocol views Si to be a random variable of the distribution (x,y)U(Pz) conditioned on the information about (x,y) that the protocol has learnt so far. Note that we fixed z in advance, and we assume that the inputs given to the players are indeed in Pz (we are not interested in the behaviour of the protocol on other inputs). As a result, both players can compute the entropy of Si and compare it to 110log2k+log2|i|+13 without communication.

After that Alice sends a message indicating if Si lies in i111Note that this might be redundant if we just communicated the whole Si. We choose to keep this step even if it is redundant to make the pseudocode in Figure 1. In the analysis below the redundancy of these messages is reflected in 35.. If this is not the case or if Si has already been revealed on one of the previous steps, we finish processing the message Q and proceed with the further simulation of the protocol Π. If Si lies in i, then we consider Si to be revealed. In this case Alice sends Si, and we fix Si if it was not already fixed before. Bob sends ySi. Next, let Qj1,,Qjl be the sets whose linear combination, when intersected with the i-th block, equals the secret set Si. The set Q must be present among them: otherwise, Si would have been revealed at an earlier step. Without loss of generality we can assume that Qj1=Q. We update Q to QQ+Qj2++Qjl+Si and perform the same procedure with the updated Q starting with sift (note that from this players can compute yQ for the new Q without additional communication since ySi is known). Note, that during this iteration we removed from Q all elements in the i-th block without introducing anything to Q in the previous blocks (all sets in the combinations had their principle variables in the i-th block). As a result, the updated Q lies within the blocks that are higher than i-th block and the whole procedure of updating Q will be finished eventually.

Refined protocol Π¯z on input (x,y)U(Pz):

1:Initialize:   v=root of Π,   Q1,,Qt[n]×[m] – sets which parities Bob sends, initially t=0
2:while v is not a leaf   [ invariant: p(Qi) is lower than p(Qj) when i<j]
3:  Let v0, v1 be children of v
4:  if Bob speaks at v then
5:    Let Q[n]×[m] be the set which parity Bob sends at v
6:    Let b=yQ
7:      Bob sends b and we update vvb (B1)
8:
9:    for i=1..t do
10:    if p(Qi)Q then
11:      QQ+Qi
12:    end if
13:    end for
14:    if Q then
15:    Insert a copy of Q in Q1,,Qt so that invariant holds
16:    Let i[n] be the block containing p(Q)
17:    // Li={Qj{i}×[m]|p(Qj){i}×[m]}
18:    // i denotes linear span of Li without null vector
19:    // Si={i}×Gxi – a secret set in ith block
20:    if H(Si)<110log2k+log2|i|+13 then
21:        Alice sends Si; Si is fixed now (A3)
22:    end if
23:      Alice communicates whether Si is in i (A2)
24:    if Sii and Si is not revealed then
25:        Alice sends Si; Si is fixed now (A3)
26:      Si is considered to be revealed
27:        Bob sends ySi (B2)
28:      Find distinct j1,,jl, s.t. Q=Qj1 and
                                            (Qj1++Qjl){i}×[m]=Si
29:      QQ+Qj2++Qjl+Si
30:      Go to 8th line
31:    end if
32:    end if
33:
34:    else Alice speaks at v
35:    Let b be the bit she sends
36:      Alice sends b and we update vvb (A1)
37:    end if
38:  end while
39:  return the value of the leaf v
Figure 1: The modified (deterministic) protocol Π¯z. The original protocol Π can be recovered by ignoring lines 8–31 and the red text. Lines 8–31 are used to maintain the invariant and prove the estimate on the number of revealed sets. The classification (A1), (A2), (A3), (B1), (B2) of the actions made by Alice and Bob is used in Section 5.4.3. Note that Alice can send more than 1-bit of information for the message of type (A3).

5.4.2 Connection between 𝑻 and 𝚷¯𝒛

The protocol Π¯z is useful for upper bounding the complexity of 𝐓 due to the following lemmas.

Lemma 29.

The following is a while-loop invariant in Π¯z: If a linear combination of Q1,,Qt equals a linear combination of S1,,Sn, then all Sis in this linear combination are revealed.

Proof.

First note that for all i such that Si is unrevealed, Si does not lie in i. Otherwise, at the line 25 of the algorithm, Si would have been revealed.

Assume that this invariant is violated. Let this linear combination be Qj1++Qjl=I, where j1<<jl. If there are many linear combinations that violate invariant, then choose the one with the greatest j1. Let i be the block of the variable p(Qj1). Then, I{i}×[m] must be equal to Si, since I{i}×[m]. It follows that Si must be revealed, since it lies in i. At the line 28 of the algorithm, it is ensured that a linear combination Qj1++Qjl+Si=I+Si is added to the set of Qs. I+Si is contained in blocks strictly higher than i-th block and, by assumption, is equal to a linear combination of S1,,Sn containing an unrevealed secret set. Therefore, we obtained a contradiction with the maximality of j1.

Next we establish the connection between 𝐓 and Π¯z. By construction, 𝐓(z) simulates Π on a random input (x,y)U(Pz). Fix r to be the outcome of the random bits of the tree 𝐓. Then 𝐓r(z) will simulate Π on some pair (xr,yr)Pz.

We show the following.

Lemma 30.

The number of queries to z made by 𝐓r(z) is less or equal than the number of revealed sets in protocol Π¯z on input (xr,yr).

Proof.

By definition of 𝐓, it queries zi only if there is some linear combination of S1,,Sn containing Si that equals a linear combination of Q1,,Qt. By 29, we get that zi is queried by 𝐓 only if Si is revealed. Hence, the statement of the lemma follows.

The lemma holds for all r. In particular, if we average over r, then we get that the expected number of queries made by 𝐓 is bounded by the expected number of revealed sets in protocol Π¯z on a random (x,y)U(Pz). Therefore, it remains to obtain an upper bound on the expected number of revealed sets.

5.4.3 Upper bound on the expected number of revealed sets in 𝚷¯𝒛

For each vertex v of the protocol Π¯z, define

Pz,v={(x,y)|Π¯zreaches vertexvon input(x,y)andgn(x,y)=z}

In other words, if Xv×Yv is the rectangle corresponding to the vertex v in the protocol Π¯z, then Pz,v=Xv×Yv(gn)1(z).

Essentially, if the protocol Π¯z has reached the vertex v, then (x,y) can be any element of the set Pz,v. Now consider the uniform distribution U(Pz) and run Π¯z on a random pair (x,y) from this distribution. Then U(Pz,v) is precisely the conditional distribution of the pair (x,y), given that Π¯z has reached v. In what follows, we consider (x,y)U(Pz,v). We will be interested in the entropy of the projection of this distribution onto x, hence we introduce the following notation

Hv=H(x),
Hiv=H(xi).

Let Si be the secret set in the i-th block. The set Si depends on xi and, moreover, there is a one-to-one correspondence between Sis and xis. Therefore, their entropies are equal: H(Si)=H(xi)=Hiv. Denote by Liv the set Li corresponding to the vertex v, and by iv its linear span with the zero vector removed. That is, |iv|=2|Liv|1.

Using Fano’s inequality, we can show the following.

Lemma 31.

Let v be a vertex of the protocol Π¯z in which Alice sends a message revealing whether Si lies in the linear span Liv (this is (A2) type of message on Figure 1). Then, if Hivlog2|iv|+110log2k+13, then Si does not lie in iv with probability at least 1100.

Proof.

Define Y to be equal to Si if Siiv, and to be equal to any element in iv otherwise. Note that with this definition, P(Siiv)=P(SiY). Denote this probability by ε. Then, by Fano’s inequality we have

H(Si)log2|iv|H(Si)H(Y)H(Si|Y)H(ε)+εlog2k.

Assume that ε<1/100. Then

H(Si)<110log2k+log2|iv|+H(1/100).

Since H(1/100)<1/3, we get at a contradiction with the statement of the lemma.

Now let’s show that if at vertex v Alice or Bob sends one-bit message, then the entropy H(x) on average does not decrease significantly.

Lemma 32.

Let Iv:Pz,v{0,1} denote the bit of information that Alice or Bob sends at vertex v. Then

H(x|Iv)H(x)H(Iv)H(x)1.
Proof.

We have that

H(x|Iv)+H(Iv)=H(x,Iv)H(x)

and

H(x|Iv)H(x)H(Iv)H(x)1.

In other words, the entropy H of the distribution of x drops by no more than 1 on average on one step of the protocol. Next, we introduce the deficiency of the distribution. Let

Uiv={log2k,ifSiis not fixed;0,ifSiis fixed,

and let the deficiency be

Dv=(i=1nUiv)Hv.

Note that H(Si)=0 if Si is fixed. Thus, Dv0 for any v, as i=1nUivi=1nHivHv. We will omit the superscript v if the vertex is clear from the context.

Now we consider (x,y)U(Pz) and introduce some random variables for various types of messages of Π¯z on the input (x,y).

For this we consider a finer classification of the messages sent by Alice, compared to the one in Figure 1:

(A1)

A message that Alice sends in Π.

(A2’)

A message Si?i, such that we have Hilog2|i|+1/3+110log2k before sending this message. In other words, in these messages Si was not previously fixed.

(A2”)

A message Si?i, such that we have Hi<log2|i|+1/3+110log2k before sending this message. In other words, in these messages Si was previously fixed.

(A3’)

A message that Alice sends to find out the exact value of Si, and Si was not fixed before sending this message.

(A3”)

Same as (A3’) but Si was fixed before sending this message.

For Bob we use the same classification of messages as in Figure 1:

(B1)

A message that Bob sends in Π.

(B2)

A message in which Bob communicates the value ySi

Let A1,A2,A2′′,A3,A3′′,B1,B2 denote the number of messages of the corresponding type sent by the protocol during the whole computation. All of these are random variables depending on (x,y), which we draw randomly: (x,y)U(Pz).

The following inequality hold.

Observation 33.

𝔼A2100𝔼B1.

Proof.

When Alice sends a message of type (A2’), by 31 the secret set Si does not lie in i with probability at least 1100. If it does not lie in i, the processing of the Bob’s message is over. Thus, for each query of type (B1), there will be no more than 100 queries of type (A2’) on average.

Observation 34.

At the end of the computation, the number of fixed Si is greater or equal than A3. In particular, there are at least A3 coordinates i such that Ui decreased from log2k to 0. During each of these decreases of Uis, D decreases by at least max{0,910log2k1/3log2|i|} on average.

Proof.

By definition, A3 is precisely equal to the number of the fixed secret sets Si. When the status of Si changes from unfixed to fixed, Ui drops from log2k to 0. Since Ui changes to 0 only when Hi becomes less than log2|i|+1/3+110log2k (as ensured by if’s on lines 19 and 23), sending Si transmits no more than log2|i|+1/3+110log2k information. Thus, we get the required decrease of D by at least max{0,910log2k1/3log2|i|} on average.

Observation 35.

Messages of types (B2), (A2”), and (A3”) do not affect D.

Proof.

A message of the type (B2) does not change D since pairs (x,y)Pz have a property that g(xi,yi)=zi. Thus, message ySi does not decrease the entropy of the distribution.

Just before a message M of type (A2”) is sent, by definition, we have that Hi<log2|i|+1/3+110log2k. But in lines 1921 it is ensured that if Hi is lower than this threshold, then Si will be fixed. Thus, Hi is not only lower than log2|i|+1/3+110log2k, but also equals to 0. Thus, (A2′′) does not influence D.

By definition, Si is fixed before sending a message of type (A3”). Thus, Hi=0 and sending Si does not reveal any information.

Observation 36.

At the end of the protocol’s execution, i|Li|B1+B2.

Proof.

A new set Q is generated and added to the list of Qis either when a message of type (B1) is sent, or after a message of type (B2) is sent.

Let’s show how the desired bound follows from these lemmas. Note that if Si is revealed, then it must be fixed. As a consequence, we get B2A3 since B2 equals the number of revealed sets and A3 equals the number of fixed ones. Our goal is to upper bound the number of revealed sets. Thus, we only need to upper bound 𝔼A3 by 𝔼(A1+B1)/log2k, as A1+B1 is exactly the number of messages sent by the protocol Π.

Lemma 37.

𝔼A3=O(𝔼(A1+B1)/log2k)

Proof.

By 32 messages of type (A1) and (B1) increase D by no more than 1 on average. The same is true for messages of type (A2’). As a result, due to 34 and 35,

𝔼(A3(910log2k1/3)+ilog2|i|+A1+B1+A2)0,

since D is always greater or equal than than 0. Applying 33 and rearranging we get

𝔼(ilog2|i|+101B1+A1)𝔼A3(910log2k1/3).

Now we consider two cases.

  1. 1.

    k3

    Note that ilog2|i|i|Li|B1+B2. Using the fact that B2A3 and the inequality above, we obtain

    𝔼(102B1+A1)𝔼A3(910log2k4/3).

    Since 910log2k>4/3, we get 𝔼A3=O(𝔼A1+B1log2k).

  2. 2.

    k=2

    Note that if we are sending a message of type (A3’) and it is true that |Li|1, then D decreases by at least 1(1/10+1/3)12 on average. The number of is such that |Li|2 does not exceed B1+B22 by 36. As a result,

    𝔼(12(A3B1+B22)+A1+B1+A2)0.

    Using B2A3, we get

    𝔼(5/4B1+100B1+A1)𝔼A31/4.

    Thus, we again obtain 𝔼A3=O(𝔼(A1+B1)).

References

  • [1] Nikhil Bansal and Makrand Sinha. k-forrelation optimally separates quantum and classical query complexity. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, STOC 2021, pages 1303–1316, New York, NY, USA, 2021. Association for Computing Machinery. doi:10.1145/3406325.3451040.
  • [2] Paul Beame and Sajin Koroth. On disperser/lifting properties of the index and inner-product functions. In Yael Tauman Kalai, editor, 14th Innovations in Theoretical Computer Science Conference, ITCS 2023, January 10-13, 2023, MIT, Cambridge, Massachusetts, USA, volume 251 of LIPIcs, pages 14:1–14:17. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPICS.ITCS.2023.14.
  • [3] Eric Blais, Li-Yang Tan, and Andrew Wan. An inequality for the Fourier spectrum of parity decision trees. CoRR, abs/1506.01055, 2015. arXiv:1506.01055.
  • [4] Arkadev Chattopadhyay, Yuval Filmus, Sajin Koroth, Or Meir, and Toniann Pitassi. Query-to-communication lifting using low-discrepancy gadgets. SIAM J. Comput., 50(1):171–210, 2021. doi:10.1137/19M1310153.
  • [5] Arkadev Chattopadhyay, Michal Koucký, Bruno Loff, and Sagnik Mukhopadhyay. Simulation theorems via pseudo-random properties. Comput. Complex., 28(4):617–659, 2019. doi:10.1007/S00037-019-00190-7.
  • [6] Arkadev Chattopadhyay, Nikhil S. Mande, Swagato Sanyal, and Suhail Sherif. Lifting to parity decision trees via stifling. In Yael Tauman Kalai, editor, 14th Innovations in Theoretical Computer Science Conference, ITCS 2023, January 10-13, 2023, MIT, Cambridge, Massachusetts, USA, volume 251 of LIPIcs, pages 33:1–33:20. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPICS.ITCS.2023.33.
  • [7] Thomas M. Cover and Joy A. Thomas. Elements of Information Theory 2nd Edition (Wiley Series in Telecommunications and Signal Processing). Wiley-Interscience, July 2006.
  • [8] Ankit Garg, Mika Göös, Pritish Kamath, and Dmitry Sokolov. Monotone circuit lower bounds from resolution. Theory Comput., 16:1–30, 2020. doi:10.4086/TOC.2020.V016A013.
  • [9] Mika Göös and Toniann Pitassi. Communication lower bounds via critical block sensitivity. SIAM J. Comput., 47(5):1778–1806, 2018. doi:10.1137/16M1082007.
  • [10] Mika Göös, Toniann Pitassi, and Thomas Watson. Deterministic communication vs. partition number. SIAM J. Comput., 47(6):2435–2450, 2018. doi:10.1137/16M1059369.
  • [11] Mika Göös, Toniann Pitassi, and Thomas Watson. Query-to-communication lifting for BPP. SIAM J. Comput., 49(4), 2020. doi:10.1137/17M115339X.
  • [12] Hamed Hatami, Kaave Hosseini, and Shachar Lovett. Structure of protocols for XOR functions. SIAM J. Comput., 47(1):208–217, 2018. doi:10.1137/17M1136869.
  • [13] Bruno Loff and Sagnik Mukhopadhyay. Lifting theorems for equality. In Rolf Niedermeier and Christophe Paul, editors, 36th International Symposium on Theoretical Aspects of Computer Science, STACS 2019, March 13-16, 2019, Berlin, Germany, volume 126 of LIPIcs, pages 50:1–50:19. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2019. doi:10.4230/LIPICS.STACS.2019.50.
  • [14] Shachar Lovett, Raghu Meka, Ian Mertz, Toniann Pitassi, and Jiapeng Zhang. Lifting with sunflowers. In Mark Braverman, editor, 13th Innovations in Theoretical Computer Science Conference, ITCS 2022, January 31 - February 3, 2022, Berkeley, CA, USA, volume 215 of LIPIcs, pages 104:1–104:24. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/LIPICS.ITCS.2022.104.
  • [15] Frédéric Magniez, Ashwin Nayak, Miklos Santha, and David Xiao. Improved bounds for the randomized decision tree complexity of recursive majority. In Luca Aceto, Monika Henzinger, and Jiří Sgall, editors, Automata, Languages and Programming, pages 317–329, Berlin, Heidelberg, 2011. Springer Berlin Heidelberg. doi:10.1007/978-3-642-22006-7_27.
  • [16] Toniann Pitassi and Robert Robere. Strongly exponential lower bounds for monotone computation. In Hamed Hatami, Pierre McKenzie, and Valerie King, editors, Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, Montreal, QC, Canada, June 19-23, 2017, pages 1246–1255. ACM, 2017. doi:10.1145/3055399.3055478.
  • [17] Ran Raz and Pierre McKenzie. Separation of the monotone NC hierarchy. Comb., 19(3):403–435, 1999. doi:10.1007/S004930050062.
  • [18] Robert Robere, Toniann Pitassi, Benjamin Rossman, and Stephen A. Cook. Exponential lower bounds for monotone span programs. In Irit Dinur, editor, IEEE 57th Annual Symposium on Foundations of Computer Science, FOCS 2016, 9-11 October 2016, Hyatt Regency, New Brunswick, New Jersey, USA, pages 406–415. IEEE Computer Society, 2016. doi:10.1109/FOCS.2016.51.
  • [19] Miklos Santha. On the Monte Carlo Boolean decision tree complexity of read-once formulae. In Proceedings of the Sixth Annual Structure in Complexity Theory Conference, Chicago, Illinois, USA, June 30 - July 3, 1991, pages 180–187. IEEE Computer Society, 1991. doi:10.1109/SCT.1991.160259.
  • [20] Xiaodi Wu, Penghui Yao, and Henry S. Yuen. Raz-McKenzie simulation with the inner product gadget. Electron. Colloquium Comput. Complex., TR17-010, 2017. URL: https://eccc.weizmann.ac.il/report/2017/010.