Abstract 1 Introduction 2 Preliminaries 3 An Infinitely-Often Algebrization Barrier for 𝖯𝗈𝗌𝗍𝖡𝖯𝖤 References

New Algebrization Barriers to Circuit Lower Bounds via Communication Complexity of Missing-String

Lijie Chen ORCID University of California, Berkeley, CA, USA Yang Hu ORCID Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China Hanlin Ren ORCID Institute for Advanced Study, Princeton, NJ, USA
Abstract

The algebrization barrier, proposed by Aaronson and Wigderson (STOC ’08, ToCT ’09), captures the limitations of many complexity-theoretic techniques based on arithmetization. Notably, several circuit lower bounds that overcome the relativization barrier (Buhrman–Fortnow–Thierauf, CCC ’98; Vinodchandran, TCS ’05; Santhanam, STOC ’07, SICOMP ’09) remain subject to the algebrization barrier.

In this work, we establish several new algebrization barriers to circuit lower bounds by studying the communication complexity of the following problem, called XOR-Missing-String: For m<2n/2, Alice gets a list of m strings x1,,xm{0,1}n, Bob gets a list of m strings y1,,ym{0,1}n, and the goal is to output a string s{0,1}n that is not equal to xiyj for any i,j[m].

  1. 1.

    We construct an oracle A1 and its multilinear extension A1~ such that 𝖯𝗈𝗌𝗍𝖡𝖯𝖤A1~ has linear-size A1-oracle circuits on infinitely many input lengths. That is, proving 𝖯𝗈𝗌𝗍𝖡𝖯𝖤i.o.-𝖲𝖨𝖹𝖤[O(n)] requires non-algebrizing techniques. This barrier follows from a 𝖯𝗈𝗌𝗍𝖡𝖯𝖯 communication lower bound for XOR-Missing-String. This is in contrast to the well-known algebrizing lower bound 𝖬𝖠𝖤(𝖯𝗈𝗌𝗍𝖡𝖯𝖤)𝖯/poly.

  2. 2.

    We construct an oracle A2 and its multilinear extension A2~ such that 𝖡𝖯𝖤A2~ has linear-size A2-oracle circuits on all input lengths. Previously, a similar barrier was demonstrated by Aaronson and Wigderson, but in their result, A2~ is only a multiquadratic extension of A2. Our results show that communication complexity is more useful than previously thought for proving algebrization barriers, as Aaronson and Wigderson wrote that communication-based barriers were “more contrived”. This serves as an example of how XOR-Missing-String forms new connections between communication lower bounds and algebrization barriers.

  3. 3.

    Finally, we study algebrization barriers to circuit lower bounds for 𝖬𝖠𝖤. Buhrman, Fortnow, and Thierauf proved a sub-half-exponential circuit lower bound for 𝖬𝖠𝖤 via algebrizing techniques. Toward understanding whether the half-exponential bound can be improved, we define a natural subclass of 𝖬𝖠𝖤 that includes their hard 𝖬𝖠𝖤 language, and prove the following result: For every super-half-exponential function h(n), we construct an oracle A3 and its multilinear extension A3~ such that this natural subclass of 𝖬𝖠𝖤A3~ has h(n)-size A3-oracle circuits on all input lengths. This suggests that half-exponential might be the correct barrier for 𝖬𝖠𝖤 circuit lower bounds w.r.t. algebrizing techniques.

Keywords and phrases:
circuit lower bound, algebrization barrier, missing string, communication complexity
Copyright and License:
[Uncaptioned image] © Lijie Chen, Yang Hu, and Hanlin Ren; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Complexity classes
; Theory of computation Circuit complexity ; Theory of computation Communication complexity
Related Version:
Full Version: https://eccc.weizmann.ac.il/report/2025/184/
Editor:
Shubhangi Saraf

1 Introduction

Proving unconditional circuit lower bounds is one of the major challenges in theoretical computer science, with the holy grail of proving 𝖭𝖯𝖯/poly. Unfortunately, our progress toward this goal is barely satisfactory, as it is even open to prove a super-polynomial size lower bound for huge, exponential-time classes such as 𝖭𝖤𝖷𝖯. Only for even larger complexity classes such as Σ2𝖤𝖷𝖯 [27], which are in (or beyond) the second level of exponential-time hierarchy, are super-polynomial lower bounds known.

Our lack of progress in proving circuit lower bounds is partially explained by a series of barrier results such as relativization [10], natural proofs [37], and algebrization [3]. Among these barriers, relativization and algebrization are particularly relevant to lower bounds against unrestricted circuits for large complexity classes. For example, Wilson [44] constructed an oracle world where 𝖯𝖭𝖯 has linear-size circuits, which explains our inability to prove fixed-polynomial size lower bounds for 𝖯𝖭𝖯.

Missing-String: a duality-based approach to relativizing circuit lower bounds.

Previous relativization barriers for circuit lower bounds are proved in an ad hoc fashion, which involves carefully analyzing the interaction between the oracle and the machines to be diagonalized [44, 12, 2]. A recent paper by Vyas and Williams [43] introduced the Missing-String problem as a systematic approach to such relativization barriers:

Problem 1 (Missing-String(n,m)).

Let n,m be such that m<2n. Given (query access to) a list of length-n strings x1,x2,,xm, output a length-n string that is not in this list.

The query complexity of Missing-String captures relativizing circuit lower bounds in the following sense: relativization barriers to proving 𝒞-𝖤𝖷𝖯𝖯/poly are essentially 𝒞dt lower bounds for Missing-String. (Here, 𝒞-𝖤𝖷𝖯 is the exponential-time version of 𝒞 and 𝒞dt means the decision-tree version of 𝒞.) This connection was made explicit in [43], who demonstrated an equivalence between relativization barriers to exponential lower bounds for Σ2𝖤 and the non-existence of small depth-3 circuits for Missing-String.111𝖤=𝖣𝖳𝖨𝖬𝖤[2O(n)] denotes single-exponential time and 𝖤𝖷𝖯=𝖣𝖳𝖨𝖬𝖤[2nO(1)] denotes exponential time; classes such as Σ2𝖤 and Σ2𝖤𝖷𝖯 are defined analogously. Exponential time and single-exponential time are basically interchangeable in the context of super-polynomial lower bounds by a padding argument.

Intriguingly, Missing-String connects circuit lower bounds with circuit upper bounds and provides a duality-based approach to both: Decision tree upper bounds for Missing-String imply relativizing circuit lower bounds, and decision tree lower bounds for Missing-String imply relativized worlds with circuit upper bounds (i.e., relativization barrier to circuit lower bounds). Besides providing a clean and systematic method for relativization barriers, this “algorithmic” perspective has indeed made progress in circuit lower bounds: By designing algorithms for the Range Avoidance problem [30, 32, 38], which is the “white-box” version of Missing-String,222In the Range Avoidance problem, we are given the description of a circuit C:{0,1}n{0,1}n+1 and our goal is to output a string y{0,1}n+1 that is not in the range of C. It is easy to see that relativizing algorithms for Range Avoidance are equivalent to decision trees for Missing-String. recent work [15, 33] proved an exponential-size, relativizing lower bound for the complexity class Σ2𝖤 (which also implies a quasi-polynomial size depth-3 circuit upper bound for Missing-String, settling the question in [43]).

The quest of algebrization.

However, the relativization barrier does not capture many circuit lower bounds that follow from nonrelativizing results such as 𝖨𝖯=𝖯𝖲𝖯𝖠𝖢𝖤 [35, 40] and 𝖬𝖨𝖯=𝖭𝖤𝖷𝖯 [8, 9], whose proofs are based on nonrelativizing proof techniques such as arithmetization. This includes circuit lower bounds for 𝖯𝖯 [42] and 𝖬𝖠 [12, 39], both of which are provably nonrelativizing [12, 2].

To shed light on the limitations of such proof techniques, Aaronson and Wigderson [3] proposed the algebrization barrier: a lower bound statement 𝒞𝒟 algebrizes if 𝒞A~𝒟A for all oracles A and all low-degree extensions A~ of A. Aaronson and Wigderson showed that many arithmetization-based results indeed algebrize; in particular, 𝖬𝖠𝖤A~𝖯A/poly for every oracle A and low-degree extension A~ of A [3, Theorem 3.17]. In fact, all super-polynomial lower bounds against general circuits that we are aware of are algebrizing. Thus, algebrization is a more suitable framework than relativization for capturing the limitations of our current techniques.

Unfortunately, our understanding of the algebrization barrier remains primitive. For instance, several aspects of the algebrizing lower bound 𝖬𝖠𝖤𝖯/poly [12] remain poorly understood even with respect to algebrizing techniques:

  • This lower bound only holds infinitely-often. That is, [12] only exhibited a language L𝖬𝖠𝖤 where there are infinitely many input lengths on which L has high circuit complexity. Recently, the relativizing infinitely-often lower bound for Σ2𝖤 was improved to almost everywhere by [33], and it is tempting to ask whether the same improvement can be made with respect to this lower bound. Does 𝖬𝖠𝖤 require large circuit complexity on every input length? If so, can we show this via algebrizing proof techniques?

  • This lower bound is only sub-half-exponential. Roughly speaking, a function h is half-exponential if h(h(n))2n. Half-exponential bounds appear naturally in many win-win analyses in complexity theory [36], and [12] is no exception – it is only known how to prove a size-h(n) lower bound for 𝖬𝖠𝖤 when h is smaller than half-exponential. The recent “iterative win-win paradigm” [17, 15, 16] provides new techniques for overcoming the half-exponential barrier, which has indeed improved the circuit lower bound for Σ2𝖤𝖷𝖯 to a near-maximum (2n/n) one [15]. Can we use similar techniques to prove an exponential size lower bound for 𝖬𝖠𝖤? If so, can we show this via algebrizing proof techniques?

Our lack of understanding on algebrization barriers raises the following question:

Is there a duality-based approach to algebrization barriers?

​​In particular, can we establish algebrization barriers via lower bounds for Missing-String? It follows from [3] that communication lower bounds for Missing-String imply algebrization barriers to circuit lower bounds.333In fact, lower bounds for Missing-String in the algebraic query model suffices. However, we find the algebraic query model somewhat counter-intuitive to work with (for example, the input to the query algorithms consists of a Missing-String instance along with its low-degree extension). The “transfer principle” [3, Section 4.3] allows us to simulate such query algorithms by communication protocols, hence we choose to study communication complexity. Hence, a naïve attempt is to prove communication lower bounds for Missing-String and translate them into algebrization barriers. Unfortunately, it turns out that Missing-String admits an efficient deterministic communication protocol by a simple binary search.444Suppose that m2n/10, Alice has inputs x1,x2,,xm{0,1}n and Bob has inputs y1,y2,,ym{0,1}n. For each string s{0,1}n, it costs only O(logm) bits of communication to obtain the value f(s):=|{i:xis}|+|{i:yis}|. Hence, we can use binary search to find two (lexicographically) adjacent strings pred(s) and s such that f(pred(s))=f(s) by communicating O(nlogm) bits. Clearly, s is not in the set {x1,,xm,y1,,ym}.

The main conceptual contribution of this paper is the following communication problem:

Problem 2 (XOR-Missing-String(n,m)).

Let n,m be such that m<2n/2. Alice receives m strings x1,x2,,xm{0,1}n and Bob receives y1,y2,,ym{0,1}n. Their goal is to communicate with each other and output an n-bit string s such that s is not equal to xiyj for every 1i,jm. Here denotes the bit-wise XOR operation over strings.

We show that XOR-Missing-String is hard for various communication complexity models, and transfer our communication complexity lower bounds to algebrization barriers. We now discuss our results in more detail.

1.1 Barriers to 𝐩𝐫-𝗣𝗼𝘀𝘁𝗕𝗣𝗘 Circuit Lower Bounds

Our first result is an algebrization barrier to proving pr-𝖯𝗈𝗌𝗍𝖡𝖯𝖤i.o.-𝖲𝖨𝖹𝖤[O(n)], i.e., an almost-everywhere circuit lower bound for pr-𝖯𝗈𝗌𝗍𝖡𝖯𝖤. Here, pr-𝖯𝗈𝗌𝗍𝖡𝖯𝖤 is the class of promise problems decided by a probabilistic exponential-time machine with postselection [23]: conditioned on some event (which may happen with exponentially small probability), the machine outputs the correct answer with high probability. Postselection is a powerful computational resource: 𝖯𝗈𝗌𝗍𝖡𝖯𝖯 contains 𝖬𝖠 (thus 𝖭𝖯) [23] and 𝖯𝗈𝗌𝗍𝖡𝖰𝖯 (polynomial-time quantum computation with postselection) equals 𝖯𝖯 [1].555For example, a 𝖯𝗈𝗌𝗍𝖡𝖯𝖯 machine can solve 𝖭𝖯-complete problems as follows. With exponentially small probability, output “No” and halt. If this does not happen, choose an 𝖭𝖯 witness uniformly at random and “kill yourself” if this witness is invalid (that is, we postselect on the event that our witness is valid). If the algorithm survives, it outputs “Yes.” Conditioned on survival, with high probability, our algorithm outputs “Yes” on Yes instances and outputs “No” on No instances.

Theorem 3.

There exists an oracle A1 and its multilinear extension A1~ such that

pr-𝖯𝗈𝗌𝗍𝖡𝖯𝖤A1~i.o.-𝖲𝖨𝖹𝖤A1[O(n)].

In particular, this also implies

pr-𝖬𝖠𝖤A1~i.o.-𝖲𝖨𝖹𝖤A1[O(n)].

The proof of pr-𝖬𝖠𝖤𝖯/poly is algebrizing [12, 39, 3]. Our result implies that improving this circuit lower bound to almost-everywhere requires non-algebrizing techniques.

​​​Theorem 3 follows from a 𝖯𝗈𝗌𝗍𝖡𝖯𝖯 communication lower bound for XOR-Missing-String:

Theorem 4.

Let n1 and 20nm<2n/2 be integers. Any 𝖯𝗈𝗌𝗍𝖡𝖯𝖯 communication protocol that solves XOR-Missing-String(n,m) with error 25n must have communication complexity Ω(m).

Assume that nm2o(n). A trivial protocol for XOR-Missing-String is to output a uniformly random n-bit string, which has communication complexity O(n) and error m2/2n. Thus, Theorem 4 states that even if postselection is allowed, if we want to beat the error bound of this trivial protocol (2n), then the amount of communication needed is close to the maximum (Ω(m)).

Moreover, since the error of any pseudodeterministic666A randomized protocol for a search problem is pseudodeterministic [20] if there is a canonical output that is correct and outputted with probability 2/3. The trivial protocol for XOR-Missing-String that outputs a random guess is not pseudodeterministic, since running it two times using different randomness is likely to yield different answers. communication protocol can be reduced by a 2k factor by repeating the protocol Θ(k) times and taking the majority answer, Theorem 4 implies that any pseudodeterministic 𝖯𝗈𝗌𝗍𝖡𝖯𝖯 protocol that solves XOR-Missing-String(n,m) (with correct probability, say, 2/3) must have communication complexity Ω(m/n). This stands in stark contrast to the case without pseudodeterministic constraints, as the trivial protocol is correct with probability 1m2/2n2/3. We also remark that lower bounds against pseudodeterministic 𝖯𝗈𝗌𝗍𝖡𝖯𝖯 communication protocols suffice for constructing an oracle A such that 𝖯𝗈𝗌𝗍𝖡𝖯𝖤A~ has small A-oracle circuits; however, we will need the full power of Theorem 4 against every low-error protocol to prove that the promise version of 𝖯𝗈𝗌𝗍𝖡𝖯𝖤A~ has small A-oracle circuits.

1.2 Barriers to 𝖡𝖯𝖤 Circuit Lower Bounds

Our second result is an algebrization barrier to proving 𝖡𝖯𝖤𝖲𝖨𝖹𝖤[O(n)], i.e., an infinitely-often circuit lower bound for 𝖡𝖯𝖤:

Theorem 5.

There exists an oracle A2 and its multilinear extension A2~ such that

𝖡𝖯𝖤A2~𝖲𝖨𝖹𝖤A2[O(n)].

Previously, Aaronson and Wigderson [3] constructed an oracle A and its multiquadratic extension A~ such that 𝖡𝖯𝖤𝖷𝖯A~𝖯A/poly. Their proof works with the algebraic query model directly, and it is unclear how to prove the same result for multilinear extensions using their techniques.

Besides demonstrating the power of our communication- and duality-based approach, an algebrization barrier w.r.t. multilinear extension also aligns better with the notion of affine relativization [7]. A statement 𝒞𝒟 is said to affine relativize if 𝒞A~𝒟A~ for every A~ that is the multilinear extension of some oracle A. The crucial difference from algebrization (in the sense of [3]) is that the left-hand side is also required to relativize with the multilinear extension. Affine relativization is arguably a cleaner and more robust notion than algebrization.777For example, algebrization is not necessarily closed under modus ponens: Although it requires non-algebrizing techniques to prove, say, 𝖭𝖤𝖷𝖯𝖯/poly, it might still be possible to exhibit a complexity class 𝒞 and prove both 𝒞𝖭𝖤𝖷𝖯 and 𝒞𝖯/poly via algebrizing techniques. In contrast, affine relativization is clearly closed under modus ponens. It is important to only take multilinear extensions, as [7] was only able to show that, e.g., 𝖯𝖲𝖯𝖠𝖢𝖤A~𝖨𝖯A~ for every multilinear oracle A~; it is unclear if such a statement holds when A~ is merely multiquadratic. (This is also why Aaronson and Wigderson [3] choose to relativize 𝖨𝖯 with the low-degree extension of A but to relativize 𝖯𝖲𝖯𝖠𝖢𝖤 with A itself.) In this regard, Theorem 5 also shows that 𝖡𝖯𝖤𝖯/poly is not affine relativizing (since 𝖡𝖯𝖤A~𝖲𝖨𝖹𝖤A[O(n)]𝖲𝖨𝖹𝖤A~[O(n)]).

To obtain Theorem 5, we prove a lower bound for XOR-Missing-String against multiple pseudodeterministic 𝖡𝖯𝖯 protocols simultaneously in the following model.

  • There are t communication protocols Q1,Q2,,Qt and t inputs (X1,Y1), (X2,Y2), , (Xt,Yt).

  • The goal of protocol Qi is to solve the input (Xi,Yi). However, each protocol is given the inputs to other protocols as well. More precisely, in each Qi, Alice gets (X1,X2,,Xt) as inputs and Bob gets (Y1,Y2,,Yt) as inputs.888We remark that in the case of XOR-Missing-String, each Xi and Yi is already a list of strings, which means Alice and Bob get t lists of strings each.

  • We say that the protocols succeed if at least one of the protocols Qi outputs a correct answer.

  • For proving algebrization barriers, each Qi and (Xi,Yi) should have a different size and correspond to different input lengths of the oracle, but we omit this detail in the informal description here. We refer the reader to the full version of our paper for details.

The point of this model is that the protocols are allowed to look at other protocols’ inputs and to perform win-win analyses: the correctness of Qi may rely on the failure of some other protocol Qi.999If the possibility of such win-win analyses sounds surprising, it may help to compare it with the following classic puzzle. There are N prisoners, each assigned a (not necessarily distinct) number between 0 and N1. Each prisoner can see others’ numbers but not their own. Each prisoner must guess their own number simultaneously. If at least one prisoner guesses correctly, they are all set free; if none of them do, they are all executed. There is a simple solution that guarantees the prisoners’ freedom. The i-th prisoner (0iN1) assumes that the total sum of all numbers is congruent to i modulo N, and then infers their own number as (isum of other prisoners’ numbers)modN. Exactly one assumption will be correct, so that prisoner will guess their own number correctly. Indeed, this scenario models a variety of win-win analyses in complexity theory; see e.g., [15, Section 1.4.2] and [34, Section 5.2]. Circuit lower bounds for 𝖬𝖠 [12, 39] can also be modeled as win-win algebraic query algorithms for Missing-String; see Appendix A of the full version.

We show that efficient pseudodeterministic protocols require large communication complexity to solve XOR-Missing-String, even if they receive multiple instances and are allowed to perform win-win analyses in the above sense.

Theorem 6 (Informal).

In the above model, suppose that each (Xi,Yi) is an instance of XOR-Missing-String(ni,mi) but the communication complexity of each Qi is “much less” than mi. Then there exists a sequence of inputs {(Xi,Yi)}i[t] such that every Qi fails to solve its corresponding instance (Xi,Yi) pseudodeterministically.

1.3 Barriers to 𝖬𝖠𝖤 Circuit Lower Bounds

Finally, we present algebrization barriers to improving the half-exponential lower bound for 𝖬𝖠𝖤 [12]. While we are unable to fully resolve the algebrizing circuit complexity of 𝖬𝖠𝖤, we show that for a natural subclass of 𝖬𝖠𝖤 which includes the hard language in [12], non-algebrizing techniques are required to go beyond half-exponential bounds.

Recall that a standard algorithm MA~ in 𝖬𝖠𝖤A~ defines a hard language if, relative to this particular oracle A~, MA~ satisfies the 𝖬𝖠 promise and defines a language without small A-oracle circuits; we do not care about the behavior of MB~ for other oracles B~. In contrast, we say that an 𝖬𝖠𝖤 machine M is a robust machine for defining a hard language (or simply “is robust”), if for any oracle B and its multilinear extension B~ such that MB~ satisfies the 𝖬𝖠 promise, the language computed by MB~ does not have small B-oracle circuits. The use of interactive proofs in [12] naturally leads to hard languages defined by robust 𝖬𝖠 machines; see appendix A of the full version for details.101010In fact, the definition of robust 𝖬𝖠 resembles closer to the class 𝖬𝖠𝖼𝗈𝖬𝖠 compared to 𝖬𝖠 itself. This is natural since single-valued 𝖥𝖬𝖠-constructions of hard truth tables (see, e.g., [15, Section 1.2.1]) give rise to circuit lower bounds for 𝖬𝖠𝖼𝗈𝖬𝖠, and indeed the lower bound proved in [12] is for hard languages in 𝖬𝖠𝖤𝖼𝗈𝖬𝖠𝖤. For ease of notation we use “robust 𝖬𝖠” (Rob-𝖬𝖠) instead of “robust 𝖬𝖠𝖼𝗈𝖬𝖠” (Rob-𝖬𝖠𝖼𝗈𝖬𝖠).

We use Rob-𝖬𝖠A (resp. Rob-𝖬𝖠𝖤A) to denote the class of languages computed by robust 𝖬𝖠 (resp. 𝖬𝖠𝖤) algorithms with access to oracle A. Our main result is that proving a super-half-exponential bound for languages in 𝖤 or robust 𝖬𝖠𝖤 would require non-algebrizing techniques:

Theorem 7 (Informal).

There exists an oracle A3 and its multilinear extension A3~ such that both 𝖤A3~ and Rob-𝖬𝖠𝖤A3~ admit half-exponential size A3-oracle circuits.

Why study robust 𝗠𝗔𝗘?

We present two reasons for studying the notion of robust 𝖬𝖠𝖤.

Firstly, Theorem 7 is tight in the sense that, in the two cases of the win-win argument in [12], the hard language is either in 𝖤 (i.e., deterministic exponential time), or in robust 𝖬𝖠𝖤. Therefore, a slight adaptation of [12] proves that for every oracle A and its multilinear extension A~, 𝖤A~Rob-𝖬𝖠𝖤A~ does not have sub-half-exponential size A-oracle circuits.

Secondly, while the classes 𝖤 and Robh(n)-𝖬𝖠𝖤 may initially appear arbitrary, they in fact capture two fundamental types of failures for algorithms in 𝖬𝖠𝖤𝖼𝗈𝖬𝖠𝖤:

  • For MRobh(n)-𝖬𝖠𝖤, if MA~ fails to achieve high circuit complexity, it must be due to an input x on which MA~ is semantically incorrect. That is, the corresponding truth table entry is undefined. We call this a failure due to no positive.

  • In contrast, for M𝖤, the value MA~(x) is always well-defined for all x, regardless of the oracle A. If MA~ has low circuit complexity, it must be because its truth table is easy for A-oracle circuits. We call this a failure due to a false positive.

In general, for an algorithm M𝖬𝖠𝖤𝖼𝗈𝖬𝖠𝖤, failure to achieve high circuit complexity can be attributed to both types, depending on the oracle. Thus, 𝖤 and Robh(n)-𝖬𝖠𝖤 represent two extreme cases within 𝖬𝖠𝖤𝖼𝗈𝖬𝖠𝖤.

Our proof of Theorem 7 entails techniques that can force the two types of failures to occur. In this sense, the proof says something fundamental about 𝖬𝖠𝖤𝖼𝗈𝖬𝖠𝖤. However, our current methods are limited to handling algorithms that exhibit only a single failure mode. For more general algorithms in 𝖬𝖠𝖤𝖼𝗈𝖬𝖠𝖤 that can have mixed types of failures, our understanding remains incomplete.

Proof of Theorem 7 via a communication lower bound.

To prove algebrization barriers to circuit lower bounds for robust 𝖬𝖠𝖤, we need to understand the robust 𝖬𝖠 communication complexity of XOR-Missing-String. Here, an 𝖬𝖠 protocol P is called robust if on every input (X,Y), Merlin (i.e., the prover) cannot convince the protocol to output a non-solution for (X,Y) except with very small probability. Underlying Theorem 7 is the following lower bound stating that efficient 𝖬𝖠 protocols for XOR-Missing-String that are robust can only be correct on a tiny fraction of inputs:

Lemma 8.

Let n1,1m<2n/2 be parameters. Let P be a robust 𝖥𝖬𝖠 communication protocol of complexity C attempting to solve XOR-Missing-String(n,m). Then the fraction of inputs that P solves is at most

2Ω(m/C)+O(C+n).

Intuitively, since there is no efficient way to verify whether a string is a solution, any protocol that solves a large number of instances of XOR-Missing-String is essentially guessing the answer (similar to the naïve protocol that succeeds with probability 12O(n)), which means that it must make mistakes. Since robust protocols are not allowed to output false positives, they can only be correct on a tiny fraction of inputs.

Although Lemma 8 is useful for analyzing the behaviors of robust 𝖬𝖠 protocols, it does not directly imply any half-exponential bounds. Indeed, we need to carefully diagonalize against both 𝖤A~ and Robh(n)-𝖬𝖠𝖤A~ simultaneously, which leads to a half-exponential bound. We discuss this in more detail in Subsection 1.4.

1.4 Technical Overview

Next, we briefly overview our proof techniques. In fact, the engine behind all of our communication lower bounds is the following observation about XOR-Missing-String:

Lemma 9 (Informal version of Lemma 17).

For every large enough rectangle R=𝒳×𝒴 and every answer s, there is a large enough subrectangle R=𝒳×𝒴 (𝒳𝒳 and 𝒴𝒴) such that s is a wrong answer for the XOR-Missing-String problem on every input (X,Y)R.

With this lemma in hand, it is not hard to prove Theorem 4. We characterize a 𝖯𝗈𝗌𝗍𝖡𝖯𝖯 communication protocol as a distribution over labeled rectangles (using approximate majority covers [29]), where for each input (X,Y), the output of the protocol is given by the label of a randomly selected rectangle that contains (X,Y). Using Lemma 9, for every large enough labeled rectangle, there exists a large subrectangle in which the label is never a solution to XOR-Missing-String. Lemma 9 thus translates to a lower bound for the success probability of large labeled rectangles (i.e., the probability that the label is a solution, over a random input in the rectangle). This then translates to a lower bound for (weighted) average success probability over a random input, because the distribution consists mostly of large rectangles when the protocol is efficient. This lower bound implies that efficient protocols cannot have very high success probability.

However, more work needs to be done to obtain our other communication lower bounds and algebrization barriers.

BPP communication lower bounds.

We prove our pseudodeterministic 𝖡𝖯𝖯 communication lower bound in the “win-win model” (Theorem 6) via a reduction to the 𝖯𝗈𝗌𝗍𝖡𝖯𝖯 communication lower bound (Theorem 4). We first use an induction on the number of protocols; assume towards a contradiction that Theorem 6 holds for any sequence of t1 protocols but does not hold for some sequence of t protocols Q1,,Qt. Then the following 𝖯𝗈𝗌𝗍𝖡𝖯𝖯 communication protocol solves XOR-Missing-String efficiently: Given an input (X,Y), guess t1 inputs {(Xi,Yi)}i[t1] uniformly at random, set (Xt,Yt):=(X,Y), and postselect on the event that all of the first t1 protocols fail. That is, for every 1it1, Qi fails to solve (Xi,Yi) given {(Xi,Yi)}i[t]. By our induction hypothesis, this event happens with probability strictly greater than 0. If this happens, then Qt solves (Xt,Yt) correctly. This contradicts Theorem 4.

For clarity, we have made a crucial simplification in the above description: To obtain algebrization barriers, we also need to prove lower bounds against list-solvers for XOR-Missing-String, which are communication protocols that output a small set of answers such that one of the answers is correct. Fortunately, it is not hard to adapt the proof of Theorem 4 to prove a 𝖯𝗈𝗌𝗍𝖡𝖯𝖯 communication lower bound for list solvers; see the full version for details.

𝗠𝗔 communication lower bounds against robust protocols.

A robust protocol can be described as a collection of randomized verifiers Vw,π, where Vw,π(X,Y)=1 means that when the input is (X,Y) and π is given as proof, the protocol accepts w as output. In the formal proof, we apply error reduction on the verifiers, so that when w is not a solution to (X,Y), Vw,π(X,Y)=1 with extremely small probability.

Thus, if a protocol has large success probability, there must exist some answer string w and some proof π, such that over a random input (X,Y), Pr[Vw,π(X,Y)=1] is large. We interpret this Vw,π as a distribution over labeled rectangles, and apply Lemma 9 to show that conditioned on Vw,π(X,Y)=1, the verifier makes many mistakes (i.e., Pr[Vw,π(X,Y)=1w is not a solution to (X,Y)] is large). This implies that the protocol must make mistakes on some (X,Y), so it cannot be robust.

The half-exponential barrier for robust 𝗠𝗔𝗘.

We formulate the problem in terms of refuting communication protocols, where we need to refute a finite set of protocols (deterministic or robust) on each input length. More formally, to construct an algebrized world with circuit upper bound h(n), our oracle encodes one instance (Xi,Yi) of XOR-Missing-String(2i,2h(i)) for each input length i. Each protocol being diagonalized is dedicated to solving the one instance that corresponds to its input length (it can nevertheless see the other instances).

We employ an inductive approach to construct the oracle. At step i, we maintain a rectangle111111We interpret our oracles as the concatenation of Alice’s and Bob’s inputs, so it is reasonable to talk about a “rectangle” of oracles. Ri of oracles, such that the rectangle is relatively large, and any protocol that works on input length at most i fails to solve its instance on any oracle ARi.

The goal of every step is then to slightly shrink the rectangle Ri1 into Ri, so that the protocols working on length i are refuted. To achieve this, we employ a combination of two different strategies:

  • Refuting deterministic protocols: Given a deterministic protocol P working on length i and a large enough rectangle Ri, we can always find a large subrectangle RRi such that P outputs the same answer on every oracle in R (this is by definition of deterministic protocols). By Lemma 9, there exists a large subrectangle R′′R, such that for any oracle AR′′, P does not solve (Xi,Yi) on A. We then update RiR′′ to refute P.

  • Refuting robust protocols: Given a robust protocol Q working on length i and a large enough rectangle Ri, a random oracle from Ri refutes Q with high probability. This is true as long as the density of Ri is sufficiently larger than the success probability of Q on a uniformly random input (as upper bounded in Lemma 8).

On their own, both strategies work extremely well: The first strategy is very similar to the proof of 𝖭𝖤𝖷𝖯A~𝖯A/poly by [3], and can be used to obtain much better circuit upper bounds than half-exponential. For the second strategy, since robust protocols are very weak, a randomly selected oracle would refute all robust protocols with high probability.

However, it is unclear how to combine the two strategies. On the one hand, after refuting a deterministic protocol P by the first strategy, the resulting rectangle Ri may be too small for the second strategy to work. On the other hand, although the set of oracles RRi that refutes a robust protocol Q is large, it may not be a rectangle, so we cannot use it to refute the next deterministic protocol.

To integrate these two strategies, we refute each robust protocol at a carefully chosen time step. Suppose that we refute a robust protocol Qi working on length i after all deterministic protocols working on lengths <k are refuted, where k=k(i) is some function on i. Also, recall that we want to prove an algebrized circuit upper bound of size h(n).

  • After refuting the deterministic protocols working on lengths <k, we end up with a rectangle that contains roughly a 22O(k) fraction of all oracles. This is because the deterministic protocols working on lengths <k run in time 2O(k). In contrast, Lemma 8 implies that Qi only solves a 22h(i) fraction of oracles (since it attempts to solve an instance of XOR-Missing-String(2i,2h(i))). Therefore, if 22O(k)22h(i), then we can refute Qi.

  • On the other hand, after refuting Qi, we still need to refute the deterministic protocols working on length k. Since these protocols attempt to solve XOR-Missing- String(2k,2h(k)), to apply Lemma 9, this requires our current rectangle Rk to have size at least roughly 22h(k). Since Qi is equivalent to a deterministic protocol running in time 22O(i) (by enumerating the random bits and Merlin’s proof), after refuting it, we still have a rectangle of density 222O(i). We thus need that 222O(i)22h(k).

Overall, we require that kh(i) and 2ih(k). This implies that h must be super-half-exponential.

1.5 Related Works

Prior work on super-polynomial circuit lower bounds.

Kannan’s seminal work [27] proved a super-polynomial size lower bound for Σ2𝖤𝖷𝖯; since then, a sequence of work has proved circuit lower bounds for various complexity classes such as 𝖹𝖯𝖤𝖷𝖯𝖭𝖯 [31, 11], 𝖲2𝖤𝖷𝖯 [14, 13], 𝖯𝖤𝖷𝖯 [42, 2], 𝖬𝖠𝖤𝖷𝖯 [12, 39], 𝖹𝖯𝖤𝖷𝖯MCSP [26, 24], and more [41, 16]. Such lower bounds are usually proved by Karp–Lipton theorems [28, 9, 18]: If a large uniform class (usually 𝖯𝖧 or 𝖤𝖷𝖯) admits polynomial-size circuits, then it collapses to a smaller uniform class (such as Σ2𝖯). As explained in [36] and [15, Section 1.4.1], such a strategy naturally yields a half-exponential bound. Recently, [15, 33] proved near-maximum (2n/n) circuit lower bounds for the classes Σ2𝖤𝖷𝖯, 𝖹𝖯𝖤𝖷𝖯𝖭𝖯, and 𝖲2𝖤𝖷𝖯, improving upon previous half-exponential bounds. Near-maximum circuit lower bounds are also known for several classes with subexponential amount of advice bits, such as 𝖡𝖯𝖤𝖷𝖯MCSP/2nε [24] and 𝖠𝖬𝖤𝖷𝖯/2nε [16].

Algebrization.

The interactive proof results such as 𝖨𝖯=𝖯𝖲𝖯𝖠𝖢𝖤 [35, 40] and 𝖬𝖨𝖯=𝖭𝖤𝖷𝖯 [8] generated much excitement among complexity theorists, as they are the first “truly compelling” ([4]) non-relativizing results in complexity theory. There has been much discussion on the extent to which these results are non-relativizing, and how to adapt the relativization barrier to accommodate these results [6, 19]. Arguably, the most influential work in this direction is that of Aaronson and Wigderson [3] on the algebrization barrier, which nicely captures interactive proof results and arithmetization-based techniques. However, Aaronson and Wigderson’s algebrization barrier has its own subtleties (such as not being closed under modus ponens, see footnote 7), which leads to several proposed refinements of its definition [25, 7]. We also mention the bounded relativization barrier, recently proposed by Hirahara, Lu, and Ren [24], which attempts to capture the interactive proof results entirely within the original Baker–Gill–Solovay [10] framework of relativization.

1.6 Open Problems

We think the most important open problem left in our paper is to study the algebrizing circuit complexity of 𝖬𝖠𝖤. Can we strengthen Theorem 7 to hold for all of 𝖬𝖠𝖤 instead of just “robust” 𝖬𝖠𝖤 algorithms? If not, can we prove an exponential size lower bound for 𝖬𝖠𝖤 (potentially with one bit of advice) using algebrizing techniques?

Another open question is to prove a 𝖯𝖯 communication lower bound for XOR-Missing- String, which would imply an algebrization barrier to proving almost-everywhere circuit lower bounds for 𝖯𝖤𝖷𝖯. Such circuit lower bounds are known in the infinitely-often regime [12, 42].

Finally, our work did not examine the regime of fixed-polynomial circuit lower bounds. Using algebrizing techniques, Santhanam [39] proved that for every constant k1, 𝖬𝖠/1𝖲𝖨𝖹𝖤[nk]. Aaronson and Wigderson [3] speculated that it might be possible to eliminate the advice bit and prove 𝖬𝖠𝖲𝖨𝖹𝖤[nk] using “tried-and-true arithmetization methods,” but there has been no progress in this direction. Is there an algebrizing barrier to proving 𝖬𝖠𝖲𝖨𝖹𝖤[nk]?

2 Preliminaries

Definition 10.

A search problem f over domain 𝒳×𝒴 and range 𝒪 is defined using a relation (𝒳×𝒴)×𝒪. For any input (X,Y)𝒳×𝒴, the valid solutions for f on (X,Y) are those s such that (X,Y,s) (we say that 𝒔 is a solution to f(X,Y)). We say that f is a total problem, if for any (X,Y), there is at least one valid solution.

2.1 Complexity Classes

We assume familiarity with basic complexity classes such as 𝖯, 𝖭𝖯, 𝖡𝖯𝖯 and their exponential-time versions 𝖤𝖷𝖯, 𝖭𝖤𝖷𝖯, 𝖡𝖯𝖤𝖷𝖯. The reader is encouraged to consult standard textbooks [5, 21] or the Complexity Zoo121212https://complexityzoo.net/, accessed Nov 28, 2025. for their definition.

Definition 11 (𝖯𝗈𝗌𝗍𝖡𝖯𝖯).

A promise problem Π=(Πyes,Πno) is in pr-𝖯𝗈𝗌𝗍𝖡𝖯𝖯 if there exist two polynomial-time algorithms A,B, taking an input x{0,1}n and randomness r{0,1}poly(n) (they take the same randomness), such that the following holds for every xΠyesΠno.

  • If xΠyes, then Prr[A(x,r)=1B(x,r)=1]2/3.

  • If xΠno, then Prr[A(x,r)=1B(x,r)=1]1/3.

  • Prr[B(x,r)=1]>0.

We say that the algorithm postselects on the event that B(x,r)=1.

Definition 12 (𝖬𝖠𝖼𝗈𝖬𝖠).

A promise problem Π=(Πyes,Πno) is in pr-(𝖬𝖠𝖼𝗈𝖬𝖠) if there exist two polynomial-time algorithms (verifiers) V0,V1, taking an input x{0,1}n, a proof π{0,1}poly(n) and randomness r{0,1}poly(n) (they take different randomness), such that the following holds for every xΠyesΠno.

  • If xΠyes, then V1 accepts x and V0 rejects x.

  • If xΠno, then V0 accepts x and V1 rejects x.

Here, for k=0,1, we say that

  • Vk accepts x, if there exists a proof π such that Prr[Vk(x,π,r)=1]2/3.

  • Vk rejects x, if for any proof π, we have that Prr[Vk(x,π,r)=1]1/3.

2.2 Algebrization

In this section, we present some definitions regarding algebrization barriers. The definitions are in accordance with [3], with slight modifications.

Definition 13 (Oracle).

An oracle A is a collection of Boolean functions Am:{0,1}m{0,1}, one for each m. Given a complexity class 𝒞, by 𝒞A we mean the class of languages decidable by a 𝒞 machine that can query Am for any m of its choice.

We say that a separation 𝒞𝒟 does not algebrize, if there exist an oracle A and a low-degree extension A~ of A, such that 𝒞A~𝒟A. Throughout this paper, we only consider multilinear extensions, which are the simplest class of low-degree extensions.

Definition 14 (Multilinear Extension over Finite Fields).

Let Am:{0,1}m{0,1} be a Boolean function, and let 𝔽 be a finite field. The (unique) multilinear extension of Am over 𝔽 is the linear function A~m,𝔽:𝔽m𝔽 that agrees with Am on {0,1}m. Given an oracle A=(Am), the multilinear extension A~ of A is the collection of multilinear extensions A~m,𝔽, one for each positive integer m and finite field 𝔽.

Given a complexity class 𝒞, by 𝒞A~ we mean the class of languages decidable by a 𝒞 machine that can query A~m,𝔽 for any m,𝔽 of its choice. Moreover, we assume that each query to A~m,𝔽 takes O(mlog|𝔽|) time.

An important property of multilinear extensions is that algorithms having access to A~ can be simulated by communication protocols where Alice and Bob are each given half of A as input. This reduces proving algebrization barriers to proving communication lower bounds. Formally, we have the following theorem, which is a simple corollary of [3, Theorem 4.11].

Theorem 15.

Let A be an oracle, and let A0 (resp. A1) be the subfunction of A obtained by restricting the first bit to 0 (resp. 1). Let M be a deterministic algorithm that has oracle access to A~ and runs in time T(|x|) when given x as input. There exists a deterministic communication protocol P in which Alice (resp. Bob) is given x and the function A0 (resp. A1) as input, such that P uses O(T(|x|)3) bits of communication, and agrees with M(x) on any input x and any oracle A.

We remark that the proof of [3, Theorem 4.11] can be generalized to produce randomized communication protocols from randomized algorithms, 𝖬𝖠 communication protocols from 𝖬𝖠 algorithms, and so on.

3 An Infinitely-Often Algebrization Barrier for 𝖯𝗈𝗌𝗍𝖡𝖯𝖤

In this section, we prove the following algebrization barrier:

See 3

3.1 𝖯𝗈𝗌𝗍𝖡𝖯𝖯 Communication Lower Bounds for XOR-Missing-String

Theorem 3 follows from the following communication lower bound for XOR-Missing-String:

See 4

In this paper, 𝖯𝗈𝗌𝗍𝖡𝖯𝖯 communication protocols are defined as follows:131313There are many different definitions in the literature (see e.g., [22]). Here, we choose a simple definition that is best suited for proving the barrier.

Definition 16 (𝖯𝗈𝗌𝗍𝖡𝖯𝖯 communication protocols for search problems).

Let f be a search problem over domain 𝒳×𝒴 and range 𝒪. A 𝗣𝗼𝘀𝘁𝗕𝗣𝗣 communication protocol Π for f is defined as follows: Let k be the length of the public randomness used by Π, and let {Πr}r{0,1}k be a set of deterministic communication protocols, each having communication complexity c. On input (X,Y)𝒳×𝒴, protocol Π first samples a uniformly random string r{0,1}k, then simulates Πr, whose output can be or any value from 𝒪. The communication complexity of Π is defined to be k+c.

We say that Π solves f with error ϵ, if for any input (X,Y)𝒳×𝒴,

Prr[Πr(X,Y) is a solution to f(X,Y)|Πr(X,Y)]1ϵ.

We say that Π is pseudodeterministic, if for any input (X,Y)𝒳×𝒴, there exists some answer s𝒪, such that

Pr[Πr(X,Y)=s|Πr(X,Y)]2/3.

The main technical observation for our lower bound is that for any large enough rectangle R, no single answer s{0,1}n will be correct for every input (X,Y)R. In fact, there is always a 2Θ(n) fraction of inputs in R on which s is not a valid solution.

To use this result in the lower bound, note that a 𝖯𝗈𝗌𝗍𝖡𝖯𝖯 communication protocol can be seen as a distribution of labeled rectangles (i.e., approximate majority covers, as defined in Subsubsection 3.1.2). If a protocol has small communication complexity, then it must contain many large rectangles, so we can apply the aforementioned lower bound on individual rectangles, which implies that the protocol must have error 2O(n).

3.1.1 A Lower Bound for One Rectangle

In this section, we prove the main technical lemma for the communication lower bound. It is helpful to draw an analogy from the decision tree version of Missing-String: Suppose that the input is a list of m n-bit strings. Whenever the decision tree only queries the input on m1 bits, there always exists a string in the input list that is not queried. In this case, regardless of the decision tree’s answer, the adversary can arbitrarily modify the value of that string and make the decision tree fail.

Here, we show that the XOR-Missing-String problem has a similar property: If, conditioned on the communication history, the rectangle formed by the possible inputs is large (this corresponds to the decision tree making a small number of queries), then regardless of the protocol’s answer, there always exists a significant portion of the rectangle, on which the protocol fails.

Lemma 17.

Let n1,1m<2n/2 be integers. Let R{0,1}nm×{0,1}nm be a rectangle (i.e., R is of the form 𝒳×𝒴, where 𝒳,𝒴{0,1}nm) of size at least 22nmm+2. Let s be any n-bit string. Then there exists some R, which is a subrectangle of R and has size at least 22n2|R|, such that for any (X,Y)R, s is not a solution to XOR-Missing-String on (X,Y).

Proof.

Without loss of generality, we only have to prove the lemma for the case where s=0n. When s0n, we consider a different rectangle Rs, in which every input string x for Alice in every input list X𝒳 is replaced by xs. By the definition of XOR-Missing-String, the lemma holds for R and s if and only if it holds for Rs and 0n.

To prove the lemma, we need to show that there exists a large enough subrectangle R in R, in which 0n is never a solution to XOR-Missing-String. Recall that R is equal to 𝒳×𝒴, where 𝒳 is a set of Alice’s inputs and 𝒴 is a set of Bob’s inputs. For any string s{0,1}n, let 𝒳s={X𝒳,sX} denote the subset of 𝒳 that contains the string s; 𝒴s is defined similarly.

For any s{0,1}n, consider the subrectangle 𝒳s×𝒴sR. Since any input list (Alice’s or Bob’s) in the subrectangle always contains s, 0n is never a solution in 𝒳s×𝒴s.

In the remaining, we show that there exists some s such that 𝒳s×𝒴s is large enough (i.e., has size 22n2|R|) using a counting argument. If we can show this, then the lemma follows by letting R=𝒳s×𝒴s.

Let cxs=|𝒳s| denote the number of occurrences of string s in 𝒳. Note that, if some input X𝒳 contains multiple copies of s, X is only counted once in cxs. Let x1,,x2n be all the n-bit strings, sorted in non-increasing order of cx. Similarly define cys and y1,,y2n. We have the following lower bound on the number of occurrences of xi:

Claim 18.

For any 1i2n, cxxi is at least

|𝒳|(i1)m2ni+1.
Proof.

Let k=iicxxi, i.e., the sum of occurrences of xi.

Consider the inputs X in 𝒳 that only contain strings from x1,,xi1. The number of such X is at most (i1)m. Therefore, at least |𝒳|(i1)m inputs in 𝒳 contain at least one string from xi. Each of the |𝒳|(i1)m inputs contribute at least one to the sum k. That is,

k=iicxxi|𝒳|(i1)m.

Since xi occurs at least as frequently as any string in xi+1,,x2n, we have that kcxxi(2ni+1). Therefore,

cxxi k2ni+1|𝒳|(i1)m2ni+1.

The same bound also applies to cyyi. Fix any 1i2n and consider the strings x1,,xi and y1,,y2ni+1. By the pigeonhole principle, there must exist some string s that appears in both {x1,,xi} and {y1,,y2ni+1}. Using Claim 18, we can show that 𝒳s×𝒴s is large, that is,

|𝒳s×𝒴s| =cxscys
cxxicyy2ni+1 (by definition of the lists x,y)
|𝒳|(i1)m2ni+1|𝒴|(2ni)mi. (by Claim 18)

Let A=|𝒳|,B=|𝒴|. It now remains to show that, for any A,B such that 1A,B2nm and AB22nmm+2, there exists some 1i2n such that the rectangle chosen above is large enough. That is,

A(i1)m2ni+1B(2ni)mi22n2AB. (1)

Let i1 be the largest integer such that 2(i1)mA. Note that 22nm>A, so we must have i2n. In this case, we have that 2(2ni)mB, since if not, then

2(i)m2(2ni)m>AB (2(i)m>A by definition of i)
(i(2ni))m>AB/422nmm
i(2ni)>22n1,

which is impossible since i(2ni)22(n1) for any i. Therefore, by plugging i into (1), we have that

A(i1)m2ni+1B(2ni)mi
A/22ni+1B/2i (2(i1)mA and 2(2ni)mB)
A/22nB/22n=22n2AB.

3.1.2 Reducing 𝗣𝗼𝘀𝘁𝗕𝗣𝗣 Communication to Approximate Majority Covers

To use Lemma 17 in the lower bound proof, we use the characterization of 𝖯𝗈𝗌𝗍𝖡𝖯𝖯 communication protocols by distributions of labeled rectangles called approximate majority covers [29].

Definition 19 (approximate majority covers).

Let f be a search problem over the domain 𝒳×𝒴 and range 𝒪. An approximate majority cover for f is a (multi) set of labeled rectangles 𝒜𝒞={(R,s)}, where R𝒳×𝒴 is a rectangle, i.e., R is of the form A×B for some A𝒳,B𝒴, and s𝒪 is the label of R. The size of 𝒜𝒞 is defined as the number of labeled rectangles.

We say that the approximate majority cover solves f with error ϵ, if for any (X,Y)𝒳×𝒴, we have that

Pr(R,s)𝒜𝒞[s is a solution to f(X,Y)|(X,Y)R]1ϵ.

That is, when we randomly sample from 𝒜𝒞 a labeled rectangle (R,s) that contains (X,Y), the label s is a solution with probability 2/3.

Claim 20.

Let f be a total search problem over domain 𝒳×𝒴 and range 𝒪. If there exists a 𝖯𝗈𝗌𝗍𝖡𝖯𝖯 communication protocol Π that solves f with error ϵ and has complexity C, then there exists an approximate majority cover that solves f with error ϵ and has size 2C.

Proof.

Assume that Π uses kC bits of randomness. That is, Π is the uniform distribution over 2k deterministic communication protocols {Πr}, where each protocol Πr has complexity Ck.

Construct an approximate majority cover 𝒜𝒞 as follows: Initially, 𝒜𝒞 is empty. Each deterministic protocol Πr partitions the input space into at most 2Ck pairwise disjoint rectangles, where the answers of Πr in each rectangle are the same. For each such rectangle R of Πr, suppose the answer of Πr on R is v, we add (R,v) to 𝒜𝒞 if and only if v.

It is clear that the size of 𝒜𝒞 is at most 2C. Moreover, for any (X,Y)𝒳×𝒴, there is a one-to-one correspondence between the rectangles in 𝒜𝒞 containing (X,Y) and the protocols Πr for which Πr(X,Y). Therefore, the value of Π(X,Y) is distributed the same as the label of a uniformly random rectangle in 𝒜𝒞 that contains (X,Y).

3.1.3 Putting Everything Together

With the components in place, we can now prove Theorem 4.

See 4

Proof.

​​​Let Π be a 𝖯𝗈𝗌𝗍𝖡𝖯𝖯 communication protocol that solves XOR-Missing-String(n,m) with error 25n, and let C denote the complexity of Π. We convert Π into an approximate majority cover 𝒜𝒞 using Claim 20. ​​We have that 𝒜𝒞 solves XOR-Missing-String(n,m) with error 25n and has size 2O(C). We now show that the size of 𝒜𝒞 is at least 2Ω(m), which proves the lemma.

Let S=(R,s)𝒜𝒞|R| denote the total size of the rectangles. Since 𝒜𝒞 is correct, each input (X,Y){0,1}nm×{0,1}nm must be contained in at least one rectangle, which means that S22nm.

Let

Ccorrect (X,Y){0,1}nm×{0,1}nm(R,s)𝒜𝒞,R(X,Y)
[s is a solution to XOR-Missing-String on (X,Y)].

Since 𝒜𝒞 solves XOR-Missing-String(n,m) with error 25n, we have that

Ccorrect
(X,Y){0,1}nm×{0,1}nm(125n)(R,s)𝒜𝒞,R(X,Y)1 (each (X,Y) is correct w.h.p.)
= (125n)(R,s)𝒜𝒞(X,Y)R1
= (125n)S.

On the other hand,

Ccorrect
= (R,s)𝒜𝒞(X,Y)R[s is a solution to XOR-Missing-String on (X,Y)]
= (R,s)𝒜𝒞|R|22nmm+2(X,Y)R[s is a solution to XOR-Missing-String on (X,Y)]
+(R,s)𝒜𝒞|R|<22nmm+2(X,Y)R[s is a solution to XOR-Missing-String on (X,Y)].

It follows from Lemma 17 that the first summand is at most

(R,s)𝒜𝒞|R|22nmm+2(122n2)|R|(124n)S.

Let |𝒜𝒞| denote the size of 𝒜𝒞, then the second summand is at most

(R,s)𝒜𝒞|R|<22nmm+2|R| |𝒜𝒞|22nmm+2.
|𝒜𝒞|S/2m2. (since S22nm)

If |𝒜𝒞|2m/2, then the second summand is at most 2m/2+2S, which is at most 28nS since m20n. Summing everything together, we have that

Ccorrect(124n+28n)S,

which contradicts the previous bound of Ccorrect(125n)S. Hence it must be the case that |𝒜𝒞|>2m/2.

Lower bounds for pseudodeterministic protocols.

A consequence of Theorem 4 is that it also applies to pseudodeterministic 𝖯𝗈𝗌𝗍𝖡𝖯𝖯 communication protocols for XOR-Missing-String with constant error (say 1/3), as the error probability for such protocols can always be reduced to as small as possible by running it multiple times and outputting the majority answer. In contrast, the error probability of an arbitrary (non-pseudodeterministic) protocol cannot be amplified in general, since it is unclear how to verify whether a string is a valid solution to XOR-Missing-String.

Claim 21.

Let n1,1m<2n/2 be integers. Let Π be a pseudodeterministic 𝖯𝗈𝗌𝗍𝖡𝖯𝖯 communication protocol that solves XOR-Missing-String(n,m) with error 1/3 and has communication complexity C. Then for any k1, there exists a pseudodeterministic 𝖯𝗈𝗌𝗍𝖡𝖯𝖯 communication protocol Πk that solves XOR-Missing-String(n,m) with error 2Θ(k) and has communication complexity O(Ck).

By setting k=Θ(n) in the above lemma and applying Theorem 4, we have:

Corollary 22.

Let n1 and 20nm<2n/2 be integers. Any pseudodeterministic 𝖯𝗈𝗌𝗍𝖡𝖯𝖯 communication protocol that solves XOR-Missing-String(n,m) with error probability 1/3 requires communication complexity Ω(m/n).

3.2 Proving the Barrier

See 3

Proof of Theorem 3.

Let M1,M2, be a syntactic enumeration of 𝖯𝗈𝗌𝗍𝖡𝖯𝖤 algorithms each running in time 2n (that is, each Mi may or may not satisfy the 𝖯𝗈𝗌𝗍𝖡𝖯𝖯 promise). If we can prove a barrier for all such algorithms, then we can also prove a barrier for all algorithms running in time 2O(n) by a padding argument. We assume that each machine Mi appears infinitely many times in the list (Mi)i.

Let n1 be a large enough constant and set ni=4ni1 for every i>1. Let mi=ni20 for every i1. Recall that logarithms are always base 2. The oracle A that we construct will have the following structure: On input strings whose lengths are not of the form 2logmi, the value of A will always be zero. For every i1, the truth table of A on input length 2logmi will encode an instance (Xi,Yi) of XOR-Missing-String(ni,mi). More specifically, the truth table of A{0,1}2logmi restricted to inputs whose first bit is 0 (resp. 1) will be equal to Xi (resp. Yi) padded with zeros. Note that the length of (Xi,Yi) is 2mini22logmi.

For every i1, the instance (Xi,Yi) is designed to diagonalize against MiA~. More precisely, we want the following property to hold: let 𝒢𝒪𝒪𝒟i be the set of inputs x{0,1}logni such that the execution of MiA~(x) satisfies the semantics of 𝖯𝗈𝗌𝗍𝖡𝖯𝖯, then there exists a non-solution α{0,1}ni of the XOR-Missing-String instance (Xi,Yi) such that αx=MiA~(x) for every x𝒢𝒪𝒪𝒟i. Note that α=(Xi)a(Yi)b for some a,b[mi], hence by hardwiring a and b we can construct an A-oracle circuit of size O(logmi)O(logni) whose truth table is equal to α, and it follows that the pr-𝖯𝗈𝗌𝗍𝖡𝖯𝖤 problem defined by MiA~ on input length logni can be computed by linear-size A-oracle circuits. Therefore, it suffices to satisfy the aforementioned property.

Note that, when considering the truth table of MiA~ on input length logni, the algorithm MiA~ can only access A~ on inputs of length ni. Therefore, we only need to show that MiA~ni fails to solve (Xi,Yi). Since 2logmi+1>ni, this implies that the algorithm only sees (Xi,Yi).

We construct our oracle inductively. Suppose that we have constructed (Xj,Yj) for every j<i, and we need to construct (Xi,Yi). Let Pi be the following 𝖯𝗈𝗌𝗍𝖡𝖯𝖯 communication protocol that tries to solve XOR-Missing-String(ni,mi). Given an instance (X,Y), define an oracle A such that (X<i,Y<i) are the previously fixed values and (Xi,Yi)=(X,Y). Then, Alice and Bob output the “truth table” of MiA~ni on inputs of length logni, denoted as 𝐭𝐭{0,1}ni: for every x{0,1}logni, Alice and Bob simulate the computation of MiA~ni(x) using (the 𝖯𝗈𝗌𝗍𝖡𝖯𝖯 version of) Theorem 15 repeatedly for (1000ni+1) times, and output the majority outcome as the x-th bit of 𝐭𝐭. (We use boldface to emphasize that 𝐭𝐭 is a random variable). Let 𝒢𝒪𝒪𝒟i(X,Y) denote the set of inputs x{0,1}logni such that the computation of MiA~ni(x) satisfies 𝖯𝗈𝗌𝗍𝖡𝖯𝖯 promise, and let fi:𝒢𝒪𝒪𝒟i(X,Y){0,1} denote the promise problem defined by MiA~ni:

𝒢𝒪𝒪𝒟ib(X,Y)={x{0,1}logni:Pr[MiA~ni(x)=b|MiA~ni(x)]2/3},
𝒢𝒪𝒪𝒟i(X,Y)=𝒢𝒪𝒪𝒟i0(X,Y)𝒢𝒪𝒪𝒟i1(X,Y),
fi(x)={0if x𝒢𝒪𝒪𝒟i0(X,Y);1if x𝒢𝒪𝒪𝒟i1(X,Y).

We say that 𝐭𝐭 is consistent with fi if for every x𝒢𝒪𝒪𝒟i(X,Y), 𝐭𝐭x=fi(x). Since the computation of MiA~ni(x) is repeated (1000ni+1) times, the probability that 𝐭𝐭 is consistent with fi is at least 1210ni. On the other hand, note that the behavior of Pi is the same as some 𝖯𝗈𝗌𝗍𝖡𝖯𝖯 algorithm that has oracle access to A~ and runs in time O(ni3) (since it considers ni possible inputs, simulates Mi for O(ni) times on each of them, and each simulation takes O(ni) time), hence Theorem 15 implies that Pi can be implemented by a 𝖯𝗈𝗌𝗍𝖡𝖯𝖯 communication protocol of complexity O(ni9)=o(mi). It follows from Theorem 4 that Pi cannot solve XOR-Missing-String(ni,mi) with success probability more than 125ni; in other words, there exists some (X,Y) such that 𝐭𝐭 is a non-solution of (X,Y) w.p. at least 25ni. By a union bound, there is a non-zero probability that both of the following hold simultaneously: 𝐭𝐭 is consistent with fi and 𝐭𝐭 is a non-solution of (X,Y). This means that there is a non-solution α{0,1}ni of (X,Y) such that for every x𝒢𝒪𝒪𝒟i(X,Y), MiA~ni(x)=αx. Setting (Xi,Yi)=(X,Y), this establishes the property we want for i, and we can continue our construction for i+1.

Due to space constraints, we omit the almost-everywhere algebrization barrier for 𝖡𝖯𝖤 (Theorem 5) and the half-exponential barrier for Rob-𝖬𝖠𝖤 (Theorem 7) and refer the interested reader to the full version of our paper.

References

  • [1] Scott Aaronson. Quantum computing, postselection, and probabilistic polynomial-time. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 461(2063):3473–3482, 2005. doi:10.1098/rspa.2005.1546.
  • [2] Scott Aaronson. Oracles are subtle but not malicious. In Conference on Computational Complexity (CCC), pages 340–354. IEEE Computer Society, 2006. doi:10.1109/CCC.2006.32.
  • [3] Scott Aaronson and Avi Wigderson. Algebrization: A new barrier in complexity theory. ACM Trans. Comput. Theory, 1(1):2:1–2:54, 2009. doi:10.1145/1490270.1490272.
  • [4] Eric Allender. Oracles versus proof techniques that do not relativize. In SIGAL International Symposium on Algorithms, volume 450 of Lecture Notes in Computer Science, pages 39–52. Springer, 1990. doi:10.1007/3-540-52921-7_54.
  • [5] Sanjeev Arora and Boaz Barak. Computational Complexity - A Modern Approach. Cambridge University Press, 2009. doi:10.1017/CBO9780511804090.
  • [6] Sanjeev Arora, Russell Impagliazzo, and Umesh Vazirani. Relativizing versus nonrelativizing techniques: the role of local checkability. Manuscript, 1992. URL: https://people.eecs.berkeley.edu/˜vazirani/pubs/relativizing.pdf.
  • [7] Barış Aydınlıoğlu and Eric Bach. Affine relativization: Unifying the algebrization and relativization barriers. ACM Trans. Comput. Theory, 10(1):1:1–1:67, 2018. doi:10.1145/3170704.
  • [8] László Babai, Lance Fortnow, and Carsten Lund. Non-deterministic exponential time has two-prover interactive protocols. Comput. Complex., 1:3–40, 1991. doi:10.1007/BF01200056.
  • [9] László Babai, Lance Fortnow, Noam Nisan, and Avi Wigderson. 𝖡𝖯𝖯 has subexponential time simulations unless 𝖤𝖷𝖯𝖳𝖨𝖬𝖤 has publishable proofs. Computational Complexity, 3:307–318, 1993. doi:10.1007/BF01275486.
  • [10] Theodore P. Baker, John Gill, and Robert Solovay. Relativizations of the 𝖯=?𝖭𝖯 question. SIAM J. Comput., 4(4):431–442, 1975. doi:10.1137/0204037.
  • [11] Nader H. Bshouty, Richard Cleve, Ricard Gavaldà, Sampath Kannan, and Christino Tamon. Oracles and queries that are sufficient for exact learning. J. Comput. Syst. Sci., 52(3):421–433, 1996. doi:10.1006/jcss.1996.0032.
  • [12] Harry Buhrman, Lance Fortnow, and Thomas Thierauf. Nonrelativizing separations. In CCC, pages 8–12, 1998. doi:10.1109/CCC.1998.694585.
  • [13] Jin-yi Cai. 𝖲2p𝖹𝖯𝖯𝖭𝖯. J. Comput. Syst. Sci., 73(1):25–35, 2007. doi:10.1016/j.jcss.2003.07.015.
  • [14] Jin-yi Cai, Venkatesan T. Chakaravarthy, Lane A. Hemaspaandra, and Mitsunori Ogihara. Competing provers yield improved Karp–Lipton collapse results. Inf. Comput., 198(1):1–23, 2005. doi:10.1016/j.ic.2005.01.002.
  • [15] Lijie Chen, Shuichi Hirahara, and Hanlin Ren. Symmetric exponential time requires near-maximum circuit size. In STOC, pages 1990–1999. ACM, 2024. doi:10.1145/3618260.3649624.
  • [16] Lijie Chen, Jiatu Li, and Jingxun Liang. Maximum circuit lower bounds for exponential-time arthur merlin. In STOC, pages 1348–1358. ACM, 2025. doi:10.1145/3717823.3718224.
  • [17] Lijie Chen, Zhenjian Lu, Igor C. Oliveira, Hanlin Ren, and Rahul Santhanam. Polynomial-time pseudodeterministic construction of primes. In FOCS, pages 1261–1270. IEEE, 2023. doi:10.1109/FOCS57990.2023.00074.
  • [18] Lijie Chen, Dylan M. McKay, Cody D. Murray, and R. Ryan Williams. Relations and equivalences between circuit lower bounds and Karp–Lipton theorems. In CCC, volume 137 of LIPIcs, pages 30:1–30:21. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2019. doi:10.4230/LIPIcs.CCC.2019.30.
  • [19] Lance Fortnow. The role of relativization in complexity theory. Bull. EATCS, 52:229–243, 1994.
  • [20] Eran Gat and Shafi Goldwasser. Probabilistic search algorithms with unique answers and their cryptographic applications. Electronic Colloquium on Computational Complexity (ECCC), 18:136, 2011. URL: https://eccc.weizmann.ac.il/report/2011/136.
  • [21] Oded Goldreich. Computational complexity - a conceptual perspective. Cambridge University Press, 2008. doi:10.1017/CBO9780511804106.
  • [22] Mika Göös, Toniann Pitassi, and Thomas Watson. The landscape of communication complexity classes. Comput. Complex., 27(2):245–304, June 2018. doi:10.1007/s00037-018-0166-6.
  • [23] Yenjo Han, Lane A. Hemaspaandra, and Thomas Thierauf. Threshold computation and cryptographic security. SIAM J. Comput., 26(1):59–78, 1997. doi:10.1137/S0097539792240467.
  • [24] Shuichi Hirahara, Zhenjian Lu, and Hanlin Ren. Bounded relativization. In CCC, volume 264 of LIPIcs, pages 6:1–6:45. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPIcs.CCC.2023.6.
  • [25] Russell Impagliazzo, Valentine Kabanets, and Antonina Kolokolova. An axiomatic approach to algebrization. In STOC, pages 695–704. ACM, 2009. doi:10.1145/1536414.1536509.
  • [26] Russell Impagliazzo, Valentine Kabanets, and Ilya Volkovich. The power of natural properties as oracles. In Computational Complexity Conference (CCC), volume 102 of LIPIcs, pages 7:1–7:20. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2018. doi:10.4230/LIPIcs.CCC.2018.7.
  • [27] Ravi Kannan. Circuit-size lower bounds and non-reducibility to sparse sets. Inf. Control., 55(1-3):40–56, 1982. doi:10.1016/S0019-9958(82)90382-5.
  • [28] Richard M. Karp and Richard J. Lipton. Some connections between nonuniform and uniform complexity classes. In Symposium on Theory of Computing (STOC), pages 302–309. ACM, 1980. doi:10.1145/800141.804678.
  • [29] H. Klauck. Rectangle size bounds and threshold covers in communication complexity. In 18th IEEE Annual Conference on Computational Complexity, 2003. Proceedings., pages 118–134, 2003. doi:10.1109/CCC.2003.1214415.
  • [30] Robert Kleinberg, Oliver Korten, Daniel Mitropolsky, and Christos Papadimitriou. Total functions in the polynomial hierarchy. In ITCS, volume 185 of LIPIcs, pages 44:1–44:18, 2021. doi:10.4230/LIPIcs.ITCS.2021.44.
  • [31] Johannes Köbler and Osamu Watanabe. New collapse consequences of 𝖭𝖯 having small circuits. SIAM J. Comput., 28(1):311–324, 1998. doi:10.1137/S0097539795296206.
  • [32] Oliver Korten. The hardest explicit construction. In Symposium on Foundations of Computer Science (FOCS), pages 433–444. IEEE, 2021. doi:10.1109/FOCS52979.2021.00051.
  • [33] Zeyong Li. Symmetric exponential time requires near-maximum circuit size: Simplified, truly uniform. In STOC, pages 2000–2007. ACM, 2024. doi:10.1145/3618260.3649615.
  • [34] Zhenjian Lu, Igor C. Oliveira, Hanlin Ren, and Rahul Santhanam. On the complexity of avoiding heavy elements. In FOCS, pages 2403–2412. IEEE, 2024. doi:10.1109/FOCS61266.2024.00140.
  • [35] Carsten Lund, Lance Fortnow, Howard J. Karloff, and Noam Nisan. Algebraic methods for interactive proof systems. J. ACM, 39(4):859–868, 1992. doi:10.1145/146585.146605.
  • [36] Peter Bro Miltersen, N. V. Vinodchandran, and Osamu Watanabe. Super-polynomial versus half-exponential circuit size in the exponential hierarchy. In COCOON, volume 1627 of Lecture Notes in Computer Science, pages 210–220. Springer, 1999. doi:10.1007/3-540-48686-0_21.
  • [37] Alexander A. Razborov and Steven Rudich. Natural proofs. J. Comput. Syst. Sci., 55(1):24–35, 1997. doi:10.1006/JCSS.1997.1494.
  • [38] Hanlin Ren, Rahul Santhanam, and Zhikun Wang. On the range avoidance problem for circuits. In FOCS, pages 640–650. IEEE, 2022. doi:10.1109/FOCS54457.2022.00067.
  • [39] Rahul Santhanam. Circuit lower bounds for Merlin–Arthur classes. SIAM J. Comput., 39(3):1038–1061, 2009. doi:10.1137/070702680.
  • [40] Adi Shamir. 𝖨𝖯=𝖯𝖲𝖯𝖠𝖢𝖤. J. ACM, 39(4):869–877, 1992. doi:10.1145/146585.146609.
  • [41] Larry J. Stockmeyer and Albert R. Meyer. Cosmological lower bound on the circuit complexity of a small problem in logic. J. ACM, 49(6):753–784, 2002. doi:10.1145/602220.602223.
  • [42] N. V. Vinodchandran. A note on the circuit complexity of 𝖯𝖯. Theor. Comput. Sci., 347(1-2):415–418, 2005. doi:10.1016/j.tcs.2005.07.032.
  • [43] Nikhil Vyas and R. Ryan Williams. On oracles and algorithmic methods for proving lower bounds. In Innovations in Theoretical Computer Science Conference (ITCS), volume 251 of LIPIcs, pages 99:1–99:26. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPIcs.ITCS.2023.99.
  • [44] Christopher B. Wilson. Relativized circuit complexity. J. Comput. Syst. Sci., 31(2):169–181, 1985. doi:10.1016/0022-0000(85)90040-6.