Abstract 1 Introduction 2 Preliminaries 3 Communication Complexity vs. Error-Correcting Codes 4 Communication Lower Bounds Implied by Code Bounds 5 Analyzing the Landscape of Communication Complexity of 𝐄𝐐 6 Discussion & Future Directions References Appendix A Appendix

Communication Complexity of Equality and Error-Correcting Codes

Dale Jacobs ORCID Tufts University, Medford, MA, USA John Jeang ORCID Tufts University, Medford, MA, USA Vladimir Podolskii ORCID Tufts University, Medford, MA, USA Morgan Prior ORCID Tufts University, Medford, MA, USA Ilya Volkovich ORCID Boston College, MA, USA
Abstract

We study the public-coin randomized communication complexity of the equality function. The communication complexity of this function is known to be low when the error probability is constant and the players have access to many random bits. The complexity grows, however, if the allowed error probability and the amount of randomness are restricted.

We show that public-coin randomized protocols for equality and error-correcting codes are essentially the same object. That is, given a protocol for equality, we can construct a code, and vice versa.

We substantially extend the protocol-implies-code direction: any protocol computing a function with a large fooling set can be converted into an error-correcting code. As a corollary, we show that among functions with a fooling set of size s, equality on logs bits has the least randomized communication complexity, regardless of the restrictions on the error probability and the amount of randomness.

Finally, we use the connection to error-correcting codes to analyze the randomized communication complexity of equality for varying restrictions on the error probability and the amount of randomness. In most cases, we provide tight bounds. We pinpoint the setting in which tight bounds are still unknown.

Keywords and phrases:
communication complexity, randomized communication complexity, error-correcting codes
Copyright and License:
[Uncaptioned image] © Dale Jacobs, John Jeang, Vladimir Podolskii, Morgan Prior, and Ilya Volkovich; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Communication complexity
; Theory of computation Error-correcting codes ; Theory of computation Randomness, geometry and discrete structures
Related Version:
Full Version: https://eccc.weizmann.ac.il/report/2025/068/ [9]
Editors:
C. Aiswarya, Ruta Mehta, and Subhajit Roy

1 Introduction

In the standard model of communication complexity, two players, Alice and Bob, are tasked with computing the value of a fixed function f:{0,1}n×{0,1}n{0,1} on some given input (x,y). However, Alice only knows x and Bob only knows y, so they must communicate to compute f(x,y). Communication complexity studies the amount of communication between players in terms of the number of bits exchanged. In recent decades, communication complexity has become one of the central areas in computational complexity, with numerous applications to other areas, including circuit complexity, proof complexity, online algorithms, and data structures [11, 19, 22, 10].

The equality function EQ:{0,1}n×{0,1}n{0,1} is one of the most fundamental functions in communication complexity. On input (x,y), this function outputs 1 iff x=y. This was the first function for which an exponential separation between deterministic and randomized communication complexity was shown (see [23]). Later, EQ played a key role in many results in communication complexity, including direct sum results for the randomized setting [7], protocols for threshold functions [17], lifting theorems [14, 6], and communication complexity with an equality oracle [5, 18].

It is well known that the randomized communication complexity of EQ is low for constant error probability. We can consider both the private-coin and public-coin models, which we sometimes refer to as the private randomness setting and the public randomness setting, respectively. In the private randomness setting, there is an O(logn) protocol, usually attributed to Rabin and Yao (unpublished, see [11, 20]). For the case of public randomness, there is an O(1) protocol. The general idea of this protocol is that players can first use their shared randomness to sample a random hash function h. Then Alice sends Bob h(x), and Bob compares it to h(y). The most standard protocol of this type is for players to pick a random vector r{0,1}n and compare the inner products x,r and y,r over 𝔽2. In the case that x and y are not equal, there is a 1/2 probability that the compared inner products are also not equal.

Although the public-coin randomized complexity of EQ is O(1) for constant error probability and unlimited access to randomness, the complexity can increase when we impose stronger restrictions on these parameters. The error probability for EQ (as well as for any other function) can be reduced to any ε at the cost of a log1ε factor in complexity by a standard error reduction procedure. Note that the EQ protocol above uses a large amount of randomness, which grows as the error probability is reduced. The number of random bits, which we denote by m, can be substantially reduced using the idea from Newman’s result on the connection between communication complexity in the public-coin and private-coin models [16, 13]. Another known construction of randomized protocols for EQ with restricted randomness is through error-correcting codes. In such a protocol, players privately encode their inputs via an error-correcting code and compare them in a random coordinate (see Section 2.4 or [19, Chapter 3]).

As for lower bounds, Canetti and Goldreich [4] obtained general lower bounds in the setting with restricted randomness (by a reduction to the deterministic case; see Fact 14). For EQ, their result gives a lower bound of Ω(nε2m). They also provided examples of matching upper bounds for the case of constant error probability. Moran et al. [15] (see also [13]) gave a lower bound on communication complexity in the private-coin model in the small error case for functions with large fooling sets. In particular, for EQ, their bound implies a lower bound of Ω(logn+log1ε) in the private randomness setting. For public randomness, this implies a lower bound Ω(logn+log1εm). (Note that to reduce a protocol with private randomness to one with public randomness, one of the players can generate random bits and send them to another player.) Despite these results, the entire landscape of communication complexity of EQ for varying ε and m is not fully clear.

Our Results

As mentioned above, error-correcting codes give communication protocols for EQ. In our first result, we observe that actually, the converse is also true: communication protocols for EQ give rise to error-correcting codes. Thus, error-correcting codes and protocols for EQ are essentially the same object.

Theorem 1 (Informal version of Theorem 30 and Corollary 31).

Given a protocol for EQ, we can construct an error-correcting code with equivalent parameters (up to a constant factor), and vice versa. Furthermore, this connection is precise for one-sided error protocols; in this case, protocols and error-correcting codes are in one-to-one correspondence.

We obtain the following corollary:

Corollary 2 (Informal version of Corollary 32).

Any randomized public-coin protocol for EQ can be converted into a one-sided false-positive error protocol with no loss in parameters.

In particular, any communication protocol for EQ with ε1 false positive error and ε0 false negative error can be converted into a one-sided error protocol with ε0+ε1 false positive error (the communication complexity and the number of random bits used remain the same).

Next, we show that one direction of the result above generalizes to arbitrary functions in the following way:

Theorem 3 (Informal version of Theorem 33).

Any randomized public-coin protocol for a function with a large fooling set can be converted into an error-correcting code with the same parameters as for EQ.

As a consequence, we obtain the following corollary:

Corollary 4 (Informal version of Corollary 34).

Among all functions with a fooling set of size s, EQ on logs bits has the least randomized public-coin communication complexity for all values of error probability and number of random bits.

In other words, this corollary says that of all functions, EQ provides the largest separation between fooling set size and public-coin randomized communication complexity.

We next use the established connection between communication complexity of EQ and error-correcting codes to provide a detailed analysis of the communication complexity k of EQ on n-bit inputs for varying amounts of random bits m and error ε. We start by translating bounds for error-correcting codes to lower bounds for communication complexity. In particular, based on Plotkin bound for codes, we obtain the following lower bound on the randomized communication complexity of EQ in the public-coin model (up to a constant factor):

Theorem 5 (Informal version of Theorem 35).

Any randomized public-coin protocol for EQ satisfies the following lower bound for the communication complexity k:

kmax(min(n,nε2m),log1ε).

Furthermore, this lower bound is tight for many regimes of the parameters m,ε.

It is noted in [1] that Plotkin bound is the best known for the codes over large alphabet. We confirm this comparing it with Singleton bound, Elias-Bassalygo bound, Griesmer bound and MRRW bound. We observe that actually the combination of Singleton bound and Elias-Bassalygo bound gives the full lower bound in the theorem above (up to a constant factor).

Interestingly, the two known lower bounds for EQ mentioned above have natural interpretations in terms of codes. The result of [15] corresponds to the Elias-Bassalygo bound (after translation of their result to the public randomness setting) and the result of [4] matches the Singleton bound.

Next, we further analyze the landscape of communication complexity of EQ. Depending on values of the parameters m and ε, the expression in the theorem above can resolve (up to constant factors) to n, nε2m, or log1ε. In the first case, the trivial deterministic protocol for EQ (see Section 2.2) is clearly optimal. In the second case, we show that for a wide range of parameters, the protocol for EQ obtained from Reed-Solomon codes is optimal. However, this code does not apply in a certain range of parameters, specifically when log1ε<nε2m<m. In this case, we devise a protocol that is suboptimal, but close to the lower bound (specifically, we iterate the Reed-Solomon code with itself). Finally, in the third case, we show that the standard protocol combined with an improvement of Newman’s theorem (see Fact 15) for reducing the number of random bits gives an upper bound that is tight, up to additive lower order terms. Previously this was known only for the case of constant ε [4].

Other related work

Fleischer et al. [8] studied the relation between the number of random bits and the zero-error communication complexity for the list-non-disjointness function. Ball et al. [2] gave a lower bound on randomized communication complexity with defective randomness through the size of the fooling set. The paper [15] analyses communication protocols of functions by looking at the statistical distance of the protocol on distinct inputs from the fooling set. The difference with our approach is that we look at Hamming distance between the vectors of outcomes instead.

One could apply round elimination techniques (e.g., [3]) to transform an arbitrary protocol for EQ into a one-way protocol, thus obtaining an alternative proof of Corollary 2 (or Corollary 32). This allows for an alternative prove of the translation from protocols to codes in Theorem 1: we can prove this translation for one-way protocol and use round elimination to extend the result to general protocols. However, since the proof of this direction of the theorem does not differ that much for one-way and for general communication protocols, we decided to give a direct proof for the general case.

The rest of the paper is organized as follows. In Section 2, we give the necessary definitions and state known results. In Section 3, we prove our results on the connection between communication complexity and error-correcting codes. In Section 4, we use the connection established in Section 3 to deduce lower bounds on the communication complexity of equality for various amounts of allowed error and randomness (the analysis of suboptimal bounds on the codes is omitted and can be found in the full version of the paper [9]). In Section 5, we analyze the landscape of communication complexity of equality by comparing our lower bounds against upper bounds for various regimes of parameters.

2 Preliminaries

We make frequent use of the following fact:

Fact 6.

For two non-negative functions a(n),b(n) such that a(n)=b(n)+logb(n),

b(n)=a(n)loga(n)+O(1).

2.1 Notation and Conventions

All logarithms are base 2 unless otherwise specified. We use the notation f(i) to denote the function f composed with itself i times.

For brevity, unless specified otherwise, when we refer to the “complexity” of a function, we are referring to the randomized communication complexity in the public-coin model. Similarly, when we refer to a “protocol”, we mean a randomized communication protocol in the public-coin model. By “naive protocol” for EQ, we mean the deterministic protocol in which Alice sends her entire input, and by the “standard protocol” for EQ, we mean the standard randomized protocol used in the public-coin model (see Chapter 3 of [11]. Finally, when we refer to a “code”, we are referring to an error-correcting code.

We use m to denote the amount (in bits) of public randomness in a communication protocol, and we use ε to denote the allowed error. When protocols have two-sided error, we use ε1 to denote false positive error and ε0 to denote false negative error of a protocol. That is, ε1 is an upper bound on the probability that the protocol outputs 1 on an input where the function outputs 0. The false negative error, ε0, is an upper bound on the probability that the protocol outputs 0 on an input where the function outputs 1. For convenience, we often use k to denote public-coin randomized communication complexity. That is, we let k denote CCpub. We use the notation [N,K,D] code to refer to a code with block length N, message length K, and distance D (see Section 2.3 for definitions of these parameters).

2.2 Communication Complexity

In the standard model of communication complexity, two players, Alice and Bob, wish to compute the value of a function f:{0,1}n×{0,1}n{0,1} on some given input (x,y). The challenge is that Alice only knows x{0,1}n and Bob only knows y{0,1}n, and thus the players need to communicate (i.e., send bits) to compute f(x,y).

Definition 7 (protocol, value, cost, one-way).

A communication protocol Π is a binary tree where each internal node v is labeled either by a function av:{0,1}n{0,1} or by a function bv:{0,1}n{0,1} (denoting Alice or Bob’s message respectively) and each leaf node is labeled with an element in {0,1}.

The value of the protocol Π on input (x,y) is the label reached by traversing the tree according to the following rules: start at the root, and for each internal node v labeled by av (resp bv), move left if av(x)=0 (resp bv(y)=0) and right otherwise. The cost of Π on input (x,y) is the length of the path taken on input (x,y). The cost of Π is the height of the tree. We call a protocol one-way if the entire protocol consists of just one message.

Definition 8 (deterministic communication complexity).

The deterministic communication complexity CC(f) of f:{0,1}n×{0,1}n{0,1} is the minimum cost of Π, over all protocols Π that compute f.

Fact 9 (deterministic communication complexity of EQn).

For x,y{0,1}n, the equality function, EQn(x,y), is defined to be 1 if x=y and 0 otherwise. We have that CC(EQn)=n+1.

The lower bound on CC(EQ) can be proved using the fooling set method (see [11]). We define a fooling set below:

Definition 10 (fooling set [11]).

A fooling set for F:{0,1}n×{0,1}n{0,1} is a set F{0,1}n×{0,1}n and a value b{0,1} such that

  1. 1.

    For every (x,y)F, f(x,y)=b

  2. 2.

    For all distinct pairs (x1,y1),(x2,y2), either f(x1,y2)b or f(x2,y1)b.

While Definition 7 defines protocols for deterministic communication complexity, this paper focuses on randomized communication complexity, wherein Alice and Bob have a source of public randomness. Formally, there is a random string of bits r{0,1} available to both players. Where in the deterministic setting we had internal nodes in the protocol tree labeled with av(x) and bv(y), we now have internal nodes labeled with av(x,r) and bv(y,r); that is, Alice and Bob’s communication can depend on r. We can also view this model as a distribution {Pr}rΠ on deterministic protocols:

Definition 11 (randomized public-coin protocol [11]).

A (randomized) public-coin protocol is a probability distribution over deterministic protocols. The success probability of a public-coin protocol on input (x,y) is the probability of choosing a (deterministic) protocol, according to the probability distribution {Pr}rΠ, that computes f(x,y) correctly.

Definition 12 (randomized communication complexity [11]).

The randomized communication complexity CCεpub(f) is the minimum cost of a public-coin protocol that computes f with an error of at most ε (for every input (x,y)).

Fact 13 (randomized communication complexity of EQ).

The randomized communication complexity (in the public randomness model) of EQ with constant error ε is O(1).

A proof of Fact 13 can be found in [11].

Randomized protocols can be converted to deterministic protocols, with some overhead on the communication complexity, by repeating the randomized procedure and taking a majority vote. The following upper bound is proved in [4]:

Fact 14 ([4]).

Let f:{0,1}n×{0,1}n{0,1}, be a function computable via a randomized communication protocol Π using k bits of communication, m bits of randomness, and with error probability ε. Then the deterministic communication complexity CC(f) is bounded by

CC(f)k(ε2m+1+1). (1)

The following fact improves upon Newman’s Theorem [16], and states that a public-coin protocol can be converted into a private-coin protocol with only logarithmic additive overhead in the complexity. Similarly, the amount of randomness used by a public-coin protocol can be reduced by incurring a small amount of extra error.

Fact 15 ([13]).

Let f:{0,1}n×{0,1}n{0,1} be a function. Then CCε(1+δ)priv(f)CCεpub(f)+log(nε)+log(6δ2). Furthermore, any public-coin protocol for f which uses m random bits and has error ϵ can be converted into a protocol using logn+log(1/ε)+log(6δ2) random bits with error ϵ(1+δ), at no penalty to the communication complexity.

Interestingly, this improvement over original Newman’s Theorem is crucial for tight bounds in one of the cases in our analysis in Section 5.3.

See [11] for more on private and public-coin settings in communication complexity.

2.3 Error-Correcting Codes and Bounds

Definition 16 (code, block length, codewords).

A code C of block length N over alphabet Σ is CΣn. The elements cC are called codewords.

Definition 17 (distance, relative distance).

The distance D of a code CΣN is mincc(Δ(c,c)), where Δ is the Hamming distance function. In other words, the distance of a code is the minimum Hamming distance between any pair of codewords. The relative distance δ of C is the ratio of distance to block length, DN.

Definition 18 (message length).

The message length (sometimes called dimension) K of a code C over alphabet Σ with |Σ|=q is K=logq|C|.

We call a code with block length N, message length K, and distance D an [N,K,D] code.

Definition 19 (rate).

The rate of a code with block length N and message length K is RKN.

High distance and high rate are desirable properties for codes; however, there is a trade-off between these two parameters. The bounds stated below offer some insight into this trade-off.

Fact 20 (Singleton Bound).

Let C be an [N,K,D] code with alphabet size q. Then

KND+1.

Equivalently, we can express the Singleton Bound in terms of rate and relative distance:

R1δ+1N.

The Singleton Bound is tight when qN. That is, there exist codes for which K=ND+1 under this condition. We call such codes MDS (Maximum Distance Separable) codes. See Section 2.3 below.

Definition 21 (q-ary entropy).

The q-ary entropy function Hq(δ), a generalization of binary entropy, is defined as follows:

Hq(δ)δlogq(q1δ)+(1δ)logq(11δ).

As the block length N of a code tends to infinity, we have the relationship Volq(δN,N)qNHq(δ), where Volq(δN,N) is the volume of the Hamming Ball with strings of length N, alphabet size q, and radius δN. Hence Hq appears in the asymptotic statement of various bounds for error-correcting codes, such as the Elias-Bassalygo Bound below:

Fact 22 (Elias-Bassalygo Bound).

Let C be an [N,K,D] code with alphabet size q. Then

R:=KN1Hq[(11q)(11(qq1)δ)].
Fact 23 (Griesmer Bound).

For a linear [N,K,D] code with alphabet size q, we have:

Ni=0K1Dqi.

For more information on linear codes, see [21].

Fact 24 (Plotkin Bound).

The Plotkin bound strengthens the Singleton bound. For a code C with message length K, block length N, relative distance δ, and alphabet size q, we have:

KN+δ(qq1)1+1N.
Fact 25 (MRRW Bound for large alphabet [1]).

Let C be an [N,K,D] code with alphabet size q and 0<δ<(q1)/q. Then

R:=KNHq[(q1(q2)δ2(q1)δ(1δ))/q].

Our upper bounds use protocols based on Reed-Solomon codes, which we define below. For more information about these codes, see [21].

Reed-Solomon codes are a family of codes which achieve the Singleton bound (i.e., for which D=NK+1).

Formally, Reed-Solomon codes are defined as follows:

Definition 26 (Reed-Solomon codes).

Let qNK. Let α1,αN𝔽q be distinct. The Reed-Solomon code over 𝔽q with evaluation points α=(α1,,αN), is the [N,K,NK+1] code given by:

RSq(N,K1)={(f(α1),f(α2),f(αN)):f𝔽q[X],deg(f)K1}.

2.4 Codes Give Communication Protocols for 𝐄𝐐

It is known (see [19, beginning of Chapter 3]) that codes give communication protocols for EQ. In particular, we have the following fact:

Fact 27.

For every [N,K,D] code with alphabet Σ of size q, there exists a communication protocol, Π, computing EQn, such that nKlogq, using m public random bits, and logq bits of communication, with error ND2m. Moreover, this protocol is one-way and has one-sided false-positive error.

3 Communication Complexity vs. Error-Correcting Codes

In this section, we show that randomized communication protocols for EQ give rise to error-correcting codes. In fact, we prove a generalization of this statement which applies to functions that are close to EQ in some sense, which we call “EQ-like” functions.

Definition 28 (EQ-like).

We say that a function f:{0,1}n×{0,1}n{0,1} is EQ-like if for all x{0,1}n, f(x,x)=1 and for all pairs (x,y) such that xy, either f(x,y)=0 or f(y,x)=0.

Note that EQ is a particular example of an EQ-like function, where for all pairs (x,y) such that xy, f(x,y)=0 and f(y,x)=0. Another example of an EQ-like function is the well-known Greater-Than function, defined as follows: GT(x,y)=𝟙[int(x)int(y)] where int(x) is the integer given by x as its binary representation.

3.1 Protocols for 𝐄𝐐 Imply Codes

Consider a protocol for f. Alice holds x, Bob holds y, and there is public randomness r. Let T(x,y,r) be the transcript of Alice and Bob’s communication. That is,

T(x,y,r)=(a1,b1,a2,b2,,aR,bR),

where each ai is a function of all previous messages, x, and their shared randomness r, and each bi is a function of all previous messages, y, and r. Without loss of generality, at the end of the protocol, Alice outputs c(T(x,y,r),x,r). Let ε1 and ε0 denote the false positive and false negative error of the protocol, respectively. That is,

(x,y) such that f(x,y)=0,Prr[c(T(x,y,r),x,r)=1]ε1
(x,y) such that f(x,y)=1,Prr[c(T(x,y,r),x,r)=1]1ε0.

We show that the transcript T of Alice and Bob’s messages forms a code. An encoding of x{0,1}n is simply the concatenation of T(x,x,r) for all choices of r.

Lemma 29.

Let f be EQ-like, and let T(x,y,r) be the transcript for Alice and Bob’s communication in a randomized protocol for f as given above. Then for xy{0,1}n:

Prr[T(x,x,r)=T(y,y,r)]ε0+ε1.
Proof.

Let xy{0,1}n. Without loss of generality suppose f(x,y)=0 111If instead f(y,x)=0, we replace T(x,y,r) with T(y,x,r) in the first two lines of Equation and the proof holds.. Then we have
Prr[c(T(x,y,r),x,r)=1]+Prr[c(T(y,y,r),y,r)=0] (Prr[c(T(x,y,r),x,r)=1T(x,x,r)=T(y,y,r)]+Prr[c(T(y,y,r),y,r)=0T(x,x,r)=T(y,y,r)]) Prr[T(x,x,r)=T(y,y,r)]= (Prr[c(T(y,y,r),y,r)=1T(x,x,r)=T(y,y,r)]+Prr[c(T(y,y,r),y,r)=0T(x,x,r)=T(y,y,r)]) Prr[T(x,x,r)=T(y,y,r)]= =1Prr[T(x,x,r)=T(y,y,r)],

where the substitution T(x,y,r)=T(y,y,r) follows from the fact that T(x,x,r)=T(y,y,r) and that the communication tree partitions the inputs into combinatorial rectangles. Hence,

Prr[T(x,x,r)=T(y,y,r)]Prr[c(T(x,y,r),x,r)=1]+Prr[c(T(y,y,r),y,r)=0]ε0+ε1.

Using Lemma 29, we obtain the following theorem:

Theorem 30.

Let f:{0,1}n×{0,1}n{0,1} be EQ-like, and suppose there exists a randomized communication protocol for f which has communication complexity k, uses m bits of public randomness, and has two-sided error probabilities ε0 and ε1. Then there exists an error-correcting code with message length n, codeword length 2m, relative distance 1(ε0+ε1), and alphabet size 2k.

Proof.

The proof follows directly from Lemma 29. Concretely, the codeword for x{0,1}n is given by the concatenation of T(x,x,r) for all 2m possible values of randomness r{0,1}m. Each symbol in the codeword corresponds to T(x,x,r) for one choice of r, and since each transcript uses k bits of communication, there are 2k possible symbols in the code alphabet. Finally, by Lemma 29, the probability that two different codewords agree on any given coordinate is bounded above by ε0+ε1. Thus the relative distance of the code is 1(ε0+ε1).

If we consider a protocol with one-sided error, that is, a protocol with either ε0=0 or ε1=0, we immediately obtain the following corollary:

Corollary 31 (One-sided error).

Let f:{0,1}n×{0,1}n{0,1} be EQ-like, and suppose there exists a randomized communication protocol for f which has communication complexity k, uses m bits of public randomness, and has one-sided error probability ε. Then there exists a code with message length n, codeword length 2m, relative distance 1ε, and alphabet size 2k.

We remark that since codes give protocols (Fact 27), this corollary shows that one-way protocols for EQ with one-sided false-positive error are in one-to-one correspondence with error-correcting codes. That is, we can translate from protocols to codes and vice versa with no loss in parameters. Another consequence of Corollary 31 is the following corollary:

Corollary 32.

Any randomized protocol for EQ can be converted to a one-way protocol with one-sided false-positive error with no loss in parameters.

The following table summarizes the relationship between the communication complexity of a protocol for EQ and the obtained code:

Table 1: Translation Between CC and Codes.
Communication Complexity Code
size of input = n # of messages = 2n
# of random bits = m length of code = 2m
error = ε relative distance = 1ε
complexity = k alphabet size = 2k

3.2 Codes from Functions Based on Fooling Sets

We have seen that protocols for EQ-like functions give codes. We can actually make a more general statement which applies to all functions. In particular, for any function f, a protocol for f gives a code whose parameters depend on the parameters of the protocol and the size of the largest fooling set for f.

Theorem 33.

Let f:{0,1}n×{0,1}n{0,1} be any function with fooling set F, and suppose there exists a randomized communication protocol for f which has communication complexity k, uses m bits of randomness, and has error probabilities ε0,ε1. Then there exists a code with block length 2m, message length log|F|, relative distance 1(ε0+ε1), and alphabet size 2k.

Proof.

To prove the theorem, we show how to construct an error-correcting code from the function f with fooling set F. Let X and Y denote the set of x and y inputs respectively which appear in elements of the fooling set F, and define f:{0,1}log|F|×{0,1}log|F|{0,1} to be f restricted to the inputs X×Y. Without loss of generality, assume that for (x,y)F, f(x,y)=1. By the definition of fooling set (see Section 2.2), f is EQ-like, up to a permutation on one half of the input. Therefore, a protocol for computing f gives a protocol for f, which by Lemma 29 gives an error-correcting code for messages of length log|F|. As a consequence of Theorem 33, we obtain the following corollary, which says that in some sense, EQ represents the largest gap between fooling set size and randomized communication complexity in the public-coin model.

Corollary 34.

Let f:{0,1}n×{0,1}n{0,1} be any function with fooling set F, and suppose there exists a randomized communication protocol for f which has communication complexity k, uses m bits of randomness, and has error probabilities ε0,ε1. Then there exists a protocol for EQ on log|F| bits which has communication complexity k, uses m bits of randomness, and has one-sided false-positive error probability ε0+ε1.

This follows because, given a protocol for f, we can construct a code with message length log|F|, from which we obtain a protocol for EQ on log|F| bits.

Informally, Corollary 34 says that among all functions with a fooling set of size s, EQ on logs bits has the smallest public-coin randomized complexity for all values of ε and m. For protocols with two-sided error, this interpretation is true up to constant factors. For protocols with one-sided error, it is precise.

4 Communication Lower Bounds Implied by Code Bounds

The equivalence between EQ communication protocol and codes allows us to view bounds on codes as bounds on communication complexity. In particular, since we showed that existence of protocols implies existence of codes, nonexistence of codes implies nonexistence of protocols. Using this connection, we can translate known bounds for codes (Section 2.3) into bounds on communication complexity using the correspondence given by Table 1222Note that these lower bounds also hold for functions with large fooling sets.. Below we describe the bound that can be obtained from Plotkin bound, that turns out to be the best known bound for large size of the alphabet.

The Plotkin bound (Fact 24) can be expressed in terms of communication protocols as:

nk2m+(1ε)(2k2k1)1+12m. (2)

In terms of error rate ε, this implies

εnk2mk(2k12k)A+12kB. (3)

From this we get the system

{εnk2mk(2k12k),ε12k.

For any k we have (2k12k)12, thus solving for k we get the following (up to constant factors)

{kn2mε+1,klog1ε.

Depending on m and ε, either 1 or ε2m is the dominant term in the denominator of the first inequality. Based on these two cases, this bound can be equivalently stated (up to constant factors) as kmin(nε2m,n).

Combining two bounds together we get the following theorem.

Theorem 35.

For any randomized public-coin protocol for EQ on n bits with complexity k, m bits of randomness, and error probability ε,

kmax(min(n,nε2m),log1ε) (4)

up to constant factors.

It was noted (without details) in [1] that Plotkin bound is the best known bound for the case of error-correcting codes with large alphabet. For the sake of completeness, we provide a detailed analysis of communication complexity bounds we can obtain from Singleton bound, Elias-Bassalygo bound, Griesmer bound and MRRW bound. The analysis for these bounds can be found in the full version of our paper [9]. We also note that the combination of Singleton bound, Elias-Bassalygo bound matches Plotkin bound up to the constant factor (see [9]).

Note that Theorem 35 gives a different lower bound (one of n,nε2m, or log1ε) for different regimes of the parameters n,m, and ε. In Section 5, we examine the necessary relationship between n,m, and ε for achieving each of the three possible lower bounds. We also analyze when these bounds are tight by giving matching upper bounds for most regimes of the parameters. This analysis is summarized in the table below.

Table 2: Bounds on complexity of EQ for various regimes of parameters.
Description of the Regime Lower Bound Upper Bound
12mε12n n n
nnε2mlog1ε nε2m nε2m,
if m<nε2m
n2m,
if lognlog(c)n>m>nε2m
and constant ε and c
nεc2m,
if ε=1log(c)n and c is constant
logn+log1εloglog1ε<m log1ε log1ε,
if m>logn+log1ε+Θ(1)

5 Analyzing the Landscape of Communication Complexity of 𝐄𝐐

In this section, we analyze the various regimes of parameters which give rise to different lower bounds on communication complexity. We further examine when these bounds are tight by giving upper bounds. Now we consider the three cases based on which of the three terms the maximum in Theorem 35 evaluates to.

5.1 Case 1

In this case, we have

{nnε2m,nlog1ε,

or, after rearranging, 12nε12m. In this case, max(min(n,nε2m),log1ε)=n, giving the lower bound kn. Clearly, the naive protocol for EQ gives us the matching upper bound.

 Remark 36.

This lower bound is actually easy to see, since for ε<12m only the deterministic protocol is possible (because of the granularity of error probability).

5.2 Case 2

In this case, we have

{nnε2m,nε2mlog1ε.

The lower bound in this case is knε2m.

 Remark 37.

Note that this case shows that we cannot study the value of m up to a constant factor. The reason is that the bound depends on 2m and multiplicative changes for m may affect the bound dramatically. Note that we never allowed extra multiplicative constant factors for m.

5.2.1 Upper Bound with Reed-Solomon Protocol

For the upper bound in this case, we can use the communication protocol derived from Reed-Solomon codes (Section 2.3). In this protocol, Alice and Bob encode their inputs using a Reed-Solomon code and check the value of a single random index in a subset S of their respective codewords. For some values of parameters, the communication complexity of this protocol matches the lower bound given by the Singleton bound.

Lemma 38.

For any mk, there exists a public-coin randomized communication protocol that computes EQ using k bits of communication, m bits of randomness, and error rate ε=O(nk2m).

Proof.

Let q be the largest prime number such that q<2k. To compute EQ, we use a RSq(q,nk) code to get an error rate of

ε=nk2m=nk2m+Θ(12m), (5)

but we know that kn+1 (the naive protocol for EQ gives this upper bound), giving ε=O(nk2m). Note that Reed-Solomon codes are only defined for mk. For this protocol, k=Θ(nε2m). Thus, we have a protocol only in the regime

mΘ(nε2m).

Hence, this protocol does not give an upper bound when

log1εnε2mm. (6)

An example parameters in this regime is m=logn and ε=1/loglogn. The lower bound in this case is nε2m=loglogn. We partially address this issue in the next section.

5.2.2 Upper bound with iterated Reed-Solomon

The Reed-Solomon code allows us to do the following transformation of parameters:
n(m,ε,nε2m). This notation means that when we need to send a message of length n, we can use m random bits, introduce an error ε, and reduce the problem to sending a message of size nε2m. This protocol gives a tight upper bound, but it has the restriction mnε2m. For the range of parameters

log1ε<nε2m<m,

we do not have a tight upper bound.

Restating in terms of m and using Fact 6, Reed-Solomon codes give tight upper bounds for the regime of parameters satisfying

mlogn+log1εlog(logn+log1ε). (7)

Thus, the following range of the lower bound is not matched with a tight upper bound:

logn+log1εlog(logn+log1ε)<m<logn+log1εloglog1ε. (8)

Our plan is to apply the Reed-Solomon code iteratively. That is, in the first step, we apply the code with some parameters:

n(m1,ε1,nε12m1),

but then instead of sending nε12m1 bits, we apply the Reed-Solomon code again (with some new values of parameters).

We note that the idea to apply codes iteratively is well known in coding theory under the name of concatenated codes [21]. In particular, it was previously observed in communication complexity language that applying Reed-Solomon codes iteratively helps to improve multiplicative constant and lower order terms in communication complexity of the EQ function in the regime of constant error probability and unlimited randomness [12]. However, we need a more detailed analysis.

For a real nonnegative x, we introduce the function f(x)=log(log1ε+x). We now describe the values of parameters that we can cover with c iterations of Reed-Solomon protocol. The proof of the following lemma can be found in Section A.1.

Lemma 39.

For any integer constant c we have that the communication protocol derived from c iterations of Reed-Solomon has complexity nεc2m and is valid for all

mlogn+clog1εf(c)(logn) (9)

(up to an additive constant), where f(c) is the function f applied to itself c times.

For example, for constant ε the lower bound we get is still tight up to the constant factor. It is not hard to see in this case that the function f(c)(logn) is equal to log(c+1)n up to an additive constant, and thus the range of parameters for which we do not have an upper bound reduces to

lognlog(c+1)n<m<logn.

For non-constant ε, our upper bound is not tight. To analyze the interval in which it applies, observe that f(x)=log(log1ε+x) is equal to loglog1ε (up to an additive constant 1) for log1εx and is equal to logx (up to an additive constant 1) for log1εx.

For example, for ε=1log(t)n, for constant t, we get that log1ε=log(t+1)n. After t+1 iterations we have that f(t+1)=loglog1ε (up to an additive constant 1). Substituting this into (9) we get that this upper bound covers the whole regime (8).

5.3 Case 3

This case applies when log1εn or log1εnε2m. The lower bound in this case is klog1ε.

Note that the first of the subcases here corresponds to ε12n, in which case the communication complexity bound is n. We know that this bound is tight.

The remaining case to consider is log1εnε2m which, after the rearranging gives 2mn1εlog1ε. This is equivalent to

mlogn+log1εloglog1ε.

As for the upper bound, we use the standard randomized protocol and apply Fact 15. The standard protocol gives us the parameters

(nlog1ε,ε,log1ε).

Applying Fact 15 with error parameter δ gives

(logn+log1ε+log(6δ2),ε(1+δ),log1ε).

Setting δ=Θ(1) yields

(logn+log1ε+Θ(1),Θ(ε),log1ε),

and this bound achieves optimal m up to an additive factor in lower order terms, and optimal ε up to a constant multiplicative factor.

 Remark 40.

We observe that this public-coin protocol implies a private-coin protocol, where the randomness is simply shared at the start of the protocol at the cost of logn+log1ε+Θ(1) bits of communication. This resulting private-coin protocol has Θ(ε) error using log(nε2)+Θ(1) bits of communication, essentially matching the lower bound of log(nε2)loglog(1ε) observed in [13]. This indicates that in this range of parameters, the optimal private-coin protocol is to simply share the private randomness and then simulate the standard public-coin protocol.

6 Discussion & Future Directions

We identify an equivalence between randomized public-coin protocols for equality and error-correcting codes, and generalize one direction of this equivalence to functions with large fooling sets. We use this connection to study the randomized communication complexity of EQ for various regimes of the error ε and randomness m parameters. We give lower bounds for protocols from lower bounds from codes, and analyze when these lower bounds are tight. An immediate question is whether we can tighten our bounds in the regime in which they are not tight.

One way to view our result on fooling sets is that the maximal separation between the fooling set size and randomized communication complexity with public randomness is achieved by the equality function (for all regimes of parameters). A natural future direction is to explore whether similar separations can be found between randomized communication complexity and other complexity measures, such as deterministic communication complexity, rectangle size, or rank. Partial progress in this direction was obtained by Canetti and Goldreich [4].

References

  • [1] Matti J. Aaltonen. Linear programming bounds for tree codes (corresp.). IEEE Trans. Inf. Theory, 25(1):85–90, 1979. doi:10.1109/TIT.1979.1056004.
  • [2] Marshall Ball, Oded Goldreich, and Tal Malkin. Communication complexity with defective randomness. In Valentine Kabanets, editor, 36th Computational Complexity Conference, CCC 2021, July 20-23, 2021, Toronto, Ontario, Canada (Virtual Conference), volume 200 of LIPIcs, pages 14:1–14:10. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2021. doi:10.4230/LIPICS.CCC.2021.14.
  • [3] Jop Briët, Harry Buhrman, Debbie W. Leung, Teresa Piovesan, and Florian Speelman. Round elimination in exact communication complexity. In Salman Beigi and Robert König, editors, 10th Conference on the Theory of Quantum Computation, Communication and Cryptography, TQC 2015, May 20-22, 2015, Brussels, Belgium, volume 44 of LIPIcs, pages 206–225. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2015. doi:10.4230/LIPICS.TQC.2015.206.
  • [4] Ran Canetti and Oded Goldreich. Bounds on tradeoffs between randomness and communication complexity. Comput. Complex., 3:141–167, 1993. doi:10.1007/BF01200118.
  • [5] Arkadev Chattopadhyay, Shachar Lovett, and Marc Vinyals. Equality alone does not simulate randomness. In Amir Shpilka, editor, 34th Computational Complexity Conference, CCC 2019, July 18-20, 2019, New Brunswick, NJ, USA, volume 137 of LIPIcs, pages 14:1–14:11. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2019. doi:10.4230/LIPICS.CCC.2019.14.
  • [6] Susanna F. de Rezende, Or Meir, Jakob Nordström, Toniann Pitassi, Robert Robere, and Marc Vinyals. Lifting with simple gadgets and applications to circuit and proof complexity. In Sandy Irani, editor, 61st IEEE Annual Symposium on Foundations of Computer Science, FOCS 2020, Durham, NC, USA, November 16-19, 2020, pages 24–30. IEEE, 2020. doi:10.1109/FOCS46700.2020.00011.
  • [7] Tomás Feder, Eyal Kushilevitz, Moni Naor, and Noam Nisan. Amortized communication complexity. SIAM J. Comput., 24(4):736–750, 1995. doi:10.1137/S0097539792235864.
  • [8] Rudolf Fleischer, Hermann Jung, and Kurt Mehlhorn. A communication-randomness tradeoff for two-processor systems. Inf. Comput., 116(2):155–161, 1995. doi:10.1006/INCO.1995.1011.
  • [9] Dale Jacobs, John Jeang, Vladimir Podolskii, Morgan Prior, and Ilya Volkovich. Communication complexity of equality and error correcting codes. Technical Report TR25-068, Electronic Colloquium on Computational Complexity (ECCC), 2025. URL: https://eccc.weizmann.ac.il/report/2025/068/.
  • [10] Stasys Jukna. Boolean Function Complexity - Advances and Frontiers, volume 27 of Algorithms and combinatorics. Springer, 2012. doi:10.1007/978-3-642-24508-4.
  • [11] Eyal Kushilevitz and Noam Nisan. Communication complexity. Cambridge University Press, 1997.
  • [12] Olivier Lalonde. Entanglement-assisted communication complexity and nonlocal games. Master’s Thesis, Université de Montréal, 2023. URL: https://umontreal.scholaris.ca/items/d507c06f-33dc-49f5-ab91-31c2bd60c6bc.
  • [13] Olivier Lalonde, Nikhil S Mande, and Ronald de Wolf. Tight bounds for the randomized and quantum communication complexities of equality with small error. arXiv preprint arXiv:2107.11806, 2021.
  • [14] Bruno Loff and Sagnik Mukhopadhyay. Lifting theorems for equality. In Rolf Niedermeier and Christophe Paul, editors, 36th International Symposium on Theoretical Aspects of Computer Science, STACS 2019, March 13-16, 2019, Berlin, Germany, volume 126 of LIPIcs, pages 50:1–50:19. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2019. doi:10.4230/LIPICS.STACS.2019.50.
  • [15] Shay Moran, Makrand Sinha, and Amir Yehudayoff. Fooling pairs in randomized communication complexity. In Jukka Suomela, editor, Structural Information and Communication Complexity - 23rd International Colloquium, SIROCCO 2016, Helsinki, Finland, July 19-21, 2016, Revised Selected Papers, volume 9988 of Lecture Notes in Computer Science, pages 49–59, 2016. doi:10.1007/978-3-319-48314-6_4.
  • [16] Ilan Newman. Private vs. common random bits in communication complexity. Information Processing Letters, 39(2):67–71, 1991. doi:10.1016/0020-0190(91)90157-D.
  • [17] Noam Nisan. The communication complexity of threshold gates. Combinatorics, Paul Erdős Is Eighty, pages 301–315, 1993.
  • [18] Toniann Pitassi, Morgan Shirley, and Adi Shraibman. The strength of equality oracles in communication. In Yael Tauman Kalai, editor, 14th Innovations in Theoretical Computer Science Conference, ITCS 2023, January 10-13, 2023, MIT, Cambridge, Massachusetts, USA, volume 251 of LIPIcs, pages 89:1–89:19. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPICS.ITCS.2023.89.
  • [19] Anup Rao and Amir Yehudayoff. Communication Complexity: and Applications. Cambridge University Press, 2020.
  • [20] Alexander A. Razborov. Communication Complexity, pages 97–117. Springer Berlin Heidelberg, Berlin, Heidelberg, 2011. doi:10.1007/978-3-642-19533-4_8.
  • [21] Ron M. Roth. Introduction to coding theory. Cambridge University Press, 2006.
  • [22] Tim Roughgarden. Communication complexity (for algorithm designers). Found. Trends Theor. Comput. Sci., 11(3-4):217–404, 2016. doi:10.1561/0400000076.
  • [23] Andrew Chi-Chih Yao. Some complexity questions related to distributive computing (preliminary report). In Michael J. Fischer, Richard A. DeMillo, Nancy A. Lynch, Walter A. Burkhard, and Alfred V. Aho, editors, Proceedings of the 11h Annual ACM Symposium on Theory of Computing, April 30 - May 2, 1979, Atlanta, Georgia, USA, pages 209–213. ACM, 1979. doi:10.1145/800135.804414.

Appendix A Appendix

A.1 Proof of Lemma 39

We start by illustrating the case of two iterations. In the first step, we apply the Reed-Solomon code with some parameters:

n(m1,ε1,nε12m1),

and then instead of sending nε12m1 bits, we apply the Reed-Solomon code (with new values of parameters) to this message again:

n2=nε12m1(m2,ε2,n2ε22m2).

Notice that in both iterations we use random bits, but only communicate according to the result of the final iteration. The resulting procedure for two iterations gives us the following parameters

n(m=m1+m2,εε1+ε2,nε1ε22m1+m2),

and has the following restrictions

m1 nε12m1,
m2 nε1ε22m1+m2.

Observe that we are interested in ε and communication complexity only up to a constant factor. Further, notice that our errors are only additive from the first to the second iteration. Finally, we see that allowing a higher error rate in a given round will never increase the amount of randomness or communication required. Therefore, we can assume that in each iteration we allow the same error ε; otherwise, we could simply choose to increase the allowed error in the iteration with a smaller error rate.

With this assumption, on an arbitrary iteration i we will have parameters

(mi,ε,nεi2m1++mi)

with the restriction

minεi2m1++mi. (10)

The total values of parameters after c iterations (we assume that c is a constant) are

n(m=m1++mc,ε,nεc2m1++mc).

Recall that our protocol using single-iteration Reed-Solomon works only for m<nε2m. Thus our goal is to develop a protocol that is valid for larger values of m. Notice that only the final restriction directly contains a restriction on the total number of random bits used in the protocol:

mcnεc2m1++mc=nεc2m. (11)

Rearranging this expression, we get

2mnεc2mc.

We can see that minimizing the randomness used in the last iteration, mc, while keeping m the same maximizes the range in which this protocol is valid. Naturally to minimize mc, we need to maximize the randomness used in all previous iterations. Thus, to optimize the parameters we need to fix

mi=nεi2m1++mi

on intermediate iterations.

We prove the following stronger version of Lemma 39.

Lemma 41.

Recall that f(x)=log(log1ε+x). For any integer constant c we have that the communication protocol derived from c iterations of Reed-Solomon has complexity nεc2m and is valid for all

mlogn+clog1εf(c)(logn) (12)

(up to an additive constant).

Moreover, if all restrictions (10) are saturated, the inequality (12) is saturated as well.

Proof.

We prove this by induction on c. The base case corresponds to the standard Reed-Solomon protocol, which we already covered.

For the induction step, we do one extra step of the iteration and obtain the restriction

mc+1nεc+12m1++mc+1.

Since for the previous iterations we need to maximize m1,,mc, we must fix them so that all restrictions (10) are saturated. By the inductive hypothesis we have that (12) is saturated. Substituting m1++mc from this equality to above yields

mc+12f(c)(logn)ε2mc+1.

Rearranging this inequality we obtain, using Fact 6,

mc+1log1ε+f(c)(logn)log(1ε+f(c)(logn))=log1ε+f(c)(logn)f(c+1)(logn).

Adding this to the expression for m1++mc gives the desired inequality.

Moreover, it is easy to see that if (10) is saturated for i=c+1 as well, the resulting inequality is saturated.