Communication Complexity of Equality and Error-Correcting Codes

Jacobs, Dale; Jeang, John; Podolskii, Vladimir; Prior, Morgan; Volkovich, Ilya

doi:10.4230/LIPIcs.FSTTCS.2025.37

Communication Complexity of Equality and Error-Correcting Codes

Dale Jacobs

Tufts University, Medford, MA, USA John Jeang

Tufts University, Medford, MA, USA Vladimir Podolskii

Tufts University, Medford, MA, USA Morgan Prior

Tufts University, Medford, MA, USA Ilya Volkovich

Boston College, MA, USA

Abstract

We study the public-coin randomized communication complexity of the equality function. The communication complexity of this function is known to be low when the error probability is constant and the players have access to many random bits. The complexity grows, however, if the allowed error probability and the amount of randomness are restricted.

We show that public-coin randomized protocols for equality and error-correcting codes are essentially the same object. That is, given a protocol for equality, we can construct a code, and vice versa.

We substantially extend the protocol-implies-code direction: any protocol computing a function with a large fooling set can be converted into an error-correcting code. As a corollary, we show that among functions with a fooling set of size $s$ , equality on $\log s$ bits has the least randomized communication complexity, regardless of the restrictions on the error probability and the amount of randomness.

Finally, we use the connection to error-correcting codes to analyze the randomized communication complexity of equality for varying restrictions on the error probability and the amount of randomness. In most cases, we provide tight bounds. We pinpoint the setting in which tight bounds are still unknown.

Keywords and phrases:

communication complexity, randomized communication complexity, error-correcting codes

Copyright and License:

2012 ACM Subject Classification:

Theory of computation

\rightarrow

Communication complexity ; Theory of computation

\rightarrow

Error-correcting codes ; Theory of computation

\rightarrow

Randomness, geometry and discrete structures

Editors:

C. Aiswarya, Ruta Mehta, and Subhajit Roy

Series and Publisher:

Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik

1 Introduction

In the standard model of communication complexity, two players, Alice and Bob, are tasked with computing the value of a fixed function $f\colon\{0,1\}^{n}\times\{0,1\}^{n}\to\{0,1\}$ on some given input $(x,y)$ . However, Alice only knows $x$ and Bob only knows $y$ , so they must communicate to compute $f(x,y)$ . Communication complexity studies the amount of communication between players in terms of the number of bits exchanged. In recent decades, communication complexity has become one of the central areas in computational complexity, with numerous applications to other areas, including circuit complexity, proof complexity, online algorithms, and data structures [11, 19, 22, 10].

The equality function $\mathrm{EQ}\colon\{0,1\}^{n}\times\{0,1\}^{n}\to\{0,1\}$ is one of the most fundamental functions in communication complexity. On input $(x,y)$ , this function outputs $1$ iff $x=y$ . This was the first function for which an exponential separation between deterministic and randomized communication complexity was shown (see [23]). Later, $\mathrm{EQ}$ played a key role in many results in communication complexity, including direct sum results for the randomized setting [7], protocols for threshold functions [17], lifting theorems [14, 6], and communication complexity with an equality oracle [5, 18].

It is well known that the randomized communication complexity of $\mathrm{EQ}$ is low for constant error probability. We can consider both the private-coin and public-coin models, which we sometimes refer to as the private randomness setting and the public randomness setting, respectively. In the private randomness setting, there is an $O(\log n)$ protocol, usually attributed to Rabin and Yao (unpublished, see [11, 20]). For the case of public randomness, there is an $O(1)$ protocol. The general idea of this protocol is that players can first use their shared randomness to sample a random hash function $h$ . Then Alice sends Bob $h(x)$ , and Bob compares it to $h(y)$ . The most standard protocol of this type is for players to pick a random vector $r\in\{0,1\}^{n}$ and compare the inner products $\langle x,r\rangle$ and $\langle y,r\rangle$ over $\mathbb{F}_{2}$ . In the case that $x$ and $y$ are not equal, there is a $1/2$ probability that the compared inner products are also not equal.

Although the public-coin randomized complexity of $\mathrm{EQ}$ is $O(1)$ for constant error probability and unlimited access to randomness, the complexity can increase when we impose stronger restrictions on these parameters. The error probability for $\mathrm{EQ}$ (as well as for any other function) can be reduced to any $\varepsilon$ at the cost of a $\log{\frac{1}{\varepsilon}}$ factor in complexity by a standard error reduction procedure. Note that the $\mathrm{EQ}$ protocol above uses a large amount of randomness, which grows as the error probability is reduced. The number of random bits, which we denote by $m$ , can be substantially reduced using the idea from Newman’s result on the connection between communication complexity in the public-coin and private-coin models [16, 13]. Another known construction of randomized protocols for $\mathrm{EQ}$ with restricted randomness is through error-correcting codes. In such a protocol, players privately encode their inputs via an error-correcting code and compare them in a random coordinate (see Section 2.4 or [19, Chapter 3]).

As for lower bounds, Canetti and Goldreich [4] obtained general lower bounds in the setting with restricted randomness (by a reduction to the deterministic case; see Fact 14). For $\mathrm{EQ}$ , their result gives a lower bound of $\Omega\left(\frac{n}{\varepsilon 2^{m}}\right)$ . They also provided examples of matching upper bounds for the case of constant error probability. Moran et al. [15] (see also [13]) gave a lower bound on communication complexity in the private-coin model in the small error case for functions with large fooling sets. In particular, for $\mathrm{EQ}$ , their bound implies a lower bound of $\Omega\left(\log{n}+\log{\frac{1}{\varepsilon}}\right)$ in the private randomness setting. For public randomness, this implies a lower bound $\Omega\left(\log{n}+\log{\frac{1}{\varepsilon}}-m\right)$ . (Note that to reduce a protocol with private randomness to one with public randomness, one of the players can generate random bits and send them to another player.) Despite these results, the entire landscape of communication complexity of $\mathrm{EQ}$ for varying $\varepsilon$ and $m$ is not fully clear.

Our Results

As mentioned above, error-correcting codes give communication protocols for $\mathrm{EQ}$ . In our first result, we observe that actually, the converse is also true: communication protocols for $\mathrm{EQ}$ give rise to error-correcting codes. Thus, error-correcting codes and protocols for $\mathrm{EQ}$ are essentially the same object.

Theorem 1 (Informal version of Theorem 30 and Corollary 31).

Given a protocol for $\mathrm{EQ}$ , we can construct an error-correcting code with equivalent parameters (up to a constant factor), and vice versa. Furthermore, this connection is precise for one-sided error protocols; in this case, protocols and error-correcting codes are in one-to-one correspondence.

We obtain the following corollary:

Corollary 2 (Informal version of Corollary 32).

Any randomized public-coin protocol for $\mathrm{EQ}$ can be converted into a one-sided false-positive error protocol with no loss in parameters.

In particular, any communication protocol for $\mathrm{EQ}$ with $\varepsilon_{1}$ false positive error and $\varepsilon_{0}$ false negative error can be converted into a one-sided error protocol with $\varepsilon_{0}+\varepsilon_{1}$ false positive error (the communication complexity and the number of random bits used remain the same).

Next, we show that one direction of the result above generalizes to arbitrary functions in the following way:

Theorem 3 (Informal version of Theorem 33).

Any randomized public-coin protocol for a function with a large fooling set can be converted into an error-correcting code with the same parameters as for $\mathrm{EQ}$ .

As a consequence, we obtain the following corollary:

Corollary 4 (Informal version of Corollary 34).

Among all functions with a fooling set of size $s$ , $\mathrm{EQ}$ on $\log s$ bits has the least randomized public-coin communication complexity for all values of error probability and number of random bits.

In other words, this corollary says that of all functions, $\mathrm{EQ}$ provides the largest separation between fooling set size and public-coin randomized communication complexity.

We next use the established connection between communication complexity of $\mathrm{EQ}$ and error-correcting codes to provide a detailed analysis of the communication complexity $k$ of $\mathrm{EQ}$ on $n$ -bit inputs for varying amounts of random bits $m$ and error $\varepsilon$ . We start by translating bounds for error-correcting codes to lower bounds for communication complexity. In particular, based on Plotkin bound for codes, we obtain the following lower bound on the randomized communication complexity of $\mathrm{EQ}$ in the public-coin model (up to a constant factor):

Theorem 5 (Informal version of Theorem 35).

Any randomized public-coin protocol for $\mathrm{EQ}$ satisfies the following lower bound for the communication complexity $k$ :

k\geq\max\left(\min\left(n,\frac{n}{\varepsilon 2^{m}}\right),\log\frac{1}{% \varepsilon}\right).

Furthermore, this lower bound is tight for many regimes of the parameters $m,\varepsilon$ .

It is noted in [1] that Plotkin bound is the best known for the codes over large alphabet. We confirm this comparing it with Singleton bound, Elias-Bassalygo bound, Griesmer bound and MRRW bound. We observe that actually the combination of Singleton bound and Elias-Bassalygo bound gives the full lower bound in the theorem above (up to a constant factor).

Interestingly, the two known lower bounds for $\mathrm{EQ}$ mentioned above have natural interpretations in terms of codes. The result of [15] corresponds to the Elias-Bassalygo bound (after translation of their result to the public randomness setting) and the result of [4] matches the Singleton bound.

Next, we further analyze the landscape of communication complexity of $\mathrm{EQ}$ . Depending on values of the parameters $m$ and $\varepsilon$ , the expression in the theorem above can resolve (up to constant factors) to $n$ , $\frac{n}{\varepsilon 2^{m}}$ , or $\log\frac{1}{\varepsilon}$ . In the first case, the trivial deterministic protocol for $\mathrm{EQ}$ (see Section 2.2) is clearly optimal. In the second case, we show that for a wide range of parameters, the protocol for $\mathrm{EQ}$ obtained from Reed-Solomon codes is optimal. However, this code does not apply in a certain range of parameters, specifically when $\log\frac{1}{\varepsilon}<\frac{n}{\varepsilon 2^{m}}<m$ . In this case, we devise a protocol that is suboptimal, but close to the lower bound (specifically, we iterate the Reed-Solomon code with itself). Finally, in the third case, we show that the standard protocol combined with an improvement of Newman’s theorem (see Fact 15) for reducing the number of random bits gives an upper bound that is tight, up to additive lower order terms. Previously this was known only for the case of constant $\varepsilon$ [4].

Other related work

Fleischer et al. [8] studied the relation between the number of random bits and the zero-error communication complexity for the list-non-disjointness function. Ball et al. [2] gave a lower bound on randomized communication complexity with defective randomness through the size of the fooling set. The paper [15] analyses communication protocols of functions by looking at the statistical distance of the protocol on distinct inputs from the fooling set. The difference with our approach is that we look at Hamming distance between the vectors of outcomes instead.

One could apply round elimination techniques (e.g., [3]) to transform an arbitrary protocol for $\mathrm{EQ}$ into a one-way protocol, thus obtaining an alternative proof of Corollary 2 (or Corollary 32). This allows for an alternative prove of the translation from protocols to codes in Theorem 1: we can prove this translation for one-way protocol and use round elimination to extend the result to general protocols. However, since the proof of this direction of the theorem does not differ that much for one-way and for general communication protocols, we decided to give a direct proof for the general case.

The rest of the paper is organized as follows. In Section 2, we give the necessary definitions and state known results. In Section 3, we prove our results on the connection between communication complexity and error-correcting codes. In Section 4, we use the connection established in Section 3 to deduce lower bounds on the communication complexity of equality for various amounts of allowed error and randomness (the analysis of suboptimal bounds on the codes is omitted and can be found in the full version of the paper [9]). In Section 5, we analyze the landscape of communication complexity of equality by comparing our lower bounds against upper bounds for various regimes of parameters.

2 Preliminaries

We make frequent use of the following fact:

Fact 6.

For two non-negative functions $a(n),b(n)$ such that $a(n)=b(n)+\log{b(n)}$ ,

b(n)=a(n)-\log{a(n)}+O(1).

2.1 Notation and Conventions

All logarithms are base 2 unless otherwise specified. We use the notation $f^{(i)}$ to denote the function $f$ composed with itself $i$ times.

For brevity, unless specified otherwise, when we refer to the “complexity” of a function, we are referring to the randomized communication complexity in the public-coin model. Similarly, when we refer to a “protocol”, we mean a randomized communication protocol in the public-coin model. By “naive protocol” for $\mathrm{EQ}$ , we mean the deterministic protocol in which Alice sends her entire input, and by the “standard protocol” for $\mathrm{EQ}$ , we mean the standard randomized protocol used in the public-coin model (see Chapter 3 of [11]. Finally, when we refer to a “code”, we are referring to an error-correcting code.

We use $m$ to denote the amount (in bits) of public randomness in a communication protocol, and we use $\varepsilon$ to denote the allowed error. When protocols have two-sided error, we use $\varepsilon_{1}$ to denote false positive error and $\varepsilon_{0}$ to denote false negative error of a protocol. That is, $\varepsilon_{1}$ is an upper bound on the probability that the protocol outputs 1 on an input where the function outputs 0. The false negative error, $\varepsilon_{0}$ , is an upper bound on the probability that the protocol outputs 0 on an input where the function outputs 1. For convenience, we often use $k$ to denote public-coin randomized communication complexity. That is, we let $k$ denote $CC^{\text{pub}}$ . We use the notation $[N,K,D]$ code to refer to a code with block length $N$ , message length $K$ , and distance $D$ (see Section 2.3 for definitions of these parameters).

2.2 Communication Complexity

In the standard model of communication complexity, two players, Alice and Bob, wish to compute the value of a function $f\colon\{0,1\}^{n}\times\{0,1\}^{n}\to\{0,1\}$ on some given input $(x,y)$ . The challenge is that Alice only knows $x\in\{0,1\}^{n}$ and Bob only knows $y\in\{0,1\}^{n}$ , and thus the players need to communicate (i.e., send bits) to compute $f(x,y)$ .

Definition 7 (protocol, value, cost, one-way).

A communication protocol $\Pi$ is a binary tree where each internal node $v$ is labeled either by a function $a_{v}:\{0,1\}^{n}\rightarrow\{0,1\}$ or by a function $b_{v}:\{0,1\}^{n}\rightarrow\{0,1\}$ (denoting Alice or Bob’s message respectively) and each leaf node is labeled with an element in $\{0,1\}$ .

The value of the protocol $\Pi$ on input $(x,y)$ is the label reached by traversing the tree according to the following rules: start at the root, and for each internal node $v$ labeled by $a_{v}$ (resp $b_{v}$ ), move left if $a_{v}(x)=0$ (resp $b_{v}(y)=0$ ) and right otherwise. The cost of $\Pi$ on input $(x,y)$ is the length of the path taken on input $(x,y)$ . The cost of $\Pi$ is the height of the tree. We call a protocol one-way if the entire protocol consists of just one message.

Definition 8 (deterministic communication complexity).

The deterministic communication complexity $CC(f)$ of $f\colon\{0,1\}^{n}\times\{0,1\}^{n}\to\{0,1\}$ is the minimum cost of $\Pi$ , over all protocols $\Pi$ that compute $f$ .

Fact 9 (deterministic communication complexity of $\mathrm{EQ}_{n}$ ).

For $x,y\in\{0,1\}^{n}$ , the equality function, $\mathrm{EQ}_{n}(x,y)$ , is defined to be 1 if $x=y$ and 0 otherwise. We have that $CC(\mathrm{EQ}_{n})=n+1$ .

The lower bound on $CC(\mathrm{EQ})$ can be proved using the fooling set method (see [11]). We define a fooling set below:

Definition 10 (fooling set [11]).

A fooling set for $F:\{0,1\}^{n}\times\{0,1\}^{n}\rightarrow\{0,1\}$ is a set $F\subseteq\{0,1\}^{n}\times\{0,1\}^{n}$ and a value $b\in\{0,1\}$ such that

1.

For every $(x,y)\in F$ , $f(x,y)=b$
2.

For all distinct pairs $(x_{1},y_{1}),(x_{2},y_{2})$ , either $f(x_{1},y_{2})\neq b$ or $f(x_{2},y_{1})\neq b$ .

While Definition 7 defines protocols for deterministic communication complexity, this paper focuses on randomized communication complexity, wherein Alice and Bob have a source of public randomness. Formally, there is a random string of bits $r\in\{0,1\}^{*}$ available to both players. Where in the deterministic setting we had internal nodes in the protocol tree labeled with $a_{v}(x)$ and $b_{v}(y)$ , we now have internal nodes labeled with $a_{v}(x,r)$ and $b_{v}(y,r)$ ; that is, Alice and Bob’s communication can depend on $r$ . We can also view this model as a distribution $\{P_{r}\}_{r\in\Pi}$ on deterministic protocols:

Definition 11 (randomized public-coin protocol [11]).

A (randomized) public-coin protocol is a probability distribution over deterministic protocols. The success probability of a public-coin protocol on input $(x,y)$ is the probability of choosing a (deterministic) protocol, according to the probability distribution $\{P_{r}\}_{r\in\Pi}$ , that computes $f(x,y)$ correctly.

Definition 12 (randomized communication complexity [11]).

The randomized communication complexity $CC^{\text{pub}}_{\varepsilon}(f)$ is the minimum cost of a public-coin protocol that computes $f$ with an error of at most $\varepsilon$ (for every input $(x,y)$ ).

Fact 13 (randomized communication complexity of $\mathrm{EQ}$ ).

The randomized communication complexity (in the public randomness model) of $\mathrm{EQ}$ with constant error $\varepsilon$ is $O(1)$ .

A proof of Fact 13 can be found in [11].

Randomized protocols can be converted to deterministic protocols, with some overhead on the communication complexity, by repeating the randomized procedure and taking a majority vote. The following upper bound is proved in [4]:

Fact 14 ([4]).

Let $f\colon\{0,1\}^{n}\times\{0,1\}^{n}\to\{0,1\}$ , be a function computable via a randomized communication protocol $\Pi$ using $k$ bits of communication, $m$ bits of randomness, and with error probability $\varepsilon$ . Then the deterministic communication complexity $CC(f)$ is bounded by

CC(f)\leq k(\varepsilon 2^{m+1}+1).

(1)

The following fact improves upon Newman’s Theorem [16], and states that a public-coin protocol can be converted into a private-coin protocol with only logarithmic additive overhead in the complexity. Similarly, the amount of randomness used by a public-coin protocol can be reduced by incurring a small amount of extra error.

Fact 15 ([13]).

Let $f:\{0,1\}^{n}\times\{0,1\}^{n}\rightarrow\{0,1\}$ be a function. Then $\textrm{CC}^{\text{priv}}_{\varepsilon(1+\delta)}(f)\leq\textrm{CC}^{\text{pub% }}_{\varepsilon}(f)+\log({\frac{n}{\varepsilon}})+\log(\frac{6}{\delta^{2}}).$ Furthermore, any public-coin protocol for $f$ which uses $m$ random bits and has error $\epsilon$ can be converted into a protocol using $\log{n}+\log(1/\varepsilon)+\log(\frac{6}{\delta^{2}})$ random bits with error $\epsilon(1+\delta)$ , at no penalty to the communication complexity.

Interestingly, this improvement over original Newman’s Theorem is crucial for tight bounds in one of the cases in our analysis in Section 5.3.

See [11] for more on private and public-coin settings in communication complexity.

2.3 Error-Correcting Codes and Bounds

Definition 16 (code, block length, codewords).

A code $C$ of block length $N$ over alphabet $\Sigma$ is $C\subseteq\Sigma^{n}$ . The elements $c\in C$ are called codewords.

Definition 17 (distance, relative distance).

The distance $D$ of a code $C\subseteq\Sigma^{N}$ is $\text{min}_{c\neq c^{\prime}}(\Delta(c,c^{\prime}))$ , where $\Delta$ is the Hamming distance function. In other words, the distance of a code is the minimum Hamming distance between any pair of codewords. The relative distance $\delta$ of $C$ is the ratio of distance to block length, $\frac{D}{N}$ .

Definition 18 (message length).

The message length (sometimes called dimension) $K$ of a code $C$ over alphabet $\Sigma$ with $|\Sigma|=q$ is $K=\log_{q}|C|$ .

We call a code with block length $N$ , message length $K$ , and distance $D$ an $[N,K,D]$ code.

Definition 19 (rate).

The rate of a code with block length $N$ and message length $K$ is $R\coloneq\frac{K}{N}$ .

High distance and high rate are desirable properties for codes; however, there is a trade-off between these two parameters. The bounds stated below offer some insight into this trade-off.

Fact 20 (Singleton Bound).

Let $C$ be an $[N,K,D]$ code with alphabet size $q$ . Then

K\leq N-D+1.

Equivalently, we can express the Singleton Bound in terms of rate and relative distance:

R\leq 1-\delta+\frac{1}{N}.

The Singleton Bound is tight when $q\geq N$ . That is, there exist codes for which $K=N-D+1$ under this condition. We call such codes MDS (Maximum Distance Separable) codes. See Section 2.3 below.

Definition 21 ( $q$ -ary entropy).

The $q$ -ary entropy function $H_{q}(\delta)$ , a generalization of binary entropy, is defined as follows:

H_{q}(\delta)\coloneq\delta\log_{q}{\left(\frac{q-1}{\delta}\right)}+(1-\delta% )\log_{q}\left(\frac{1}{1-\delta}\right).

As the block length $N$ of a code tends to infinity, we have the relationship $\text{Vol}_{q}(\delta N,N)\approx q^{NH_{q}(\delta)}$ , where $\text{Vol}_{q}(\delta N,N)$ is the volume of the Hamming Ball with strings of length $\leq N$ , alphabet size $q$ , and radius $\delta N$ . Hence $H_{q}$ appears in the asymptotic statement of various bounds for error-correcting codes, such as the Elias-Bassalygo Bound below:

Fact 22 (Elias-Bassalygo Bound).

Let $C$ be an $[N,K,D]$ code with alphabet size $q$ . Then

R:=\frac{K}{N}\leq 1-H_{q}\left[\left(1-\frac{1}{q}\right)\left(1-\sqrt{1-(% \frac{q}{q-1})\delta}\right)\right].

Fact 23 (Griesmer Bound).

For a linear $[N,K,D]$ code with alphabet size $q$ , we have:

N\geq\sum_{i=0}^{K-1}\left\lceil\frac{D}{q^{i}}\right\rceil.

For more information on linear codes, see [21].

Fact 24 (Plotkin Bound).

The Plotkin bound strengthens the Singleton bound. For a code $C$ with message length $K$ , block length $N$ , relative distance $\delta$ , and alphabet size $q$ , we have:

\frac{K}{N}+\delta\left(\frac{q}{q-1}\right)\leq 1+\frac{1}{N}.

Fact 25 (MRRW Bound for large alphabet [1]).

Let $C$ be an $[N,K,D]$ code with alphabet size $q$ and $0<\delta<(q-1)/q$ . Then

R:=\frac{K}{N}\leq H_{q}\left[\left(q-1-(q-2)\delta-2\sqrt{(q-1)\delta(1-% \delta)}\right)/q\right].

Our upper bounds use protocols based on Reed-Solomon codes, which we define below. For more information about these codes, see [21].

Reed-Solomon codes are a family of codes which achieve the Singleton bound (i.e., for which $D=N-K+1$ ).

Formally, Reed-Solomon codes are defined as follows:

Definition 26 (Reed-Solomon codes).

Let $q\geq N\geq K$ . Let $\alpha_{1},\ldots\alpha_{N}\in\mathbb{F}_{q}$ be distinct. The Reed-Solomon code over $\mathbb{F}_{q}$ with evaluation points $\overrightarrow{\alpha}=(\alpha_{1},\ldots,\alpha_{N})$ , is the $[N,K,N-K+1]$ code given by:

\text{RS}_{q}(N,K-1)=\{(f(\alpha_{1}),f(\alpha_{2})\ldots,f(\alpha_{N})):f\in% \mathbb{F}_{q}[X],\hskip 2.84544pt\text{deg}(f)\leq K-1\}.

2.4 Codes Give Communication Protocols for $\mathrm{EQ}$

It is known (see [19, beginning of Chapter 3]) that codes give communication protocols for $\mathrm{EQ}$ . In particular, we have the following fact:

Fact 27.

For every $[N,K,D]$ code with alphabet $\Sigma$ of size $q$ , there exists a communication protocol, $\Pi$ , computing $\mathrm{EQ}_{n}$ , such that $n\leq K\log q$ , using $m$ public random bits, and $\log q$ bits of communication, with error $\frac{N-D}{2^{m}}$ . Moreover, this protocol is one-way and has one-sided false-positive error.

3 Communication Complexity vs. Error-Correcting Codes

In this section, we show that randomized communication protocols for $\mathrm{EQ}$ give rise to error-correcting codes. In fact, we prove a generalization of this statement which applies to functions that are close to $\mathrm{EQ}$ in some sense, which we call “ $\mathrm{EQ}$ -like” functions.

Definition 28 (EQ-like).

We say that a function $f:\{0,1\}^{n}\times\{0,1\}^{n}\rightarrow\{0,1\}$ is $\mathrm{EQ}$ -like if for all $x\in\{0,1\}^{n}$ , $f(x,x)=1$ and for all pairs $(x,y)$ such that $x\neq y$ , either $f(x,y)=0$ or $f(y,x)=0$ .

Note that $\mathrm{EQ}$ is a particular example of an $\mathrm{EQ}$ -like function, where for all pairs $(x,y)$ such that $x\neq y$ , $f(x,y)=0$ and $f(y,x)=0$ . Another example of an $\mathrm{EQ}$ -like function is the well-known Greater-Than function, defined as follows: $\mathrm{GT}(x,y)=\mathbbm{1}[\text{\emph{int}}(x)\geq\text{\emph{int}}(y)]$ where $\text{\emph{int}}(x)$ is the integer given by $x$ as its binary representation.

3.1 Protocols for $\mathrm{EQ}$ Imply Codes

Consider a protocol for $f$ . Alice holds $x$ , Bob holds $y$ , and there is public randomness $r$ . Let $T(x,y,r)$ be the transcript of Alice and Bob’s communication. That is,

T(x,y,r)=(a_{1},b_{1},a_{2},b_{2},\ldots,a_{R},b_{R}),

where each $a_{i}$ is a function of all previous messages, $x$ , and their shared randomness $r$ , and each $b_{i}$ is a function of all previous messages, $y$ , and $r$ . Without loss of generality, at the end of the protocol, Alice outputs $c(T(x,y,r),x,r)$ . Let $\varepsilon_{1}$ and $\varepsilon_{0}$ denote the false positive and false negative error of the protocol, respectively. That is,

	$\displaystyle\forall(x,y)\textrm{ such that }f(x,y)=0,\quad\Pr_{r}[c(T(x,y,r),% x,r)=1]\leq\varepsilon_{1}$
	$\displaystyle\forall(x,y)\textrm{ such that }f(x,y)=1,\quad\Pr_{r}[c(T(x,y,r),% x,r)=1]\geq 1-\varepsilon_{0}.$

We show that the transcript $T$ of Alice and Bob’s messages forms a code. An encoding of $x\in\{0,1\}^{n}$ is simply the concatenation of $T(x,x,r)$ for all choices of $r$ .

Lemma 29.

Let $f$ be $\mathrm{EQ}$ -like, and let $T(x,y,r)$ be the transcript for Alice and Bob’s communication in a randomized protocol for $f$ as given above. Then for $x\neq y\in\{0,1\}^{n}:$

\Pr_{r}[T(x,x,r)=T(y,y,r)]\leq\varepsilon_{0}+\varepsilon_{1}.

Proof.

Let $x\neq y\in\{0,1\}^{n}$ . Without loss of generality suppose $f(x,y)=0$ ¹¹1If instead $f(y,x)=0$ , we replace $T(x,y,r)$ with $T(y,x,r)$ in the first two lines of Equation and the proof holds.. Then we have
$\displaystyle\Pr_{r}[c(T(x,y,r),x,r)=1]+\Pr_{r}[c(T(y,y,r),y,r)=0]\geq$ $\displaystyle\geq\left(\Pr_{r}[c(T(x,y,r),x,r)=1\mid T(x,x,r)=T(y,y,r)]+\Pr_{r% }[c(T(y,y,r),y,r)=0\mid T(x,x,r)=T(y,y,r)]\right)$ $\displaystyle\quad\quad\cdot\Pr_{r}[T(x,x,r)=T(y,y,r)]=$ $\displaystyle\left(\Pr_{r}[c(T(y,y,r),y,r)=1\mid T(x,x,r)=T(y,y,r)]+\Pr_{r}[c(% T(y,y,r),y,r)=0\mid T(x,x,r)=T(y,y,r)]\right)$ $\displaystyle\quad\quad\cdot\Pr_{r}[T(x,x,r)=T(y,y,r)]=$ $\displaystyle=1\cdot\Pr_{r}[T(x,x,r)=T(y,y,r)],$

where the substitution $T(x,y,r)=T(y,y,r)$ follows from the fact that $T(x,x,r)=T(y,y,r)$ and that the communication tree partitions the inputs into combinatorial rectangles. Hence,

\Pr_{r}[T(x,x,r)=T(y,y,r)]\leq\Pr_{r}[c(T(x,y,r),x,r)=1]+\Pr_{r}[c(T(y,y,r),y,% r)=0]\leq\varepsilon_{0}+\varepsilon_{1}.\

$\hfill\blacktriangleleft$

Using Lemma 29, we obtain the following theorem:

Theorem 30.

Let $f:\{0,1\}^{n}\times\{0,1\}^{n}\rightarrow\{0,1\}$ be $\mathrm{EQ}$ -like, and suppose there exists a randomized communication protocol for $f$ which has communication complexity $k$ , uses $m$ bits of public randomness, and has two-sided error probabilities $\varepsilon_{0}$ and $\varepsilon_{1}$ . Then there exists an error-correcting code with message length $n$ , codeword length $2^{m}$ , relative distance $1-(\varepsilon_{0}+\varepsilon_{1})$ , and alphabet size $2^{k}$ .

Proof.

The proof follows directly from Lemma 29. Concretely, the codeword for $x\in\{0,1\}^{n}$ is given by the concatenation of $T(x,x,r)$ for all $2^{m}$ possible values of randomness $r\in\{0,1\}^{m}$ . Each symbol in the codeword corresponds to $T(x,x,r)$ for one choice of $r$ , and since each transcript uses $k$ bits of communication, there are $2^{k}$ possible symbols in the code alphabet. Finally, by Lemma 29, the probability that two different codewords agree on any given coordinate is bounded above by $\varepsilon_{0}+\varepsilon_{1}$ . Thus the relative distance of the code is $1-(\varepsilon_{0}+\varepsilon_{1})$ . $\hfill\blacktriangleleft$

If we consider a protocol with one-sided error, that is, a protocol with either $\varepsilon_{0}=0$ or $\varepsilon_{1}=0$ , we immediately obtain the following corollary:

Corollary 31 (One-sided error).

Let $f:\{0,1\}^{n}\times\{0,1\}^{n}\rightarrow\{0,1\}$ be $\mathrm{EQ}$ -like, and suppose there exists a randomized communication protocol for $f$ which has communication complexity $k$ , uses $m$ bits of public randomness, and has one-sided error probability $\varepsilon$ . Then there exists a code with message length $n$ , codeword length $2^{m}$ , relative distance $1-\varepsilon$ , and alphabet size $2^{k}$ .

We remark that since codes give protocols (Fact 27), this corollary shows that one-way protocols for $\mathrm{EQ}$ with one-sided false-positive error are in one-to-one correspondence with error-correcting codes. That is, we can translate from protocols to codes and vice versa with no loss in parameters. Another consequence of Corollary 31 is the following corollary:

Corollary 32.

Any randomized protocol for $\mathrm{EQ}$ can be converted to a one-way protocol with one-sided false-positive error with no loss in parameters.

The following table summarizes the relationship between the communication complexity of a protocol for $\mathrm{EQ}$ and the obtained code:

Table 1: Translation Between CC and Codes.

Communication Complexity	Code
size of input = $n$	# of messages = $2^{n}$
# of random bits = $m$	length of code = $2^{m}$
error = $\varepsilon$	relative distance = $1-\varepsilon$
complexity = $k$	alphabet size = $2^{k}$

3.2 Codes from Functions Based on Fooling Sets

We have seen that protocols for $\mathrm{EQ}$ -like functions give codes. We can actually make a more general statement which applies to all functions. In particular, for any function $f$ , a protocol for $f$ gives a code whose parameters depend on the parameters of the protocol and the size of the largest fooling set for $f$ .

Theorem 33.

Let $f:\{0,1\}^{n}\times\{0,1\}^{n}\rightarrow\{0,1\}$ be any function with fooling set $F$ , and suppose there exists a randomized communication protocol for $f$ which has communication complexity $k$ , uses $m$ bits of randomness, and has error probabilities $\varepsilon_{0},\varepsilon_{1}$ . Then there exists a code with block length $2^{m}$ , message length $\log|F|$ , relative distance $1-(\varepsilon_{0}+\varepsilon_{1}$ ), and alphabet size $2^{k}$ .

Proof.

To prove the theorem, we show how to construct an error-correcting code from the function $f$ with fooling set $F$ . Let $X$ and $Y$ denote the set of $x$ and $y$ inputs respectively which appear in elements of the fooling set $F$ , and define $f^{\prime}:\{0,1\}^{\log|F|}\times\{0,1\}^{\log|F|}\rightarrow\{0,1\}$ to be $f$ restricted to the inputs $X\times Y$ . Without loss of generality, assume that for $(x,y)\in F$ , $f(x,y)=1$ . By the definition of fooling set (see Section 2.2), $f^{\prime}$ is $\mathrm{EQ}$ -like, up to a permutation on one half of the input. Therefore, a protocol for computing $f$ gives a protocol for $f^{\prime}$ , which by Lemma 29 gives an error-correcting code for messages of length $\log|F|$ . $\hfill\blacktriangleleft$ As a consequence of Theorem 33, we obtain the following corollary, which says that in some sense, $\mathrm{EQ}$ represents the largest gap between fooling set size and randomized communication complexity in the public-coin model.

Corollary 34.

Let $f:\{0,1\}^{n}\times\{0,1\}^{n}\rightarrow\{0,1\}$ be any function with fooling set $F$ , and suppose there exists a randomized communication protocol for $f$ which has communication complexity $k$ , uses $m$ bits of randomness, and has error probabilities $\varepsilon_{0},\varepsilon_{1}$ . Then there exists a protocol for $\mathrm{EQ}$ on $\log|F|$ bits which has communication complexity $k$ , uses $m$ bits of randomness, and has one-sided false-positive error probability $\varepsilon_{0}+\varepsilon_{1}$ .

This follows because, given a protocol for $f$ , we can construct a code with message length $\log|F|$ , from which we obtain a protocol for $\mathrm{EQ}$ on $\log|F|$ bits.

Informally, Corollary 34 says that among all functions with a fooling set of size $s$ , $\mathrm{EQ}$ on $\log s$ bits has the smallest public-coin randomized complexity for all values of $\varepsilon$ and $m$ . For protocols with two-sided error, this interpretation is true up to constant factors. For protocols with one-sided error, it is precise.

4 Communication Lower Bounds Implied by Code Bounds

The equivalence between $\mathrm{EQ}$ communication protocol and codes allows us to view bounds on codes as bounds on communication complexity. In particular, since we showed that existence of protocols implies existence of codes, nonexistence of codes implies nonexistence of protocols. Using this connection, we can translate known bounds for codes (Section 2.3) into bounds on communication complexity using the correspondence given by Table 1²²2Note that these lower bounds also hold for functions with large fooling sets.. Below we describe the bound that can be obtained from Plotkin bound, that turns out to be the best known bound for large size of the alphabet.

The Plotkin bound (Fact 24) can be expressed in terms of communication protocols as:

\frac{n}{k2^{m}}+(1-\varepsilon)\left(\frac{2^{k}}{2^{k}-1}\right)\leq 1+\frac% {1}{2^{m}}.

(2)

In terms of error rate $\varepsilon$ , this implies

\varepsilon\geq\underbrace{\frac{n-k}{2^{m}k}\left(\frac{2^{k}-1}{2^{k}}\right% )}_{A}+\underbrace{\frac{1}{2^{k}}}_{B}.

(3)

From this we get the system

\begin{cases}\varepsilon\geq\frac{n-k}{2^{m}k}\left(\frac{2^{k}-1}{2^{k}}% \right),\\ \varepsilon\geq\frac{1}{2^{k}}.\end{cases}

For any $k$ we have $\left(\frac{2^{k}-1}{2^{k}}\right)\geq\frac{1}{2}$ , thus solving for $k$ we get the following (up to constant factors)

\begin{cases}k\geq\frac{n}{2^{m}\varepsilon+1},\\ k\geq\log\frac{1}{\varepsilon}.\end{cases}

Depending on $m$ and $\varepsilon$ , either 1 or $\varepsilon 2^{m}$ is the dominant term in the denominator of the first inequality. Based on these two cases, this bound can be equivalently stated (up to constant factors) as $k\geq\min(\frac{n}{\varepsilon 2^{m}},n)$ .

Combining two bounds together we get the following theorem.

Theorem 35.

For any randomized public-coin protocol for $\mathrm{EQ}$ on $n$ bits with complexity $k$ , $m$ bits of randomness, and error probability $\varepsilon$ ,

k\geq\max\left(\min(n,\frac{n}{\varepsilon 2^{m}}),\log\frac{1}{\varepsilon}\right)

(4)

up to constant factors.

It was noted (without details) in [1] that Plotkin bound is the best known bound for the case of error-correcting codes with large alphabet. For the sake of completeness, we provide a detailed analysis of communication complexity bounds we can obtain from Singleton bound, Elias-Bassalygo bound, Griesmer bound and MRRW bound. The analysis for these bounds can be found in the full version of our paper [9]. We also note that the combination of Singleton bound, Elias-Bassalygo bound matches Plotkin bound up to the constant factor (see [9]).

Note that Theorem 35 gives a different lower bound (one of $n,\frac{n}{\varepsilon 2^{m}}$ , or $\log\frac{1}{\varepsilon}$ ) for different regimes of the parameters $n, m,$ and $\varepsilon$ . In Section 5, we examine the necessary relationship between $n, m$ , and $\varepsilon$ for achieving each of the three possible lower bounds. We also analyze when these bounds are tight by giving matching upper bounds for most regimes of the parameters. This analysis is summarized in the table below.

Table 2: Bounds on complexity of

\mathrm{EQ}

for various regimes of parameters.

Description of the Regime	Lower Bound	Upper Bound
$\frac{1}{2^{m}}\geq\varepsilon\geq\frac{1}{2^{n}}$	$n$	$n$
$n\geq\frac{n}{\varepsilon 2^{m}}\geq\log\frac{1}{\varepsilon}$	$\frac{n}{\varepsilon 2^{m}}$	$\frac{n}{\varepsilon 2^{m}}$ ,
		if $m<\frac{n}{\varepsilon 2^{m}}$
		$\frac{n}{2^{m}}$ ,
		if $\log n-\log^{(c)}n>m>\frac{n}{\varepsilon 2^{m}}$
		and constant $\varepsilon$ and $c$
		$\frac{n}{\varepsilon^{c}2^{m}}$ ,
		if $\varepsilon=\frac{1}{\log^{(c)}n}$ and $c$ is constant
$\log n+\log\frac{1}{\varepsilon}-\log\log\frac{1}{\varepsilon}<m$	$\log\frac{1}{\varepsilon}$	$\log\frac{1}{\varepsilon}$ ,
	$\log\frac{1}{\varepsilon}$	if $m>\log n+\log\frac{1}{\varepsilon}+\Theta(1)$

5 Analyzing the Landscape of Communication Complexity of $\mathrm{EQ}$

In this section, we analyze the various regimes of parameters which give rise to different lower bounds on communication complexity. We further examine when these bounds are tight by giving upper bounds. Now we consider the three cases based on which of the three terms the maximum in Theorem 35 evaluates to.

5.1 Case 1

In this case, we have

\begin{cases}n\leq\frac{n}{\varepsilon 2^{m}},\\ n\geq\log\frac{1}{\varepsilon},\end{cases}

or, after rearranging, $\frac{1}{2^{n}}\leq\varepsilon\leq\frac{1}{2^{m}}$ . In this case, $\max\left(\min(n,\frac{n}{\varepsilon 2^{m}}),\log\frac{1}{\varepsilon}\right)=n$ , giving the lower bound $k\geq n$ . Clearly, the naive protocol for $\mathrm{EQ}$ gives us the matching upper bound.

$\blacktriangleright$ Remark 36.

This lower bound is actually easy to see, since for $\varepsilon<\frac{1}{2^{m}}$ only the deterministic protocol is possible (because of the granularity of error probability).

5.2 Case 2

In this case, we have

\begin{cases}n\geq\frac{n}{\varepsilon 2^{m}},\\ \frac{n}{\varepsilon 2^{m}}\geq\log\frac{1}{\varepsilon}.\end{cases}

The lower bound in this case is $k\geq\frac{n}{\varepsilon 2^{m}}$ .

$\blacktriangleright$ Remark 37.

Note that this case shows that we cannot study the value of $m$ up to a constant factor. The reason is that the bound depends on $2^{m}$ and multiplicative changes for $m$ may affect the bound dramatically. Note that we never allowed extra multiplicative constant factors for $m$ .

5.2.1 Upper Bound with Reed-Solomon Protocol

For the upper bound in this case, we can use the communication protocol derived from Reed-Solomon codes (Section 2.3). In this protocol, Alice and Bob encode their inputs using a Reed-Solomon code and check the value of a single random index in a subset $S$ of their respective codewords. For some values of parameters, the communication complexity of this protocol matches the lower bound given by the Singleton bound.

Lemma 38.

For any $m\leq k$ , there exists a public-coin randomized communication protocol that computes $\mathrm{EQ}$ using $k$ bits of communication, $m$ bits of randomness, and error rate $\varepsilon=O(\frac{n}{k2^{m}})$ .

Proof.

Let $q$ be the largest prime number such that $q<2^{k}$ . To compute $\mathrm{EQ}$ , we use a $\text{RS}_{q}(q,\lceil\frac{n}{k}\rceil)$ code to get an error rate of

\varepsilon=\frac{\lceil\frac{n}{k}\rceil}{2^{m}}=\frac{n}{k2^{m}}+\Theta\left% (\frac{1}{2^{m}}\right),

(5)

but we know that $k\leq n+1$ (the naive protocol for $\mathrm{EQ}$ gives this upper bound), giving $\varepsilon=O(\frac{n}{k2^{m}})$ . $\hfill\blacktriangleleft$ Note that Reed-Solomon codes are only defined for $m\leq k$ . For this protocol, $k=\Theta(\frac{n}{\varepsilon 2^{m}})$ . Thus, we have a protocol only in the regime

m\leq\Theta(\frac{n}{\varepsilon 2^{m}}).

Hence, this protocol does not give an upper bound when

\log\frac{1}{\varepsilon}\leq\frac{n}{\varepsilon 2^{m}}\leq m.

(6)

An example parameters in this regime is $m=\log n$ and $\varepsilon=1/\log\log n$ . The lower bound in this case is $\frac{n}{\varepsilon 2^{m}}=\log\log n$ . We partially address this issue in the next section.

5.2.2 Upper bound with iterated Reed-Solomon

The Reed-Solomon code allows us to do the following transformation of parameters:
$n\longmapsto\nobreak\ (m,\varepsilon,\frac{n}{\varepsilon 2^{m}}).$ This notation means that when we need to send a message of length $n$ , we can use $m$ random bits, introduce an error $\varepsilon$ , and reduce the problem to sending a message of size $\frac{n}{\varepsilon 2^{m}}$ . This protocol gives a tight upper bound, but it has the restriction $m\leq\frac{n}{\varepsilon 2^{m}}$ . For the range of parameters

\log\frac{1}{\varepsilon}<\frac{n}{\varepsilon 2^{m}}<m,

we do not have a tight upper bound.

Restating in terms of $m$ and using Fact 6, Reed-Solomon codes give tight upper bounds for the regime of parameters satisfying

m\leq\log n+\log\frac{1}{\varepsilon}-\log\left(\log n+\log\frac{1}{% \varepsilon}\right).

(7)

Thus, the following range of the lower bound is not matched with a tight upper bound:

\log n+\log\frac{1}{\varepsilon}-\log\left(\log n+\log\frac{1}{\varepsilon}% \right)<m<\log n+\log\frac{1}{\varepsilon}-\log\log\frac{1}{\varepsilon}.

(8)

Our plan is to apply the Reed-Solomon code iteratively. That is, in the first step, we apply the code with some parameters:

n\longmapsto(m_{1},\varepsilon_{1},\frac{n}{\varepsilon_{1}2^{m_{1}}}),

but then instead of sending $\frac{n}{\varepsilon_{1}2^{m_{1}}}$ bits, we apply the Reed-Solomon code again (with some new values of parameters).

We note that the idea to apply codes iteratively is well known in coding theory under the name of concatenated codes [21]. In particular, it was previously observed in communication complexity language that applying Reed-Solomon codes iteratively helps to improve multiplicative constant and lower order terms in communication complexity of the $\mathrm{EQ}$ function in the regime of constant error probability and unlimited randomness [12]. However, we need a more detailed analysis.

For a real nonnegative $x$ , we introduce the function $f(x)=\log(\log\frac{1}{\varepsilon}+x)$ . We now describe the values of parameters that we can cover with $c$ iterations of Reed-Solomon protocol. The proof of the following lemma can be found in Section A.1.

Lemma 39.

For any integer constant $c$ we have that the communication protocol derived from $c$ iterations of Reed-Solomon has complexity $\frac{n}{\varepsilon^{c}2^{m}}$ and is valid for all

m\leq\log n+c\log\frac{1}{\varepsilon}-f^{(c)}(\log n)

(9)

(up to an additive constant), where $f^{(c)}$ is the function $f$ applied to itself $c$ times.

For example, for constant $\varepsilon$ the lower bound we get is still tight up to the constant factor. It is not hard to see in this case that the function $f^{(c)}(\log n)$ is equal to $\log^{(c+1)}n$ up to an additive constant, and thus the range of parameters for which we do not have an upper bound reduces to

\log n-\log^{(c+1)}n<m<\log n.

For non-constant $\varepsilon$ , our upper bound is not tight. To analyze the interval in which it applies, observe that $f(x)=\log(\log\frac{1}{\varepsilon}+x)$ is equal to $\log\log\frac{1}{\varepsilon}$ (up to an additive constant 1) for $\log\frac{1}{\varepsilon}\geq x$ and is equal to $\log x$ (up to an additive constant 1) for $\log\frac{1}{\varepsilon}\leq x$ .

For example, for $\varepsilon=\frac{1}{\log^{(t)}n}$ , for constant $t$ , we get that $\log\frac{1}{\varepsilon}=\log^{(t+1)}n$ . After $t+1$ iterations we have that $f^{(t+1)}=\log\log\frac{1}{\varepsilon}$ (up to an additive constant 1). Substituting this into (9) we get that this upper bound covers the whole regime (8).

5.3 Case 3

This case applies when $\log\frac{1}{\varepsilon}\geq n$ or $\log\frac{1}{\varepsilon}\geq\frac{n}{\varepsilon 2^{m}}.$ The lower bound in this case is $k\geq\log{\frac{1}{\varepsilon}}$ .

Note that the first of the subcases here corresponds to $\varepsilon\leq\frac{1}{2^{n}}$ , in which case the communication complexity bound is $n$ . We know that this bound is tight.

The remaining case to consider is $\log\frac{1}{\varepsilon}\geq\frac{n}{\varepsilon 2^{m}}$ which, after the rearranging gives $2^{m}\geq n\frac{1}{\varepsilon\log\frac{1}{\varepsilon}}.$ This is equivalent to

m\geq\log n+\log\frac{1}{\varepsilon}-\log\log\frac{1}{\varepsilon}.

As for the upper bound, we use the standard randomized protocol and apply Fact 15. The standard protocol gives us the parameters

(n\log\frac{1}{\varepsilon},\varepsilon,\log\frac{1}{\varepsilon}).

Applying Fact 15 with error parameter $\delta$ gives

(\log n+\log\frac{1}{\varepsilon}+\log\left(\frac{6}{\delta^{2}}\right),% \varepsilon(1+\delta),\log\frac{1}{\varepsilon}).

Setting $\delta=\Theta(1)$ yields

(\log n+\log\frac{1}{\varepsilon}+\Theta(1),\Theta(\varepsilon),\log\frac{1}{% \varepsilon}),

and this bound achieves optimal $m$ up to an additive factor in lower order terms, and optimal $\varepsilon$ up to a constant multiplicative factor.

$\blacktriangleright$ Remark 40.

We observe that this public-coin protocol implies a private-coin protocol, where the randomness is simply shared at the start of the protocol at the cost of $\log n+\log\frac{1}{\varepsilon}+\Theta(1)$ bits of communication. This resulting private-coin protocol has $\Theta(\varepsilon)$ error using $\log(\frac{n}{\varepsilon^{2}})+\Theta(1)$ bits of communication, essentially matching the lower bound of $\log(\frac{n}{\varepsilon^{2}})-\log\log(\frac{1}{\varepsilon})$ observed in [13]. This indicates that in this range of parameters, the optimal private-coin protocol is to simply share the private randomness and then simulate the standard public-coin protocol.

6 Discussion & Future Directions

We identify an equivalence between randomized public-coin protocols for equality and error-correcting codes, and generalize one direction of this equivalence to functions with large fooling sets. We use this connection to study the randomized communication complexity of $\mathrm{EQ}$ for various regimes of the error $\varepsilon$ and randomness $m$ parameters. We give lower bounds for protocols from lower bounds from codes, and analyze when these lower bounds are tight. An immediate question is whether we can tighten our bounds in the regime in which they are not tight.

One way to view our result on fooling sets is that the maximal separation between the fooling set size and randomized communication complexity with public randomness is achieved by the equality function (for all regimes of parameters). A natural future direction is to explore whether similar separations can be found between randomized communication complexity and other complexity measures, such as deterministic communication complexity, rectangle size, or rank. Partial progress in this direction was obtained by Canetti and Goldreich [4].

References

[1] Matti J. Aaltonen. Linear programming bounds for tree codes (corresp.). IEEE Trans. Inf. Theory, 25(1):85–90, 1979. doi:10.1109/TIT.1979.1056004.
[2] Marshall Ball, Oded Goldreich, and Tal Malkin. Communication complexity with defective randomness. In Valentine Kabanets, editor, 36th Computational Complexity Conference, CCC 2021, July 20-23, 2021, Toronto, Ontario, Canada (Virtual Conference), volume 200 of LIPIcs, pages 14:1–14:10. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2021. doi:10.4230/LIPICS.CCC.2021.14.
[3] Jop Briët, Harry Buhrman, Debbie W. Leung, Teresa Piovesan, and Florian Speelman. Round elimination in exact communication complexity. In Salman Beigi and Robert König, editors, 10th Conference on the Theory of Quantum Computation, Communication and Cryptography, TQC 2015, May 20-22, 2015, Brussels, Belgium, volume 44 of LIPIcs, pages 206–225. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2015. doi:10.4230/LIPICS.TQC.2015.206.
[4] Ran Canetti and Oded Goldreich. Bounds on tradeoffs between randomness and communication complexity. Comput. Complex., 3:141–167, 1993. doi:10.1007/BF01200118.
[5] Arkadev Chattopadhyay, Shachar Lovett, and Marc Vinyals. Equality alone does not simulate randomness. In Amir Shpilka, editor, 34th Computational Complexity Conference, CCC 2019, July 18-20, 2019, New Brunswick, NJ, USA, volume 137 of LIPIcs, pages 14:1–14:11. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2019. doi:10.4230/LIPICS.CCC.2019.14.
[6] Susanna F. de Rezende, Or Meir, Jakob Nordström, Toniann Pitassi, Robert Robere, and Marc Vinyals. Lifting with simple gadgets and applications to circuit and proof complexity. In Sandy Irani, editor, 61st IEEE Annual Symposium on Foundations of Computer Science, FOCS 2020, Durham, NC, USA, November 16-19, 2020, pages 24–30. IEEE, 2020. doi:10.1109/FOCS46700.2020.00011.
[7] Tomás Feder, Eyal Kushilevitz, Moni Naor, and Noam Nisan. Amortized communication complexity. SIAM J. Comput., 24(4):736–750, 1995. doi:10.1137/S0097539792235864.
[8] Rudolf Fleischer, Hermann Jung, and Kurt Mehlhorn. A communication-randomness tradeoff for two-processor systems. Inf. Comput., 116(2):155–161, 1995. doi:10.1006/INCO.1995.1011.
[9] Dale Jacobs, John Jeang, Vladimir Podolskii, Morgan Prior, and Ilya Volkovich. Communication complexity of equality and error correcting codes. Technical Report TR25-068, Electronic Colloquium on Computational Complexity (ECCC), 2025. URL: https://eccc.weizmann.ac.il/report/2025/068/.
[10] Stasys Jukna. Boolean Function Complexity - Advances and Frontiers, volume 27 of Algorithms and combinatorics. Springer, 2012. doi:10.1007/978-3-642-24508-4.
[11] Eyal Kushilevitz and Noam Nisan. Communication complexity. Cambridge University Press, 1997.
[12] Olivier Lalonde. Entanglement-assisted communication complexity and nonlocal games. Master’s Thesis, Université de Montréal, 2023. URL: https://umontreal.scholaris.ca/items/d507c06f-33dc-49f5-ab91-31c2bd60c6bc.
[13] Olivier Lalonde, Nikhil S Mande, and Ronald de Wolf. Tight bounds for the randomized and quantum communication complexities of equality with small error. arXiv preprint arXiv:2107.11806, 2021.
[14] Bruno Loff and Sagnik Mukhopadhyay. Lifting theorems for equality. In Rolf Niedermeier and Christophe Paul, editors, 36th International Symposium on Theoretical Aspects of Computer Science, STACS 2019, March 13-16, 2019, Berlin, Germany, volume 126 of LIPIcs, pages 50:1–50:19. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2019. doi:10.4230/LIPICS.STACS.2019.50.
[15] Shay Moran, Makrand Sinha, and Amir Yehudayoff. Fooling pairs in randomized communication complexity. In Jukka Suomela, editor, Structural Information and Communication Complexity - 23rd International Colloquium, SIROCCO 2016, Helsinki, Finland, July 19-21, 2016, Revised Selected Papers, volume 9988 of Lecture Notes in Computer Science, pages 49–59, 2016. doi:10.1007/978-3-319-48314-6_4.
[16] Ilan Newman. Private vs. common random bits in communication complexity. Information Processing Letters, 39(2):67–71, 1991. doi:10.1016/0020-0190(91)90157-D.
[17] Noam Nisan. The communication complexity of threshold gates. Combinatorics, Paul Erdős Is Eighty, pages 301–315, 1993.
[18] Toniann Pitassi, Morgan Shirley, and Adi Shraibman. The strength of equality oracles in communication. In Yael Tauman Kalai, editor, 14th Innovations in Theoretical Computer Science Conference, ITCS 2023, January 10-13, 2023, MIT, Cambridge, Massachusetts, USA, volume 251 of LIPIcs, pages 89:1–89:19. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPICS.ITCS.2023.89.
[19] Anup Rao and Amir Yehudayoff. Communication Complexity: and Applications. Cambridge University Press, 2020.
[20] Alexander A. Razborov. Communication Complexity, pages 97–117. Springer Berlin Heidelberg, Berlin, Heidelberg, 2011. doi:10.1007/978-3-642-19533-4_8.
[21] Ron M. Roth. Introduction to coding theory. Cambridge University Press, 2006.
[22] Tim Roughgarden. Communication complexity (for algorithm designers). Found. Trends Theor. Comput. Sci., 11(3-4):217–404, 2016. doi:10.1561/0400000076.
[23] Andrew Chi-Chih Yao. Some complexity questions related to distributive computing (preliminary report). In Michael J. Fischer, Richard A. DeMillo, Nancy A. Lynch, Walter A. Burkhard, and Alfred V. Aho, editors, Proceedings of the 11h Annual ACM Symposium on Theory of Computing, April 30 - May 2, 1979, Atlanta, Georgia, USA, pages 209–213. ACM, 1979. doi:10.1145/800135.804414.

Appendix A Appendix

A.1 Proof of Lemma 39

We start by illustrating the case of two iterations. In the first step, we apply the Reed-Solomon code with some parameters:

n\longmapsto(m_{1},\varepsilon_{1},\frac{n}{\varepsilon_{1}2^{m_{1}}}),

and then instead of sending $\frac{n}{\varepsilon_{1}2^{m_{1}}}$ bits, we apply the Reed-Solomon code (with new values of parameters) to this message again:

n_{2}=\frac{n}{\varepsilon_{1}2^{m_{1}}}\longmapsto(m_{2},\varepsilon_{2},% \frac{n_{2}}{\varepsilon_{2}2^{m_{2}}}).

Notice that in both iterations we use random bits, but only communicate according to the result of the final iteration. The resulting procedure for two iterations gives us the following parameters

n\longmapsto\left(m=m_{1}+m_{2},\varepsilon\leq\varepsilon_{1}+\varepsilon_{2}% ,\frac{n}{\varepsilon_{1}\varepsilon_{2}2^{m_{1}+m_{2}}}\right),

and has the following restrictions

	$\displaystyle m_{1}$	$\displaystyle\leq\frac{n}{\varepsilon_{1}2^{m_{1}}},$
	$\displaystyle m_{2}$	$\displaystyle\leq\frac{n}{\varepsilon_{1}\varepsilon_{2}2^{m_{1}+m_{2}}}.$

Observe that we are interested in $\varepsilon$ and communication complexity only up to a constant factor. Further, notice that our errors are only additive from the first to the second iteration. Finally, we see that allowing a higher error rate in a given round will never increase the amount of randomness or communication required. Therefore, we can assume that in each iteration we allow the same error $\varepsilon$ ; otherwise, we could simply choose to increase the allowed error in the iteration with a smaller error rate.

With this assumption, on an arbitrary iteration $i$ we will have parameters

(m_{i},\varepsilon,\frac{n}{\varepsilon^{i}2^{m_{1}+\ldots+m_{i}}})

with the restriction

m_{i}\leq\frac{n}{\varepsilon^{i}2^{m_{1}+\ldots+m_{i}}}.

(10)

The total values of parameters after $c$ iterations (we assume that $c$ is a constant) are

n\longmapsto\left(m=m_{1}+\ldots+m_{c},\varepsilon,\frac{n}{\varepsilon^{c}2^{% m_{1}+\ldots+m_{c}}}\right).

Recall that our protocol using single-iteration Reed-Solomon works only for $m<\frac{n}{\varepsilon 2^{m}}$ . Thus our goal is to develop a protocol that is valid for larger values of $m$ . Notice that only the final restriction directly contains a restriction on the total number of random bits used in the protocol:

m_{c}\leq\frac{n}{\varepsilon^{c}2^{m_{1}+\dots+m_{c}}}=\frac{n}{\varepsilon^{% c}2^{m}}.

(11)

Rearranging this expression, we get

2^{m}\leq\frac{n}{\varepsilon^{c}2m_{c}}.

We can see that minimizing the randomness used in the last iteration, $m_{c}$ , while keeping $m$ the same maximizes the range in which this protocol is valid. Naturally to minimize $m_{c}$ , we need to maximize the randomness used in all previous iterations. Thus, to optimize the parameters we need to fix

m_{i}=\frac{n}{\varepsilon^{i}2^{m_{1}+\ldots+m_{i}}}

on intermediate iterations.

We prove the following stronger version of Lemma 39.

Lemma 41.

Recall that $f(x)=\log(\log\frac{1}{\varepsilon}+x)$ . For any integer constant $c$ we have that the communication protocol derived from $c$ iterations of Reed-Solomon has complexity $\frac{n}{\varepsilon^{c}2^{m}}$ and is valid for all

m\leq\log n+c\log\frac{1}{\varepsilon}-f^{(c)}(\log n)

(12)

(up to an additive constant).

Moreover, if all restrictions (10) are saturated, the inequality (12) is saturated as well.

Proof.

We prove this by induction on $c$ . The base case corresponds to the standard Reed-Solomon protocol, which we already covered.

For the induction step, we do one extra step of the iteration and obtain the restriction

m_{c+1}\leq\frac{n}{\varepsilon^{c+1}2^{m_{1}+\ldots+m_{c+1}}}.

Since for the previous iterations we need to maximize $m_{1},\ldots,m_{c}$ , we must fix them so that all restrictions (10) are saturated. By the inductive hypothesis we have that (12) is saturated. Substituting $m_{1}+\ldots+m_{c}$ from this equality to above yields

m_{c+1}\leq\frac{2^{f^{(c)}(\log n)}}{\varepsilon 2^{m_{c+1}}}.

Rearranging this inequality we obtain, using Fact 6,

m_{c+1}\leq\log\frac{1}{\varepsilon}+f^{(c)}(\log n)-\log\left(\frac{1}{% \varepsilon}+f^{(c)}(\log n)\right)=\log\frac{1}{\varepsilon}+f^{(c)}(\log n)-% f^{(c+1)}(\log n).

Adding this to the expression for $m_{1}+\ldots+m_{c}$ gives the desired inequality.

Moreover, it is easy to see that if (10) is saturated for $i=c+1$ as well, the resulting inequality is saturated. $\hfill\blacktriangleleft$

[bib.bib1] [1] Matti J. Aaltonen. Linear programming bounds for tree codes (corresp.). IEEE Trans. Inf. Theory, 25(1):85–90, 1979. doi:10.1109/TIT.1979.1056004.

[bib.bib2] [2] Marshall Ball, Oded Goldreich, and Tal Malkin. Communication complexity with defective randomness. In Valentine Kabanets, editor, 36th Computational Complexity Conference, CCC 2021, July 20-23, 2021, Toronto, Ontario, Canada (Virtual Conference), volume 200 of LIPIcs, pages 14:1–14:10. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2021. doi:10.4230/LIPICS.CCC.2021.14.

[bib.bib3] [3] Jop Briët, Harry Buhrman, Debbie W. Leung, Teresa Piovesan, and Florian Speelman. Round elimination in exact communication complexity. In Salman Beigi and Robert König, editors, 10th Conference on the Theory of Quantum Computation, Communication and Cryptography, TQC 2015, May 20-22, 2015, Brussels, Belgium, volume 44 of LIPIcs, pages 206–225. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2015. doi:10.4230/LIPICS.TQC.2015.206.

[bib.bib4] [4] Ran Canetti and Oded Goldreich. Bounds on tradeoffs between randomness and communication complexity. Comput. Complex., 3:141–167, 1993. doi:10.1007/BF01200118.

[bib.bib5] [5] Arkadev Chattopadhyay, Shachar Lovett, and Marc Vinyals. Equality alone does not simulate randomness. In Amir Shpilka, editor, 34th Computational Complexity Conference, CCC 2019, July 18-20, 2019, New Brunswick, NJ, USA, volume 137 of LIPIcs, pages 14:1–14:11. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2019. doi:10.4230/LIPICS.CCC.2019.14.

[bib.bib6] [6] Susanna F. de Rezende, Or Meir, Jakob Nordström, Toniann Pitassi, Robert Robere, and Marc Vinyals. Lifting with simple gadgets and applications to circuit and proof complexity. In Sandy Irani, editor, 61st IEEE Annual Symposium on Foundations of Computer Science, FOCS 2020, Durham, NC, USA, November 16-19, 2020, pages 24–30. IEEE, 2020. doi:10.1109/FOCS46700.2020.00011.

[bib.bib7] [7] Tomás Feder, Eyal Kushilevitz, Moni Naor, and Noam Nisan. Amortized communication complexity. SIAM J. Comput., 24(4):736–750, 1995. doi:10.1137/S0097539792235864.

[bib.bib8] [8] Rudolf Fleischer, Hermann Jung, and Kurt Mehlhorn. A communication-randomness tradeoff for two-processor systems. Inf. Comput., 116(2):155–161, 1995. doi:10.1006/INCO.1995.1011.

[bib.bib9] [9] Dale Jacobs, John Jeang, Vladimir Podolskii, Morgan Prior, and Ilya Volkovich. Communication complexity of equality and error correcting codes. Technical Report TR25-068, Electronic Colloquium on Computational Complexity (ECCC), 2025. URL: https://eccc.weizmann.ac.il/report/2025/068/.

[bib.bib10] [10] Stasys Jukna. Boolean Function Complexity - Advances and Frontiers, volume 27 of Algorithms and combinatorics. Springer, 2012. doi:10.1007/978-3-642-24508-4.

[bib.bib11] [11] Eyal Kushilevitz and Noam Nisan. Communication complexity. Cambridge University Press, 1997.

[bib.bib12] [12] Olivier Lalonde. Entanglement-assisted communication complexity and nonlocal games. Master’s Thesis, Université de Montréal, 2023. URL: https://umontreal.scholaris.ca/items/d507c06f-33dc-49f5-ab91-31c2bd60c6bc.

[bib.bib13] [13] Olivier Lalonde, Nikhil S Mande, and Ronald de Wolf. Tight bounds for the randomized and quantum communication complexities of equality with small error. arXiv preprint arXiv:2107.11806, 2021.

[bib.bib14] [14] Bruno Loff and Sagnik Mukhopadhyay. Lifting theorems for equality. In Rolf Niedermeier and Christophe Paul, editors, 36th International Symposium on Theoretical Aspects of Computer Science, STACS 2019, March 13-16, 2019, Berlin, Germany, volume 126 of LIPIcs, pages 50:1–50:19. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2019. doi:10.4230/LIPICS.STACS.2019.50.

[bib.bib15] [15] Shay Moran, Makrand Sinha, and Amir Yehudayoff. Fooling pairs in randomized communication complexity. In Jukka Suomela, editor, Structural Information and Communication Complexity - 23rd International Colloquium, SIROCCO 2016, Helsinki, Finland, July 19-21, 2016, Revised Selected Papers, volume 9988 of Lecture Notes in Computer Science, pages 49–59, 2016. doi:10.1007/978-3-319-48314-6_4.

[bib.bib16] [16] Ilan Newman. Private vs. common random bits in communication complexity. Information Processing Letters, 39(2):67–71, 1991. doi:10.1016/0020-0190(91)90157-D.

[bib.bib17] [17] Noam Nisan. The communication complexity of threshold gates. Combinatorics, Paul Erdős Is Eighty, pages 301–315, 1993.

[bib.bib18] [18] Toniann Pitassi, Morgan Shirley, and Adi Shraibman. The strength of equality oracles in communication. In Yael Tauman Kalai, editor, 14th Innovations in Theoretical Computer Science Conference, ITCS 2023, January 10-13, 2023, MIT, Cambridge, Massachusetts, USA, volume 251 of LIPIcs, pages 89:1–89:19. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPICS.ITCS.2023.89.

[bib.bib19] [19] Anup Rao and Amir Yehudayoff. Communication Complexity: and Applications. Cambridge University Press, 2020.

[bib.bib20] [20] Alexander A. Razborov. Communication Complexity, pages 97–117. Springer Berlin Heidelberg, Berlin, Heidelberg, 2011. doi:10.1007/978-3-642-19533-4_8.

[bib.bib21] [21] Ron M. Roth. Introduction to coding theory. Cambridge University Press, 2006.

[bib.bib22] [22] Tim Roughgarden. Communication complexity (for algorithm designers). Found. Trends Theor. Comput. Sci., 11(3-4):217–404, 2016. doi:10.1561/0400000076.

[bib.bib23] [23] Andrew Chi-Chih Yao. Some complexity questions related to distributive computing (preliminary report). In Michael J. Fischer, Richard A. DeMillo, Nancy A. Lynch, Walter A. Burkhard, and Alfred V. Aho, editors, Proceedings of the 11h Annual ACM Symposium on Theory of Computing, April 30 - May 2, 1979, Atlanta, Georgia, USA, pages 209–213. ACM, 1979. doi:10.1145/800135.804414.

Communication Complexity of Equality and Error-Correcting Codes

Abstract

Keywords and phrases:

Copyright and License:

2012 ACM Subject Classification:

Related Version:

DOI:

Event:

Editors:

Series and Publisher:

1 Introduction

Our Results

Theorem 1 (Informal version of Theorem 30 and Corollary 31).

Corollary 2 (Informal version of Corollary 32).

Theorem 3 (Informal version of Theorem 33).

Corollary 4 (Informal version of Corollary 34).

Theorem 5 (Informal version of Theorem 35).

Other related work

2 Preliminaries

Fact 6.

2.1 Notation and Conventions

2.2 Communication Complexity

Definition 7 (protocol, value, cost, one-way).

Definition 8 (deterministic communication complexity).

Fact 9 (deterministic communication complexity of EQn).

Definition 10 (fooling set [11]).

Definition 11 (randomized public-coin protocol [11]).

Definition 12 (randomized communication complexity [11]).

Fact 13 (randomized communication complexity of EQ).

Fact 14 ([4]).

Fact 15 ([13]).

2.3 Error-Correcting Codes and Bounds

Definition 16 (code, block length, codewords).

Definition 17 (distance, relative distance).

Definition 18 (message length).

Definition 19 (rate).

Fact 20 (Singleton Bound).

Definition 21 (q-ary entropy).

Fact 22 (Elias-Bassalygo Bound).

Fact 23 (Griesmer Bound).

Fact 24 (Plotkin Bound).

Fact 25 (MRRW Bound for large alphabet [1]).

Definition 26 (Reed-Solomon codes).

2.4 Codes Give Communication Protocols for 𝐄𝐐

Fact 27.

3 Communication Complexity vs. Error-Correcting Codes

Definition 28 (EQ-like).

3.1 Protocols for 𝐄𝐐 Imply Codes

Lemma 29.

Proof.

Theorem 30.

Proof.

Corollary 31 (One-sided error).

Corollary 32.

3.2 Codes from Functions Based on Fooling Sets

Theorem 33.

Proof.

Corollary 34.

4 Communication Lower Bounds Implied by Code Bounds

Theorem 35.

5 Analyzing the Landscape of Communication Complexity of 𝐄𝐐

5.1 Case 1

▶ Remark 36.

5.2 Case 2

▶ Remark 37.

5.2.1 Upper Bound with Reed-Solomon Protocol

Lemma 38.

Proof.

5.2.2 Upper bound with iterated Reed-Solomon

Lemma 39.

5.3 Case 3

▶ Remark 40.

6 Discussion & Future Directions

References

Appendix A Appendix

A.1 Proof of Lemma 39

Lemma 41.

Proof.

Fact 9 (deterministic communication complexity of $\mathrm{EQ}_{n}$ ).

Fact 13 (randomized communication complexity of $\mathrm{EQ}$ ).

Definition 21 ( $q$ -ary entropy).

2.4 Codes Give Communication Protocols for $\mathrm{EQ}$

3.1 Protocols for $\mathrm{EQ}$ Imply Codes

5 Analyzing the Landscape of Communication Complexity of $\mathrm{EQ}$

$\blacktriangleright$ Remark 36.

$\blacktriangleright$ Remark 37.

$\blacktriangleright$ Remark 40.