Optimal Short-Circuit Resilient Formulas

We consider fault-tolerant boolean formulas in which the output of a faulty gate is short-circuited to one of the gate's inputs. A recent result by Kalai et al. (FOCS 2012) converts any boolean formula into a resilient formula of polynomial size that works correctly if less than a fraction $1/6$ of the gates (on every input-to-output path) are faulty. We improve the result of Kalai et al., and show how to efficiently fortify any boolean formula against a fraction $1/5$ of short-circuit gates per path, with only a polynomial blowup in size. We additionally show that it is impossible to obtain formulas with higher resilience and sub-exponential growth in size. Towards our results, we consider interactive coding schemes when noiseless feedback is present; these produce resilient boolean formulas via a Karchmer-Wigderson relation. We develop a coding scheme that resists up to a fraction $1/5$ of corrupted transmissions in each direction of the interactive channel. We further show that such a level of noise is maximal for coding schemes with sub-exponential blowup in communication. Our coding scheme takes a surprising inspiration from Blockchain technology.

computes the same function that F computes, as long as at most a fraction of 1 6 − ε of the gates in any input-to-output path in F ′ suffer from short-circuit noise. Kalai et al. explicitly leave open the question of finding the optimal fraction of faulty gates for a resilient formula F ′ . 2 In this work we show that a fraction of 1 5 is a tight bound on the tolerable fraction of faulty gates per input-to-output path, subject to the condition that the increase in the size of the formula is sub-exponential. Namely, we show how to convert any formula to a resilient version that tolerates up to a fraction of 1 5 − ε of short-circuited gates per path.
Theorem 1.1 (Main, informal). For any ε > 0, any formula F can be efficiently converted into a formula F ′ of size |F ′ | = poly ε (|F |) that computes the same function as F even when up to 1 5 − ε of the gates in any of its input-to-output paths are short-circuited.
We also show that our bound is tight. Namely, for an arbitrary formula F , it is impossible to make a resilient version (of sub-exponential size in |F |) that tolerates a fraction 1 5 (or more) of short-circuited gates per path.
Theorem 1.2 (Converse). There exists a formula F for computing some function f , such that no formula F ′ of size |F ′ | = o(exp(|F |)) that computes f is resilient to a fraction of 1 5 of short-circuit noise in any of its input-to-output paths.
Similar to the work of Kalai et al. [KLR12], a major ingredient in our result is a transformation, known as the Karchmer-Wigderson transformation (hereinafter, the KW-transformation) [KW90], between a formula that computes a boolean function f , and a two-party interactive communication protocol for a task related to f , which we denote the KW-game for f , or KW f for short. Similarly, a reverse KW-transformation converts protocols back to formulas; see below and Section 6.1 for more details on the KW-transformation. The work of Kalai et al. adapts the KW-transformation to a noisy setting in which the formula may suffer from short-circuit noise, and the protocol may suffer from channel noise. The "attack plan" in [KLR12] for making a given formula F resilient to short-circuit noise is (i) apply the KW-transformation to obtain an interactive protocol π; (ii) convert π to a noise-resilient protocol π ′ that tolerates up to a δ-fraction of noise; (iii) apply the (reverse) KW-transformation on π ′ to obtain a formula F ′ . The analysis of [KLR12] shows that the obtained F ′ is resilient to a δ/2 fraction of noise in any of its input-to-output paths.
The interactive protocols π, π ′ are defined in a setting where the parties have access to a noiseless feedback channel-the sender learns whether or not its transmission arrived correctly at the other side. Building upon recent progress in the field of coding for interactive protocols (see, e.g., [Gel17]), Kalai et al. [KLR12] constructed a coding scheme for interactive protocols (with noiseless feedback) that features resilience of δ = 1 3 − ε for any ε > 0; this gives their result. Note that a resilience of δ = 1 3 is maximal for interactive protocols in that setting [EGH16], which implies that new techniques must be introduced in order to improve the result by [KLR12].
The loss in resilience witnessed in step (iii) stems from the fact that short-circuit noise affects formulas in a "one-sided" manner: a short-circuit of an AND gate can only turn the output from 0 to 1, while a short-circuit in an OR gate can only turn the output from 1 to 0. The noisy AND gates are thus decoupled from the noisy OR gates: if the output of the circuit is 0, any amount of short-circuited OR gates will keep the output 0, while if the output is 1, any amount of short-circuited AND gates will keep the output 1 (see Lemma 6.3). Informally speaking, this decoupling reduces by half the resilience of circuits generated by the KW-transformation. Assume the formula F ′ obtained from the above process is resilient to a δ ′ -fraction of noise. Then F ′ is correct if on a specific input-to-output path (a) at most a δ ′ -fraction of the AND gates are short-circuited, but also if (b) at most a δ ′ -fraction of the OR gates are short-circuited. Since the noise is decoupled, from (a) and (b) we get that F outputs the correct value even when a 2δ ′ -fraction of the gates on that input-to-output path are noisy. Yet, the resilience of F ′ originates from the resilience of π ′ (step (iii) above). The KW-transformation limits the resilience of F ′ by the resilience of π ′ , i.e., 2δ ′ ≤ δ, leading to a factor 2 loss.
We revisit the above line of thought and make a more careful noise analysis. Instead of bounding the total fraction of noise by some δ, we consider the case where the noise from Alice to Bob is bounded by some α while the noise in the other direction is bounded by some β. A similar approach used by Braverman and Efremenko [BE17] yields interactive protocols (without noiseless feedback) with maximal resilience. In more detail, assume that the protocol π communicates n symbols overall. We define an (α, β)-corruption as any noise that corrupts up to αn symbols sent by Alice and up to βn symbols sent by Bob. We emphasize that the noise fraction on Alice's transmissions is higher than α, since Alice speaks less than n symbols overall; the global noise fraction in this case is α + β.
This distinction may be delicate but is instrumental. The KW-transformation translates a protocol of length n that is resilient to (α, β)-corruptions into a formula which is resilient to up to αn short-circuited AND gates in addition to up to βn short-circuited OR gates. When α = β the obtained formula is resilient to up to an α-fraction of short-circuited gates in any input-to-output path, avoiding the factor 2 loss in resilience.

Technique overview
Achievability: Coding schemes for noisy channels with noiseless feedback. We obtain resilient formulas by employing the approach of [KLR12] described above. In order to increase the noise resilience to its optimal level, we develop a novel coding scheme which is resilient to 1 5 − ε, 1 5 − ε -corruptions, assuming noiseless feedback.
The mechanism of our coding scheme resembles, in a sense, the Blockchain technology [Nak08]. Given a protocol π 0 that assumes reliable channels, the parties simulate π 0 message by message. These messages may arrive at the other side correctly or not; however, a noiseless feedback channel allows each party to learn which of its messages made it through. With this knowledge, the party tries to create a "chain" of correct messages. Each message contains a pointer to the last message that was not corrupted by the channel. As time goes by, the chain grows and grows, and indicates the entire correct communication of that party. An appealing feature of this mechanism is the fact that whenever a transmission arrives correctly at the other side, the receiver learns all the correct transmissions so far. On the other hand, the receiver never knows whether a single received transmission (and the chain linked to it) is indeed correct.
The adversarial noise may corrupt up to 1 5 − ε n of the messages sent by each party. We think of the adversary as one trying to construct a different, corrupt, chain. Due to its limited budget, at the end of the coding scheme one of two things may happen. Either the correct chain is the longest, or the longest chain contains in its prefix a sufficient amount of uncorrupted transmissions.
Indeed, if the adversary tries to create its own chain, its length is bounded by 1 5 − ε n, while the correct chain is of length 2n 5 at the least. 3 On the other hand, the adversary can create a longer chain which forks off the correct chain. As a simple example, consider the case where a party sends ≈ 2n 5 messages which go through uncorrupted. Now, the adversary starts corrupting the transmissions and extends the correct chain with 1 5 − ε n corrupt messages. 4 The corrupt forked chain is of length 2n 5 + 1 5 − ε n and may be longer than the correct chain. However, in this case, the information contained in the uncorrupted prefix of the corrupt forked chain is sufficient to simulate the entire transcript of π 0 .
Another essential part of our coding scheme is its ability to alter the order of speaking according to the observed noise. 5 Most previous work follows the following intuition. If a party's transmissions were corrupted, then the information contained in these transmissions still needs to reach the other side. Therefore, the coding scheme should allow that party to speak more times. In this work we take the opposite approach-the more 3 The order of speaking in the coding scheme depends on the noise. Therefore, it is not necessary that a party speaks half of the times; see discussion below. 4 This attack assumes that there are n/5 additional rounds where the same party speaks. This assumption is usually false and serves only for this intuitive (yet unrealistic) example.
5 Protocols that change their length or order of speaking as a function of the observed noise are called adaptive [GHS14,AGS16]. Since these decisions are noise-dependent, the parties may disagree on the identity of the speaker in each round, e.g., both parties may decide to speak in a given round, etc. We emphasize that due to the noiseless feedback there is always a consensus regarding whose turn it is to speak next. Hence, while our scheme has a non-predetermined order of speaking, the scheme is non-adaptive by the terminology of [EGH16]; see discussion in [EGH16] and in Section 6 of [Gel17]. a party is corrupted in the first part of the protocol, the less it speaks in the later part. The intuition here is that if the adversary has already wasted its budget on some party, it cannot corrupt much of the subsequent transmissions of that party. A similar approach appears in [AGS16].
One hurdle we face in constructing our coding scheme derives from the need to communicate pointers to previous messages using a small (constant-size) alphabet. Towards this end, we first show a coding scheme that works with a large alphabet that is capable of pointing back to any previous transmission. Next, we employ a variable-length coed, replacing each pointer with a large number of messages over a constantsize alphabet. We prove that this coding does not harm the resilience, leading to a coding scheme with a constant-size alphabet and optimal resilience to 1 5 − ε, 1 5 − ε -corruptions.
Converse: Impossibility Bound. The converse proof consists of two parts. First, we show that for certain functions, any protocol resilient to 1 5 , 1 5 -corruptions must have an exponential blowup in the communication. In the second part, we show a (noisy) KW-transformation from formulas to protocols. Together, we obtain an upper bound on the noise of formulas. Indeed, assuming that there is a "shallow" formula that is resilient to 1 5 , 1 5 -corruptions, converting it into a protocol yields a "short" protocol with resilience to 1 5 , 1 5 -corruptions. The existence of such a protocol contradicts the bound of the first part. The bound on the resilience of protocols follows a natural technique of confusing a party between two possible inputs. We demonstrate that a 1 5 , 1 5 -corruption suffices in making one party (say, Alice) observe exactly the same transcript whether Bob holds y or y ′ . Choosing x, y, y ′ such that the output of the protocol differs between (x, y) and (x, y ′ ) leads to Alice erring on at least one of the two instances.
This idea does not work if the protocol is allowed to communicate a lot of information. To illustrate this point, assume f : Σ n × Σ n → Σ z defined over a channel with alphabet Σ. Consider a protocol where the parties send their inputs to the other side encoded via a standard Shannon error-correcting code of length n ′ = O(n) symbols, with distance 1 − ε for some small constant ε > 0. The protocol communicates 2n ′ symbols overall, and a valid 1 5 , 1 5 -corruption may corrupt up to 2n ′ 5 symbols of each one of the codewords. However, this does not suffice to invalidate the decoding of either of the codewords, since an error-correcting code with distance ≈ 1 is capable of correcting up to ≈ n ′ 2 corrupted symbols. On the other hand, once we limit the communication of the protocol, even moderately, to around n symbols, the above encoding is not applicable anymore. Quite informally, our lower bound follows the intuition described below. We show the existence of a function f such that for any protocol that computes f in r rounds (where r is restricted as mentioned above), the following properties hold for one of the parties (stated below, without loss of generality, for Alice). There are inputs x, x ′ , y, y ′ such that (1) f (x, y) = f (x ′ , y) = f (x ′ , y ′ ) and (2) Alice speaks at most r 5 times during the first 2r 5 rounds. Further, (3) when Alice holds x, the protocol communicates exactly the same messages during its first 2r 5 rounds, whether Bob holds y or y ′ (assuming no channel noise is present).
When we bound the protocol to these conditions, a 1 5 , 1 5 -corruption is strong enough to make the transcript identical from Alice's point of view on (x ′ , y) and (x ′ , y ′ ), implying the protocol cannot be resilient to such an attack. In more detail, we now describe an attack and assume Bob speaks at most 2r 5 times beyond round number 2r 5 , given the attack. (If Bob speaks more, then an equivalent attack will be able to confuse Bob rather than Alice.) The attack changes the first 2r 5 rounds as if Alice holds x rather than x ′ ; this amounts to corrupting at most r 5 transmissions by Alice due to property (2). Bob behaves the same regardless of his input due to property (3). From round 2r 5 and beyond, the attack corrupts Bob's messages so that the next r 5 symbols Bob sends are consistent with y and the following r 5 symbols Bob communicates are consistent with y ′ . Since Bob speaks less than 2r 5 times (given the above noise), the attack corrupts at most r 5 of Bob's transmissions after round 2r 5 . Unfortunately, while the above shows that some functions f cannot be computed in a resilient manner, this argument cannot be applied towards a lower bound on resilient formulas. The reason is that the KW f task is not a function, but rather a relation-multiple outputs may be valid for a single input. The attack on protocols described earlier shows that a 1 5 , 1 5 -corruption drives the protocol to produce a different output from that in the noiseless instance. However, it is possible that a resilient protocol gives a different but correct output. Therefore, we need to extend the above argument so it applies to computations of arbitrary relations. Specifically, we consider the parity function on n bits and its related KW-game. We show the existence of inputs that satisfy conditions (2) and (3) above, while requiring that the outputs of different inputs be disjoint; i.e., any possible output of (x ′ , y) is invalid for (x, y) and for (x ′ , y ′ ).
The last part of the converse proof requires developing a KW-transformation from formulas to protocols, in a noise-resilience preserving manner. Let us begin with some background on the (standard) KWtransformation (see Section 6.1 for a formal description). The KW-game (or rather a slight adaptation we need for our purposes) is as follows. For a boolean function f on {0, 1} n , Alice gets an input x such that f (x) = 0 and Bob gets an input y such that f (y) = 1; their goal is to output a literal function ℓ(z) (i.e., one of the 2n functions of the form ℓ(z) = z i or ℓ(z) = ¬z i ) such that ℓ(x) = 0 and ℓ(y) = 1.
Let F be a boolean formula for f , consisting of ∨ and ∧ gates, and where all the negations are pushed to the input layer (i.e., F is a monotone formula of the literals z i , ¬z i ). The conversion of F to a protocol π for the KW f game is as follows. View the formula as the protocol tree, with the literals at the bottom of the tree being the output literal function. Assign each ∧-node to Alice, and each ∨-node to Bob.
The invariant maintained throughout the execution of the protocol is that if the protocol reaches a node v, then the value of v in F is 0 when evaluated on x, and 1 when evaluated on y. This invariant holds for the output gate of the formula, which is where the communication protocol begins. Next, each time that the protocol is at node v and it is Alice's turn to speak (thus v is an ∧-gate in F ), Alice sends the identity of a child which evaluates to 0 on x. Note that assuming the invariant holds for v, Alice can send the identity of such a child (since at least one of the inputs to an AND gate that outputs a 0, also evaluates to 0), while this child must evaluate to 1 on y assuming v evaluates to 1 on y. By maintaining this invariant, Alice and Bob arrive at the bottom, where they reach a literal evaluating to 0 on x and 1 on y. Note that there is some room for arbitrary decision making: if more than one child of v evaluates to 0 on x, Alice is free to choose any such child-the protocol will be valid for any such choice.
In this work we extend the above standard KW-transformation to the noisy-regime. Namely, we wish to convert a resilient formula into an interactive protocol π while keeping the protocol resilient to a similar level of channel noise. We note that the extension we need is completely different from what is found in previous uses of the KW-transformation. Indeed, for the achievability bound, a KW-transformation is used in both steps (i) and (iii) in the above outline of [KLR12]. However, the instance used in step (i) assumes there is no noise, while the instance in step (iii) works in the other direction, i.e., it transforms (resilient) protocols to (resilient) formulas.
Similar to the standard transformation, our noisy KW-transformation starts by constructing a protocol tree based on the formula's structure, where every ∧-gate is assigned to Alice and every ∨-gate to Bob. The main difference is in the decision making of how to proceed when reaching a node v. The goal is to keep the invariant that the gate v in F evaluates to 0 on x and to 1 on y, even when noise is present.
When only one of v's descendants evaluates to 0 on x in F , Alice has no choice but to choose that child. However, when more than a single descendant evaluates to 0 on x, Alice's decision is less obvious. Moreover, this decision may affect the resilience of the protocol-it is possible that noise causes one of the descendants evaluate to 1 on that given x.
We observe, however, that one of v's children evaluates to 0 on x given all the noise patterns F is resilient against. The other children may still evaluate to 1 sometimes, as a function of the specific noise. Once we identify this special child that always evaluates to 0, Alice can safely choose it and maintain the invariant (and the correctness of the protocol), regardless of future noise. In more detail, we prove that if such a special child did not exist and all descendants could evaluate to both 0 and 1 as a function of the noise, then we could construct a noise pattern E * that would make all descendants evaluate to 1 on x simultaneously. Hence, assuming the noise is E * , the node v would evaluate to 1 on x, and consequently F (x) = 1. At the same time, we show that F is resilient to the noise E * , so F (x) = 0 assuming the noise is E * , and we reach a contradiction.

Other related work
The field of interactive coding schemes [Gel17] started with the seminal line of work by Schulman [Sch92,RS94,Sch96]. Commonly, the goal is to compile interactive protocols into a noise-resilient version that has (1) good noise resilience; (2) a good rate; and (3) high probability of success. Computational efficiency is another desired goal. Numerous works achieve these goals, either fully or partially [BR14, GMS14, BKN14, FGOS15, BE17, GH14, KR13, Hae14, GHK + 18], where the exact parameters depend on the communication and noise model.
Most related to this work are coding schemes in the setting where a noiseless feedback channel is present. Pankratov [Pan13] gave the first interactive coding scheme that assumes noiseless feedback. The scheme of [Pan13] aims to maximize its rate, assuming all communication passes over a binary symmetric channel (BSC) with flipping parameter ε (i.e., a channel that communicates bits, where every bit is flipped with probability ε, independently of other bits). Pankratov's scheme achieves a rate of 1 − O( √ ε) when ε → 0.
Gelles and Haeupler [GH17] improved the rate in that setting to 1 − O(ε log 1/ε), which is the current state of the art. For the regime of high noise, Efremenko, Gelles, and Haeupler [EGH16] provided coding schemes with maximal noise resilience, assuming noiseless feedback. They showed that the maximal resilience depends on the channel's alphabet size and on whether or not the order of speaking is noise-dependent. Specifically, they developed coding schemes with a noise-independent order of speaking and a constant rate that are resilient to 1/4 − ε and 1/6 − ε fractions of noise with a ternary and binary alphabet, respectively. When the order of speaking may depend on the noise, the resilience increases to 1/3 − ε for any alphabet size. They showed that these noise levels are optimal and that no general coding scheme can resist higher levels of noise. There has been a tremendous amount of work on coding for noisy channels with noiseless feedback in the one-way (non-interactive) communication setting, starting with the works of Shannon, Horstein, and Berlekamp [Sha56,Hor63,Ber64]. It is known that the presence of feedback does not change the channel's capacity, however, it improves the error exponent. The maximal noise-resilience in this setting is also known. Recently, Haeupler, Kamath, and Velingker [HKV15] considered deterministic and randomized codes that assume a partial presence of feedback.

Organization
The first half of our paper considers interactive coding protocols over noisy channels with noiseless feedback. Section 3 proves that any interactive coding scheme that is resilient to 1 5 , 1 5 -corruptions must exhibit a zero rate. Sections 4-5 describe our constant-rate coding scheme that is resilient to 1 5 − ε, 1 5 − ε -corruptions. First, Section 4 describes a scheme with a large alphabet (polynomial in the length of the protocol). Then, Section 5 shows how to reduce the alphabet to a constant size.
The second half of the paper (Section 6) considers noise-resilient circuits. First, in Section 6.1 we recall the notions of formulas, short-circuit noise and the (noiseless) KW-transformation. In Section 6.2 we present our noise-preserving KW-transformation and show how to convert a resilient formula into a resilient protocol. This reduction (along with the impossibility from Section 3) proves the converse theorem, showing that the resilience we obtain for formulas is maximal. In Section 6.3 we provide the other direction, a noise-resilient transformation from protocols to formulas (following [KLR12]). Employing the coding scheme of Section 5 we give an efficient method that compiles any formula into an optimal resilient version.

Preliminaries
Notations For integers i ≤ j we denote by [i, j] the set {i, i+1, . . . , j} and by [i] the set {1, . . . , i}. We let Σ be some finite set. For a string s ∈ Σ * and two indices x, y ∈ {1, . . . , |s|}, x < y we let s[x, y] = s x s x+1 · · · s y . We will treat ∅ as the empty word, i.e., for any a ∈ Σ * we have a • ∅ = ∅ • a = a, where • stands for concatenation. For bits a, b ∈ {0, 1} we denote a ⊕ b = a + b mod 2, a ∧ b = a · b, and b = 1 − b. For two bitstrings of the same length x, y ∈ {0, 1} n we denote by x, y = i (x i · y i ) their inner product (mod 2) as vectors over GF (2). We denote x = x 1 x 2 · · · x n , the bit-wise complement of x. All logarithms are taken to base 2, unless the base is explicitly written.
Interactive Protocols In the interactive setting we have two parties, Alice and Bob, who receive private inputs x ∈ X and y ∈ Y , respectively. Their goal is to compute some predefined function f (x, y) : X ×Y → Z by sending messages to each other. A protocol describes for each party the next message to send, given its input and the communication received so far. We assume the parties send symbols from a fixed alphabet Σ. The protocol also determines when the communication ends and the output value (as a function of the input and received communication).
Formally, an interactive protocol π can be seen as a |Σ|-ary tree (also referred to as the protocol tree), where each node v is assigned either to Alice or to Bob. For any v node assigned to Alice there exists a mapping a v : X → Σ that maps the next symbol Alice should send, given her input. Similarly, for each one of Bob's nodes we set a mapping b v : Y → Σ. Each leaf is labeled with an element of Z. The output of the protocol on input (x, y) is the element at the leaf reached by starting at the root node, and traversing down the tree, where, at each internal node v owned by Alice (resp., Bob), if a v (x) = i (resp., b v (y) = i) the protocol advances to the i-th child of v. For convenience, we denote Alice's nodes by the set V a and Bob's nodes by the set V b . We may assume that all the nodes in a given protocol tree are reachable by some input (x, y) ∈ X × Y (otherwise, we can prune that branch without affecting the behavior of the protocol). Note that the order of speaking in π does not necessarily alternate and it is possible that the same party is the sender in consecutive rounds. For any given transcript T , we denote by π(· | T ) the instance of π assuming the history T . Specifically, assuming Alice is the sender in the next round (assuming the history so far is T ), then the next communicated symbol is π(x | T ).
The length of a protocol, denoted |π|, is the length of the longest root-to-leaf path in the protocol tree, or equivalently, it is the maximal number of symbols the protocol communicates in any possible instantiation. In the following we assume that all instances have the same length |π|. The communication complexity of the protocol is CC(π) = |π| log |Σ|.
When Σ is constant (independent of the input size), we have CC(π) = O(|π|). If, by round t, Alice is the sender in t A rounds and Bob is the sender in t B = t − t A rounds, we denote their respective communication complexity until round t by CC ≤t A (π) = t A log |Σ| and CC ≤t B (π) = t B log |Σ|.
Transmission Noise with Feedback We will assume the communication channel may be noisy, that is, the received symbol may mismatch with the sent symbol. All the protocols considered in this work assume the setting of noiseless feedback : the sender always learns the symbol that the other side received (whether corrupted or not). The receiver, however, does not know whether the symbol it received is indeed the one sent to him. A noise pattern is defined as E ∈ {0, 1, . . . , |Σ| − 1, * } |Va|∪|V b | . For any node v, E v denotes the symbol that the receiver gets for the transmission that is done when the protocol reaches the node v. Specifically, say v is an Alice-owned node, then if E v = * , Bob receives the symbol sent by Alice; otherwise, E v = * , Bob receives the symbol E v . Note that due to the feedback, Alice learns that her transmission was corrupted as well as the symbol that Bob received, and the protocol descends to the node dictated by E v . We denote by π E the protocol π when the noise is dictated by E; we sometimes write π 0 for a run of the protocol with no transmission noise, i.e., with the pattern E = * |Va|∪|V b | .
We say that a protocol is resilient to a noise pattern E if for any (x, y) ∈ X × Y it holds that π E outputs the same value as π 0 . While it is common to limit the noise to a constant fraction of the transmissions, in this work we take a more careful look at the noise, and consider the exact way it affects the transmissions of each party.
Definition 2.1. An (α, β)-corruption is a noise pattern that changes at most α|π| symbols sent by Alice and at most β|π| symbols sent by Bob. Note that the effective (combined) noise rate is (α + β).

Resilience to (1/5, 1/5)-Corruptions is Impossible
In this section we prove that no coding scheme with constant overhead can be resilient to a (1/5, 1/5)corruption. To this end we show a specific (1/5, 1/5)-corruption that confuses any protocol for a specific function f that is "hard" to compute in linear communication. Our result does not apply to coding schemes with vanishing rates. In fact, if the communication is exponentially large, coding schemes with resilience higher than 1/5 exist. 6 Normally, we discuss the case where protocols compute a function f : X × Y → Z. While our converse bound on the resilience of interactive protocols works for some hard function (e.g., the pointer jumping), such a proof does not suffice towards our converse on the resilience of boolean formulas (Theorem 1.2). The reason is that the conversion from formulas to protocols does not yield a protocol that computes a function, but rather a protocol that computes a relation. Recall that for any given function f and any input (x, y) such that f (x) = 0 and f (y) = 1, the KW-game for f , KW f , outputs an index i ∈ [n] for which x i = y i (see Section 6.1 for a formal definition). However, multiple such indices may exist and each such an index is a valid output.
Let X, Y, Z be finite sets and R ⊆ X × Y × Z be a ternary relation. For any (x, y) ∈ X × Y and a given relation R let R(x, y) = {z | (x, y, z) ∈ R} be the set of all z that satisfy the relation for x, y. We assume that for any x, y it holds that |R(x, y)| > 0. Given such a relation, a protocol that computes the relation is the following two-party task. Alice is given x ∈ X and Bob is given y ∈ Y . The parties need to agree on some z ∈ R(x, y).
We now show an explicit relation for which no protocol (of "short"' length) is resilient to (1/5, 1/5)corruptions. Specifically, in the rest of this section we consider the binary parity function on n bits, par : be the KW-game for the parity function, defined by We will need the following technical claim.
Proof. Since |Y | > 2 n/2 , there exist k = ⌊n/2⌋ + 1 linearly independent elements , 1} n | v, w = 0 for all w ∈ L} be the orthogonal space of L with respect to the ·, · product and recall that dim L + dim L ⊥ = n. Since dim L = k we get that dim L ⊥ = n − k < n/2 and therefore |L which, in turn means that its product with at least one of the b i s must be non-zero (or otherwise y would belong to the orthogonal space L ⊥ ). That is, there exists b i for which b i , y = 1, as stated.
Lemma 3.2. Let π be an interactive protocol for KW par (with inputs of n bits) of length |π| = r defined over a communication channel with alphabet Σ and noiseless feedback. Without loss of generality, let Alice be the party who speaks less in the first 2r/5 rounds of π (averaging over all possible inputs (x, y) ∈ X × Y ). Additionally, assume n/3 > 2r log(2|Σ|)/5.
(2) During the first 2r/5 rounds of the execution π(x, y) Alice speaks fewer times than Bob. ( Note that the above lemma assumes Alice is the party that speaks fewer times in the first 2r/5 rounds of π when averaging over all possible inputs (x, y) ∈ X × Y ; otherwise, a symmetric lemma holds for Bob.
Proof. Let x be an input for Alice such that on most of the values y, Alice speaks fewer times in the first 2n/5 rounds of π(x, y). Such an input must exist by our choice of Alice. Let be the set of all inputs for Bob, where Alice speaks fewer times in the first 2r/5 rounds of π assuming Alice holds the above x. By the choice of x, it holds that |Y ′ | ≥ 2 n /2.
Consider the set of transcript prefixes of length 2r/5 generated by π when Alice holds the above x and Bob holds some input from the set Y ′ , Note that there are at most (2|Σ|) 2r/5 different prefixes of length 2r/5 over Σ with an arbitrary order of speaking. Since we assumed n/3 > 2r log(2|Σ|)/5, we have, for large enough n, with Υ = 2 (n+1)/2 +1. Using a pigeon-hole principle, there must be y 1 , y 2 , . . . , agree on the first 2r/5 rounds of the protocol-they have an identical order of speaking and they communicate the same information.
Next consider the set {x ⊕ y i } Υ i=1 . Claim 3.1 guarantees that there exist two elements in that set such that x ⊕ y i , x ⊕ y j = par(x); these y i , y j will be our y, y ′ . Note that Properties (1) and (2) of the lemma are satisfied by the above x, y, y ′ . We are left to show an input x ′ for Alice that satisfies property (3).
Based on the above x, y, y ′ we construct x ′ in the following manner. For any i ∈ [n] set The above x ′ is constructed such that outputs given by KW par are disjoint if we change only the input of Alice or only the input of Bob. Formally, Claim 3.3. The following claims hold for the above x, x ′ , y, y ′ , Since we picked y, y ′ for which x ⊕ y, x ⊕ y ′ = par(x), we conclude that par(x ′ ) = 0.
i are all bits and these two inequalities imply y i = y ′ i . But then, x ′ i = y i by the way we construct x ′ , which is a contradiction.
Both options lead to a contradiction. The proof of the second part is identical.
The first claim proves that x ′ ∈ X. The other claims prove property (3) of the lemma and conclude its proof.
Our main result in this section is the following theorem, proving that no protocol for the KW par can be resilient to a (1/5, 1/5)-corruption if its communication is bounded. This will imply that any coding scheme that is resilient to (1/5, 1/5)-corruption must have rate 0. Specifically, it cannot produce a protocol with a constant overhead with respect to the optimal protocol that computes KW par over reliable channels.
We now generate a transcript T and show that T is consistent with a (1/5, 1/5)-corruption of π(x 1 , y 0 ). Additionally, it is either the case that T is consistent with a (1/5, 1/5)-corruption of π(x 1 , y 1 ) or it is consistent with a (1/5, 1/5)-corruption of π(x 0 , y 0 ). In the first case, Alice is unable to distinguish the case where Bob holds y 0 and y 1 ; in the second, Bob cannot tell if Alice holds x 0 or x 1 . The outputs for different inputs are distinct by property (3). Thus the confused party is bound to err on at least one of them.
Note that the transcript T contains messages received by the two parties, which may be noisy. Due to the feedback, both parties learn T . Additionally, the order of speaking in π is entirely determined by (prefixes of) T . Specifically, if two different instances of π have the same received transcript by round j, the party to speak in round j + 1 is identical in both instances.
The string T is obtained in the following manner: 1. Run π(x0, y0) for 2r/5 rounds. Let T1 be the generated transcript.
In the case where the above algorithm did not execute Step i, for i ∈ {3, 4}, assume Ti = ∅.
We now show that T corresponds to a (1/5, 1/5)-corrupted execution of π for two different valid inputs with disjoint outputs. We consider two cases: (i) when Step 3 halts since T reached its maximal size of r symbols (i.e., when T 4 = ∅), and (ii) when Step 3 halts since Bob transmitted r/5 symbols in this step (T 4 = ∅).
In this case we show that a (1/5, 1/5)-corruption suffices to make the executions of π(x 1 , y 0 ) and π(x 1 , y 1 ) look the same from Alice's point of view.
Let Π be the transcript of a noisy execution of π(x 1 , y 0 ) (defined shortly) and split Π into three parts: The noise changes all Alice transmissions in Π 1 so that they correspond to Alice's symbols in T 1 ; the noise changes all Bob's transmissions in Π 3 so that they correspond to Bob's transmissions in T 3 . It is easy to verify that the obtained transcript Π of received messages is exactly T . Furthermore, the first part changes at most r/5 transmissions by Alice, since by property (2) Alice speaks fewer times in the first 2r/5 of the instance π(x 0 , y 0 ). The second part changes at most r/5 transmissions of Bob since T 3 halts before Bob communicates additional r/5 transmissions. Hence, the noise described above is a valid (1/5, 1/5)-corruption.
On the other hand, and abusing notation, consider a (noisy) instance of π(x 1 , y 1 ) and let Π = Π 1 Π 2 Π 3 be the received messages transcript split into parts that corresponds in length to T 1 , T 2 , T 3 , assuming the following noise. Again, the noise changes all Alice's transmissions in Π 1 to be the corresponding symbols received in T 1 . This makes the 2r/5 first rounds of the received transcript look like the ones in the instance π(x 0 , y 1 ). By Property (1), these transmissions agree with the first 2r/5 transmissions in the noiseless instance π(x 0 , y 0 ); hence, the corrupted Π 1 equals T 1 . Next, the noise changes Bob's transmissions in Π 2 to correspond to T 2 . The obtained transcript Π is then exactly T (note that Π 3 = T 3 by definition). Again, T 1 contains at most 2r/5 of Alice's transmissions, and T 2 contains at most r/5 transmissions of Bob by definition. Hence, this is a valid (1/5, 1/5)-corruption.
We conclude by recalling that KW par (x 1 , y 0 ) ∩ KW par (x 1 , y 1 ) = ∅, that Alice must be wrong on at least one of the above executions, since her view in both executions is the same. Note that the above proof holds even when T 3 = ∅.
case (ii) T 4 = ∅. In this case we show a (1/5, 1/5)-corruption that makes the executions of π(x 0 , y 0 ) and π(x 1 , y 0 ) look the same from Bob's point of view. We point out that Alice speaks at most r/5 times after Step 1. Indeed, Step 1 contains 2r/5 rounds, and Steps 2-3 contain 2r/5 rounds where Bob speaks, hence, Alice may speak in at most another r/5 times after Step 1.
Let Π be the transcript of a noisy execution of π(x 0 , y 0 ) where the noise is defined below. Split Π into 4 parts: Π = Π 1 Π 2 Π 3 Π 4 that correspond in length to T 1 , T 2 , T 3 , T 4 . The noise changes all Alice's transmissions in Π 2 Π 3 Π 4 so that they match the corresponding symbols of T 2 , T 3 , T 4 . As mentioned, this corrupts at most r/5 symbols. Additionally, the noise changes Bob's transmissions in Π 3 to correspond to T 3 ; this, by definition, entails r/5 corruptions of Bob's transmissions. The obtained transcript Π is exactly T .
On the other hand, and abusing notation again, consider a noisy execution of π(x 1 , y 0 ) denoted by Here the noise is defined as follows. The noise changes all Alice's transmissions in Π 1 to match the corresponding symbols of T 1 . As before, the noise changes Bob's transmissions in Π 3 to match T 3 . Now it holds that Π = T , while the noise corrupted at most r/5 of each party's transmissions.
We conclude by recalling that KW par (x 0 , y 0 ) ∩ KW par (x 1 , y 0 ) = ∅. Thus, Bob must be wrong on at least one of the above executions, since his view in both executions is exactly the same.
Note that KW par has a protocol of length O(log n) assuming reliable channels. 7 Theorem 3.4 leads to the following conclusion.
Corollary 3.5. There exists an interactive protocol π 0 defined over a noiseless channel with feedback such that any protocol π that computes the same functionality as π 0 and is resilient to (1/5, 1/5)-corruptions (assuming noiseless feedback) must incur an exponential blowup in the communication.
As a consequence, any coding scheme that compiles any protocol into a (1/5, 1/5)-resilient version must have rate 0.

A Coding Scheme with a Large Alphabet
In this section we construct a coding scheme for interactive protocols assuming noiseless feedback. We show that for any constant ε > 0, any protocol π 0 defined over noiseless channels (with noiseless feedback) can be simulated by a protocol π = π ε defined over noisy channels (with noiseless feedback) such that (1) CC(π)/CC(π 0 ) = O ε (1), and (2) π is resilient to (1/5−ε, 1/5−ε)-corruptions. The protocol π in this section communicates symbols from a large alphabet of polynomial size in |π 0 |. In later sections we show how to reduce the size of the alphabet. While the coding scheme π will alter its order of speaking in accordance with the noise, we will assume that π 0 is an alternating protocol. This is without loss of generality, since any protocol can be made alternating by increasing its communication complexity by a factor of at most 2.

The coding scheme
At a high level, the coding scheme (Algorithm 1) runs for n = |π 0 |/ε rounds in which it tries to simulate π 0 step by step. The availability of noiseless feedback allows a party to notice when the channel alters a transmission sent by that party. The next time that party speaks, it will re-transmit its message and "link" the new transmission to its latest uncorrupted transmission. That is, each message carries a "link"-a pointer to a previous message sent by the same party. By following the links, the receiver learns the "chain" of uncorrupted transmissions; the party considers all "off-chain" transmissions as corrupted. Note that there are two chains: one for symbols received by Alice and one for symbols received by Bob. However, due to the feedback, both parties learn the received symbols at both sides and can infer both chains.
The algorithm consists of several sub-procedures. The Parse procedure (line 14) parses all the transmissions received so far at a given party and outputs the "current chain" of that party: the (rounds of the) transmissions linked by the latest received transmission. Note that once a new transmission arrives, the current chain of the recieving party possibly changes. Moreover, upon reception of a corrupt transmission, a corrupt chain may be retrieved.
Given the current chains of both parties, the parties can infer a candidate for a partial transcript for π 0 . We call this transcript the simulated transcript of π 0 given the current chains. Note, that if the last received transmission is corrupted, the current chain linked to it might be arbitrary, and the candidate for the simulated transcript will be wrong. The TempTranscript procedure (line 21) determines the partial simulated transcript of π 0 according to all the messages received in π so far at both sides, i.e., according to At the beginning of round 4, the current chain contains m 1 , X for Alice and m 2 for Bob, thus Bob sends m 4 according to this false information. Note that Bob's message is not corrupted, yet Alice knows it is based on wrong information and ignores it. In round 5 Alice indicates to Bob that round 3 was corrupted by linking m 5 to m 1 rather than to the corrupted X. Bob learns that m 4 was generated based on wrong information and can correct it using future transmissions. In round 6, the current chain contains m 1 , m 5 for Alice and m 2 , m 4 , m 6 for Bob. The information in m 4 , although on Bob's chain, will not affect the generated simulated transcript.
the current chains. Again, the scheme considers only transmissions that are on the current chains and ignore all off-chain transmissions.
The TempTranscript procedure computes the simulated transcript by concatenating all the messages that according to the current knowledge (a) were received uncorrupted and (b) were generated (at the sender's side) according to "correct" information, i.e., information that is consistent with the current chains. To clarify this behavior, consider round i where, without loss of generality, Alice sends the message m i . Assume that the last transmission received by Alice prior to round i, which we denote Prev(i) (see Definition 4.1), is uncorrupted. This implies that Alice learns in round Prev(i) the correct current chain of Bob, i.e., which of Bob's transmissions so far are correct and which ones are corrupted. Using the feedback, Alice knows which of her transmissions were corrupted and thus she knows both current chains. Learning the correct chains allows Alice to retrieve a correct partial transcript of π 0 (Lemma 4.2). Hence, she can generate the correct m i that extends the simulation of π 0 by one symbol.
In each round of the protocol, the parties construct the partial simulated transcript implied by the current chains. If the received transmission is not corrupted, the TempTranscript procedure retrieves the correct transcript (i.e., the simulated transcript at that point is indeed a prefix of the transcript of π 0 ). Then, the parties simulate the next rounds of π 0 assuming the simulated partial transcript. As long as there is no noise in two alternating rounds, the next transmission extends the simulation of π 0 by one symbol. Otherwise, the sent symbol may be wrong, however, it will be ignored in future rounds once the chains indicate that this transmission was generated due to false information. Finally, at the end of the protocol, the parties output the simulated transcript implied by the longest chain at each side. The main part of this section is the proof that the longest chain indeed implies a complete and correct simulation of π 0 .
An important property of the coding scheme is its adaptive order of speaking. In the first 2n/5 rounds, the order of speaking alternates. In later rounds, the order of speaking is determined according to the observed noise: the more corrupted transmissions a party has, the less the party gets to speak. In particular, the protocol is split into epochs of 2 or 3 rounds each. In the first two rounds of an epoch, the order is fixed: Alice speaks in the first round and Bob speaks in the second. Then, the parties estimate the noise each party suffered so far (namely, the length of their current chain) and decide whether or not the epoch has a third round and who gets to speak in that extra round. For Alice to be the speaker in the third epoch-round, her current chain must be of length less than n/5 while Bob's current chain must be longer than n/5; Bob gets to speak if his chain is of length less than n/5 while Alice's chain is longer than n/5. In all other cases, the epoch contains only two rounds. We emphasize that due to the noiseless feedback, both parties agree on the received symbols (on both sides), which implies they agree on the current chains on both side, and thus, on the order of speaking in every epoch. The Next procedure (line 28), which determines the next speaker according to the current received transcript, captures the above idea.
The coding scheme is depicted in Algorithm 1.

Good rounds and the implied transcript
Next, we show that whenever a transmission arrives correctly at the other side, the receiver learns all the uncorrupted transmissions communicated so far. First, let us define the notions of good rounds and implied transcript. Note: R ≤j A is the prefix of R A as received by the j-th round of the protocol (incl. j) and R <j A excluding round j. The terms R ≤j B and R <j B are similarly defined.
We additionally set any round i ≤ 0 to be good (and uncorrupted) by definition.
Definition 4.3 (Implied Transcript). For any round i, the transcript T (i) is defined as the (natural order) concatenation of bits {b j } j∈GOOD ≤i where σ j = (link j , b j ) is the symbol transmitted (and correctly received) in round j.
While the above GOOD and T (i) are tools for the analysis, the next lemma shows that whenever the i-th transmission arrives correctly at the other side, the receiver learns GOOD ≤i and T (i). Specifically, the variable GoodChain (Line 24) equals GOOD ≤i and TempTranscript outputs T (i). This allows us (despite some abuse of notation) to treat T (i) and the output of TempTranscript interchangeably, as long as round i is uncorrupted. Proof. Assume, without loss of generality, that Alice is the receiver of the i-th transmission. Since transmission i is uncorrupted, it holds that G B = Parse(R A ) contains exactly all the uncorrupted transmission sent by Bob so far. Alice knows her own uncorrupted transmissions G A via the feedback. Then GoodChain indeed holds all the good rounds up to round i, and TempTranscript outputs T (i) by definition.
Remark 1. Assume that round i is corrupted and let j < i be the latest uncorrupted round. Then T (i) = T (j), which equals the output of TempTranscript in round j. However, the output of TempTranscript in round i may be arbitrary.
The next lemma argues that, if round i is uncorrupted, then the implied transcript T (i) (and hence the output of TempTranscript in round i) is indeed a correct (partial) simulation of π 0 .
Lemma 4.2. If round i is uncorrupted, then T (i) is a prefix of π 0 (x, y).
Proof. The proof goes by induction on i. The base case T (0) = ∅ is trivial. Assume that the claim holds for T (j) for any uncorrupted round j < i; we show that the same holds for round i.
Assume, without loss of generality, that Alice is the receiver in round i. Let j be the maximal previous round where Alice's transmission was not corrupted. Since round j is uncorrupted, Lemma 4.1 proves that, at round j, Bob learns GOOD ≤j and T (j). By the induction hypothesis, T (j) is a prefix of π 0 (x, y).
If j < Prev A (i) then i is not a good round, i / ∈ GOOD ≤i . It holds that GOOD ≤i = GOOD ≤j and T (i) = T (j); therefore, T (i) is indeed a prefix of π 0 (x, y). Otherwise, j = Prev A (i) and i is a good round. As said, in round j Bob learns T (j) (which is a correct prefix of π 0 (x, y)). Next, in round i it is Bob's turn to send the symbol σ i = (link i , b i ). If it is Bob's turn to speak in π 0 , then b i = π 0 (y | T (j)) will indeed be the correct continuation of T (j) according to π 0 ; otherwise, Bob sends b i = ∅. In both cases, the channel does not corrupt σ i , Alice learns GOOD ≤i and the implied transcript she constructs equals T (i) = T (j) • b i . Hence, T (i) is indeed a prefix of π 0 (x, y).

Skipped rounds, the order of speaking and noise-progress tradeoffs
The order of speaking in the protocol depends on the observed noise measured through the length of the current chain. Whenever the current chain is shorter than n/5 for only one of the parties, this party "skips" one round of communication-the other party gets to speak one additional round. We now define the skipping mechanism and use it to show that the coding scheme makes progress unless too much noise has occurred. Next we prove some properties with regard to the number of rounds each party gets to speak, as a function of the noise. In particular, we relate between the variables SkipCnt A , SkipCnt B and the number of rounds Alice and Bob get to speak, denoted RC A , RC B , respectively. Lemma 4.3. Alice is the sender in 1 2 (n − SkipCnt A + SkipCnt B ) rounds and Bob is the sender in 1 2 (n − SkipCnt B + SkipCnt A ) rounds.
Proof. We split the protocol into the epochs generated by the Next procedure. For the i-th epoch denote n(i) ∈ {2, 3} the number of rounds in that epoch, and let A(i) (resp., B(i)) be an indicator which is 1 if the epoch is Alice-skipped (resp., Bob-skipped).
Note that Alice speaks in the i-th epoch exactly 1 2 (n(i) − A(i) + B(i)) times: if n(i) = 2 it must hold that A(i) = B(i) and Alice speaks once. She also speaks once if n(i) = 3, but Bob speaks at the third round, A(i) = 1, B(i) = 0, i.e., if this is an Alice-skipped but not a Bob-skipped epoch. Finally, Alice speaks twice only when n(i) = 3 and A(i) = 0, B(i) = 1. Then, The case for Bob is symmetric.
Remark 2. In fact, due to rounding and the fact that Alice is the first to speak, she might get one extra round if the total number of rounds does not divide into full epochs, e.g., when the last epoch contains only a single round. A more accurate statement is RC A ≥ (n − SkipCnt A + SkipCnt B )/2 − 2 (and similarly for Bob). In order to simplify the proof, we ignore this issue.
Next, we connect the number of skips with the amount of noise that happens during the first part of the protocol.
Claim 4.4. If t transmissions by Alice were corrupted during the 2n/5 first rounds, then at the end of the protocol, SkipCnt A ≥ n/5 + t.
Proof. During the first 2n/5 rounds, all the epochs are both Alice-and Bob-skipped (i.e., epochs of size 2). This means that by round i = 2n/5, SkipCnt A = n/5. Split rounds [2n/5 + 1, n] into epochs as done by the Next procedure; note that there are at least n/5 epochs in this part of the protocol. Since the noise corrupted t of Alice's transmissions before round 2n/5, it can corrupt at most n/5 − εn − t additional transmissions of Alice beyond round 2n/5. That is, in at least n/5 − (n/5 − εn − t) = t + εn of the epochs after round 2n/5, Alice's transmission (in the first round of the epoch) is not corrupted; call these epochs Alice-uncorrupted. Note that by round 2n/5, Alice's "correct" chain is of length at most n/5 − t. As long as the length of Alice's correct chain is less than n/5, any Alice-uncorrupted epoch is also Alice-skipped. In each such epoch, SkipCnt A increases by one and Alice gets to speak only once. The length of Alice's correct chain also increases by one in each such epoch. It follows that in each of following t Alice-uncorrupted epochs, SkipCnt A increases until the length of the correct chain exceeds n/5 and the condition of line 35 does not hold any longer. may increase SkipCnt A even further). Since the number of Alice-uncorrupted epochs is t + εn > t, the counter will indeed reach at least n/5 + t.
The following lemma captures a key property of our resilient protocol-a relation between the length of the implied transcript and the number of corruptions that have occurred so far. Proof. We prove the claim by induction on k for all r ≥ t ≥ k.
The base case: k = 0, trivially holds since for all r, t, we have |T (r)| ≥ 0. For the inductive step, we assume that the lemma holds (for both parties) for some k and any r ≥ t ≥ k, and wish to show that it also holds for k + 1 and any r ≥ t ≥ k + 1. Specifically, let r, t where r ≥ t be fixed. We are given that by round r there are t Alice-skipped Alice-uncorrupted epochs, and that Bob's transmissions suffer from at most t − (k + 1) corruptions; we need to show that |T (r)| ≥ k + 1.
Since the number of corruptions at Bob's side by round r is less than t − k, the induction hypothesis tells us that there exists some round j ≤ r where |T (j)| ≥ k. Let j be the minimal round such that j ∈ GOOD and |T (j)| = k while |T (j − 1)| = k − 1. Recall that T extends only in good rounds (Definition 4.3), hence, such a round j must exist.
Assume there are t ′ Alice-skipped Alice-uncorrupted epochs until round j − 1. It follows that the number of corruptions at Bob's side (up to round j − 1) is at least t ′ − k + 1: if the number of corruptions is strictly less than t ′ − k + 1 then by induction T (j − 1) ≥ k, contradicting the way we chose j. It follows that the number of corruptions in Bob's transmissions for round [j, r] is at most ( We now split the analysis into different cases. Assume Alice is the speaker in the j-th round. We know that j ∈ GOOD and |T (j)| > |T (j − 1)|. This implies that round (j − 1) is uncorrupted and that Bob is the next to speak in π 0 given T (j), due to the alternating nature of π 0 . Note that Alice has at least t − t ′ additional uncorrupted rounds within Alice-skipped epochs in rounds [j, r] (note that Alice speaks in round j, which is uncorrupted). In all these cases, either Bob speaks a single time immediately after Alice, or he speaks twice after Alice (Alice-skipped epoch). Since at most t − t ′ − 2 of Bob's transmissions are corrupted, it follows that there must exist an uncorrupted round j ′ ∈ [j + 1, r] where Bob is the speaker and Prev A (j ′ ) is uncorrupted, i.e., j ′ ∈ GOOD. This implies that Bob sends the correct symbol that extends T in round j ′ ; thus, |T (j ′ )| = |T (j)| + 1 = k + 1. Since |T (·)| is non-decreasing, we have proved the claim.
The other case is when Bob is the speaker in the j-th round. Again, j ∈ GOOD, since |T (j)| > |T (j − 1)|, thus, round j itself is uncorrupted. Then, in [j + 1, r] Alice has t − t ′ additional uncorrupted rounds (in Alice-skipped epochs) while at most t − t ′ − 2 of Bob's transmissions are corrupted. Similar to the previous case, after each one of the aforementioned Alice-skipped Alice-uncorrupted rounds, Bob either speaks once or twice. If we consider the previous (Prev B ) of these t − t ′ rounds of Alice, we know that at most t − t ′ − 2 of them can be corrupted (notice that round j itself belongs to Bob and is uncorrupted!). This means that there must exist a round j ′ ∈ [j + 1, r] where Alice is the speaker and Prev B (j ′ ) is uncorrupted, i.e., j ′ ∈ GOOD. Then, |T (j ′ )| = |T (j)| + 1 = k + 1, which completes this case.
An immediate corollary of the above lemma shows that the "correct" chain of the coding scheme fully simulates π 0 . Indeed, the proof of Claim 4.4 suggests that by the end of the coding scheme there were at least n/5 uncorrupted Alice-skipped rounds. Furthermore, the corruption on Bob's side is bounded to (1/5 − ε)n. Lemma 4.5 then gives that T (n) ≥ εn = |π 0 |.
Yet, we still need to prove that the protocol outputs, in round n, a chain that contains T (n) as a prefix. In other words, we need to prove that the correct chain is a prefix of the longest chain. This is the goal of Section 4.3 below. Another useful corollary of Lemma 4.5 is the following lemma that measures the progress in the first 2n/5 alternating rounds, as a function of the total amount of corruptions in that part.
Proof. This is an immediate corollary of Lemma 4.5. Note that all the n/5 epochs up to round 2n/5 are both Alice-skipped and Bob-skipped epochs, and that the order of speaking alternates. Assume that Alice has t uncorrupted rounds until round i, then Bob's transmissions suffer from at most (i/2 − k) − (i/2 − t) = t − k corruptions. Lemma 4.5 gives that T (i) ≥ k.
In the following we implicitly assume the noise is a (1/5 − ε, 1/5 − ε)-corruption. As mentioned earlier, Algorithm 1 simulates the entire transcript of π 0 correctly in its good rounds. However, the parties cannot tell which rounds are good rounds. Instead, we show that the transcript implied by the longest chain (of each side) contains the entire transcript of π 0 as its prefix.
Before proving the theorem, let us set some notations. Consider a complete instance of the coding scheme of Algorithm 1. Let P A be the longest chain of Alice's transmissions as seen by Bob at the end of the protocol. Formally, P A = Parse(R ≤j max B ) with j max = arg max j |Parse(R ≤j B )|. Given a chain P A we differentiate between several types of Alice's rounds: 1. Uncorrupted rounds that are on P A -we denote these rounds as the set N C A .
2. Corrupted rounds that are on P A -we denote these rounds as the set D.
3. Corrupted rounds that are not on P A -we denote these rounds as the set J.
In a similar way we can define P B as the longest chain of Bob's transmissions as observed by Alice at the end of the protocol, and N C B as the set of uncorrupted rounds that are on P B , where Bob is the speaker.
Proof. Assume an instance of the algorithm with (1/5 − ε, 1/5 − ε)-corruption. We prove that the algorithm outputs the transcript of π 0 correctly. Let P A be the longest chain of Alice at the end of the protocol. Then, This holds since, in any uncorrupted round, Alice's transmission contains a link to the longest previous correct chain (Line 6), thus extending this chain by at least one link. These uncorrupted rounds where Alice is the speaker form a chain of length at least RC A − (1/5 − ε)n. The longest chain at the end of the protocol, P A , may only be longer. We can further classify each transmission of Alice, and understand its effect on P A , i.e., whether it belongs to the set N C A of uncorrupted rounds, the set D of harmful corrupted rounds (that got into P A ), or the set J of corrupted rounds that are not on P A , and thus are somewhat harmless.
It is easy to verify that (3) Eq.
(2) follows trivially from bounding the noise to a (1/5 − ε, 1/5 − ε)-corruption. Eq. (3) is an immediate corollary of Eq. (1) and Eq. (2), since |P A | = |N C A | + |D|. Define w A , w B to be the number of corrupted transmissions in the first 2n/5 rounds (of Alice's and Bob's transmissions, respectively). Furthermore we distinguish between before and after round 2n/5 via a prime and double prime superscripts, respectively, i.e., Lemma 4.8. If |N C ′ A | − w B > εn, then P A contains, as a prefix, a correct and complete simulation of π 0 . Proof. Indeed, up to round 2n/5 there were at most n/5 − |N C ′ A | corruptions on Alice's side and w B corruptions on Bob's, with a total of n/5 − |N C ′ A | + w B < n/5 − εn corruptions. Lemma 4.6 implies that the progress up to round 2n/5 is at least εn, that is, |T (2n/5)| ≥ εn = |π 0 |. Then, the entire transcript of π 0 is correctly simulated by the |N C ′ A | uncorrupted rounds, and these rounds are on the chain P A .
We now show that the output of Algorithm 1 is indeed correct. Assume towards contradiction that the longest chain P A does not imply the correct answer. Then, Lemma 4.8 suggests that the number of corruptions in Bob's transmissions in the first 2n/5 rounds is w B ≥ |N C ′ A | − εn. Claim 4.4 then implies that SkipCnt B ≥ n/5 + w B ≥ n/5 + |N C ′ A | − εn. Thus, Further, since SkipCnt A increases by n/5 during the first 2n/5 rounds, and at most by 1 in every round in J ′′ . Rounds in N C ′′ A ∪ D ′′ can increase the counter only until the length of the chain reaches n/5, that is, at most n/5 − |N C ′ A | times. Putting these all together gives Eq. (5). We show that in this case we have |N C A | + (1/5 − ε)n − |J| < RC A − (1/5 − ε)n, which contradicts Eq. (3). Note that via Lemma 4.3, the above can be written as The above bounds on the skip-counters, Eq. (4) and Eq. (5), allow us to bound the left-hand side of Eq. (6) by Now, if |N C ′′ A | ≤ 3εn/2, then Eq. (6) holds and we reached a contradiction. Otherwise,

we will prove this shortly), then Eq. (7) is upper bounded by
We obtained a contradiction for the second case as well. We are left to show that the assumption we took earlier holds, i.e., that |N C ′ A | + |N C ′′ A | ≤ n/5. If the above equation does not hold, then there are n/5 Alice-skipped Alice-uncorrupted transmissions on the chain that becomes the output. Since the number of Bob's corrupted transmissions is limited to n/5 − εn, Lemma 4.5 immediately gives that the length of the correct simulation of π 0 is at least εn = |π 0 |. This transcript is contained in the output chain and contradicts the assumption that the longest chain implies an incorrect output.
Finally, we argue that Algorithm 1 is computationally efficient, as long as π 0 itself is efficient.
Proof. The algorithm performs n = |π 0 |/ε iterations, in each of which it needs to determine the next speaker, determine the partial transcript so far and determine the next message to send. The former two activities require performing Parse on all the symbols received by both parties; this takes O(n) time. Setting the next message requires a single activation of π 0 .

From large to constant alphabet: Overview
The coding scheme of Section 4 uses an alphabet whose size is polynomial in n, which is large enough to describe links to each of the n rounds of the protocol. We now show how to decrease the size of the alphabet to a constant. The main, and quite natural, idea is to encode each link using several symbols. We will use a constant-size alphabet Σ of size |Σ| ≈ C 2 , where C is some constant we set later as a function of ε, i.e., C = O ε (1). We interpret each symbol m ∈ Σ as the triplet (link, type, msg), where link ∈ {0, . . . , C}, type ∈ {std, start, stop, cont} and msg ∈ [C] ∪ {0, 1, ∅}.
In order to link to a transmission which is at most C transmissions back, the link field can be used directly to contain a relative pointer. That is, link = 1 means the previous transmission, link = 2 means the second previous transmissions, etc. In this case, type = std and the msg field contains the payload-the bit b ∈ {0, 1} sent by the party according to π 0 (or msg = ∅ if the other party is to speak in π 0 ).
When the protocol needs to link to a transmission which is x > C transmissions back, we use a variablelength encoding of the relative pointer. Specifically, the coding begins with a message with type = start. Next, the value of x is encoded in the msg fields of the next log C x transmissions. In each such segment (except for the first one), the link field still points to the last uncorrupted transmission. The type field equals cont to denote this transmission is a (middle) fragment of the encoding. On the last fragment, type = stop denotes the end of the encoding.
A possible problem occurs when a party wishes to send an encoding of some (large) value x, but during the transmission of this encoding many corruptions occur. Due to the noise, the link field of some specific segment of the encoding of x is too small to point to the previous segment. For example, say the two segments are y > C transmissions apart. In this case, the above encoding acts recursively. That is, we initiate a new encoding (for y) by sending a message with type = start, whilst the encoding of x is still in progress. In the following transmissions, the msg fields contain the value y. After all the bits of y have been transmitted, a message with type = stop indicates the end of y's encoding. Then, the encoding of x resumes from the point it stopped. Once all the fragments of x have been communicated, a message with type = stop indicates the end of x's encoding, and the protocol continues as before.
This encoding does not harm the rate of the coding: most of the time the pointer is small enough and fits in a single link field with no further encoding (type = std). A burst of t > C consecutive corruptions causes the addition of ⌈log C t⌉ transmissions that describe a pointer to t transmissions beforehand. It is not too difficult to verify that n/C is a bound on the total added communication due to these encodings. We can set C = 1/ε so that the added communication is bounded by εn transmissions.
These transmissions do not take part in the simulation of π 0 and can be considered as a "corruption" towards that goal (although they serve a critical role in generating the uncorrupted chain). We argue that the effect of these transmissions on the simulation of π 0 is at most as harmful as εn corrupted transmissions. It then follows that if the noise corrupts at most 1/5 − 2ε transmissions in each direction, the "effective" noise level (including transmissions used for encoding links) is bounded by 1/5 − ε, which is low enough to allow the correct simulation of π 0 .

A coding scheme with a constant-size alphabet
Towards a scheme with constant-size alphabet let us (re)define some of the basic elements we use. Let C = 1/ε be constant (without loss of generality, we assume C is an integer). We define our alphabet to be Every m ∈ Σ is interpreted as m = (link, type, msg), where link points to a previous symbol m ′ unless type = start, which indicates that the link to m ′ is encoded in the msg field of the next symbols. The type field indicates whether the encoding has been completed (type = stop) or it is still going on (type = cont). We emphasize that whenever type = start the link field indeed points to the previous uncorrupted transmission. We let link = 0 indicate the first message in the chain (no previous message).
For any m 1 , . . . , m t ∈ Σ, the "chain" of messages, Parse(m 1 , . . . , m t ), is determined by going over the chain link-by-link, until we hit the head of the chain (link = 0) or an encoded link (type = std). In this case we collect the fragments of the link (recursively, in case we hit another instance of encoding before we are done collecting all the fragments of the current encoding), decode them and continue parsing from the transmission pointed by the encoded value. The fragments that contain the encoding are omitted from the parsed output (so that the chain contains only the "real" messages of π 0 ). The Parse procedure is formally described in Algorithm 3. The coding scheme with a constant-size alphabet is given in Algorithm 4. It is very similar to the coding scheme of Algorithm 1 except for the handling of encoded links, i.e., the encoding of a far link and parsing of a chain that contains encoded links.
Similar to Algorithm 1, the coding scheme of Algorithm 4 is clearly computationally-efficient.
Algorithm 4 A coding scheme with a constant-size alphabet (Alice's side) Input: A binary alternating protocol π0 defined over noiseless channels with feedback; a noise parameter 1/5 − 2ǫ. Alice's input for π0 is x.
Without loss of generality, we assume log 2 C is an integer. The procedures Next and TempTranscript are as described in Alrogithm 1. Let lastMsg be the offset to the latest uncorrupted round where Alice is the speaker. 8: Alice is the sender in π0, otherwise (or if π0 has terminated) b = ∅.

9:
if lastM sg > C then ⊲ Encode link using multiple segments 10: Write lastMsg as a binary string s = s1s2 · · · st where ∀i, |si| = log C The proof is similar to the proof of Proposition 4.9, once it has been verified that the new Parse procedure still takes linear time in n.

Analysis
Lemma 5.2. Let E denote the set of all the rounds where the transmission is uncorrupted and has type = std (i.e., is a part of an encoding). Then |E| ≤ εn.
Proof. Any burst of t > C corruptions causes at most ⌈log C t⌉ uncorrupted transmissions with type = std, that encode a link to t transmissions back. Due to the recursive manner of the encoding, a later burst of corruptions has no effect on the encoding of previous links, it only delays the rounds in which the first encoding is transmitted by the number of rounds needed to encode the link that comes after the later burst.
In other words, a burst of t corruptions followed by a burst of t ′ corruptions cause at most ⌈log C t⌉+ ⌈log C t ′ ⌉ rounds with type = std. Since the total number of corrupted rounds (per party) is bounded by (1/5 − ε)n, the total encodings length (for that party) is bounded by n/C. Partition E into E A , E B , the encoding rounds on Alice's and Bob's sides, respectively. Assume that the noise pattern on Alice's transmission is composed of bursts of lengths t 1 , t 2 , . . . , t k where for every i we have |t i | > C (otherwise t i does not add any transmissions with type = std). Note that the above requirement implies that k < n/5C.
where the first inequality follows from Jensen's inequality. E B is bounded by the same value. The above function monotonically increases in [0, n/5]. The number of messages with type = std is then upper bounded by the value of the function at k = n/5C, We now prove that Algorithm 4 simulates π 0 correctly as long as the corruption level is below 1/5. The idea is to reduce Algorithm 4 to Algorithm 1. This is done by considering fragments of encoding as "corrupted" transmissions of Algorithm 1, while still obtaining the correct link from these encoded transmissions. Since the number of transmissions used for encodings is at most εn, they "increase" the effective noise level by this small amount, which is still tolerable for Algorithm 1.
Proof. Algorithm 4 differs from Algorithm 1 in one main aspect-rounds in which type ∈ {start, cont, stop}. Other than those rounds, the two algorithms behave exactly the same: given a similar transcript m 1 , . . . , m t for which type = std, they both generate exactly the same partial transcript, the same next message, and the same next speaker.
We can interpret any instance of Algorithm 4 as an instance of Algorithm 1 in which transmissions with type = std correspond to "erased" transmissions in Algorithm 1: transmissions whose "link" part is invalid (hence, the parsed chain is empty). Formally, there exists a transformation that takes any transcript m = m 1 , . . . , m n generated by Algorithm 4 on the input x, y assuming a (1/5 − 2ε, 1/5 − 2ε)-corruption, and generates a transcript m ′ = m ′ 1 , . . . , m ′ n such that 1. m ′ is an instance of Algorithm 1 on the input x, y that suffers from a (1/5 − ε, 1/5 − ε)-corruption.
The transformation is as follows: if m i .type = std and m i .link points to a message m j with m j .type = std, then m ′ i .msg = m i .msg and m ′ i .link = j. If m j .type = stop then m ′ i .link = EffectiveAddress(m 1 , . . . , m j ). Other cases are irrelevant (m ′ i will be attributed to a corruption). That is, the transmissions contain (logically) the same messages and links except for transmissions that contain encoded-links in m. These correspond to corrupted transmissions in m ′ . However, in every round i where m i links to the end of an encoded link (m j ), we set the link in m ′ i to EffectiveAddress(m 1 , . . . , m j ), i.e., to the last non-encoding uncorrupted transmission prior to m i . Item 2 holds by induction. Assume that the claim holds for all rounds up to i. Since both algorithms generate the same parsed chain, they make identical decisions regarding the order of speaking and the identity of the next speaker. If the (i + 1)-th transmission in m links to a transmission more than C steps back, or if m i+1 = std, then m ′ i+1 is assumed to be corrupted. In this case it holds that Parse(m 1 , . . . , m i+1 ) = Parse(m ′ 1 , . . . , m ′ i+1 ) = ∅. Otherwise, the (i + 1)-th transmission links to a transmission m j at most C steps back, and m i+1 .type = std. If m j .type = std, then m ′ i+1 .link = j and the claim holds. If m j .type = stop, then m ′ i+1 .link points to the link encoded by EffectiveAddress(m 1 , . . . , m i ). Since Parse in Algorithm 4 resolves the identity of the message prior to m i+1 as the one pointed by EffectiveAddress(m 1 , . . . , m i ), it outputs the same sequence as Parse(m ′ 1 , . . . , m ′ i+1 ) does in Algorithm 1. As a consequence of Item 2, the parsed chains, and hence the implied transcripts, are identical between the two instances for any i ∈ [n]. Therefore, for any round i, the transmission generated by Algorithm 1 given m ′ 1 , . . . , m ′ i−1 equals m ′ i defined by the above transformation, except for two cases: when m i is corrupted and when m i is an encoding (m i .type = std). Lemma 5.2 bounds the number of encoded transmissions by εn. Hence, any instance with a (1/5 − 2ε, 1/5 − 2ε)-corruption in Algorithm 4 translates to an instance of Algorithm 1 with a (1/5 − ε, 1/5 − ε)-corruption.
The correctness of the Algorithm 4 follows from the correctness of the Algorithm 1.

Applications for Circuits with Short-Circuit Noise
In this section, we prove our main theorems (Theorems 1.1 and 1.2). We show that the KW-transformation between formulas and protocols (and vice versa) extends to the noisy setting in a manner that preserves noiseresilience. Applying the results from Sections 4-5 onto the realm of boolean formulas gives a construction that is resilient to an optimal level of noise, namely, a fraction of (1/5 − ε) of short-circuit gates in any input-to-output path. Additionally, the results of Section 3 imply that noise-resilience of 1/5 is maximal for formulas (assuming a polynomial overhead). In the following subsections, we show how to convert between formulas and protocols while preserving their noise-resilience. If we start with a formula that is resilient to (α, β)-corruptions, our transformation yields a protocol that is resilient to (α, β)-corruptions (Proposition 6.6). Moreover, given a protocol that is resilient to (α, β)-corruptions, the transformation yields a formula that is resilient to a similar level of noise (Proposition 6.9).

Preliminaries
Formulas A formula F (z) over n-bit inputs z ∈ {0, 1} n is a k-ary tree where each node is a {∧, ∨} gate with fan-in k and fan-out 1. (While our results apply to any k, in this section we will usually assume k = 2 for simplicity.) Each leaf is a literal (either z i or ¬z i ). The value of a node v given the input z ∈ {0, 1}, denoted v(z) ∈ {0, 1}, is computed in a recursive manner: the value of a leaf is the value of the literal (given the specific input z); the value of an ∧ gate is the boolean AND of the values of its k descendants, v 0 , · · · , v k−1 , that is v(z) = v 0 (z) ∧ · · · ∧ v k−1 (z). The value of an OR gate is v(z) = v 0 (z) ∨ · · · ∨ v k−1 (z).
The output of the formula on z, F (z), is the value of the root node. We say that F computes the function f : {0, 1} n → {0, 1} if for any z ∈ {0, 1} n it holds that F (z) = f (z). The depth of a formula, denoted depth(F ), is the longest root-to-leaf path in it. The size of a formula, denoted |F |, is the number of nodes it contains. We denote by V ∧ the set of all the ∧ nodes, and by V ∨ the set of all the ∨ nodes.

Karchmer-Wigderson Games
For any boolean function f : {0, 1} n → {0, 1}, the Karchmer-Wigderson game is the following interactive task. Alice is given an input x ∈ f −1 (0) and Bob gets y ∈ f −1 (1). Their task is to find an index i ∈ [n] such that x i = y i . We are guaranteed that such an index exists since f (x) = 0 while f (y) = 1. We denote the above task by KW f . Karchmer and Wigderson [KW90] proved the following relation between formulas and protocols.
Theorem 6.1 ( [KW90]). For any function f : {0, 1} n → {0, 1}, the depth of the optimal formula for f equals the length of the optimal interactive protocol for KW f .
The above theorem is proven by showing a conversion between a formula for f and a protocol for KW f , which we term the KW-transformation. In this conversion, the formula-tree is converted into a protocol tree, where every ∧-gate becomes a node where Alice speaks and every ∨-gate becomes a node where Bob speaks. For a node v, the mapping a v : {0, 1} n → {0, 1} is set as follows. For a given input z, consider the evaluation of the formula F on z. The node v is an ∧ gate and we can write v(z) = v 0 (z) ∧ v 1 (z) where v 0 and v 1 are v's left and right descendants, respectively. If v 0 (z) = 0 we set a v (z) = 0; otherwise we set a v (z) = 1. For an If the protocol reaches a leaf which is marked with the literal z i or ¬z i , it outputs i. For technical reasons we will assume that the protocol outputs either z i or ¬z i rather than just giving the index i. Note that the literal always evaluates to the value of f ; In this work, a KW f protocol must satisfy this additional requirement.
It is easy to verify that the following invariant holds: for every node v reached by the protocol on some input (x, y) ∈ f −1 (0) × f −1 (1), it holds that v(x) = 0 while v(y) = 1. This holds for the root node by definition, and our selection of mappings a v , b v maintains this property. Specifically, for an ∧-gate v for which v(x) = 0 it must hold that at least one of the gate's inputs is zero. Indeed, the way we chose a v advances the protocol to a child node that evaluates to 0. Since v(y) = 1, both children of v evaluate to 1 on y; thus both descendants satisfy the invariant. The analysis for an ∨ gate is symmetric. It follows that once the protocol reaches a leaf (the literal z i or ¬z i ), that literal evaluates differently on x and on y, so x i = y i as required. In particular, the literal evaluates to 0 on x and to 1 on y.
The same reasoning allows us to convert a protocol for KW f into a formula for f : consider the protocol tree and convert each (reachable) node where Alice speaks to an ∧-gate and each (reachable) node where Bob speaks to an ∨-gate. If the protocol outputs z i or ¬z i at some leaf, that literal is assigned to that leaf.
Proving that this conversion yields a formula for f is by induction on the length of the protocol. If |KW f | = 0, then the protocol outputs (say) z i without communicating. It is clear that all inputs in the domain satisfy x i = y i , and that x i = 0 while y i = 1 (negate these values if the output of the protocol is ¬z i ). For the induction step, assume, without loss of generality, that Alice is to speak first. For some partition X 0 ∪ X 1 = f −1 (0), Alice sends 0 when x ∈ X 0 and otherwise she sends 1. By induction, the continuation of the protocol can be converted into formulas F 0 and F 1 (corresponding to the cases where Alice sends 0 and 1, respectively), for which F 0 (x) = 0 when x ∈ X 0 , F 1 (x) = 0 when x ∈ X 1 , and F 0 (y) = F 1 (y) = 1 when y ∈ f −1 (1). Taking F = F 0 ∧ F 1 completes the proof. The other case, where Bob is to speak first, is symmetric. See [KW90] for further details about the KW-transformation from formulas to protocols and vice versa, and for the formal proofs.
Remark 3. In the above, formulas are assumed to have fan-in 2 and protocols are assumed to communicate bits. However, the same reasoning and conversion also applies for a more general case, where each ∧-gate and ∨-gate has fan-in k, and the protocol sends symbols from an alphabet of size |Σ| = k.
Furthermore, while our claims below are stated and proved assuming fan-in 2, all our claims apply to any arbitrary fan-in k.
Short-Circuit Noise Short circuit noise replaces the value of a specific node with the value of one of its descendants. A noise pattern E ∈ {0, 1, . . . , k − 1, * } |V∧|∪|V∨| defines for each node whether it is shortcircuited and to which input. Specifically, if for some node v, E v = * , then the gate is not corrupted and it behaves as defined above. Otherwise, the value of the node is the value of its E v -th descendant, v(z) = v Ev (z). We denote by F E the formula with short circuit pattern E; we sometimes write F for the formula with no short-circuit noise, i.e., with the noise pattern E = * |V∧|∪|V∨| .
We say that a circuit is resilient to a noise pattern E if for any z ∈ {0, 1} n it holds that F (z) = F E (z).
Definition 6.1. We say that F is resilient to a δ-fraction of noise if it is resilient to all noise patterns E in which the fraction of corrupted gates in any input-to-output path in F is at most δ.
We can also be more precise and distinguish between noise in ∧-gates and ∨-gates.
Definition 6.2. An (α, β)-corruption of short-circuit errors, is a noise pattern on a formula F of depth n that changes at most αn ∧-gates and at most βn ∨-gates in any input-to-output path in F .
Remark 4. Note that an (α, β)-corruption is defined with respect to the maximal depth n of the formula. If the formula has shorter paths, then α and β no longer describe the fraction of corrupted gates in these paths. Hence, we will assume F 's underlying tree is a perfect k-ary tree: every inner node has exactly k children, and all the leaves are of the same depth n. We denote these as perfect formulas. It is easy to convert every formula of depth n to be perfect without affecting its function.
The following is immediately clear by definition.
Claim 6.2. If, for some δ > 0, the formula F is resilient to any (δ, δ)-corruption of short-circuit errors, then F is also resilient to a δ-fraction of noise.
On the surface, the other direction does not necessarily hold: (δ, δ)-corruption may corrupt up to a fraction 2δ of the gates in each path, hence, resilience to a δ-fraction appears to be insufficient to resist all (δ, δ)-corruptions. Nevertheless, we argue that these two notions are indeed equivalent. The reason for this is that a short-circuit in an ∧-gate can only turn the output from 0 to 1. A short-circuit in an ∨-gate can only turn the output from 1 to 0. Then, if a formula evaluates to 1 on some input, the output remains 1 regardless of any amount of short-circuited ∧-gates. If the output is 0, it remains so regardless of any number of short-circuited ∨-gates. This observation was already made by Kalai et al. [KLR12]. Lemma 6.3 ([KLR12, Claim 7]). Let F be a formula, z an input and E any error pattern. Let E ∧ be the error pattern induced by E on the ∧-gates alone (no errors on the ∨-gates); Let E ∨ be the error pattern induced by E on the ∨-gates alone. It holds that if F E∧ (z) = 0 then F E (z) = 0, and if F E∨ (z) = 1 then F E (z) = 1.
The above lemma then implies that resilience to a δ-fraction of noise corresponds to resilience to the same fraction of noise in both types of gates.
Lemma 6.4. If, for some δ > 0, the perfect formula F is resilient to a fraction δ of short-circuit noise, then F is also resilient to any (δ, δ)-corruption.
Proof. Assume F has depth n and consider any inputs x, y such that F (x) = 0 and F (y) = 1.
Let E be an arbitrary (δ, δ)-corruption pattern. In particular, E short-circuits up to δn of the ∧-gates and additionally up to δn of the ∨-gates in any input-to-output path. Let E ∧ be the error pattern induced by E on the ∧-gates alone and let E ∨ be the error pattern induced by E on the ∨-gates alone. Note that both the noise patterns E ∨ and E ∧ corrupt at most a δ-fraction of the gates in each path.
Since F is resilient to a δ-fraction of noise, we have Lemma 6.3 and Eq. (8) then imply that F E (x) = 0. Similarly, the lemma and Eq. (9) imply that F E (y) = 1. Since the above holds for an arbitrary (δ, δ)-corruption E and for all inputs x, y, we get that F is resilient to (δ, δ)-corruptions.
Following the mapping between formulas and protocols, the authors in [KLR12] made the observations that a short-circuit error in a formula translates to channel noise in the equivalent KW protocol, assuming both parties learn the noise, i.e., assuming noiseless feedback. Specifically, the feedback allows both parties to continue to the same node in the protocol tree, despite the noise. Thus, it is crucial in order to keep the parties synchronized. We will sometimes abuse notation and identify a short-circuit noise pattern with a transmission noise pattern for a formula F and a protocol π that share the same underlying tree structure. Furthermore, we will denote the two different objects with the same identifier E.

From Formulas to Protocols
In this part we describe a variant of the KW-transformation, which we call the resilient KW-transformation from formulas to protocols. We prove that there is a way to chose the mappings a v (x), b v (y) in the protocol tree in a way that preserves resilience, that is, if F is resilient against (α, β)-corruptions, then the resulting interactive protocol will feature the same resilience.
Recall that in the standard KW-transformation for some formula F that computes f , for any (x, y) ∈ f −1 (0) × f −1 (1), an invariant that v(x) = 1 and v(y) = 0 holds for any node v reached by the protocol given by the transformation. This invariant is the key for the one-to-one correspondence between the formula and the protocol. Keeping this invariant in the noiseless case amounts to selecting the mapping a v (x) to be the child of v that evaluates to 0 on x, and b v (1) to be the child of v that evaluates to 1 on y.
The main observation is that, given any (α, β)-resilient formula F (i.e., a formula that is resilient to any (α, β)-corruption), we can choose the mapping a v (x) as the child of v that evaluates to 0 given any (α, β)corruption: such a child always exists! Similarly, the mapping b v (y) is set to be the child of v that evaluates to 1 given any (α, β)-corruption. This allows us to maintain the invariant that v(x) = 0 and v(y) = 1 for any node v reached by the protocol, regardless of the possible noise pattern. Keeping this invariant leads to proving that the protocol correctly computes KW f despite (α, β)-corruptions.
We begin by introducing our variant of the KW-transformation that preserves resilience.
1. The formula-tree is converted into a protocol tree, where every ∧-gate becomes a node where Alice speaks and every ∨-gate becomes a node where Bob speaks.
2. Order the nodes in the protocol tree in a BFS order starting from the root, and determine the mappings associated with each node in that order (i.e., before setting the mapping of some node, set the mapping of all its ancestors).
3. For any inner node v, let S (v,x,y) be the set of noise patterns E such that E is a (α, β)-corruption and such that an instance of π given the input (x, y) and noise E causes the protocol to reach the node v (note that this process is well defined due to the BFS order).

4.
If v is an ∧-node, for any x, the mapping a v (x) maps to the child w for which the subformula of F rooted at w evaluates to 0 on x for all noise patterns E ∈ y ′ ∈F −1 (1) S (v,x,y ′ ) . If v is an ∨-node, then for any y, the map b v (y) maps to the child w for which the subformula of F rooted at w evaluates to 1 on y for all noise patterns E ∈ x ′ ∈F −1 (0) S (v,x ′ ,y) .

5.
A leaf of F marked with the literal z i or ¬z i becomes a leaf (output) of the protocol with the same literal.
Note that the mappings a v (x), b v (y) defined in item (4) may be partial functions. Specifically, if an ∧-node v is not reachable given the input x with any y and any valid noise, then, definition of a v on that input x has no meaning.
Proposition 6.5 guarantees that for any reachable node v we can always find a child w that satisfies the condition of item (4). The proposition further proves that every node v reached by the constructed protocol π (assuming any valid noise) satisfies the invariant that v(x) = 0 and v(y) = 1. Similar to the noiseless KW-transformation, this invariant would imply that π correctly computes KW f in a resilient manner (Proposition 6.6).
Proposition 6.5. Let F (z) be an (α, β)-resilient formula, and consider the resilient KW-transformation of F (Definition 6.3). For any node v reached during the construction, and for any (x, y) ∈ F −1 (0) × F −1 (1) such that the partial protocol π constructed thus far reaches v on x, y and some (α, β)-corruption E, the following holds.
(b) Let F 0 and F 1 be the subformulas (of F ) rooted at the left and right child of v, respectively. There is at least one subformula G ∈ {F 0 , F 1 } that satisfies G E (x) = 0 for all noise patterns E ∈ Proof. Let us begin with property (a). The proof is by induction on the depth of v in F E . For the base case, when v is the root, v(x) = F (x) = 0 and v(y) = F (y) = 1 in F E since E is an (α, β)-corruption to which F is resilient. Now, let v be an arbitrary node and let w be its parent in F E ; we denote by u the other child of w. 8 Property (a) holds for w by the induction hypothesis. Consider the case where w is an ∧-gate (the case of an ∨-gate is shown in a similar manner). There are two cases according to the noise associated with w. If there is no noise at w, E w = * , then for any input z it holds that w(z) = v(z) ∧ u(z). Using the induction hypothesis, w(y) = 1, and it must hold that v(y) = u(y) = 1. Additionally, w(x) = 0 therefore at least one of v(x) and u(x) must be 0. The protocol π E proceeds to v only if v(x) = 0 for all noise patterns in y ′ ∈F −1 (1) S(v, x, y ′ ), and, in particular, v(x) = 0 for the noise E which clearly belongs to S(v, x, y). If the protocol does not proceed to v, then it is not reachable for (x, y), E and the statement holds vacously. The other case is when there is noise at w. Then π E reaches v only if E w directs to the child v. In this case w(z) = v(z), and thus v(x) = w(x) = 0 and v(y) = w(y) = 1 by the induction hypothesis.
We continue to the proof of property (b). The base case for v being the root node is a simple special case of the proof given below for an arbitrary v.
Let v be given and assume that the claim holds for all nodes v ′ that come before v in the BFS ordering. Specifically, it holds for all the ancestors of v. We show that the claim holds for v as well. Consider the case where v is an ∧ node (the other case is similar). Assume towards contradiction that the claim does not hold for v. That is, there are two noise patterns E 0 , E 1 ∈ y ′ ∈F −1 (1) S (v,x,y ′ ) such that (F 0 ) E0 (x) = 1 and (F 1 ) E1 (x) = 1.
Define the noise pattern E * (over the nodes of F ) in the following way. For any ancestor of v, E * is defined as the ∨-minimal noise-pattern between E 0 and E 1 , i.e., the one that induces the least noise on ∨-gates in the root-to-v path. Furthermore, for any ∧-gate u in the root-to-v path, if either E 0 or E 1 contain no noise at u, set E * to have no noise at that gate. Otherwise, both E 0 and E 1 have noise at u and since both reach v, the noise must be the same; in this case E * contains the same noise for u as E 0 and E 1 . For the nodes that belong to the subformula F 0 , the noise E * is identical to E 0 , and for nodes that belong to the subformula F 1 , E * is identical to E 1 . For all other nodes there is no noise in E * .
Clearly by this construction, E * is an (α, β)-corruption, since compared to either E 0 or E 1 , we only reduced the amount of corruptions in both ∧ and ∨ gates between the root and v (and kept the same number of corruptions below v). Furthermore, there must exist some y ′ such that π E * (x, y ′ ) reaches v: assume E 0 was the ∨-minimal pattern. Since E 0 ∈ y ′ ∈F −1 (1) S (v,x,y ′ ) there exists y ′ for which π E0 (x, y ′ ) reaches v. We argue that π E * (x, y ′ ) also reaches v. Indeed, in any ∨-gate π E * (x, y ′ ) behaves exactly like π E0 (x, y ′ ) since the noise in both is identical. For any ∧-gate u, E 0 may have noise in u while E 1 (and thus, E * ) does not. However, there exists y ′′ such that π E1 (x, y ′′ ) reaches v. Hence, it also reaches u, and it also advances to the same child as π E0 (x, y ′ ) does when it reaches u. Since u is an ∧-gate, this decision depends only on x. By the above we learn that if the protocol reaches u and there is no noise, it advances to the same child determined by the noise E 0 at u. Therefore, π E * (x, y ′ ) takes the same child of u as π E0 (x, y ′ ). It follows that π E * (x, y ′ ) reaches v, and E * ∈ y ′ ∈F −1 (1) S (v,x,y ′ ) .
Additionally, in F E * , the node v evaluates to 1 on x, because (F 0 ) E * (x) = (F 0 ) E0 (x) = 1 and (F 1 ) E * (x) = (F 1 ) E1 (x) = 1. But this contradicts property (a), asserted at the beginning of this proof, that for any noise E (and specifically for E * ), any node v that is reachable by π E * (x, y ′ ) must evaluate to 0 on x. Therefore, at least one of F 0 (x) and F 1 (x) evaluates to 0 on all noise patterns within the scope.
With the above we can show our main proposition for converting formulas to protocols in a noisepreserving way.
Proposition 6.6. Let F be a perfect formula that computes the function f and is resilient to (α, β)-corruption of short-circuit gates in every input-to-output path. Then, the resilient KW-transformation yields an interactive protocol π over channels with feedback, that solves KW f and is resilient to (α, β)-corruptions.
Proof. Let E be a given (α, β)-corruption, and let π E be the protocol defined above for F assuming the transmission noise induced by E. We claim that the protocol π E , that is, the protocol π under the noise E, computes KW f , which means that π is an (α, β)-resilient protocol for KW f . Say that on inputs (x, y) ∈ F −1 (0) × F −1 (1) the protocol terminates at a leaf v marked with either z i or ¬z i . By Proposition 6.5 it holds that v(x) = 0 while v(y) = 1 in F E (and thus in F ), which implies that x i = y i . Note that the literal evaluates to the output of the function as we additionally require from KW f protocols.
The conversion from resilient formulas into resilient protocols in Proposition 6.6 implies an upper bound on the maximal resilience of formulas, and proves Theorem 1.2.
Theorem 6.7. There exists a function f : {0, 1} n → Z such that no formula F that computes f with fan-in k and depth less than r < 5 6 n log 2k is resilient to a fraction of 1/5 of short-circuit noise.
Let F be a perfect formula that computes par(z) with AND/OR gates of fan-in k and depth(F ) < 5 6 n log 2k . Assume that F is resilient to a fraction of 1/5 of short-circuit noise. Lemma 6.4 shows that F is also resilient to (1/5, 1/5)-corruptions of short-circuits. Then, using Proposition 6.6 we obtain an interactive protocol π for KW par of length |π| = depth(F ) < 5 6 n log 2k that communicates symbols from an alphabet of size |Σ| = k, and is resilient to (1/5, 1/5)-corruptions. This contradicts Theorem 3.4.
Note that computing the parity of n bits can be done with a formula of depth O(log n). However, the above theorem shows that any resilient formula for the parity function will have an exponential blow-up in depth, and thus exponential blow-up in size.
Corollary 6.8. There is no coding scheme that converts any formula F of size s into a formula F ′ of size o(exp(s)), such that F ′ computes the same function as F and is resilient to 1/5-fraction of short-circuit gates on every input to output path.

From Protocols to Formulas
In this part we show the other direction of the resilient KW-transformation, namely, that a resilient protocol can be transformed into a resilient formula, with the same resilience level. This result is based on the result in [KLR12], adapted to the setting of (α, β)-corruptions (rather than resilience to δ-fraction of noise). Then, we can use our coding scheme that is resilient against (1/5 − ε, 1/5 − ε)-corruptions in order to transform any formula F into a (1/5 − ε, 1/5 − ε)-resilient version.
Proposition 6.9. Let π be a protocol that solves KW f for some function f and is resilient to (α, β)corruptions. The KW-transformation on the reachable protocol tree of π yields a formula F that computes f and is resilient to any (α, β)-corruption of short-circuit noise in any of its input-to-output paths.
The above proposition is, in fact, a reformulation of a result by Kalai, Lewko, and Rao [KLR12], implied by Lemma 6.3 and the following.
Lemma 6.10 ([KLR12, Lemma 8]). Let f be a boolean function, and let π be a protocol with root p root . Let T ⊂ f −1 (0) × ([k] ∪ { * }) VA and U ⊂ f −1 (1) × ([k] ∪ { * }) VB be two nonempty sets such that the protocol π solves KW f on every pair of input and noise in T × U , and assume that any vertex that is a descendent of p root can be reached using some input and noise from T × U .
Then there is a formula F that is obtained by replacing every vertex where Alice speaks with an ∧ gate, every vertex where Bob speaks with an ∨ gate and every leaf with a literal, such that for every (x, E A ) ∈ T , (y, E B ) ∈ U it holds that F E∧ (x) = 0 and F E∨ (y) = 1, where E ∧ is E A on Alice's vertices and * on Bob's vertices, and E ∨ is * on Alice's vertices and E B on Bob's vertices.
Using our coding scheme that is resilient to (1/5 − ε, 1/5 − ε)-corruptions (Algorithm 4) we get that we can fortify any formula F so it becomes resilient to a (1/5 − ε)-fraction of short-circuit noise, with only polynomial growth in size.
Theorem 6.11. For any ε > 0, any formula F of depth n and fan-in 2 that computes a function f can be efficiently converted into a formula F ′ that computes f even if up to a fraction of 1/5 − ε of the gates in any of its input-to-output paths are short-circuited. F ′ has a constant fan-in O ε (1) and depth O(n/ε).
Proof. The conversion is done in the following manner. Given a formula F (that computes some function f ) we first balance it, i.e., convert it to an equivalent formulaF of depth log |F | with no redundant branches. It is well known that such a formula always exists. Next, we convertF into a protocol π for KW f via the KW-transformation (Section 6.1); note that the length of π is at most the depth ofF , that is, O(log |F |). Then, we convert π into a protocol π ′ that solves the same function KW f and is resilient to (1/5 − ε, 1/5 − ε)corruptions, assuming noiseless feedback. This step is possible due to Theorem 5.3. The resilient π ′ is then transformed back into the resilient formula F ′ that satisfies the theorem assertions, using Proposition 6.9. Recall that the depth of the obtained formula is exactly the length of the resilient protocol.
To complete the proof we only need to argue that the conversion can be done efficiently. It is easy to verify that convertingF to π is efficient, and also converting π to π ′ (Algorithm 4) is efficient by Lemma 5.1. The only part which is possibly inefficient is the reverse KW-transformation from π ′ back to a formula, which requires finding the reachable protocol tree of π ′ -the vertices v for which there exist an input (x, y) and a (1/5 − ε, 1/5 − ε)-corruption E such that π ′ (x, y) reaches v if the noise is E. This part can be shown to be efficient by a technique similar to that presented in [KLR12]. In Appendix A we give a detailed proof. Theorem 1.1 is an immediate corollary of the above theorem, by noting that |F ′ | ≤ k depth(F ′ ) = 2 O(log(1/ε)) O((log |F |)/ε) = poly ε (|F |).
Here, k ≈ ε −2 is the fan-in of |F ′ | given by the alphabet size of the resilient interactive protocol π ′ constructed earlier.
(v 0 , v 1 , ..., v h = v) be the nodes on the unique path from the root to v. We examine each one of the possible noise patterns that affects only this path. That is, for each node v i we decide whether it is corrupted or not; there are at most 2 d different such noise patterns. For any fixed noise pattern, we verify that all the other edges (v i , v i+1 ) are consistent with the behavior of the simulation, and reject the noise pattern if they are not. If no inconsistency is found, we show that a valid run of π ′ on some input with that noise pattern leads to v. First, we recall that we assume that the complete protocol tree of π of depth |π| is reachable for some input, given there is no noise at all; that is, we prune all the redundant branches. 10