Error Correction for Message Streams
Abstract
In the setting of error correcting codes, Alice wants to send a message to Bob via an encoding that is resilient to error. In this work, we investigate the scenario where Bob is a low space decoder. More precisely, he receives Alice’s encoding bit-by-bit and desires to compute some function in low space. A generic error-correcting code does not accomplish this because decoding is a very global process and requires at least linear space. Locally decodable codes partially solve this problem as they allow Bob to learn a given bit of in low space, but not compute a generic function .
Our main result is an encoding and decoding procedure where Bob is still able to compute any such function in low space when a constant fraction of the stream is corrupted. More precisely, we describe an encoding function of length so that for any decoder (streaming algorithm) that on input computes in space , there is an explicit decoder that computes in space as long as there were not more than fraction of (adversarial) errors in the input stream .
Keywords and phrases:
error-correcting codes, streaming algorithms, space-efficient algorithmsFunding:
Meghal Gupta: Supported by an NSF GRFP Fellowship.Copyright and License:
2012 ACM Subject Classification:
Mathematics of computing Coding theoryEditors:
Raghu MekaSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
Consider the following scenario: Alice streams a message denoted to Bob that he receives and processes bit-by-bit. His goal is to compute some function unknown to Alice in significantly less space than necessary to store the entire stream. This scenario arises for instance in automatic control applications, where a device receives an incoming stream of data and needs to use it to make decisions.
As an example, a class of functions that Bob may wish to compute in small space from a message stream is the class of linear functions. If Bob’s function is for some , then after receiving each bit of , Bob adds to a running sum. Bob only needs to track and the running sum modulo 2, which is in total bits of space.
However, this and other small space algorithms are very rigid when it comes to errors in the stream. Corruption of even one bit of Alice’s message can change the output of Bob’s linear function. The same applies if Bob’s function is an index function, a majority, or the result of a decision tree.
In this work, we study what happens when there is noise in the incoming stream. In particular, we ask if it’s possible to convert a given algorithm that processes a message stream into one that is robust to errors in the message while still allowing the output to be computed in low space.
In the usual error correction setting, Alice could just encode her message using a generic large distance error-correcting code. Bob would be able to compute any function in the presence of a small constant fraction of error, but he must receive and store the whole stream in order to decode, which is far too much space. Even if Alice sends a locally decodable code, which has the property that to determine a single bit of one only needs to track a few random bits of the codeword, it is not clear how Bob can compute a function requiring all bits without storage. Our question is whether there is an encoding that preserves the space complexity of Bob’s original decoding algorithm while being resilient to error.
In this work, we answer the question in the affirmative. Our main result is a scheme that protects any streaming algorithm against noise in the incoming stream. More precisely, we give an encoding of length such that any streaming algorithm running in space on the noiseless stream can be converted to one running in space on the encoded stream. It is correct with high probability whenever at most of the stream was corrupted. In the specific case where Bob’s function is a linear function such as dot product, our encoding can be made to have length .
1.1 The Model
We provide a formal definition of a noise resilient transformation for a message stream and processing algorithm. The transformation has two components:
-
An encoding function .
-
An explicit transformation that takes as input a deterministic streaming algorithm and outputs a randomized streaming algorithm .
The algorithm is said to be -error resilient if whenever , then outputs with high probability.
1.2 Our Results
Our main result is a noise-resilient conversion for deterministic streaming algorithms.
Theorem 1.
Fix . For any , there is a noise resilient transformation consisting of an encoding , and a transformation of algorithms that is error resilient with probability . Moreover, , and for any deterministic algorithm on domain that runs in space and time , the algorithm runs in space and time .
In other words, given an algorithm that accepts a stream of length and uses space, we demonstrate a noise resilient algorithm computing that uses space. The transformation of the stream from the sender is independent of and has length .
A priori, it is not obvious that there is any low-memory noise resilient transformation of the algorithm even with an arbitrary blow-up in communication. For example, if the sender were to encode their message using an arbitrary error-correcting code, the receiver would need to store the entire message in memory in order to decode it before applying the algorithm . Our result shows that not only does this low-memory transformation exist, but that it can be done efficiently (both with respect to communication complexity and computational complexity).
Linear Streaming Algorithms
In the above transformation, the length of the resulting stream is for any . In the specific case where the streaming algorithm is linear, we construct a scheme where the stream length is only quadratic, namely . The key property of linear streaming algorithms that we will be leveraging is that they can be computed in pieces in any order, and later computations do not depend on the results of previous ones.
Definition 2 (Linear Streaming Algorithms).
A linear streaming algorithm is described by a list of functions for , and computes the value (where addition is over ) by tracking the partial sum upon receiving the ’th bit.
Note, for example, that every linear function on codomain admits a linear streaming algorithm. We note that linear streaming algorithms capture a large class of interesting functions, including linear sketching algorithms (see for example [12, 10, 1, 37]). For linear streaming algorithms, we show the following result.
Theorem 3.
Fix . There is a function where and an explicit transformation such that the following holds: For any linear streaming algorithm that takes as input as a stream, runs in space and time , and outputs , the algorithm takes as input as a stream, runs in space and time , and satisfies that whenever then outputs with probability .
Randomized Algorithms
We remark that our transformations in Theorems 1 and 3 are only for deterministic algorithms. However, this easily implies the result for algorithms that pre-sample all of their randomness (that is, algorithms that fix their randomness before receiving any bits of , at which point they are deterministic for the remainder of the algorithm). Such algorithms can be viewed as sampling from a distribution over deterministic algorithms. Our transformation can then be applied to the ensuing deterministic algorithm, thus correctly computing the function with high probability while being resilient to noise.
We remark that this notion of randomized algorithms whose only access to randomness is pre-sampled at the start of computation is quite general: randomized algorithms that sample its randomness online can often be converted to protocols where the randomness is pre-sampled, see e.g. [44].
Non-Binary Input Alphabets
Our main theorems are stated for the setting where the input stream is binary. We remark that this assumption is without loss of generality: a larger alphabet stream can be converted to a binary one by assigning each alphabet symbol to a unique binary string.
1.3 Discussion of Subsequent Work
Since this paper was first posted, the work of [29] improved upon our schemes in a few major ways.
-
For the case of general deterministic streaming algorithms, they demonstrate a transformation that only requires near-quadratic encoding length.
-
They show that this is essentially tight: there is no transformation that has sub-quadratic encoding length.
-
And, in the case of linear streaming algorithms, they construct a transformation that has an encoding of near linear length.
This addresses most of the open questions posed by our paper.
1.4 Related Works
Streaming Algorithms
The study of streaming algorithms began with the work Morris [42] on approximate counting; a rigorous analysis was given later in [20]. Later works introduced other classic problems in streaming, including heavy hitters [9], approximation [2, 41, 36], and finding a nonzero entry in a vector (for turnstile algorithms) [41].
Many works, for example [22, 11, 39, 3], consider the problem of processing noisy data sets using streaming algorithms. [22] shows a memory lower bound for learning a parity function with noisy samples of random linear equations.
However, this type of noise is quite different from our treatment. In these problems, noise is inherent in the way samples are generated, whereas we are investigating noise that occurs in the communication process.
Error Correcting Codes
Our result can be viewed through the lens of low-space noise resilient one-way communication.
Noise resilient one-way communication, in other words, error correction [51, 35], is one of the most well-studied problems in computer science. One specific line of works related to our result is that of locally decodable codes. Locally decodable codes [52] can be viewed as a low-time and low-space version of error-correcting codes, where the goal is to learn a single bit of the original message. By constrast, for us, the decoder must be able to piece together any function of the input that is noiselessly computable in low space, with the tradeoff that the decoder accesses the entire encoding via a stream rather than via query access. Local decodable codes have been constructed in a variety of different parameter regimes, including constant query [16, 15] and near rate [38]. In our work, we will use Reed-Muller codes [43, 46] that achieve query complexity and slightly super-linear block length.
Coding for Interactive Communication
The analogue of noise resilient one-way communication in the interactive setting, where Alice and Bob have inputs and engage in a protocol to compute some while being resilient to error, comprises the study of coding for interactive communication. Interactive coding was studied starting with the seminal works of Schulman [48, 49, 50] and continuing in a prolific sequence of followup works, including [8, 6, 4, 5, 34, 7, 13, 25, 24, 17, 28, 26, 19, 30, 31]. We refer the reader to an excellent survey by Gelles [23] for an extensive list of related work.
The recent work of [18] studies the space complexity of interactive coding schemes. Their main result is an interactive coding scheme that preserves the space complexity of the original noiseless protocol up to a factor, where is the length of the protocol. Their result can be viewed as the interactive analogue of ours, where their noiseless and noise resilient protocols are both interactive. We remark that their techniques are quite different than ours and do not apply to our setting, however, since their approach crucially relies on the communication of feedback from Bob to Alice.
2 Overview of Techniques
Consider the task of computing a linear function . In a noiseless setting, one can compute by tracking the partial sum and updating it with upon receiving the ’th bit of .
As a first attempt to create a noise resilient version of this algorithm, one might consider the input stream , where is a locally decodable code. In short, a locally decodable code is a noise resilient encoding of such that in order to decode any specific bit of , say , the decoder simply has to read (randomized) bits of . In a streaming setting, the decoder can record only these bits into memory, then run the decoding algorithm to determine after all such query bits have been seen.
However, it’s not clear that this local decoding property extends to arbitrary linear functions.111The smallest local decodable code we know of that supports local decoding to any arbitrary linear function is an extension of the Hadamard code, which has block length exponential in . This will not be a satisfactory solution for us because (a) we would like encodings that are efficiently computable, and (b) keeping track of which bit of the stream one is receiving takes memory. If we attempt to simultaneously decode all bits from using a separate query set for each index, the issue is that we will need to track queries simultaneously since we cannot ensure that the local decoding for any index finishes before another: the query sets are typically randomized so that any bit of the codeword is equally likely to belong to the query set for any particular index.
A second attempt is to send times, with the intent that Bob locally decodes in the ’th and computes by tracking the partial sum at the end of the ’th chunk. Now, the only space requirements are the space required to store this partial sum and the space required to locally decode in each , which is . However, the adversary can simply corrupt the first so that Bob mis-decodes , thereby corrupting his final output, and ultimately, this approach is not so different from just sending the bits in order.
What Bob would like to do is randomize which he is computing from a given copy of so that the adversary does not know. This crucially uses the property that LDC’s allow Bob to decode any not known to the encoder or adversary. Then, if he computes each many times and takes the majority, he will learn correctly with high probability, regardless of the adversary’s corruption process. Since the adversary does not know in advance which chunks Bob is computing in, she cannot target corruptions towards a specific . However, this runs into an issue with the amount of space Bob requires. Since he computes each many times throughout the protocol and only takes the majority when he has finished these computations, he needs to track his current guess for its value through (almost) the entire protocol. When each was computed only once, Bob could just add to the running sum, but now he needs to track each individually between the first and last time he queries it, for a total of bits of space.
The crux of our construction is an organized system for aggregating decoding attempts in space much smaller than storing all attempts simultaneously.
A recursive computation
Let . Suppose that we already know how to compute linear functions on inputs of size in space , that is resilient to fraction of errors in the stream of copies of . We will use this to build a scheme for computing linear functions on inputs of size . Note that the base case is given by computing linear functions on a single bit, which can be done by local decoding on a single .
To compute , we just need to compute . Each sub-computation can individually be done with the guarantee that if fraction of the stream is corrupted, then the function is computed correctly.
Consider a stream consisting of copies of (you can think of ). We split up this stream into chunks each consisting of blocks of ’s. In each of the chunks, we will assign the blocks of ’s to the computation of a random permutation of . Throughout the algorithm, we will keep track of all the outputs from each of the sub-computations for each of . At the end, we can take a majority vote for each of the . We illustrate the recursion process in Figure 1.
Because of the way we’ve randomly assigned blocks to , the adversary cannot concentrate her attack on the computation of any specific . More precisely, we can show that the amount of error present in the blocks corresponding to the computation of is (roughly) proportional to the total error present. Then, as long as at least half the blocks corresponding to have fraction of errors, we have that the majority vote for each of will be correct. In total, this means that our algorithm for computing is resilient to fraction of errors.
Unfortunately, this factor of loss in the error resilience in each iteration of the recursion will prove to be problematic, as we can no longer guarantee a positive error resilience when gets large.
Decoding with confidences
In the above analysis, our worst case error distribution was that where just under one-half of the chunks have error while the rest are error-free. In this case, the chunks with error will produce a wrong output, while the error-free ones will produce a correct output.
However, we notice that in certain local decoding constructions (for us, we will use a Reed-Muller code), one can actually obtain an estimate of how much error there was in the codeword based on the inconsistency of the queries. In particular, in chunks where there were errors, even though we will obtain a wrong answer, we can also see that there were a lot of errors in the codeword, and thereby be less sure of the obtained answer.
This motivates our definition of local decoding with confidences: in addition to outputting an answer, the decoding algorithm should also output a confidence indicating how confident it is that the answer is correct. A confidence of indicates that it’s not sure at all: the fraction of errors seen was near the threshold ; whereas a confidence of indicates that it’s quite sure: either the answer is right and no errors occurred, or at least errors were required to obtain a wrong answer with confidence . Another way to view the confidence is as an estimate of how much error was present in the total codeword (how far the actual error is from .
Now, when we perform the recursive step to obtain a noise resilient algorithm for from , in addition to recording the outputted answer from each chunk, we will also record the associated confidence. Suppose that the answers and confidences obtained for were . Then, the weighted majority vote of these answers is defined to be the with the largest total confidence. We can show that as long as the cumulative fractional error is at most , we’re guaranteed that the correct answer has more the highest total confidence.
We remark that one does not have to store all pairs . Rather, we can keep track of a most likely and associated confidence . To update with a new pair , one can update by ( if , and otherwise). If the resulting confidence is less than , then it means that the answer is supported by less overall error, so we set our most likely to be and flip the sign of .
Now, we analyze the space and communication complexity of the resulting algorithm. The space required is simply , the space required to do a size sub-computation, plus the space required to store a pair for each of . Overall, this means that for each layer of the recursion, we are only gaining an additive factor of space overhead, so the total space overhead is at most polylogarithmic in . As for the size of the resulting stream, we see that the number of ’s is ,222In our actual protocol, we will repeat the procedure times and take a final majority vote, so the number of ’s used will be . which is polynomial in .
Computing non-linear functions
Our algorithm so far has only captured functions that are splittable into sub-computations whose computations do not depend on each other. However, for a general function, this may not be the case. Indeed, for many functions, the computations must be done sequentially, where later computations can only be done once earlier computations have finished. We may split a general function into sub-functions where is the state of the algorithm after bits of the stream, is the state of the algorithm after receiving the next bits of the stream, and so on. The difference from the linear function case is that to compute , we must have the correct output of .
In order to handle such sequential computations, we modify our above algorithm as follows. In the recursive layer computing from functions , we always compute from the starting state which is our current best guess for the output of . This value of may not always be correct, in which case the computation done for will be doomed.
As with our algorithm for linear functions, after every chunk, we update the most likely state and confidence for the sub-function that was computed in that chunk. However, when a state changes, this also means that computations done for were based on wrong information. In the case of linear functions, the computations are independent so this does not matter, but for sequential functions, we have to discard the guesses . Formally, we set these states to and set the corresponding confidences all to .
Because each sub-function is only computed usefully when is correct and also the value of is frequently discarded, it is less clear that over many iterations the ’s will converge to the correct values. Nevertheless, we prove in Section 6 that over enough iterations, the guesses will eventually all be correct as long as there are not too many errors in the stream.
Outline
The rest of this paper is organized as follows. In Section 3, we start with some preliminary theorems about error-correcting codes. In Section 4, we define our notion of local decoding with confidences and prove that Reed-Muller codes satisfy this property. Then, in Sections 5 and 6, we present our noise resilient streaming algorithms for linear and general (sequential) functions, respectively.
3 Preliminaries
Notation
-
The function is in base unless otherwise specified.
-
The set denotes the integers .
-
We use to denote the ’th to ’th bits of inclusive. We use to refer to the ’th to ’th bit inclusive.
-
We use as a variable rather than the universal constant .
3.1 Error-Correcting Codes
We begin with some results about error-correcting codes. The first is the existence of a relative distance binary code.
Theorem 4 ([33]).
For all , there exists an explicit error-correcting code with relative distance and , and a -time decoding algorithm , such that for any and satisfying , it holds that .
Our second theorem is about the efficient decoding of Reed-Solomon codes [45].
Theorem 5 (Berlekamp-Welch; see e.g. [27]).
Given a collection of tuples where and a parameter , there is an algorithm running in time that outputs a polynomial of degree at most such that .
Finally, we recall generalized minimum distance decoding, which allows us to decode concatenated codes efficiently.
4 Locally Decoding with Confidences
We now recall the definition of locally decodable codes. For us, besides just correctness of local decoding when the distance to a codeword is small, we require that the decoder output a confidence indicating how much error it witnessed (less error means higher confidence).
Theorem 7 (Locally Decodable Code).
For any and , there is a code where that satisfies the following properties:
-
Distance: For any , it holds that .
-
Locally Decoding with Confidence: For any index and any word , there exists a randomized algorithm that reads bits of , runs in time, and outputs a bit along with confidence , satisfying that
-
–
For any such that where , it holds that
with probability at least .
-
–
For any such that where , it holds that
with probability at least .
Furthermore, the confidence can be represented as a rational number with denominator .
-
–
Proof.
This proof is omitted for the conference version.
5 Noise Resilient Streaming for Linear Algorithms
Our first result is a noise resilient conversion for linear streaming algorithms . We begin by recalling the definition of a linear streaming algorithm.
See 2
For such algorithms, we describe a noise resilient conversion that incurs quadratic blow-up in communication complexity.
See 3
5.1 Statement of Algorithm
Let . Throughout the section let and define . We also assume for simplicity that is an integer. We will also assume that is sufficiently large, so that also .
We begin by specifying the encoding that Alice sends in the stream.
Definition 8 ().
Let be a locally decodable code with and query complexity satisfying the guarantees of Theorem 7 for . Then Alice sends where (that is, is copies of ). In particular, .
Throughout this section, we use the notation to denote the quantity . Then, . We define which, as we’ll see in the description below, represent the number of bits read by the algorithm to approximate .
We describe Bob’s streaming algorithm . Before stating it formally, we provide a high-level description.
Description
Let be the incoming stream that Bob receives. At a high level, the algorithm runs by generating many guesses for along with confidences indicating how likely that guess was correct (depending on how much error is witnessed while generating the guess). At the end, the guess that has the most cumulative confidence is outputted.
For where is an integer power of , we define the algorithm that takes in two input two indices . It reads many bits of after which it outputs a guess for along with a confidence. In particular, outputs a guess for along with a confidence. By running many times and aggregating the guesses weighted by the confidences, we obtain a final output.
To compute , we break down the interval into subintervals where . We then recursively formulate guesses for each over many iterations by calling each many times. The choice of in each iteration must be randomized, so that the adversary cannot attack any single choice of . More specifically, we will split up the length input stream into chunks, each of length bits. Each chunk is split into sections, each of size . In each chunk, we pick a random permutation of . In the ’th section of the chunk, we will compute on the subproblem corresponding to interval by calling .
Whenever a recursive call to is completed, outputting , then: if then the confidence is increased by , and if then the confidence is decreased by . If the confidence becomes negative, then we are more sure that the correct state is rather than : then is replaced by and the confidence is negated (so that it’s positive).
5.2 Proof of Theorem 3
We begin by proving the stated claims about the communication, time, and space complexities, and then proceed to prove correctness.
5.2.1 Algorithmic Complexity
The proofs of communication, time, and space complexities are omitted for the conference version.
5.2.2 Correctness
We now show correctness. Formally, correctness is shown by the following lemma.
Lemma 9.
When at most fraction of the stream is corrupted, Algorithm 1 outputs with probability .
We prove the following statement which will easily imply Lemma 9. For a given and associated string that is read by the algorithm in the computation, let the random variable denote the state and confidence after the computation . We define the signed confidence, denoted , to be defined as if and otherwise. For intuition, the more positive is, the more correct with higher confidence the output is, whereas the more negative is, the more is incorrect with high confidence. So, gives us a scaled measure of how correct the output of is.
Lemma 10.
For any and , given fraction of corruptions in the bits read by the computation of , we have .
Proof.
The proof of this lemma is omitted for the conference version.
Proof of Lemma 9.
In the special case where and there are errors (where the inequality follows because ), by Lemma 10 we have that .
Then, the final amplification step of the protocol computes the pair for times, where in each chunk we denote the fraction of error to be , where since we assumed the total error was . Also, the protocol outputs if and only if (denoting the value of in the ’th chunk) is positive. By Azuma’s inequality,
This concludes the proof of Theorem 3.
6 Noise Resilient Streaming for General Algorithms
We now consider general streaming algorithms, whose computational may be sequential in nature. Our main result is a noise resilient conversion for deterministic streaming algorithms . Compared to our scheme for linear algorithms from Section 5, the length of our encoding is larger: compared to .
See 1
Throughout this section, we use the notation
to denote the state of algorithm when starting with the state and executing on received in a stream. By definition, there is an explicit algorithm that computes in space. We also use the shorthand to denote . Notice that is simply the output state of the stream.
6.1 Statement of Algorithm
Let . Throughout the section let and define . We also assume for simplicity that is an integer. We will also assume that the parameter is sufficiently large, so that also .
We begin by specifying the encoding that Alice sends in the stream.
Definition 11 ().
Let be a locally decodable code with and query complexity satisfying the guarantees of Theorem 7 for . Then Alice sends where (that is, is copies of ). In particular, .
Next, we describe Bob’s streaming algorithm . Before stating it formally, we provide a high-level description.
Description
Let be the incoming stream that Bob receives. At a high level, the algorithm runs by generating many guesses for along with confidences indicating how likely that guess was correct (depending on how much error is witnessed while generating the guess). At the end, the guess that has the most cumulative confidence is outputted.
For where is an integer power of , we define the algorithm that has the following syntax:
-
takes as input two indices along with a state which represents a guess for the state of .
-
It reads many bits of after which it outputs a guess for along with a confidence.
In particular, outputs a guess for along with a confidence. By running many times and aggregating the guesses weighted by the confidences, we obtain a final output.
At a high level, to compute , we break down the interval into subintervals where . We then recursively formulate guesses for each over many iterations by calling for some state . The choice of in each iteration must be randomized, so that the adversary cannot attack any single choice of . More specifically, we will split up the length input stream into chunks, each of length bits. Each chunk is split into sections, each of size . In each chunk, we pick a random permutation of . In the ’th section of the chunk, we will compute on the subproblem corresponding to interval by calling .
Note that as the algorithm is computed sequentially, computing a guess for requires having a starting state for . Thus, throughout the entire computation, we will keep track of the best guess and associated confidence for each of the states , denoted and all initialized to .
These pairs are maintained as follows. Whenever a recursive call to is completed, outputting , then: if then the confidence is increased by , and if then the confidence is decreased by . If the confidence becomes negative, then we are more sure that the correct state is rather than : then is replaced by and the confidence is negated (so that it’s positive). In this last case where is replaced, we have no more reason to believe that further computations of , , are correct since they all depended on , so we erase all such pairs and reset them to . A key point in our analysis is to understand why this does not cause the error probability to accumulate.
6.2 Proof of Theorem 1
The proofs that the algorithm satisfies the required space, computational and communication guarantees are essentially the same as in Section 5, but we reproduce them for completeness. The proof of correctness is somewhat more complicated but also follows the same general outline.
6.2.1 Algorithmic Complexity
The proofs of communication, time, and space complexities are omitted for the conference version.
6.2.2 Correctness
Next, we show correctness. Formally, correctness is shown by the following lemma.
Lemma 12.
When at most fraction of the stream is corrupted, Algorithm 2 outputs with probability .
We prove the following statement which will easily imply Lemma 12. For a given and associated string that is read by the algorithm in the computation, let the random variable denote the state and confidence after the computation . We define the signed confidence, denoted , to be defined as if and otherwise. For intuition, the more positive is, the more correct with higher confidence the output is, whereas the more negative is, the more is incorrect with high confidence. So, gives us a scaled measure of how correct the output of is.
Lemma 13.
For any , , and , given fraction of corruptions in the bits read by the computation of , we have .
Proof.
The proof of this lemma is omitted for the conference version.
Proof of Lemma 12.
In the special case where and there are errors (where the inequality follows because ), by Lemma 13 we have that .
Then, the final amplification step of the protocol computes the pair for times, where in each chunk we denote the fraction of error to be , where since we assumed the total error was . Also, the protocol outputs if and only if (denoting the value of in the ’th chunk) is positive. By Azuma’s inequality,
This concludes the proof of Theorem 1.
References
- [1] Kook Jin Ahn, Sudipto Guha, and Andrew McGregor. Analyzing Graph Structure via Linear Measurements, pages 459–467. SIAM, 2012. doi:10.1137/1.9781611973099.40.
- [2] Noga Alon, Yossi Matias, and Mario Szegedy. The space complexity of approximating the frequency moments. In Proceedings of the twenty-eighth annual ACM symposium on Theory of computing, pages 20–29, 1996. doi:10.1145/237814.237823.
- [3] Omri Ben-Eliezer, Rajesh Jayaram, David P Woodruff, and Eylon Yogev. A framework for adversarially robust streaming algorithms. ACM Journal of the ACM (JACM), 69(2):1–33, 2022. doi:10.1145/3498334.
- [4] Zvika Brakerski and Yael Tauman Kalai. Efficient Interactive Coding against Adversarial Noise. In 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science, pages 160–166, 2012. doi:10.1109/FOCS.2012.22.
- [5] Zvika Brakerski and Moni Naor. Fast Algorithms for Interactive Coding. In Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’13, pages 443–456, USA, 2013. Society for Industrial and Applied Mathematics. doi:10.1137/1.9781611973105.32.
- [6] Mark Braverman. Towards Deterministic Tree Code Constructions. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, ITCS ’12, pages 161–167, New York, NY, USA, 2012. Association for Computing Machinery. doi:10.1145/2090236.2090250.
- [7] Mark Braverman and Klim Efremenko. List and Unique Coding for Interactive Communication in the Presence of Adversarial Noise. In 2014 IEEE 55th Annual Symposium on Foundations of Computer Science (FOCS), pages 236–245, Los Alamitos, CA, USA, October 2014. IEEE Computer Society. doi:10.1109/FOCS.2014.33.
- [8] Mark Braverman and Anup Rao. Towards Coding for Maximum Errors in Interactive Communication. In Proceedings of the Forty-Third Annual ACM Symposium on Theory of Computing, STOC ’11, pages 159–166, New York, NY, USA, 2011. Association for Computing Machinery. doi:10.1145/1993636.1993659.
- [9] Moses Charikar, Kevin Chen, and Martin Farach-Colton. Finding frequent items in data streams. In International Colloquium on Automata, Languages, and Programming, pages 693–703. Springer, 2002. doi:10.1007/3-540-45465-9_59.
- [10] Moses Charikar, Kevin Chen, and Martin Farach-Colton. Finding frequent items in data streams. Theoretical Computer Science, 312(1):3–15, 2004. Automata, Languages and Programming. doi:10.1016/S0304-3975(03)00400-6.
- [11] Jiecao Chen and Qin Zhang. Bias-aware sketches. arXiv preprint, 2016. arXiv:1610.07718.
- [12] Graham Cormode and S. Muthukrishnan. An improved data stream summary: the count-min sketch and its applications. Journal of Algorithms, 55(1):58–75, 2005. doi:10.1016/j.jalgor.2003.12.001.
- [13] Varsha Dani, Thomas P. Hayes, Mahnush Movahedi, Jared Saia, and Maxwell Young. Interactive Communication with Unknown Noise Rate, 2015. arXiv:1504.06316.
- [14] Irit Dinur, Shai Evra, Ron Livne, Alexander Lubotzky, and Shahar Mozes. Locally testable codes with constant rate, distance, and locality. In Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing, pages 357–374, 2022. doi:10.1145/3519935.3520024.
- [15] Zeev Dvir, Parikshit Gopalan, and Sergey Yekhanin. Matching vector codes. SIAM Journal on Computing, 40(4):1154–1178, 2011. doi:10.1137/100804322.
- [16] Klim Efremenko. 3-query locally decodable codes of subexponential length. In Proceedings of the forty-first annual ACM symposium on Theory of computing, pages 39–44, 2009. doi:10.1145/1536414.1536422.
- [17] Klim Efremenko, Ran Gelles, and Bernhard Haeupler. Maximal Noise in Interactive Communication Over Erasure Channels and Channels With Feedback. IEEE Trans. Inf. Theory, 62(8):4575–4588, 2016. doi:10.1109/TIT.2016.2582176.
- [18] Klim Efremenko, Bernhard Haeupler, Yael Tauman Kalai, Gillat Kol, Nicolas Resch, and Raghuvansh R Saxena. Interactive coding with small memory. In Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 3587–3613. SIAM, 2023. doi:10.1137/1.9781611977554.CH137.
- [19] Klim Efremenko, Gillat Kol, and Raghuvansh R. Saxena. Binary Interactive Error Resilience Beyond (or why . In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), pages 470–481, 2020. doi:10.1109/FOCS46700.2020.00051.
- [20] Philippe Flajolet. Approximate counting: a detailed analysis. BIT Numerical Mathematics, 25(1):113–134, 1985. doi:10.1007/BF01934993.
- [21] G. Forney. Generalized minimum distance decoding. IEEE Transactions on Information Theory, 12(2):125–131, 1966. doi:10.1109/TIT.1966.1053873.
- [22] Sumegha Garg, Pravesh K Kothari, Pengda Liu, and Ran Raz. Memory-sample lower bounds for learning parity with noise. arXiv preprint, 2021. arXiv:2107.02320.
- [23] Ran Gelles. Coding for Interactive Communication: A Survey. Foundations and Trends® in Theoretical Computer Science, 13:1–161, January 2017. doi:10.1561/0400000079.
- [24] Ran Gelles and Bernhard Haeupler. Capacity of Interactive Communication over Erasure Channels and Channels with Feedback. SIAM Journal on Computing, 46:1449–1472, January 2017. doi:10.1137/15M1052202.
- [25] Ran Gelles, Bernhard Haeupler, Gillat Kol, Noga Ron-Zewi, and Avi Wigderson. Towards Optimal Deterministic Coding for Interactive Communication, pages 1922–1936. SIAM, 2016. doi:10.1137/1.9781611974331.ch135.
- [26] Ran Gelles and Siddharth Iyer. Interactive coding resilient to an unknown number of erasures. arXiv preprint, 2018. arXiv:1811.02527.
- [27] Peter Gemmell and Madhu Sudan. Highly resilient correctors for polynomials. Information Processing Letters, 43(4):169–174, 1992. doi:10.1016/0020-0190(92)90195-2.
- [28] Mohsen Ghaffari and Bernhard Haeupler. Optimal Error Rates for Interactive Coding II: Efficiency and List Decoding. Proceedings - Annual IEEE Symposium on Foundations of Computer Science, FOCS, December 2013. doi:10.1109/FOCS.2014.49.
- [29] Meghal Gupta, Venkatesan Guruswami, and Mihir Singhal. Tight bounds for stream decodable error-correcting codes, 2024. doi:10.48550/arXiv.2407.06446.
- [30] Meghal Gupta and Rachel Yun Zhang. Efficient interactive coding achieving optimal error resilience over the binary channel. arXiv preprint, 2022. doi:10.48550/arXiv.2207.01144.
- [31] Meghal Gupta and Rachel Yun Zhang. The Optimal Error Resilience of Interactive Communication Over Binary Channels. In Symposium on Theory of Computing, STOC 2012, New York, NY, USA, June 20 - June 24, 2022, STOC ’22. ACM, 2022. doi:10.1145/3519935.3519985.
- [32] Venkat Guruswami. Error-correcting codes: Constructions and algorithms, lecture no. 11, 2006. URL: http://www.cs.washington.edu/education/courses/533/06au/.
- [33] Venkatesan Guruswami and Madhu Sudan. List decoding algorithms for certain concatenated codes. In Proceedings of the Thirty-Second Annual ACM Symposium on Theory of Computing, STOC ’00, pages 181–190, New York, NY, USA, 2000. Association for Computing Machinery. doi:10.1145/335305.335327.
- [34] Bernhard Haeupler. Interactive Channel Capacity Revisited. In 55th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2014, Philadelphia, PA, USA, October 18-21, 2014, pages 226–235, 2014. doi:10.1109/FOCS.2014.32.
- [35] Richard W Hamming. Error detecting and error correcting codes. The Bell system technical journal, 29(2):147–160, 1950.
- [36] Piotr Indyk and David Woodruff. Optimal approximations of the frequency moments of data streams. In Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, pages 202–208, 2005. doi:10.1145/1060590.1060621.
- [37] William Johnson and Joram Lindenstrauss. Extensions of lipschitz maps into a hilbert space. Contemporary Mathematics, 26:189–206, January 1984. doi:10.1090/conm/026/737400.
- [38] Swastik Kopparty, Shubhangi Saraf, and Sergey Yekhanin. High-rate codes with sublinear-time decoding. J. ACM, 61(5), September 2014. doi:10.1145/2629416.
- [39] Chaoyi Ma, Haibo Wang, Olufemi Odegbile, and Shigang Chen. Noise measurement and removal for data streaming algorithms with network applications. In 2021 IFIP Networking Conference (IFIP Networking), pages 1–9. IEEE, 2021. doi:10.23919/IFIPNETWORKING52078.2021.9472845.
- [40] Andrew McGregor, Atri Rudra, and Steve Uurtamo. Polynomial fitting of data streams with applications to codeword testing. Symposium on Theoretical Aspects of Computer Science (STACS2011), 9, March 2011. doi:10.4230/LIPIcs.STACS.2011.428.
- [41] Morteza Monemizadeh and David P Woodruff. 1-pass relative-error lp-sampling with applications. In Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms, pages 1143–1160. SIAM, 2010. doi:10.1137/1.9781611973075.92.
- [42] Robert Morris. Counting large numbers of events in small registers. Communications of the ACM, 21(10):840–842, 1978. doi:10.1145/359619.359627.
- [43] David E Muller. Application of boolean algebra to switching circuit design and to error detection. Transactions of the IRE professional group on electronic computers, 3:6–12, 1954. doi:10.1109/IREPGELC.1954.6499441.
- [44] N. Nisan. Pseudorandom generators for space-bounded computations. In Proceedings of the Twenty-Second Annual ACM Symposium on Theory of Computing, STOC ’90, pages 204–212, New York, NY, USA, 1990. Association for Computing Machinery. doi:10.1145/100216.100242.
- [45] I. S. Reed and G. Solomon. Polynomial codes over certain finite fields. Journal of the Society for Industrial and Applied Mathematics, 8(2):300–304, 1960. doi:10.1137/0108018.
- [46] Irving S Reed. A class of multiple-error-correcting codes and the decoding scheme. IEEE Transactions on Information Theory, 4(4):38–49, 1954. doi:10.1109/TIT.1954.1057465.
- [47] Atri Rudra and Steve Uurtamo. Data stream algorithms for codeword testing. In International Colloquium on Automata, Languages, and Programming, pages 629–640. Springer, 2010. doi:10.1007/978-3-642-14165-2_53.
- [48] Leonard J. Schulman. Communication on noisy channels: a coding theorem for computation. In Proceedings., 33rd Annual Symposium on Foundations of Computer Science, pages 724–733, 1992. doi:10.1109/SFCS.1992.267778.
- [49] Leonard J. Schulman. Deterministic Coding for Interactive Communication. In Proceedings of the Twenty-Fifth Annual ACM Symposium on Theory of Computing, STOC ’93, pages 747–756, New York, NY, USA, 1993. Association for Computing Machinery. doi:10.1145/167088.167279.
- [50] Leonard J. Schulman. Coding for interactive communication. IEEE Transactions on Information Theory, 42(6):1745–1756, 1996. doi:10.1109/18.556671.
- [51] Claude E. Shannon. A mathematical theory of communication. The Bell System Technical Journal, 27(3):379–423, 1948. doi:10.1002/j.1538-7305.1948.tb01338.x.
- [52] Sergey Yekhanin et al. Locally decodable codes. Foundations and Trends® in Theoretical Computer Science, 6(3):139–255, 2012. doi:10.1561/0400000030.
