List-Recovery of Random Linear Codes over Small Fields
Abstract
We study list-recoverability of random linear codes over small fields, both from errors and from erasures. We consider codes of rate -close to capacity, and aim to bound the dependence of the output list size on , the input list size , and the alphabet size . Prior to our work, the best upper bound was (Zyablov and Pinsker, Prob. Per. Inf. 1981).
Previous work has identified cases in which linear codes provably perform worse than non-linear codes with respect to list-recovery. While there exist non-linear codes that achieve , we know that is necessary for list recovery from erasures over fields of small characteristic, and for list recovery from errors over large alphabets.
We show that in other relevant regimes there is no significant price to pay for linearity, in the sense that we get the correct dependence on the gap-to-capacity and go beyond the Zyablov–Pinsker bound for the first time. Specifically, when is constant and approaches zero,
-
For list-recovery from erasures over prime fields, we show that . By prior work, such a result cannot be obtained for low-characteristic fields.
-
For list-recovery from errors over arbitrary fields, we prove that .
Above, and depend on the decoding radius, input list size, and field size. We provide concrete bounds on the constants above, and the upper bounds on improve upon the Zyablov–Pinsker bound whenever for some small universal constant .
Keywords and phrases:
List recovery, random linear codesCategory:
RANDOMFunding:
Dean Doron: Supported in part by NSF-BSF grant #2022644.Copyright and License:
2012 ACM Subject Classification:
Theory of computation Error-correcting codesFunding:
This work was done in part while the authors were visiting the Simons Institute for the Theory of Computing, supported by DOE grant #DE-SC0024124.Editors:
Alina Ene and Eshan ChattopadhyaySeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
Error-correcting codes enable reliable communication over noisy channels by encoding messages as codewords . A code of rate and minimum (relative) Hamming distance allows for reliable communication over an adversarial noisy channel that corrupts up to a -fraction of codeword symbols. To tolerate more corruptions, one can relax unique decoding to list decoding, where the decoder outputs all codewords within a given Hamming radius.
This notion is further generalized by list recovery (from errors), which models scenarios where the receiver gets a small list of possible values for each symbol. Formally, a code is said to be -list-recoverable if for every sequence of sets with , we have
where denotes the Hamming ball consisting of all words in that agree with in at least coordinates. When , list-recovery from errors reduces to standard list-decoding (from errors).
A related variant is list recovery from erasures, where some coordinates are entirely unknown, modeled by setting . A code is said to be -list-recoverable from erasures if
whenever for at least positions. Here too, the case corresponds to list-decoding from erasures.
List recoverable codes are used as a building block for list-decodable and uniquely decodable codes [13, 14, 15, 16, 26, 9, 21]. They have also gained a significant independent interest, in part due to their applications in pseudorandomness [38, 19, 4, 24], algorithms (in particular, for heavy hitters, compressed sensing, and combinatorial group testing [23, 34, 28, 7, 5]), and cryptography [20, 22].
For both list-recovery from errors and from erasures, there exists a well-defined capacity threshold that characterizes the maximal achievable rate for which bounded list-size decoding is possible. Specifically, given parameters , , and alphabet size , there is a critical rate such that:
-
For every and any large enough block length, there exist codes of rate that are -list-recoverable (from errors or erasures) with
(1) -
For every and any large enough block length , no code of rate is -list-recoverable for .
The exact threshold depends on the recovery model:
-
For -list-recoverability from erasures, the corresponding threshold is
The dependence of the list size on the parameters and (see Equation 1) is often critical, and has been the focus of extensive research (e.g., [36, 37, 32, 10, 17, 8, 39, 2, 31]).
Using the probabilistic method, it is easy to show that plain random codes achieve , and this dependence is often viewed as the optimal benchmark. Codes achieving this tradeoff are said to match the Elias bound for list-recovery, in reference to the analogous threshold in list-decoding [6] (see also [33]).
To set expectations for list-recovery of linear codes, we recall the classic argument of Zyablov and Pinsker [40], adapted to the setting of list-recovery by Guruswami [11]. Since any set of vectors has a linearly independent subset of size , and since the events that linearly independent vectors lie in a linear code are stochastically independent, the argument for plain random codes gives . This naturally raises the question: what is the actual price of linearity in list-recovery? While various forms of degradation are possible, we seek to understand how the requirement of linearity affects the list-size achievable near capacity.
Prior work
Most previous results concerning the output list size have focused on the large alphabet regime, where is at least exponential in . In this setting, Li and Shagrithaya [31] recently proved that random linear codes are almost surely list-recoverable with list size . By a reduction between code ensembles [30], the same upper bound also holds with high probability for Reed–Solomon codes over random evaluation sets. On the other hand, [31] proved that all -list-recoverable (from errors) linear codes over a large alphabet must satisfy , implying an exponential gap in the list-size between linear codes and plain random codes. A similar negative result was previously proven by Chen and Zhang [2] for Reed-Solomon codes and folded Reed–Solomon codes. Recently, Komech and Mosheiff [25] constructed a new ensemble of non-linear codes that achieve in list-recovery from errors. These are the only codes other than plain random codes known to achieve the list-recovery Elias bound.
Older works of Rudra and Wootters [36, 37], incomparable to [31], show (in our terms) that random linear codes achieve list-size for list-recovery (from errors) in the large alphabet regime, but only under the guarantee that .
Other works concern the list-recoverability of folded Reed–Solomon codes, multiplicity codes, tensor codes, and variants of them (e.g., [27, 21, 39]). In particular, Tamo [39] shows that folded Reed–Solomon codes and multiplicity codes achieve list size (in the errors case) in the large alphabet regime.
In contrast, very little is known about list-recovery in the small alphabet regime, where is sub-exponential in . To the best of our knowledge, the only positive result in this setting is the aforementioned for random linear codes [40]. On the other hand, we know that when is a power of a small prime, random linear codes in this regime are very unlikely to be -list-recoverable from erasures with [17] (we state this as Theorem 1 below).
Provable separation between linear and nonlinear codes were also established in the setting of list decoding from erasures. For the special case of , and aiming for an erasure decoding radius of , Guruswami [11] showed that the output list size must satisfy , as long as the rate is sufficiently non-vanishing. In contrast, plain random codes achieve , and we even have explicit constructions that nearly match plain random codes in some parameter regimes [1].
1.1 Our Contribution
We establish new upper bounds on the output list size for list-recovery of random linear codes near capacity in the small alphabet regime. To the best of our knowledge, these are the only known bounds for general list-recovery in this setting beyond the classical Zyablov–Pinsker argument. Notably, our results achieve list sizes with only linear dependence on , in contrast to the exponential dependence in previous bounds.
List-recovery from erasures over prime fields
For our first result, we consider -list-recovery from erasures. Recall that the capacity in this case is
As mentioned before, [17] showed that there is a price to pay for linearity over fields of small characteristic. More precisely, they proved the following.
Theorem 1 (informal; see [17, Theorem III.1]).
If divides , then with high probability over the choice of the linear code, the output list size cannot be taken smaller than .
We show that this limitation disappears when considering a prime field size . In this case, we can make the output list size .
Theorem 2 (list recovery from erasures over prime fields; see Theorem 25).
Given with prime and , there exists such that the following holds. Let be a random linear code of rate for some . Then, is with high probability -list-recoverable from erasures.
While our focus is on the setting where and are constants, we determine effective bounds on even when is a function of , and is slightly super-constant.111For a concrete example, when for some constant , and is bounded away from , we can take as long as . We refer the reader to Theorem 25 for the precise bound.
List-recovery from errors
Next, we consider the case of list-recovery from errors, where we recall the capacity is
Here, contrary to what happens in the regime of large -s, we do not observe any price to pay for linearity, at least in terms of the dependence on the gap-to-capacity .
Theorem 3 (list recovery from errors; see Theorem 31).
Given with a prime power and , there exists such that the following holds. Let be a random linear code of rate for some . Then, is with high probability -list-recoverable from errors.
Again, we determine effective bounds on the numerator , which can be found in Theorem 31.222Specifically, when is bounded away from both and , we can take . See Theorem 31 for the precise bound. For context, we recall that in the “large regime” (say, ), such a result is provably impossible. Specifically, when is large, the list-recovery capacity is (essentially) the Singleton bound, namely, . It is known that at rate , any linear code list-recoverable from errors must have [31]. That is, the dependence of on the gap-to-capacity must be exponential. But in our case, at least if and are held constant, the dependence of on is just .
We conclude this section with some remarks.
Remark 4 (on the field insensitivity).
Both of our results show that there is no significant price to pay for linearity in list-recovery for certain parameter regimes. Our result in Theorem 3 on list-recovery from errors is insensitive to the field size. On the other hand, as discussed above, our result in Theorem 2 on list-recovery from erasures is (necessarily!) field sensitive. We provide some informal discussion on why this happens. First, note that we work with rates close to capacity, and the capacity in the erasures setting is larger for comparable and . Second, looking ahead (see Section 1.2 for more details), list-recovery from erasures depends on the “additive structure” of , for arbitrary size- subsets . If the ’s are subspaces of , then it is quite likely these form a bad configuration for list-recovery from erasures. In contrast, list-recovery from errors depends on the additive structure of the “puffed-up” combinatorial rectangles
Even if the ’s are subspaces of , the “puffing up” operation kills any additive structure that could lead to a bad list-recovery configuration.
Remark 5 (on the dependence of on the various parameters).
Recall that a code which is -close to capacity is said to achieve the Elias bound if . Note that in both Theorem 2 and Theorem 3, the dependence of on is as we would hope. However, the dependence on the other parameters (particularly and ) is exponentially worse than the dependence. We leave it as a natural open problem to improve on this dependency.
Remark 6 (comparison to [40]).
As mentioned earlier, we are not aware of any prior arguments establishing non-trivial bounds on the list-size for codes -close to capacity which do not require to be large, other than what follows from the Zyablov–Pinsker argument [40] (given formally in [11]). Recall that this method guarantees list size . In comparison, we obtain (roughly) in the case of erasures (over prime fields), and in the case of errors (over all fields, assuming is not too close to or ). Thus, compared to [40], we obtain a smaller bound on once, roughly, . In particular, when and are constants we achieve asymptotic improvement in .
1.2 Technical Overview
At a conceptual level, our work reconsiders the approach of Guruswami, Håstad, and Kopparty [12], which allowed for an understanding of the list-decodability from errors of random linear codes over constant-sized alphabets, and adapts it to the case of list-recovery (either from erasures or errors). Recall that list-decoding from errors is the special case of list-recovery from errors with .
Using terminology that we define in this work, the first step in the argument of [12] is to argue that Hamming balls are nontrivially mixing. Specifically, fix any center and consider sampling twice uniformly and independently from the Hamming ball of radius centered at ; denote the two samples by and . The authors argue that for some and any , it follows that for any other center , for some .333In fact for [12] it sufficed to only consider the case, in which case one can always assume . But their argument naturally generalizes to this case, and this is the notion of mixing that we require for our list-recovery results. Alternatively, we could say that for any , we have . From here, using some additional tools like 2-increasing chains – whose existence they establish via a Ramsey-theoretic argument – they are able to show that random linear codes with gap-to-capacity are with high probability -list-decodable from errors. In particular, their techniques promise the correct dependence on (although the dependence on the parameters and is quite poor).
We recommend viewing this “-mixing” property in the following light. Any argument establishing that random linear codes have good list-decodability must somehow argue that random subspaces and Hamming balls don’t tend to “correlate” too much. In particular, it should not be the case that Hamming balls have noticeable “linear structure”, and in particular they should be “far” from being closed under addition.
In our work, we consider whether or not sets that are relevant for list-recovery, i.e., the sorts of sets that list-recoverable codes cannot intersect with too much, also have nontrivial mixing. Firstly, we crystallize in a definition what it means for any arbitrary set to be -mixing, where : for any and , if – which denotes that and are sampled independently and uniformly from – then .
Now, for -list-recovery from erasures the relevant sets are combinatorial rectangles where for at least values of we have . For -list-recovery from errors the sets are “puffed-up” combinatorial rectangles. Namely, for each with , we consider list-recovery balls
Following the argument of [12], once we establish that these sets are nontrivially mixing, we can obtain bounds on the list-size with the correct dependence on . Our task then boils down to understanding the mixingness of the sets relevant for list-recovery. We consider first the erasures case, and subsequently discuss the errors case.
List-recovery from erasures
Firstly, observe that for a combinatorial rectangle , if each of the sets is nontrivially mixing as a subset of (take the case of the above definition), then is also nontrivially mixing (as a subset of ). Hence, it suffices for us to consider whether or not subsets of mix. It is here that the dependence on the field size shows up.
Recall from the earlier discussion that [17] established that random linear codes over (where is a prime power) are with high probability not -list-recoverable from erasures unless . Indeed, it is easy to find a subset which is not -mixing for any : take (or, more generally, any multiplicative coset for ). Since is closed under addition, in particular we have that if and are sampled independently and uniformly from then , so is not -mixing for any .
What went wrong in this example? The fact that contains as a subfield means that contains non-trivial -linear subspaces. Such subspaces naturally create “bad” input lists, and the argument of [17] establishes that indeed a random linear code is likely to contain many vectors from a combinatorial rectangle where at least a fraction of the ’s are -dimensional -subspaces of .
If we insist that be a prime, then does not have any non-trivial subspaces. However, in this case still contains some subsets with “additive structure;” for example, taking to be odd here for simplicity, centered intervals444Or, more generally, arithmetic progressions. like have the property that if one samples independently, then assuming . But note that this is still non-trivially bounded away from ! Remarkably, an argument of Lev [29] shows that this is the worst case over prime fields. More precisely, over all sets of size , to maximize where with , one should choose (the centered interval of length ), and . In our technical section we give an effective bound on for all , which allows us to argue that combinatorial rectangles in which at least a fraction of the ’s have size at most are nontrivially mixing, as desired.
List-recovery from errors
We now wish to establish that list-recovery balls are nontrivially mixing. Notably, unlike in the case of erasures, the argument here is insensitive to the base field. Let , where each . Let ; in this overview, we will sketch how one bounds (the argument easily generalizes to allow for multipliers and a shift ).
Unlike in the case of combinatorial rectangles, it is not the case that and have independent coordinates. For example, conditioned on lying in , then is less likely to lie in . However, these correlations are relatively minor, and we can essentially “pretend” that both and are sampled as follows: for each , with probability set the -th coordinate to a uniformly random element of , and otherwise set it to a uniformly random element of , and these choices are made independently for each .555In fact for technical reasons we have to consider lying in with probability for , and similarly lies in with probability . But by a concentration argument one can easily establish that and are with high probability very close to . We remark that a similar trick is implicit in [12], and made explicit in the context of rank-metric codes by Guruswami and Resch [18].
This new distribution is much more amenable to analysis. In particular, letting be the indicator for the event , then iff . Thus, if we can argue then a classic Chernoff-Hoeffding bound establishes is exponentially small, implying the desired -mixing.
Bounding is the most novel part of the analysis, and is done in Lemma 26. Recall that , and let also . We have
| (2) |
As is standard, is proportional to , the convolution of the indicator functions for , and similarly and are proportional to and , respectively. Using the simple identity , we can rewrite Equation 2 as
where is a constant which we show to be positive assuming (and indeed, if then it could be negative). Upon giving a trivial upper bound (which corresponds to the case that does not mix at all, as we must do since we are making no assumptions on the field) and simplifying, we obtain the bound
To our satisfaction, we have that iff , which is precisely the range of decoding radius at which we can hope for positive rate list-recovery from errors in the first place! Thus, we have that is exponentially small, establishing that list-recovery balls nontrivially mix.
1.3 Open Problems
Lastly, we leave here some directions for future research:
-
In our results the dependency on is correct, but the dependency on and is rather poor. Can we improve this dependency? Or, can we perhaps prove new lower bounds on in terms of and that apply when these parameters are not too big?
-
[17] showed that over with (and hence small characteristic), a random linear code is with high probability not -list-recoverable from erasures unless . Can we show that this lower bound actually applies to every -ary linear code?
-
Many arguments for codes being list-recoverable from errors in fact establish the stronger property of average-radius-list-recovery, where now one instead shows that for any input lists of size , given codewords one has
This in particular implies that there cannot be codewords lying in a list-recovery ball of radius . We believe our method should be able to establish this (slightly) stronger guarantee for random linear codes.
2 Preliminaries
2.1 Notation
We will often denote random variables and sets by uppercase Roman letters. The distinction will be clear from context. We write for any positive integer . For a vector with the finite field of order , we write . We define the weight of to be , and the Hamming distance between and is . For a collection of vectors , we denote by the subspace of spanned by .
We denote the binary entropy function by , and recall that . For a positive integer we shorthand . Additionally, by default is the base-2 logarithm. We write for the indicator function of a set , and write for the indicator random variable that equals if and only if the event holds.
2.2 The Random Code Model
For an alphabet size , a plain random code of block length and rate is obtained by including each in with probability , and these choices are made independently for each . (Note then that such a code has size in expectation, and by a Chernoff bound it follows that it has rate with high probability.)
When is a prime power, a random linear code of block length and rate is obtained by sampling a uniformly random matrix , and defining
Note that is full-rank with probability , and therefore has rate with high probability.
Given any subset , if is a plain random code of rate then . When dealing with random linear codes, the probability that a set appears in the code is determined by the span of the set.
Proposition 7.
Let be a random linear code of block length and rate , and let . Then,
2.3 List-Recovery Notions
This section collects the basic notions of list-recovery we study.
List-recovery from erasures
We begin with the relevant definition.
Definition 8 (list recovery from erasures).
Let be a -ary code of block-length . For an erasure radius and input list size , we say that is -list-recoverable from erasures if for every such that for at least of the -s and for the remaining, it holds that
That is, in any combinatorial rectangle of which at least of its side-lengths are at most (and the remainder can be as large as ), there are at most codewords.
We will also consider list-recovery from errors. The concept of a list-recovery ball – which generalizes that of a Hamming ball – will be useful.
Definition 9 (list-recovery ball).
Let and let . The list-recovery ball of radius centered at is
Above, we have extended the Hamming metric by setting
We now state the relevant capacity theorem for list-recovery from erasures. The proofs of the two implications are standard: the possibility result follows from analyzing the performance of a plain random code, while the impossibility result follows from a counting argument.
Theorem 10 (list-recovery from erasures capacity).
Let and let . Fix . For large enough, the following hold:
-
There exists a code of rate which is -list-recoverable from erasures.
-
For any code of rate , there exist with for at least values of such that .
We therefore say that the capacity for -list-recovery from erasures is . We will study what happens for codes of rate for some , and determine the value of their output list-size . We will be focused on the case where is held constant, and the gap-to-capacity tends to .
List-recovery from errors
We now define what it means for a code to be list-recoverable from errors.
Definition 11 (list recovery from errors).
Let be a -ary code of block-length . For a decoding radius and input list size , we say that is -list-recoverable from errors if for every such that for all , it holds that
That is, every list-recovery ball of radius with side-lengths contains at most codewords.
We will need an estimate on the size of list-recovery balls. It makes use of the -entropy function, defined as follows:
| (3) |
An operational interpretation of this quantity is as the base- entropy of a random variable which, with probability , samples a uniformly random element from a set of size , and with probability samples a uniformly random element from the complement. Additionally, so long as it holds that . Note that if one recovers the -entropy function, which we denote as (i.e., if is omitted from the subscript, then it is by default ).
We now state the relevant estimate.
Proposition 12 ([35, Proposition 2.4.11]).
Let be integers and . Let with for all . Then,
This estimate drives the following capacity theorem.
Theorem 13 (list-recovery from errors capacity).
Let and let . Fix . For large enough, the following hold:
-
There exists a code of rate which is -list-recoverable from errors.
-
For any code of rate , there exists with for all such that .
Thus, we will concern ourselves with codes of rate , and determine the output list size for -list-recovery from errors. And, as in the list-recovery from erasures case, we will hold constants and consider the asymptotic behaviour of as the gap-to-capacity .
We will also need the following lower bound on the difference . See the full version of the paper [3] for the proof.
Claim 14.
For any integers and , any , and any , we have
2.4 Increasing Chains
The following definition of an increasing chain was first introduced by Guruswami, Håstad, and Kopparty [12].
Definition 15 (-increasing chain).
A sequence of vectors is said to be a -increasing chain of length if for all we have
We require the following lemma on the existence of appropriately long increasing chains in an appropriate shift of an arbitrary subset .
Lemma 16 ([12, Lemma 6.3]).
For every prime power , and all positive integers and , the following holds. For every with , there is such that has a -increasing chain of length at least .
2.5 Mixing Sets
In our analysis we need to understand the probability that the sum of two independent uniformly random samples and from a set lands in a shifted set , for an arbitrary shift (and in fact a more general question of that form). We begin with the necessary definitions.
Definition 17 (mixing over ).
For a prime power , and , we say that is -mixing, if for any , where and are nonzero, it holds that
where are independent and uniformly distributed over , and .
Definition 18 (mixing over ).
For , a prime power , and , we say that is -mixing, if for any nonzero , and any , it holds that
where are independent and uniformly distributed over .
Remark 19.
Note that a set mixes nontrivially when , and moreover, we will want to not depend on . However, in the case of list recovery from erasures, where for some -s, , the case of will be useful towards bounding the expected mixing of .
The following connection then follows easily.
Remark 20.
Suppose that , and each is -mixing. Then, is -mixing, for .
3 List-Recovery from Erasures over Prime Fields
In this section we establish the list-recoverability of random linear codes over prime fields. To achieve this, we must first understand the mixing properties of worst-case subsets of prime fields. Most of the proofs are deferred to the full version [3].
3.1 Worst-Case Mixing of Subsets of Prime Fields
Towards understanding mixing of subsets of prime fields, we leverage a general result of Lev [29] which characterizes worst-case -s, when is prime. Before we introduce it, we set up some relevant notation.
For a set we let denote a “centered interval” of length . More precisely, is the -centered interval associated with if with and satisfying . Note that when is odd there is a unique centered interval (because necessarily), but when is even there are two centered intervals, corresponding to .
Lemma 21 ([29, Theorem 1], adapted).
Let be prime and be arbitrary sets with the associated -centered intervals. Then, if , for any set and some associated -centered interval , we have
We can use Lemma 21 to prove the following.
Lemma 22.
Fix a prime . Let be arbitrary sets of size . Then,
and this is tight for all . In particular:
-
1.
When we have
-
2.
When we have
We can then record the following corollary.
Corollary 23.
For a prime , any set of size ,
-
If , is -mixing for .
-
Otherwise, is -mixing for .
3.2 List-Recovery from Erasures over Prime Fields via Mixing
In this section we adapt the technique in [12], together with the worst-case mixing result from Corollary 23, to establish list recovery from erasures over prime fields.
Lemma 24.
Let be -mixing. For and any satisfying , the following holds. Let be sampled independently and uniformly at random from . Then, we have that
where . In particular, when , and each is -mixing, we get the same result as above for .
Proof.
Let denote the bad event that we want to bound, namely for . Note that implies that there exists some set , , such that for all .666Notice that if the -s are not linearly independent, this can only decrease the probability that the intersection is large, so we can concentrate on the case that distinct -s give rise to distinct -s. Hence, it suffices to bound the probability that such a set exists.
Fix some of size . Applying Lemma 16 with (and note that we can assume that ), we know there exists such that has a -increasing chain of length . That is, we have such that for all ,
Now, we can bound
| (4) |
Next, we bound Equation 4 by
| (5) |
Towards bounding each term in the sum, observe that the increasing chain property tells us that for each , we can write , where contains -s that participated in , whereas contains two new -s. Now,
| (6) |
where is an indicator for whether past -s landed in , and is a fixed string that depends on the fixing of . Assume for simplicity that , where are nonzero. Then, using the fact that is -mixing, each summand of Equation 5 can now be bounded by
and summing over all -s gives us
Union-bounding over all -s, we get
| (7) |
First, note that we set parameters so that . Indeed, we can set , and then need to be at most, say, . Under this choice of , it also holds that , since is large enough. Overall, Equation 7 gives , as desired. The “In particular” part simply follows from Remark 20.
We are now ready to give our list recovery result.
Theorem 25 (list recovery with erasures).
For any , a prime , an integer , and , the following holds. With probability at least , a random linear code of rate
is -list-recoverable from erasures, with
provided that . In particular, there exists a universal constant such that:
-
When , we can take , and,
-
When for some , we can take .
4 List-Recovery from Errors over Arbitrary Fields
We now turn to the case of list-recovery from errors. Unlike in the case of list-recovery from erasures, we will make no assumptions on the underlying field. We firstly show that list-recovery balls are non-trivially mixing. We subsequently sketch how, again using Lemma 22, we can conclude the desired list-recovery result. The proofs are deferred to the full version [3].
4.1 Mixture of List-Recovery Balls
Lemma 26.
Suppose is a prime power, is an integer and . Let be of size , let , , and all independent. Then,
| (8) | ||||
| (9) |
In the sequel, we will need upper and lower bounds on the RHS of Lemma 26 to . The following lemma establishes the required bounds.
Lemma 27.
Suppose is a prime power, is an integer and . Suppose and . Then, the following inequalities hold:
| (10) | ||||
| (11) |
We can now state a lemma bounding the probability that a nontrivial linear combination of two uniform samples from a list-recovery ball lands in a shift of the list-recovery ball. Then, in Corollary 30, we show how to set parameters to turn this into a statement about the -mixing of a list-recovery ball.
Lemma 28.
Let , a prime power, an integer, and let . Let be subsets, each of size . Fix small enough so that
Let and , and let . Then,
| (12) |
Remark 29.
Since Lemma 27 implies
it follows that one can indeed choose small enough to ensure
In Corollary 30 we show how to choose to bound the two terms in Equation 12 by for a concrete (which depends on ).
Corollary 30.
Let , a prime power, an integer, and let . Let , each of size . The list-recovery ball is -mixing, for
assuming for some universal constant .
4.2 List Recovery from Errors
Having established that the list-recovery ball is -mixing, we will repeat roughly the same argument we had for list recovery with erasures, that uses Lemma 24.
Theorem 31 (list recovery with errors).
For every , a prime power , an integer , , and , the following holds. With probability at least , a random linear code of rate
is -list-recoverable from errors, with
provided that is large enough. More concretely, there exists a universal constant such that
provided that .
References
- [1] Avraham Ben-Aroya, Dean Doron, and Amnon Ta-Shma. Near-optimal erasure list-decodable codes. In 35th Computational Complexity Conference (CCC), pages 1:1–1:27. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2020. doi:10.4230/LIPIcs.CCC.2020.1.
- [2] Yeyuan Chen and Zihan Zhang. Explicit folded Reed-Solomon and multiplicity codes achieve relaxed generalized Singleton bounds, 2024. doi:10.48550/arXiv.2408.15925.
- [3] Dean Doron, Jonathan Mosheiff, Nicolas Resch, and João Ribeiro. List-recovery of random linear codes over small fields, 2025. doi:10.48550/arXiv.2505.05935.
- [4] Dean Doron, Dana Moshkovitz, Justin Oh, and David Zuckerman. Nearly optimal pseudorandomness from hardness. Journal of the ACM, 69(6), November 2022. doi:10.1145/3555307.
- [5] Dean Doron and Mary Wootters. High-probability list-recovery, and applications to heavy hitters. In 49th International Colloquium on Automata, Languages, and Programming (ICALP), pages 55:1–55:17. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/LIPICS.ICALP.2022.55.
- [6] Peter Elias. List decoding for noisy channels. Technical Report Technical Report 335, Research Laboratory of Electronics, MIT, 1957. URL: https://dspace.mit.edu/handle/1721.1/4484.
- [7] Anna C. Gilbert, Yi Li, Ely Porat, and Martin J. Strauss. For-all sparse recovery in near-optimal time. ACM Transactions on Algorithms, 13(3):1–26, 2017. doi:10.1145/3039872.
- [8] Eitan Goldberg, Chong Shangguan, and Itzhak Tamo. Singleton-type bounds for list-decoding and list-recovery, and related results. In International Symposium on Information Theory (ISIT), pages 2565–2570. IEEE, 2022. doi:10.1109/ISIT50566.2022.9834849.
- [9] Sivakanth Gopi, Swastik Kopparty, Rafael Oliveira, Noga Ron-Zewi, and Shubhangi Saraf. Locally testable and locally correctable codes approaching the Gilbert-Varshamov bound. IEEE Transactions on Information Theory, 64(8):5813–5831, 2018. doi:10.1109/TIT.2018.2809788.
- [10] Zeyu Guo, Ray Li, Chong Shangguan, Itzhak Tamo, and Mary Wootters. Improved list-decodability and list-recoverability of Reed-Solomon codes via tree packings. In 62nd Annual Symposium on Foundations of Computer Science (FOCS), pages 708–719. IEEE, 2021. doi:10.1109/FOCS52979.2021.00074.
- [11] Venkatesan Guruswami. List decoding from erasures: Bounds and code constructions. IEEE Transactions on Information Theory, 49(11):2826–2833, 2003. doi:10.1109/TIT.2003.815776.
- [12] Venkatesan Guruswami, Johan Håstad, and Swastik Kopparty. On the list-decodability of random linear codes. IEEE Transactions on Information Theory, 57(2):718–725, 2011. Preliminary version at STOC 2010. doi:10.1109/TIT.2010.2095170.
- [13] Venkatesan Guruswami and Piotr Indyk. Near-optimal linear-time codes for unique decoding and new list-decodable codes over smaller alphabets. In 34th Annual Symposium on Theory of Computing (STOC), pages 812–821. ACM, 2002. doi:10.1145/509907.510023.
- [14] Venkatesan Guruswami and Piotr Indyk. Linear time encodable and list decodable codes. In 35th Annual Symposium on Theory of Computing (STOC), pages 126–135. ACM, 2003. doi:10.1145/780542.780562.
- [15] Venkatesan Guruswami and Piotr Indyk. Efficiently decodable codes meeting Gilbert-Varshamov bound for low rates. In 15th Annual Symposium on Discrete Algorithms (SODA), pages 756–757. SIAM, 2004. URL: http://dl.acm.org/citation.cfm?id=982792.982907.
- [16] Venkatesan Guruswami and Piotr Indyk. Linear-time encodable/decodable codes with near-optimal rate. IEEE Transactions on Information Theory, 51(10):3393–3400, 2005. doi:10.1109/TIT.2005.855587.
- [17] Venkatesan Guruswami, Ray Li, Jonathan Mosheiff, Nicolas Resch, Shashwat Silas, and Mary Wootters. Bounds for list-decoding and list-recovery of random linear codes. IEEE Transactions on Information Theory, 68(2):923–939, 2022. doi:10.1109/TIT.2021.3127126.
- [18] Venkatesan Guruswami and Nicolas Resch. On the list-decodability of random linear rank-metric codes. In International Symposium on Information Theory (ISIT), pages 1505–1509. IEEE, 2018. doi:10.1109/ISIT.2018.8437698.
- [19] Venkatesan Guruswami, Christopher Umans, and Salil P. Vadhan. Unbalanced expanders and randomness extractors from Parvaresh-Vardy codes. Journal of the ACM, 56(4):20:1–20:34, 2009. doi:10.1145/1538902.1538904.
- [20] Iftach Haitner, Yuval Ishai, Eran Omri, and Ronen Shaltiel. Parallel hashing via list recoverability. In Advances in Cryptology – CRYPTO 2015, pages 173–190. Springer Berlin Heidelberg, 2015. doi:10.1007/978-3-662-48000-7_9.
- [21] Brett Hemenway, Noga Ron-Zewi, and Mary Wootters. Local list recovery of high-rate tensor codes and applications. SIAM Journal on Computing, 49(4):FOCS17–157, January 2020. doi:10.1137/17M116149X.
- [22] Justin Holmgren, Alex Lombardi, and Ron D. Rothblum. Fiat–Shamir via list-recoverable codes (or: parallel repetition of GMW is not zero-knowledge). In 53rd Annual Symposium on Theory of Computing (STOC), pages 750–760. ACM, 2021. doi:10.1145/3406325.3451116.
- [23] Piotr Indyk, Hung Q. Ngo, and Atri Rudra. Efficiently decodable non-adaptive group testing. In 21st Annual Symposium on Discrete Algorithms (SODA), pages 1126–1142. SIAM, 2010. doi:10.1137/1.9781611973075.91.
- [24] Itay Kalev and Amnon Ta-Shma. Unbalanced expanders from multiplicity codes. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM), pages 12:1–12:14. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/LIPIcs.APPROX/RANDOM.2022.12.
- [25] Sergey Komech and Jonathan Mosheiff. Let’s have both! Optimal list-recoverability via alphabet permutation codes, 2025. doi:10.48550/arXiv.2502.05858.
- [26] Swastik Kopparty, Or Meir, Noga Ron-Zewi, and Shubhangi Saraf. High-rate locally correctable and locally testable codes with sub-polynomial query complexity. Journal of the ACM, 64(2):1–42, 2017. doi:10.1145/3051093.
- [27] Swastik Kopparty, Noga Ron-Zewi, Shubhangi Saraf, and Mary Wootters. Improved decoding of folded Reed-Solomon and multiplicity codes. In 59th Annual Symposium on Foundations of Computer Science (FOCS), pages 212–223. IEEE, 2018. doi:10.1109/FOCS.2018.00029.
- [28] Kasper Green Larsen, Jelani Nelson, Huy L. Nguyen, and Mikkel Thorup. Heavy hitters via cluster-preserving clustering. In 57th Annual Symposium on Foundations of Computer Science (FOCS), pages 61–70. IEEE, 2016. doi:10.1109/FOCS.2016.16.
- [29] Vsevolod F. Lev. Linear equations over and moments of exponential sums. Duke Mathematical Journal, 107(2):239–263, 2001. doi:10.1215/S0012-7094-01-10722-9.
- [30] Matan Levi, Jonathan Mosheiff, and Nikhil Shagrithaya. Random Reed-Solomon codes and random linear codes are locally equivalent, 2024. arXiv:2406.02238.
- [31] Ray Li and Nikhil Shagrithaya. Near-optimal list-recovery of linear code families, 2025. doi:10.48550/arXiv.2502.13877.
- [32] Ben Lund and Aditya Potukuchi. On the list recoverability of randomly punctured codes. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM), pages 30:1–30:11. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2020. doi:10.4230/LIPIcs.APPROX/RANDOM.2020.30.
- [33] Jonathan Mosheiff, Nicolas Resch, Kuo Shang, and Chen Yuan. Randomness-efficient constructions of capacity-achieving list-decodable codes, 2024. doi:10.48550/arXiv.2402.11533.
- [34] Hung Q. Ngo, Ely Porat, and Atri Rudra. Efficiently decodable error-correcting list disjunct matrices and applications. In International Colloquium on Automata, Languages, and Programming (ICALP), pages 557–568. Springer, 2011.
- [35] Nicolas Resch. List-Decodable Codes: (Randomized) Constructions and Applications. PhD thesis, Carnegie Mellon University, 2020. URL: http://reports-archive.adm.cs.cmu.edu/anon/2020/abstracts/20-113.html.
- [36] Atri Rudra and Mary Wootters. Every list-decodable code for high noise has abundant near-optimal rate puncturings. In 46th Annual Symposium on Theory of Computing (STOC), pages 764–773. ACM, 2014. doi:10.1145/2591796.2591797.
- [37] Atri Rudra and Mary Wootters. Average-radius list-recoverability of random linear codes. In 29th Annual Symposium on Discrete Algorithms (SODA), pages 644–662. SIAM, 2018. doi:10.1137/1.9781611975031.42.
- [38] Amnon Ta-Shma and David Zuckerman. Extractor codes. IEEE Transactions on Information Theory, 50(12):3015–3025, 2004. doi:10.1109/TIT.2004.838377.
- [39] Itzhak Tamo. Tighter list-size bounds for list-decoding and recovery of folded Reed-Solomon and multiplicity codes. IEEE Transactions on Information Theory, 70(12):8659–8668, 2024. doi:10.1109/TIT.2024.3402171.
- [40] Victor Vasilievich Zyablov and Mark Semenovich Pinsker. List concatenated decoding. Problemy Peredachi Informatsii, 17(4):29–33, 1981.
