Tight Bounds on List-Decodable and List-Recoverable Zero-Rate Codes
Abstract
In this work, we consider the list-decodability and list-recoverability of codes in the zero-rate regime. Briefly, a code is -list-recoverable if for all tuples of input lists with each and , the number of codewords such that for at most choices of is less than ; list-decoding is the special case of . In recent work by Resch, Yuan and Zhang (ICALP 2023) the zero-rate threshold for list-recovery was determined for all parameters: that is, the work explicitly computes with the property that for all (a) there exist positive-rate -list-recoverable codes, and (b) any -list-recoverable code has rate . In fact, in the latter case the code has constant size, independent on . However, the constant size in their work is quite large in , at least .
Our contribution in this work is to show that for all choices of and with , any -list-recoverable code must have size , and furthermore this upper bound is complemented by a matching lower bound . This greatly generalizes work by Alon, Bukh and Polyanskiy (IEEE Trans. Inf. Theory 2018) which focused only on the case of binary alphabet (and thus necessarily only list-decoding). We remark that we can in fact recover the same result for and even , as obtained by Alon, Bukh and Polyanskiy: we thus strictly generalize their work.
Our main technical contribution is to (a) properly define a linear programming relaxation of the list-recovery condition over large alphabets; and (b) to demonstrate that a certain function defined on a -ary probability simplex is maximized by the uniform distribution. This represents the core challenge in generalizing to larger (as a binary simplex can be naturally identified with a one-dimensional interval). We can subsequently re-utilize certain Schur convexity and convexity properties established for a related function by Resch, Yuan and Zhang along with ideas of Alon, Bukh and Polyanskiy.
Keywords and phrases:
List Decoding, List Recovery, Zero RateCopyright and License:
2012 ACM Subject Classification:
Mathematics of computing Coding theoryFunding:
The research of C. Yuan was support in part by the National Key R&D Program of China under Grant 2023YFE0123900 and Natural Science Foundation of Shanghai under the 2024 Shanghai Action Plan for Science, Technology and Innovation Grant 24BC3200700. The research of N. Resch is supported in part by an NWO (Dutch Research Council) grant with number C.2324.0590, and this work was done in part while he was visiting the Simons Institute for the Theory of Computing, supported by DOE grant #DE-SC0024124.Editors:
Raghu MekaSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
Given an error-correcting code , a fundamental requirement is that the codewords are sufficiently well-spread in order to guarantee some non-trivial correctability properties. This is typically enforced by requiring that the minimum distance of the code , where denotes the Hamming distance (i.e. the number of coordinates on which two strings differ). Note that minimum distance is equivalent to the following “packing” property: if we put a ball of radius around any point – i.e. we consider the Hamming ball – then all these balls contain at most 1 codeword from .
This latter viewpoint can easily be generalized to obtain list-decodability, where we now require that such Hamming balls do not capture “too many” codewords. That is, for and a code is called -list-decodable if every Hamming ball of radius contains less than codewords from . In notation: for all , .111Typically the upper bound is , rather than . However, for “impossibility” arguments this parametrization is more common, as it leads to less cumbersome computations. This notion was already introduced in the 50’s by Elias and Wozencraft [10, 30, 11] but has in the past 20 years seen quite a bit of attention due to its connections to other parts of theoretical computer science [13, 2, 21, 19, 18, 26].
One can push this generalization further to obtain list-recoverability. Here, we consider a tuple of input lists , where each has size at most (for some ). The requirement is that the number of codewords that “disagree” with in at most coordinates is at most . More formally, if for all the number of codewords such that is at most , the code is called -list-recoverable. Note that -list-decodability is nothing other than -list-recoverability. Initially, list-recoverability was abstracted as a useful stepping stone towards list-decoding concatenated codes. However, in recent years this notion has found many connections to other parts of computer science, e.g. in cryptography [15, 16], randomness extraction [14], hardness amplification [8], group testing [17, 23], streaming algorithms [9], and beyond.
Rate versus noise-resilience.
Having fixed a desired “error-tolerance” as determined by the parameters and we would also like the code to be as large as possible: intuitively, this implies that the code contains the minimal amount of redundancy possible. A fundamental question in coding theory is to understand the achievable tradeoffs between the rate and some “error-resilience” property of the code, e.g., minimum distance, list-decodability, or list-recoverability.
This question in full generality is wide open. Even for the special case of and (i.e. determining the optimal tradeoff between rate and distance for binary codes) is unclear: on the possibility side we have the Gilbert-Varshamov bound [12, 28] showing is achievable (here, is the binary entropy function), while bounds of Elias and Bassalygo [3] and the linear programming bound [29, 22, 7] give incomparable and non-tight upper bounds. None of these bounds have been substantially improved in at least 40 years. The situation is even more complicated for larger : for (and larger prime powers) the celebrated algebraic geometry codes of Tsafsman, Vladut and Zink [27] provide explicit codes of higher rate in certain regimes than those promised by the Gilbert-Varshamov bound.
When one relaxes the question to allow an asymptotically growing list size then we do have a satisfactory answer: the answer is provided by the list-decoding/-recovery theorem, which states that for all there exist -list-recoverable codes of rate where
is -ary entropy function [24].222Note that setting recovers the standard -ary entropy function, which itself reduces to the binary entropy function upon setting . On the other hand, any code of rate fails to be -list-recoverable unless . However, this does not provide very meaningful bounds if one is interested in, say, -list-recoverable codes.
Positive versus zero-rate regimes.
Thus far, we have implicitly been discussing the positive-rate regime. However, one can also ask questions about the behaviour of codes in the zero-rate regime. For context, recent work by Resch, Yuan and Zhang [25] computed the zero-rate threshold for list-recovery: that is, for all alphabet sizes , input list sizes and output list size , they determine the value such that (a) for all there exist infinite families of positive rate -list-recoverable codes over the alphabet , and (b) for all there does not exist such an infinite family.
Having now delineated the “positive rate” and the “zero-rate” regimes depending on how compares to , in this work we study the zero-rate regime for list-recoverable codes for all alphabet sizes . In [25], it is shown that -list-recoverable codes with have constant size (that is, independent of the block length ); however, this constant is massive in the parameters due to the use of a Ramsey-theoretic bound. In particular, the dependence on is at least , and this is additionally multiplied by a tower of 2’s of height roughly .
To the best of our knowledge, prior work on this question focuses exclusively on the case. For example, in the case of (i.e., unique-decoding) we have , and work by Levenshtein [20] shows that a code construction based on Hadamard matrices corrects a fraction of errors and has size . A particularly relevant prior work is due to Alon, Bukh and Polyanskiy [1]. Herein the authors consider this question for the special case of (and thus, necessarily, only for list-decoding). In particular, they show that when is even if then such a -list-decodable code has size at most , and moreover provide a construction of such a code with size .333Note that for the special case of , the zero-rate threshold for list-decoding had already been established by Blinovsky [4]. They observe some interesting behaviour in the case of odd ; in particular, the maximum size of a -list-decodable code is .444This argument in fact shows a flaw in an earlier claimed proof of Blinovsky that claimed such codes have size for all .
Our motivations for this investigation are three-fold. Firstly, the zero-rate regime offers combinatorial challenges and interesting behaviours that we uncover in this work. Secondly, many codes that find applications in other areas of theoretical computer in fact have subconstant rate. Lastly, the zero-rate regime appears much more tractable than the positive rate regime – indeed, we can obtain tight upper and lower bounds on the size of a code, as we will soon see. It would be interesting to determine to what extent such techniques could be useful for understanding the positive rate regime as well.
1.1 Our results
Our main result in this work is a tight bound on the size of a -list-recoverable code over an alphabet of size when . The main technical challenge is to compute the following upper bound on the size of such a code.
Theorem 1.
Let with . and be fixed constants. Let and put . Suppose is -list-recoverable. Then .
We complement the above negative result with the following code construction, showing the upper bound is tight.
Theorem 2.
Let with and be fixed constants. Let and put . There exists a -list-recoverable code such that .
We emphasize that in the above theorems the implied constants may depend on and .
Note that our results explicitly exclude the case of . As [1] prove, the binary alphabet behaves in subtle ways: the bound on the code size depends on the parity of . Intriguingly, our work demonstrates that such behaviour does not arise over larger alphabets.
1.2 Technical Overview
The double-counting argument.
Since our focus is on zero-rate list-decodable/-recoverable codes, it helps to first review the proof of the zero-rate threshold . A lower bound can be easily obtained by a random construction that attains a positive rate for any . For the upper bound, let us first consider the list-decoding case, i.e., . The proof in [5, 6, 25], at a high-level, proceeds via a double-counting argument.555A characterization of was announced in [5, 6] whose proof was flawed. The work [25] filled in the gaps therein and characterized for general . For any -list-decodable code , the proof aims to upper and lower bound the radius of a list averaged over the choice of the list from :
| (1) |
Comparing the bounds produces an upper bound on . Here , known as the Chebyshev radius of a list, is the relative radius of the smallest Hamming ball containing all codewords in the list. A lower bound on Equation 1 essentially follows from list-decodability of . Indeed, each term (corresponding to lists consisting of distinct codewords) is lower bounded by , otherwise a list that fits into a ball of radius at most is found, violating list-decodability of . Therefore Equation 1 is at least , where is to account for lists with not-all-distinct codewords.
On the other hand, it is much more tricky to upper bound Equation 1 as, in general, admits no analytically closed form and can only be computed by solving a min-max problem. Previous proofs [25] then first extracts a subcode with highly-regular list structures via the hypergraph Ramsey’s theorem. This allows one to assert that all lists have essentially the same radius and all codewords in each list have essentially the same distance to the center of the list. As a result, the min-max expression is “linearized” and Equation 1 can be upper bounded when restricted to . The downside is that the Ramsey reduction step is rather lossy for code size.
Weighted average radius.
The effect of the Ramsey reduction, put formally, is to enforce the average radius:
| (2) |
of every list in the subcode to be approximately equal. To extract the regularity structures in lists without resorting to extremal bounds from Ramsey theory, [1] introduced the notion of weighted average radius which “linearizes” the Chebyshev radius in a weighted manner:
where is a distribution on elements. For any weighting , of lists from the code forms a suite of succinct statistics of the list distribution. It turns out (where denotes the uniform distribution on ) is maximal under all . Recall that the double-counting argument suggests that in an optimal zero-rate code, the behaviour of the ensemble average of is essentially captured by that of . In particular, list-decodability ensures that of most lists should be large. However, not too many lists in an optimal code are expected to have large for any . [1] then managed to quantify the gap between and (with ), which yields improved (and sometimes optimal) size-radius trade-off of zero-rate codes.
Generalization to -ary list-decoding.
Our major technical contribution is in extrapolating the above ideas to list-recovery. The challenge lies particularly in defining a proper notion of weighted average radius and proving its properties. Our definition relies crucially on an embedding from to the simplex in and relaxes the center of the list to be a fractional vector. Specifically, denoting by the simplex in and its vertices (i.e., the standard basis of ), we let the embedding map each symbol to the standard basis vector . Denoting by the (element-wise) images of a list , we define the weighted average radius of as:
| (3) |
where is any distribution on .
The notriviality and significance of the above notion, especially the embedding used therein, is three-fold.
-
First, as the weighting varies, serves as a bridge between the standard average radius in Equation 2 and the Chebyshev radius. Indeed, recovers the former, and the maximum over recovers the latter. However, we caution that the second statement does not hold without the embedding since the Hamming distance between -ary symbols per se is not and cannot be interpolated by a convex function, which makes the minimax theorem inapplicable. Fortunately, our embedding affinely extend the -ary Hamming distance to the simplex therefore brings back the applicability of the minimax theorem and connects to .
-
Second, our definition in Equation 3 allows to take any value on the simplex, instead of only its vertices, i.e., the image of under . Though embedding naively to the hypercube seems convenient, upon solving the expression with fractional one does not necessarily obtain a notion that is guaranteed to closely approximate the original version with integral . In contrast, using linear programming duality, we show that our embedding yields relaxed notion of radius which closely approximates the actual Chebyshev radius. Indeed, upon rounding the fractional center and taking its pre-image under , our results guarantee that the resulting radius must have negligible difference from the Chebyshev radius. Precisely speaking, we want to find a vector close to the images of the codewords by linear programming. Meanwhile, we want to belong to so that we can find a preimage of in . Since , the components in are subject to . This implies that at least one component of is nonzero. The basic feasible solution guarantees that there exists a feasible solution such that most of are . Combining with the fact forces for almost all . Thus, we obtain a negligible loss in the conversion between Hamming distance and Euclidean distance.
-
Finally, under the embedding , the weighted average radius still retains the appealing feature that the minimization can be analytically solved, therefore giving rise to an explicit expression (see Equation 31) which greatly facilitates our analysis.
We then show, via techniques deviating from those in [1], three key properties that are required by the subsequent arguments.
-
1.
For any fixed distribution , if entries of codewords in the list are generated i.i.d. using , then
is maximized when . Moreover, the equality holds if and only if for and any . Our approach is different from [1] as we can not explicitly represent function .
-
2.
Furthermore, if entries of codewords in the list are generated i.i.d. using a certain , then is upper bounded by with and . This follows from the Schur convexity property proved in [25].
-
3.
Finally, denoting by the distribution of the -th components of codewords in code , Schur convexity promises . In [25], it is proved that is convex for . Thus, we can conclude that
with .
The remaining part of our proof is similar to [1]. We show that a code either has radius
or most of -tuples with distinct codewords in are distributed close to uniform,. For the former case, we use the convexity property to show that the list-decodability of can not exceed by much. For the latter case, since most of -tuples of distinct codewords in looks uniformly at random, we can show that the list-decodablilty of is very close to that of random codes which is .
Generalization to list-recovery.
For list-recovery, i.e., , we find an embedding that maps each element in to a superposition of vertices of the simplex in , i.e., we map the element in to a vector space where is the collection of all -subsets in . Concretely, we define where is a standard basis of . The intuition behind this map is that if , we have and otherwise . Similar to the list decoding, given codewords in , we obtain vectors under the map . Our goal is to find a vector close to these vectors subject to the constraint that for any . This constraint combined with the basic feasible solution argument forces that for almost all , is of the form . For such , we can find an -subset preserving the distance, i.e.,
Besides the linear programming relaxation, further adjustments for the proof of properties analogous to Items 1, 2, and 3 above are required.
Code construction.
As alluded to before, a code that saturates the optimal size-radius trade-off should essentially saturate both the upper and lower bounds on the quantity
considered in the double-counting argument. Indeed, our impossibility result implies that any optimal zero-rate code must contain a large fraction of random-like -tuples , i.e., for every
| (4) |
where . To match such an impossibility result, an optimal construction should contain as many such -tuples as possible. A simplex-like code then becomes a natural candidate. This is a natural extension of the construction in [1] to larger alphabet. An codebook consisting of codewords each of length is constructed by putting as columns all possible distinct length- vectors that contains identical numbers of . It is not hard to see by symmetry that (4) becomes equality for every -tuple with distinct codewords in . Thus, is the most regular code.
We also remark that, unlike for positive-rate codes, the prototypical random construction (with expurgation) does not lead to favorable size-radius trade-off since the deviation of random sampling is comparatively too large in the zero-rate regime. In contrast, the simplex code is deterministically regular and has no deviation.
1.3 Organization
The remainder is organized as follows. First, Section 2 provides the necessary notations and definitions, together with some preliminary results which will be useful in the subsequent arguments. Sections 3.1, 3.2, 3.3, and 3.4 contain our argument establishing Theorem 1 for list-decoding (i.e. the case ); in Section 4 we elucidate the changes that need to be made to establish the theorem for general . Next, Section 5 provides the code construction establishing Theorem 2.
2 Preliminaries
Firstly, for convenience of the reader we begin by summarizing the notation that we use. This is particularly relevent as we will often be in situations where we need multiple indexes for, e.g., lists of vectors where each coordinate lies in a probability simplex, so the reader is encouraged to refer to this table whenever it is unclear what is intended.
| English letter in boldface | -valued vector |
|---|---|
| Greek letter in boldface | -valued vector |
| Simplex in , i,e., | |
| Set of vertices of | |
| The image of under , i.e., the -th vertex of | |
| The -th codeword in a list | |
| Image of under (applied component-wise) | |
| Relaxed center of a list | |
| The -th block (of length ) in , respectively | |
| The -th element of , respectively | |
| (Standard) Chebyshev radius | |
| Relaxed Chebyshev radius | |
| Average radius | |
| Average radius weighted by | |
| Expected average radius (weighted by ) of -distributed symbols | |
| A list of i.i.d. -distributed symbols | |
| Uniform distribution on |
For a finite set and an integer , we denote . Let .
2.1 List-Decoding
Fix and . Let denote the Hamming distance between , i.e., the number of coordinates on which the strings differ. For , let denote the Hamming ball centered around of radius .
Definition 3 (List-decodable code).
Let . A code is -list-decodable if for any ,
In [25] the zero-rate regime for list-decoding was derived, which is the supremum over for which -list-decodable codes of positive rate exist for all . This value was shown to be
| (5) |
where the function outputs the number of times the most popular symbol appears. In [25] it is shown that -list-decodable codes have size , i.e., some constant independent of . Our target in this work is to show that the correct dependence on is , except for the case of with odd .
A “dual” definition of list-decodability is proffered by the Chebyshev radius.
Definition 4 (Chebyshev radius).
The Chebyshev radius of a list of distinct vectors is defined as
Observe that a code is -list-decodable if and only if
| (6) |
In particular, to show a code fails to be list-decodable, it suffices to upper bound the Chebyshev radius of distinct codewords.
Recall that our main target is an upper bound on the size of list-decodable/-recoverable codes (in the zero-rate regime). A natural approach is to derive from Equation 6 the desired bound on the code; however, this quantity is quite difficult to work with directly. We therefore work with a relaxed version, which we now introduce.
We require the following definitions. Let us embed into the simplex via the following map :
| (9) |
where is the -dimensional vector with a in the -th location and everywhere else. Denote by the simplex and its vertices. For and , let
| (10) |
Note that if , then
From now on we will only work with -valued vectors and will still denote such length- vectors by boldface letters, abusing the notation. For , we use to denote its -th element and use to denote its -th block of size . For , we use to denote its -th element.
For and , the definition of can be extended to length- vectors in the natural way. Specifically,
| (11) |
We may now define the relaxed Chebyshev radius.
Definition 5.
The relaxed Chebyshev radius of a list of distinct vectors is
| (12) |
Observe that
| (13) |
where (here we extend the definition of to length- inputs in a similar way as in Equation 11). This justifies the “relaxation” terminology.
As a last piece of terminology, we define the radius of a code.
Definition 6.
For any code , we define the Chebyshev radius of as
3 Zero-Rate List-Decoding
3.1 Linear Programming Relaxation
We have shown in Equation 13 that is smaller than . Conversely, Lemma 7 below establishes that and do not differ much. That is, for any list, if a center achieves a relaxed radius , then there must exist attaining approximately the same for sufficiently large .
Lemma 7 ( is close to ).
Let . Denote by the images of under the embedding . Then
Proof.
Suppose . Then there exists such that for every ,
where the first equality is by Equations 10 and 11. That is, the following polytope is nonempty:
| (14) |
Equivalently, the following linear program (LP) is feasible:
| (19) |
Since the equality is equivalent to the equality , the above LP can be written in equational form:
| (25) |
or more compactly in matrix form:
| (29) |
Here and encode respectively the fourth and third constraints in Equation 25, and denote respectively the identity matrix and the all-one vector of length . It is clear that This implies that there exists a feasible solution that has at most nonzeros and thus has at most nonzeros. Indeed, such solutions are known as the basic feasible solutions. Note that for every block , . This implies that cannot be simultaneously . Moreover, if out of them are , the remaining one is forced to be . Since there are blocks in total, by the pigeonhole principle, there are at least choices of such that . Without loss of generality, we assume that these indices are . Let be such that for and is any value in for . Since and , the difference between and is at most . The proof is completed.
We further relax by defining the weighted average radius. For and , let
In words, weighted average radius is obtained by replacing the maximization over in the definition of relaxed radius (see Equation 12) with an average with respect to a distribution .
Since the objective of the minimization is separable, one can minimize over each individually and obtain an alternative expression. Suppose are the images of under the embedding . Then
| (30) | |||
| (31) |
Equation 30 holds since the objective in brackets only depends on , not on other . To see Equation 31, we note that a maximizer to the following problem , where and are fixed, is given by where satisfies
Obviously, by definition, for any and , In fact, the following lemma shows that is equal to the maximum over .
Lemma 8 ( equals maximum ).
For any ,
Proof.
Note that
since the inner maximum is anyway achieved by a singleton distribution. Note also that the objective function
is affine in and linear in . Therefore, von Neumann’s minimax theorem allows us to interchange and and obtain
as claimed by the lemma.
In fact, we can say something stronger: it is not necessary to maximize over the entire (uncountable) probability simplex . Instead, we can extract a finite subset and maximize over this set to recover . The following lemma is analogous to [1, Lemma 6].
Lemma 9 ( is achieved by finitely many ).
For every , there exists a finite set of probability measures such that
for all .
Proof.
The idea is to view the computation of as finding the maximum among some finite set of linear program maxima over some convex polytopes, and then to take to be the set of vertices of the defined convex polytopes.
First, we define the convex polytopes based on a (-ary version of a) signature. For each , we define a signature for which is a function such that
for . Define further the halfspaces for . Observe that if where is a signature for then . Thus, by ranging over the choices for and we obtain halfspaces that partition the -dimensional space into at most regions.
For each possible signature , let , and note that is a convex polytope. Indeed, it is an intersection over of the convex polytopes
where is a signature for . Now, to maximize
over , consider the set for and let . We claim it suffices to find the maximum of the following linear function:
| (32) |
over all . Indeed, by Equation 31, we have
This implies that a maximizer only depends on the index set , and furthermore that its value is determined by the ’s as in Equation 32.
We can thus take the union of all vertex sets of all polytopes for all signatures . Multiplying this by the regions defined by all the halfspaces we obtain a finite set of vertices, as desired.
3.2 Properties of
Now, we consider the expected weighted average radius of a sequence of i.i.d. symbols. Specifically, for and , let
[1] studies for and even . In this case, one can take advantage of the fact that may be parametrized by a single real number, and thereby yield a fairly simple expression for .
Nonetheless, in this subsection, we will show that all properties of in [1] holding for and even can be generalized to any and any . Let us first provide a more explicit expression for using Equation 31:
We define the shorthand notation
| (33) |
for any and .
The first property that we would like to establish is that only increases if is replaced by , and furthermore that the maximum is uniquely obtained at if for all . In order to do this, we will regularly “average-out” coordinates of and then show that the function value increases (or at least, does not decrease). To be introduce some terminology, for we say that is obtained from by averaging-out the subset of coordinates if is defined as
The following lemma gives a simple criterion for establishing that, if is obtained from by averaging two coordinates, then , and it furthermore gives a criterion for the inequality to be strict. The main thrust of the proof of Lemma 11 is thus to show that this criterion is always satisfied.
Lemma 10.
Let and . Suppose and that is obtained by averaging-out the last two coordinates of . Suppose that for all we have
| (34) |
Then .
Furthermore, suppose that additionally there exists with such that the inequality in Equation 34 is strict. Then .
Proof.
Define as
That is, is obtained by swapping the last two components of . By symmetry, we have and so
where the inequality follows from Equation 34. From the above sequence of inequalities, it is also clear that if additionally there exists with for which the inequality in Equation 34 is strict, then .
We now establish that the function value cannot decrease if is replaced by .
Lemma 11.
For any and , .
Proof.
Fix any . Let be non-uniform. Without loss of generality, assume . Let be obtained by uniformizing the last two components of , i.e.,
We claim . By Lemma 10, we just need to establish Equation 34.
Equation 34 trivially holds if . We therefore assume below . Let and . Let
Then, we have
We first assume that there exists such that
where the second equality follows since the set does not contain . Equation 34 therefore holds as
We proceed to the case that
Equation 34 holds as
and
Thus, Lemma 10 implies , as desired.
We can then continue averaging components of and in this way obtain a sequence of distributions with . This sequence converges in -norm to the uniform distribution and satisfies for all . Observing that is a continuous function – the term is a maximum over linear functions of , hence linear, implying that is a linear combination of continuous functions – it follows that , and in particular that , as desired.
We now strengthen the conclusion of Lemma 11 by showing that for all and the function is uniquely maximized by the setting , except for degenerate cases where for some .
Before stating and proving this fact, we note that the proof of Lemma 11 in fact shows that we can average out any subset of coordinates of and only increase the value of . We formalize this fact in the following lemma, which will be useful in the following arguments.
Lemma 12.
Let , and . Let be obtained from by averaging-out the subset of coordinates. Then .
Theorem 13.
Let , and let be such that 666In fact, our proof only apply with which clearly satisfies the condition. for all . Then for all , with equality if and only if .
Proof.
The inequality was already established in Lemma 11, so we focus on showing when . As , let and denote distinct elements of . Let and suppose for a contradiction that is a maximum of the function . The proof proceeds via a number of cases.
-
1.
is even. Without loss of generality, . If , let be obtained from by averaging-out the first coordinates; by Lemma 12, . If , set .
If , since , we can set and . Set further and . We observe that for this and obtained from by averaging-out the last two coordinates, Equation 34 strictly holds. Indeed,
and
Thus Lemma 10 implies , a contradiction.
-
2.
is odd and at least three components of take distinct values, or the components in only take two different values and at least two of them take the minimum value. Without loss of generality . If , let be obtained from by averaging-out the first coordinates; by Lemma 12, . If , set .
Since , if , we set and . Let , and . We observe that for this and obtained from by averaging-out the last two coordinates, Equation 34 strictly holds. Indeed, since we have
and
Thus Lemma 10 implies , a contradiction.
-
3.
is odd and only one component takes the minimum value. That is, . Let be obtained from by averaging-out the subset . Then by Lemma 12 and moreover is such that at least two coordinates take on the minimum value, as . The argument from the previous case can now be applied to derive a contradiction.
Thus, except for degenerate choices for , it follows that the function is maximized by the choice of . The next step is to determine the distribution maximizing . At this point, we can rely on a main result of [25], as the function is the same as the function defined in [25, Equation (17)]. It is shown therein that is strictly Schur convex, which in particular means that has a unique minimum at . That is, has a unique maximum at .
The (strict) Schur convexity also implies the following: if , then where
| (35) |
That is, we can conclude that . We encapsulate these facts in the following proposition.
Proposition 14 (Theorem 1,2 [25]).
Let , and . Suppose . Then . Furthermore, is monotone decreasing for . Lastly, is concave for , i.e., with .
A further fact that we have from [25] is that
In fact, this was taken as the definition of . To end this subsection, we prove the following theorem by utilizing the concavity of .
Theorem 15.
Assume , then we have
Proof.
Let be the center attaining . Without loss of generality, we can assume is a all zero vector. Let be the distribution of symbols in the -th index of , i.e., with the distribution taken over . Let and . Clearly, . Then, we have
The first inequality is due to Lemma 11 and the second and third inequalities are due to Proposition 14. The last inequality is due to and the center is all zero vector. The proof is completed.
3.3 Abundance of Random-Like -tuples
Recall the notations
and
where and . In this subsection, we prove a code either contains a large subcode with radius , or most of its -tuples are of type close to the uniform distribution (for all .
We first show that for any projection with (for some parameter ), the projection almost preserves the radius with small loss. Then, if for any subcode with large size, we find a codeword in whose symbols’ distribution is close to the uniform. In fact, most codewords in satisfies this requirement. Let be the index set of taking value . We apply to and claim that preserves the radius. Thus, we can find a codeword such that for every , the symbol’s distribution of is close to uniform. Moreover, most of codewords in satisfy this requirement. The proof is the completed by induction.
Lemma 16.
Let be the projection on a set of size . Suppose is a code of size satisfying . Then, there exists a subcode of size at least such that .
Proof.
The proof is omitted due to the limited space.
Theorem 17.
Let be fixed. For every , there exists a with the following property. If is a natural number, there exist constants and such that for any code with size , at least one of the following must hold:
-
1.
There exists such that and .
-
2.
There exist at least many -tuples of distinct codewords in such that for all we have
and thus
Proof.
The proof is omitted due to the limited space.
3.4 Putting Everything Together
The argument follows the same line of reasoning as [1]. We provide the proof for completeness. Define with minimum taken over all -tuples with distinct elements, where we recall that denotes the relaxed Chebyshev radius (Definition 5).
Theorem 18.
Let and . If is -list-decodable, then .
Proof.
The proof is omitted due to the limited space.
4 Zero-Rate List-Recovery
In this section, we show how our results on list-decoding can naturally be extended to list-recovery. Due to the page limit, we refer the reader to the full version. The result is summarized in Theorem 1.
5 Code Construction
In this section, we present a simple simplex-like code construction and show that it attains the optimal size-radius trade-off by analyzing its list-decoding and -recovery radius.
Our construction will be identical for list-decoding and -recovery and therefore we will directly analyze its list-recovery radius. Before presenting the construction and its analysis, let us define the average radius . This is a standard notion that “linearizes” the Chebyshev radius and often finds its use in the analysis of list-recoverable codes in the literature. The definition reads as follows: for any ,
It is well-known and easy to verify (by, e.g., following the derivations leading to Equation 31) that the above minimization admits the following explicit solution:
| (36) |
Equation 36 should be interpreted as the average distance from each to the “centroid” of the list defined as777If there are multiple maximizers, take an arbitrary one and the value of remains the same.
for each .
Finally, for integers and , denote by
the set of -partitions of , i.e., is the number of indices taking value . For , we shorthand where . Define , i.e., the sum of largest components in .
Theorem 19 (Construction of zero-rate list-recoverable codes).
Fix any integers , and . For any sufficiently large , there exists a -list-recoverable code with blocklength
| (37) |
and the trade-off between code size and (relative) radius given by:
where
| (38) |
Proof.
The proof is omitted due to the page limit.
References
- [1] Noga Alon, Boris Bukh, and Yury Polyanskiy. List-decodable zero-rate codes. IEEE Transactions on Information Theory, 65(3):1657–1667, 2018. doi:10.1109/TIT.2018.2868957.
- [2] László Babai, Lance Fortnow, Noam Nisan, and Avi Wigderson. BPP has subexponential time simulations unless EXPTIME has publishable proofs. Comput. Complex., 3:307–318, 1993. doi:10.1007/BF01275486.
- [3] L. A. Bassalygo. New upper bounds for error-correcting codes. Probl. of Info. Transm., 1:32–35, 1965.
- [4] Vladimir M Blinovsky. Bounds for codes in the case of list decoding of finite volume. Problems of Information Transmission, 22:7–19, 1986.
- [5] Vladimir M Blinovsky. Code bounds for multiple packings over a nonbinary finite alphabet. Problems of Information Transmission, 41:23–32, 2005. doi:10.1007/S11122-005-0007-5.
- [6] Vladimir M Blinovsky. On the convexity of one coding-theory function. Problems of Information Transmission, 44:34–39, 2008. doi:10.1134/S0032946008010031.
- [7] Philippe Delsarte. An algebraic approach to the association schemes of coding theory. Philips Res. Rep. Suppl., 10:vi+–97, 1973.
- [8] Dean Doron, Dana Moshkovitz, Justin Oh, and David Zuckerman. Nearly optimal pseudorandomness from hardness. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), pages 1057–1068. IEEE, 2020. doi:10.1109/FOCS46700.2020.00102.
- [9] Dean Doron and Mary Wootters. High-probability list-recovery, and applications to heavy hitters. In 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022). Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/LIPICS.ICALP.2022.55.
- [10] Peter Elias. List decoding for noisy channels. Wescon Convention Record, Part 2, pages 94–104, 1957.
- [11] Peter Elias. Error-correcting codes for list decoding. IEEE Transactions on Information Theory, 37(1):5–12, 1991. doi:10.1109/18.61123.
- [12] Edgar N Gilbert. A comparison of signalling alphabets. The Bell System Technical Journal, 31(3):504–522, 1952.
- [13] Oded Goldreich and Leonid A Levin. A hard-core predicate for all one-way functions. In Proceedings of the 21st Annual ACM Symposium on Theory of Computing (STOC), pages 25–32. ACM, 1989. doi:10.1145/73007.73010.
- [14] Venkatesan Guruswami, Christopher Umans, and Salil Vadhan. Unbalanced expanders and randomness extractors from parvaresh–vardy codes. Journal of the ACM (JACM), 56(4):1–34, 2009. doi:10.1145/1538902.1538904.
- [15] Iftach Haitner, Yuval Ishai, Eran Omri, and Ronen Shaltiel. Parallel hashing via list recoverability. In Annual Cryptology Conference, pages 173–190. Springer, 2015. doi:10.1007/978-3-662-48000-7_9.
- [16] Justin Holmgren, Alex Lombardi, and Ron D Rothblum. Fiat–shamir via list-recoverable codes (or: parallel repetition of gmw is not zero-knowledge). In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pages 750–760, 2021. doi:10.1145/3406325.3451116.
- [17] Piotr Indyk, Hung Q Ngo, and Atri Rudra. Efficiently decodable non-adaptive group testing. In Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms, pages 1126–1142. SIAM, 2010. doi:10.1137/1.9781611973075.91.
- [18] Jeffrey C Jackson. An efficient membership-query algorithm for learning DNF with respect to the uniform distribution. Journal of Computer and System Sciences, 55(3):414–440, 1997. doi:10.1006/JCSS.1997.1533.
- [19] Eyal Kushilevitz and Yishay Mansour. Learning decision trees using the Fourier spectrum. SIAM Journal on Computing, 22(6):1331–1348, 1993. doi:10.1137/0222080.
- [20] VI Levenshtein. Application of hadamard matrices on coding problem. Problems of Cybernetica, 5:123–136, 1961.
- [21] Richard J Lipton. Efficient checking of computations. In Proceedings of the 7th Annual Symposium on Theoretical Aspects of Computer Science (STACS), pages 207–215. Springer, 1990. doi:10.1007/3-540-52282-4_44.
- [22] Robert J. McEliece, Eugene R. Rodemich, Howard Rumsey, Jr., and Lloyd R. Welch. New upper bounds on the rate of a code via the Delsarte-MacWilliams inequalities. IEEE Trans. Inform. Theory, IT-23(2):157–166, 1977. doi:10.1109/tit.1977.1055688.
- [23] Hung Q Ngo, Ely Porat, and Atri Rudra. Efficiently decodable error-correcting list disjunct matrices and applications. In International Colloquium on Automata, Languages, and Programming, pages 557–568. Springer, 2011.
- [24] Nicolas Resch. List-decodable codes:(randomized) constructions and applications. School Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, USA, Tech. Rep., CMU-CS-20-113, 2020.
- [25] Nicolas Resch, Chen Yuan, and Yihan Zhang. Zero-rate thresholds and new capacity bounds for list-decoding and list-recovery. arXiv preprint, 2022. doi:10.48550/arXiv.2210.07754.
- [26] Madhu Sudan, Luca Trevisan, and Salil Vadhan. Pseudorandom generators without the XOR lemma. Journal of Computer and System Sciences, 62(2):236–266, 2001. doi:10.1006/JCSS.2000.1730.
- [27] Michael A Tsfasman, SG Vlădutx, and Th Zink. Modular curves, shimura curves, and goppa codes, better than varshamov-gilbert bound. Mathematische Nachrichten, 109(1):21–28, 1982.
- [28] RR Varshamov. Estimate of the number of signals in error correcting codes. Docklady Akad. Nauk, SSSR, 117:739–741, 1957.
- [29] Lloyd R. Welch, Robert J. McEliece, and Howard Rumsey, Jr. A low-rate improvement on the Elias bound. IEEE Trans. Inform. Theory, IT-20:676–678, 1974. doi:10.1109/tit.1974.1055279.
- [30] Jack Wozencraft. List decoding. Quarter Progress Report, 48:90–95, 1958.
