Abstract 1 Introduction 2 Preliminaries 3 List-Decoding Reed–Solomon Codes 4 General 𝒑 (Semi)Metrics 5 The 𝟐 Metric and Gaussian Error 6 The 𝟏 Metric and Laplacian Error References

List Decoding Reed–Solomon Codes in the
Lee, Euclidean, and Other Metrics

Chris Peikert ORCID University of Michigan, Ann Arbor, MI, USA Alexandra Veliche Hostetler ORCID University of Michigan, Ann Arbor, MI, USA
Abstract

Reed–Solomon error-correcting codes are ubiquitous across computer science and information theory, with applications in cryptography, computational complexity, communication and storage systems, and more. Most works on efficient error correction for these codes, like the celebrated Berlekamp–Welch unique decoder and the (Guruswami–)Sudan list decoders, are focused on measuring error in the Hamming metric, which simply counts the number of corrupted codeword symbols. However, for some applications, other metrics that depend on the specific values of the errors may be more appropriate. This work gives a polynomial-time algorithm that list decodes (generalized) Reed–Solomon codes over prime fields in p (semi)metrics, for any 0<p2. Compared to prior algorithms for the Lee (1) and Euclidean (2) metrics, ours decodes to arbitrarily large distances (for correspondingly small rates), and has better distance-rate tradeoffs for all decoding distances above some moderate thresholds. We also prove lower bounds on the 1 and 2 minimum distances of a certain natural subclass of GRS codes, which establishes that our list decoder is actually a unique decoder for many parameters of interest. Finally, we analyze our algorithm’s performance under random Laplacian and Gaussian errors, and show that it supports even larger rates than for corresponding amounts of worst-case error in 1 and 2 (respectively).

Keywords and phrases:
Reed–Solomon codes, list decoding, unique decoding, Lee metric, Euclidean metric, Guruswami–Sudan algorithm
Copyright and License:
[Uncaptioned image] © Chris Peikert and Alexandra Veliche Hostetler; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Error-correcting codes
Related Version:
Full Version: https://arxiv.org/abs/2510.11453
Editor:
Shubhangi Saraf

1 Introduction

Reed–Solomon codes [11] are among the most widely used families of error-correcting codes, with applications across computer and communication sciences. Their many virtues include: a very simple definition; the largest possible minimum distance as a function of rate; and efficient decodability from errors, via either unique decoding up to half the minimum distance (see, e.g., [6, Section 12.1]), or list decoding up to the larger Johnson bound, via the celebrated works of Sudan [14] and Gururswami–Sudan [7] (see also [6, Section 12.2]).

List decoding [3, 15] is the task of finding all codewords that are within some desired distance of a (potentially corrupted) received word. When this radius is more than half the code’s minimum distance, there can potentially be more than one codeword within range (hence the name “list decoding”). Despite this non-uniqueness, list decoding can suffice for many purposes (e.g., finding a nearest codeword within range), and indeed, it has found numerous applications.

Most work on decoding Reed–Solomon codes has measured errors in the Hamming metric, which simply counts the number of corrupted codeword symbols (regardless of how they are corrupted). However, there are many other natural metrics that depend on the specific values of the errors. Such metrics can be more appropriate for settings where introducing a “large” error at a coordinate is more costly than a “small” error, or where the communication channel might add some nonzero error to every coordinate. An example is a channel that adds error according to a Gaussian or other fairly concentrated distribution. When the code alphabet is q (the integers modulo q) – in particular, a prime field 𝔽q – one metric of frequent study is the Lee metric, which is merely the 1 norm 𝐱1=i|xi| after lifting q to its distinguished representatives in [q/2,q/2). Other natural, analogously defined choices include the Euclidean (2) or other p metrics.

We know of only a few prior works on efficiently decoding Reed–Solomon codes in metrics others than Hamming. For the Lee (1) metric, Roth and Siegel [12] gave an algorithm that uniquely decodes up to half of (a lower bound on) the minimum distance; their algorithm works for certain subclasses of (generalized) Reed-Solomon and BCH codes. In addition, Wu, Kuijper, and Udaya [16] gave a list-decoding algorithm for 1, built around Guruswami–Sudan [7], that decodes to larger distances than in [12] for all small enough rates. Finally, for the Euclidean (2) metric, Mook and Peikert [10] recently gave a list-decoding algorithm that also uses [7] as a black box.

1.1 Contributions

This work gives a polynomial-time algorithm that list decodes any generalized Reed–Solomon (GRS) code over a prime field in the p (semi)metric for any 0<p2; in particular, this includes the Lee (1) and Euclidean (2) metrics.111A semimetric is just a metric that does not necessarily satisfy the triangle inequality (which we will not need). Our algorithm works for a broader range of parameters, and has a better distance-rate tradeoff for all decoding distances above some moderate thresholds, than the prior algorithms for 1 and 2 [12, 16, 10]; see below for elaboration and Figure 1 for a visual depiction. For ease of comparison across the various works and p (semi)metrics, we use a suitably normalized version of distance: for code length n, distance d corresponds to relative distance δ:=d/n1/p.

For p=2, our algorithm can handle an arbitrarily large decoding distance, for a correspondingly small enough rate: specifically, as δ and the alphabet size grow, we can decode for rates rapidly approaching 1/(δ2πe). By contrast, the prior work [10] applies only for relative distance δ<1/20.7071 (i.e., 2 distance less than n/2). In addition, our algorithm works for larger rates than the one in [10] whenever δ exceeds about 0.51797. (See Section 5.2 for a detailed comparison.) This is particularly interesting since the rates obtained in [10] were shown to be optimal (in a certain sense) for δ<1/2, but not for larger values.

For p=1, again our algorithm (like the one from [16]) can handle an arbitrarily large decoding distance, whereas [12] is limited to relative distance δ<1 (i.e., 1 distance less than n). In addition, our algorithm works for larger rates than those of [12, 16] whenever the relative decoding distance exceeds about 0.78988, and in general, as δ and the alphabet size grow, we can decode for rates rapidly approaching 1/(2eδ). (See Section 6.2 for details.) Our algorithm is also qualitatively broader: it decodes from continuous (real-valued) error, whereas the ones from [12, 16] require discrete (integer) error. While continuous error can be discretized by rounding, this can increase the relative distance from the codeword by up to 1/2 in 1, which significantly degrades the distance-rate tradeoffs of the prior works, making them worse than ours for all distances.

We also give several useful supplementary results. By adapting an argument of [12], we prove lower bounds on the 1 and 2 minimum distances for a certain natural subclass of GRS codes. These imply that for many parameters of interest, our list-decoding algorithm outputs at most one codeword, i.e., it is actually a unique decoder. (See Lemmas 5.7 and 6.4 and the discussions thereafter.) And in addition to worst-case errors added by an adversarial channel, we also consider our algorithm’s performance under average-case errors produced by “memoryless additive” channels. Such channels add independent identically distributed error, drawn from some specified distribution, to each coordinate of the transmitted codeword. For Laplacian and Gaussian errors (which roughly correspond to 1 and 2, respectively), we show that our algorithm supports even larger rates than what we would get by merely applying concentration bounds on the error vector and invoking our worst-case results.

Figure 1: Plots of the adjusted rate R,(p), as a function of the p relative decoding distance δ=d/n1/p or corresponding channel error width r=p1/pcpδ, for which our algorithm can list decode prime-field GRS codes in the worst case (wc) or average case (ac), respectively, for p=2 (left) and p=1 (right). (For simplicity, these plots assume a field size qδ,r.) For comparison, also shown are the corresponding functions from the prior work on decoding GRS codes in these metrics: [10] is for list decoding in 2, and [12, 16] are respectively for unique and list decoding in the 1 (Lee) metric, but only for discrete (integer) error. Also shown are rate bounds Runiq(p) for which decoding to p relative distance δ is guaranteed to yield a unique codeword, for a certain natural subclass of GRS codes. (See Lemmas 5.7 and 6.4 and the discussions thereafter.)

1.2 Technical Overview

At the highest level, our algorithm for list decoding prime-field GRS codes in p follows the basic approach of [10] for list decoding (G)RS codes in 2: we first translate the received word into a suitable weight (or reliability) vector, then invoke a soft-decision list-decoding algorithm [7, 5, 8] for GRS codes. Informally, a weight vector specifies, for each coordinate of the received word and each symbol in the code alphabet, a “confidence level” that the transmitted codeword had that symbol at that coordinate. Given such a weight vector, a soft-decision decoding algorithm then finds all codewords that are sufficiently correlated with it, as determined by the code rate. (For the formal definitions and theorem statement, see Definitions 3.1, 3.2, and 3.3.)

For our purposes, the principal challenge is in mapping a received word to an appropriate weight vector so that any codeword that is close enough to the received word (in the p metric) has sufficient correlation with the weight vector. The prior work [10] uses very simple weights: given a real received value r, only its floor r and ceiling r receive positive weights, of 1(rr) and 1(rr), respectively. It was shown that with these weights, the cited soft-decision algorithms decode up to any 2 relative distance δ<1/2 for any code rate up to 12δ2. Moreover, for δ1/2 it was shown that this rate is optimal for those algorithms, i.e., no other weight assignment can work for a larger rate. However, [10] did not consider larger decoding distances than these, nor other metrics.

In this work, to handle large decoding distances and other metrics, we use “smoother” weights, which typically assign a positive weight to every alphabet symbol. Our overall approach (see Section 3) is quite general, and is parameterized by a function f:[0,1] satisfying mild hypotheses, primarily that its Fourier transform f^ is non-negative (see ˜2.12). This function can be seen as defining a weight – or a relative “likelihood,” in the case of a random channel – for every potential real-valued error.222For example, we take f to be a Gaussian function for decoding in the 2 metric, or under a Gaussian channel. For a prime-order code alphabet 𝔽qq:=/q, and a received real value r/q, we assign weight f(rw+q)=e=rw(modq)f(e) to each alphabet symbol wq. Since the elements of the coset rw+q are exactly those errors that would convert a transmitted symbol w to the received value r, this assignment captures the total weight of all such errors.

Our first main result, given in Theorem 3.5, lower bounds the correlation between the weight vector for a received word 𝐫 over /q and any (code)word 𝐜 over q, by the ratio of two quantities determined by f: its (arithmetic or geometric) mean over the coordinates of the error vector 𝐫𝐜, and (the square root of) its sum over a certain two-dimensional integer lattice q. So, for a particular decoding distance in a metric of interest (or channel distribution, in the average case), the goal becomes to choose a suitable function f that nearly maximizes this ratio. The proof of the theorem uses a mild generalization of Fourier-analytic results on lattices from [1, 9], and is the source of our requirement that f^ is non-negative, and ultimately the restriction that 0<p2 for p (semi)metrics.

The bulk of the remaining work is then devoted to making a suitable choice of function f for the p (semi)metric (and corresponding channel distributions), and analyzing its summation over q. In Section 4 we consider scalings of the function f(p)(x):=exp(|x|p), which is known to have non-negative Fourier coefficients for 0<p2 (but not for any other p). Then in Sections 5 and 6 we specialize to p=2 and p=1, respectively, and give fairly tight upper bounds on f(q) using Fourier-analytic techniques or direct analysis. Finally, we use these bounds to optimize the distance-rate tradeoffs for which we can list decode GRS codes in these p metrics, and for Gaussian and Laplacian random channels as well.

2 Preliminaries

For a positive integer n, let [n]:={1,,n}. For a positive integer q, define the quotient ring q:=/q and the additive quotient group q:=/q. For a prime power q, let 𝔽q denote the finite field of size q. When q is prime, we identify q with the finite field 𝔽q in the natural way.

For any xq (which is a coset of q), define its “lift” x¯[q/2,q/2) to be the unique real number such that x¯=x(modq), i.e., the “zero-centered” distinguished representative of x. We also apply this notation entry-wise to vectors over q.

For any p>0, define the p (quasi)norm on n as 𝐱p:=(i=1n|xi|p)1/p. It is well known that this is a norm if and only if p1, and is a quasinorm for any p>0.333A quasinorm relaxes the triangle inequality axiom to require only that 𝐱+𝐲K(𝐱+𝐲) for some fixed K. We do not use the triangle inequality, or even this relaxation, so we can consider p<1. Similarly, we define the p (semi)metric on qn by lifting, i.e., via 𝐱p:=𝐱¯p.444Formally, this is not a norm because it is not defined on a vector space (since q is not a field), and it does not satisfy homogeneity due to the mod-q reduction. However, it does define a (semi)metric (where “semi” does not require the triangle inequality), with distance function d(𝐱,𝐲)=𝐱𝐲p. For p=1, this generalizes the Lee metric over q to q.

For two groups X,Y, their direct sum group XY is their Cartesian product with the group operation defined component-wise. This notation extends to the direct sum of group cosets, which is a coset of the direct sum of the groups.

For any two vectors 𝐱=(x1,,xn) and 𝐲=(y1,,yn) of the same dimension, their coordinate-wise (or Hadamard) product is denoted by 𝐱𝐲:=(x1y1,,xnyn).

For a finite sequence X1,,Xn of real values, we denote their average by Avgi[Xi]:=1ni=1nXi. We use the following special case of the well known Hoeffding (lower-)tail bound.

Lemma 2.1 (Hoeffding’s Inequality).

Let X1,,Xn be independent identically distributed random variables in [0,1] with common expectation μ=𝔼[Xi]. Then for any γ0,

Pr[Avgi[Xi]μγ]<exp(2γ2n).

Operations on functions

For any function f:D and countable subset XD, we define f(X):=𝐱Xf(𝐱). We extend the domain to Dk multiplicatively, as

fk(𝐱):=i=1kf(xi), (2.1)

often omitting the superscript k when it is clear from context. When D=n, for any real s0 we define fs(𝐱):=f(𝐱/s).

2.1 Linear Codes

A linear (error-correcting) code of (block) length n over the alphabet 𝔽q is a linear subspace of 𝔽qn. As a subspace, it has a dimension. In this paper, we consider the following family of codes.

Definition 2.2 ((Generalized) Reed–Solomon code).

Let nq be positive integers, with q a prime power. For a non-negative integer k, a vector 𝛂𝔽qn with distinct entries, and a vector 𝐭(𝔽q{0})n with (not necessarily distinct) non-zero entries, the Generalized Reed–Solomon (GRS) code of dimension k with evaluation points 𝛂 and twist factors 𝐭 is defined as

GRSq,k(𝜶,𝐭):={𝐭f(𝜶)=(t1f(α1),,tnf(αn)):f𝔽q[x],deg(f)<k}.

A special case is a Reed–Solomon (RS) code, which is obtained by using trivial twist factors 𝐭=(1,,1).

2.2 Lattices

Definition 2.3 (Lattice, Basis).

An (n-dimensional, full-rank) lattice n is the set of all integer linear combinations of some n linearly independent basis vectors 𝐁={𝐛1,,𝐛n} n:

=(𝐁):={i=1nzi𝐛i:zi}.

Equivalently, it is a discrete additive subgroup of n whose -span is n; as such, it defines the quotient group n/ of lattice cosets 𝐱+ for 𝐱n. A sublattice of n is called an integer lattice.

In this work, all lattices are implicitly full rank. A lattice basis can equivalently be seen as an invertible matrix 𝐁n×n whose columns are the vectors 𝐛1,,𝐛n. Note that a given lattice has multiple different bases, which are all related by right-multiplication by unimodular matrices in n×n.

Definition 2.4 (Determinant).

The determinant of a lattice generated by basis 𝐁 is det():=|det(𝐁)|.

Note that the determinant of a lattice is invariant under the choice of basis, by the above-mentioned relationship between the bases of a lattice.

Definition 2.5 (Dual lattice).

The dual lattice of a lattice n is

:={𝐱n:𝐯,𝐯,𝐱}.

If 𝐁 is a basis of , then its dual basis 𝐁:=𝐁t is a basis of , and hence det()=det()1.

Lemma 2.6.

Let f:D and X,YD be countable subsets of its domain (e.g., lattice cosets). Then f(XY)=f(X)f(Y).

Proof.

This follows directly from the definition of direct sum and Equation 2.1.

2.3 Fourier Analysis

Let f:n be a (Borel) measurable function that satisfies n|f(𝐱)|d𝐱<. Its Fourier transform f^:n is defined as

f^(𝐰):=nf(𝐱)exp(2πi𝐱,𝐰)d𝐱.

It satisfies the following standard properties, which follow by routine calculations.

Lemma 2.7 (Multiplicativity).

For any function f as above, fk^=f^k (where the exponent notation is as defined in Equation 2.1).

Lemma 2.8 (Time-scaling property).

For any function f as above and real s0, fs^(𝐰)=snf^1/s(𝐰).

Lemma 2.9 (Time-shift property).

For any function f as above and 𝐜n, let g(𝐱)=f(𝐱𝐜). Then g^(𝐰)=f^(𝐰)exp(2πi𝐰,𝐜).

We say that f is nice if it satisfies conditions that are sufficient for the following formula to hold, e.g., those given in [13, pages 106–107]. All of the specific functions f we use in this work are easily seen to be nice.

Lemma 2.10 (Poisson Summation Formula (PSF)).

For any lattice and nice function f,

f()=det()f^().

We will use a more general version of the PSF for lattice cosets.

Lemma 2.11 (Generalized PSF).

For any lattice n, nice function f, and 𝐲n,

f(𝐲+)=det()𝐰f^(𝐰)exp(2πi𝐰,𝐲).

2.4 Lattice Roughness

Continuing from Section 2.3, for the rest of this work we require the following properties of f.

Assumption 2.12.

The function f has range [0,1] and is nice, and f^ is non-negative real with f^(0)>0.

Because f is real, its Fourier transform is conjugate symmetric, i.e., f^(w)=f^(w) for all w, where the star denotes complex conjugation. Since f^ is also real, this implies that it is symmetric, i.e., f^(w)=f^(w). Finally, note that if f satisfies this assumption, then so does its multiplicative extension fk.

We now define an important Fourier-analytic quantity that plays an important role in our analysis. We adopt the name “roughness” because it is the functional inverse of the “smoothing parameter” from [9], which is the smallest s that makes the function fs(𝐲+) sufficiently “smooth” as a function of 𝐲.

Definition 2.13.

For a function f, lattice n, and real s>0, the roughness is defined as

ε,s:=fs^({𝟎})fs^(𝟎)=fs^()fs^(𝟎)10.

More generally, for a (linear) subspace H of n, the H-roughness is defined as

ε,s(H):=fs^(H)fs^(H)=fs^()fs^(H)1ε,s(n)=ε,s.

(Both inequalities follow from the non-negativity of fs^.)

Lemma 2.14 (adapted from [9, Lemmas 2.9 and 4.1]).

For any f satisfying ˜2.12, lattice n, real s>0, subspace H of n defining roughness ε:=ε,s(H), and 𝐲H,

fs(𝐲+)det()fs^(H)[1ε,1+ε],

with equality against the upper bound when 𝐲=𝟎. In particular, fs(𝐲+)fs()[1ε1+ε,1].

Due to space constraints, the proof is left to the full version.

3 List-Decoding Reed–Solomon Codes

3.1 Soft-Decision Decoding

To list-decode Reed–Solomon codes under various norms and probabilistic channel models, we use the “weighted,” or soft-decision, list decoder of Guruswami and Sudan (hereafter GS) [7], as elaborated upon in Guruswami’s thesis [5, Section 6.2.10] and the work of Koetter and Vardy [8]. A soft-decision decoder takes a “weight vector” as input, and outputs a set of codewords.

Definition 3.1.

A weight vector for a length-n code over 𝔽q is some W:=(W1,,Wn)[0,1]qn where each block Wi[0,1]q is indexed by 𝔽q; equivalently, each block is a function Wi:𝔽q[0,1].

Conceptually, each block Wi of a weight vector may be thought of as specifying a (posterior) probability distribution Πi over 𝔽q, where Πi(x) is proportional to the probability that the ith transmitted symbol was x𝔽q, given what was received from the channel (which need not be an element of 𝔽q). At a formal level, this interpretation makes sense only when the channel is probabilistic (for average-case decoding), but it still serves as useful intuition when the channel is adversarial (for worst-case decoding). We consider both types of channels in our results below.

For c𝔽q, define [c][0,1]q to be the binary indicator vector indexed by 𝔽q that has a 1 in coordinate c and 0s elsewhere. Similarly, for any vector 𝐜=(c1,,cn)𝔽qn, define the weight vector [𝐜]:=([c1],,[cn])[0,1]qn. Observe that its Euclidean norm is [𝐜]=n.

Definition 3.2.

The correlation between a weight vector W[0,1]qn and a word 𝐜𝔽qn is defined as their length-normalized inner product (or the cosine of the angle between them):

corr(W,𝐜):=W,[𝐜]Wn.
Theorem 3.3 (adapted from [7, Theorem 18] and [5, Theorem 6.21]).

For a prime power q, let 𝒞𝔽qn be a Generalized Reed–Solomon code of dimension k and adjusted rate R:=(k1)/n. There is a deterministic algorithm that, given a weight vector W and a “tolerance” τ>0, outputs in time poly(n,q,1/(τW)) the set of all codewords 𝐜𝒞 that satisfy

corr(W,𝐜)R+τ. (3.1)

We remark that the above theorem is originally stated for rational weights, but the supporting argument (from [5, Lemma 6.20]) easily adapts to handle real-valued weights that can be lower bounded to any needed precision in polynomial time, as all of ours can be.

3.2 From Received Words to Weight Vectors

Here we describe a general approach for translating a received word to a weight vector. This translation is parameterized by a function that, conceptually, can be viewed as (proportional to) the channel’s probability density function, even if the channel is not actually probabilistic.

Let f:[0,1] be a function that satisfies ˜2.12, extended multiplicatively to n as in Equation 2.1, and recall that fs(x):=f(x/s) for any constant s>0. Next let q be a positive integer, and recall that we identify q:=/q with 𝔽q in the natural way when q is prime. Let the set of possible received values be q=/q, and for any such value yq, define the weight function Ws,y:q[0,1] by

Ws,y(x):=fs(yx+q).

Notice that here fs is applied to a coset of q, which represents an infinite series; for all our concrete choices, these series converge and so the function Ws,y is well defined. This function can also be seen as the vector Ws,y:=(Ws,y(x))xq[0,1]q, indexed by q.

In line with the probabilistic conception of weight vectors from Section 3.1 above, the function Ws,y can be seen as follows. Suppose that a uniformly random symbol in q is sent over a channel, which adds (modulo q) noise drawn from a distribution over whose probability density function is proportional to fs. Then the probability that the sent symbol was xq, conditioned on receiving y, is proportional to Ws,y(x). This is because the coset yxq is the set of all noise values that yield y if x is sent. Note that in the definition of Ws,y we do not normalize by the total weight Ws,y(q)=fs(y+) (which may vary based on the received value y); this turns out to yield simpler analyses and tighter results.

Definition 3.4.

For a function fs as above and any received vector 𝐲=(y1,,yn)qn, define the corresponding weight vector as

Ws,𝐲:=(Ws,y1,,Ws,yn)[0,1]nq.

In order to use the soft-decision algorithm (Theorem 3.3) for decoding under an adversarial channel, it suffices to show that we can choose a suitable s so that for any received word 𝐲 and any sufficiently close codeword 𝐜 (in the norm of interest), the correlation corr(Ws,𝐲,𝐜) satisfies (3.1). Similarly, for decoding under a probabilistic channel, it suffices to show that with high probability over the channel noise 𝐞, the transmitted codeword 𝐜 has large enough correlation with the weight vector Ws,𝐲 of the received word 𝐲=𝐜+𝐞 (again, for some suitably chosen s). To this end, in what follows we give a lower bound on Ws,𝐲,[𝐜] and an upper bound on Ws,𝐲, in terms of fs and the difference 𝐲𝐜 between the received word and the codeword of interest.

3.3 Main Theorem

Here we state and prove the main result of this section. For this we define the two-dimensional integer lattice q that consists of all shifts of the lattice q2 by (z,z) for an integer z, i.e.,

q:=xq(xx)=z((z,z)+q2)q2.

We have that det(q)=q, and so det(q)=1/q. We sometimes omit the q subscript when it is clear from context or its value is unimportant.

Theorem 3.5.

Suppose that f satisfies ˜2.12. For any s>0 and 𝐲qn defining W=Ws,𝐲, and any 𝐜qn,

corr(W,𝐜)Avgi[n][fs(yici)]fs(q)fs(𝐲𝐜)1/nfs(q).
Proof.

This follows immediately from the following lower and upper bounds on the numerator and denominator of corr(W,𝐜)=W,[𝐜]/nW/n. For the numerator, by the definitions of W and [𝐜],

W,[𝐜]/n=Avgi[n][Ws,yi(ci)]=Avgi[n][fs(yici)]fs(𝐲𝐜)1/n,

where the last step follows by the inequality of arithmetic and geometric means, and the non-negativity and multiplicativity of fs over direct sums of cosets (Lemma 2.6). For the denominator, the upper bound W/nfs(q) is proved in Lemma 3.6 below.

Lemma 3.6.

Adopting the setup from Theorem 3.5, and letting ε~=εq,s(H) where H=span(1,1),

W2/nfs(q)[1ε~1+ε~,1].
Proof.

By definition of W,

W2/n=Avgi[n][xqfs(yix)2].

To bound this, let yq be arbitrary. By Lemma 2.6,

xqfs(yx)2 =xqfs((yx)(yx))
=xqfs((yy)(xx))
=fs((y¯,y¯)+q)
fs(q)[1ε~1+ε~,1],

where the last step follows by the latter part of Lemma 2.14 on the lattice q with subspace H, and noting that (y¯,y¯)H. The claim follows by averaging over i[n].

3.4 Average-Case Decoding

Here we consider list-decoding in the average case, where the channel is probabilistic (not worst case) and the goal is to output a list of codewords that includes the transmitted one. We consider channels that add independent, identically distributed random error (drawn from some specified distribution) to each coordinate of the transmitted codeword; this is often known as a memoryless additive channel. Specifically, we assume that the channel’s error distribution (for each coordinate) is proportional to fr for some r>0, i.e., it has probability density function

Dr(x):=fr(x)fr^(0).

For example, if fr is a Gaussian function, this is known as the additive white Gaussian noise (AWGN) channel model. In some settings one may also consider a discrete channel distribution, e.g., over , in which case its probability mass function is Dr(x):=fr(x)/fr(). For any s>0 (which may differ from r), define

μr,s:=𝔼eDr[fs(e)].

In Section 4 we will use the following bound for a specific family of functions f to show that the transmitted codeword is recovered with high probability over the channel error.

Lemma 3.7.

For any r,s>0 and T defining γ:=μr,sTfs(q)0, and any 𝐜qn,

Pr𝐞Drn[corr(Ws,𝐜+𝐞,𝐜)T]<exp(2γ2n).

This follows from ˜2.12, 3.5, and 2.1. Due to space constraints, the details of the proof are left to the full version.

4 General 𝒑 (Semi)Metrics

In this section we define weight vectors via Definition 3.4 using the function f:[0,1] defined as

f(x)=f(p)(x) :=exp((cp|x|)p) (4.1)
where cp :=2Γ(1+1/p),

where the gamma function Γ(z)=0uz1exp(u)du for z>0, and satisfies Γ(1)=1 and Γ(1+z)=zΓ(z). As two important examples, c1=2 and c2=π.

Note that by multiplicativity (Equation 2.1),

f(𝐱)=i=1nf(xi)=exp(i=1n(cp|xi|)p)=exp((cp𝐱p)p)=f(𝐱p).

Regarding the Fourier transform of f, the “normalizing constant” cp has been defined to make f^(0)=1.

It is also known that f^ is non-negative for 0<p2; this follows immediately from an elegant lemma and proof due to Logan, given in [4, Lemma 5].666For p>2, by contrast, f^ can have negative values, which prevents our framework from supporting p metrics for such p. So, f satisfies ˜2.12 for such p. Another immediate consequence of Logan’s lemma is that as s grows, fs^(w)/s strictly decreases and approaches zero for every w0.

We will need the following simple lemma.

Lemma 4.1.

For any s>0 and 𝐲qn defining W=Ws,𝐲, we have that W2n/exp(cpp/(2s)p).

Due to space constraints, the proof is left to the full version.

4.1 Worst-Case Decoding

We now address list-decoding in the p (semi)metric for 0<p2, under worst-case error. Consider decoding distance d=δn1/p, where n is the code length, and δ can be seen as the relative decoding distance (relative to n1/p, which is the most natural normalization factor for p). For s>0, relative distance δ0, and positive integer modulus q, define

Wq,δ(p)(s):=fs(δ)fs(q)=exp((cpδ/s)p)fs(q)0. (4.2)

By Theorems 3.5 and 3.3, to decode a GRS code of adjusted rate R over a prime field 𝔽q to within p distance δn1/p using the GS algorithm, it suffices to set s>0 so that Wq,δ(p)(s)>R. In other words, we can decode under relative distance δ for any R less than

Rwc,q,(p)(δ):=sups>0Wq,δ(p)(s)2. (4.3)

The following makes this formal.

Theorem 4.2.

For any 0<p2, δ0, and prime q, the GS soft-decision algorithm using weight vector given by fs(p) for any s>0 list-decodes, up to p distance d=δn1/p, any GRS code 𝒞𝔽qn with adjusted rate R<Wq,δ(p)(s)2, in time polynomial in n, q, and exp(1/sp)/(Wq,δ(p)(s)R).777We remark that in many cases, the bound on the polynomial running time can be improved using a better lower bound for W2, such as the one given by Lemma 3.6.

Proof.

We invoke the GS algorithm on the weight vector W=Ws,𝐲 given by the choice of s and the received word 𝐲, and tolerance τ=Wq,δ(p)(s)R>0.888To be more precise, we can invoke GS on any approximation of τ in [τ/2,τ], say. This can be computed by approximating fs(q) from above to the needed precision, by enumerating sufficiently many points of q near the origin, and upper-bounding the contribution of the remaining points in the “tails” using, e.g., Lemma 5.3. The running time is polynomial in n, q, and 1/(τW2)exp(cpp/(2s)p)/(τn), by Lemma 4.1.

Now let 𝐜𝒞 be a codeword within distance d of 𝐲, i.e., 𝐲𝐜¯pd. By Theorems 3.5, 2.12, and 4.2,

corr(W,𝐜)fs(𝐲𝐜)1/nfs(q)fs(d)1/nfs(q)=Wq,δ(p)(s)=R+τ.

So, by Theorem 3.3, the output of the GS algorithm includes 𝐜, as needed.

 Remark 4.3.

Interestingly, as δ, q/δ, and n grow (and the other parameters remain fixed), the product of the relative distance δ and the adjusted rate R for which we can decode approaches the relative radius of a unit-volume p ball. Due to space constraints, we defer this derivation to the full version.

4.2 Average-Case Decoding

We now consider average-case decoding under a memoryless additive (continuous or discrete) channel whose density function is proportional to a scaling of f=f(p). Specifically, we consider the continuous distribution with probability density function Dr(x):=fr(x)/r, and the discrete distribution over with probability mass function Dr(x):=fr(x)/fr(). Following Section 3.4, for any r,s>0 define

μr,s(p):=μr,s=𝔼eDr[fs(e)].

For these channel distributions we derive suitable bounds on μr,s(p), then reach the conclusion via Lemmas 3.7 and 3.3.

Lemma 4.4.

For any 0<p2, any r>0 defining a continuous or discrete distribution Dr, and s>0,

μr,s(p)s(r,s)p,

with equality in the continuous case and strict inequality in the discrete case.

Due to space constraints, the proof is left to the full version.

Now, for any channel parameter r>0 and for s>0, define

Aq,r(p)(s):=μr,s(p)fs(q)s(r,s)pfs(q), (4.4)

where the inequality is by Lemma 4.4. By Theorems 3.5 and 3.3, to decode (with high probability) a GRS code of adjusted rate R over a prime field 𝔽q under a channel with parameter r, it suffices to set s>0 so that Aq,r(p)(s)>R. In other words, we can decode under channel parameter r for any R less than

Rac,q,(p)(r):=sups>0Aq,r(p)(s)2. (4.5)

The following makes this formal.

Theorem 4.5.

Let 0<p2, r>0, α(0,1), and q be prime. Under a memoryless additive (continuous or discrete) channel with distribution Dr, the GS soft-decision algorithm, using weight vector given by fs(p) for any s>0, list-decodes any GRS code 𝒞𝔽qn with adjusted rate R<Aq,r(p)(s)2, in time polynomial in n, q, and exp(1/sp)/(Aq,r(p)(s)R), except with probability less than

exp(2nfs(q)α2(Aq,r(p)(s)R)2).

The proof is similar to that of Theorem 4.2, but using Lemma 3.7 to bound the probability that corr(W,𝐜)<R+τ. Due to space constraints, the details are left to the full version.

 Remark 4.6.

Theorem 4.5 outperforms Theorem 4.2 (for worst-case decoding) by a factor that approaches (e/2)1/p in the adjusted rate R it can handle, as r and q/r grow. Specifically, consider a channel with parameter r. A calculation reveals that its relative error (in p, relative to n1/p) is tightly concentrated around δ=r/(p1/pcp), so following the analysis in Remark 4.3, Theorem 4.2 applies for R that approaches 1/(re1/p). By comparison, Theorem 4.5 applies for R that approaches 1/(r21/p).

5 The 𝟐 Metric and Gaussian Error

In the remainder of the paper we instantiate our general list-decoding results for p (semi)metrics (Theorems 4.2 and 4.5) to specific metrics of interest and memoryless additive channels. In this section, we consider the 2 metric and Gaussian channels.

We specialize Equation 4.1 to p=2, i.e., the Gaussian function

f(x):=f(2)(x)=exp(πx2).

By a straightforward calculation it can be seen that this function is its own Fourier transform: f^=f. Note that fs^=sf1/s by the time-scaling property of the Fourier transform (Lemma 2.8). Finally, recalling that f(𝐱)=f(𝐱2), we get that f is invariant under rotations.

5.1 Bounds

In this subsection we derive fairly tight bounds on the factor fs(q) that appears in the quantities that govern the adjusted rates under which we can decode in the worst and average cases (Equations 4.2 and 4.4, respectively). For this purpose we need to define a suitable “fudge factor.” For rr0:=ln(4)/π0.66428, define

E(r):=12exp(πr2/2)[0,1).

Notice that E(r) is positive for r>r0, is strictly increasing, and rapidly approaches 1 as r increases. Next, for real s,q such that s[r0,q/r0], define

Eq(s):=E(q/s)E(s)[0,1).

Similarly, Eq(s) is positive for s(r0,q/r0), and rapidly approaches 1 as both s,q/s increase.

We next state some bounds on fs(q) and Eq(s); Lemma 5.1 is the main one we use. Due to space constraints, the proofs are given in the full version.

Lemma 5.1.

For any real s and positive integer q such that s(r0,q/r0),

1fs(q)>2sEq(s)2.

This follows directly from Lemmas 5.2 and 5.4 below.

Lemma 5.2.

For any real s>0 and positive integer q, let ε=ε,q/(s2) and ε~=εq,s(H) where H=span(1,1). Then

fs(q)=s2(1+ε)(1+ε~).

Next we bound the roughness quantities ε,ε~ from Lemmas 5.1 and 5.2, using the following classic tail inequality.

Lemma 5.3 (adapted from [2, Lemma 2.4]).

For any lattice , unit vector 𝐮, and s,t>0, let T𝐮,t={𝐱:|𝐱,𝐮|t}. Then

fs(T𝐮,t)<2exp(πt2/s2)fs().
Lemma 5.4.

Let r>r0 and H=span(1,1). Then

11+ε,r/2,11+εq,r(H)>E(r)=12exp(πr2/2).

5.2 Worst-Case Decoding

We now address list-decoding in the 2 metric, under worst-case error of bounded distance, by specializing the material of Section 4.1 to p=2 and using our bounds on fs(q) from Section 5.1. So, we consider decoding distance d=δn, where n is the code length and δ is the relative decoding distance. Then by Equations 4.2 and 4.3, we can list-decode for any R less than

Rwc,q,(2)(δ)=sups>0Wq,δ(2)(s)2>sups(r0,q/r0)2exp(2πδ2/s2)sEq(s)2, (5.1)

where the inequality follows by Lemma 5.1.

Corollary 5.5 below is obtained by nearly maximizing the right-hand side of (5.1). More specifically, a standard calculation shows that taking s=δ4π maximizes the “main term” 2exp(2πδ2/s2)/s, to have value 1/(δ2πe). For moderate or larger values of δ (and hence s), this very nearly maximizes the entire expression, because Eq(s)E(s) since q/ss, and E(s) rapidly approaches 1 as s grows. For example, E(s)21108 for δ1. So, as δ grows, the R for which we can list-decode rapidly approaches 1/(δ2πe).

Corollary 5.5.

For any δ>ln(4)/(2π)0.1874 and prime q4πδ2, the GS algorithm using weight vector given by fs for s=δ4π list-decodes, up to 2 distance δn in time poly(n,q,1/(R~wc,q,(2)(δ)R)), any GRS code with adjusted rate

R<R~wc,q,(2)(δ):=1δ2πeEq(δ4π)2.
Proof.

For s=δ4π, the lower bounds on δ and q imply that s=δ4π(r0,q/r0). Then by hypothesis and Lemmas 5.1 and 4.2,

R<R~wc,q,(2)(δ)=1δ2πeEq(δ4π)2<exp(2πδ2/s2)fs(q)=Wq,δ(2)(s)2.

The claim then follows directly by Theorem 4.2.

Comparison to [10]

The previous best result for list-decoding (Generalized) Reed–Solomon codes in the 2 metric was given by Mook and Peikert [10].999By a standard reduction, the result from [10] also applies to GRS codes, not just RS codes as was originally stated.

Proposition 5.6 ([10, Theorem 3.4]).

For any GRS code 𝒞𝔽qn with any adjusted rate R<1 and any ε>0, there is a poly(n,q,1/ε)-time algorithm that list-decodes 𝒞 up to 2 distance d=n(1R)(1ε)/2.

Equivalently, for a relative decoding distance δ=d/n>0, the result from [10] works for adjusted rates R approaching 12δ2, so it applies only for

δ(1R)/21/2.

By contrast, our Theorem 4.2 works for any (arbitrarily large) δ>0 (and Corollary 5.5 gives a simpler and more explicit rate bound for any δ>0.1875). Moreover, for those δ for which both Theorems 4.2 and 5.6 apply, our result works for a larger R as long as Rwc,q,(2)(δ)>12δ2 (see (4.3)). For typical (moderate or larger) q, this holds for all δ0.51797, which corresponds to R0.46342. (For tiny δ0, Theorem 4.2 works for R0.93700, whereas [10] works for R1, so the latter is better for very small distances.)

We also point out that [10] proves that for any δ1/2, which corresponds to R1/2, its (very simple) choice of weight vector gives an optimal tradeoff between δ and R for the GS/KV soft-decision algorithm and analysis. However, the optimality argument breaks down for δ>1/2 (equivalently, for R<1/2). And indeed, as we have just seen, we obtain a better distance-rate tradeoff than [10] for almost all such δ. This highlights the interesting question of determining an optimal choice of weights for the GS soft-decision algorithm for δ>1/2 (especially at the low end of this range).

5.3 Unique Decoding for a Subclass of GRS Codes

For a certain natural subclass of GRS codes, and certain rates and decoding distances covered by our list-decoding algorithm, decoding is in fact unique (i.e., the list size is at most one). We show this by giving a lower bound on the 2 minimum distance of such codes, and then observing that our list-decoding algorithm can decode to beyond half this distance for all small enough rates.

Lemma 5.7 (adapted from [12, Theorem 4]).

Any prime-field GRS code GRSq,k(𝛂,𝛂)𝔽qn (whose twist factors 𝐭 equal the nonzero evaluation points 𝛂) of rate R=k/n has squared 2 minimum distance at least

(n+1)2k212k2(n+1)>1R212R2n.

Due to space constraints, the proof is left to the full version.

Lemma 5.7 gives a relationship between the code rate R and (a lower bound on) half the 2 minimum distance, for which decoding to that distance yields a unique solution. By taking the functional inverse of half this minimum-distance bound, we see that decoding to relative distance δ yields a unique solution as long as

R<Runiq(2)(δ):=148δ2+1,

which approaches 1/(43δ) as δ grows. This curve is shown in Figure 1. Observe that for any δ for which our list-decoding algorithm outperforms the one of [10], we have that Rwc,(2)(δ)>Runiq(2)(δ). In other words, we can efficiently list decode to relative distance δ for all rates up to Runiq(2)(δ) (and beyond), thus yielding a unique decoder for these parameters. Alternatively, as the rate R approaches zero, we can efficiently list decode to a multiple of the unique-decoding distance bound that approaches 43/2πe1.6764.

5.4 Average-Case Decoding

We now consider average-case decoding under a memoryless additive (continuous or discrete) Gaussian channel, by specializing the material of Section 4.2 to p=2 and using our bounds on fs(q) from Section 5.1. Consider a Gaussian channel of parameter r>0. Then by Equations 4.4 and 4.5, we can list-decode for any R less than

Rac,q,(2)(r)=sups>0Aq,r(2)(s)2>sups(r0,q/r0)s2r2+s2Eq(s)2, (5.2)

where the inequality is by Lemma 5.1.

Corollary 5.8 below is obtained by nearly maximizing the right-hand side of (5.2). More specifically, setting s=r maximizes the “main term” s2/(r2+s2), to have value 1/(r2). As above, for moderate or larger values of r (and hence s), this very nearly maximizes the entire expression, because Eq(s) rapidly approaches 1 as s grows.101010By contrast, Eq(s)1 for values of s very close to r0, in which case the bound is maximized by taking s somewhat larger than r. So, as r grows, the rate R for which we can list-decode rapidly approaches 1/(r2).

Corollary 5.8.

For any r(r0,q/r0), α(0,1), and prime q, the GS algorithm using weight vector given by fr list-decodes, in time poly(n,q,1/(R~ac,q,(2)(r)R)), any GRS code with adjusted rate

R<R~ac,q,(2)(r):=1r2Eq(r)2,

except with probability less than exp(2nα2r(R~ac,q,(2)(r)R)2).

Proof.

By hypothesis, Lemmas 4.4, 5.1, and 4.4,

R<1r2Eq(r)2<μr,r2fr(q)=Aq,r(2)(r)2.

The claim then follows directly by Theorem 4.5, and the fact that fr(q)>r/2 by Lemma 5.2.

6 The 𝟏 Metric and Laplacian Error

In this section, we consider the 1 metric and Laplacian channels. We specialize Equation 4.1 to p=1, i.e., the Laplacian function

f(x):=f(1)(x)=exp(2|x|).

(The Fourier transform of this function is given by f^(w)=1/(1+(πw)2), but we will not use this; as already noted earlier, f(1) satisfies ˜2.12.)

Throughout this section we use the hyperbolic tangent function

tanh(x):=exexex+ex=1e2x1+e2x=e2x1e2x+1<1

and its reciprocal coth(x)=1/tanh(x)>1. Observe that tanh(x) approaches 1 as x grows; it also satisfies tanh(x)<x for all x>0, and approaches x as x approaches zero.111111Both facts can be seen from the Taylor series tanh(x)=xx3/3+, valid for |x|<π/2.

6.1 Bounds

In this subsection, we analyze the exact value of fs(q) and derive an asymptotic bound. This appears in the quantities that govern the adjusted rates under which we can decode in the worst and average cases (Equations 4.2 and 4.4, respectively). For this purpose, we define a suitable “fudge factor”. For any real x>0, define

E(x):=(coth(x)+4xe2x(e2x1)2)1(0,1), (6.1)

where the upper bound comes from the fact that coth(x)>1. Note that, as x grows, the first term in the sum rapidly approaches one, and the second term rapidly approaches zero. More precisely, a brief calculation reveals that

E(x)=1O(xe2x). (6.2)
Lemma 6.1.

For any s>0 and positive integer q,

1fs(q)>tanh(2/s)E(q/s).

Note that by Equation 6.2, for any fixed s>0, as q (or equivalently, q/s) grows, 1/fs(q) rapidly approaches tanh(2/s). In turn, this approaches 2/s as s grows.

The proof of Lemma 6.1 follows directly from Lemma 6.2 below and Equation 6.1. Due to space constraints, the details are left to the full version.

Lemma 6.2.

For any s>0 and positive integer q,

fs(q)=coth(2/s)coth(q/s)+2qe2q/s(e2q/s1)2.

Due to space constraints, the proof is left to the full version.

6.2 Worst-Case Decoding

Now we address list-decoding in the 1 metric, under worst-case error of bounded distance, by specializing the material of Section 4.1 to p=1 and using our bound on fs(q) from Lemma 6.1. We consider decoding distance d=δn, where n is the code length and δ is the relative decoding distance. Then by Equations 4.2, 4.3, and 6.1, we can list-decode for any R less than

Rwc,q,(1)(δ)=sups>0Wq,δ(1)(s)2>sups>0exp(4δ/s)tanh(2/s)E(q/s). (6.3)

Corollary 6.3 below is obtained by maximizing the “main term” exp(4δ/s)tanh(2/s) of the right-hand side of (6.3). By calculus, this is done by taking s=4/ln(D(δ))>0, where

D(δ):=1+1δ2+1δ>1.

Substituting, this means we can list-decode for any R less than

R~wc,q,(1)(δ):=tanh(lnD(δ))D(δ)δE(qln(D(δ))/4)=D(δ)1D(δ)+1E(qln(D(δ))/4)D(δ)δ. (6.4)

We consider this quantity’s asymptotic behavior for large and small δ:

  • As δ grows, D(δ)=1+1/δ+O(1/δ2) and D(δ)δ approaches e, hence R~wc,q,(1)(δ) approaches 1/(2eδ) as q/δ also grows. This is consistent with Remark 4.3.

  • As δ approaches zero, D(δ) approaches 2/δ and D(δ)δ approaches 1, hence R~wc,q,(1)(δ) approaches 1 as q/δ also grows.

Corollary 6.3.

For any δ>0 and prime q, the GS algorithm using weight vector fs for s=4/ln(D(δ)) list-decodes, up to 1 distance δn in time poly(n,q,1/(R~wc,q,(1)(δ)R)), any GRS code with adjusted rate R<R~wc,q,(1)(δ) (see Equation 6.4).

Proof.

By hypothesis and Lemmas 6.1 and 4.2,

R<R~wc,q,(1)(δ)=tanh(lnD(δ))D(δ)δE(q/s)<exp(4δ/s)fs(q)=Wq,δ(1)(s)2.

The claim then follows directly by Theorem 4.2.

Comparison to [12, 16]

To our knowledge, the only prior algorithms for (unique or list) decoding Reed–Solomon codes in the 1 (Lee) metric are [12, Section 5] and [16]. We note that both of these require discrete (integer) error, whereas our algorithm works for continuous error.

For a certain subclass of GRS codes (and BCH codes more generally), [12] gives a unique decoding algorithm for up to half (a lower bound on) the 1 minimum distance, using Euclid’s algorithm for polynomials. This algorithm decodes up to any relative distance δ<1R<1R. For any prime-field GRS code, [16] gives a list-decoding algorithm that uses GS as a subroutine, and has a piecewise distance-rate tradeoff due to its optimization over an integer parameter. (The algorithm works by putting equal weight on a range of alphabet symbols centered at the received symbol, optimizing over the range size for a given rate.)

By contrast with [12], and like [16], our Corollary 6.3 works for any GRS code, and for any (arbitrarily large) relative decoding distance δ>0, for sufficiently small R>0. Our rate-distance trade-off surpasses that of both [12, 16] for all δ0.78988, which corresponds to rates R0.21012; see Figure 1.

6.3 Unique Decoding for a Subclass of GRS Codes

As in Section 5.3, for the same subclass of GRS codes and certain parameters covered by our list-decoding algorithm, the decoding output is in fact unique. To show this, we give a lower bound on the 1 minimum distance of such codes, and then observe that our list-decoding algorithm can decode to beyond half this distance for all small enough rates.

Lemma 6.4 (adapted from [12, Theorem 4]).

Any prime-field GRS code GRSq,k(𝛂,𝛂)𝔽qn (whose twist factors 𝐭 equal the nonzero evaluation points 𝛂) of rate R=k/n has 1 minimum distance at least

(n+1)2k24k>1R24Rn.

Due to space constraints, the proof is left to the full version.

Lemma 6.4 gives a relationship between the code rate R and (a lower bound on) half the 1 minimum distance, for which decoding to that distance yields a unique solution. By taking the functional inverse of half this minimum-distance bound, we see that decoding to relative distance δ yields a unique solution as long as

R<Runiq(1)(δ):=4δ+(4δ)2+1,

which approaches 1/(8δ) as δ grows. This curve is shown in Figure 1. Observe that for any δ for which our list-decoding algorithm outperforms the unique decoder of [12] (or for which [12] does not apply), we have that Rwc,(1)(δ)>Runiq(1)(δ). In other words, we can efficiently list decode to relative distance δ for all rates up to Runiq(1)(δ) (and beyond), thus yielding a unique decoder for these parameters. Alternatively, as the rate R approaches zero, we can efficiently list decode to a multiple of the unique-decoding distance bound that approaches 8/(2e)1.4715.

6.4 Average-Case Decoding

We now consider average-case decoding under a memoryless additive (continuous or discrete) Laplacian channel, by specializing the material of Section 4.2 to p=1 and using our bound on fs(q) from Lemma 6.1. Consider a Laplacian channel of parameter r>0. Then by Equations 4.4 and 4.5, we can list-decode for any R less than

Rac,q,(1)(r)=sups>0Aq,r(1)(s)2>sups>0s2tanh(2/s)(r+s)2E(q/s), (6.5)

where the inequality is by Lemma 6.1.

Corollary 6.5 below is obtained by nearly maximizing the right-hand side of (6.5), at least for moderate or large values of r. Specifically, we use the bound tanh(2/s)<2/s to approximate the “main term” of (6.5) by 2s/(r+s)2. This is maximized at s=r, which makes the original main term equal to tanh(2/r)/4. Note that Rac,q,(1)(r) does indeed approach this value as r and q/r grow, because tanh(2/r) approaches 2/r, and E(q/r) rapidly approaches 1 (see Equation 6.2).

However, for small values of r, the expression in (6.5) is maximized for s significantly larger than r, to have value much larger than tanh(2/r)/4<1/4. This maximization can be computed numerically, and indeed, Rac,q,(1)(r) approaches 1 as r approaches 0; see Figure 1.

Corollary 6.5.

For any r>0, α(0,1), and prime q, the GS algorithm using weight vector given by fr list-decodes, in time poly(n,q,1/(Rac,q,(1)R)), any GRS code with adjusted rate

R<R~ac,q,(1)(r):=tanh(2/r)4E(q/r),

except with probability less than exp(nα2r(R~ac,q,(1)(r)R)2).

Proof.

By hypothesis, Lemmas 4.4, 6.1, and 4.4,

R<R~ac,q,(1)(r)=tanh(2/r)4E(q/r)<μr,r2fr(q)=Aq,r(1)(r)2.

The claim then follows directly by Theorem 4.5, and (for the probability bound) the fact that fr(q)>coth(2/r)>r/2 by Lemma 6.2.

References

  • [1] Wojciech Banaszczyk. New bounds in some transference theorems in the geometry of numbers. Mathematische Annalen, 296(4):625–635, 1993.
  • [2] Wojciech Banaszczyk. Inequalites for convex bodies and polar reciprocal lattices in n. Discrete & Computational Geometry, 13:217–231, 1995.
  • [3] Peter Elias. Zero error capacity under list decoding. IEEE Transactions on Information Theory, 34(5):1070–1074, September 1988. Originally appeared as Quarterly Progress Report, vol. 48, pp. 88-90, Research Laboratory of Electronics, MIT, January 1958. doi:10.1109/18.21233.
  • [4] N. D. Elkies, A. M. Odlyzko, and J. A. Rush. On the packing densities of superballs and other bodies. Inventiones mathematicae, 105:613–639, December 1991.
  • [5] Venkatesan Guruswami. List decoding of error correcting codes. PhD thesis, Massachusetts Institute of Technology, 2001. URL: http://dspace.mit.edu/handle/1721.1/8700.
  • [6] Venkatesan Guruswami, Atri Rudra, and Madhu Sudan. Essential coding theory, March 2019. URL: https://cse.buffalo.edu/faculty/atri/courses/coding-theory/book/web-coding-book.pdf.
  • [7] Venkatesan Guruswami and Madhu Sudan. Improved decoding of Reed-Solomon and algebraic-geometry codes. IEEE Trans. Inf. Theory, 45(6):1757–1767, 1999. Preliminary version in FOCS 1998. doi:10.1109/18.782097.
  • [8] Ralf Koetter and Alexander Vardy. Algebraic soft-decision decoding of Reed-Solomon codes. IEEE Trans. Inf. Theory, 49(11):2809–2825, 2003. doi:10.1109/TIT.2003.819332.
  • [9] Daniele Micciancio and Oded Regev. Worst-case to average-case reductions based on Gaussian measures. SIAM J. Comput., 37(1):267–302, 2007. Preliminary version in FOCS 2004. doi:10.1137/S0097539705447360.
  • [10] Ethan Mook and Chris Peikert. Lattice (list) decoding near Minkowski’s inequality. IEEE Trans. Inf. Theory, 68(2):863–870, 2022. doi:10.1109/TIT.2021.3126540.
  • [11] Irving S. Reed and Gustave Solomon. Polynomial codes over certain finite fields. Journal of the Society for Industrial and Applied Mathematics, 8(2):300–304, 1960.
  • [12] Ron M. Roth and Paul H. Siegel. Lee-metric BCH codes and their application to constrained and partial-response channels. IEEE Trans. Inf. Theory, 40(4):1083–1096, 1994. doi:10.1109/18.335966.
  • [13] Jean-Pierre Serre. A Course in Arithmetic. Springer New York, NY, 1973. doi:10.1007/978-1-4684-9884-4.
  • [14] Madhu Sudan. Decoding of Reed Solomon codes beyond the error-correction bound. J. Complex., 13(1):180–193, 1997. doi:10.1006/JCOM.1997.0439.
  • [15] John M. Wozencraft. List decoding. Quarterly Progress Report, Research Laboratory of Electronics, MIT, 48:90–95, 1958.
  • [16] Xin-Wen Wu, Margreta Kuijper, and Parampalli Udaya. Lee-metric decoding of BCH and Reed–Solomon codes. Electronics Letters, 39(21):1522–1524, October 2003.