Abstract 1 Introduction 2 Preliminaries 3 Coset Complex and Code 4 Rate 5 Code Testability 6 Codes in Higher Dimensions References

New Codes on High Dimensional Expanders

Irit Dinur ORCID Weizmann Institute of Science, Rehovot, Israel Siqi Liu ORCID UC Berkeley, CA, USA Rachel Yun Zhang ORCID Massachusetts Institute of Technology, Cambridge, MA, USA
Abstract

We describe a new parameterized family of symmetric error-correcting codes with low-density parity-check matrices (LDPC).

Our codes can be described in two seemingly different ways. First, in relation to Reed-Muller codes: our codes are functions on a subset of the points in 𝔽n whose restrictions to a prescribed set of affine lines has low degree. Alternatively, they are Tanner codes on high dimensional expanders, where the coordinates of the codeword correspond to triangles of a 2-dimensional expander, such that around every edge the local view forms a Reed-Solomon codeword.

For some range of parameters our codes are provably locally testable, and their dimension is some fixed power of the block length. For another range of parameters our codes have distance and dimension that are both linear in the block length, but we do not know if they are locally testable. The codes also have the multiplication property: the coordinate-wise product of two codewords is a codeword in a related code.

The definition of the codes relies on the construction of a specific family of simplicial complexes which is a slight variant on the coset complexes of Kaufman and Oppenheim. We show a novel way to embed the triangles of these complexes into 𝔽n, with the property that links of edges embed as affine lines in 𝔽n.

We rely on this embedding to lower bound the rate of these codes in a way that avoids constraint-counting and thereby achieves non-trivial rate even when the local codes themselves have arbitrarily small rate, and in particular below 1/2.

Keywords and phrases:
error correcting codes, high dimensional expanders, multiplication property
Funding:
Irit Dinur: Supported by ERC grant 772839, and ISF grant 2073/21. Part of the work was done while visiting the Simons Institute for the Theory of Computing.
Siqi Liu: Supported in part by the Berkeley Haas Blockchain Initiative and a donation from the Ethereum Foundation.
Rachel Yun Zhang: This research was supported in part by DARPA under AgreementNo. HR00112020023, an NSF grant CNS-2154149, and NSF Graduate Research Fellowship 2141064.
Copyright and License:
[Uncaptioned image] © Irit Dinur, Siqi Liu, and Rachel Yun Zhang; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Error-correcting codes
; Theory of computation Expander graphs and randomness extractors
Related Version:
Full Version: https://arxiv.org/abs/2308.15563
Acknowledgements:
We wish to thank Gilles Zemor for pointing us to the work of [16]. We thank Amnon Ta-Shma for early comments on this work.
Editors:
Srikanth Srinivasan

1 Introduction

A locally testable code (LTC) is an error correcting code that has a property-tester. The tester reads q bits that are randomly chosen, and rejects words with probability that grows with their distance from the code.

LTCs were initially studied and constructed side by side with PCPs (probabilistically checkable proofs). It was only recently that the question of existence of LTCs with the c3 property was resolved in the affirmative [6, 31]. A code has the c3 property if it has constant rate, constant distance and testable with constant locality.

It was initially hoped that high dimensional expanders, a la [25, 24], can serve as the combinatorial structure underlying the code, and all one needs to do is to find an appropriate collection of local codes (at the links) to match up with the combinatorics, see [8, 7, 4].

This approach has not borne fruit up until now, essentially due to the stringent requirements on the local codes, which turned out difficult to fulfill. Nevertheless, c3 codes were eventually constructed by circumventing the problem and switching from simplicial to square complexes. The main benefit of square complexes is that their local views support tensor codes, which satisfy the requirements for local testability.

In this work we return to the simplicial setting and construct a new parameterized family of locally testable codes on simplicial (bounded-degree) high dimensional expanders. In addition to serving as a new and potentially interesting family of LDPC error-correcting codes, these codes satisfy further properties that could be potentially useful for other applications such as PCPs and beyond. In particular, the codes are symmetric in the sense of [22, 18], meaning that there is a group acting on the coordinates of the codeword, that takes every coordinate to every coordinate. In addition, the fact that local views of our codes are Reed-Solomon, immediately implies that the codes satisfy the multiplication property: the coordinate-wise product of two codewords is a codeword in a related code.

These two properties (symmetry and multiplicativity) do not hold for the known constructions of c3 LTCs due to their using random linear codes as local codes.

Our codes can be described in two seemingly different ways. First, in relation to Reed-Muller codes: our codes are functions on a subset X¯n𝔽n whose restrictions to a prescribed set of affine lines has low degree. Alternatively, they are Tanner codes on high dimensional expanders, where the coordinates of the codeword correspond to triangles of a 2-dimensional expander X, such that around every edge the local view forms a Reed-Solomon codeword. The definition of the codes relies on the construction of a specific family of simplicial complexes whose triangles embed naturally into 𝔽n, with the property that links of edges embed as affine lines111An affine line in 𝔽n is given by 𝐚0𝔽n and 𝐚1𝔽n{0} so that e={𝐚0+t𝐚1|t𝔽} in 𝔽n.

Theorem 1.

Let 𝔽=𝔽q be a fixed finite field. For every n divisible by 9 there exists a connected 2-dimensional 3-partite simplicial complex Xn, such that

  1. (a)

    For each vertex vXn(0), the link of v is a bipartite q-regular graph on 2q2 vertices whose normalized adjacency matrix has second largest eigenvalue λ2=1/q.

  2. (b)

    There is a set of points X¯n𝔽n and an injective map ι:Xn(2)𝔽n, such that X¯n=Im(ι)𝔽n, and

    |Xn(2)|=|X¯n|qcn

    for some absolute constant c>0.

  3. (c)

    There is a set n of affine lines in 𝔽n with one line per edge eXn(1). The line corresponding to an edge e is given by a bijection e:𝔽X+e such that ιe:𝔽𝔽n is an affine line in 𝔽n, where we denote X+e={tXn(2)|te}. Let n={ιe|eX(1)}.

  4. (d)

    Xn is 3-partite, so the edges have 3 distinct types and we denote by ni the set of lines corresponding to type-i edges.

Furthermore, Xn and the maps ι,{e}e are constructible in polynomial time in the size of the complex Xn.

The complexes {Xn} are a slight variant on the coset complexes of [19]. The embedding of Xn(2) into 𝔽n, so that the links of edges map into affine lines is new, and allows us to define a new family of codes on Xn,

Definition 2 (HDX local Reed-Solomon Codes).

Let 𝔽=𝔽q and {Xn}n be as above, and let d1,d2,d3<q be positive integers. We define a family of codes as n,

Cn,d1,d2,d3 ={f:X¯n𝔽|i=1,2,3,ni,fRS(q,di)} (1)

where RS(q,d)𝔽𝔽 is the Reed Solomon code of degree d over 𝔽. Note that we have three distinct degree parameters since the edges of the complex have three distinct types.

The code Cn,d1,d2,d3 can alternatively be described as

Cn,d1,d2,d3 ={f:Xn(2)𝔽|eX(1),f|X+eCe}. (2)

for an appropriate choice of codes Ce𝔽X+e such that Ce isomorphic to RS(q,di) whenever e is an edge of type i.

We also denote Cn,d=Cn,d,d,d. In the sequel we will often focus on Cn,d as it contains all of the ideas, but we wanted to include the slightly more general definition of Cn,d1,d2,d3.

The isomorphism between definitions (1) and (2) is essentially given by ι from Theorem 1. The map ι identifies X(2) and X¯n𝔽n, and further identifies the set of triangles containing an edge eX(1) with an affine line in 𝔽n (see also Section 3.4 and Claim 25 for a full proof).

By definition (1) we can see that every n-variate polynomial of total degree at most d=min(d1,d2,d3) gives rise to a codeword in Cn,d. This allows us to give a non-trivial lower bound on the dimension of Cn,d even when d is small and constraint-counting fails. On the other hand, definition (2) allows us to prove distance and local testability through the high dimensional expansion machinery.

Theorem 3.

Fix a prime power q and let d<q. The family {Cn,d}n has the following properties

  1. 1.

    Rate: The dimension of the code is at least the dimension of the Reed-Muller code with n/3 variables and degree d. Furthermore, for d>23q the code has dimension Ω(|Xn(2)|) (namely, linear rate as n).

  2. 2.

    Distance: If d<qΩ(q) the code has constant relative distance Ωq(1).

  3. 3.

    LDPC and Local Testability: For all d and n, the code is defined by parity checks of length d+2. When q is prime, if d<q/4 the code is locally testable with d+2 queries.

  4. 4.

    Multiplication: For any d1,d2, and for any w1Cn,d1 and w2Cn,d2, we have w1w2Cn,d1+d2 where w1w2 is the coordinate-wise product of w1 and w2.

  5. 5.

    Symmetric: There is a transitive group action on the coordinates of the code that preserves the code. Transitivity means that for every pair of coordinates t,t there is a group element that takes t to t.

 Remark 4.

Here is a table that summarizes the rate, relative distance, and local testability of {Cn,d}n in different regimes of d.

d Rate Distance Local testability
(0,q/4) (n/3d) Ωq(1) d+2
[q/4,23q) ?
[23q,qΩ(q)) Ω(|Xn(2)|)

We fail to give a family of c3 codes (with constant relative rate, constant relative distance, and local testability with constant number of queries). Specifically, in the low degree regime d<q/4, the lower bound on the relative rate is subconstant >(n/3d)/|Xn(2)|=qΩ(n). In the high degree regime d>2q/3, our approach for showing local testability fails, and we don’t know whether or not local testability occurs.

There is a natural way to generalize our complexes and codes to higher dimensions. In Section 6 we describe this generalization and show that distance as well as local testability “trickles down”. We also show that the codes have a homological description as the space of cycles on the top dimension.

1.1 Background

Recent works [6, 31] give locally testable codes that have the c3 property, namely they have constant relative rate, constant relative distance and locally testable with a constant number of queries. These codes are constructed on an especially designed squares complex. Similar schemes have been hypothesized to exist on simplicial complexes (see discussion in [6], and see [10]).

The simplicial complex codes are defined as

C={w𝔽qX(2)|w|X+eCe} (3)

where we fix, for each edge e, a local code Ce𝔽qX+e where X+e is the set of triangles containing the edge e. In other words, C is defined by aggregating the local codes Ce into one big parity check matrix, using the structure of the HDX. The definition is simple, but the challenge is in analysing properties of C such as rate, distance, or local testability.

  1. 1.

    Distance follows quite directly from the distance of each Ce together with expansion of X.

  2. 2.

    Local testability can be shown if we manage to show a local version of local testability, per each vertex v. Namely, we look at Cv:=C|X+v𝔽qX+v, the restriction of the code to the set X+v of triangles containing a vertex v. This itself is a linear subspace of 𝔽qX+v. We say that Cv itself is locally testable if a good test for z?Cv is to select a random ev and check if z|X+eCe.

    If each Cv is locally testable, then the high dimensional expansion of X would allow us to deduce local testability of the entire code C.

    This is the content of Theorem 41, whose proof is based on the same ideas underlying the proof of local testability in the the squares complex [6, 31] and also earlier the proof of cosystolic expansion in high dimensional expanders [17, 9], and is similar to [10, Proposition 6.4], although there are technical differences.222They show that if the links are coboundary expanders, then the global sheaf is a cosystolic expander. This is morally equivalent to saying that local robustness implies global local-testability. However, our proof makes use of a weaker local robustness condition, which is what we manage to prove for the local codes.

  3. 3.

    The dimension (or rate) of C could a priori be 0, so one needs to show that C is non-trivial somehow. So far the only successful method has been through constraint counting. One shows that the total number of parity checks is smaller than the number of degrees of freedom. Of course the constraints are often highly dependent so this argument is not tight, and one would like new techniques for lower bounding the dimension.

A major difficulty in construction of LTCs is in coming up with a collection of local codes Ce that simultaneously allow proving that each Cv is locally testable, and at the same time maintain non-trivial dimension for C. Indeed, several prior works gave a general framework for HDX codes but did not know how to instantiate them non-trivially. In [5], the authors gave a definition based on double samplers, and the proof of local testability relies on agreement testing as does ours. In [6], the c3 problem has been resolved via a similar approach, but where the underlying complex is a squares complex that has a simpler link structure, which, in turn, implies more flexibility in the choice of local codes. A similar framework is introduced and studied in [10], where codes are described as sections in sheaved high dimensional expanders. All of these works, including the current paper, follow one unifying local to global principle, and are descendants of the work of [17, 9] that showed coboundary expansion based on the same property in the links, plus some form of spectral expansion.

In this work we give a first family of codes on high dimensional expanders that are defined by instantiating (3). As mentioned before, our local edge codes Ce are Reed-Solomon codes. The codes Cv at each vertex turn out to be the following

CvCdx,dy={f:𝔽q3𝔽q|a,b,c,degx(f(x,b,c))dx;degy(f(a,y,ay+c))dy}. (4)

Quite surprisingly we have learned, after completing this work, that these (local) codes have already been studied in [16], and their rate and distance is known (for the case dx=dy). We compute the rate of these codes in the case dxdy for dx+dy<q/2 (see Theorem 36), and prove that Cdx,dy is agreement-testable when dx,dy<q/4 and when q is a prime (see Theorem 38). Namely, given two functions X,Y:𝔽q3𝔽q such that for all b,c𝔽q, X(x,b,c) is a low degree function of x; and such that for all a,c𝔽q, Y(a,y,ay+c) is a low degree function of y, if

a,b,c[X(a,b,c)Y(a,b,c)]<ϵ

then there is some fCdx,dy that is close to both X and Y. To show that this is a good test we adapt the analysis of Polyschuk and Spielman [32] of the local testability of bivariate polynomial functions. In order to make the analysis go through, we need a characterization of the codewords of the local code, namely that every codeword is described by a polynomial that is low degree in all three coordinate directions. This is given in Lemma 34. We remark that the local testability we obtain is slightly weaker: there is a quadratic, rather than linear, relation between the distance of the word to a code and the probability of failing the test. Nevertheless we derive local-testability of C from this slightly weaker local testability of the Cv’s.

Finally, as for the dimension of the code C. For degree parameters that allow local testability, constraint counting is out of the question, because there are more constraints than degrees of freedom. Instead we show that the code contains many codewords by showing that every low degree multi-variate polynomial is a valid codeword in of our code. We leave for future research to try and get tighter bounds on the rate of these codes.

Multiplication Property

A triple C1,C2,C3 of codes are so-called multiplication codes if for any w1C1 and w2C2 the word w1w2 obtained by coordinate-wise product satisfies w1w2C3. Codes with the multiplication property enable computation “under the hood”, namely, without decoding. Roughly speaking, the product of two codewords is an encoding of the product of the two messages. This can clearly be useful for secure computation. For example, if the two messages represent two subsets, then their product is the intersection of the sets, and we can compute set disjointness without decoding.

This property was introduced in [27] who used it towards a proof for IP=PSPACE involving abstract codes, not necessarily Reed-Muller. The original proof of [26, 34] relied on Reed-Muller codes which are indeed multiplication codes. The multiplication property is used for example in the proof of the PCP theorem [26, 2, 1], and could potentially be useful for constructing PCPs from LTCs.

Since our codes are defined as lifts of Reed-Solomon codes they are automatically multiplication codes for appropriate degree choices (a Tanner codes inherits the multiplication property from the local-view codes). Alternatively, they are also Tanner codes with Reed-Solomon codes as local codes. So these Tanner codes inherit their multiplication property from the local codes. By choosing the parameters appropriately, we can get C1,C3 an LDPC code with linear rate, and C2 a locally testable (LDPC) code with polynomial rate, such that this triple has the multiplication property (see Lemma 27).

Connection to Reed-Muller codes, lifted codes, and Tanner Codes

Recall that a Tanner code is given by two pieces of data:

  • A bipartite graph 𝒢=(,𝒫,E) with left vertices (for bits), right vertices 𝒫 (for parity checks), and edges E.

  • For each p𝒫 there is a local code Cp𝔽Γ(p) where Γ(p) are the neighbors of p.

Given 𝒢 and {Cp}, the Tanner code is

𝒯(𝒢,{Cp})={w𝔽|p𝒫,w|Γ(p)Cp}.

The Reed-Muller code RMn,d is the space of all n-variate polynomials of total degree at most d. For a broad set of parameters, Reed-Muller codes are themselves known to be Tanner codes, with local parity check codes being Reed Solomon codes, and with a =𝔽n and 𝒫 being the set of all affine lines in 𝔽n, so that the bipartite graph 𝒢RM is the point-vs.-line graph which connects a point in 𝔽n to all lines passing through it. For some parameter regimes (when the degree is high) such Tanner codes form a larger space than the Reed-Muller codes, as was discovered and studied in [14], where such codes are termed “lifted codes”. Later in [11], partial lifts are considered where some of the vertices of 𝒫 are erased and one considers the remaining punctured code. Clearly, different choices of where to puncture will have different properties. The codes studied in [11] are locally correctable and each point in 𝔽n is contained in poly(n) affine lines in 𝒫. This type of codes resembles the situation in our codes, as we explain next, but a key difference is that in our construction each point in 𝔽n only participates in constant number of parity checks.

The code Cn,d1,d2,d3, as per (1), can be viewed a Tanner code whose graph is a subgraph of 𝒢RM obtained by keeping a subset n of the lines, and a subset X¯n𝔽n of the points. We also allow different degree restrictions on different types of lines. If we set d=d1=d2=d3 the parity check matrix of Cn,d is thus a sub-matrix of the parity check matrix of the Reed-Muller code matrix. As for (2), it can be seen as a Tanner code on the bipartite graph 𝒢=(X(2),X(1),E) connecting each edge eX(1) on the right to the set X+eX(2) of triangles containing it, and putting a Reed-Solomon code on every X+e in a specified way.

Two-query testability

One can often convert a code that is testable with q queries to another code that is testable with two queries, while increasing the alphabet. This is done by converting each codeword to the list of its local views. For example, in the case of Reed-Muller codes, instead of representing a polynomial by its evaluation on points, we represent it by providing its restriction to each plane. This is sometimes called the planes table, and is two-query testable by the plane-vs.-plane test, see [33]. In the case of c3 LTCs of [6, 31], this conversion is immediate, although it is not described there explicitly. For our codes it also holds, and we include details in Claim 17. This property was also highlighted in [10].

1.2 Further Work

This work raises many questions for further investigation.

  • Better bounds. What is the exact rate of our codes? We give a lower bound based on Reed-Muller codes, but we don’t know how close this bound is to the truth. Are there parameter regimes where our codes are c3?

  • Efficient encoding. Our codes are specified by their parity-check matrices. A generator matrix can be computed in O(N3) time, where N is the block length. Given the generator matrix, computing an encoding takes time O(N2). Can encoding be done more efficiently, ideally even in linear-time?

  • Quantum codes. It has recently become clear that locally testable HDX codes are related to quantum LDPC codes of the CSS type. Our construction gives a 2-chain (from vertices to edges to triangles) which can automatically be viewed as a quantum LDPC code. Indeed, we show in Section 2.6 how any HDX code gives rise to a sheaf which, by [10, Section 7.4], gives rise to a quantum code. Our local testability analysis implies distance in one direction, whereas for a good quantum LDPC code, one needs to prove both distance as well as co-distance. This is an interesting challenge. Moreover, in Section 6 we show how our construction generalizes to k dimensions, which, by the same recipe as above, can be converted to a sheaf and a k+1-chain. Studying the resulting quantum codes at intermediate levels 0<i<k seems like an interesting direction.

  • Sparsified Grassmannian. We have constructed a collection of affine lines with the property that each point has exactly three lines passing through it. This is a very sparse collection of lines, that never-the-less expands quite well when we walk from point to line to point. How does this generalize to higher dimensional complexes? The question of finding sparse Grassmannians has been studied in [29, 3, 21, 13] and could provide new PCP gadgets with properties similar to [23].

  • Other coset complexes. We have considered a variant of the coset complexes of Kaufman and Oppenheim, by choosing subgroups based on a sub-ring of the ring chosen in [19]. Many high dimensional expanders are based on matrix groups [25, 24, 19, 30]. Thus similar embeddings of the complexes into 𝔽n could potentially be useful for coming up with new codes, whose properties can be studied through the high dimensional expansion framework.

  • Generalized sumcheck. Many interactive proof systems reduce proving certain relations to checking that a multi-variate polynomial p(x1,,xm) over 𝔽m sums up to zero over some subset m𝔽m. The sumcheck protocol is designed to check this relation by recursively restricting variables of p and checking the condition over smaller sets of random variables. Can one design sumcheck-like protocols for codes other than Reed-Muller? For example, in the case of our code, we would need to find subsets of points that play the role of Hm, so that one can test whether some partial sums of a codeword sum up to zero.

    Perhaps a first step would be to find a sumcheck protocol for our local codes Cdx,dy.

1.3 Organization

We define our coset complex in Section 3 and prove its properties (Theorem 1) in Sections 3.1, 3.2, and 3.3. We then define our code on this coset complex in Section 3.4. Its properties as detailed in Theorem 3 are proved in the following sections:

  • In Section 3.5 we show that our code has the multiplication property, and in Section 3.6, we show that our code has constant distance in appropriate parameter regimes (items 2 and 4 of Theorem 3). We also show that our code has the transivity property (item 5).

  • We discuss the rate (item 1) of the global and local codes in Sections 4.1 and 4.2 respectively.

  • In Sections 5.1 and 5.3, we prove that the local and global codes are agreement testable.

Finally, in Section 6 we describe the generalization to higher dimensions.

2 Preliminaries

2.1 Expander Graphs

A d-regular graph G is said to be a γ-one-sided expander if it has eigenvalues d=λ1λ2λnd which satisfy λiγd for all i>1.

Lemma 5 (Alon-Chung).

Let G=(V,E) be a d-regular γ one-sided expander. Let TV be such that the graph induced on T has average degree at least δd. Then |T|(δγ)|V|.

Proof.

Let A be the normalized adjacency matrix of G and let f be the indicator function of T. Using the spectral decomposition f=|T||V|𝟏+f we get

δd|T|2E(T)=fAf|T|2d/|V|+γd|T|

where E(T) denotes the number of edges in the induced graph on T. Dividing both sides by d|T| and rearranging gives the lemma.

2.2 High Dimensional Expanders

A pure k-dimensional simplicial complex X is a set system (or hypergraph) consisting of a set of vertices X(0) and an arbitrary collection of subsets of size k+1 together with all their subsets. The sets of size i+1 in X are denoted by X(i). We will sometimes omit set brackets and write for example uvwX(2) instead of {u,v,w}X(2). As convention X(1)={}. Unless it is otherwise stated, we always assume that X is finite.

Let i<k and sX(i). It is standard to define the link of s to be a ki1-dimensional simplicial complex defined by Xs={ts|tX,ts}. We also define the less-standard but useful notation

Definition 6 (Star).

For a k-dimensional complex X and a face sX(i) for some i<k, the star of s is the k-dimensional complex containing all faces that contain s.

X+s(j)={tX(j)|ts}.

For a face sX(i), there is a natural bijection X+s(j)Xs(ji1) mapping tX+s to tsXs.

Definition 7 (High dimensional link expander).

Let X be a k-dimensional simplicial complex. Let λ0. We say that X is a one-sided λ-link expander if for every sXk2, the graph (Xs(0),Xs(1)) is a λ-one sided spectral expansion.

Definition 8 (Coset complex).

A k-dimensional coset complex is given by a group G and subgroups K1,,Kk+1. The vertices are all cosets of the k+1 subgroups, and the i-faces are all i+1-tuples of cosets that have a non-empty intersection. The complex is denoted X[G;K1,,Kk+1].

The degree of a complex is defined as maxvX(0)|{eX(1)|ve}|, the maximum edge degree of a vertex. A beautiful construction of a constant-degree coset complex that is a high dimensional expander was given in [19], see also [15, 30].

It is not hard to see that links in a coset complex are themselves coset complexes.

2.3 Random Walks on High Dimensional Expanders

Let X be a regular two-dimensional complex, so that every vertex touches the same number edges, and every edge touches the same number of triangles. This assumption is not needed, but it is satisfied by our coset complexes and it slightly simplifies the definitions below.

Let V=X(0),E=X(1),X+=X(2), and also denote X(1)={ϕ}.

The down operator Di:X(i)X(i1) (for 0i2) and the up operator Ui:X(i)X(i+1) (for 1i1) are defined by

aX(i1),Dif(a) =𝔼ba[f(b)],
bX(i+1),Uig(b) =𝔼ab[g(a)].

We will often drop the superscript i in Ui and Di when it is clear what i is.

Let ee denote the lower random walk on the edges (choose a random e, then ve, then ev). It is easy to see that the Markov operator corresponding to this walk is just UD. Let ee denote the non-lazy upper random walk on the edges (choose a random e, then te, then et such that ee). It is not hard to see that the Markov operator corresponding to this walk, denoted M+, satisfies DU=23M++13I.

We define inner products on the spaces V,E by expectation according to the uniform distribution (here we are using the regularity assumption).

For example, for f,g:V

f,g=𝔼xV[f(x)g(x)],f=(f,f)1/2.

The following is by now well known, see [8, 20], and we include a proof for completeness.

Lemma 9.

Let X be a one-sided γ-link expander. Every gE satisfies

g,M+gg,(UD+γI)g.

In particular, for a set RX(1) such that β=ee[eR|eR] it must be that

ee[eR|eR]βγ

by applying the lemma on g=𝟏R and observing that tX(2),eet[e,eR]=β[R].

Proof.
g,M+g =𝔼abcX+g(ab)g(ac)
=𝔼a𝔼bcXa(1)[g(ab)g(ac)]
𝔼a𝔼b,cXa(0)[g(ab)g(ac)]+γ𝔼a𝔼bXa(0)[g(ab)2]
=g,UDg+γg2

where the inequality follows since Xa is a γ-one-sided expander for each a, and this means that for any f:Xa(0) (and in particular setting f(b):=g(ab)),

𝔼bcXa(1)f(b)f(c)𝔼b,cXa(0)f(b)f(c)+γ𝔼bXa(0)f(b)2.

Note that the distribution of choosing a and then two edges ab,ac independently, is equivalent to choosing a random edge e, then a random vertex ae and then another edge ea. The following swap walk S0,1 that starts from a random edge in X(1) and ends at some vertex in X(0) will be used in the analysis of local testability of the global code. Starting with a random edge e, choose a random triangle te and output v=te.

Lemma 10.

Let X be a two-dimensional γ-link-expander. The random walk M~=S0,1D on the edges has second largest eigenvalue bounded by 3γ.

Proof.

We claim that

M+UD=12S0,1D+12UD.

Let us analyze the random walk corresponding to the operator M+UD. We note that when viewed as an operator for functions in E,

M+UDf(e)=𝔼e1M+e[UDf(e1)]=𝔼eUDe1M+e[f(e)].

The underlying random walk from e to e is as follows. We start from an edge e, go (via M+) to a random edge e1 such that ee1X(2). We then go from e1, via UD, to an edge e. In this step two things can happen, each with probability 1/2. Either e doesn’t contain {v}=e1e, in which case we end up with the distribution of S0,1D; or ev, in which case we end up with UD. This proves the required equality. We can thus write S0,1D=2M+UDUD. Furthermore, by Lemma 9 we have that for any fE,

f,(2M+UDUD)ff,(2(UDUD+γUD)UD)f.

So for any fE with 𝔼[f]=0 we get,

f,S0,1Dff,(2UDUDUD)f+2γf2=f,U(2DUI)Df+2γf23γf2

where in the last inequality we have used the fact that 2DUI is nothing other than the random walk from vertex to vertex in the graph (X(0),X(1)), so by assumption on X, (2DUI)g,gγg2 for every function gV such that 𝔼[g]=0.

2.4 HDX Codes

An HDX code is defined by two objects

  • A k-dimensional simplicial complex X, and a dimension 0<ik.

  • A collection {Cs} of local codes Cs𝔽qX+s(k), one per face sX(ki).

The HDX code at dimension k is defined as

𝒞k[X,{Cs}]={f𝔽qX(k)|sX(ki),(f(sv):vXs(i1))Cs} (5)

When X is one dimensional these are the expander codes of [35].

Example 1: expander code

An expander code is given by an expander graph X=(V,E) and a local code Cv for every vertex vV such that Cv𝔽2{ev}. A word f𝔽2E is in the code if for every vertex v, the bits on the edges touching v form a codeword in a small code Cv. Formally, if (f(e):ev)Cv. Often the graph (V,E) is d-regular and the local codes are taken to all be copies of some C0𝔽2d.

This is a special case of (5) for dimension k=1, where X(0)=V and X(1)=E and Cv𝔽2Xv(0) via the identification Xv(0){ev}.

Example 2: cocycle codes

Fix some simplicial complex X, and some dimension 0<k<dim(X). Suppose for every sX(k1), the local code Cs is taken to be the parity code consisting of all even length words, namely the local code at sX(k1) is,

Zs={f𝔽2Xs(0)|vXs(0)f(sv)=0}.

Then the k dimensional HDX code C[X,{Zs}] coincides with the space of k-cocycles of X. For example, when k=1, the code is spanned by all closed walks.

2.5 Local-Testability and Agreement-Testability

For a function f:AB and a subset AA we denote by f|A:AB the restriction of f to A.

Let X be a k-dimensional simplicial complex and assume that for every vX(0) we are given a local code Cv𝔽qX+v. Let C=𝒞k[X,{Cv}]𝔽qX(k) be an HDX code.

Definition 11 (Locally testable code).

Let ρ:++ be a strictly increasing function with ρ(0)=0, and let ϵ>0. A code C𝔽qn is a (Q,ϵ,ρ())-locally testable code if there is a randomized tester that, upon receiving a given word f𝔽qn, queries f in at most Q locations and then accepts or rejects, such that if p=[Tester rejects f]ϵ then dist(f,C)ρ(p).

We specialize this definition to the case of HDX codes by considering a “cannonical” local tester that selects a random vertex v and checks if the restriction of the codeword to the star of v is in the local code Cv. Namely, if f|X+vCv. The number of queries made by this tester is equal to the maximal number of k-faces containing a vertex v.

Definition 12 (Local testability of HDX Codes).

Let ρ:++ be a strictly increasing function with ρ(0)=0, and let ϵ>0. The HDX code C=𝒞k[X,{Cv}] is (ϵ,ρ())-locally testable if for any f𝔽qX(k), denoting p=v[f|X+vCv], the following holds. If pϵ then

dist(f,C)ρ(p).
 Remark 13.

If an HDX code is locally testable as per Definition 12, then the number of queries made by the tester is at most maxv|X+v(k)| which is the maximal number of k-faces containing a vertex v. In a bounded-degree complex X this is a constant number. Moreover, if each Cv is an LDPC, we can reduce the query complexity further (without changing the code), as in the following Claim.

Claim 14.

Suppose each local code Cv is defined by at most m0 parity checks, each looking at most q0 bits. If an HDX code 𝒞k[X,{Cv}] is (ϵ,ρ())-locally testable per Definition 12, then it is (q0,ϵm0,ρ())-locally testable as per Definition 11, where ρ(x):=ρ(m0x).

Proof.

For any vertex v, f|X+vCv iff none of the m0 parity checks fail. By union bound, the fraction of vertices v for which at least one of the m0 parity checks fail is at most m0ϵm0=ϵ. This means that v[f|X+vCv]m0p<ϵ, and by the local testability per Definition 12 we deduce that dist(f,C)ρ(m0p).

Definition 15 (Agreement testability of HDX Codes).

Let ρ:++ be a strictly increasing function, and let ϵ>0. The HDX code 𝒞k[X,{Cv}] is called (ϵ,ρ())-agreement testable if, for any given collection {zvCv|vX(0)}, if

α:=uvX(1)[zv(X+uv(k))zu(X+uv(k))]<ϵ

then there exists some hC such that

vX(0)(zvh|X+v(k))ρ(α).
Claim 16 (Agreement testability implies local testability).

Let X be a k-dimensional simplicial complex, and assume that every vertex is contained in the same number of k faces. Let C=𝒞k[X,{Cv}] be an HDX code. If C is (ϵ0,ρ0())-agreement testable then it is (ϵ0/2,ρ1())-locally testable, where ρ1(p):=ρ0(2p)+p.

Proof.

Suppose f𝔽qX(k). Assume that p=v[f|X+vCv]. Let V={vX(0)|f|X+vCv}. Set zv={f|X+vvV0vV. Choose a random edge uvX(1). The probability that either u or v are not in V is at most 2p. In the remaining probability they surely agree, so the disagreement is at most α2p. If 2pϵ0, then by the assumption on agreement testability, there must be some codeword hC such that v[zvh|X+v]ρ0(α)ρ0(2p). The codeword h disagrees with f on some X+v either when vV or when zvh|X+v. This event is upper bounded by p+ρ0(2p)=ρ1(p). By assumption every vertex is contained in the same number of k-faces. So choosing a random vertex and then a random k-face containing it, is the same as choosing a random k-face. Therefore,

dist(f,C)dist(f,h)=sX(k)[f(s)h(s)]vX(0)[vV]+vX(0)[zvh|X+v]ρ1(p).

Since there has been some discussion of this in the literature [10], we spell out how any HDX code that is agreement testable automatically gives rise to the “local-view” code that is two-query testable. The idea is to move from a codeword fC to the collection {zv}vX(0) of local views where zv=f|X+v.

Claim 17 (Agreement testability implies 2-query LTCs).

Suppose C is an HDX code. Let Σ be a finite alphabet such that |Σ|=maxv|Cv|, and fix an injection σv:CvΣ for each vX(0). Define

LC={z:X(0)Σ|fC, s.t. z(v)=σv(f|X+v)vX(0)}. (6)

If C is agreement testable, then LC is locally testable with two queries. If C is (ϵ,ρ())-agreement testable, then LC is (2,ϵ,ρ())-locally testable per Definition 11, where ρ(p)=p+ρ(p).

Proof.

We only give a sketch. Given zLC, the tester will select a random edge uv and read z(u),z(v)Σ. It will interpret z(u),z(v) as words in Cu,Cv respectively. This is done by computing zv=σv1(z(v)) and similarly zu=σu1(z(u)). This inversion may fail since σv is an injection but not necessarily a bijection, in which case the tester rejects. Otherwise, the tester will accept iff zu|X+uv=zv|X+uv. The analysis follows from the definition of agreement testability: define z^v=σv1(z(v)) for each vX(0), where if the inversion fails we define z^v=0Cv, then the probability pϵ the tester fails is at least uvX(0)[z^u|X+uvz^v|X+uv]=:α, which by Definition 15 means that there is some xC such that vX(0)[z^vx|X+v(k)]ρ(α)ρ(p). Define y:vσv(x|X+v)LC, so that vX(0)[σv(z^v)y(v)]ρ(p).

Now note that

dist(z,LC) =minlcLCvX(0)[z(v)lc(v)]
vX(0)[z(v)y(v)]
vX(0)[z(v)σv(Cv)]+vX(0)[σv(z^v)y(v)]
p+ρ(p),

where in the last line we used that vX(0)[z(v)σv(Cv)]uvX(1)[z(u)Σu(Cv)z(v)Σv(Cv)]p.

2.6 Sheaves and HDX codes

Meshulam shows in [28] how to view the expander codes of [35] as a twisted homology of a graph with certain local coefficients. He also describes a higher dimensional generalization of systems of local coefficients attached to higher dimensional simlicial complexes. First and Kaufman [10] focus on the cohomological (as opposed to homological) variant and give a framework for studying codes as sheaves. The HDX codes we have defined can be placed in this framework, as we briefly explain next.

An 𝔽-sheaf X over a simplicial complex X is a collection of 𝔽-vector spaces F(x) for every xX, together with linear maps Rests:F(s)F(t) for every pair of faces st. These maps are called restriction maps, and are required to satisfy certain transitive consistency, see more details in [10].

A k-dimensional HDX code naturally gives rise to a sheaf as follows.

Definition 18.

Let 𝔽 be a field, let X be a k-dimensional simplicial complex, and let {Cs𝔽X+s}sX(k1) be a collection of 𝔽-linear local codes. Define a sheaf over X with respect to {Cs} to be

  • F(t)=𝔽 for every tX(k),

  • F(s)=Cs𝔽X+s(k) for every sX(k1),

  • F(r)={f𝔽X+r(k)|sX+r(k1),f|X+s(k)Cs} for every rX(i) and every i. In other words, F(r) is isomorphic to the HDX code defined over Xr with a local code appropriately isomorphic to Cs at a face sr,

  • Since X+sX+r whenever sr, we can define the restriction maps by actual restriction.

The coboundary operator from vertices to edges is δ:vX(0)F(v)eX(1)F(e) is defined by δf(e)=veResevf(v) (and the boundary operator is defined as the dual). For full definitions of the coboundary and boundary operators, see [10] (the coboundary operators are defined in Section 4 and the boundary operators in Section 7.4).

Claim 19.

Let 𝔽 be a field, let X be a k-dimensional simplicial complex, and let {Cs𝔽X+s}sX(k1) be a collection of 𝔽-linear local codes. Let Cs={f𝔽X+s|fCs} be the code dual to Cs, and let X be the sheaf over X with respect to {Cs}. Then, letting k:Ck(X,X)Ck1(X,X) denote the k-th boundary operator, Zk=Kerk is the HDX code 𝒞k[X,{Cs}]. ∎

Moreover, for the sheaf X with respect to {Cs}, the cocycle code Z0=Kerδ0 is the related “local-views” code defined in (6), given by replacing each codeword with its ensemble of local views (see the definition in (6) and the ensuing discussion). Agreement testability of our code is equivalent to cosystolic expansion of the sheaf X in dimension 0, as proven in [10, Proposition 7.6].

Finally, First and Kaufman describe in [10, Section 7.4] how a sheaf gives rise to a quantum CSS code.

3 Coset Complex and Code

3.1 The Triangle Complex

We describe a family of simplicial complexes {Xn}n. The construction is a variant of the coset complexes constructed by Kaufman and Oppenheim [19].

Let 𝔽=𝔽q be a fixed finite field. Let φ𝔽q[t] be a primitive (and irreducible) polynomial of degree n and let Rn=𝔽q[t]/φ𝔽qn (i.e. the ring of univariate polynomials of degree n1 where multiplication is done modulo φ). Further assume that we choose n so that 3qn1 (this is easy as long as q1mod3).

We define three matrix groups

K1={(1atct201bt001)M3(Rn)}, (7)
K2={(100ct21atbt01)M3(Rn)},
K3={(1at0010btct21)M3(Rn)}.

and let G=Gn be the group generated by K1,K2,K3. Clearly GnSL3(Rn). We will show that for ϕ and n as we specified above, it will hold that Gn=SL3(Rn).

We define three additional (smaller) subgroups,

H1=K2K3={(100010αt01)|α𝔽q}, (8)
H2=K1K3={(1αt0010001)|α𝔽q},
H3=K1K2={(10001αt001)|α𝔽q}.
Claim 20.

Each subgroup H1,H2,H3 is isomorphic to the abelian group (𝔽q,+) via the isomorphism αhi(α) for hi(α)Hi the matrix with αt in the appropriate location. ∎

The coset complex considered in this paper is

X=X[G;K1,K2,K3] (9)

as per Definition 8. The group G=Gn depends on the underlying ring R=Rn. When we let the size of the ring Rn grow by increasing the degree n, we get a family of complexes Xn as required.

By definition X is a 3-partite simplicial complex satisfying the following,

Claim 21.

Let G,K1,K2,K3,H1,H2,H3 be as above. Let X=X[G;K1,K2,K3]. Then

  1. 1.

    The vertices correspond to cosets of Ki: X(0)G/K1G/K2G/K3. Each vertex is contained in q3=|Ki| triangles.

  2. 2.

    The edges of X connect a vertex u=giKi to a vertex v=gjKj iff ij and the cosets intersect. In this case their intersection corresponds to a coset of Hk for ki,j. The elements of this coset are in 11 correspondence with the set Tuv of triangles containing the edge uv. There are exactly |Hk|=q such triangles.

  3. 3.

    X(2)G. Moreover, given three vertices u=g1K1, v=g2K2, w=g3K3, the triangle uvw belongs to X(2) iff the three cosets have a nonempty intersection g1K1g2K2g3K3={g}.

  4. 4.

    Assuming that φ is primitive and 3qn1, then G=𝖲𝖫3(Rn).

Proof.

The first three items are properties of coset complexes, as in [19]. We prove the last item:

Let eij(α) denote the matrix with 1’s along the diagonal and α in the (i,j)’th position. (Note that h1(α)=e12(αt),h2(α)=e23(αt), and h3(α)=e31(αt).) The following so-called Steinberg relations hold for all ijk,

[eij(α),ejk(β)]=eik(αβ),

where [g,h]=ghg1h1 denotes the commutator (this can be verified by direct calculation).

We claim that using h1(1)=e12(t), h2(1)=e23(t), and h3(1)=e31(t) it’s possible to generate all matrices eij(tβ) where (i,j){(1,2),(2,3),(3,1)} and β1mod 3, as well as where (i,j){(1,3),(2,1),(3,2)} and β2mod 3. We will prove this via induction. Suppose that for some B0 all eij(tβ) with βB,βjimod 3 are generatable. Then, we will show the claim for B+1. If B+1 is a multiple of 3, then there’s nothing to show. Otherwise, if B+1 is 1mod 3, then by assumption we have that eik(t2) and ekj(tB1) are both generatable where kijk2mod 3, so eij(tB+1)=[eik(t2),ekj(tB1)] is also generatable. If B+1 is 2mod 3, then we have that eik(t) and ekj(tB) are both generatable where kijk1mod 3 by inductive hypothesis, so eij(tB+1)=[eik(t),ekj(tB)] are also both generatable.

Now, since φ is primitive, it holds that the elements t,,tqn1 are all distinct and thus must cover all of Rn\{0}. If 3qn1, then the elements t3β+1 where 0β<qn1 are also distinct and range over Rn\{0}, as do the elements t3β+2. The reason is that if t generates the multiplicative group of Rn=𝔽q[t]/φ, whose order is qn1, then as long as 3qn1 then t3 also generates the group as well. In other words, from h1(1),h2(1),h3(1) we can generate all eij(γ) for any ij and γRn\{0} (that is, we can generate all elementary matrices).

To finish, it is well known that if we can generate all elementary matrices then we can generate all of SL3(Rn). This claim proves item (b) in Theorem 1. Item (a) is proven in the next subsection.

3.2 Structure of the Links

In this section we describe the links. That is, we understand the structure of X(Ki;Hi+1,Hi1). Without loss of generality we focus on the link G1=(K1/H2K1/H3,E1), a bipartite graph described in more details below.

Recall that the vertices of G1 corresponds to cosets of the form kH2 and kH3 while the edges correpsond to pairs of cosets {kH2,kH3} that have non-empty intersection k1H1k2H2. For any coset kH2 with representative k=(10ct201bt001), we use (,b,c) to denote the vertice in G1. Similarly for any coset kH3 with representation k=(1αtγt2010001) we use (α,,γ) to denote the corresponding vertice in G1. So an edge connects (,b,c) and (α,,γ) iff the following equation has a solution in 𝔽q2:

(1xtct201bt001)=(1αt(αy+γ)t201yt001). (10)

This system of equations is solvable iff c=αb+γ. Therefore we can deduce the following statement about the degree of G1.

Claim 22.

Every vertex in G1 has degree q.

Furthermore let A be the normalized adjacency matrix of G1. By the edge characterization Equation 10, we derive the following spectral gap for A.

Claim 23.

|λ2(A)|=1q.

Proof.

Let B be the |K1/H2|×|K1/H3| unnormalized biadjacency matrix of G1 such that

A=(𝟎1qB1qBT𝟎).

Then the matrix BBT is the |K1/H2|×|K1/H2| matrix whose value in its ((,b,c),(,b,c))-th entry is the number of 2-step walks from (,b,c) to (,b,c)). Equivalently, by Equation 10, the value is also the number of solutions to the equations c=xb+y and c=xb+y over 𝔽q2. Therefore

BBT[(,b,c),(,b,c)]={1when bbqwhen b=bc=c0o.w.

We can thus explicitly write BBT=(JqIq)Jq+qIqIq where Jqq×q is the all-ones matrix.

So its top eigenvalue satisfies λ1(BBT)=q2 and the second-largest eigenvalue is λ2(BBT)=q. From this we can deduced that |λ2(A)|=1q.

3.3 Embedding the Complex into a Vector Space

Recall that there is a natural isomorphism between the group G and the triangles of the complex, X(2). We describe a natural way to biject G to a set of points S𝔽q9n,

X(2)GιSR9𝔽q9n.

Every gG is a 3×3 matrix (rij)1i,j3 such that rijR for each i,j. An element in R is a univariate polynomial of degree at most n1 so it is specified by n coefficients rij(t)==0n1rij()t. We simply map each of the nine matrix entries into a vector of coefficients in 𝔽qn and concatenate them all:

(rij)ι(r11(0),r11(1),r11(n1),r12(0),r12(1),r12(n1),,r33(0),r33(1),r33(n1)) (11)

This embedding is clearly injective and it is linear in the coefficients of the matrix entries, namely ι(αg+βg)=αι(g)+βι(g) for any α,β𝔽q and g,gG.

Claim 24.

For gG and i=1,2,3 let g,i:𝔽qgHi be defined by g,i(α)=ghi(a). Then ιg,i:𝔽q𝔽qn is an affine line that can be written as

ι(g,i(α))=v0+αvi

where v0=ι(g) and for g=(|||g1g2g3|||), we have

v1=ι((|00tg300|00));v2=ι((0|00tg100|0));v3=ι((00|00tg200|)).

Here the entries in tgi are taken modulo φ(t), and vi0 for i=1,2,3.

Moreover, for any ggHi, the line ιg,i is a re-parameterization of ιg,i, satisfying

ιg,i(α)=ιg,i(α+α)

for α𝔽 such that g=ghi(α).

Proof.

Fix gG. We prove the first part for i=1, the other cases are similar. The matrix gh1(α) is obtained from g by adding αt times the third column of g to the first column of g, namely,

gh1(α)=(g1,g2,g3)+(αtg3,0,0)

Since the embedding ι is linear in the coefficients, we get that

ι(gh1(α))=ι((g1,g2,g3))+ι((αtg3,0,0))=v0+αv1

as in the claim.

Regarding the moreover part, by definition g,i(α)=ghi(α)=ghi(α)hi(α)=ghi(α+α)=g,i(α+α). Since the expression is linear in α, and since ι is additive, ιg,i is clearly the same affine line as ιg,i, reparameterized. We define, for i=1,2,3,

ni={ιg,i:𝔽q𝔽qn|gG}, (12)

and n=n1n2n3. This establishes item (c) in Theorem 1.

3.4 The Global Code

Let RS(q,d) be the Reed Solomon code of degree d over 𝔽q. Namely, RS(q,d) is the set of length-q tuples (p(α):α𝔽q) where p is a univariate polynomial of degree at most d.

For any three parameters 0d1,d2,d3q, we defined the code Cd1,d2,d3 in two ways, see Definition 2. First, we defined it as a (punctured) lifted Reed-Solomon code,

C1=Cn,d1,d2,d3 ={f:X¯n𝔽q|i=1,2,3,ni,fRS(q,di)} (1)

where ni is the set of affine lines defined in (12), and then we defined it as an HDX code,

C2=Cn,d1,d2,d3 ={f:Xn(2)𝔽q|eX(1),f|X+eCe} (2 )

where Ce is isomorphic to RS(q,di) for edges of type i. We now define Ce more explicitly. Recall from Claim 21 (item 2) that every edge e of type i corresponds to a coset gHi, in the sense that the triangles containing e, which we denote X+e, correspond to the elements of gHi. Choose, for each edge, one group element g to be a coset representative. We write X+e={ghi(α)}α𝔽 and define

Ce={f𝔽qX+e|f(ghi(α)) is a degree di function of α}. (13)

The definition of Ce appears to depend on the choice of coset representative but it does not, because the degree of f(ghi(α)) as a function of α is the same as the degree of f(ghi(α+α))=f(ghi(α)) as a function of α.

Claim 25.

The codes C1,C2 are isomorphic, and ι:GX¯n gives the isomorphism.

Proof.

Fix fC1. We define f~:G𝔽q by f~=fι, and show f~C2. We need to check that f~|X+eCe for each e. By definition this is true iff (f~(ghi(α)))αRS(q,di) for e a type i edge. But f~(ghi())=f~g,i=fιg,iRS(q,di) where the last inclusion is because fC1 and ιg,ini.

The other direction is easy as well. Given f~C2, we let f=f~ι1:X¯n𝔽q the unique function such that fι=f~. To check that fC1 consider any line =ιg,in and observe that f=f(ιg,i)=f~g,iRS(q,di).

Since there is a 11 correspondence between X(2) and G, we may also write codewords of C2 as f:G𝔽q. A group always acts transitively on itself by left multiplication. Moreover,

Claim 26.

If w:G𝔽q is a codeword, then wg is a codeword, where wg(g)=w(gg)

Proof.

This is clear since for any gG and any i=1,2,3, (wg(ghi(α))α=(w(gghi(α))αRSqdi.

This establishes the transitivity of the code, as claimed in last item of Theorem 3. It implies that the restriction of our code to Ki is isomorphic to the restriction of our code to any coset gKi. To study the local view of the code at a link of a vertex it suffices to study its restriction to Ki for i=1,2,3.

3.5 Multiplication Property

It is immediate that the codes have the multiplication property by inheritance from the Reed-Solomon code. Recall that for w,w𝔽qN we define w′′=ww𝔽qN by coordinate-wise product: w′′[i]=w[i]w[i].

Lemma 27.

Suppose C=Cn,d1,d2,d3 and C=Cn,d1,d2,d3, then for every wC and wC, we have wwC′′=Cn,d1+d1,d2+d2,d3+d3.

Proof.

For every eX(1) the local views of w|X+e,w|X+e are Reed-Solomon codewords of degrees di,di (relying on the definition in (2)). So the coordinate-wise product, w′′|X+e, is a Reed-Solomon codeword of degree at most di+di, as needed.

3.6 Distance

The global code C can also easily be shown to have constant relative distance,

Lemma 28 (Distance).

If the relative distance of Ce is at least δ>0 for every eX(1) then C has relative distance at least (δ2γ)(δγ)δ.

Proof.

Let 0xC. Let V={vX(0)|x|X+v0} and let vV. We will first show that at least (δγ) of the edges touching v are non-zero. (An edge e is non zero iff x(e)0.)

Let Av={uXv(0)|x|X+uv0}. Each uAv has 0x|X+uvCuv so x|X+uv must have at least δ fraction of non-zero entries. Each of these non-zero entries corresponds to some vertex w such that uvwX+ with x(uvw)0 so wAv. We found that for each uAvXv(0), at least δ of its neighbors (inside Xv) are in Av. Since the graph Xv is a γ-expander, the Alon-Chung lemma (Lemma 5) implies that |Av|(δγ)|Xv(0)|.

Observe that every uAv must itself be in V, so each vV has at least δγ fraction of its neighbors in V. We can again apply Lemma 5, (now using the fact that the graph (X(0),X(1)) is a γ-expander, to deduce |V|(δ2γ)|X(0)|. We have seen that each vV has at least δγ fraction of non-zero edges touching it, so the total fraction of non-zero edges is at least (δ2γ)(δγ). Each such edge touches at least δ non-zero triangles, so the total fraction of non-zero triangles is at least (δ2γ)(δγ)δ as claimed. This along with the fact that q is constant and γ=1/q from Claim 23 establishes the second item in Theorem 3.

We now prove a generalization of the above distance lemma to higher dimensional HDX codes.

Lemma 29 (Distance of k-Dimensional HDX Code).

Let X be a k-dimensional γ-one-sided local expander, and assume that for every tX(k1) we have a code Ct{f:X+t(k)𝔽q} with minimum relative distance δ>0. Define, for every 1ik2 and every face tX(i), the code

Ct={f:X+t(k)𝔽q|f|X+tCttX+t(k1)}.

Then for every 1i<k2 and every tX(i) the code Ct has relative distance j=0k1i(δjγ). In particular, the code C=C on the entire complex has relative distance j=0k(δjγ).

Before giving the proof we remark that the distance in the lemma decays exponentially with the dimension, assuming that γδ. This is necessary, as can be seen from the example the (k+1)-dimensional tensor code Ck+1, for C any code with distance δ. This code has distance δk+1 and it can be viewed as an HDX code on the complete k+1-partite k-dimensional complex333There is a natural identification of [n]k+1 with the faces of this complex. The link of every k1 face is identified with a row in the appropriate direction..

Proof.

Let 0xC. For all 1ik1 and sX(i) we define As={uXs(0)|x|X+(s{u})0}. We claim that if x|X+s0, then |As|(δ(k1i)γ)|Xs(0)|, and furthermore that Cs has relative distance j=0k1i(δjγ).

To see this, we proceed by (downwards) induction on i. This is clearly the case for i=k1. Now for i<k1, let tX(i). For vXt(0) such that t{v}At, we have that each uAt{v} has 0x|X+(t{u,v})Ct{u,v} so t{u}At as well. By the inductive hypothesis, we have that |At{v}|(δ(k2i)γ)|Xt{v}(0)| for all vVt. Since Xt is a γ-expander, the Alon-Chung lemma (Lemma 5) implies that |At|(δ(k2i)γγ)|Xt(0)|=(δ(k1i)γ)|Xt(0)|.

Now, to compute the distance, we have that each tAv has a δ(k1i)γ fraction of non-zero (i+1)-faces touching it, each of which has a j=0k2i(δjγ) fraction of k-faces touching it, so the total fraction of non-zero k-faces in X+t is at least j=0k1i(δjγ).

3.7 Local Code at a Vertex

For each vX(0), let

Cv={f𝔽qX+v|ev,f|X+eCe}. (14)

It is easy to see that our code can be written as

C={f𝔽X(2)|v,f|X+vCv}

because we are simply aggregating the constraints differently than in (2).

What does Cv look like when moving to X¯n?

Lemma 30.

Fix vX(0) a vertex of type i.

  • ι(X+v) is a 3 dimensional affine subspace in X¯n.

  • The code Cv𝔽qX+v is isomorphic to Cdx,dy for dx=di+1, dy=di1, where we define Cdx,dy by

    Cdx,dy={f:𝔽q3𝔽q|a,b,c,degx(f(x,b,c))dx;degy(f(a,y,ay+c))dy}. (15)
Proof.

Fix first v=gK1=K1 given by g=id. The elements of ι(K1) are

{ι((1atct201bt001))|a,b,c𝔽q}

and when we range over all possible choices of a,b,c we get an 𝔽q-linear subspace. If v=gK1 for some gK1, every element becomes

(|||g1g2g3|||)(1atct201bt001)=(|||g1atg1+g2ct2g1+btg2+g3|||)

which after embedding into 𝔽qn becomes ι(g1,g2,g3)+aι(0,tg1,0)+bι(0,0,tg2)+cι(0,0,t2g1). This is a 3 dimensional affine subspace. A similar proof applies to vertices of type 2,3.

For the second item, we focus again on v=K1. The code Cv consists of all f𝔽qX+v that, for each ev, satisfy f|X+eCe. We identify functions on X+v with functions on 𝔽q3 through the isomorphism 𝔽q3K1 given by (a,b,c)(1atct201bt001). An edge ev corresponds to a coset of H2 or H3 in K1, say gH2, given by a coset representative g=(10ct201bt001). The group elements of the coset are gh2(x) for all x𝔽q,

{(10ct201bt001)(1xt0010001)=(1xtct201bt001)|x𝔽q},

each corresponding to an triangle of X+e. So, by definition of Ce (see (13)) the constraint f|X+eCe translates to f(gh2(x)) having degree d2 in x. In other words, f(x,b,c) must have degree at most d2 in x.

Similarly, suppose the edge e is gH3 for g=(1atct2010001). The group elements of the coset are gh3(y) for all y𝔽q, where

{(1atct2010001)(10001yt001)=(1at(ay+c)t201yt001)|y𝔽q}.

The constraint f|X+eCe translates to requiring that f(a,y,ay+c) have degree is at most d3 in y.

It is now clear that Cv is isomorphic to Cd2,d3 when v=K1. The same also holds for any v=gK1 since by Claim 26 the code is invariant under the action of G. This implies that CvCv, for any v,v of the same color (since the group action moves any Ki coset to any other Ki coset, it is thus transitive on each color class).

To complete the proof we check that for v=K2 we have CvCd3,d1 and for v=K3 CvCd1,d2.

Let us start with v=K2. Our requirement is that f:K2𝔽q evaluates to a degree d1 polynomial on cosets of H1<K2 and a degree d3 polynomial on cosets of H3<K2. Fix some g=(100ct21atbt01)K2. For any element gx=(10001xt001)H3, and gy=(100010yt01)H1, we have

ggx=(100ct21(a+x)tbt01),ggy=(100(c+ay)t21at(b+y)t01).

Writing now f(a,b,c)=f((100ct21atbt01)), we require that for all a,b,c,

  • f(a+x,b,c) must have degree d3 in x; and

  • f(a,b+y,c+ay) must have degree at most d1 in y.

This is clearly equivalent to requiring

  • f(x,b,c) must have degree d3 in x for all b,c; and

  • f(a,y,c+ay) must have degree at most d1 in y for all a,c.

Namely, it is equivalent to requiring fCd3,d1

Finally, for v=K3, we fix some g=(1at0010btct21)K3. For any element gx=(100010xt01)H1, and gy=(1yt0010001)H2, we have

ggx=(1at0010(b+x)tct21),ggy=(1(y+a)t0010bt(yb+c)t21).

Writing now f(a,b,c)=f((1at0010btct21)), we require that for all a,b,c,

  • f(a,b+x,c) must have degree d1 in x; and

  • f(a+y,b,c+by) must have degree at most d2 in y.

This is clearly equivalent to requiring that fCd1,d2.

Moving to cosets gKi, by the fact that the code is invariant under the group action, for any coset gKi, the local code is isomorphic to Cdi+1,di1

4 Rate

4.1 Rate of Global Code

We analyze the rate of our code in two regimes. The first, is when the relative rate of the local codes Ce is at least 2/3. In this case a standard constraint counting implies that the global code has constant relative rate.

Lemma 31.

Suppose dim(Ce)>(2/3+ϵ)q for each eX(1). Then dim(C)>3ϵ.

The second parameter regime is when the relative rate of the local codes Ce is arbitrarily small. In this case we give a non-trivial lower bound by demonstrating a collection of linearly independent codewords. These are

C={f|S|f:𝔽q9m𝔽q has degree d}

where d=min(d1,d2,d3) and S𝔽q9m is the image of G when embedded into the vector space, see (11).

Lemma 32.

dim(C)dim(RMd3m). If we choose |𝔽q|poly(m) we get polynomial rate.

Proof.

Consider the upper triangular matrices with 1 on the diagonal. This set of matrices belongs to S and is isomorphic to 𝔽3m, so any polynomial in 9m variables that depends only on these 3m variables and has total degree at most d gives rise to a distinct codeword in C. An alternative way to bound the rate is as follows. If the ring R is a field (which by Claim 21 is true whenever φ is a primitive polynomial), then the fraction of matrices with determinant 0 is about 1/|R|, which is tiny. Moreover, let Sr={mM3×3(R)|det(m)=r}. Since every 0rR has an inverse, there is a bijection between Sr1 and Sr2 for every r1,r20, given by m1m2=(r2r1100010001)m1. In particular, our set S=S1 bijects to each Sr. Each Sr for r0 supports a copy of our complex, obtained by a linear transformation of the entire vector space which moves shifts the lines. We leave open the question of obtaining better bounds on the global rate.

4.2 Rate of Local Code at a Vertex

In this section we analyse the rate of the code Cq,dx,dy{f:𝔽q3𝔽q} defined in (15). In this section, we will only consider the case where q is a prime p. We remark that a proof of the rate in the case of dx=dy was given in an earlier work [16]. However, in our case, we will need a stronger statement about the nonexistence of monomials above a certain degree in order to prove local testability in Section 5.1, which is given in Lemma 34.

Since f(x,b,c) lies on a degree dx polynomial in x for any b,c, we can write f(x,y,z) as a polynomial that is degree dx in x and degree p1 in y and z:

f(x,y,z)=0idx0j,kp1cijkxiyjzk.

Plugging in (x,y,xy+z), we get that

f(x,y,xy+z)=0idx0j,kp1cijkxiyj(xy+z)k.

We can reduce f(x,y,xy+z) modulo ypy to get a polynomial g(x,y,z) that is degree p1 in y and z, and dx+p1 in x. We let gj(x,z) and gjk(x) denote the coefficient of yj and yjzk respectively, so that

g(x,y,z)=0jp1gj(x,z)yj=0j,kp1gjk(x)yjzk.

Note that we can also write

gj(x,z) ={hj(x,z)j=0hj(x,z)+hj+p1(x,z)j0
gjk(x) ={hj,k(x)j=0hj,k(x)+hj+p1,k(x)j0.

The condition that f(a,y,ay+c) lies on a degree dy polynomial in y for all a and c means that for all dy<jp1, it holds that gj(x,z)=0 for all x,z. In particular, for any 0kp1, it must hold that gjk(x)=0 for all x. So, if we know that gjk(x) is degree p1 in x, then in fact all coefficients in gjk(x) must be 0. We will use this fact extensively in the analysis of the rate.

First, we will need the following lemma.

Lemma 33 ([12]).

Assuming that rkm<p, the following matrix has full rank in 𝔽p:

((mk)(mk1)(mkr)(m1k)(m1k1)(m1kr)(mrk)(mrk1)(mrkr))
Proof.

We divide the entries of row i by (mi)! (where the top row is row 0), and multiply the entries of column j by (mk+j)!(kj)! (where the leftmost column is column 0). Since p>m and mkri,j, both (mi)! and (mk+j)!(kj)! are nonzero in 𝔽p. We obtain that the rank of the above matrix is equivalent to the rank of the below matrix:

(1111h1(mk)h1(mk+1)h1(mk+2)h1(mk+r)h2(mk)h2(mk+1)h2(mk+2)h2(mk+r)hr(mk)hr(mk+1)hr(mk+2)hr(mk+r)),

where hi(x)=x(x1)(x2)(xi+1) is a degree i polynomial.

This matrix is invertible. To see this, let αj=mkj. The above matrix is rewritten as

(1111h1(α0)h1(α1)h1(α2)h1(αr)h2(α0)h2(α1)h2(α2)h2(αr)hr(α0)hr(α1)hr(α2)hr(αr)).

Using the smaller degree monomials in the rows above it, the polynomial hi in each row can iteratively be made into simply the monomial xi. In particular, the rank of the above matrix is the same as the rank of the Vandermonde matrix, which has full rank.

Lemma 34.

Assuming that pdx+dy+2, then for j+k>dx+dy and for all 0idx, it holds that cijk=0. In other words, the combined degree of y and z in f(x,y,z) is at most dx+dy.

Proof.

Recall that we’ve written

f(x,y,z)=0idxcijkxiyjzk.

We will backwards induct on the value of j+k to show that cijk=0i[0,dx]. The base case is j+k=2p1. In this case, there are no terms with j+k=2p1, so cijk=0 for all i[0,dx].

Now, assume that for s>dx+dy and j+k>s, it holds that cijk=0idx. We will prove that for all j+k=s, it holds that cijk=0 for i[0,dx].

To begin, we express the polynomial gjk(x) in terms of the nonzero coefficients cijk.

Claim 35.

Assume that cijk=0 for j+k>s and i[0,dx]. Then for j+k=s, it holds that

gjk(x)=0idxkkp1ci,sk,k(kk)xkk+i.
Proof.

Consider a term cijkxiyj(xy+z)k that appears in f(x,y,xy+z). This expands to

cijk0idx0kk(kk)xkk+iyj+kkzk.

Thus, the coefficient cijk appears in the expression of gjk(x) when (j+kkmodp)=j. Note that all the sum of the exponents of y and z in every term of the above expression is (j+kk)+k=j+k. Then, the coefficients that appear in the expression of gjk(x) must satisfy the property that j+k{s,s+p1}. Since we’ve assumed that cijk=0 for j+k>s, it holds that the cijk that appear are precisely those where j+k=s. We thus obtain the formula in the claim.

We now continue to prove the inductive step for j+k=s. We split into three cases.

Case 1: 𝒔>𝒑𝟏+𝒅𝒚.

In this case, we proceed by backwards induction on the value of k to show that cijk=0. For k>p1, there are no such terms, so ci,sk,k=0 for all i[0,dx].

Now assume the inductive hypothesis that for all k>k it holds that ci,sk,k=0. Using Claim 35, we have that

gsk,k(x) =0idxkkp1ci,sk,k(kk)xkk+i
=0idxci,sk,kxi,

where in the second line we’ve used the inductive hypothesis. But sks(p1)>dy, so gsk,k(x)=0x𝔽p. Since gsk,k(x) is a polynomial of degree dx<p, this implies that all the coefficients ci,sk,k are equal to 0. This completes the inductive step.

Case 2: 𝒑𝟏<𝒔𝒑𝟏+𝒅𝒚.

In this case, to prove that ci,sk,k=0 for all i[0,dx] and k[0,p1], our strategy is to find independent linear relations between these coefficients. It will be convenient to view a matrix which represents the data given by Claim 35, as follows:

The rows correspond to the coefficients ci,sk,k for k=p1,,sp+1 as labeled on the left. Each row actually corresponds to dx+1 coefficients, for i[0,dx], but for sake of space we condense these into a single row.

The columns correspond to gd,sd(x) for d[dy+1,p1]. Because d>dy, these gd,sd(x) all should be 0. The way to read off the value of gd,sd(x) is by looking at the corresponding column, and summing the product of each entry with the corresponding row coefficient ci,sk,k, remembering that each row actually corresponds to dx+1 rows for i[0,dx]. The entries of the matrix were chosen according to the expansion of gd,sd(x) given in Claim 35.

Now, to prove that ci,sk,k=0 for all i[0,dx] and k[0,p1], we will backwards induct on the value of i+k, which we will denote by t. So assume that for i+k>t that ci,sk,k=0. This is true for t=dx+(p1), since there do not exist coefficients with larger values of i+k (as idx and kp1).

Consider all the coefficients with i+k=t, meaning coefficients ci,st+i,ti where i[0,dx] and ti,st+i[0,p1]. These coefficients correspond to dx+1 consecutive rows of the matrix with increasing values of i. Note that in each column, these coefficients only contribute to a single monomial xi: namely, in the column for gd,sd, the exponents of x that appear for i+k=t are all equal to i=i+(k(sd))=ts+d. Thus, for any gd,sd(x) which has degree <p, we obtain a linear constraint on ci,st+i,ti by restricting the corresponding column corresponding to the rows corresponding to the coefficients (we may also drop the powers of x, so that the coefficients of the linear constraint are simply binomial coefficients). For instance, for t=p, the column gd,sd gives us the constraint

(c1,sp+1,p1c2,sp+2,p2cdx,sp+dx,pdx)((p1sd)(p2sd)(pdxsd))=0.

Our strategy may therefore be summarized as follows: for each tp1+dx, assuming that there are m coefficients of the form ci,st+i,ti, we will find m consecutive columns such that the degree of x in gd,sd(x) is <p. These columns will also satisfy that if we restrict the matrix to these columns and to the rows corresponding to the coefficients, the resulting m×m matrix has diagonal that lies above the lower left triangle of 0’s. Then, by Lemma 33, we know these m linear constraints are independent, telling us that ci,st+i,ti=0 for each i.

The columns will be chosen as follows.

  • For tp1, we are considering the m=pt+dx coefficients

    ctp+1,sp+1,p1,ctp+2,sp+2,p2,,cdx,st+dx,tdx.

    We pick the first pt+dx columns of the matrix, corresponding to gdy+1,sdy1, gdy+2,sdy2, , gp+dx+dyt,s+tpdxdy. Each of gd,sd(x) for d[dy+1,p+dx+dyt] is a polynomial in x. Using the inductive hypothesis that ci,sk,k=0 for i+k>t, we see that the maximum degree of x in any of these polynomials is i+kmin(sd)=t(s+tpdxdy)=p+dx+dys<dx+dy<p.

  • For sdy1t<p1, we are looking at the dx+1 coefficients

    c0,st,t,c1,st+1,t1,,cdx,st+dx,tdx.

    Consider the dx+1 columns corresponding to gdy+1,sdy1,,gdx+dy+1,sdxdy1. Because tsdy1, the submatrix has nonzero diagonal. The largest exponent of x in any of these columns is i+k(sdxdy1)ts+dx+dy1<dx+dy1<p.

  • For t<sdy1, let d^=min(dx,p1s+t). We consider the d^+1 coefficients

    c0,st,t,c1,st+1,t1,,cd^,p1,td^.

    Look at the d^+1 columns corresponding to gst,t,,gst+d^,td^. This matrix has nonzero diagonal. The largest degree of x in any of these columns is i+k(td^)d^<p.

Case 3: 𝒅𝒙+𝒅𝒚<𝒔𝒑𝟏.

Similar to the previous case, we consider the following matrix, interpreted the same way as before.

As in the previous case, we will downwards induct on the value of i+k=t. Assume that for i+k>t it holds that ci,sk,k=0. This is true for t=dx+s (the maximal possible value of i+k) since there do not exist coefficients with larger values of i+k.

We again consider all m coefficients satisfying i+k=t. These occupy a number of consecutive rows of the matrix. Furthermore, for any column with degree in x less than p, restricting that column to those rows gives a linear constraint on the coefficients. Thus, we again demonstrate, for each t, m consecutive columns that each have exponent in x less than p. Restricting these columns to the coefficient rows will result in a m×m submatrix lying above the lower left triangle of 0’s, which is full rank by Lemma 33, implying that all m coefficients must in fact be 0.

  • For ts, we are interested in the st+dx+1 coefficients

    cts,0,s,cts+1,1,s1,,cdx,st+dx,tdx.

    Look at the first st+dx+1 columns, which correspond to gd,sd for d[dy+1,st+dx+dy+1]. The maximum degree of any of these polynomials is i+k(tdxdy1)=dx+dy+1<p.

  • For sdy1t<s, we consider the dx+1 coefficients

    c0,st,t,c1,st+1,t1,,cdx,st+dx,tdx.

    The columns we are interested in are gdy+1,sdy1,,gdx+dy+1,sdxdy1. Since tsdy1, the submatrix has nonzero diagonal. The maximum degree of any of these polynomials is i+k(sdxdy1)=ts+dx+dy+1<dx+dy+1<p.

  • For t<sdy1, let d^=min(dx,t). Then we are considering the following d^+1 coefficients.

    c0,st,t,c1,st+1,t1,,cd^,st+d^,td^.

    For these, we will look at the d^+1 columns corresponding to gst,t,,gst+d^,td^. This clearly has nonzero diagonal, and the maximum degree of x in any of these polynomials is i+k(td^)=d^<p.

Theorem 36.

For pdx+dy+2, the dimension of the local code is 12(dx+1)(dy+1)(dx+dy+2).

Proof.

From Lemma 34, we know that cijk=0 for all i[0,dx] and j+k>dx+dy. Thus we can write

f(x,y,z) =0idx0j+kdx+dycijkxiyjzk.

Note that cijkxiyj(xy+z)k=0kk(kk)xkk+iykk+jzk and in particular the sum of the exponents of y and z is always j+k, so in particular the value of gjk(x) only depends on coefficients cijk were j+k=j+k (note also that we’re now in the setting where j+kdx+dy<p, so g(x,y,z)=f(x,y,xy+z) without having to reduce modulo ypy).

Again, we will use the fact that for all j>dy and k[0,p1], it holds that gjk(x)=0x𝔽p. Note that gjk(x) could have degree as large as 2dx+dy which could be larger than p. We thus let g^jk(x) denote gjk(x) reduced modulo xpx. The condition that gjk(x)=0x𝔽p then is equivalent to g^jk(x)0.

We will work through the polynomials g^jk(x) and set each to 0 by choosing coefficients cijk appropriately. We will consider all the polynomials g^jk with a fixed value of j+k, denoted by s, simultaneously.

As such, let dy<sdx+dy. The relevant polynomials gjk(x) that evaluate to 0 everywhere are gdy+1,sdy1,,gs,0, and the relevant coefficients are ci,0,s,,ci,s,0, where i[0,dx]. The dependency of gd,sd on the coefficients is given in the following matrix, interpretted the same as before.

We will set the coefficients ci,sk,k starting with those with the largest value of t:=i+k, which is dx+s, and proceeding downwards. We will show that if we’ve already set all coefficients ci,sk,k with i+k>t, then the degree of g^d,sd(x) is d+ts (that is, all larger monomials have been set to 0 by the previous choices of coefficient assignments). This is certainly satisfied in the base case: if t=dx+s, then since idx we have that the largest power of x in any gd,sd(x) is dx+d=d+ts. This largest exponent can only decrease when we pass to g^d,sd(x).

Now, suppose that we’re part way through this process and have already set the values for all ci,sk,k where i+k>t. The number of ways we have to set the values ctk,sk,k is as follows.

  • For t=dx+s, dx+s1, , dx+dy+1, the coefficients of interest are cts,0,s, cts+1,1,s1, , cdx,s+dxt,tdx. We will show that these must all be set to 0. We look at the columns corresponding to gdy+1,sdy1,,gdx+dyt+s+1,tdxdy1. Note that there are as many columns as coefficients. By the inductive assumption, the maximal degree of column gd,sd is d+tsdx+dy+1<p. Thus the coefficient of xd+ts in gd,sd(x), which is equal to tdxks(ksd)ctk,sk,k, must also be 0. This gives us the following linear system of equations:

    (cts,0,scts+1,1,s1cdx,st+dx,tdx)T((ssdy1)(ssdy2)(stdxdy1)(s1sdy1)(s1sdy2)(s1tdxdy1)(tdxsdy1)(tdxsdy2)(tdxtdxdy1))=0,

    where the matrix is also a top left submatrix of the aforementioned matrix, ignoring the powers of x. Since s>sdy1, this matrix has nonzero diagonal and by Lemma 33 is invertible. This implies that ctk,sk,k=0 for all tdxks.

    Now, we’ve set all cijk with j+kt. It remains to show that g^d,sd(x) is now degree d+ts1. We already have that g^d,sd(x) was degree d+ts from the inductive assumption. For any g^d,sd(x), the coefficient of the xd+ts term is the sum of the coefficients of xd+ts and xd+ts+(p1) in gd,sd(x). But recall that we’ve set all cijk with j+kt to 0, so both these coefficients in gd,sd must be 0. Therefore, g^d,sd(x) is degree d+ts1.

  • Next, for t=dx+dy,,s+1, we are looking at the s+dxt+1>sdy coefficients cts,0,s,,cdx,st+dx,tdx. We have so far that all g^d,sd(x) have degree d+ts. The coefficient of xd+ts in g^d,sd(x) is equal to the sum of the coefficients of xd+ts and xd+ts+(p1) in gd,sd(x), which in turn is equal to

    tdxks(ksd)ctk,sk,k+0ks(ksd)ctk+(p1),sk,k,

    where in the second summation the terms ctk+(p1),sk,k that don’t exist are understood to be 0. Note that we’ve already set the values of ctk+(p1),sk,k. Thus, in order to set (4.2) to 0, we need to choose ctk,sk,k, tdxks so that

    tdxks(ksd)ctk,sk,k=0ks(ksd)ctk+(p1),sk,k.

    There are sdy such equations for the sdy polynomials g^d,sd(x), which are all independent by Lemma 33 since ssdy1. Then, there are ps+dxt+1(sdy)=pdx+dy+1t ways to choose the coefficients ctk,sk,k. We remark also that once cts,0,s, , cdx,st+dx,tdx are set, all g^d,sd(x) must have degree d+ts1 since we chose the values so that the coefficient of xd+ts was 0 for all columns.

  • The next case is t=s,s1,,dx. In this case, we are looking at the dx+1 coefficients c0,st,t,,cdx,st+dx,tdx. We have so far that all g^d,sd(x) have degree d+ts, and the coefficient of xd+ts in g^d,sd(x) is equal to the sum of the coefficients of xd+ts and xd+ts+(p1) in gd,sd(x). Similar to the previous case, this results in sdy independent linear equations (by Lemma 33, using the fact that tsdy1. Thus, there are pdx+dy+1s ways to set c0,st,t,,cdx,st+dx,tdx. Note that after we’ve set these coefficients, the degree of g^d,sd(x) necessarily must be d+ts1 by choice of these coefficients.

  • If t=dx1,,sdy, then the coefficients we care about are c0,st,t,,ct,s,0. We have that g^d,sd(x) has degree d+ts and wish to set the ctk,sk,k so that the coefficient of xd+ts is 0. As before, this results in sdy linearly independent equations, which are independent because tsdy1. Thus, there are pt+1s+dy ways to set the coefficients ctk,sk,k. The degrees of all sdy polynomials g^tk,sk,k(x) are now d+ts1.

  • For t=sdy1,,0, we are considering the t+1 coefficients c0,st,t,,ct,s,0. Note also that for d<st, we must already have that g^d,sd(x)0 since the maximum degree, if it exists, is already d+ts. So, we look at the t+1 polynomials g^st,t(x),,g^s,0(x). Since we want to set the coefficient of xd+ts to be 0, this results in t+1 linearly independent equations in c0,st,t,,ct,s,0. So there is exactly one way to set c0,st,t,,ct,s,0 to make all these coefficients 0.

In total, for dy<sdx+dy, if Cs is the number of ways to assign all the coefficients ci,sk,k, then

logpCs =0+t=s+1dx+dy(dx+dy+1t)+t=dxs(dx+dy+1s)
+t=sdydx1(t+1s+dy)+0
=(dy+1)(dx+dy+1s).

Furthermore, if j+k=sdy, then f(x,y,xy+z) is always degree dy in y, so all such coefficients cijk are permissible. There are (dx+1)(dy+22) such coefficients. In total, this gives that the dimension of the local code is

(dx+1)(dy+22)+s=dy+1dx+dy((dy+1)(dx+dy+1s))
=12(dx+1)(dy+1)(dx+dy+2).

5 Code Testability

In this section, we will work with the field size q=p being a prime. The main reason is that we will need results from Section 4.2 about the degree of codewords in Cdx,dy, viewed as low degree polynomials.

5.1 Testability of the Local Code at a Vertex

In this section we prove that CvCdx,dy (as defined in Lemma 30) is agreement-testable whenever dx,dy<p/4. Our definition of agreement testability (see Definition 15) applies to HDX codes. Let us restate it in a form that is specialized for Cv:

Definition 37.

The local code Cv defined in (15) is (ϵ,ρ())-agreement testable if whenever we are given a collection of zeCe for each ev, such that

α:=uwXv(1)[zuv(Tuvw)zwv(Tuvw)]<ϵ

then there exists some xCv such that

uX(0)(zuvx|Tuv)ρ(α).

Recall CvCdx,dy. The local views have two kinds. Let us see the case where v=K1: Cosets of H2 correspond to lines (x,b,c) for all b,c𝔽p. Cosets of H3 correspond to lines (a,y,ay+c) for all a,c𝔽. Therefore, the local views can be packaged through X,Y, which will be collections of degree d2 and d3 polynomials on the lines corresponding to cosets of H2 and H3, respectively.

The following theorem will immediately imply local testability.

Theorem 38.

Let X,Y:𝔽p3𝔽p be such that

  • For each b,c, the x degree of X(x,b,c) is at most dx.

  • For each a,c, the y degree of Y(a,y,ay+c) is at most dy.

Then if [X(x,y,z)Y(x,y,z)]=δ3 so that pdx+dy+2max(dx,dy)+4δp, there exists codeword Q(x,y,z)Cdx,dy such that

b,c[X(x,b,c)Q(x,b,c)]+a,c[Y(a,y,ay+c)Q(a,y,ay+c)]4δ.

Before proving Theorem 38, let us see that it immediately implies agreement testability.

Corollary 39.

If dx,dy<p4, the local code Cv=Cdx,dy is ((p/2(dx+dy)4p)3,4()1/3)-agreement testable.

Proof.

Without loss of generality, let v be the coset K1 in X(G;K1,K2,K3) (the argument applies to other choices of v since G acts transitively on the code Cv, see Claim 26). Then every edge ev is of type 2 or type 3. Recall from Theorem 1 that type 2 edges e are cosets {gh2(x)}x while type 3 edges e correpsond to cosets of the form {gh3(y)}y. Given a collection of local views zeCe, we define functions X,Y:𝔽p3𝔽p in the following way:

X(x,b,c)=zgh2(gh2(x))where g=ι1(0,b,c),
Y(a,y,ay+c)=zgh3(gh3(y))where g=ι1(a,0,c).

Here ι is the embedding of elements in K1 to 𝔽p3 as used in the proof of Lemma 30. Since zgh2(gh2(x)) has degree at most d2 and zgh3(gh3(y)) has degree at most d3, X and Y satisfy the conditions in the theorem statement for dx=d2,dy=d3. Also by construction of X and Y

uwXv(1)[zuv(Tuvw)zwv(Tuvw)]=x,y,z[X(x,y,z)Y(x,y,z)]

Similarly, given a codeword QCdx.dy, by Lemma 30 we can define its corresponding codeword fCv as

f(g)=Q(ι(g)).

Then the disagreement probability between ze and f satisfies:

2uX(0)(zuvf|Tuv)=b,c[X(x,b,c)Q(x,b,c)]+a,c[Y(a,y,ay+c)Q(a,y,ay+c)].

Now applying Theorem 38 to ze and f we have that the local code Cv is
((p(dx+dy)2max(dx,dy)4p)3(p/2(dx+dy)4p)3,4()1/3)-agreement testable.

We now move to prove Theorem 38. Let S denote the set of all points (x,y,z) on which X(x,y,z)Y(x,y,z). Consider all polynomials e(x,y,z) that are degree δp in x, and degree δp in y in e(x,y,xy+z). By Theorem 36 , the dimension of the space of such polynomials is (δp+1)3. Thus, by dimension counting, there is a nonzero polynomial E(x,y,z) that evaluates to 0 on all points of S. Then, we have that for all x,y,z, X(x,y,z)E(x,y,z)=Y(x,y,z)E(x,y,z).

We have that X(x,y,z)E(x,y,z) is degree dx+δp in x, and Y(x,y,xy+z)E(x,y,xy+z) is degree dy+δp in y. Thus there is a polynomial P(x,y,z) that is degree dx+δp in x, and degree dy+δp in y in P(x,y,xy+z), that agrees on all points. (To see this, we can write P(x,y,z) as a polynomial that is degree q1 in each of x,y,z. Since P(x,y0,z0) has the same evaluation as a degree dx+δp polynomial for each y0,z0, it follows that P must be degree dx+δp in x. Next, by Lemma 34, we have that P must be combined degree dx+δp+dy+δp<q in y and z. Thus, P(x,y,xy+z) is degree <q in y. Since P(x0,y,x0y+z0) has the same evaluation as a degree dy+δp polynomial for each x0,z0, it follows that P(x,y,xy+z) must actually be degree dy+δp in y.)

We may write

X(x,y,z)E(x,y,z)=Y(x,y,z)E(x,y,z)=P(x,y,z)x,y,z.

We’d like to formally divide P(x,y,z) by E(x,y,z) to obtain a polynomial Q(x,y,z) that is degree dx in x and degree dy in the skew-y direction.

Lemma 40.

There exists a polynomial Q(x,y,z) that is degree dx in x and degree dy in the skew-y direction, such that E(x,y,z)Q(x,y,z)=P(x,y,z).

Proof.

First, suppose P(x,y,z) and E(x,y,z) share a common factor F(x,y,z)E(x,y,z) of degree (e,f) in the x and skew-y directions, and set

P(x,y,z)P¯(x,y,z)F(x,y,z)andE(x,y,z)E¯(x,y,z)F(x,y,z)

so that P¯(x,y,z) and E¯(x,y,z) share no common factors.

Note that whenever F(x,y0,z0) is not the 0 polynomial (in x), then we have that E¯(x,y0,z0)|P¯(x,y0,z0). Since the combined degree of y and z in F(x,y,z) is e+f<|𝔽| by Lemma 34 and F(x,y,xy+z) is degree f in y, there can be at most f values of y0𝔽 for which F(x,y0,z)0 and also at most f values of z0𝔽 for which F(x,y,z0)0. Let Yx denote the values of y0 for which F(x,y0,z)0 and Zx denote the values z0 for which F(x,y,z0)0, so that |Yx|,|Zx||𝔽|f. For any y0Yx, there can be at most e+f values of z0 for which f(x,y0,z0)0, and for any z0Zx, there can be at most e+f values of y0 for which f(x,y0,z0)0. Thus, there are at least max(|Yx|,|Zx|)(|𝔽|ef) pairs (y0,z0)𝔽×𝔽 for which E¯(x,y0,z0)|P¯(x,y0,z0). Let the set of such pairs (y0,z0) be denoted by Mx. We may show a similar statement for the skew-y direction, that there is a set My of size at least max(|Zx|,|Zy|)(|𝔽|ef) consisting of pairs (x0,z0) for which E¯(x0,y,x0y+z0)|P¯(x0,y,x0y+z0) as polynomials in y.

Our goal is to show that E¯(x,y,z) divides P¯(x,y,z). Assume for the sake of contradiction that P¯(x,y,z) and E¯(x,y,z) have no common factors. Let the degree of E¯(x,y,z) be (a=ae,b=bf) in the x and skew-y directions, so that P¯(x,y,z) has degree (dx+a,dy+b) in the x and skew-y directions. Assume without loss of generality that ab, otherwise we can consider the change of basis P(x,y,z)=P¯(y,x,xyz) and E(x,y,z)=E¯(y,x,xyz).

Write

P¯(x,y,z) P0(y,z)+P1(y,z)x++Pa+dx(y,z)xa+dx
E¯(x,y,z) E0(y,z)+E1(y,z)x++Ea(y,z)xa.

Then, P¯(x,y,z) and E¯(x,y,z) have a common factor if there exists A(x,y,z) that is degree a1 in x and B(x,y,z) that is degree a+dx1 in x such that

P¯(x,y,z)A(x,y,z)=E¯(x,y,z)B(x,y,z),

or

Pa+dxAa1 =EaBa+dx1
Pa+dx1Aa1+Pa+dxAa2 =Ea1Ba+dx1+EaBa+dx2
P0A0 =E0B0.

These equations are linear in Ai and Bi and can be summarized by the following (2a+dx)×(2a+dx) matrix:

M(P¯,E¯)(y,z)=(Pa+dxPa+dx1P0000Pa+dxP1P0000Pa+dxP1P0EaEa1E1E000000EaEa1E0)

consisting of a rows with P¯ entries and a+dx rows of E¯ entries.

We define R(P¯,E¯)(y,z), the resultant of P¯ and E¯, to be the polynomial in the coefficients of P¯ and E¯ (viewing both as a polynomial in x, with coefficients that are polynomials in y and z) obtained by taking the determinant of M(P¯,E¯). Solutions A and B exist if R(P¯,E¯)=0, which would give our contradiction.

Viewing R(P¯,E¯) as a polynomial in y and z, and recalling by Lemma 34 that P¯ (resp. E¯) is total degree dx+dy+a+b (resp. a+b) in y and z, we see that R(P¯,E¯) has degree a(dx+dy+a+b)+(a+dx)(a+b).

Meanwhile, for each (y0,z0)Mx, we have that E(x,y0,z0)|P(x,y0,z0), so each of the top a rows are linear combinations of the bottom a+dx rows. Thus, the polynomial R(P,E) has a zero at each (y0,z0)Mx of multiplicity a. Then, because

a|Mx| amax(|Yx|,|Zx|)(|𝔽|ef)
amax(|Yx|,|Zx|)(3dx+dy+3a+b)
>(a(dx+dy+a+b)+(a+dx)(a+b))max(|Yx|,|Zx|)
degRmax(|Yx|,|Zx|),

R(P¯,E¯)(y,z) must be the zero polynomial by Schwarz-Zippel. This implies that P¯(x,y,z) and E¯(x,y,z) must have non-trivial common factor when considered as polynomials in x, which implies (since we assumed that they’re coprime) that E¯(x,y,z)|P¯(x,y,z) in 𝔽(y,z). Then, by Gauss’ Lemma, this implies that E(x,y,z)|P(x,y,z) when considered as polynomials in 𝔽[y,z], the ring of polynomials in y and z.

Thus, polynomial Q(x,y,z) that is degree dx in x and degree dy in the skew-y directions exists.

The final step in the proof of Theorem 38 is to analyze the probability of the local views X and Y disagreeing with Q.

We have that

X(x,y,z)E(x,y,z)=Y(x,y,z)E(x,y,z)=Q(x,y,z)E(x,y,z)x,y,z.

Thus, for any (b,c) for which E(x,b,c) is nonzero, we have that Q agrees with X on the entire row. Since E can be zero on at most 2δp2 rows (since the combined degree of y and z in E(x,y,z) is at most 2δp by Lemma 34), this means that

b,c[X(x,b,c)Q(x,b,c)]2δ.

Similarly, we have that a,c[Y(a,y,ay+c)Q(a,y,ay+c)]2δ. This proves Theorem 38.

5.2 Local Testability: From Local to Global

Our main theorem is that if the local codes around an edge have distance δ, and the local codes around a vertex are agreement-testable, then the global code C is agreement-testable, and therefore robustly locally testable.

Theorem 41.

Let ϵ0,δ>0, let ρ0() be a monotone increasing function and assume that γ<min(δ8,δ2128min(ϵ0,ρ01(δ4))). Let X be a two-dimensional bounded-degree γ-one-sided link expander. Suppose we are given codes Ce𝔽X+e with relative minimum distance at least δ for each edge eX(1), and let Cv and C be as defined above. If Cv is (ϵ0,ρ0())-agreement-testable for all vX(0) then C is (ϵ,ρ())-agreement-testable, where ϵ=δ2128min(ϵ0,ρ01(δ4)) and where ρ(t)=Dt for D the maximal degree of a vertex in X. Namely, given any collection of local views {zvCv|vX(0)}, if

α(z)=uvX(1)[zu(X+uv)zv(X+uv)]<ϵ

then there is some xC such that

v[x|X+vzv]Dα(z).
Corollary 42.

Under the assumptions above, and assuming that each local code Cv is defined by at most m0 parity checks each looking at most q0 bits, the code C is (d+2,ϵ/m0,ρ()) locally testable under Definition 11 where ρ(t)=ρ(m0t).

The corollary follows from a standard conversion from agreement-testability to robust testability, see Section 2.5, particularly Claims 16 and 14.

Proof of Theorem 41.

Fix z0={zv0} a collection of local views, such that α(z0)<ϵ. Run the following local correction algorithm:

Local Algorithm.

If there is a vertex v and a choice zvCv that reduces α(z) then replace zv by zv and repeat.

Let z={zv} be the final collection, after the termination of the local algorithm. The algorithm must halt in at most α(z0)|X(1)| steps. At this point, either α(z)=0, or α(z)>0. In the first case, we will show a nearby codeword. In the second case, we will show that in fact α(z)>ϵ, a contradiction since α(z)α(z0)ϵ.

Claim 43.

If α(z)=0 then z corresponds to a codeword x^C such that vX(0)[zv0x^|X+v]Dα(z0), where D bounds the maximal degree of a vertex in X.

Proof.

For each triangle tX(2) choose arbitrarily a vertex vt and set x^(t)=zv(t). This choice does not depend on the choice of v because α(z)=0 implies that zv(t)=zv(t) for any v,vt.

At every step of the local algorithm, one local view changes. So the number of local views where x^|X+vzv0 is at most the number of steps of the algorithm, which is at most α(z0)|E|. So

[x^|X+vzv0]α(z0)|E||V|=Dα(z0).

Assume α(z)>0, and let

R={uvX(1)|zu|X+uvzv|X+uv}.

The rest of the proof will show that R has large size. First, we claim that

ee[eR|eR]δ/2. (17)

where ee is shorthand for the distribution of selecting e,e from the upper random walk, namely, first choose a random triangle and then choose two distinct edges in it.

Fix an edge e=uvR. By definition, (zu)|X+uv(zv)|X+uv. Since both are codewords in Cuv, for at least δ fraction of the triangles uvwX+uv either zu(uvw)zw(uvw) or zv(uvw)zw(uvw) and in particular either uwR or vwR. So the random walk from uv to a triangle uvw and then to uw or vw has probability at least δ/2 of staying in R. This establishes (17). By Lemma 9 we further deduce that

ee[eR|eR]δ/2γ. (18)

where ee is shorthand for the distribution of the lower random walk, namely, selecting a two random edges that intersect on a vertex.

The next step is to focus on the neighborhood of a fixed vertex. Fix vV. For every neighbor u of v let yu=zu|X+uvCuv. This gives us a local view for each neighbor of v, which may or may not agree with (zv)|X+uv. Let R(v)=RXv(1){uwXv(1)|yu(uvw)yw(uvw)}. We next show that vertices v with small R(v) have few R edges adjacent to them. Here we use the agreement testability of the code Cv.

Claim 44.

Fix vV such that ϵv=Δ|R(v)|/|Xv(1)|ϵ0. Then uXv(0)[uvR]ρ0(ϵv). Contrapositively, if uXv(0)[uvR]τ, then ϵvmin(ϵ0,ρ01(τ)).

Proof.

For every neighbor u of v , yu=zu|X+uvCuv either agrees or disagrees with zv. This is measured by the fraction of R edges touching v,

uXv(0)[uvR]=uXv(0)[zv|X+uvyu]. (19)

Note also that for uwXv(1), if yu(uvw)yw(uvw) then uwR(v) and so uwR. The assumption of the claim implies that PruwXv(1)[yu(uvw)yw(uvw)]ϵvϵ0. The (ϵ0,ρ0()) agreement-testability of Cv (see Definition 15) guarantees that in this case there exists a codeword z^vCv such that

uXv(0)[z^v|X+uvyu]ρ0(ϵv) (20)

where we have used the monotonicity of ρ0().

Since the local algorithm halted without changing zv to z^v, we conclude, together with (19) and (20), that

uXv(0)[uvR]=uXv(0)[zv|X+uvyu]uXv(0)[z^v|X+uvyu]ρ0(ϵv).

To prove the contrapositive notice that if ϵvϵ0 it is immediate, and if not, τρ0(ϵv) which means that ρ01(τ)ϵv as needed (ρ0 is invertible since it is monotone).

Let f=𝟏R. By (18),

Df,Df=f,UDf(δ/2γ)f2. (21)

Observe also that

𝔼vDf(v)=𝔼ef(e)=𝔼ef(e)2=f2.
Claim 45.

For any τ>0 let Vτ={vV|Df(v)>τ}. Then

v(Vτ)(δ/2γτ)f2
Proof.

Let μ(v) denote the probability of a vertex.

(δ/2γ)f2 Df2
=vVτμ(v)Df(v)2+vVτμ(v)Df(v)2
vVτμ(v)1+vVτμ(v)Df(v)τ
vVτμ(v)1+vVμ(v)Df(v)τ
=v[Vτ]+τf2

where we have used (21) in the first inequality, the definition of Vτ in the next inequality, and Df(v)0 in the last one. Let M~=S1,0D be the Markov operator corresponding to the random walk that starts at an edge e, selects a random vertex such that e{v}X(2) (we denote this condition by ev), and then chooses a random ev. Then,

f,M~f=Df,S0,1f =vμ(v)Df(v)S0,1f(v)
=vμ(v)(𝔼evf(e))(𝔼evf(e))
vμ(v)(𝔼evf(e))ϵv
vVτμ(v)τmin(ϵ0,ρ01(τ))
=τmin(ϵ0,ρ01(τ))[Vτ]
τmin(ϵ0,ρ01(τ))(δ/2γτ)f2.

where the second inequality is because whenever vVτ, Claim 44 implies that ϵvmin(ϵ0,ρ01(τ)).

Lemma 10 gives a bound of 3γ on the second largest eigenvalue of M~. We finally apply Lemma 5 on the graph corresponding to M~ to deduce that

|R|(τ(δ/2γτ)min(ϵ0,ρ01(τ))3γ)|X(1)|. (22)

Choosing for example τ=δ/4 and as long as γ<min(δ/8,δ2128min(ϵ0,ρ01(δ4)) we deduce |R|>δ2128min(ϵ0,ρ01(δ4))|X(1)|=ϵ|X(1)|.

5.3 Testability of HDX codes in Dimensions Above Two

In this section we prove a “trickle-down” statement for agreement-testability. We show that if the local codes at i-links are agreement-testable, then this implies agreement-testability for the codes at i1 links.

Lemma 46.

Let δ,γ>0, and let X be a k-dimensional γ-one-sided local expander, and assume that for every tX(k1) we have a code Ct{f:X+t(k)𝔽} with minimum relative distance δ. Let Di denote the maximal number of i+1 faces that touch an i face in X. Define, for every 1ik2 face sX(i), the code

Cs={f:X+s(k)𝔽|f|X+tCttX+s(k1)},

and assume that for sX(i) the code Cs has minimum relative distance δi. If there is ϵk2>0 and a monotone increasing function ρk2() such that for every tX(k2) the code Ct is (ϵk2,ρk2())-agreement testable, then for every 1i<k2 and every sX(i) the code Cs is (ϵi,ρi())-agreement testable with

ϵi=δi+12128min(ϵi+1,ρi+11(δi+14)),ρi(t)=Di+1t

as long as γ<min(δi+18,δi+12128min(ϵi+1,ρi+11(δi+1/4))) for all i. In particular, the code C{f:X(k)𝔽} is (ϵ1,ρ1())-agreement testable.

Proof.

The proof is very similar to the proof of Theorem 41. Fix sX(i). Our goal is to prove that Cs is testable assuming that for all vXs(0), Cs{v} is testable. By bijecting X+s to Xs (mapping each tX+s to ts), we can move to the case where s=, and our codes sit on the faces of dimension ki1, namely Cv𝔽X+v(ki1) and Cϕ𝔽X(ki1). The complex Xs is a γ high dimensional expander, and each vertex v touches at most Di+1 edges . We also let δi+1 denote the distance of the code Cs{v} (see Lemma 29 for a calculation).

When i=k3, this was done in the previous section. So the only difference is that now Cv itself is a (ki1)-dimensional HDX code for possibly ki1>1. In fact we will see that this makes almost no difference.

We start with z0={zv0Cv}vX(0) a collection of local views, and define α(z0)=uvX(1)[zu(X+uv)zv(X+uv)]. Same as in the proof of Theorem 41, we run the local algorithm, replacing zv with zvCv whenever it reduces α(z), until no more changes can be made. Let z={zv}vX(0) be the final collection.

Then, by following the same steps as in the proof of Theorem 41, we have the following:

  • If α(z)=0, then just as in Claim 43 we have that z corresponds to a codeword x^C satisfying vX(0)[zv0x^|X+v]Di+1α(z0).

  • If α(z)0, then the goal is to show that there must have been many edge disagreements to begin with. Define R={uvX(1)|zu|X+uvzv|X+uv}. Then as long as γ<min(δi+18,δi+12128min(ϵi+1,ρi+11(δi+1/4))), it holds that

    |R|δi+12128min(ϵi+1,ρi+11(δi+14))|X(1)|=ϵi|X(1)|.

    Therefore, the number of edge disagreements to begin with in the original ensemble z0 must have been at least ϵi|X(1)| also.

6 Codes in Higher Dimensions

The coset complexes considered here have a higher dimensional version as follows. Fix k>2 and define Hi={hi(α)|α𝔽} where hi(α)=ei,i+1(αt)+Ik+1 and let Ki=span(Hj:ji) and G=span(H1,,Hk+1). Let X be the k-dimensional complex X[G;K1,,Kk+1].

We have the following properties of X:

  • X is a γ-expander, where γ=1|𝔽|(k1). We use the trickle down theorem together with the fact that the link of any tX(d2) is either a 1|𝔽|-expander (Claim 23) or a complete bipartite graph (justification in proof of Lemma 48).

  • For any tX(i), the number of (i+1)-faces that touch t is at most Di=(ki)|𝔽|ki1. The reason is that tX(i) corresponds to a coset of the group generated by ki subgroups Hj1,,Hjki. Each (i+1) face that touches t is a coset of the group generated by ki1 of those subgroups. For each such collection of ki1 subgroups, there are at most |𝔽|ki1 cosets of the resulting group contained within t.

Define like before an embedding of the group elements of X into a vector space

GR(k+1)2𝔽n(k+1)2

so that the elements in each gHi embed to an entire affine line. Here we will be working with fields 𝔽=𝔽p that have prime order. Fix a degree parameter kγ|𝔽|<d<|𝔽|/4 and define for every tX(k1) the code Ct to be the Reed-Solomon code RS(|𝔽|,d). Further define, for every i<k1 and every face wX(i),

Cw={f:X+w(k)𝔽|f|X+tCttX+w(k1)}.

It is immediate that the code C=𝒞k[X,{Cv}vX(0)]𝔽X(k) is an HDX code as defined in Section 2.5. Furthermore, by Lemma 29, for any wX(i), the code Cw has distance δi=j=0k1i(δjγ).

Theorem 47.

Let 𝔽 be a field of prime order, and let X=X[G;K1,,Kk+1] be a k-dimensional complex. If kγ|𝔽|<d<(1418δk2)|𝔽|, then for every i<k1 and every wX(i), the code Cw is (ϵi,ρi)-agreement testable, where

ϵk2=(18d2p)3andρk2()=4()1/3,

and for 1i<k2,

ϵi=δk23j=i+1k2δj22527(ki1)(k+1)|𝔽|kandρi=Di+1(),

where δi=j=0k1i(δjγ) and Di=(ki)|𝔽|ki1.

In particular, the code Cϕ{f:X(k)𝔽} is (ϵ1,ρ1())-agreement testable.

The rest of this section is dedicated to proving Theorem 47. Suppose first that i=k2 and sX(k2). Recall that X is (k+1)-partite, and denote color(s)[k+1] the set of colors of s. For k>2 we observe two kinds of links Xs, and therefore two kinds of codes Cs, depending on the color of s.

Lemma 48.

Let sX(k2). Let color(s)=[k+1]{i,j}. If |ij|1mod(k+1) then Cs is isomorphic to Cd,d as defined in (15). Otherwise, CsRS(|𝔽|,r)2.

In both cases Cs is ((18d2p)3,4()1/3)-agreement testable.

Proof.

When |ij|>1mod(k+1) the subgroups Hi and Hj commute since [hi(α),hj(β)]=0. Then span(Hi,Hj)𝔽2, and CsRS(|𝔽|,d)2. Also, ((p2d2p)2,2())-agreement testability was proven in [32] as long as d<|𝔽|/2. Note that (p2d2p)2>(18d2p)3 when 0dp/4 and 2()4()1/3.

When |ij|=1mod(k+1), we can ignore a large part of the matrices (which is identity). For example, if i=1 and j=2 then we can restrict attention to the first 3 rows and columns of the matrices. Thus, we are back in the k=2 case, with span(Hi,Hj) isomorphic to the group K6ij defined in Section 3.1, and the code Cs is isomorphic to Cd,d as defined in (15). In this case, by Corollary 39, Cs is ((18d2p)3,4()1/3)-agreement testable when d<|𝔽|/4.

Moving to i<k2, our proof relies on reverse induction, deducing agreement testability of the level i codes from agreement testability of the level i+1 codes.

Lemma 49.

If d<(1418δk2)|𝔽|, then for every 1i<k2 and every sX(i) the code Cs is (ϵi,ρi())-agreement testable with

ϵi=δk23j=i+1k2δj22527(ki1)(k+1)|𝔽|kandρi(x)=Di+1x,

where δi=j=0k1i(δjγ) and Di=(ki)|𝔽|ki1.

Proof.

We have from Lemma 48 that for any sX(k2) Cs is (ϵk2,ρk2())-agreement testable, where ϵk2=(12d2p)3 and ρk2()=4()1/3. Then for sX(k3), we have by Lemma 46 that Cs is (ϵk3,ρk3())-agreement testable where ρk3(x)=Dk2x and

ϵk3 =δk22128min(ϵk2,ρk21(δk2/4))
=δk22128min((18d2p)3,(δk2/16)3)
=δk25219
δk25219D1,

where the third equality holds whenever 18d2p>δk216, which happens whenever d<(1418δk2)|𝔽|.

In general, for sX(i), the code Cs is (ϵi,ρi())-agreement testable where ρi(x)=Di+1x and

ϵi =δi+12128min(ϵi+1,δi+14Di+2)δk23j=i+1k2δj22527(ki1)D1

since

ϵi+1=δk23j=i+2k2δj2252ki2D1<δi+14Di+2.

Plugging in D1=(k+1)|𝔽|k finishes the calculation.

References

  • [1] S. Arora, C. Lund, R. Motwani, M. Sudan, and M. Szegedy. Proof verification and intractability of approximation problems. Journal of the ACM, 45(3):501–555, 1998. 6
  • [2] S. Arora and S. Safra. Probabilistic checking of proofs: A new characterization of NP. Journal of the ACM, 45(1):70–122, 1998. doi:10.1145/273865.273901. 6
  • [3] Yotam Dikstein, Irit Dinur, Yuval Filmus, and Prahladh Harsha. Boolean function analysis on high-dimensional expanders. In Proc. 20th International Workshop on Randomization and Computation (RANDOM), volume 116, pages 37:1–37:21, Princeton, NJ, 2018. RANDOM/APPROX. doi:10.4230/LIPIcs.APPROX/RANDOM.2018.37. 7
  • [4] Yotam Dikstein, Irit Dinur, Prahladh Harsha, and Noga Ron-Zewi. Locally testable codes via high-dimensional expanders. Electron. Colloquium Comput. Complex., TR20-072, 2020. URL: https://eccc.weizmann.ac.il/report/2020/072. 2
  • [5] Yotam Dikstein, Irit Dinur, Prahladh Harsha, and Noga Ron-Zewi. Locally testable codes via high-dimensional expanders. arXiv preprint, 2020. arXiv:2005.01045. 5
  • [6] Irit Dinur, Shai Evra, Ron Livne, Alexander Lubotzky, and Shahar Mozes. Locally testable codes with constant rate, distance, and locality. In Stefano Leonardi and Anupam Gupta, editors, STOC ’22: 54th Annual ACM SIGACT Symposium on Theory of Computing, Rome, Italy, June 20 - 24, 2022, pages 357–374. ACM, 2022. doi:10.1145/3519935.3520024. 2, 4, 5, 7
  • [7] Irit Dinur, Prahladh Harha, Tali Kaufman, and Noga Ron-zewi. From local to robust testing via agreement testing. In Proceedings of the 11th Innovations in Theoretical Computer Science (ITCS) Conference, pages 29:1–29:18, 2019. doi:10.4230/LIPIcs.ITCS.2019.29. 2
  • [8] Irit Dinur and Tali Kaufman. High dimensional expanders imply agreement expanders. In Proc. 58th IEEE Symp. on Foundations of Comp. Science (FOCS), pages 974–985, 2017. doi:10.1109/FOCS.2017.94. 2, 9
  • [9] Shai Evra and Tali Kaufman. Bounded degree cosystolic expanders of every dimension. In Proc. 48th ACM Symp. on Theory of Computing (STOC), pages 36–48, 2016. doi:10.1145/2897518.2897543. 4, 5
  • [10] Uriya A First and Tali Kaufman. On good 2-query locally testable codes from sheaves on high dimensional expanders. arXiv preprint, 2022. doi:10.48550/arXiv.2208.01778. 4, 5, 7, 13, 14
  • [11] S. Luna Frank-Fischer, Venkatesan Guruswami, and Mary Wootters. Locality via partially lifted codes. CoRR, abs/1704.08627, 2017. arXiv:1704.08627. 6
  • [12] Ira Gessel and Gérard Viennot. Binomial determinants, paths, and hook length formulae. Advances in Mathematics, 58(3):300–321, 1985. doi:10.1016/0001-8708(85)90121-5. 24
  • [13] Louis Golowich. From grassmannian to simplicial high-dimensional expanders. CoRR, abs/2305.02512, 2023. doi:10.48550/arXiv.2305.02512. 7
  • [14] Alan Guo, Swastik Kopparty, and Madhu Sudan. New affine-invariant codes from lifting. In Robert D. Kleinberg, editor, Innovations in Theoretical Computer Science, ITCS ’13, Berkeley, CA, USA, January 9-12, 2013, pages 529–540. ACM, 2013. doi:10.1145/2422436.2422494. 6
  • [15] Prahladh Harsha and Ramprasad Saptharishi. A note on the elementary construction of high-dimensional expanders of kaufman and oppenheim. arXiv, December 2019. arXiv:1912.11225. 9
  • [16] Tom Hoholdt and Jorn Justesen. Graph codes with Reed-Solomon component codes. In 2006 IEEE International Symposium on Information Theory, pages 2022–2026, 2006. doi:10.1109/ISIT.2006.261904. 1, 5, 23
  • [17] Tali Kaufman, David Kazhdan, and Alexander Lubotzky. Ramanujan complexes and bounded degree topological expanders. In Proc. 55th IEEE Symp. on Foundations of Comp. Science (FOCS), pages 484–493, 2014. doi:10.1109/FOCS.2014.58. 4, 5
  • [18] Tali Kaufman and Alexander Lubotzky. Edge transitive ramanujan graphs and symmetric LDPC good codes. In Howard J. Karloff and Toniann Pitassi, editors, Proceedings of the 44th Symposium on Theory of Computing Conference, STOC 2012, New York, NY, USA, May 19 - 22, 2012, pages 359–366. ACM, 2012. doi:10.1145/2213977.2214011. 2
  • [19] Tali Kaufman and Izhar Oppenheim. Construction of new local spectral high dimensional expanders. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2018, Los Angeles, CA, USA, June 25-29, 2018, pages 773–786, 2018. doi:10.1145/3188745.3188782. 3, 7, 9, 14, 15
  • [20] Tali Kaufman and Izhar Oppenheim. High order random walks: Beyond spectral gap. Comb., 40(2):245–281, 2020. doi:10.1007/s00493-019-3847-0. 9
  • [21] Tali Kaufman and Ran J. Tessler. Garland’s technique for posets and high dimensional grassmannian expanders. In Yael Tauman Kalai, editor, 14th Innovations in Theoretical Computer Science Conference, ITCS 2023, January 10-13, 2023, MIT, Cambridge, Massachusetts, USA, volume 251 of LIPIcs, pages 78:1–78:22. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPIcs.ITCS.2023.78. 7
  • [22] Tali Kaufman and Avi Wigderson. Symmetric LDPC codes and local testing. Comb., 36(1):91–120, 2016. doi:10.1007/s00493-014-2715-1. 2
  • [23] Subhash Khot, Dor Minzer, and Muli Safra. Pseudorandom sets in grassmann graph have near-perfect expansion. In Mikkel Thorup, editor, 59th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2018, Paris, France, October 7-9, 2018, pages 592–601. IEEE Computer Society, 2018. doi:10.1109/FOCS.2018.00062. 7
  • [24] Alexander Lubotzky, Beth Samuels, and Uzi Vishne. Explicit constructions of Ramanujan complexes of type Ad~. European J. Combin., 26(6):965–993, 2005. doi:10.1016/j.ejc.2004.06.007. 2, 7
  • [25] Alexander Lubotzky, Beth Samuels, and Uzi Vishne. Ramanujan complexes of type Ad~. Israel J. Math., 149(1):267–299, 2005. doi:10.1007/BF02772543. 2, 7
  • [26] C. Lund, L. Fortnow, H. Karloff, and N. Nisan. Algebraic methods for interactive proof systems. Journal of the ACM, 39(4):859–868, October 1992. doi:10.1145/146585.146605. 6
  • [27] Or Meir. IP = PSPACE using error-correcting codes. SIAM J. Comput., 42(1):380–403, 2013. doi:10.1137/110829660. 6
  • [28] Roy Meshulam. Graph codes and local systems, 2018. arXiv:1803.05643. 13
  • [29] Dana Moshkovitz and Ran Raz. Sub-constant error low degree test of almost-linear size. SIAM J. Computing, 38(1):140–180, 2008. doi:10.1137/060656838. 7
  • [30] Ryan O’Donnell and Kevin Pratt. High-dimensional expanders from chevalley groups. In Shachar Lovett, editor, 37th Computational Complexity Conference, CCC 2022, July 20-23, 2022, Philadelphia, PA, USA, volume 234 of LIPIcs, pages 18:1–18:26. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/LIPIcs.CCC.2022.18. 7, 9
  • [31] Pavel Panteleev and Gleb Kalachev. Asymptotically good quantum and locally testable classical LDPC codes. In Stefano Leonardi and Anupam Gupta, editors, STOC ’22: 54th Annual ACM SIGACT Symposium on Theory of Computing, Rome, Italy, June 20 - 24, 2022, pages 375–388. ACM, 2022. doi:10.1145/3519935.3520017. 2, 4, 7
  • [32] Alexander Polishchuk and Daniel A. Spielman. Nearly-linear size holographic proofs. In Frank Thomson Leighton and Michael T. Goodrich, editors, Proceedings of the Twenty-Sixth Annual ACM Symposium on Theory of Computing, 23-25 May 1994, Montréal, Québec, Canada, pages 194–203. ACM, 1994. doi:10.1145/195058.195132. 5, 40
  • [33] R. Raz and S. Safra. A sub-constant error-probability low-degree test, and a sub-constant error-probability PCP characterization of NP. In STOC 1997, pages 475–484, 1997. 7
  • [34] A. Shamir. IP = PSPACE. Journal of the ACM, 39(4):869–877, October 1992. Prelim. version in 1990 FOCS, pages 11–15. doi:10.1145/146585.146609. 6
  • [35] Michael Sipser and Daniel A. Spielman. Expander codes. IEEE Trans. Inf. Theory, 42(6):1710–1722, 1996. doi:10.1109/18.556667. 11, 13