Abstract 1 Introduction 2 The Hamming 𝖺𝖫𝖣𝖢 to Insdel 𝖺𝖫𝖣𝖢 Compiler 3 Ideal Insdel 𝖺𝖫𝖣𝖢s in Private-key Settings 4 Ideal Insdel 𝖺𝖫𝖣𝖢s for Resource-bounded Channels References

Amortized Locally Decodable Codes for Insertions and Deletions

Jeremiah Blocki ORCID Department of Computer Science, Purdue University, West Lafayette, IN, USA Justin Zhang ORCID Department of Computer Science, Purdue University, West Lafayette, IN, USA
Abstract

Locally Decodable Codes (𝖫𝖣𝖢s) are error correcting codes which permit the recovery of any single message symbol with a low number of queries to the codeword (the locality). Traditional 𝖫𝖣𝖢 tradeoffs between the rate, locality, and error tolerance are undesirable even in relaxed settings where the encoder/decoder share randomness or where the channel is resource-bounded. Recent work by Blocki and Zhang initiated the study of Hamming amortized Locally Decodable Codes (𝖺𝖫𝖣𝖢s), which allow the local decoder to amortize their number of queries over the recovery of a small subset of message symbols. Surprisingly, Blocki and Zhang construct asymptotically ideal (constant rate, constant amortized locality, and constant error tolerance) Hamming 𝖺𝖫𝖣𝖢s in private-key and resource-bounded settings. While this result overcame previous barriers and impossibility results for Hamming 𝖫𝖣𝖢s, it is not clear whether the techniques extend to Insdel 𝖫𝖣𝖢s. Constructing Insdel 𝖫𝖣𝖢s which are resilient to insertion and/or deletion errors is known to be even more challenging. For example, Gupta (STOC’24) proved that no Insdel 𝖫𝖣𝖢 with constant rate and error tolerance exists even in relaxed settings.

Our first contribution is to provide a Hamming-to-Insdel compiler which transforms any amortized Hamming 𝖫𝖣𝖢 that satisfies a particular property (consecutive interval querying) to amortized Insdel 𝖫𝖣𝖢 while asymptotically preserving the rate, error tolerance and amortized locality. Prior Hamming-to-Insdel compilers of Ostrovsky and Paskin-Cherniavsky (ICITS’15) and Block et al. (FSTTCS’20) worked for arbitrary Hamming 𝖫𝖣𝖢s, but incurred an undesirable polylogarithmic blow-up in the locality. Our second contribution is a construction of an ideal amortized Hamming 𝖫𝖣𝖢 which satisfies our special property (consecutive interval querying) in the relaxed settings where the sender/receiver share randomness or where the channel is resource bounded. Taken together, we obtain ideal Insdel 𝖺𝖫𝖣𝖢s in private-key and resource-bounded settings with constant amortized locality, constant rate and constant error tolerance. This result is surprising in light of Gupta’s (STOC’24) impossibility result which demonstrates a strong separation between locality and amortized locality for Insdel 𝖫𝖣𝖢s.

Keywords and phrases:
Amortized Locally Decodable Codes, Insertion and Deletion Errors
Funding:
Jeremiah Blocki: Supported in part by NSF CAREER Award CNS-2047272.
Copyright and License:
[Uncaptioned image] © Jeremiah Blocki and Justin Zhang; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Mathematics of computing Coding theory
Related Version:
Full Version: https://arxiv.org/abs/2507.03141
Editor:
Niv Gilboa

1 Introduction

Locally Decodable Codes (𝖫𝖣𝖢s) are a variant of error correcting codes which provide single-symbol recovery with highly efficient query complexity over a (possibly corrupted) codeword. Specifically, a 𝖫𝖣𝖢 over alphabet Σ is defined as a tuple of algorithms the encoding algorithm 𝖤𝗇𝖼:ΣkΣn and the local decoder 𝖣𝖾𝖼by~:[k]Σ. Here, k denotes the message length, n denotes the codeword length and 𝒚~Σn is a (possibly corrupted) codeword — for any message 𝒙Σk, its corresponding codeword is 𝒚=𝖤𝗇𝖼(𝒙) and 𝒚~ denotes a (possibly corrupted) copy of 𝒚. The local decoder 𝖣𝖾𝖼𝒚~(i) is a randomized oracle algorithm that is given an index i[k]:={1,2,,k} as input and has oracle access to a (possibly corrupted) codeword 𝒚~. 𝖣𝖾𝖼𝒚~(i) attempts to output the i’th symbol of the message 𝒙 after making at most queries to the symbols of 𝒚~. We require that for any index i[k] and any 𝒚~ that is not too far from the uncorrupted codeword 𝒚 (i.e 𝖣𝗂𝗌𝗍(𝒚,𝒚~)δ|𝒚| for a chosen distance metric 𝖣𝗂𝗌𝗍) that 𝖣𝖾𝖼𝒚~(i) outputs the correct symbol 𝒙[i] with probability at least 1ε. Broadly, codes which satisfy the above parameters are called (,δ,ε)-𝖫𝖣𝖢s. The main parameters of interest are the query locality , error tolerance δ and rate k/n. Ideally, we would like an 𝖫𝖣𝖢 with constant locality, constant error tolerance, and constant rate simultaneously.

Within the classical setting of worst-case Hamming errors, i.e when the distance metric is chosen to be the Hamming distance metric and error patterns are introduced in a worst-case fashion, the trade-offs between the locality, error tolerance, and rate has been extensively studied. Even so, achievable parameters in this setting remain undesirably sub-optimal. Katz and Trevisan show that any 𝖫𝖣𝖢 with constant locality 2 and constant error tolerance δ>0 must have non-constant rate [17], immediately ruling out the existence of an ideal 𝖫𝖣𝖢 for worst case Hamming errors. Moreover, the best known constructions with constant locality and error tolerance have super-polynomial rate [24, 11]. Several relaxations have been made, such as allowing the local decoder to reject corrupted codewords [13, 2, 9, 18, 10], the assumption that the encoder and decoder share a cryptographic key [20, 14, 15] that is unknown to a probabilistic polynomial time channel, or the assumption that the channel is resource-bounded in other ways [1, 6]. Yet, even under these assumptions, we do not have an 𝖫𝖣𝖢 with constant locality, constant error tolerance, and constant rate. The literature on 𝖫𝖣𝖢s and relaxed variants is vast. We provide an expanded discussion of the work related to 𝖫𝖣𝖢s in the full version of the paper [8].

Blocki and Zhang [7] introduced the notion of an amortized locally decodable code where the total number of queries to the codeword can be amortized by the total number of message bits recovered (see Definition 2). The local decoder for an amortized 𝖫𝖣𝖢 takes as input a range [L,R] (we will assume RL+1κ for a suitable parameter κ) instead of a single index i and attempts to output the entire substring 𝒙[L,R] instead of the single message symbol xi. The amortized locality is /(RL+1) i.e., the total number of queries divided by the number of message symbols recovered. Surprisingly, they showed how to construct ideal amortized locally decodable codes in relaxed settings where the channel is resource-bounded or where the sender/receiver share a secret key. In particular, given any (sufficiently large) interval [L,R][k] the local decoder can recover all of the corresponding message symbols x[L],,x[R] while making at most O(RL) queries to the corrupted codeword. They also observed that the Hadamard code can have amortized locality approaching 1. By contrast, Katz and Trevisan show that codes with locality <2 do not exist unless the message length is a constant [17].

Given the surprising success in construction ideal amortized 𝖫𝖣𝖢s for Hamming channels, even in relaxed settings, it is natural to wonder whether or not it is possible to obtain similar results for insertions and deletions (Insdel) channels.

Is it possible to obtain constant amortized locality, constant error tolerant, and constant rate Insdel 𝖫𝖣𝖢s when the channel is oblivious, resource-bounded, or when the sender/receiver share a cryptographic key?

A construction of an ideal amortized 𝖫𝖣𝖢 for Insdel channels would be especially surprising since when compared to known lowerbounds of Hamming 𝖫𝖣𝖢s, the current state of affairs is even worse for Insdel 𝖫𝖣𝖢s. Blocki et al. [5] proved the first separation result between Insdel and Hamming 𝖫𝖣𝖢s by proving that linear Insdel 𝖫𝖣𝖢s with locality =2 do not exist. This stands in contrast to Hamming 𝖫𝖣𝖢s, where constructions do exist, albeit with exponential rate (e.g. the Hadamard code). They additionally provide a exponential rate lowerbound for any constant locality Insdel 𝖫𝖣𝖢. Their lower bound holds in both traditional and relaxed (private-key and oblivious channel) settings, making progress towards a conjecture raised by Block et al. [5] that constant locality Insdel 𝖫𝖣𝖢s do not exist. Recently, Gupta resolved this conjecture by showing that constant query deletion codes do not exist by providing a randomized, oblivious adversarial deletion strategy [12]. The results of Gupta extend even to relaxed settings where the sender/receiver share random coins or where the channel is resource bounded i.e., even in relaxed settings it is impossible to construct constant query Insdel 𝖫𝖣𝖢s. Thus, it would be especially surprising to construct an ideal amortized 𝖫𝖣𝖢 for Insdel channels even in relaxed settings with shared randomness or resource bounded channels.

On the side of constructions, Ostrovsky and Paskin-Cherniavsky provide the first InsDel 𝖫𝖣𝖢 via a compilation of a Hamming 𝖫𝖣𝖢s into an Insdel 𝖫𝖣𝖢s [22] with a polylogarithmic blowup to the query complexity and only a constant blowup to the error-tolerance and rate. Their construction was revisited by Block et al. [4] who reproved and extended their results to construct Locally Correctable Codes (𝖫𝖢𝖢s), a variant of 𝖫𝖣𝖢s which considers the local decodability of any codeword symbol. At a high level, their compiler takes a Hamming 𝖫𝖣𝖢 encoding of the message, partitions it into blocks, and applies an Insdel code to each block. The local decoder then simulate the underlying Hamming decoder’s queries by retrieving and decoding the corresponding Insdel-encoded block to recover the queried Hamming code symbol, where each answered Hamming query incurs a polylogarithmic query blowup.

Unfortunately, a black box approach of instantiating their Insdel 𝖫𝖣𝖢 constructions with an amortizable Hamming 𝖫𝖣𝖢 does not preserve the amortized locality. Even if all symbols recovered by the decoder are used, the amortized locality will still suffer a polylogarithmic blowup from the Hamming decoder simulation. Thus, it is unclear whether the Insdel 𝖫𝖣𝖢 constructions of Ostrovsky and Paskin-Cherniavsky or Block et al. are amortization friendly.

1.1 Our Contributions

We provide the first framework for constructing amortized Insdel 𝖫𝖣𝖢s by modifying the compiler of Block et al. [4] to be amenable to amortization. Our resulting compiler takes in an amortized Hamming 𝖫𝖣𝖢, whose decoder satisfies a properties we define as consecutive interval querying, and outputs an Insdel amortized 𝖫𝖣𝖢 with asymptotically equivalent parameters (amortized locality, rate, and error tolerance). More specifically, given an amortized Hamming 𝖫𝖣𝖢 with constant rate, constant error tolerance, and constant amortized locality, whose decoder queries contiguous blocks of polylogarithmic size instead of individual message symbols, our compiler will yield an amortized Insdel 𝖫𝖣𝖢 with constant rate, constant error tolerance, and constant amortized locality. While prior Insdel 𝖫𝖣𝖢 compilers suffer from a polylogarithmic blow-up in query complexity of the underlying Hamming 𝖫𝖣𝖢, our amortized compiler preserves amortized locality as well as rate and error tolerance (up to constant factors). Thus, given any ideal Hamming amortized 𝖫𝖣𝖢 which satisfies our consecutive interval querying property, we obtain an ideal Insdel amortized 𝖫𝖣𝖢 achieving constant rate, error tolerance, and amortized locality.

Theorem (Informal, see Corollary 13).

If there exists ideal (constant rate, constant error tolerance, and constant amortized locality) Hamming amortized 𝖫𝖣𝖢s whose decoder only queries consecutive blocks of size Ω(polylogn), then there exists ideal Insdel amortized 𝖫𝖣𝖢s (with asymptotically equivalent parameters).

Furthermore, in the private-key setting, we construct the first consecutive interval querying Hamming 𝖫𝖣𝖢 with constant rate, error tolerance, and constant amortized locality. We leave it as an open question whether one can construct a consecutive inverval querying Hamming 𝖫𝖣𝖢 with constant amortized locality in the information theoretic setting. However, using our updated Hamming-to-Insdel compiler we obtain an ideal amortized Insdel 𝖫𝖣𝖢 construction (constant rate, constant error tolerance and constant amortized locality) in the private-key setting. Specifically, the local decoder 𝖣𝖾𝖼𝒀~(a,b) makes at most O(ba+polylog(n)) queries to the corrupted codeword111𝒀~ is the corrupted codeword output by the PPT-bounded channel who does not have the secret key under the constraint that the edit distance fraction between 𝒀 and 𝒀~ is at most 2δ. 𝒀~ and (whp) outputs all of the correct symbols x[a],x[a+1],,x[b] in the interval [a,b]. As long as bapolylog(n) the amortized locality is O(1) per symbol recovered.

Theorem (Informal, see Theorem 16 and Corollary 19).

For any t,n such that tn, there exists ideal private-key Hamming amortized 𝖫𝖣𝖢s with codeword length n whose decoder only queries consecutive blocks of size t and has constant amortized locality whenever decoding any consecutive interval of size at least ω(tlogn). This implies that there exists ideal private-key Insdel amortized 𝖫𝖣𝖢s.

We show further that our private-key construction implies an ideal amortized Insdel 𝖫𝖣𝖢 in resource-bounded settings i.e. when the channel has a bounded amount of some resource (e.g. space, circuit-depth, cumulative memory, etc.).

Theorem (Informal, see Corollary 24).

If there exists ideal private-key Insdel amortized 𝖫𝖣𝖢s, there exists ideal resource-bounded Insdel amortized 𝖫𝖣𝖢s (with asymptotically equivalent parameters).

Our results demonstrate definitively strong separations between locality and amortized locality for Insdel 𝖫𝖣𝖢s. In particular, Gupta [12] rules out the existence of any Insdel LDC with constant locality and error tolerance, even in relaxed settings and even with exponential rate.

1.2 Technical Overview

Recalling the BBGKZ Compiler

Block et al. give a construction of a compiler taking any Hamming 𝖫𝖣𝖢 and compiling it into an Insdel 𝖫𝖣𝖢[4]. We refer to their construction as the BBGKZ compiler. In the compiled encoding procedure, the message 𝒙 is first encoded under the Hamming 𝖫𝖣𝖢 as codeword 𝒚, where 𝒚 is then partitioned into size τ Hamming blocks 𝒚=𝒘1𝒘B. Each Hamming block 𝒘j along with its index j is encoded using a constant rate Insdel code with the special property stating that the relative Hamming weight of every logarithmic-sized interval is at most 2/5 (The Schulman-Zuckerman-code [23] satisfies this property). Lastly, each block is pre/postpended with a zero buffer 0ϕτ of length ϕτ, where ϕ(0,1), resulting in the Insdel block 𝒘j and the Insdel codeword 𝒀=𝒘1𝒘B. Intuitively, the zero buffers 0ϕτ help to identify the beginning/end of each block after insertions/deletions.

Then, the compiled Insdel local decoding procedure, when given oracle access to the (possibly corrupted) Insdel codeword 𝒀~, attempts to simulate the Hamming decoder with access to a corrupted codeword 𝒚~ that is close to the Hamming Codeword 𝒚 in Hamming Distance. Specifically, when the Hamming decoder queries symbol i of its expected oracle access to (possibly corrupted) codeword 𝒚~, the Insdel decoding procedure will identify the block 𝒘j which contains 𝒚[i], (attempt to) locate the corresponding padded Insdel block 𝒘j, undo the padding, run the Insdel decoder to recover the pre-compiled block 𝒘j, and return the corresponding symbol 𝒚~[i]. This query simulation process is facilitated by a subroutine we refer to as the RecoverBlock procedure which (attempts to) locate a particular Insdel block 𝒘j in the presence of corruptions using a noisy binary search procedure.

Challenge 1: RecoverBlock is not Amortizable

Our first challenge in constructing amortizable Insdel 𝖫𝖣𝖢s is the observation that RecoverBlock performs asymptotically strictly more queries than the symbols it recovers. Since the codeword 𝒀 may contain insertions and deletions, in order for RecoverBlock to locate the Hamming decoder’s queries, it performs a noisy binary search procedure and makes O(τ+τpolylog(n)) queries to recover the block corresponding to each query made by the Hamming decoder. The Hamming decoder only uses one of these symbols, but even if the decoder utilized all τ symbols from the recovered symbols, the amortized locality is still at least O(polylog(n)) per symbol recovered. Thus, we cannot achieve constant amortized locality by using the BBGKZ compiler as a black box.

Our insight to address this challenge is that RecoverBlock can be modified to recover a consecutive interval of blocks rather than just a single block. We refer to this modified procedure as RecoverBlocks, where on input block interval [a,b]:={a,,b}, RecoverBlocks will recover pre-compiled blocks 𝒘a,,𝒘b and make at most O((ba+1)τ+τpolylog(n)) queries. Hence, as long as ba+1polylog(n), the ratio of recovered symbols to queried symbols will be constant.

Challenge 2: Proving Correctness of RecoverBlocks

We show that the modified procedure RecoverBlocks preserves correctness. That is, a majority of the (corrupted) Insdel blocks 𝒘~j satisfy the property running the search RecoverBlocks with any interval [a,b] that contains j[a,b] will yield the original Hamming block 𝒘j except with negligible probability. Block et al. [4] proved that the procedure RecoverBlock will correctly find any local-good block and that most blocks will be local-good as long as the number of corruptions is bounded. The main technical challenge results from an inherent mismatch for RecoverBlocks because an interval [a,b] may contain a mixture of blocks that are (resp. are not) locally-good. We show that RecoverBlocks successfully recovers any local-good blocks in its interval except with negligible probability.

We prove that correctness in a similar manner to the analysis of Block et al. [4]. We also utilize the notion of local-good blocks, where if block 𝒘j is (θ,γ)-local-good, it will have at most γτ corruption with the additional desirable property that any block interval [a,b] surrounding the block, i.e. j[a,b], has at least (1θ)×|ba+1| blocks that also have at most γτ corruption. Utilizing analysis from [4] we can pick our parameters to ensure that at least (1δh) fraction of our blocks are (θ,γ)-local good. What we prove is that if block 𝒘j is (θ,γ)-local good and if j[a,b] then calling RecoverBlocks for the interval [a,b] will recover the original (uncorrupted) 𝒘j with high probability along with any other (θ,γ)-good block in the interval [a,b]. Thus, for any partition [a1,b1],,[az,bz] of [B] calling RecoverBlocks with each interval [ai,bi] would recover the corrupt (uncorrupted) 𝒘j for at least (1δh)B blocks. We will use this observation to reason about the correctness of our decoding procedure. Intuitively, we can view the compiled Insdel decoder as simulating the Hamming decoder with a corrupted codeword whose fractional hamming distance is at most δh from the original codeword.

Challenge 3: Consecutive Hamming Queries Needed For Amortized Decoding

The next challenge is utilizing the RecoverBlocks procedure to obtain amortized locality in the Insdel local decoding procedure. Similar to the BBGKZ compiler, our Insdel decoder with oracle access to (possibly corrupted) codeword 𝒀~ simulates the compiled Hamming decoder’s oracle access to (possibly corrupted) codeword 𝒚~. Our hope is that if the underlying Hamming 𝖫𝖣𝖢 is amortizable then the compiled Insdel 𝖫𝖣𝖢 will also be amortizable. In particular, we hope to group the Hamming decoder’s queries q1,,q together into corresponding (disjoint) index intervals [a1,b1],,[ap,bp] such that (1) i|bi+1ai|=Θ(), (2) {q1,,q}i[ai,bi], and (3) the average size 1pi|bi+1ai| of each interval is large enough we can amortize over the calls to RecoverBlocks e.g., we need to ensure that the average interval size is at least 1pi|bi+1ai|=Ω(polylogn). Unfortunately, in general, we cannot place any such structural guarantees on the queries made by an (amortizable) Hamming 𝖫𝖣𝖢.

We address this problem by introducing the notion of a t-consecutive interval querying Hamming 𝖫𝖣𝖢decoder which is amortization-friendly under our Insdel compiler. Intuitively, the local decoder for a t-consecutive interval querying Hamming 𝖫𝖣𝖢 accesses the (possibly corrupted) codeword 𝒚~ by querying for t-sized intervals instead of individual message symbols i.e., the decoder may submit the query [i,i+t1] and the output will be 𝒚~[i,i+t1]. In more detail the Hamming decoder takes as input an interval [L,R] of message symbols we would like to recover, makes queries for t-sized intervals of the (possibly corrupted) codeword 𝒚~ and then attempts to output 𝒙[L,R]. Note that if the Hamming decoder makes /t interval queries to 𝒚~ this still corresponds to locality as the total number of codeword symbols accessed is t×(/t)=.

Correspondingly, when the Hamming decoder is t-consecutive interval querying, our Insdel decoder will make at most 2/t calls to RecoverBlocks on intervals of size t. The resulting query complexity is roughly ×(τlog3nt+O(1)), and so, in other words, the query complexity blow-up is by a factor of O(τlog3nt). This implies that if there exists a t-consecutive interval querying Hamming decoder for t=Ω(τlog3n) with constant amortized locality, our compiled Insdel construction will have constant amortized locality. Thus, the primary question is whether or not it is possible to construct t-consecutive interval querying Hamming 𝖫𝖣𝖢s with constant amortized locality. We leave this as an open question for worst-case errors, but show that it is possible in settings where the channel is computationally/resource restricted.

Challenge 4: A 𝒕-consecutive interval querying 𝗮𝗟𝗗𝗖 in Private-Key Settings

We present a t-consecutive interval querying private-key amortized Hamming 𝖫𝖣𝖢 (𝗉𝖺𝖫𝖣𝖢) construction with constant rate, constant amortized locality, and constant error tolerance i.e. ideal parameters. As a starting point we use the one-time private-key Hamming 𝖫𝖣𝖢 constructed by Ostrovsky et al. [21]. Blocki and Zhang show that their construction is amortizable and that their construction may be extended to multi-round secret-key reuse with cryptographic building blocks [7]. For simplicity, we focus on the single round construction, which is information-theoretically secure as long as the channel is computationally bounded.

The single round construction first generates the secret key as a randomly drawn permutation π and a random string 𝒓. To encode the message 𝒙, it is partitioned into blocks 𝒙=𝒆1𝒆B and each block 𝒆j is encoded as 𝒆j=𝖤𝗇𝖼(𝒆) by a constant rate, constant error tolerance Hamming code 𝖤𝗇𝖼. Lastly, the bits of the encoded blocks are permuted by π and the random string 𝒓 is xor’d i.e. the resulting codeword is 𝒚=π(𝒆1𝒆B)𝒓. The local decoder can recover any block 𝒆j by finding the indices of its corresponding encoded block 𝒆j, reversing the permutation and random masking, and running the Hamming code decoder 𝖣𝖾𝖼 to recover 𝒆j. Unfortunately, the local decoder is not t-consecutive interval querying since its queries are not in a contiguous interval. The main issue is that the application of the permutation π removes any pre-existing block structure in the codeword 𝒚 i.e., if the local decoder wants to query for consecutive (pre-permutation) code symbols (𝒆1𝒆B)[s+1,s+], it would have to query 𝒚[π(s+1)],,𝒚[π(s+)], which are no longer consecutive with high probability.

A first attempt to remedy this issue is to permute at the block level instead of the bit level i.e. we draw permutation π over [B] and set the codeword instead as 𝒚=(𝒆π(1)𝒆π(B))𝒓. While this admits contiguous access when recovering any block 𝒆j it defeats the original purpose of the permutation. The permutation π (and the one-time pad 𝒓) ensure that (whp) an adversarial channel cannot produce too many corruptions in any individual block. If we permute at the block level then it is trivial for an attacker to corrupt entire blocks. In particular, the adversary could focus their entire error budget to corrupt symbols at the start of the codeword. This simple strategy will always corrupt a constant fraction of blocks 𝒆j. Thus, the proposed code will not satisfy correct decoding.

Our primary insight is that we can apply the permutation at an intermediate “sub-block” granularity to ensure correctness and simultaneously support consecutive interval queries. In particular, each encoded block 𝒆j, for j[B], is further partitioned into β sub-blocks 𝒆j=(ϵ(j1)β+1ϵjβ) and the permutation is applied over sub-blocks ϵr, for r[Bβ], rather than the Hamming blocks 𝒆j themselves i.e. the codeword is

𝒚=(ϵπ(1)ϵπ(β)ϵπ(Bβ))𝒓.
𝒙=𝒆1𝒆B𝖤𝗇𝖼𝖤𝗇𝖼𝖤𝗇𝖼𝒆1𝒆B(ϵ1ϵβ)(ϵ(B1)β+1ϵBβ)π()𝒓π()𝒓π()𝒓𝒚=(ϵπ(1)ϵπ(β)ϵπ((B1)β+1)ϵπ(Bβ))𝒓
Figure 1: An overview of our private-key Hamming 𝖺𝖫𝖣𝖢 decoder, satisfying correct decoding (of any 𝒆j Hamming block) and consecutive interval querying (of any ϵπ(r) sub-block) simultaneously.

Intuitively, while a constant fraction of the sub-blocks ϵr can still be corrupted with non-negligible probability, as long as the parameter β is suitably large (e.g., β=O(polylogn)), we can ensure that, except with negligible probability, the overall error in each encoded block 𝒆j will still be at most a δ-fraction. Thus, except with negligible probability we can recover 𝒆j (the uncorrupted content in block j) by making β consecutive interval queries to obtain ϵr for each r[(j1)β+1,jβ]. This is exactly what we needed to achieve amortization with our Ideal Insdel compiler.

The above construction assumes that the sender and receiver share a secret key and only allows the sender to transmit one encoded message. However, if the channel is probabilistic polynomial time (PPT) then the construction can be extended allow for the sender to transmit multiple messages using standard cryptographic tools e.g., pseudo-random functions[21, 7]. Furthermore, if we assume stronger resource bounds on the channel (space, sequential depth etc…) then one can use cryptographic puzzles to completely eliminate the requirement for a secret key using ideas from prior works e.g., [1, 6, 3].

Putting it All Together: Ideal Insdel 𝖺𝖫𝖣𝖢s in Relaxed Settings

With the building blocks established prior, the 𝖺𝖫𝖣𝖢 Hamming-to-Insdel compiler and the consecutive interval querying, ideal Hamming 𝗉𝖺𝖫𝖣𝖢, we construct the first ideal Insdel 𝖺𝖫𝖣𝖢 within relaxed settings (private-key and resource-bounded). Naturally, we first construct a private-key, ideal Insdel 𝖺𝖫𝖣𝖢 by directly using our compiler on the t-consecutive interval querying, ideal Hamming 𝗉𝖺𝖫𝖣𝖢. For compiler block size τ, our construction achieves constant amortized locality by setting t=Ω(τlog3n) as discussed previously.

The remaining challenge lies in showing that the compiler’s correctness is preserved in private-key settings. Intuitively, we argue via a reduction to the security of the underlying Hamming 𝗉𝖺𝖫𝖣𝖢. Suppose there exists a probabilistic polynomial time adversary 𝒜 who introduces an Insdel corruption pattern into compiled codeword 𝒀 of their chosen message 𝒙 that causes a decoding error with non-negligible probability. Then, we can construct a probabilistic polynomial time adversary who introduces a Hamming error pattern into (pre-compiled) codeword 𝒚 by (1) using our compiler procedure to transform Hamming codeword 𝒚 into the compiled Insdel codeword 𝒀, (2) Giving compiled Insdel codeword 𝒀 to adversary 𝒜 and receiving corrupted Insdel codeword 𝒀~, and finally, (3) constructing the pre-compiled codeword 𝒚~ by the same simulation process done by the compiled decoder via the RecoverBlocks procedure. To streamline the reduction argument, we draw inspiration from Block and Blocki [3] and repackage our compiler into a convenient form for our reduction. All together, we have an Ideal Insdel 𝗉𝖺𝖫𝖣𝖢.

Next, we generalize the 𝗉𝖺𝖫𝖣𝖢 construction to construct an 𝖺𝖫𝖣𝖢 for resource-bounded settings (𝗋𝖺𝖫𝖣𝖢). Similarly, our first step is constructing a t-consecutive interval querying Hamming 𝗋𝖺𝖫𝖣𝖢. To do this, we make use of the Hamming private-key-to-resource-bounded 𝖫𝖣𝖢 compiler by Ameri et al. [1], which was observed by Blocki and Zhang to be amortizable with only a constant blow-up to the amortized locality [7]. At a high level, their compiler makes use of a cryptographic primitive known as a cryptographic puzzle to embed the secret key within the codeword.

More specifically, cryptographic puzzles consists of two algorithms 𝖯𝗎𝗓𝗓𝖦𝖾𝗇 and 𝖯𝗎𝗓𝗓𝖲𝗈𝗅𝗏𝖾. On input seed 𝒄, 𝖯𝗎𝗓𝗓𝖦𝖾𝗇 outputs a puzzle 𝒁 with solution is 𝒄 i.e. 𝖯𝗎𝗓𝗓𝖲𝗈𝗅𝗏𝖾(𝒁)=𝒄. The security requirement states that adversary 𝒜 in a defined algorithm class , cannot solve the puzzle 𝒁 and cannot even distinguish between tuples (𝒁0,𝒄0,𝒄1) and (𝒁1,𝒄0,𝒄1) for random 𝒄i and 𝒁i=𝖯𝗎𝗓𝗓𝖦𝖾𝗇(𝒄i).

Then, the Hamming 𝗋𝖺𝖫𝖣𝖢 compiler takes a 𝗉𝖺𝖫𝖣𝖢 (𝖦𝖾𝗇𝗉,𝖤𝗇𝖼𝗉,𝖣𝖾𝖼𝗉), and on on message 𝒙 will (1) generate the secret key 𝗌𝗄𝖦𝖾𝗇𝗉(1λ;𝒄) with some random coins 𝒄 (2) Compute the 𝗉𝖺𝖫𝖣𝖢 encoding of the message 𝒚𝗉𝖤𝗇𝖼𝗉,𝗌𝗄(𝒙) (3) Compute an encoding 𝒚 of the puzzle 𝒁 using a code with low locality that recovers the entire message (known as an 𝖫𝖣𝖢) and (4) output 𝒚=𝒚𝗉𝒚. Intuitively, a resource-bounded adversary will not be able to solve the puzzle, while the 𝗋𝖺𝖫𝖣𝖢 decoder 𝖣𝖾𝖼 will have sufficient resources to solve the puzzle, recover the secret key 𝗌𝗄, and run the amortized local decoder 𝖣𝖾𝖼𝗉,𝗌𝗄.

We make an additional observation that if the compiler is instantiated with a t-consecutive interval querying 𝗉𝖺𝖫𝖣𝖢, then the resulting 𝗋𝖺𝖫𝖣𝖢 will also be t-consecutive interval querying. Thus, when applied to our amortized Hamming-to-Insdel compiler with t=Ω(τlog3n), we acheive constant amortized locality. Lastly, the same reduction argument as in the ideal Insdel 𝗉𝖺𝖫𝖣𝖢 construction may be used, with an additional subtle detail of allowing sufficient resources for the reduction. That is, for the class of resource-bounded channels/adversaries that our Insdel 𝖺𝖫𝖣𝖢 is secure against, the underlying Hamming 𝖺𝖫𝖣𝖢 must be secure against a wider class of adversaries ¯, where ¯ include adversaries 𝒜 whom can additionally compute the simulation process in the private-key reduction.

1.3 Preliminaries

Notation

Let xy represent a variable assignment of x as y and x$X denote a uniform random assignment of x from space X. Let denote the concatenation function. For lr, [l,r]:={l,l+1,,r} denotes an interval from l to r inclusive of its ends points. Similarly, [l,r):={l,l+1,,r1} denotes an interval from l to r exclusive of the right endpoint. Over an alphabet Σ, bolded notations 𝒙 denote vectors/words while non-bolded xΣ denote single symbols. For a word 𝒙, x[i] denotes the symbol of 𝒙 at index i, while 𝒙[l,r]:=(x[l],x[l+1],,x[r]) denotes the subword of 𝒙 projected to indices in interval [l,r]. It will also be useful to define a short-hand for retrieving symbols from “blocks" of a word 𝒙: 𝒙ib:=𝒙[(i1)b+1,ib] for any block size b and index i.

For any words 𝒙,𝒚Σn, 𝖧𝖺𝗆(𝒙,𝒚) denotes the hamming distance between 𝒙 and 𝒚 i.e. |{i[n]:xiyi}|. Further, for 𝒛Σ,𝖤𝖣(𝒙,𝒛) denotes the edit distance between 𝒙 and 𝒛 i.e. the number of symbol insertions and deletions to transform 𝒙 into 𝒛. We say 𝒙 and 𝒚 (resp. 𝒙 and 𝒛) are δ-hamming-close (resp. δ-edit-close) when 𝖧𝖺𝗆(𝒙,𝒚)δ|x| (resp. 𝖤𝖣(𝒙,𝒛)2δ|𝒙|).

We recall the formal definitions of an 𝖫𝖣𝖢 and an amortized 𝖫𝖣𝖢.

Definition 1 (𝖫𝖣𝖢).

A (n,k)-code 𝒞=(𝖤𝗇𝖼,𝖣𝖾𝖼) (over Σ) is a (,δ,ε)-locally decodable code (𝖫𝖣𝖢) for hamming errors (resp. insDel errors) if for every 𝐱Σk, 𝐲~Σ such that 𝐲~ is δ-hamming-close (resp. δ-edit-close) to 𝖤𝗇𝖼(𝐱), and every index i[k], we have Pr[𝖣𝖾𝖼𝐲~(i)=x[i]]1ε, and 𝖣𝖾𝖼𝐲~ makes at most queries to 𝐲~.

Definition 2 (𝖺𝖫𝖣𝖢 [7]).

A (n,k)-code 𝒞=(𝖤𝗇𝖼,𝖣𝖾𝖼) (over Σ) is a (α,κ,δ,ϵ)-amortizeable LDC (𝖺𝖫𝖣𝖢) for hamming errors (resp. insDel errors) if for every 𝐱Σk, 𝐲~Σ such that 𝐲~ is δ-hamming-close (resp. δ-edit-close) to 𝖤𝗇𝖼(𝐱), and every interval [L,R][k] or size RL+1κ we have Pr[𝖣𝖾𝖼𝐲~(L,R)=(xi:i[L,R])]1ε, and 𝖣𝖾𝖼𝐲~ makes at most α×(RL+1) queries to 𝐲~.

Throughout the paper, it will be assumed codes are over the binary alphabet i.e. Σ={0,1} unless specified otherwise. The primary metrics of interest for Hamming/Insdel 𝖺𝖫𝖣𝖢s are the rate k/n, amortized locality α, and the error/edit tolerance δ.

1.4 Organization

We start with reintroducing and modifying the encoding and decoding procedures of the BBGKZ compiler in Section 2. The modification will take various steps organized into the following subsections: in Subsection 2.1, the notion of good blocks and intervals are introduced to analyze our new RecoverBlocks procedure in Subsection 2.2. We then present the overall construction of our Hamming-to-Insdel 𝖺𝖫𝖣𝖢 compiler in Subsection 2.3.

In Section 3, we present our construction of an ideal Insdel 𝖺𝖫𝖣𝖢s in settings where the encoder/decoder share a secret-key (𝗉𝖺𝖫𝖣𝖢). More specifically, this construction first relies on a modified Hamming 𝗉𝖺𝖫𝖣𝖢 construction in Subsection 3.1, where the local decoder performs consecutive interval queries. The ideal Hamming 𝗉𝖺𝖫𝖣𝖢 is then compiled into our ideal Insdel 𝗉𝖺𝖫𝖣𝖢 in Subsection 3.2. Lastly, in Section 4, we present our construction of an ideal Insdel 𝖺𝖫𝖣𝖢s in settings where the channel is resource-bounded (𝗋𝖺𝖫𝖣𝖢). Section 4 follows a symmetric structure to the previous section: we present a consecutive interval querying Hamming 𝗋𝖺𝖫𝖣𝖢 construction in Subsection 4.1, which is then compiled into an ideal Insdel 𝗋𝖺𝖫𝖣𝖢 construction in Subsection 4.2.

2 The Hamming 𝖺𝖫𝖣𝖢 to Insdel 𝖺𝖫𝖣𝖢 Compiler

Our results crucially rely on a modified BBGKZ compiler suited for amortized local decoding. We start with the encoding procedure. Interestingly, we will not need to make any changes to the encoding procedure of the BBGKZ compiler, which takes as parameters the block size τ and padding rate ϕ(0,1): the message 𝒙 will first be encoded into a codeword 𝒚 using a Hamming 𝖫𝖣𝖢. Next, the codeword 𝒚 is broken up into blocks of equal size τ as 𝒚=𝒘1𝒘B, where each block and its index (i𝒘i) is 1) encoded with a specifically chosen Insdel code 2) pre-and-post-pended with ϕτ many 0s. The result is the overall codeword 𝒀. Formally, we describe the encoder 𝖤𝗇𝖼 as first computing the Hamming codeword 𝒚, then applying a Hamming-to-Insdel compiler 𝖤𝗇𝖼𝖢𝗈𝗆𝗉𝗂𝗅𝖾 to transform it into an Insdel codeword 𝒀. The encoder 𝖤𝗇𝖼 and compiler 𝖤𝗇𝖼𝖢𝗈𝗆𝗉𝗂𝗅𝖾 are formally described below.

Let 𝒞h=(𝖤𝗇𝖼h,𝖣𝖾𝖼h) be a Hamming 𝖫𝖣𝖢. Let 𝒞insdel=(𝖤𝗇𝖼insdel,𝖣𝖾𝖼insdel) be an Insdel binary code.

𝖤𝗇𝖼𝖢𝗈𝗆𝗉𝗂𝗅𝖾ϕ(𝒚)

 

Input: Hamming codeword 𝒚{0,1}m, padding rate ϕ(0,1).

Output: Insdel codeword 𝒀{0,1}n.

  1. 1.

    Parse 𝒚=𝒘1𝒘|𝒚|/τ where 𝒘j{0,1}τ for all j[|𝒚|/τ].

  2. 2.

    Compute 𝒘j0ϕτ𝖤𝗇𝖼insdel(j𝒘j)0ϕτ for all j[|𝒚|/τ].

  3. 3.

    Output 𝒀=𝒘1𝒘|𝒚|/τ.

𝖤𝗇𝖼(𝒙)

 

Parameters: Block size τ, Padding rate ϕ(0,1).

Input: message 𝒙{0,1}k.

Output: Insdel codeword 𝒀{0,1}n.

  1. 1.

    Compute 𝒚𝖤𝗇𝖼h(𝒙).

  2. 2.

    Output 𝒀𝖤𝗇𝖼𝖢𝗈𝗆𝗉𝗂𝗅𝖾(𝒚).

Our main modifications to the BBGKZ compiler will be in the construction of the decoder 𝖣𝖾𝖼. Recall that the original BBGKZ decoder simulated oracle access to the Hamming 𝖫𝖣𝖢 decoder 𝖣𝖾𝖼h by recovering the Hamming blocks corresponding to the requested queries of 𝖣𝖾𝖼h. Since insertions and deletions in the corrupted codeword 𝒀~ modify the structure of the blocks, this simulation is underpinned by a noisy binary search procedure we denote as the RecoverBlock procedure. More specifically, the RecoverBlock procedure on input block index j will iteratively cut an initial search radius [1,n~] by a fractional amount, until the search radius is a constant size larger than the block size τ. To decide how to cut the search radius on each iteration, the procedure samples N=θ(log2n) blocks indices by decoding N noisy 𝒘j1,𝒘jb blocks within the middle of the search radius to (jb𝒘jb). On the block indices j1,,jb, their median j𝗆𝖾𝖽 is compared to the desired block index j: depending if j<j𝗆𝖾𝖽, RecoverBlock will cut off a fraction of the front or the back of the search radius.

Since the search radius decreases by a constant fraction each iteration, there are O(logn) iterations, where on each iteration, θ(log2n) blocks are read of O(τ) size. Thus, the query complexity to answer a single query of 𝖣𝖾𝖼h is O(τlog3n).

Unfortunately, amortization is not feasible with the current Hamming decoder simulation method. Intuitively, since the RecoverBlock procedure must query a multiplicative polylogn factor to recovers τ symbols, the amortized locality will be at least Ω(polylogn).

Our solution will recover multiple blocks at once using the observation that the RecoverBlocks procedure can be extended from recovering a single block to an interval of contiguous blocks. That is, we will modify the noisy binary search procedure to take in a block interval ab[B], and we change the stopping condition to be linear to the interval size ba+1. Further, the consideration on how to shrink the search radius will take into account the respective endpoints a and b of the interval rather than a single block j.

While this change is relatively straight forward on paper, the analysis for the noisy binary search procedure BBGKZ compiler does not immediately follow for our modified procedure RecoverBlocks. Thus, to prove correctness for the modified RecoverBlocks procedure, we reintroduce the good block analysis from Block et al. [4], and modify their proof structure to take into account the recovery of an interval of blocks instead of a single block.

2.1 Good Blocks and Intervals

In this subsection, we introduce good block notation to analyze the modifications we will make to the BBGKZ decoder and its sub-routine RecoverBlock.

Additional Notation

First, we fix notation consistent with the encoder definition above for the rest of the paper. We will refer to the Hamming encoding 𝒚 in step 1 as the Hamming encoding and the final codeword 𝒀 as the Insdel encoding or simply as the encoding. Let 𝒙{0,1}k be the message of size k and let 𝒚=𝖤𝗇𝖼h(𝒙){0,1}m be the Hamming encoding with size m. Let B=m/τ be the number of blocks. Define β such that the rate of the overall encoding is 1/β. Similarly, define βinsdel such that the rate of the Insdel code 𝒞insdel is 1/βinsdel. Then, the rate of the Hamming encoding is (βinsdellogB+2ϕ)/β. Let δh and δinsdel be the error tolerances of the Hamming encoding and the Insdel code respectively. Let each 𝒘j=𝒚jτ{0,1}τ denote Hamming block j. Similarly, each 𝒘j{0,1}βτ denotes the Insdel block/ block j. Then, the overall codeword is 𝒀{0,1}n has length n=βm. We consider any corrupted codeword 𝒀~{0,1}n~ such that 𝖤𝖣(𝒀,𝒀~)2δn.

We define the block decomposition as a mapping from codeword symbols to their blocks post corruption.

Definition 3.

A block decomposition of 𝐘~ is a non-decreasing mapping ϕ:[n~][B].

By definition, the pre-image of each block decomposition defines a partition of [n~] into B intervals {ϕ1(j):j[B]}. Each of these intervals define the block structure, where the j’th interval defines the j’th block in the corrupted codeword. In other words, for each j[B], ϕ1(j) are all the indices corresponding to block j.

As in the BBGKZ compiler, our modified decoder will rely on a noisy recovery process, where the decoder will need to locate blocks without knowing the block boundaries that have been modified by insdel errors. For any search interval [l,r)[n~], it will be helpful to consider its corresponding block interval i.e. the smallest interval [L,R1][B] such that [l,r)[τL,τ(R1)]. Block intervals are defined formally below.

Definition 4 (Block Interval).

The block interval of an interval I=[l,r)[n~] is defined as i=lr1ϕ1(ϕ(i))[n~]. An interval I is a block interval if the block interval of I is itself. Equivalently, every block interval has the form [a,b]:=i=abϕ1(j) for some a,b[B].

We say a block is good if it does not have too many edit corruptions. Intuitively, good blocks will correspond to Insdel blocks that are correctly decodable, except with negligible probability.

Definition 5 (γ-Good Block).

For γ(0,1) and j[B], a block j is γ-good (with respect to 𝐘~) if 𝖤𝖣(𝐘~[ϕ1(j)],𝐰j)γτ. Otherwise, block j is γ-bad.

Good blocks may be naturally extended to good intervals, where the summed edit corruptions of each block within the interval is not too much. Additionally, we constrain the number of bad blocks in any good interval.

Definition 6 ((θ,γ)-Good Interval).

A block interval [a,b] is (θ,γ)-good (with respect to 𝐘~) if the following hold:

  1. 1.

    j=ab𝖤𝖣(𝒀~[ϕ1(j)],𝒘j)γτ(ba+1).

  2. 2.

    The number of γ-bad blocks in the block interval [a,b] is at most θ×(ba+1).

Otherwise, the interval is (θ,γ)-bad.

Lastly, we define the notion of a locally good block, which captures the idea that any interval including such a block must also be good.

Definition 7 ((θ,γ)-Local Good Block).

For θ,γ(0,1), block j[B] is (θ,γ)-local good (with respect to 𝐘~) if for every a,b[B] such that ajb, the interval [a,b] is (θ,γ)-good. Otherwise, block j is (θ,γ)-locally bad.

2.2 Extending RecoverBlock to RecoverBlocks

We start with recalling specific details of the RecoverBlock procedure. The RecoverBlock procedure for recovering a single block takes as input the desired block index j[B] and the initial search radius [1,n~]. The procedure will iteratively split the search radius into three contiguous, linear-sized segments, a beginning, a middle, and an end segment, where it will choose to either cut the beginning or the end segment. To decide which, it randomly samples N=θ(log2n) indices i1,,iN within the middle segment and recover their respective block indices j1,,jN. More specifically, for each index ib, a subroutine Block-Decode is called which queries a O(τ) interval around ib to decode the noisy block containing (j𝒘j). The median j𝗆𝖾𝖽 of these retrieved indices is then calculated, and, depending if j<j𝗆𝖾𝖽 or jj𝗆𝖾𝖽, cuts either the end or beginning segment from the search interval respectively. The binary search continues until the search radius is linear in the size of the block for a sufficiently small constant.

Lastly, an interval decoding procedure is then run over the symbols in the final search interval, where it is guaranteed to recover any local-good block located within the interval. The subroutines used, Block-Decode and the final interval decoding, are based on local and global variations of the SZ-code decoding algorithm, finding the pre/post-pended 0s on the Insdel blocks to locate the blocks in between.

Intuitively, we may modify RecoverBlock to recover an interval [a,b][B] of contiguous Hamming blocks instead of a single block by stopping the search once its search radius is linear to the size of the interval versus the size of a single block. Lastly, we run a procedure we call Sim-RecoverBlock to recover each block j[a,b] within the final search radius [l,r]. If we naively call RecoverBlock𝒀~(l,r,q) repeatedly for every block q[a,b] then each call to RecoverBlock will execute its own noisy binary search procedure in the interval [l,r] resulting in redundant queries which would blow up our amortized locality i.e., the total number of queries would be too large i.e. Ω((lr)polylog(lr)). Instead, we simply make lr+1 queries to Y~ to recover the entire substring Y~[l,r] and then run Sim-RecoverBlock(𝒀~[l,r],l,r,q) for each q[a,b] where Sim-RecoverBlock(𝒀~[l,r],l,r,q) uses the hardcoded string Y~[l,r] to simulate the execution of RecoverBlock𝒀~(l,r,q). We call this overall extended procedure RecoverBlocks, and present the construction below (modifications to the RecoverBlock procedure are in blue).

RecoverBlocks𝒀~(l,r,a,b)

 

Parameters: Block size τ,γ(0,1)

Input: Search range l<r[n~+1], block range ab[B]

Output: pre-compiled blocks 𝒔{0,1}(ba+1)τ

Let c=36(βγ).

Let ρ=min{14×βγβ+γ,134×β+γβγ}.

  1. 1.

    While rl>c(ba+1)τ:

    1. (a)

      Let m1(1ρ)l+ρr, and m2ρl+(1ρ)r.

    2. (b)

      For p=1,,N=θ(log2n),

      1. i.

        randomly sample ip$[m1,m2)

      2. ii.

        set SBlock-Decode𝒀~(ip)

      3. iii.

        If S=, ignore and continue

      4. iv.

        Else, parse (jp,𝒘jp)S

    3. (c)

      Let jmed median(j1,,jN) (ignore )

    4. (d)

      If a>jmed, set l=m1. Otherwise, set r=m2.

  2. 2.

    Make (lr+1) queries to recover 𝒀~[l,r].

  3. 3.

    Output [Sim-RecoverBlock(𝒀~[l,r],l,r,q):q[a,b]].

The first step is to prove that RecoverBlocks on input (1,n~+1,a,b) for any block interval [a,b][B] will return all Hamming blocks 𝒘j within that interval, given that the blocks are good. We will then argue that by a careful setting of parameters, the number of correctly recovered blocks is sufficient for an overall correct decoding process. We refer to the full version [8] for the proof of the following theorem and lemma on the correctness and query complexity of the RecoverBlocks procedure.

Theorem 8.

Let 𝒫={[ai,bi]:1aibi<ai+1B} be any partition of [B]. Then, let 𝐬j[ai,bi] be the random variable defined as the output of RecoverBlocks𝐘~(1,n~+1,ai,bi)jai+1τ. If δ=δhϕγ8β(1+1/θ), then

Pr[[ai,bi]𝒫j=1biai+11(𝒔j[ai,bi]𝒘j)δhB]negl(n),

where the probability is taken over the joint distribution {𝐬j[ai,bi]:[ai,bi]𝒫,j[biai+1]}.

Lemma 9.

For any word 𝐘~{0,1}n~, interval [a,b][B], and constants β,γ>0, on input (1,n~+1,a,b), RecoverBlocks has query complexity O((ba)τ+τlog3n).

2.3 The Amortized Insdel Decoder

We construct an amortized Insdel local decoder 𝖣𝖾𝖼 utilizing RecoverBlocks to simulate oracle access for the underlying Hamming decoder 𝖣𝖾𝖼h. Our Insdel decoder 𝖣𝖾𝖼 follows a similar procedure to the BBGKZ compiler decoder. 𝖣𝖾𝖼 calls the underlying Hamming decoder 𝖣𝖾𝖼h, and on its queries i1,,iq[m] to the (possibly corrupted) Hamming codeword 𝒚~, will respond by calling RecoverBlocks to recover the corresponding Hamming blocks and queried symbols. Correctness of our procedure will follow from Theorem 8, which roughly states that the symbols returned by RecoverBlocks are equivalent to what 𝖣𝖾𝖼h expects from true oracle access to the (possibly corrupted) Hamming codeword 𝒚~.

Our next step is to ensure that the queries made by our Insdel decoder amortize over the RecoverBlocks calls by restricting our Hamming decoder to only make contiguous queries, where each contiguous interval is of a fixed size. Suppose the Hamming decoder 𝖣𝖾𝖼h is guaranteed to make queries in contiguous intervals of size t, say [u1+1,u1+t],,[up+1,up+t][m] for some t. Intuitively, when the Insdel decoder 𝖣𝖾𝖼 is simulating the Hamming decoder’s queries with RecoverBlocks, it can then make O(p) calls to RecoverBlocks to recover codeword intervals of size O(t). For large enough t, the number of RecoverBlocks calls O(p) decreases and the Insdel decoder’s queries will amortize.

We formally describe a Hamming decoder with this desirable property t-consecutive interval querying Hamming decoder.

Definition 10 (Consecutive Interval Querying).

Consider any (m,k)-code 𝒞=(𝖤𝗇𝖼,𝖣𝖾𝖼) that is a (α,κ,δ,ε)-𝖺𝖫𝖣𝖢. For any word 𝐲~, interval [L,R][k] with RL+1κ, and random coins r, let 𝖰𝗎𝖾𝗋𝗒(𝐲~,[L,R],r):={i1,,iq}[n~] denote the codeword indices queries when 𝖣𝖾𝖼𝐲~(L,R) is ran with randomness r. Code 𝒞 and decoder 𝖣𝖾𝖼 are t-consecutive interval querying if the set 𝖰𝗎𝖾𝗋𝗒(𝐲~,[L,R],r) can be partitioned into q/t disjoint intervals [u1,v1],,[uq/t,vq/t] of size t i.e.,

  1. 1.

    vjuj+1=t for all jq/t,

  2. 2.

    𝖰𝗎𝖾𝗋𝗒(y,[L,R],r)=jq/b[uj,vj], and

  3. 3.

    [uj1,vj1][uj2,vj2]= for all j1j2.

We now show how to construct the amortized local decoder 𝖣𝖾𝖼 using RecoverBlocks and a t-consecutive interval querying Hamming 𝖺𝖫𝖣𝖢 decoder 𝖣𝖾𝖼h for a value of t to be specified in the analysis. We choose the SZ-code as the Insdel code 𝒞insdel, which has constant rate 1/βinsdel=Ω(1) and constant error-tolerance δinsdel=θ(1).

Construction 11.

Let 𝒞h=(𝖤𝗇𝖼h,𝖣𝖾𝖼h) be a t-consecutive interval querying Hamming 𝖫𝖣𝖢 such that τ divides t. Let 𝒞insdel=(𝖤𝗇𝖼insdel,𝖣𝖾𝖼insdel) be an Insdel binary code.

𝖤𝗇𝖼𝖢𝗈𝗆𝗉𝗂𝗅𝖾ϕ(𝒚)

 

Input: Hamming codeword 𝒚{0,1}m, padding rate ϕ(0,1).

Output: Insdel codeword 𝒀{0,1}n.

  1. 1.

    Parse 𝒚=𝒘1𝒘|𝒚|/τ where 𝒘j{0,1}τ for all j[|𝒚|/τ].

  2. 2.

    Compute 𝒘j0ϕτ𝖤𝗇𝖼insdel(j𝒘j)0ϕτ for all j[|𝒚|/τ].

  3. 3.

    Output 𝒀=𝒘1𝒘|𝒚|/τ.

𝖤𝗇𝖼(𝒙)

 

Parameters: Block size τ, Padding rate ϕ(0,1).

Input: message 𝒙{0,1}k.

Output: Insdel codeword 𝒀{0,1}n.

  1. 1.

    Compute 𝒚𝖤𝗇𝖼h(𝒙).

  2. 2.

    Output 𝒀𝖤𝗇𝖼𝖢𝗈𝗆𝗉𝗂𝗅𝖾(𝒚).

𝖣𝖾𝖼𝒀~(L,R)

 

Parameters: Block size τ, Padding rate ϕ(0,1).

Input: 1LRn~

Output: word 𝒔~{0,1}(RL+1)τ

  1. 1.

    Suppose 𝖣𝖾𝖼h(L,R) queries disjoint intervals [u1,v1],,[up,vp][m] of size t.

  2. 2.

    For each r[p],

    1. (a)

      Compute j[n/t] such that [ur,vr][(j1)t+1,(j+1)t].

    2. (b)

      Let t=t/τ and compute

      𝒔(j1) RecoverBlocks𝒀~(1,n~+1,(j1)t+1,jt)
      𝒔(j) RecoverBlocks𝒀~(1,n~+1,jt+1,(j+1)t).
  3. 3.

    From {𝒔(j)}, send 𝖣𝖾𝖼h the bits corresponding to intervals [u1,v1],,[up,vp].

  4. 4.

    Output the output of 𝖣𝖾𝖼h.

Our Insdel decoder 𝖣𝖾𝖼 processes the t-sized intervals [u1,v1],,[up,vp][m] queried by the Hamming decoder 𝖣𝖾𝖼h into a corresponding t=t/τ-sized block interval inputs for RecoverBlocks (step 2.a). Note that for ease of presentation, the decoder always computes RecoverBlocks on two t sized block intervals (step 2.b). The amortized locality can be optimized by a constant factor by calling RecoverBlocks on the exact block intervals queried by the Hamming decoder 𝖣𝖾𝖼h. However, it will be convenient to assume the Insdel decoder 𝖣𝖾𝖼 calls RecoverBlocks in a predictable manner for all inputs when proving correctness i.e. the input interval to RecoverBlocks is always ((j1)t+1,jt) for some j, and asymptotically, there is no change to the amortized locality.

Theorem 12.

Let (m,k)-code 𝒞h=(𝖤𝗇𝖼h,𝖣𝖾𝖼h) be a t-consecutive interval querying Hamming (αh,κh,δh,εh)-𝖺𝖫𝖣𝖢. Then, for any block size τΩ(logn) such that τn, 𝒞=(𝖤𝗇𝖼,𝖣𝖾𝖼) in Construction 11 is a (α,κ,δ,ε)-𝖺𝖫𝖣𝖢 for α=O(αhτlog3nt),κ=κh,δ=Ω(δh), and ε=εh+negl(n).

Proof.

Suppose 𝖣𝖾𝖼h𝒚~(L,R) queries disjoint intervals [u1,v1],,[up,vp][m]. For each r[p], we compute RecoverBlocks on block intervals [(j1)t+1,jt] and [jt+1,(j+1)t] such that [ur,vr][(j1)t+1,(j+1)t]. Let 𝒫={[1,t],[t+1,2t],,[Bt+1,B]} be a partition of [B] in equal t-sized intervals. Let 𝒔(j) be the random variable defined as the output of RecoverBlocks𝒀~(1,n~+1,(j1)×t+1,j×t). Define 𝒚~ as the random string defined by 𝒚~jt=𝒔(j) for each j[n/t]. Then, since 𝖧𝖺𝗆(𝒚~u,𝒚)δhm is implied by the event [(j1)t+1,jt]𝒫r=1t1(𝒔r(j)𝒘j)δhB, by Theorem 8, Pr[𝖧𝖺𝗆(𝒚~,𝒚)δhm]1negl(n). Thus, from the view of 𝖣𝖾𝖼h, it is interacting with 𝒚~ over partition 𝒫, and so by definition, for any RL+1κh,

Pr[𝖣𝖾𝖼h𝒚~(L,R)=(xi:i[L,R])|𝖧𝖺𝗆(𝒚~,𝒚)δhm]1εu.

By construction of 𝖣𝖾𝖼, for any RL+1κh,

Pr [𝖣𝖾𝖼𝒀~(L,R)=(xi:i[L,R]]
Pr[𝖧𝖺𝗆(𝒚~u,𝒚)δhm]Pr[𝖣𝖾𝖼h𝒚~u(L,R)=(xi:i[L,R])|𝖧𝖺𝗆(𝒚~u,𝒚)δhm]
(1negl(n))×(1εh)1εhnegl(n).

Since 𝒞h is t-consecutive interval querying which makes at most αh(RL+1) queries, the decoder 𝖣𝖾𝖼 calls RecoverBlocks at most 2αh(RL+1)t times on block intervals of size t=t/τ. By Lemma 9, the query complexity of 𝖣𝖾𝖼 is

α(RL+1)=2αh(RL+1)t×O((t/τ×τ)+τlog3n) and so α=O(αhτlog3nt).

If a t-consecutive interval querying, ideal Hamming 𝖺𝖫𝖣𝖢 does exists, we obtain the following corollary.

Corollary 13.

If there exists a t-consecutive interval querying, ideal Hamming 𝖺𝖫𝖣𝖢 for t=Ω(τlog3n), then there exists an ideal Insdel 𝖺𝖫𝖣𝖢.

We construct t-consecutive interval querying, ideal Hamming 𝖺𝖫𝖣𝖢s in private-key and resource-bounded settings and leave a construction for worst-case Hamming errors as an open question.

3 Ideal Insdel 𝖺𝖫𝖣𝖢s in Private-key Settings

Given the compiler and resulting Theorem 12 in the previous section converting an ideal Hamming 𝖺𝖫𝖣𝖢 𝒞h to ideal Insdel 𝖺𝖫𝖣𝖢 𝒞 whenever 𝒞h is Ω(τlog3n)-consecutive interval querying, we show that such an 𝖺𝖫𝖣𝖢 exist with private-key assumptions. We start by recalling the definition of a private-key 𝖺𝖫𝖣𝖢 (𝗉𝖺𝖫𝖣𝖢).

Definition 14 (𝗉𝖺𝖫𝖣𝖢 [7]).

Let λ be the security parameter. A triple of probabilistic polynomial time algorithms (𝖦𝖾𝗇,𝖤𝗇𝖼,𝖣𝖾𝖼) over Σ is a private (α,κ,δ,ϵ,q)-amortizeable LDC (𝗉𝖺𝖫𝖣𝖢) for Hamming errors (resp. Insdel errors) if

  • for all keys 𝗌𝗄Range(𝖦𝖾𝗇(1λ)) the pair (𝖤𝗇𝖼𝗌𝗄,𝖣𝖾𝖼𝗌𝗄) is a (n,k) code 𝒞 over Σ, and

  • for all 𝒚~Σn and all L,R[k] with RL+1κ the local decoding algorithm 𝖣𝖾𝖼𝗌𝗄𝒚~(h)(L,R) makes at most (RL+1)α queries to 𝒚~

  • for all probabilistic polynomial time algorithms 𝒜 there is a negligible function μ such that Pr[paLDC-Sec-Game(𝒜,λ,α,κ,δ,α,q)=1]μ(λ), where the probability is taken over the randomness of 𝒜,𝖦𝖾𝗇, and paLDC-Sec-Game. The experiment paLDC-Sec-Game is defined as follows:

paLDC-Sec-Game(𝒜,λ,α,κ,δ,q)

 

The challenger generates secret key 𝗌𝗄𝖦𝖾𝗇(1λ). For q rounds, on iteration h, the challenger and adversary 𝒜 interact as follows:

  1. 1.

    The adversary 𝒜 chooses a message 𝒙(h)Σk and sends it to the challenger.

  2. 2.

    The challenger sends 𝒚(h)𝖤𝗇𝖼𝗌𝗄(𝒙(h)) to the adversary.

  3. 3.

    The adversary outputs 𝒚~(h)Σ that is δ-Hamming-close (resp. δ-edit-close) distance to 𝒚(h).

  4. 4.

    If there exists L(h),R(h)[k] such that R(h)L(h)+1κ and Pr[𝖣𝖾𝖼𝗌𝗄𝒚~(h)(L(h),R(h))𝒙[L(h),R(h)]]>ε(λ), then this experiment immediately terminates and outputs 1.

If the experiment did not output 1 on any iteration h, then output 0.

Note that for 𝗉𝖺𝖫𝖣𝖢s, we assume that the local decoder takes in a consecutive interval [L,R] rather than an arbitrary subset Q as input. Blocki and Zhang show that explicit, ideal Hamming 𝗉𝖺𝖫𝖣𝖢 constructions exist for such decoders, and the existence of ideal 𝗉𝖺𝖫𝖣𝖢s whose local decoders take in arbitrary subsets Q is left as an open problem.

3.1 A consecutive interval querying, Ideal Hamming 𝗉𝖺𝖫𝖣𝖢

We construct a t-consecutive interval querying, ideal Hamming 𝗉𝖺𝖫𝖣𝖢 by modifying of the private-key 𝖫𝖣𝖢 construction by Ostrovsky et al. [21]. Recall that the encoding procedure on input message 𝒙 first partitions it into equal-sized blocks 𝒙=𝒆1𝒆B, for all j[B]. Each block is then individually encoded as 𝒆j=𝖤𝗇𝖼(𝒆j) for each j[B] by a constant rate, constant error tolerant, and constant alphabet size Hamming code encoding 𝖤𝗇𝖼 (e.g. the Justesen code) to form the encoded message 𝒙=𝒆1𝒆B. Lastly, an additional secret-key permutation π and random mask 𝒓 are applied i.e. the codeword is 𝒚=π(𝒙)𝒓, which effectively makes the codeword look random from the view of a probabilistic polynomial time channel.

Blocki and Zhang [7] observe that the encoding procedure by Ostrovsky et al. is amenable to amortization when the recovered symbols are in a consecutive interval [L,R]. Their amortized local decoder, with access to the secret key, undoes the permutation π and random mask 𝒓, recovers the contiguous blocks 𝒆s+1,,𝒆s+ corresponding to requested [L,R], decodes each block, and returns the corresponding queried symbols. Unfortunately, since we applied a permutation π in the encoding procedure, the amortized local decoder does not make consecutive interval queries to the (corrupted) codeword 𝒚~.

To add t-consecutive interval querying to the current decoder, we modify the encoding scheme permutation step to permute t-sized sub-blocks of the blocks 𝒆j in encoded message 𝒙=𝒆1𝒆B, rather than permuting individual bits. We highlight in blue significant changes made from the original 𝗉𝖺𝖫𝖣𝖢 presented by Blocki and Zhang.

Construction 15.

Suppose that 𝒞=(𝖤𝗇𝖼𝒞,𝖣𝖾𝖼𝒞) is a Hamming (A,a)-code over an alphabet Σ with rate R. Let c=log|Σ|. Define 𝒞𝗉=(𝖦𝖾𝗇𝗉,𝖤𝗇𝖼𝗉,𝖣𝖾𝖼𝗉) with message length k, codeword length m=kR, block size cA, and sub-block size t (dividing cA) as follows:

𝖦𝖾𝗇(1λ)

 
  1. 1.

    Generate 𝒓${0,1}m and uniformly random permutation π:[m/t][m/t].

  2. 2.

    Output 𝗌𝗄(𝒓,π).

𝖤𝗇𝖼𝗌𝗄(𝒙)

 

Input: Message 𝒙{0,1}k

Output: Codeword 𝒚{0,1}m

  1. 1.

    Parse (𝒓,π)𝗌𝗄.

  2. 2.

    Let 𝒙𝒆1𝒆2𝒆B where each 𝒆sΣa (and B=k/ca).

  3. 3.

    For each s[B]:

    1. (a)

      set 𝒆s𝖤𝗇𝖼𝒞(𝒆s), and

    2. (b)

      let 𝒆s=ϵ(s1)β+1ϵsβ where each ϵs{0,1}t (and β=cA/t).

  4. 4.

    Output 𝒚=(ϵπ(1)ϵπ(Bβ))𝒓.

𝖣𝖾𝖼𝗌𝗄𝒚~(L,R)

 

Input: Interval [L,R][k]

Output: word 𝒙~[L,R]{0,1}RL+1.

  1. 1.

    Parse (𝒓,π)𝗌𝗄 and interpret 𝒚~=ϵ~π(1)ϵ~π(Bβ) where each ϵ~j{0,1}t.

  2. 2.

    Let the bits of 𝒙[L,R] lie in blocks 𝒆s+1,,𝒆s+. For each j=s+1,,s+:

    1. (a)

      Compute s1,,sβ such that π(sr)=(s+j1)β+r for all r[β].

    2. (b)

      Query 𝒚~ to compute 𝒆~jϵ~π(s1)ϵ~π(sβ).

    3. (c)

      Compute 𝒆~j𝖣𝖾𝖼𝒞(𝒆~j).

From 𝒆~s+1,,𝒆~s+ output bits corresponding to interval [L,R].

Any queries made by the decoder 𝖣𝖾𝖼 are to the t-sized (corrupted) sub-blocks ϵ~π(r), so Construction 15 is t-consecutive interval querying. The main challenge is choosing the overall block size cA such that decoding error remains negligible. We show that by increasing the original setting of block size τ in the work of Blocki and Zhang [7] by a multiplicative t-factor, negligible decoding error follows.

Theorem 16.

Let λ. In Construction 15, suppose the (A,a)-code 𝒞 has (constant) rate R, (constant) error tolerance δ𝗉, and (constant) alphabet Σ with c=log|Σ|. Then, for any t such that tcA and kpoly(λ), 𝒞𝗉=(𝖦𝖾𝗇𝗉,𝖤𝗇𝖼𝗉,𝖣𝖾𝖼𝗉) is a t-consecutive interval querying (2/R,O(a),θ(δ𝗉),negl(λ),1)-𝗉𝖺𝖫𝖣𝖢 when a=ω(tlogλ).

Proof.

We show that the probability of an incorrect decoding Pr[paLDC-Sec-Game(𝒜,λ,α,κ,δ,1)=1] is negligible for δ=θ(δ𝗉) and any probabilistic polynomial time adversary 𝒜. Define the event Bad=j[k/ca]Badj, where Badj is the event that 𝒆j has more than a δ𝗉 fraction of errors. We show that the probability Pr[Bad] is negligible.

Since the t-sized sub-blocks of the codeword are permuted and a random mask is applied, any errors added by a probabilistic polynomial time adversary 𝒜 to the corrupted codeword 𝒚~ are added uniformly at random over t-bit intervals. This follows from generalizing the argument given by Lipton [19], where we interpret each sub-block as a t-bit symbol and the observation that the probability that event Bad occurs given that 𝒜 does not apply errors in a t-bit interval is at most the probability that event Bad occurs given that 𝒜 does apply errors in a t-bit interval.

Then, the number of errors in any given block 𝒆j follow a Hypergeometric(m/t,δm/t,cA/t), which by the CDF bound of [15, 16], we have Pr[𝙱𝙰𝙳j]<exp(2(((δ𝗉δ)(cA/t))21)(cA/t)+1). Thus, for δ𝗉>δ and a/tω(logn), this probability is negligible. By a union bound, Pr[Bad] is also negligible, so εPr[Bad]<negl(λ).

The proof of the query parameters α=2/R and κ=a, is the same as in Theorem 16 in the work of Zhang and Blocki [7] (see full version [8]). We note that the poly-round 𝗉𝖺𝖫𝖣𝖢 construction of Blocki and Zhang [4] can also be made t-consecutive interval querying by the same technique of applying a higher-order permutation. We omit the construction and proof from this work since it will follow an almost-identical modification.

3.2 The Ideal Insdel 𝗉𝖺𝖫𝖣𝖢 Construction

We compile the t-consecutive interval querying, ideal Hamming 𝗉𝖺𝖫𝖣𝖢 in the prior subsection into an ideal Insdel 𝗉𝖺𝖫𝖣𝖢 using the Hamming-to-Insdel compiler 𝖤𝗇𝖼𝖢𝗈𝗆𝗉𝗂𝗅𝖾 in Section 2.

Construction 17.

Suppose (m,k)-code 𝒞𝗉=(𝖦𝖾𝗇𝗉,𝖤𝗇𝖼𝗉,𝖣𝖾𝖼𝗉) is a t-consecutive interval querying Hamming (α𝗉,κ𝗉,δ𝗉,ε𝗉,q)-𝗉𝖺𝖫𝖣𝖢. Then define 𝒞=(𝖦𝖾𝗇,𝖤𝗇𝖼,𝖣𝖾𝖼) as 𝖦𝖾𝗇(1λ):=𝖦𝖾𝗇𝗉(1λ), 𝖤𝗇𝖼𝗌𝗄(𝐱):=𝖤𝗇𝖼𝖢𝗈𝗆𝗉𝗂𝗅𝖾(𝖤𝗇𝖼𝗉,𝗌𝗄(𝐱)), and 𝖣𝖾𝖼𝗌𝗄𝐘~(L,R):=𝖣𝖾𝖼𝗉,𝗌𝗄𝐘~(L,R).

We will need to show that the Hamming-to-Insdel compiler 𝖤𝗇𝖼𝖢𝗈𝗆𝗉𝗂𝗅𝖾, when used within the private-key setting, retains correctness. See the full version [8] for the reduction argument proving the following theorem.

Theorem 18.

Let 𝒞𝗉=(𝖦𝖾𝗇𝗉,𝖤𝗇𝖼𝗉,𝖣𝖾𝖼𝗉) be a (m,k)-code that is a t-consecutive interval querying Hamming (α𝗉,κ𝗉,δ𝗉,ε𝗉)-𝗉𝖺𝖫𝖣𝖢. Then, 𝒞=(𝖦𝖾𝗇,𝖤𝗇𝖼,𝖣𝖾𝖼) in Construction 17 with is a (n,k)-code that is an Insdel (α,κ,δ,ε)-𝗉𝖺𝖫𝖣𝖢with α=O(α𝗉τlog3nt),κ=κ𝗉,δ=Ω(δ𝗉), and ε=ε𝗉+negl(n).

By instantiating our private-key Insdel compiler with our t-querying Hamming 𝗉𝖺𝖫𝖣𝖢, we construct an ideal Insdel 𝗉𝖺𝖫𝖣𝖢.

Corollary 19.

If 𝒞𝗉 of Construction 15 instantiated with a constant rate, constant error tolerant, and constant size alphabet code (e.g. a Justesen code), t=θ(τlog3n) and τ=θ(logn), then code 𝒞 in Construction 17 is an (Ideal) Insdel (O(1),O(log5n),O(1),negl(n))-𝗉𝖺𝖫𝖣𝖢 with constant rate.

4 Ideal Insdel 𝖺𝖫𝖣𝖢s for Resource-bounded Channels

In this section, we present an ideal insdel 𝖺𝖫𝖣𝖢 in settings where the channel is bounded for some resource, such as parallel time or circuit depth. As in the construction of an ideal Insdel 𝗉𝖺𝖫𝖣𝖢, our construction for resource-bounded channels will rely on a construction of a consecutive interval querying Hamming 𝖺𝖫𝖣𝖢 that will allow the compiler to amortize over its queries. Then, we formally prove the Compiler we constructed in Section 2 is secure in resource-bounded settings, which results in our ideal Insdel construction for resource-bounded channels.

We start by recalling the definition of an 𝖺𝖫𝖣𝖢 for resource-bounded channels (𝗋𝖺𝖫𝖣𝖢).

Definition 20 (𝗋𝖺𝖫𝖣𝖢 [1]).

A (n,k) code 𝒞=(𝖤𝗇𝖼,𝖣𝖾𝖼) is a (α,κ,δ,ϵ,)-resource-bounded amortizeable LDC (𝗋𝖺𝖫𝖣𝖢) for hamming errors (resp. insDel errors) if for all L,R[k] with RL+1κ the local decoding algorithm 𝖣𝖾𝖼𝗌𝗄𝐲~(h)(.) makes at most (RL+1)α queries to 𝐲~ 𝖣𝖾𝖼𝐲~ and for any 𝐲~Σ and for any resource bounded algorithm 𝒜, there is a negligible function μ such that Pr[raLDC-Game(𝒜,λ,δ,κ)=1]μ(λ), where the probability is taken over the randomness of 𝒜,𝖦𝖾𝗇, and paLDC-Sec-Game. The experiment raLDC-Game is defined as follows:

raLDC-Game(𝒜,λ,δ,κ)

 
  1. 1.

    The adversary 𝒜 chooses a message 𝒙{0,1}k and sends it to the encoder.

  2. 2.

    The encoder sends 𝒚𝖤𝗇𝖼𝗌𝗄(𝒙) to the adversary.

  3. 3.

    The adversary outputs 𝒚~{0,1} that is δ-hamming-close (resp. δ-edit-close) distance to 𝒚.

  4. 4.

    If there exists L,R[k] such that Pr[𝖣𝖾𝖼𝒚~(L,R)𝒙[L,R]]>ε(λ) and RL+1κ, then this experiment outputs 1. Otherwise, output 0.

4.1 A consecutive interval querying, Ideal 𝗋𝖺𝖫𝖣𝖢

To construct a t-consecutive interval querying, ideal 𝗋𝖺𝖫𝖣𝖢, we employ the Hamming 𝗉𝖺𝖫𝖣𝖢 to 𝗋𝖺𝖫𝖣𝖢 compiler of Blocki and Zhang [7], which was a modification of the original compiler by Ameri et al. [1]. The construction makes use of two other building blocks: the first is cryptographic puzzles, consisting of two algorithms 𝖯𝗎𝗓𝗓𝖦𝖾𝗇 and 𝖯𝗎𝗓𝗓𝖲𝗈𝗅𝗏𝖾. On input seed 𝒄, 𝖯𝗎𝗓𝗓𝖦𝖾𝗇 outputs a puzzle 𝒁 with solution is 𝒄 i.e. 𝖯𝗎𝗓𝗓𝖲𝗈𝗅𝗏𝖾(𝒁)=𝒄. The security requirement states that adversary 𝒜 in a defined algorithm class , cannot solve the puzzle 𝒁 and cannot even distinguish between tuples (𝒁0,𝒄0,𝒄1) and (𝒁1,𝒄0,𝒄1) for random 𝒄i and 𝒁i=𝖯𝗎𝗓𝗓𝖦𝖾𝗇(𝒄i). Ameri et al. are able to construct memory-hard puzzles, i.e. cryptographic puzzles unsolvable by algorithm class 𝖼𝗆𝖼 of algorithms with bounded cumulative memory complexity, under standard cryptographic assumptions. We generalize their definition for any class of algorithms , where if a cryptographic puzzle is unsolvable by any algorithm in , we say the puzzle is -hard.

The second building block is a variant of a locally decodable code, referred to as 𝖫𝖣𝖢, which recovers the entire (short) encoded message 𝒔 with locality only scaling linearly with the message length. The original 𝖫𝖣𝖢 construction given by Blocki et al. [6] is a repetition code of a constant rate, constant error tolerance code (e.g. the Justesen code), where 𝖣𝖾𝖼 queries a constant number of repeated codewords and performs a majority vote decoding. On message length k, their 𝖫𝖣𝖢decoder makes θ(k) queries, where the codeword length nk can be arbitrarily large.

Given a 𝗉𝖺𝖫𝖣𝖢 code 𝒞𝗉=(𝖦𝖾𝗇𝗉,𝖤𝗇𝖼𝗉,𝖣𝖾𝖼𝗉) and a 𝖫𝖣𝖢 code 𝒞=(𝖤𝗇𝖼,𝖣𝖾𝖼), the constructed Hamming 𝗋𝖺𝖫𝖣𝖢 encoder 𝖤𝗇𝖼, on message 𝒙 will (1) generate the secret key 𝗌𝗄𝖦𝖾𝗇𝗉(1λ;𝒄) with some random coins 𝒄 (2) Compute the 𝗉𝖺𝖫𝖣𝖢 encoding of the message 𝒚𝗉𝖤𝗇𝖼𝗉,𝗌𝗄(𝒙) (3) Compute the 𝖫𝖣𝖢 encoding of the puzzle 𝒚𝖤𝗇𝖼(𝒁), where the 𝖫𝖣𝖢 encoding has codeword length θ(|𝒚𝗉|) (4) and finally, output 𝒚=𝒚𝗉𝒚. Intuitively, a resource-bounded channel will not be able to solve the puzzle to retrieve the secret key from 𝒚. Hence, the message encoding 𝒚𝗉 looks random to the channel, reducing the resource-bounded channel’s view of the codeword to the view a computationally-bound channel in the private-key setting. On the other hand, the 𝗋𝖺𝖫𝖣𝖢 decoder 𝖣𝖾𝖼 will have sufficient resources to solve the puzzle and locally decode the codeword on input interval [L,R] i.e. 𝖣𝖾𝖼 computes 𝒄~𝖯𝗎𝗓𝗓𝖲𝗈𝗅𝗏𝖾(𝖣𝖾𝖼(𝒚~)), 𝗌𝗄~𝖦𝖾𝗇𝗉(1λ;𝒄~), and outputs 𝖣𝖾𝖼𝗉𝗌𝗄~𝒚𝗉~(L,R). By choosing an ideal 𝗉𝖺𝖫𝖣𝖢 code 𝒞𝗉 and an appropriate 𝖫𝖣𝖢 code 𝒞, the resulting compiled code (𝖤𝗇𝖼,𝖣𝖾𝖼) is an ideal 𝗋𝖺𝖫𝖣𝖢.

We observe that the compiled 𝗋𝖺𝖫𝖣𝖢 decoder 𝖣𝖾𝖼 satisfies t-consecutive interval query when instantiated with a t-consecutive interval querying 𝗉𝖺𝖫𝖣𝖢 and the original 𝖫𝖣𝖢 construction by Blocki et al. [6]. First, observe that the decoder 𝖣𝖾𝖼 queries the message encoding 𝒚𝗉 and the secret-key randomness encoding 𝒚 disjointly using 𝖣𝖾𝖼𝗉 and 𝖣𝖾𝖼 respectively. Second, 𝖣𝖾𝖼 is also t-consecutive interval querying as long as the encoding used in the repetition encoding has codeword length t. Thus, the overall 𝗋𝖺𝖫𝖣𝖢 decoder 𝖣𝖾𝖼is t-consecutive interval querying. We summarize this result formally below.

Theorem 21.

Let λ. Let 𝒞𝗉=(𝖦𝖾𝗇𝗉,𝖤𝗇𝖼𝗉,𝖣𝖾𝖼𝗉) be a (n𝗉,k𝗉)-code that is a t-consecutive interval querying Hamming (α𝗉,κ𝗉,δ𝗉,ε𝗉)-𝗉𝖺𝖫𝖣𝖢. For any algorithm class such that there exists -hard cryptographic puzzles, there exists a (n𝗋,k𝗋)-code 𝒞𝗋=(𝖤𝗇𝖼𝗋,𝖣𝖾𝖼𝗋) that is a t-consecutive interval querying Hamming (α𝗋,κ𝗋,δ𝗋,ε𝗋,)-𝗋𝖺𝖫𝖣𝖢 for n𝗋=θ(n𝗉),k𝗋=k𝗉=poly(λ),α𝗋=θ(α𝗉),κ𝗋=κ𝗉,δ𝗋=θ(δ𝗉), and ε𝗋=ε𝗉+negl(n).

4.2 The Ideal Insdel 𝗋𝖺𝖫𝖣𝖢 Construction

We present our final construction of an ideal Insdel 𝗋𝖺𝖫𝖣𝖢. We will proceed similarly to our construction of an ideal Insdel 𝗉𝖺𝖫𝖣𝖢 in Section 3, where we use our Hamming-to-Insdel compiler 𝖤𝗇𝖼𝖢𝗈𝗆𝗉𝗂𝗅𝖾 with the ideal, t-consecutive interval querying Hamming 𝗋𝖺𝖫𝖣𝖢 constructed in the prior subsection.

Construction 22.

Suppose (m,k)-code 𝒞𝗋=(𝖤𝗇𝖼𝗋,𝖣𝖾𝖼𝗋) is a t-consecutive interval querying Hamming (α𝗋,κ𝗋,δ𝗋,ε𝗋,)-𝗋𝖺𝖫𝖣𝖢. Then define 𝒞=(𝖤𝗇𝖼,𝖣𝖾𝖼) as 𝖤𝗇𝖼(𝐱):=𝖤𝗇𝖼𝖢𝗈𝗆𝗉𝗂𝗅𝖾(𝖤𝗇𝖼𝗋(𝐱)) and 𝖣𝖾𝖼𝐘~(L,R):=𝖣𝖾𝖼𝗋𝐘~(L,R).

Just as before, we will need to show that security holds when we use the Hamming-to-Insdel compiler in a resource-bounded setting by giving a reduction argument. One additional nuance we will need to handle is to allow any adversary for the Hamming code sufficient resources to perform the reduction. Let T(n) (resp. S(n)) be the running time (resp. space usage) of the computation of 𝖤𝗇𝖼𝖢𝗈𝗆𝗉𝗂𝗅𝖾(𝒚) and 𝒚~𝗌𝗂𝗆 . Let =reducen(𝒜) be a reduction from algorithm 𝒜 to in in time T(n) time with space usage S(n). Then, for any class of algorithms (n), denote its closure class ¯(n) with respect to reducen defined as the minimum class of algorithms such that reducen(𝒜)¯(n) for all 𝒜(n). See the full version [8] for the detailed proof.

Theorem 23.

Let 𝒞𝗋=(𝖤𝗇𝖼𝗋,𝖣𝖾𝖼𝗋) be a (m,k)-code that is a t-consecutive interval querying Hamming (α𝗋,κ𝗋,δ𝗋,ε𝗋,(n))-𝗋𝖺𝖫𝖣𝖢. Then, 𝒞=(𝖤𝗇𝖼,𝖣𝖾𝖼) in Construction 22 is a (n,k)-code that is an Insdel (α,κ,δ,ε,¯(n))-𝗋𝖺𝖫𝖣𝖢 with α=O(α𝗋τlog3nt),κ=κ𝗋,δ=Ω(δ𝗋), and ε=ε𝗋+negl(n).

Corollary 24.

Let t=θ(τlog3n), τ=θ(logn), and define any algorithm class such that there exists -hard cryptographic puzzles. Then by Theorem 21, code 𝒞 in Construction 22 is an Insdel (O(1),O(log5n),O(1),negl(n),)-𝗋𝖺𝖫𝖣𝖢 with constant rate.

References

  • [1] Mohammad Hassan Ameri, Alexander R. Block, and Jeremiah Blocki. Memory-hard puzzles in the standard model with applications to memory-hard functions and resource-bounded locally decodable codes. In Clemente Galdi and Stanislaw Jarecki, editors, SCN 22: 13th International Conference on Security in Communication Networks, volume 13409 of Lecture Notes in Computer Science, pages 45–68, Amalfi, Italy, september 12–14 2022. Springer, Cham, Switzerland. doi:10.1007/978-3-031-14791-3_3.
  • [2] Eli Ben-Sasson, Oded Goldreich, Prahladh Harsha, Madhu Sudan, and Salil P. Vadhan. Robust PCPs of proximity, shorter PCPs and applications to coding. In László Babai, editor, 36th Annual ACM Symposium on Theory of Computing, pages 1–10, Chicago, IL, USA, june 13–16 2004. ACM Press. doi:10.1145/1007352.1007361.
  • [3] Alexander R. Block and Jeremiah Blocki. Private and resource-bounded locally decodable codes for insertions and deletions. In 2021 IEEE International Symposium on Information Theory (ISIT), pages 1841–1846, 2021. doi:10.1109/ISIT45174.2021.9518249.
  • [4] Alexander R Block, Jeremiah Blocki, Elena Grigorescu, Shubhang Kulkarni, and Minshen Zhu. Locally decodable/correctable codes for insertions and deletions. arXiv preprint arXiv:2010.11989, 2020. arXiv:2010.11989.
  • [5] Jeremiah Blocki, Kuan Cheng, Elena Grigorescu, Xin Li, Yu Zheng, and Minshen Zhu. Exponential lower bounds for locally decodable and correctable codes for insertions and deletions. In 62nd Annual Symposium on Foundations of Computer Science, pages 739–750, Denver, CO, USA, february 7–10 2022. IEEE Computer Society Press. doi:10.1109/FOCS52979.2021.00077.
  • [6] Jeremiah Blocki, Shubhang Kulkarni, and Samson Zhou. On locally decodable codes in resource bounded channels. In Yael Tauman Kalai, Adam D. Smith, and Daniel Wichs, editors, ITC 2020: 1st Conference on Information-Theoretic Cryptography, volume 163 of Leibniz International Proceedings in Informatics (LIPIcs), pages 16:1–16:23, Boston, MA, USA, june 17–19 2020. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.ITC.2020.16.
  • [7] Jeremiah Blocki and Justin Zhang. Amortized locally decodable codes, 2025. doi:10.48550/arXiv.2502.10538.
  • [8] Jeremiah Blocki and Justin Zhang. Amortized locally decodable codes for insertions and deletions. arXiv preprint, 2025. doi:10.48550/arXiv.2507.03141.
  • [9] Alessandro Chiesa, Tom Gur, and Igor Shinkar. Relaxed locally correctable codes with nearly-linear block length and constant query complexity. In Shuchi Chawla, editor, 31st Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1395–1411, Salt Lake City, UT, USA, january 5–8 2020. ACM-SIAM. doi:10.1137/1.9781611975994.84.
  • [10] Gil Cohen and Tal Yankovitz. Asymptotically-good rlccs with (log n)ˆ(2+ o(1)) queries. In 39th Computational Complexity Conference (CCC 2024), pages 8:1–8:16. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2024. doi:10.4230/LIPIcs.CCC.2024.8.
  • [11] Klim Efremenko. 3-query locally decodable codes of subexponential length. In Michael Mitzenmacher, editor, 41st Annual ACM Symposium on Theory of Computing, pages 39–44, Bethesda, MD, USA, May 31 – june 2 2009. ACM Press. doi:10.1145/1536414.1536422.
  • [12] Meghal Gupta. Constant query local decoding against deletions is impossible. In Bojan Mohar, Igor Shinkar, and Ryan O’Donnell, editors, 56th Annual ACM Symposium on Theory of Computing, pages 752–763, Vancouver, BC, Canada, june 24–28 2024. ACM Press. doi:10.1145/3618260.3649655.
  • [13] Tom Gur, Govind Ramnarayan, and Ron D. Rothblum. Relaxed locally correctable codes. In Anna R. Karlin, editor, ITCS 2018: 9th Innovations in Theoretical Computer Science Conference, volume 94, pages 27:1–27:11, Cambridge, MA, USA, january 11–14 2018. Leibniz International Proceedings in Informatics (LIPIcs). doi:10.4230/LIPIcs.ITCS.2018.27.
  • [14] Brett Hemenway and Rafail Ostrovsky. Public-key locally-decodable codes. In David Wagner, editor, Advances in Cryptology – CRYPTO 2008, volume 5157 of Lecture Notes in Computer Science, pages 126–143, Santa Barbara, CA, USA, august 17–21 2008. Springer Berlin Heidelberg, Germany. doi:10.1007/978-3-540-85174-5_8.
  • [15] Brett Hemenway, Rafail Ostrovsky, Martin J Strauss, and Mary Wootters. Public key locally decodable codes with short keys. In International Workshop on Approximation Algorithms for Combinatorial Optimization, pages 605–615. Springer, 2011. doi:10.1007/978-3-642-22935-0_51.
  • [16] Don Hush and Clint Scovel. Concentration of the hypergeometric distribution. Statistics & Probability Letters, 75(2):127–132, November 2005. doi:10.1016/j.spl.2005.05.019.
  • [17] Jonathan Katz and Luca Trevisan. On the efficiency of local decoding procedures for error-correcting codes. In 32nd Annual ACM Symposium on Theory of Computing, pages 80–86, Portland, OR, USA, May 21–23 2000. ACM Press. doi:10.1145/335305.335315.
  • [18] Vinayak M. Kumar and Geoffrey Mon. Relaxed local correctability from local testing. In Bojan Mohar, Igor Shinkar, and Ryan O’Donnell, editors, 56th Annual ACM Symposium on Theory of Computing, pages 1585–1593, Vancouver, BC, Canada, june 24–28 2024. ACM Press. doi:10.1145/3618260.3649611.
  • [19] Richard J. Lipton. A new approach to information theory. In Proceedings of the 11th Annual Symposium on Theoretical Aspects of Computer Science, STACS ’94, pages 699–708, Berlin, Heidelberg, 1994. Springer-Verlag. doi:10.1007/3-540-57785-8_183.
  • [20] Rafail Ostrovsky, Omkant Pandey, and Amit Sahai. Private locally decodable codes. Cryptology ePrint Archive, Report 2007/025, 2007. URL: https://eprint.iacr.org/2007/025.
  • [21] Rafail Ostrovsky, Omkant Pandey, and Amit Sahai. Private locally decodable codes. In Lars Arge, Christian Cachin, Tomasz Jurdzinski, and Andrzej Tarlecki, editors, ICALP 2007: 34th International Colloquium on Automata, Languages and Programming, volume 4596 of Lecture Notes in Computer Science, pages 387–398, Wroclaw, Poland, july 9–13 2007. Springer Berlin Heidelberg, Germany. doi:10.1007/978-3-540-73420-8_35.
  • [22] Rafail Ostrovsky and Anat Paskin-Cherniavsky. Locally decodable codes for edit distance. In Anja Lehmann and Stefan Wolf, editors, ICITS 15: 8th International Conference on Information Theoretic Security, volume 9063 of Lecture Notes in Computer Science, pages 236–249, Lugano, Switzerland, May 2–5 2015. Springer, Cham, Switzerland. doi:10.1007/978-3-319-17470-9_14.
  • [23] L.J. Schulman and D. Zuckerman. Asymptotically good codes correcting insertions, deletions, and transpositions. IEEE Transactions on Information Theory, 45(7):2552–2557, 1999. doi:10.1109/18.796406.
  • [24] Sergey Yekhanin et al. Locally decodable codes. Foundations and Trends® in Theoretical Computer Science, 6(3):139–255, 2012. doi:10.1561/0400000030.