Abstract 1 Introduction 2 Upper bounds for coding problems 3 Hardness of coding problems 4 Finding short lattice vectors is in 𝗣𝗠𝗣𝗣 5 Inclusions 6 Black-box separations 7 A non-black-box non-separation References

The More the Merrier! On Total Coding and Lattice Problems and the Complexity of Finding Multicollisions111An earlier version of this work used the title “The more the merrier! On the complexity of finding multicollisions, with connections to codes and lattices.”

Huck Bennett University of Colorado Boulder, CO, USA Surendra Ghentiyala Cornell University, Ithaca, NY, USA Noah Stephens-Davidowitz Cornell University, Ithaca, NY, USA
Abstract

We show a number of connections between two types of search problems: (1) the problem of finding an L-wise multicollision in the output of a function; and (2) the problem of finding two codewords in a code (or two vectors in a lattice) that are within distance d of each other. Specifically, we study these problems in the total regime, in which L and d are chosen so that such a solution is guaranteed to exist, though it might be hard to find.

In more detail, we study the total search problem in which the input is a function 𝒞:[A][B] (represented as a circuit) and the goal is to find LA/B distinct elements x1,,xLA such that 𝒞(x1)==𝒞(xL). The associated complexity classes Polynomial Multi-Pigeonhole Principle ((A,B)-𝖯𝖬𝖯𝖯L) consist of all problems that reduce to this problem.

We show close connections between (A,B)-𝖯𝖬𝖯𝖯L and many celebrated upper bounds on the minimum distance of a code or lattice (and on the list-decoding radius). In particular, we show that the associated computational problems (i.e., the problem of finding two distinct codewords or lattice points that are close to each other) are in (A,B)-𝖯𝖬𝖯𝖯L, with a more-or-less smooth tradeoff between the distance d and the parameters A, B, and L. These connections are particularly rich in the case of codes, in which case we show that multiple incomparable bounds on the minimum distance lie in seemingly incomparable complexity classes.

Surprisingly, we also show that the computational problems associated with some bounds on the minimum distance of codes are actually hard for these classes (for codes represented by arbitrary circuits). In fact, we show that finding two vectors within a certain distance d is actually hard for the important (and well-studied) class 𝖯𝖶𝖯𝖯=(B2,B)-𝖯𝖬𝖯𝖯2 in essentially all parameter regimes for which an efficient algorithm is not known, so that our hardness results are essentially tight. In fact, for some d (depending on the block length, message length, and alphabet size), we obtain both hardness and containment. We therefore completely settle the complexity of this problem for such parameters and add coding problems to the short list of problems known to be complete for 𝖯𝖶𝖯𝖯.

We also study (A,B)-𝖯𝖬𝖯𝖯L as an interesting family of complexity classes in its own right, and we uncover a rich structure. Specifically, we use recent techniques from the cryptographic literature on multicollision-resistant hash functions to (1) show inclusions of the form (A,B)-𝖯𝖬𝖯𝖯L(A,B)-𝖯𝖬𝖯𝖯L for certain non-trivial parameters; (2) black-box separations between such classes in different parameter regimes; and (3) a non-black-box proof that (A,B)-𝖯𝖬𝖯𝖯L𝖥𝖯 if (A,B)-𝖯𝖬𝖯𝖯L𝖥𝖯 for yet another parameter regime. We also show that (A,B)-𝖯𝖬𝖯𝖯L lies in the recently introduced complexity class Polynomial Long Choice for some parameters.

Keywords and phrases:
Multicollisions, Error-correcting codes, Lattices
Funding:
Huck Bennett: This work is supported in part by NSF Grant No. 2312297. Part of this work was done while the author was at Oregon State University.
Surendra Ghentiyala: This work is supported in part by the NSF under Grants Nos. CCF-2122230 and CCF-2312296, a Packard Foundation Fellowship, and a generous gift from Google.
Noah Stephens-Davidowitz: This work is supported in part by the NSF under Grants Nos. CCF-2122230 and CCF-2312296, a Packard Foundation Fellowship, and a generous gift from Google. Some of this work was completed while the author was visiting the National University of Singapore and the Centre for Quantum Technologies.
Copyright and License:
[Uncaptioned image] © Huck Bennett, Surendra Ghentiyala, and Noah Stephens-Davidowitz; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Computational complexity and cryptography
Related Version:
Full Version: https://eccc.weizmann.ac.il/report/2024/018/ [4]
Acknowledgements:
The authors would like to thank Atri Rudra for very helpful discussions.
Editor:
Raghu Meka

1 Introduction

We study the complexity of total coding problems, as well as total lattice problems. (We focus on coding problems in the introduction.)

Recall that a q-ary code with messages of length k, block length n, and distance d is a function 𝒞:𝔽qk𝔽qn such that

d=min𝒙1𝒙2Δ(𝒞(𝒙1),𝒞(𝒙2)),

where Δ is the Hamming distance, i.e., the number of entries in which two elements in 𝔽qn differ. For our purposes, we think of 𝒞 as an arbitrary circuit with size poly(n,k,logq). (Much of the literature is concerned with the special case when 𝒞 is a non-singular linear function, in which case the code is called a linear code. We discuss our choice of definition more in Section 1.5.)

The most fundamental question in coding theory is to find bounds on d for fixed n, k, and q. In other words, one wishes to argue that any set of qk points in 𝔽qn must contain two points that are within Hamming distance d for the smallest value of d possible. Many beautiful upper bounds on the distance d are known in terms of n, k, and q.

We therefore define the natural family of computational search problem associated with this question. Specifically, we define the (n,k,d)q-Short Distance Problem ((n,k,d)q-SDP) as the computational search problem in which the input is a circuit 𝒞:𝔽qk𝔽qn and the goal is to find 𝒙1𝒙2 such that Δ(𝒞(𝒙1),𝒞(𝒙2))d.222The problem of finding 𝒙1𝒙2 that minimizes Δ(𝒞(𝒙1),𝒞(𝒙2)) for the given input 𝒞 is called the Minimum Distance Problem (MDP), and it is known to be 𝖭𝖯-hard, even to approximate, and even for linear codes [14]. In contrast, SDP only asks for two codewords within some fixed distance, independent of the minimum distance of the code. Furthermore, one is often interested only in the important special case in which 𝒞 computes a linear function over 𝔽q, but we stick to this more general setting because our upper bounds for this problem apply even to arbitrary codes (making them stronger), and unfortunately our lower bounds seem to require us to work with arbitrary codes specified by circuits. Notice that (n,k,d)q-SDP is total if and only if all q-ary codes with message length k and block length n have distance at most d.

So, upper bounds on the minimum distance of a code correspond to proofs of totality of SDP for some choice of d=d(n,k,q). In other words, such bounds can be viewed as proofs that for such a choice of d, SDP is in the complexity class 𝖳𝖥𝖭𝖯, which consists of total search problems with efficiently verifiable solutions. Starting with the seminal work of Papadimitriou [34], the study of 𝖳𝖥𝖭𝖯 has focused largely on placing important total search problems in subclasses of 𝖳𝖥𝖭𝖯, where it is often convenient to think of each subclass as capturing problems whose totality can be attributed to a particular fundamental principle. E.g., 𝖯𝖯𝖯 (which we will discuss much more shortly) can be thought of as the subclass of 𝖳𝖥𝖭𝖯 corresponding to problems whose totality can be proven via a particular version of the pigeonhole principle.

It is therefore natural to ask what more we can say about the complexity of (n,k,d)q-SDP when d=d(n,k,q) is equal to one of the many celebrated upper bounds on the minimum distance of a q-ary code with block length n and message length k. Indeed, there are now many important upper bounds known on the minimum distance of codes (in terms of n, k, and q), such as the Singleton bound [39], the Hamming bound [19], the Plotkin bound [36], the Elias-Bassalygo bound [3], and the MRRW bound(s) [29]. These bounds are ubiquitous in the computer science literature and beyond. (See, e.g., [18].)

However, in spite of the importance of these bounds, there has been surprisingly little work addressing the complexity of the associated (total) search problems of actually finding a pair of distinct codewords within one of these bounds. Indeed, to our knowledge, the only prior work directly addressing such a question is the beautiful (and recent) work of Debris-Alazard, Ducas, and van Woerden [11], which showed (among other things) that the problem of solving SDP on linear codes within the Griesmer bound [16] can be solved efficiently.

1.1 The problem of finding multicollisions

In this work, we will show connections between the complexity of such total coding (and lattice) problems and a natural family of total search problems.

Specifically, for integers A>B and L2, we consider the computational problem in which the input is a circuit 𝒞:[A][B],333See the full version [4] for discussion of what we mean by a circuit with input size A and output size B when A and B are not necessarily powers of 2. and the goal is to find L distinct input values x1,x2,,xL such that 𝒞(x1)=𝒞(x2)==𝒞(xL). Notice that, by the (generalized) pigeonhole principle, this problem is total if (and only if) the size A of the domain of 𝒞 and the size B of its range satisfy LA/B. We will focus on the regime in which the problem is total.

In the special case when L=2 and A=B+1, this is the canonical complete problem for the Polynomial Pigeonhole Principle complexity class (𝖯𝖯𝖯).444When L=2 and A/B1+1/poly(logB), one obtains the canonical complete problem for Polynomial Weak Pigeonhole Principle complexity class (𝖯𝖶𝖯𝖯). We will say much more about this distinction later, but for now we largely ignore this. This class was introduced by Papadimitriou in his work studying problems in 𝖳𝖥𝖭𝖯 (i.e., total search problems in 𝖥𝖭𝖯[34]. Since then, the class 𝖯𝖯𝖯 has been of great interest because it is known to contain many important computational problems, such as the problem of breaking a cryptographic collision-resistant hash function, the problem of breaking a one-way permutation, factoring (under randomized reductions) [22], and many more problems of interest [41, 8]. Most relevantly to our work, the problem of finding a non-zero vector in a lattice within Minkowski’s bound (in the norm, though see below) was shown to be in 𝖯𝖯𝖯 in [2].

Because of its relationship with the pigeonhole principle, we call the computational problem of finding such a collision Pigeon. (Papadimitriou originally used this name for the special case when L=2 [34].) In fact, we get a family of problems (A,B)-PigeonL, parameterized by the circuit input size A, the circuit output size B, and the number of colliding inputs LA/B that we must find. Accordingly, we define the complexity class (A,B)-𝖯𝖬𝖯𝖯L as the set of all search problems that have a polynomial-time (Karp) reduction to (A,B)-PigeonL.555To define 𝖯𝖬𝖯𝖯 formally, A and B should be functions of some asymptotic parameter n, and L might also be a function of n. But, we mostly ignore this issue in the introduction for simplicity.

In the complexity-theoretic literature, there is very little work on Pigeon for L>2 (although we note Sotiraki’s thesis [40, Section 4.5] as foundational work in this direction; and see also Section 1.3 for discussion of recent independent work by [21]). On the other hand, there is an exciting line of work in the cryptographic literature studying multicollision-resistant hash functions [23, 33, 43, 5, 28, 6, 25, 40, 13, 37, 20]. From our perspective, one can think of such cryptographic works as studying the average-case complexity of the Pigeon problem (under efficiently sampleable distributions of circuits 𝒞). However, even these works have not fully explored this space, primarily focusing on the case where L is constant and logA2logB.

1.2 Our results

In this work, we show many connections between coding and lattice problems and 𝖯𝖬𝖯𝖯. We also further study the classes (A,B)-𝖯𝖬𝖯𝖯L.

1.2.1 Connections between coding problems and 𝗣𝗠𝗣𝗣

Our first set of results shows a number of connections between 𝖯𝖬𝖯𝖯 and computational search problems related to error-correcting codes. (See Figure 1 for a summary of some of our results for binary codes.)

Upper bounds on the Shortest Distance Problem

We show that many of the celebrated bounds on the minimum distance of a code yield versions of SDP that are in different versions of 𝖯𝖬𝖯𝖯, including the Singleton bound [39], the Hamming bound [19], the Plotkin bound [36], and the Elias-Bassalygo bound [3]. The following theorem shows some of these results.

Theorem 1 (Informal, see full version [4]).

The following hold for any n, k, and prime power q and any constant ε>0.

  1. 1.

    (n,k,dS)-SDPq is in 𝖯𝖶𝖯𝖯 for dS(n,k,q):=nk+1 equal to the Singleton bound.

  2. 2.

    (n,k,dH)-SDPq is in 𝖯𝖯𝖯 and (n,k,(1+ε)dH)-SDPq is in 𝖯𝖶𝖯𝖯 for dH=dH(n,k,q) equal to the Hamming bound.

  3. 3.

    (n,k,dP)-SDPq is in 𝖯𝖬𝖯𝖯 for dP(n,k,q)=(11/q+ε)(nk) slightly larger than the Plotkin bound.

  4. 4.

    (n,k,dE)-SDPq is in 𝖯𝖬𝖯𝖯 for dE(n,k,q) equal to the Elias-Bassalygo bound.

In fact, these results are special cases of more general results that we prove. Specifically, we show that one can obtain smaller values of d by increasing L or decreasing A (while holding B fixed and maintaining totality).

Our smooth tradeoff in particular is perhaps a bit surprising. Indeed, it is a-priori not clear why it would be useful to take L>2 in such a reduction, since, after all, the goal in SDP is to find two codewords. (Of course, this is because our reductions mimic the proofs of the Plotkin and Elias-Bassalygo bounds. Both proofs work by first finding L>2 codewords that lie in a relatively small Hamming ball and then showing that two of these codewords must be quite close to each other.)

Tight hardness for 𝐒𝐃𝐏 and completeness when 𝑳=𝟐

We next show tight hardness results for SDP. In particular, we consider both the case when a code is represented by an arbitrary circuit 𝒞 and the case when a code is represented in systematic form (i.e., the case when the first k symbols in a codeword consist of the message; see Section 1.5). We call this restricted problem sysSDP.

Theorem 2 (Informal, see full version [4]).

For any n, k, and prime power q and any constant ε>0, (n,k,d)-SDPq is 𝖯𝖶𝖯𝖯-hard for d(n,k,q)=(11/qε)n. Furthermore, (n,k,dP)-sysSDPq is 𝖯𝖶𝖯𝖯-hard for distance dP(n,k,q)=(11/qε)(nk) slightly below the Plotkin bound.

Theorem 2 is essentially tight. In particular, for d(11/q)n, it is easy to see that SDP can be solved efficiently. And, we show that there is also a simple efficient algorithm for sysSDP for d(11/q)(nk) (see full version [4]). So, Theorem 2 essentially characterizes when there is a polynomial-time algorithm for SDP and sysSDP (assuming that 𝖥𝖯𝖯𝖶𝖯𝖯).

Furthermore, there is a large overlap between the parameter regime for which we show containment in 𝖯𝖶𝖯𝖯 and the regime for which we show 𝖯𝖶𝖯𝖯-hardness (both for arbitrary codes and codes in systematic form), so that for a large range of parameters we show that SDP is actually complete for 𝖯𝖶𝖯𝖯. This essentially settles the complexity of SDP in these regimes and adds SDP to the relatively short list of problems known to be complete for this class. For example, we show that in many settings, the Hamming bound is 𝖯𝖶𝖯𝖯-complete. (Since the Hamming bound can be viewed as a coding-theoretic analogue of Minkowski’s bound for lattices, one might view this result as resolving a coding-theoretic analogue of the conjecture that Minkowski’s bound is 𝖯𝖯𝖯-complete [2]. However, perhaps a better coding-theoretic analogue would be 𝖯𝖯𝖯-hardness of this problem over linear codes, which we do not prove. See Section 1.6.)

We summarize these results for binary codes in Figure 1. The picture for q-ary codes is quite similar (with the various bounds replaced by their q-ary versions). See full version [4] for the details.

Figure 1: Two plots showing the complexity of the problem of finding two codewords within relative distance δ:=d/n in a code with rate R:=k/n for binary codes (represented by circuits). The shaded red region represents 𝖯𝖶𝖯𝖯-hardness. (The entirety of both figures are shaded red.) Regions not shown (δ>1/2 for the left figure and δ>12R for the right figure) are known to be in 𝖥𝖯. The shaded blue region represents containment in 𝖯𝖶𝖯𝖯. The region of overlap between red and blue therefore represents regimes where the problem is 𝖯𝖶𝖯𝖯-complete. The area covered with green dots represents problems in 𝖯𝖬𝖯𝖯L for some L2. The left figure is for arbitrary (binary) codes represented by arbitrary circuits, while the right figure is for (binary) codes in systematic form. (See Section 1.5 for discussion of systematic form and how we define codes in this context.)
Finding many close codewords and 𝗣𝗠𝗣𝗣

We also show analogous results for the problem of finding many codewords that all lie in a small ball. This problem corresponds to bounds on the list decoding radius of a code (just as the problem of finding close pairs of codewords corresponds to bounds on the unique decoding radius).

We first show that list-decoding generalizations of the Singleton bound and the Hamming bound lie in 𝖯𝖬𝖯𝖯 for appropriate parameters. (See full version [4]) We then show that the problem of finding many codewords that all lie in a small ball is also hard for 𝖯𝖬𝖯𝖯 for some choices of parameters.

Our results for this problem are significantly more subtle than the L=2 case, since for L>2, the complexity of (A,B)-PigeonL seems to vary quite a bit with the parameters, whereas for L=2, the parameters A and B matter much less. In particular, for L>2 we do not find a complete problem for 𝖯𝖬𝖯𝖯L. See full version [4] for the details.

1.2.2 Lattice problems in 𝗣𝗠𝗣𝗣

We next show similar results for computational lattice problems. Recall that a lattice is the set of all integer linear combinations of n linearly independent basis vectors 𝐁=(𝒃1,,𝒃n)n, i.e.,

=(𝐁)={z1𝒃1++zn𝒃n:zi}.

A fundamental question about lattices asks when there must exist a non-zero lattice point 𝒚𝟎 such that 𝒚Kr, where K is some norm of interest and r is some bound.

Minkowski’s celebrated theorem [32] tells us that such a 𝒚 is guaranteed to exist for some

r2det()1/n/vol(K)1/n,

where det():=|det(𝐁)| is the lattice determinant and K is the unit ball of the norm K. The corresponding computational problem MinkowskiK is the search problem in which the input is a basis 𝐁 for a lattice, and the goal is to output a non-zero vector 𝒚𝟎 such that

𝒚K2det()1/n/vol(K)1/n.

This problem is quite important in cryptography, particularly in the 2 norm. In the special case of the norm, it is known that Minkowski𝖯𝖯𝖯, and Ban, Jain, Papadimitriou, Psomas, and Rubinstein [2] conjectured that Minkowski is actually 𝖯𝖯𝖯-complete. (This conjecture remains open.)

We study the more general problem of finding 𝒚𝟎 with

𝒚pγdet()1/n,

where

𝒚p:=(|y1|p++|yn|p)1/p

is the p norm. This problem is known as the γ-Hermite Shortest Vector Problem (γ-HSVPp). Since this problem is relevant to cryptography, it is very well studied for a wide range of parameters γ [27, 38, 15, 31, 1], particularly for p=2.

The case γ=2/vol(pn)1/n corresponds to Minkowskip, where pn is the unit p ball. However, Minkowski’s bound is not tight (except in the case when K tiles space, such as when K is the hypercube). For example, Blichfeldt improved on Minkowski’s celebrated theorem in the 2 norm, proving that γ-HSVP2 is still total when γ2/vol(2n)1/n [7].666We note that Blichfeldt proved two distinct relevant theorems in this context, which might easily confuse the reader. So, we endeavor to clarify here. The theorem commonly referred to as “Blichfeldt’s theorem” says that any measurable set Sn with vol(S)>det() is guaranteed to contain two points 𝒙1,𝒙2S with 𝒙1𝒙2 such that 𝒙1𝒙2. This theorem is often discussed in the context of total lattice problems because it can be used to prove Minkowski’s theorem. In fact, Sotiraki, Zampetakis, and Zirdelis introduced a related computational problem that they called BLICHFELDT (in which the set S is represented implicitly by a circuit), and they showed that BLICHFELDT is actually 𝖯𝖯𝖯-complete. It is not clear how BLICHFELDT is related to γ-HSVP for γ2/vol(2n)1/n (except that they are two computational lattice problems whose totality was proven by Blichfeldt).

Perhaps surprisingly, we show that γ-HSVP2(A,B)-𝖯𝖬𝖯𝖯L for γ2/vol(2n)1/n and appropriate choices of A, B, and L. In fact, we show a smooth tradeoff between the Hermite factor γ and the parameters A, B, and L, showing that one can obtain shorter vectors by either increasing L or decreasing the ratio between A and B (while maintaining totality). A similar story holds for p norms more generally.

Theorem 3 (Informal; see full version [4]).

For any constant integer p1 and any A,B,L+ satisfying 2s(n)=B<A2poly(n) and LA/B for a sufficiently large polynomial s(n), it holds that γ-HSVPp(A,B)-𝖯𝖬𝖯𝖯L for

γdp(L,n)(A/B)1/nvol(pn)1/n,

where n is the rank of the lattice in the γ-HSVP instance and 1dp(L,n)2 is a particular function that is decreasing in the parameter L.

1.2.3 Containments between different classes

The above results bring new attention to the classes 𝖯𝖬𝖯𝖯. So, we next study containments between these 𝖯𝖬𝖯𝖯 classes with different parameters A, B, and L, and other classes of interest in 𝖳𝖥𝖭𝖯.

Recall that the celebrated Merkle-Damgård construction [30, 10] shows that the ratio of the input size A to the output size B of a circuit essentially does not matter in the special case when L=2, since one can efficiently reduce from the problem of finding a single collision (i.e., L=2) in a barely compressing circuit 𝒞:AB with A=B+B/poly(logB) to the (seemingly much easier) problem of finding a single collision in a much more compressing circuit 𝒞:AB with logA=log(B)C for any constant C>1. In our terminology,

(B+B/poly(logB),B)-𝖯𝖬𝖯𝖯2=(2logCB,B)-𝖯𝖬𝖯𝖯2.

This surprising collapse of complexity classes is known as domain extension, and it has innumerable applications in cryptography and complexity theory.777Admittedly, we are deliberately conflating the distinction between the problem of breaking a cryptographic hash function (which is what Merkle and Damgård actually studied) and the problem of finding a collision in an arbitrary worst-case circuit. All of the above statements hold in both cases.

Already in 2004, Joux noticed that Merkle and Damgård’s elegant domain extension technique does not seem to work for L>2 [23]. So, it appears that for L>2, the relationship between A and B (i.e., how “compressing” the circuit 𝒞 is) might matter quite a bit. This suggests a surprising fundamental difference between the case L=2 and the case when L>2.

However, we show two (rather weak) notions of domain extension that still work in the setting of multicollisions. The first such result follows by analyzing the Merkle-Damgård construction in this setting and showing that it does achieve something, albeit with a large loss in parameters. The second result shows a more sophisticated reduction (using Merkle trees together with list-recoverable codes) that is better than the Merkle-Damgård-based reduction in the regime where A is very large compared to B. The latter result can be thought of as a translation into our setting of a beautiful result due to Bitanski, Kalai, and Paneth [6] in the setting of multicollision-resistant hash functions. (In fact, the proof is substantially simpler in our setting than in that of [6], since we do not have to worry about the many additional issues that arise in the average-case setting.)

Together, these results show that it is possible to reduce the problem of finding multicollisions in a less compressing function to the problem of finding multicollisions in a more compressing function, but at the expense of a large loss in the number of collisions found.

Theorem 4 (Informal; see full version [4]).

For m>a>b,

(2a,2b)-𝖯𝖬𝖯𝖯L(2m,2b)-𝖯𝖬𝖯𝖯L

for LL(ab)/(mb).

For any r2, any k1, and any L,

(2vr,2v)-𝖯𝖬𝖯𝖯M(2vrk,2v)-𝖯𝖬𝖯𝖯M

under randomized reductions, for

MMlogr/(2logr+logk+log(M)/2).

We also show that the class 𝖯𝖬𝖯𝖯 is contained in the recently introduced ([35]) complexity class Polynomial Long Choice (𝖯𝖫𝖢), for appropriate choices of the parameters A, B, and L. In fact, we show a reduction to the Unary Long Choice problem. (This result was recently independently discovered in [21]. See Section 1.3.) This strengthens a result of Pasarkar, Papadimitriou, and Yannakakis [35], who showed that 𝖯𝖶𝖯𝖯𝖯𝖫𝖢.

Theorem 5 (Informal; see full version [4]).

For any L<n,

(2n,2nL)-𝖯𝖬𝖯𝖯L𝖯𝖫𝖢

1.2.4 Black-box separations (and a non-black-box non-separation)

Our final set of results concerns black-box separations between (A,B)-𝖯𝖬𝖯𝖯L for different values of A, B, and L, which suggest that it might be hard to prove stronger containments than what we show above. However, we note that recent independent work of Jain, Li, Robere, and Xun [21] also showed exciting black-box separations of this form. While our results are formally incomparable to theirs, we believe that the results of [21] are more interesting than our own. (See Section 1.3.)

The starting point for our results is a beautiful idea due to Komargodski, Naor, and Yogev [25] for separating (2n,2n/2)-𝖯𝖬𝖯𝖯L from (2n,2n/2)-𝖯𝖬𝖯𝖯L for any constants LL (particularly in the average-case setting relevant to cryptography). Unfortunately, however, the proof in [25] had a subtle bug that does not seem easy to fix [26].

We use similar ideas to show two different black-box separations, which can be seen as partial evidence that domain extension and range compression are not possible when L>2. However, we note that our black-box separations are rather weak, since they only rule out rather fine-grained black-box reductions between the classes.

Finally, we show that a very clever non-black-box proof due to Rothblum and Vasudevan [37] extends to our setting. In particular, Rothblum and Vasudevan show, using non-black-box techniques, that the existence of a certain sufficiently strong multicollision-resistant hash function implies the existence of a collision-resistant hash function. We prove an analogue of their result in our (worst-case) setting, showing that suitable hardness of 𝖯𝖬𝖯𝖯 for large L implies hardness of 𝖯𝖬𝖯𝖯 for smaller L. (After a preliminary version of this work appeared, Buzek and Tessaro [9] improved upon the main result in [37]. We do not know whether the stronger result in [9] extends to our setting.)

As these results are rather technical, we refer the reader to the full version [4] for the details.

1.3 Comparison with Jain, Li, Robere, and Xun

Recent exciting work by Jain, Li, Robere, and Xun [21] also defines and studies the computational problem (A,B)-PigeonL and the associated complexity class (A,B)-𝖯𝖬𝖯𝖯L (with slightly different notation). (They also define additional classes that correspond to the union of (A,B)-𝖯𝖬𝖯𝖯L over different parameters A, B, and L.) [21] is concurrent with and independent of this work (and appeared as a preprint shortly before we released a preprint of the present work). Here, we provide a brief comparison of their work with ours.

Both the present work and [21] define multi-collision classes and study relationships between them. At a high level, [21] has morally stronger (although formally incomparable) results on the structural complexity of these classes, whereas the present work focuses on showing containment and hardness of coding and lattice problems with respect to these classes (which [21] does not study at all).

In terms of structural complexity, [21] contains exciting black-box separation results, which, although formally incomparable to our black-box separations, we think of as stronger. In particular, [21] are essentially able to black-box separate (A,B)-𝖯𝖬𝖯𝖯L from (A,B)-𝖯𝖬𝖯𝖯L for any constants LL and any (reasonable) A, B, A, and B. They also show black-box separations between 𝖯𝖬𝖯𝖯 and other interesting complexity classes in 𝖳𝖥𝖭𝖯, and in particular show a black-box separation between the Ramsey problem and 𝖯𝖬𝖯𝖯. We refer the reader to [21] for the technical details.

[21] also studies the relationship between 𝖯𝖫𝖢 and 𝖯𝖬𝖯𝖯. Indeed, they prove a result that is essentially identical to our Theorem 5, which shows that 𝖯𝖬𝖯𝖯𝖯𝖫𝖢 for certain parameters. (Formally, our technical result in the full version [4] is more general than the analogous result in [21], but it is clear that the proof in [21] yields the more general result as well.) In addition, they show a containment in the other direction, that 𝖯𝖫𝖢𝖯𝖬𝖯𝖯, albeit for different parameters. Finally, [21] shows that an interesting problem in 𝖳𝖥𝖡𝖰𝖯 is in 𝖯𝖬𝖯𝖯. The latter two results have no analogue in the present work.

In this work, we focus on the relationship of 𝖯𝖬𝖯𝖯 with coding and lattice problems (see the full version [4]). We also show Merkle-Damgård-style reductions between 𝖯𝖬𝖯𝖯 with different parameters (see full version [4]) and a non-block-box non-separation in the style of [37] (see full version [4]). These topics are not studied in [21].

1.4 Other related work

Our work lies at the intersection of a number of different areas, and there is therefore much related work to discuss in addition to [21]. Here, we focus on how this prior work relates to our work.

The complexity of total lattice problems

The complexity of Minkowski’s bound and HSVP more generally is quite well studied, particularly in the norm and the 2 norm. In particular, algorithms for HSVP2 play a very important role in lattice-based cryptography, and algorithms for HSVP2 with different approximation factors are a very active area of research [27, 38, 15, 31, 1]. Some of these algorithms can be viewed as constructive proofs of classical results about the minimum distance of a lattice relative to the determinant. (For example, the celebrated LLL algorithm gives a constructive proof of Hermite’s bound [27], and the slide reduction algorithm gives a constructive proof of Mordell’s inequality [15].)

Finding a vector within Minkowski’s bound in the norm is considered one of the most important problems in the complexity class 𝖯𝖯𝖯. In particular, Ban, Jain, Papadimitriou, Psomas, and Rubinstein showed that other important problems in 𝖯𝖯𝖯 can be reduced to this problem, and they conjectured that this problem is actually 𝖯𝖯𝖯-complete [2]. Sotiraki, Zampetakis, and Zirdelis further investigated the relationship between lattice problems, 𝖯𝖯𝖯, and 𝖯𝖶𝖯𝖯, showing two problems related to lattices that are 𝖯𝖯𝖯-complete and 𝖯𝖶𝖯𝖯-complete respectively [41].

The complexity of total coding problems

To our knowledge, much less is known about the complexity of total problems that arise in coding theory. Instead, much work has focused on the γ-approximate Minimum Distance Problem (γ-MDP), in which the input is a linear code and the goal is to output a pair of distinct codewords 𝒄1,𝒄2 such that the distance between them is within a factor γ of the minimum distance of the input code (or, since the code is linear, one can equivalently output a non-zero codeword with nearly minimal Hamming weight). This problem is known to be 𝖭𝖯-hard [42], even to approximate [14]. In contrast, we are interested in the problem of finding distinct codewords 𝒄1,𝒄2 that are within distance d, where d depends only on the message length k and block size n of the code (and not on the minimum distance). We are particularly interested in the total regime, where such problems are very unlikely to be 𝖭𝖯-hard.

To our knowledge, the only direct work in this area is a beautiful paper by Debris-Alazard, Ducas, and van Woerden [11], which showed an LLL-like algorithm for linear codes that efficiently finds codewords within the Griesmer bound [16]. (Note that this algorithm only works for linear codes, and indeed the Griesmer bound itself only applies to linear codes.)

Literature on multicollisions

There is quite a bit of prior work in the cryptography literature on multicollision-resistant hash functions. In our terminology, a multicollision-resistant hash function is some efficiently sampleable distribution of instances of (A,B)-PigeonL (i.e., circuits 𝒞:[A][B]) that are actually hard (i.e., that cannot be solved by polynomial-time algorithms with non-negligible probability). A survey of this literature is beyond the scope of this work. Broadly speaking, work in this area has been concerned with (1) understanding the relationship between collision resistance and multicollision resistance (i.e., the relationship between the case when L=2 and the case when L>2); (2) finding applications of multicollision-resistant hash functions; and (3) building multicollision-resistant hash functions from various cryptographic hardness assumptions.

Many of the techniques that we use to understand the relationship between (A,B)-PigeonL in different parameter regimes are directly inspired by this cryptographic literature. In particular, our inclusion based on Merkle trees and list-recoverable codes is a direct adaption of Bitanski, Kalai, and Paneth’s construction from the cryptographic setting to our setting [6]; our black-box separations are inspired by [25]; and our non-black-box non-separation is a direct adaptation of Rothblum and Vasudevan’s proof from the cryptographic setting to our setting [37]. (See also the very recent work of [9].)

In contrast, there is very little prior work in the worst-case setting. To our knowledge, the only works that consider the worst-case complexity of finding multicollisions are Komargodski, Naor, and Yogev [25] and Sotiraki [40] (though see Section 1.3). In both of these works, the worst-case complexity of (A,B)-PigeonL is not the primary focus, but both do define the special cases of the complexity class (A,B)-𝖯𝖬𝖯𝖯L, when A=B2 (specifically, A=22n and B=2n) and L is a constant. This is a natural setting of parameters, but we show interesting results in other parameter regimes as well.

Sotiraki in particular shows a complete problem for 𝖯𝖬𝖯𝖯 that is similar to the 𝖯𝖯𝖯-complete and 𝖯𝖶𝖯𝖯-complete problems from [41]. In particular, Sotiraki’s 𝖯𝖬𝖯𝖯-complete problem bears some resemblance to certain lattice and coding problems, though the relationship is unclear. We refer the reader to [41, 40] for more.

Komargodski, Naor, and Yogev claimed a black-box separation between (22n,2n)-𝖯𝖬𝖯𝖯L and (22n,2n)-𝖯𝖬𝖯𝖯L for any constants L<L, but their elegant proof contained a subtle bug that has not been fixed [26]. Our black-box separations use similar ideas but are weaker than what they originally claimed.

Literature on 𝗣𝗣𝗣 and 𝗣𝗪𝗣𝗣

In contrast, there is much literature studying the complexity classes 𝖯𝖯𝖯 and 𝖯𝖶𝖯𝖯, which correspond to the special case of 𝖯𝖬𝖯𝖯 when L=2 (in different parameter regimes). 𝖯𝖯𝖯 was introduced by Papadimitriou in his seminal work [34]. (𝖯𝖶𝖯𝖯 appeared in many works but seems to have been first named 𝖯𝖶𝖯𝖯 in [22].) Since then, many problems of interest have been shown to be contained in either 𝖯𝖶𝖯𝖯 or 𝖯𝖯𝖯 [22, 2, 41, 8]. Until recently, only a small handful of problems were known to be complete for 𝖯𝖯𝖯 or 𝖯𝖶𝖯𝖯 [34, 41, 2]. However, Bourneuf, Folwarczný, Hubáček, Rosen, and Schwartzbach recently showed a number of problems arising from extremal combinatorics that are complete for either 𝖯𝖯𝖯 or 𝖯𝖶𝖯𝖯 [8].

There has also been some literature concerned with generalizing 𝖯𝖯𝖯 and 𝖯𝖶𝖯𝖯 to classes other than 𝖯𝖬𝖯𝖯. In particular, Pasarkar, Papadimitriou, and Yannakakis [35] recently introduced the class Polynomial Long Choice (𝖯𝖫𝖢), which can be thought of as corresponding to the generalization of the pigeonhole principle obtained by iterating the pigeonhole principle many times.

1.5 A note on “codes” represented by circuits, injectivity, and systematic form

The coding theory results in this paper are concerned with the problem of finding close codewords (or many codewords that lie in a relatively small ball) in a “code” represented by an arbitrary circuit 𝒞:𝔽qk𝔽qn with size poly(n,k,logq). It is far more common in the literature to consider linear codes represented by a generator matrix (or, equivalently, an invertible circuit with linear gates). (Sometimes when arbitrary codes are considered in the literature, the code is simply represented by listing all qk points, while we represent our codes succinctly.)

Whether one should really think of a generic circuit 𝒞:𝔽qk𝔽qn as representing a “code” is not so clear. In particular, such a circuit might not be injective, i.e., there might be distinct “messages” 𝒙1,𝒙2𝔽qk that map to the same “codeword” 𝒞(𝒙1)=𝒞(𝒙2). (And, there is likely no way to efficiently determine whether such a circuit is injective. Indeed, determining this is 𝖼𝗈𝖭𝖯-complete.) But, we find the coding-theoretic perspective to be quite useful. In particular, the notion of the distance of a code still makes sense with this slightly more general definition, and the bounds on the distance that we discuss in this paper still apply. Indeed, much of coding theory still makes sense if we treat such degenerate, non-injective codes simply as “codes with distance 0,” and standard bounds in coding theory, such as the Singleton and Hamming bounds, still apply.

Of course, it is not an issue, and actually a strength that our upper bounds apply to such general “codes.” Indeed, any upper bound that applies in the more general case when “codes” are represented by arbitrary circuits certainly applies to the special case of injective circuits or the even more special (and quite important) case of linear codes.

For our lower bounds, however, this can be viewed as a major flaw in our model. To partially mitigate this, we prove our hardness results in two different settings.

In the first “generic” setting, we show hardness for “codes” represented by arbitrary circuits 𝒞:[q]k[q]n, with no presumed structure. In particular, the circuit is not necessarily injective. Indeed, one may argue that our reductions are rather artificial, in that the reductions often produce “codes” 𝒞 such that Δ(𝒞(𝒙i),𝒞(𝒙j)) is either zero or strictly larger than d, where d is the bound on the distance needed to solve the associated coding problem. So, these reductions rely quite heavily on this rather strange definition of a code in which multiple messages can map to the same codeword.

In the “systematic” setting, our codes 𝒞:[q]k[q]n are in systematic form, which means that the first k characters of 𝒞(𝒙) are simply 𝒙 itself. (Formally, we actually work with a smaller circuit 𝒞:[q]k[q]nk, and we simply interpret (𝒙,𝒞(𝒙)) as the codeword associated with 𝒙.) Such circuits are clearly injective, and therefore this setting is much less artificial. (Codes in systematic form are also quite fundamental objects that arise naturally in coding theory.) In this setting, our formal results are a bit different, and we even show that there are efficient algorithms for this setting in regimes where the problem is provably hard for codes represented by arbitrary circuits. However, the high-level picture is the same. In particular, we still show completeness in this setting. (See Figure 1.)

1.6 Open problems

We leave a number of interesting open problems. Here, we mention some of them.

Complexity of coding and lattice problems

We place the computational problems associated with a number of fundamental bounds on the minimum distance of a code in 𝖯𝖬𝖯𝖯. However, we were unable to say anything non-trivial about the complexity of Delsarte’s linear programming bound [12] and the closely related MRRW bound(s) [29], which are the best known bounds in an important range of parameters. The associated computational problems are, by definition, in 𝖳𝖥𝖭𝖯, but we do not know any natural subclass of 𝖳𝖥𝖭𝖯 that contains these problems.

Similarly, the best known bound on the minimum distance in the 2 norm of a lattice relative to the determinant is the celebrated bound due to Kabatjanskiĭ and Levenšteĭn [24] (which is better than Blichfeldt’s [7] by a factor of roughly 21/10). The problem of finding a vector within the Kabatjanskiĭ and Levenšteĭn bound is again, by definition, in 𝖳𝖥𝖭𝖯, but we do not know any natural subclass of 𝖳𝖥𝖭𝖯 that contains this problem.

In a different direction, we show hardness of various total coding problems (and even completeness in some regimes), but only for codes represented by circuits. It would be very exciting to show hardness results of total problems on linear codes (represented, e.g., by generator matrices), since these are the more standard problems. (The analogous question for lattices was already asked in [2] and also remains open.)

Better understanding of 𝗣𝗠𝗣𝗣 across different regimes

The first question that comes to mind about the complexity classes (A,B)-𝖯𝖬𝖯𝖯L is, of course, “how do these classes relate to each other across different parameter regimes?” In the case when L=2, it has long been known that the relationship between A and B does not matter much. For example,

((1+1/poly(n))2n,2n)-𝖯𝖬𝖯𝖯2=((2n+poly(n),2n)-𝖯𝖬𝖯𝖯2=𝖯𝖶𝖯𝖯.

For L>2, it seems unlikely that a similar result holds. We instead describe a rich and rather complicated set of inclusions (and non-black-box relationships) among these classes, as well as black-box separations.

However, we have no evidence that these results are tight. One would ideally like to show black-box separations and inclusions that are tight with one another. (Or, alternatively, one might hope to prove non-trivial equalities (A,B)-𝖯𝖬𝖯𝖯L=(A,B)-𝖯𝖬𝖯𝖯L like what we know in the case of L=L=2.) The analogous average-case question has been the topic of much research in the cryptography literature [23, 33, 43, 5, 6, 25, 28, 13, 40, 37, 20], but to the authors’ knowledge the worst-case setting was barely explored before this work and the recent exciting (concurrent and independent) work of Jain, Li, Robere, and Xun [21].

2 Upper bounds for coding problems

In this section, we prove that SDP and DenseBall are contained in 𝖯𝖬𝖯𝖯 in many parameter regimes of interest. In particular, for SDP, we show such results when the distance d(n,k,q) corresponds to a number of celebrated bounds on the minimum distance of a code, including the Singleton bound (Corollary 7), the Plotkin bound (Corollary 8), the Hamming bound (Corollary 13), and the Elias-Bassalygo bound (Corollary 15). For DenseBall, we show analogous results for the list Singleton bound (Corollary 11) and list Hamming bound (Corollary 13).

In fact, all of these results are corollaries of four technical results: Theorem 6, which reduces SDP to Pigeon via a simple truncation technique; Theorem 10, which does something similar for DenseBall; Theorem 12, which reduces DenseBall to Pigeon by “finding many overlapping Hamming balls centered on codewords”; and Theorem 14, which shows a generic reduction from SDP to DenseBall.

We also show in Section 2.1.1 an efficient algorithm for SDP within the Plotkin bound for codes in systematic form.

2.1 Upper bounds for the Singleton and Plotkin bounds

We start by giving a fairly general reduction from SDP (the problem of finding a pair of close codewords) to PigeonL. On input an instance 𝒞 of SDP, the reduction works by defining a compressing circuit 𝒞 that truncates the output of 𝒞. It then finds 𝒙1,,𝒙L such that 𝒞(𝒙1)==𝒞(𝒙L) using its PigeonL oracle, and finally outputs a pair 𝒙i𝒙j that minimizes Δ(𝒞(𝒙i),𝒞(𝒙j)). We then observe that special cases of this general result place (n,k,d)-SDPq in 𝖯𝖶𝖯𝖯 for d corresponding to the Singleton bound (Corollary 7) and 𝖯𝖬𝖯𝖯 for d corresponding to the Plotkin bound (Corollary 8).

Theorem 6.

Let k,m,n,q,L+ be such that mkn and 2Lqkm with Lpoly(n,logq). Then there is a Karp reduction from (n,k,d)-SDPq to (qk,qm)-PigeonL where d:=dq(nm,L).

Proof.

See full version [4].

We get useful corollaries from Theorem 6 that show how to compute vectors achieving the Singleton bound (the L=2 case) and the Plotkin bound (for larger L) using a PigeonL oracle.

We now show that the problem of finding a pair of codewords whose distance is at most the Singleton bound is in 𝖯𝖶𝖯𝖯.

Corollary 7 (Singleton bound in 𝖯𝖶𝖯𝖯).

Let k,n,q+ with kn. Then there is a Karp reduction from (n,k,nk+1)-SDPq to (qk,qk1)-Pigeon2. In particular, (n,k,nk+1)-SDPq is in 𝖯𝖶𝖯𝖯.

Proof.

The claim follows from Theorem 6 by setting m:=k1 and noting that, trivially, dq(nm,L)=dq(nk+1,L)nk+1 for any L2.

We now show that the problem of computing a pair of codewords whose distance satisfies the high-rate Plotkin bound is in 𝖯𝖬𝖯𝖯.

Corollary 8 (Plotkin bound in 𝖯𝖬𝖯𝖯).

Let k,n,d,m,q,L+ be such that kn, Lpoly(n), Lqkm, and

d>LL1(11q)(nm), (1)

then there is a Karp reduction from (n,k,d1)-SDPq to (qk,qm)-PigeonL which runs in time poly(n,log(q)).

Proof.

See full version [4].

2.1.1 Efficiently solving 𝐬𝐲𝐬𝐒𝐃𝐏 up to the Plotkin bound

We now note that there is an efficient algorithm for finding codewords within the Plotkin bound when the code is in systematic form. In other words, we show that (n,k,d)-sysSDPq can be solved efficiently for d below the Plotkin bound, where we recall that sysSDP is the special case of SDP in which the code is in systematic form.

Theorem 9.

Let k,n,q+ with k<n and q2. There is a poly(n,q)-time algorithm for (n,k,d)-sysSDPq where

d:=(11/q)(nk+logq(4qn)).
Proof.

See full version [4].

2.2 An upper bound for the list Singleton bound

Next, we give a reduction from DenseBall to Pigeon similar to the reduction from SDP to Pigeon in Theorem 6. We then show that this reduction places the list Singleton bound (i.e., a variant of the Singleton bound that corresponds to list decoding rather than unique decoding) in 𝖯𝖬𝖯𝖯 (up to a small error).

Theorem 10.

Let k,n,m,q,L+ be such that m<n and 2Lqkm. Then there is a Karp reduction from (n,k,nm(nm)/L)-DenseBallqL to (qk,qm)-PigeonL.

Proof.

See full version [4].

Finally, we show that the problem of computing a set of codewords lying in a ball whose radius almost satisfies the list Singleton bound is in 𝖯𝖬𝖯𝖯.

Corollary 11 (List Singleton bound in 𝖯𝖬𝖯𝖯).

Let n,k,q,L+ with k>logq(L), let m:=klogqL, and let

t:=nm+(nm)/L=nk+logqL(nk+logqL)/L. (2)

Then (n,k,t)-DenseBallqL is in (qk,qm)-𝖯𝖬𝖯𝖯L.

Proof.

See full version [4].

We remark that the radius t in Equation 2 is almost as small the smallest t satisfying the list Singleton bound. Indeed, one can check that any t satisfying the list Singleton bound must be such that

t>(11/L)(nk+logq(L1)),

and that the radius t in Equation 2 satisfies

t(11/L)(nk+logq(L))+1.

2.3 Upper bounds for the (list) Hamming and Elias-Bassalygo bounds

We now show containment of (n,k,d)-SDPq in 𝖯𝖬𝖯𝖯, where d corresponds to the Hamming bound and Elias-Bassalygo bound. In particular, when d is equal to the value given by the Hamming bound, we show that SDP is contained in 𝖯𝖯𝖯 (see Item 1 of Corollary 13) and for d slightly above the Hamming bound, SDP is contained in 𝖯𝖶𝖯𝖯 (see Item 2 of Corollary 13). In fact, we show a more general result that applies to the generalization (n,k,d)-DenseBallqL of SDP and shows that for d within the list Hamming bound, this problem is in 𝖯𝖬𝖯𝖯 (see Item 3 of Corollary 13). We then show that (n,k,d)-SDPq is in 𝖯𝖬𝖯𝖯 whenever d above the Elias-Bassalygo, via a generic reduction from SDP to DenseBall (see Corollary 15). In fact, we obtain a smooth tradeoff between the distance d obtained by the reduction and the parameters A, B, and L in the reduction to (A,B)-PigeonL.

2.3.1 The (list) Hamming bound

We now give a reduction from DenseBall to PigeonL, the key to which is the circuit 𝒞Hn,V,q giving an injection from [V] into a Hamming ball for sufficiently large V defined in the full version [4]. (In fact, we will use such circuits 𝒞Hn,V,q that are bijective.)

Theorem 12.

Let n,k,t,L+, and let q be a prime power satisfying

LqkVqn(t)qn.

Then there is a Karp reduction from (n,k,t)-DenseBallqL to (qkVqn(t),qn)-PigeonL.

Proof.

See full version [4].

We now use Theorem 12 to show that SDP and DenseBall with parameters corresponding to the (list) Hamming bound are in (one or more of) 𝖯𝖯𝖯, 𝖯𝖶𝖯𝖯, and 𝖯𝖬𝖯𝖯.

Corollary 13 ((List) Hamming bound in 𝖯𝖯𝖯, 𝖯𝖶𝖯𝖯, and 𝖯𝖬𝖯𝖯).

Let n,k,t,q,L+ with 2qpoly(n) and Lpoly(n). Let

L~:=qkVqn(t)qn.

Then:

  1. 1.

    If L~>1, then (n,k,2t)-SDPq is in 𝖯𝖯𝖯.

  2. 2.

    If L~1+1/poly(n), then (n,k,2t)-SDPq is in 𝖯𝖶𝖯𝖯.

  3. 3.

    If L~L, then (n,k,t)-DenseBallqL is in (qkVqn(t),qn)-𝖯𝖬𝖯𝖯L.

Proof.

See full version [4].

2.3.2 E pluribus duo – from many codewords to two

We now give a simple reduction from SDP to DenseBallL for L2. The idea is that a ball of small radius t containing L2 codewords 𝒞(𝒙1),,𝒞(𝒙L) must contain a pair of codewords at small distance d. Clearly such a pair must exist at distance at most 2t, but in fact when L is larger it is possible to get a better upper bound. So, the reduction simply computes the distance Δ(𝒞(𝒙i),𝒞(𝒙j)) between each pair of codewords 𝒞(𝒙i),𝒞(𝒙j) for ij, and outputs a pair 𝒙i,𝒙j that minimizes Δ(𝒞(𝒙i),𝒞(𝒙j)).

Theorem 14.

Let n,k,t,q,L+ be such that

LqkVqn(t)qn.

Then there is a poly(n,L,logq)-time Karp reduction from (n,k,d)-SDPq to (n,k,t)-DenseBallqL, where d:=d[q]n,Δ(t,L).

Proof.

See full version [4].

2.3.3 The Elias-Bassalygo bound

We are now ready to show that the problem of computing a pair of codewords satisfying the Elias-Bassalygo bound is in 𝖯𝖬𝖯𝖯.

Corollary 15 (Elias-Bassalygo bound in 𝖯𝖬𝖯𝖯).

Let n,k,d,t,q+ with t<Jq(d/n)n, let L:=qnd+1, and suppose that

LqkVqn(t)qn.

Then there is a poly(n,q)-time Karp reduction from (n,k,d1)-SDPq to (qkVqn(t),qn)-PigeonL. In particular, (n,k,d1)-SDPq is in (qkVqn(t),qn)-𝖯𝖬𝖯𝖯L.

Proof.

See full version [4].

3 Hardness of coding problems

We now turn to proving hardness results for SDP and DenseBall. In each of our hardness reductions we first assume that a gadget code, specified by a circuit 𝒞A, is given to the reduction as auxiliary input. We then instantiate the gadget code 𝒞A with an explicit efficiently computable code 𝒞A to obtain a true Karp reduction.

3.1 𝗣𝗪𝗣𝗣-hardness of 𝐒𝐃𝐏

We first show a generic reduction, with auxiliary from Pigeon2 to SDP.

Theorem 16.

Let k,m,n,d,d+ be such that m<k and d<d. Let q be a prime power. Then there is a Karp reduction from (qk,qm)-Pigeon2 to (n,k,d)-SDPq that takes a circuit 𝒞A:𝔽qm𝔽qn defining an (n,m,d)q code as auxiliary input.

Proof.

See full version [4].

We will instantiate the reduction in Theorem 16 with codes 𝒞A meeting the Zyablov bound, which can be computed in polynomial time (see [18, Theorem 10.2.1]). In fact, what we state is the special case of the (effective) Zyablov bound with sub-constant rate.888We note that the premise of [18, Theorem 10.2.1] is stated with the requirement δ<1/2. While δ<1/2 is necessary for the q=2 case, for general constant q, the result holds for δ<11/q, as in Theorem 17. It is also evident from the proof of the Zyablov bound that the result extends to superconstant q as well – i.e., that for any prime power q that is an unbounded function of n and any k=o(n), there is an efficiently computable family of codes 𝒞:𝔽qk𝔽qn with relative distance 1ε.

Theorem 17 (Zyablov bound [44]).

For any (efficiently computable) k=k(n)=o(n), constant prime power q, and constant ε>0, there is a poly(n)-time algorithm that takes as input n and computes (a circuit representing) a code 𝒞q𝖹𝗒𝖻:𝔽qk𝔽qn with relative distance δ11/qε for sufficiently large n.

We now show that SDP is 𝖯𝖶𝖯𝖯-hard on q-ary codes of relative distance δ:=d/n less than 11/q and any constant rate. This hardness corresponds to the red shaded region in the top plot in Figure 1. We remark that (n,k,δn)-SDPq is easy for δ11/q. In particular, for such any such δ, this problem can be solved by choosing any collection of polynomially many codewords and outputting the closest pair among them. So, Corollary 18 shows essentially tight hardness of SDP in terms of δ.

Corollary 18 (𝖯𝖶𝖯𝖯-hardness of SDP).

Let q be a fixed prime power and let ε,ε>0 be constants. Then for all sufficiently large n,k,d+ with knpoly(k) and d(11/qε)n, there is a Karp reduction from (qk,qk1ε)-Pigeon2 to (n,k,d)-SDPq.

In particular, (n,Rn,δn)-SDPq is 𝖯𝖶𝖯𝖯-hard for any constant rate for any constant rate R(0,1] and constant relative distance δ(0,11/q).

Proof.

See full version [4].

 Remark 19.

We remark that the assumption kn (equivalently, R1) in Corollary 18 is not necessary, except to ensure that the instance 𝒞 of SDP output by the reduction meets the definition of a code used there. In fact, modifying Corollary 18 slightly shows 𝖯𝖶𝖯𝖯-hardness of “(n,k,d)-SDPq” for any nΩ(kε) for constant ε>0 (and any constant relative distance less than). This is because the hard instances 𝒞 of SDP constructed in Corollary 18 are such that either 𝒞(𝐱)=𝒞(𝐲) or Δ(𝒞(𝐱),𝒞(𝐲)) is large for 𝐱𝐲.

We also extend Corollary 18 to codes in systematic form (and in particular to codes with distance d1, which are represented by injective circuits; see Section 1.5). This hardness corresponds to the red shaded region in the bottom plot in Figure 1.

Corollary 20 (𝖯𝖶𝖯𝖯-hardness of sysSDP).

Let q be a fixed prime power and let ε,ε>0 be constants. Then for all sufficiently large n,k,d+ with knkpoly(k) and d(11/qε)(nk), there is a Karp reduction from (qk,q(nk)1ε)-Pigeon2 to (n,k,d)-sysSDPq.

In particular, (n,Rn,δn)-sysSDPq is 𝖯𝖶𝖯𝖯-hard for any constant rate R(0,1) and constant relative distance δ(0,(11/q)(1R)).

Proof.

See full version [4].

3.2 𝗣𝗠𝗣𝗣-hardness of 𝐃𝐞𝐧𝐬𝐞𝐁𝐚𝐥𝐥

We next show a reduction from PigeonL for L2 to DenseBall analogous to Theorem 16.

Theorem 21.

Let k,m,n,t,L,LA,L+ and let q be a prime power. Suppose that Lmin{poly(n),qnk} and Lmin{L/LA,qkm}. Then there is a Karp reduction from (qk,qm)-PigeonL to (n,k,t)-DenseBallqL that takes a circuit 𝒞A:𝔽qm𝔽qn defining a (t,LA)-list decodable-code with minimum distance at least 1 (i.e., 𝒞A is injective) as auxiliary input.

Proof.

See full version [4].

We now state a result on explicit codes that nearly achieve list-decoding capacity [17] (see also [18, Theorem 17.3.8]). We note in passing that these codes are folded Reed-Solomon codes, but we will only use them in a black-box way as our gadget codes 𝒞A in Theorem 21.

Theorem 22 ([17][18, Theorem 17.3.8]).

For any constant rate R(0,1), any sufficiently small constant ε>0, and all sufficiently large n+, there exist linear q-ary ((1Rε)n,LA)-list-decodable codes 𝒞q𝖥𝖱𝖲 of dimension Rn and sufficiently large block length n, for some

LA(nε)O(1/ε),q(nε)O(1/ε2).

Furthermore, there is a poly(n)-time algorithm for computing (a circuit representing) such codes 𝒞q𝖥𝖱𝖲.

Finally, we use Theorem 22 to prove 𝖯𝖬𝖯𝖯-hardness of DenseBall.

Corollary 23 (𝖯𝖬𝖯𝖯-hardness of DenseBall).

For any constants R,ρ(0,1) and positive integers m:=m(n)<o(n) and L:=L(n)poly(n), there exists a prime power q:=q(n)poly(n) and a list size L:=L(n)poly(n) such that there is a Karp reduction from (qk,qm)-PigeonL to (n,k,ρn)-DenseBallqL, where k:=k(n)=Rn.

In particular, for these parameters, (n,k,ρn)-DenseBallqL is (qk,qm)-𝖯𝖬𝖯𝖯L-hard.

Proof.

See full version [4].

As with SDP, we also show hardness of DenseBall for codes in systematic form. (See Section 1.5.)

Corollary 24 (𝖯𝖬𝖯𝖯-hardness of sysDenseBall).

For any constants R(0,1) and 0<ρ<1R and positive integers m:=m(n)<o(n) and L:=L(n)poly(n), there exists a prime power q=q(n)poly(n) and a list size L=L(n)poly(n) such that there is a Karp reduction from (qk,qm)-PigeonL to (n,k,ρn)-sysDenseBallqL, where k:=k(n)=Rn.

In particular, for these parameters, (n,k,ρn)-sysDenseBallqL is (qk,qm)-𝖯𝖬𝖯𝖯L-hard

Proof.

See full version [4].

4 Finding short lattice vectors is in 𝗣𝗠𝗣𝗣

In this section, we show that the problem of finding suitably short non-zero lattice vectors (in p norms) can be placed in (A,B)-𝖯𝖬𝖯𝖯L, with a smooth tradeoff between the length of the vector obtained and L and B. In particular, when L=2 and AB, we find vectors whose length is at most the bound given by Minkowski’s celebrated theorem [32] (up to a factor of 1+1/nC for an arbitrarily large constant C>0), and when L=poly(n) and ALB, we find shorter vectors, corresponding to a stronger bound due to Blichfeldt [7]. (Blichfeldt proved his bound in the 2 norm, but we generalize it. Again, we match Blichfeldt’s bound up to a factor of 1+1/nC for an arbitrarily large constant C>0.)

To that end, we first prove the following technical proposition. We then derive the above results as corollaries.

Proposition 25.

There is an algorithm that takes as input p1, a radius r1, an integer q1, integers A>B1, an integer L1, and a basis 𝐁n×n for a lattice =(𝐁), makes a single query to an (A,B)-PigeonL oracle, and outputs a lattice vector 𝐲(𝐁) with the following behavior. If LA/B, q100pn/r and

qndet()B<A(120nrq)qnvol(pn(r)),

then 0<𝐲pdpn(r,L). Furthermore, the algorithm runs in time poly(n,L,logA,logr,log𝐁).

Proof.

See full version [4]

Recall that we are interested in γ-HSVPp for γvol(pn(1))1/n.

Corollary 26.

Let p1 be a constant integer, and let A=A(n),B=B(n),L=L(n)+ be such that 2s(n)=B<A2poly(n) and LA/B for a sufficiently large polynomial s(n). Then γ-HSVPp(A,B)-PigeonL for

γ:=(1+o(1))vol(pn(1))1/n(A/B)1/ndpn(1,L),

where n is the rank of the lattice in the γ-HSVPp instance.

In particular, γM:=2vol(pn(1))1/n corresponds to Minkowski’s theorem , and ((1+o(1))γM)-HSVPp𝖯𝖶𝖯𝖯. Furthermore, for the case of p=2, γB:=2vol(2n(1))1/n corresponds to Blichfeldt’s theorem , and ((1+o(1))γB)-HSVP2(2LB,B)-𝖯𝖬𝖯𝖯L for any L=poly(logB).

Proof.

See full version [4].

5 Inclusions

See full version [4].

6 Black-box separations

See full version [4].

7 A non-black-box non-separation

See full version [4].

References

  • [1] Divesh Aggarwal, Zeyong Li, and Noah Stephens-Davidowitz. A 2n/2-time algorithm for n-SVP and n-Hermite SVP, and an improved time-approximation tradeoff for (H)SVP. In Eurocrypt, 2021. URL: http://arxiv.org/abs/2007.09556.
  • [2] Frank Ban, Kamal Jain, Christos H. Papadimitriou, Christos-Alexandros Psomas, and Aviad Rubinstein. Reductions in PPP. Information Processing Letters, 145:48–52, 2019. doi:10.1016/j.ipl.2018.12.009.
  • [3] L. A. Bassalygo. New upper bounds for error-correcting codes. Problems of Information Transmission, pages 32–35, 1965.
  • [4] Huck Bennett, Surendra Ghentiyala, and Noah Stephens-Davidowitz. The more the merrier! on total coding and lattice problems and the complexity of finding multicollisions. In ITCS, 2025. URL: https://eccc.weizmann.ac.il/report/2024/018.
  • [5] Itay Berman, Akshay Degwekar, Ron D. Rothblum, and Prashant Nalini Vasudevan. Multi-collision resistant hash functions and their applications. In Eurocrypt, 2018.
  • [6] Nir Bitansky, Yael Tauman Kalai, and Omer Paneth. Multi-collision resistance: A paradigm for keyless hash functions. In STOC, 2018.
  • [7] H. F. Blichfeldt. The minimum value of quadratic forms, and the closest packing of spheres. Mathematische Annalen, 101(1):605–608, 1929.
  • [8] Romain Bourneuf, Lukáš Folwarczný, Pavel Hubáček, Alon Rosen, and Nikolaj I. Schwartzbach. PPP-completeness and extremal combinatorics. In ITCS, 2023.
  • [9] Jan Buzek and Stefano Tessaro. Collision resistance from multi-collision resistance for all constant parameters. In CRYPTO, 2024.
  • [10] Ivan Bjerre Damgård. A design principle for hash functions. In CRYPTO, 1989.
  • [11] Thomas Debris-Alazard, Léo Ducas, and Wessel P. J. van Woerden. An algorithmic reduction theory for binary codes: LLL and more. IEEE Transactions on Information Theory, 68(5):3426–3444, 2022. doi:10.1109/TIT.2022.3143620.
  • [12] Philippe Delsarte. An algebraic approach to the association schemes of coding theory. Thesis, Universite Catholique de Louvain, 1973.
  • [13] Itai Dinur. Tight time-space lower bounds for finding multiple collision pairs and their applications. In Eurocrypt, 2020.
  • [14] I. Dumer, D. Micciancio, and M. Sudan. Hardness of approximating the minimum distance of a linear code. IEEE Transactions on Information Theory, 49(1):22–37, 2003. doi:10.1109/TIT.2002.806118.
  • [15] Nicolas Gama and Phong Q. Nguyen. Finding short lattice vectors within Mordell’s inequality. In STOC, 2008.
  • [16] J. H. Griesmer. A bound for error-correcting codes. IBM Journal of Research and Development, 4(5):532–542, 1960. doi:10.1147/RD.45.0532.
  • [17] Venkatesan Guruswami and Atri Rudra. Explicit codes achieving list decoding capacity: Error-correction with optimal redundancy. IEEE Trans. Inf. Theory, 54(1):135–150, 2008. doi:10.1109/TIT.2007.911222.
  • [18] Venkatesan Guruswami, Atri Rudra, and Madhu Sudan. Essential Coding Theory. self-published, 2023. October 3rd, 2023 book version. URL: https://cse.buffalo.edu/faculty/atri/courses/coding-theory/book/web-coding-book.pdf.
  • [19] R. W. Hamming. Error detecting and error correcting codes. The Bell System Technical Journal, 29(2):147–160, 1950. doi:10.1002/j.1538-7305.1950.tb00463.x.
  • [20] Yassine Hamoudi and Frédéric Magniez. Quantum time–space tradeoff for finding multiple collision pairs. ACM Trans. Comput. Theory, 15(1-2):3:1–3:22, 2023. doi:10.1145/3589986.
  • [21] Siddhartha Jain, Jiawei Li, Robert Robere, and Zhiyang Xun. On pigeonhole principles and Ramsey in TFNP. In FOCS, 2024.
  • [22] Emil Jeřábek. Integer factoring and modular square roots. Journal of Computer and System Sciences, 82(2):380–394, 2016. doi:10.1016/J.JCSS.2015.08.001.
  • [23] Antoine Joux. Multicollisions in iterated hash functions. application to cascaded constructions. In CRYPTO, 2004.
  • [24] Grigorii A. Kabatjanskiĭ and Vladimir I. Levenšteĭn. Bounds for packings on the sphere and in space. Problemy Peredači Informacii, 14(1):3–25, 1978.
  • [25] Ilan Komargodski, Moni Naor, and Eylon Yogev. Collision resistant hashing for paranoids: Dealing with multiple collisions. In Eurocrypt, 2018.
  • [26] Ilan Komargodski and Eylon Yogev. Personal communication, 2023.
  • [27] Arjen K. Lenstra, Hendrik W. Lenstra, Jr., and László Lovász. Factoring polynomials with rational coefficients. Mathematische Annalen, 261(4):515–534, December 1982.
  • [28] Qipeng Liu and Mark Zhandry. On finding quantum multi-collisions. In Eurocrypt, 2019.
  • [29] R. McEliece, E. Rodemich, H. Rumsey, and L. Welch. New upper bounds on the rate of a code via the Delsarte-MacWilliams inequalities. IEEE Transactions on Information Theory, 23(2):157–166, 1977. doi:10.1109/TIT.1977.1055688.
  • [30] Ralph C. Merkle. A certified digital signature. In CRYPTO, 1989.
  • [31] Daniele Micciancio and Michael Walter. Practical, predictable lattice basis reduction. In Eurocrypt, 2016. URL: http://eprint.iacr.org/2015/1123.
  • [32] Hermann Minkowski. Geometrie der Zahlen. B.G. Teubner, 1910. URL: http://books.google.com/books?id=MusGAAAAYAAJ.
  • [33] Mridul Nandi and Douglas R. Stinson. Multicollision attacks on some generalized sequential hash functions. IEEE Transactions on Information Theory, 53(2):759–767, 2007. doi:10.1109/TIT.2006.889721.
  • [34] Christos H. Papadimitriou. On the complexity of the parity argument and other inefficient proofs of existence. J. Comput. Syst. Sci., 48(3):498–532, 1994. doi:10.1016/S0022-0000(05)80063-7.
  • [35] Amol Pasarkar, Christos Papadimitriou, and Mihalis Yannakakis. Extremal combinatorics, iterated pigeonhole arguments and generalizations of PPP. In ITCS, 2023.
  • [36] M. Plotkin. Binary codes with specified minimum distance. IRE Transactions on Information Theory, 6(4):445–450, 1960. doi:10.1109/TIT.1960.1057584.
  • [37] Ron D. Rothblum and Prashant Nalini Vasudevan. Collision-resistance from multi-collision-resistance. In CRYPTO, 2022.
  • [38] Claus-Peter Schnorr. A hierarchy of polynomial time lattice basis reduction algorithms. Theor. Comput. Sci., 53(23):201–224, 1987. doi:10.1016/0304-3975(87)90064-8.
  • [39] R. Singleton. Maximum distance q-nary codes. IEEE Transactions on Information Theory, 10(2):116–118, 1964. doi:10.1109/TIT.1964.1053661.
  • [40] Aikaterini Sotiraki. New Hardness Results for Total Search Problems and Non-Interactive Lattice-Based Protocols. Thesis, Massachusetts Institute of Technology, 2020. URL: https://dspace.mit.edu/handle/1721.1/129310.
  • [41] Katerina Sotiraki, Manolis Zampetakis, and Giorgos Zirdelis. PPP-completeness with connections to cryptography. In FOCS, 2018.
  • [42] Alexander Vardy. Algorithmic complexity in coding theory and the Minimum Distance Problem. In STOC, 1997.
  • [43] Hongbo Yu and Xiaoyun Wang. Multi-collision attack on the compression functions of MD4 and 3-pass HAVAL. In ICISC, 2007.
  • [44] Victor Zyablov. An estimate of the complexity of constructing binary linear cascade codes. Probl. Peredachi Inf., 1971.