The More the Merrier! On Total Coding and Lattice Problems and the Complexity of Finding Multicollisions111An earlier version of this work used the title “The more the merrier! On the complexity of finding multicollisions, with connections to codes and lattices.”
Abstract
We show a number of connections between two types of search problems: (1) the problem of finding an -wise multicollision in the output of a function; and (2) the problem of finding two codewords in a code (or two vectors in a lattice) that are within distance of each other. Specifically, we study these problems in the total regime, in which and are chosen so that such a solution is guaranteed to exist, though it might be hard to find.
In more detail, we study the total search problem in which the input is a function (represented as a circuit) and the goal is to find distinct elements such that . The associated complexity classes Polynomial Multi-Pigeonhole Principle (-) consist of all problems that reduce to this problem.
We show close connections between - and many celebrated upper bounds on the minimum distance of a code or lattice (and on the list-decoding radius). In particular, we show that the associated computational problems (i.e., the problem of finding two distinct codewords or lattice points that are close to each other) are in -, with a more-or-less smooth tradeoff between the distance and the parameters , , and . These connections are particularly rich in the case of codes, in which case we show that multiple incomparable bounds on the minimum distance lie in seemingly incomparable complexity classes.
Surprisingly, we also show that the computational problems associated with some bounds on the minimum distance of codes are actually hard for these classes (for codes represented by arbitrary circuits). In fact, we show that finding two vectors within a certain distance is actually hard for the important (and well-studied) class in essentially all parameter regimes for which an efficient algorithm is not known, so that our hardness results are essentially tight. In fact, for some (depending on the block length, message length, and alphabet size), we obtain both hardness and containment. We therefore completely settle the complexity of this problem for such parameters and add coding problems to the short list of problems known to be complete for .
We also study - as an interesting family of complexity classes in its own right, and we uncover a rich structure. Specifically, we use recent techniques from the cryptographic literature on multicollision-resistant hash functions to (1) show inclusions of the form for certain non-trivial parameters; (2) black-box separations between such classes in different parameter regimes; and (3) a non-black-box proof that if for yet another parameter regime. We also show that - lies in the recently introduced complexity class Polynomial Long Choice for some parameters.
Keywords and phrases:
Multicollisions, Error-correcting codes, LatticesFunding:
Huck Bennett: This work is supported in part by NSF Grant No. 2312297. Part of this work was done while the author was at Oregon State University.Copyright and License:
![[Uncaptioned image]](x1.png)
2012 ACM Subject Classification:
Theory of computation Computational complexity and cryptographyAcknowledgements:
The authors would like to thank Atri Rudra for very helpful discussions.Editor:
Raghu MekaSeries and Publisher:

1 Introduction
We study the complexity of total coding problems, as well as total lattice problems. (We focus on coding problems in the introduction.)
Recall that a -ary code with messages of length , block length , and distance is a function such that
where is the Hamming distance, i.e., the number of entries in which two elements in differ. For our purposes, we think of as an arbitrary circuit with size . (Much of the literature is concerned with the special case when is a non-singular linear function, in which case the code is called a linear code. We discuss our choice of definition more in Section 1.5.)
The most fundamental question in coding theory is to find bounds on for fixed , , and . In other words, one wishes to argue that any set of points in must contain two points that are within Hamming distance for the smallest value of possible. Many beautiful upper bounds on the distance are known in terms of , , and .
We therefore define the natural family of computational search problem associated with this question. Specifically, we define the -Short Distance Problem (-) as the computational search problem in which the input is a circuit and the goal is to find such that .222The problem of finding that minimizes for the given input is called the Minimum Distance Problem (MDP), and it is known to be -hard, even to approximate, and even for linear codes [14]. In contrast, only asks for two codewords within some fixed distance, independent of the minimum distance of the code. Furthermore, one is often interested only in the important special case in which computes a linear function over , but we stick to this more general setting because our upper bounds for this problem apply even to arbitrary codes (making them stronger), and unfortunately our lower bounds seem to require us to work with arbitrary codes specified by circuits. Notice that - is total if and only if all -ary codes with message length and block length have distance at most .
So, upper bounds on the minimum distance of a code correspond to proofs of totality of for some choice of . In other words, such bounds can be viewed as proofs that for such a choice of , is in the complexity class , which consists of total search problems with efficiently verifiable solutions. Starting with the seminal work of Papadimitriou [34], the study of has focused largely on placing important total search problems in subclasses of , where it is often convenient to think of each subclass as capturing problems whose totality can be attributed to a particular fundamental principle. E.g., (which we will discuss much more shortly) can be thought of as the subclass of corresponding to problems whose totality can be proven via a particular version of the pigeonhole principle.
It is therefore natural to ask what more we can say about the complexity of - when is equal to one of the many celebrated upper bounds on the minimum distance of a -ary code with block length and message length . Indeed, there are now many important upper bounds known on the minimum distance of codes (in terms of , , and ), such as the Singleton bound [39], the Hamming bound [19], the Plotkin bound [36], the Elias-Bassalygo bound [3], and the MRRW bound(s) [29]. These bounds are ubiquitous in the computer science literature and beyond. (See, e.g., [18].)
However, in spite of the importance of these bounds, there has been surprisingly little work addressing the complexity of the associated (total) search problems of actually finding a pair of distinct codewords within one of these bounds. Indeed, to our knowledge, the only prior work directly addressing such a question is the beautiful (and recent) work of Debris-Alazard, Ducas, and van Woerden [11], which showed (among other things) that the problem of solving on linear codes within the Griesmer bound [16] can be solved efficiently.
1.1 The problem of finding multicollisions
In this work, we will show connections between the complexity of such total coding (and lattice) problems and a natural family of total search problems.
Specifically, for integers and , we consider the computational problem in which the input is a circuit ,333See the full version [4] for discussion of what we mean by a circuit with input size and output size when and are not necessarily powers of . and the goal is to find distinct input values such that . Notice that, by the (generalized) pigeonhole principle, this problem is total if (and only if) the size of the domain of and the size of its range satisfy . We will focus on the regime in which the problem is total.
In the special case when and , this is the canonical complete problem for the Polynomial Pigeonhole Principle complexity class ().444When and , one obtains the canonical complete problem for Polynomial Weak Pigeonhole Principle complexity class (). We will say much more about this distinction later, but for now we largely ignore this. This class was introduced by Papadimitriou in his work studying problems in (i.e., total search problems in ) [34]. Since then, the class has been of great interest because it is known to contain many important computational problems, such as the problem of breaking a cryptographic collision-resistant hash function, the problem of breaking a one-way permutation, factoring (under randomized reductions) [22], and many more problems of interest [41, 8]. Most relevantly to our work, the problem of finding a non-zero vector in a lattice within Minkowski’s bound (in the norm, though see below) was shown to be in in [2].
Because of its relationship with the pigeonhole principle, we call the computational problem of finding such a collision . (Papadimitriou originally used this name for the special case when [34].) In fact, we get a family of problems , parameterized by the circuit input size , the circuit output size , and the number of colliding inputs that we must find. Accordingly, we define the complexity class - as the set of all search problems that have a polynomial-time (Karp) reduction to .555To define formally, and should be functions of some asymptotic parameter , and might also be a function of . But, we mostly ignore this issue in the introduction for simplicity.
In the complexity-theoretic literature, there is very little work on for (although we note Sotiraki’s thesis [40, Section 4.5] as foundational work in this direction; and see also Section 1.3 for discussion of recent independent work by [21]). On the other hand, there is an exciting line of work in the cryptographic literature studying multicollision-resistant hash functions [23, 33, 43, 5, 28, 6, 25, 40, 13, 37, 20]. From our perspective, one can think of such cryptographic works as studying the average-case complexity of the problem (under efficiently sampleable distributions of circuits ). However, even these works have not fully explored this space, primarily focusing on the case where is constant and .
1.2 Our results
In this work, we show many connections between coding and lattice problems and . We also further study the classes .
1.2.1 Connections between coding problems and
Our first set of results shows a number of connections between and computational search problems related to error-correcting codes. (See Figure 1 for a summary of some of our results for binary codes.)
Upper bounds on the Shortest Distance Problem
We show that many of the celebrated bounds on the minimum distance of a code yield versions of that are in different versions of , including the Singleton bound [39], the Hamming bound [19], the Plotkin bound [36], and the Elias-Bassalygo bound [3]. The following theorem shows some of these results.
Theorem 1 (Informal, see full version [4]).
The following hold for any , , and prime power and any constant .
-
1.
- is in for equal to the Singleton bound.
-
2.
- is in and - is in for equal to the Hamming bound.
-
3.
- is in for slightly larger than the Plotkin bound.
-
4.
- is in for equal to the Elias-Bassalygo bound.
In fact, these results are special cases of more general results that we prove. Specifically, we show that one can obtain smaller values of by increasing or decreasing (while holding fixed and maintaining totality).
Our smooth tradeoff in particular is perhaps a bit surprising. Indeed, it is a-priori not clear why it would be useful to take in such a reduction, since, after all, the goal in is to find two codewords. (Of course, this is because our reductions mimic the proofs of the Plotkin and Elias-Bassalygo bounds. Both proofs work by first finding codewords that lie in a relatively small Hamming ball and then showing that two of these codewords must be quite close to each other.)
Tight hardness for and completeness when
We next show tight hardness results for . In particular, we consider both the case when a code is represented by an arbitrary circuit and the case when a code is represented in systematic form (i.e., the case when the first symbols in a codeword consist of the message; see Section 1.5). We call this restricted problem .
Theorem 2 (Informal, see full version [4]).
For any , , and prime power and any constant , - is -hard for . Furthermore, - is -hard for distance slightly below the Plotkin bound.
Theorem 2 is essentially tight. In particular, for , it is easy to see that can be solved efficiently. And, we show that there is also a simple efficient algorithm for for (see full version [4]). So, Theorem 2 essentially characterizes when there is a polynomial-time algorithm for and (assuming that ).
Furthermore, there is a large overlap between the parameter regime for which we show containment in and the regime for which we show -hardness (both for arbitrary codes and codes in systematic form), so that for a large range of parameters we show that is actually complete for . This essentially settles the complexity of in these regimes and adds to the relatively short list of problems known to be complete for this class. For example, we show that in many settings, the Hamming bound is -complete. (Since the Hamming bound can be viewed as a coding-theoretic analogue of Minkowski’s bound for lattices, one might view this result as resolving a coding-theoretic analogue of the conjecture that Minkowski’s bound is -complete [2]. However, perhaps a better coding-theoretic analogue would be -hardness of this problem over linear codes, which we do not prove. See Section 1.6.)
We summarize these results for binary codes in Figure 1. The picture for -ary codes is quite similar (with the various bounds replaced by their -ary versions). See full version [4] for the details.
Finding many close codewords and
We also show analogous results for the problem of finding many codewords that all lie in a small ball. This problem corresponds to bounds on the list decoding radius of a code (just as the problem of finding close pairs of codewords corresponds to bounds on the unique decoding radius).
We first show that list-decoding generalizations of the Singleton bound and the Hamming bound lie in for appropriate parameters. (See full version [4]) We then show that the problem of finding many codewords that all lie in a small ball is also hard for for some choices of parameters.
Our results for this problem are significantly more subtle than the case, since for , the complexity of seems to vary quite a bit with the parameters, whereas for , the parameters and matter much less. In particular, for we do not find a complete problem for . See full version [4] for the details.
1.2.2 Lattice problems in
We next show similar results for computational lattice problems. Recall that a lattice is the set of all integer linear combinations of linearly independent basis vectors , i.e.,
A fundamental question about lattices asks when there must exist a non-zero lattice point such that , where is some norm of interest and is some bound.
Minkowski’s celebrated theorem [32] tells us that such a is guaranteed to exist for some
where is the lattice determinant and is the unit ball of the norm . The corresponding computational problem is the search problem in which the input is a basis for a lattice, and the goal is to output a non-zero vector such that
This problem is quite important in cryptography, particularly in the norm. In the special case of the norm, it is known that , and Ban, Jain, Papadimitriou, Psomas, and Rubinstein [2] conjectured that is actually -complete. (This conjecture remains open.)
We study the more general problem of finding with
where
is the norm. This problem is known as the -Hermite Shortest Vector Problem (-). Since this problem is relevant to cryptography, it is very well studied for a wide range of parameters [27, 38, 15, 31, 1], particularly for .
The case corresponds to , where is the unit ball. However, Minkowski’s bound is not tight (except in the case when tiles space, such as when is the hypercube). For example, Blichfeldt improved on Minkowski’s celebrated theorem in the norm, proving that - is still total when [7].666We note that Blichfeldt proved two distinct relevant theorems in this context, which might easily confuse the reader. So, we endeavor to clarify here. The theorem commonly referred to as “Blichfeldt’s theorem” says that any measurable set with is guaranteed to contain two points with such that . This theorem is often discussed in the context of total lattice problems because it can be used to prove Minkowski’s theorem. In fact, Sotiraki, Zampetakis, and Zirdelis introduced a related computational problem that they called (in which the set is represented implicitly by a circuit), and they showed that is actually -complete. It is not clear how is related to - for (except that they are two computational lattice problems whose totality was proven by Blichfeldt).
Perhaps surprisingly, we show that for and appropriate choices of , , and . In fact, we show a smooth tradeoff between the Hermite factor and the parameters , , and , showing that one can obtain shorter vectors by either increasing or decreasing the ratio between and (while maintaining totality). A similar story holds for norms more generally.
Theorem 3 (Informal; see full version [4]).
For any constant integer and any satisfying and for a sufficiently large polynomial , it holds that - for
where is the rank of the lattice in the - instance and is a particular function that is decreasing in the parameter .
1.2.3 Containments between different classes
The above results bring new attention to the classes . So, we next study containments between these classes with different parameters , , and , and other classes of interest in .
Recall that the celebrated Merkle-Damgård construction [30, 10] shows that the ratio of the input size to the output size of a circuit essentially does not matter in the special case when , since one can efficiently reduce from the problem of finding a single collision (i.e., ) in a barely compressing circuit with to the (seemingly much easier) problem of finding a single collision in a much more compressing circuit with for any constant . In our terminology,
This surprising collapse of complexity classes is known as domain extension, and it has innumerable applications in cryptography and complexity theory.777Admittedly, we are deliberately conflating the distinction between the problem of breaking a cryptographic hash function (which is what Merkle and Damgård actually studied) and the problem of finding a collision in an arbitrary worst-case circuit. All of the above statements hold in both cases.
Already in 2004, Joux noticed that Merkle and Damgård’s elegant domain extension technique does not seem to work for [23]. So, it appears that for , the relationship between and (i.e., how “compressing” the circuit is) might matter quite a bit. This suggests a surprising fundamental difference between the case and the case when .
However, we show two (rather weak) notions of domain extension that still work in the setting of multicollisions. The first such result follows by analyzing the Merkle-Damgård construction in this setting and showing that it does achieve something, albeit with a large loss in parameters. The second result shows a more sophisticated reduction (using Merkle trees together with list-recoverable codes) that is better than the Merkle-Damgård-based reduction in the regime where is very large compared to . The latter result can be thought of as a translation into our setting of a beautiful result due to Bitanski, Kalai, and Paneth [6] in the setting of multicollision-resistant hash functions. (In fact, the proof is substantially simpler in our setting than in that of [6], since we do not have to worry about the many additional issues that arise in the average-case setting.)
Together, these results show that it is possible to reduce the problem of finding multicollisions in a less compressing function to the problem of finding multicollisions in a more compressing function, but at the expense of a large loss in the number of collisions found.
Theorem 4 (Informal; see full version [4]).
For ,
for .
For any , any , and any ,
under randomized reductions, for
We also show that the class is contained in the recently introduced ([35]) complexity class Polynomial Long Choice (), for appropriate choices of the parameters , , and . In fact, we show a reduction to the Unary Long Choice problem. (This result was recently independently discovered in [21]. See Section 1.3.) This strengthens a result of Pasarkar, Papadimitriou, and Yannakakis [35], who showed that .
Theorem 5 (Informal; see full version [4]).
For any ,
1.2.4 Black-box separations (and a non-black-box non-separation)
Our final set of results concerns black-box separations between - for different values of , , and , which suggest that it might be hard to prove stronger containments than what we show above. However, we note that recent independent work of Jain, Li, Robere, and Xun [21] also showed exciting black-box separations of this form. While our results are formally incomparable to theirs, we believe that the results of [21] are more interesting than our own. (See Section 1.3.)
The starting point for our results is a beautiful idea due to Komargodski, Naor, and Yogev [25] for separating - from - for any constants (particularly in the average-case setting relevant to cryptography). Unfortunately, however, the proof in [25] had a subtle bug that does not seem easy to fix [26].
We use similar ideas to show two different black-box separations, which can be seen as partial evidence that domain extension and range compression are not possible when . However, we note that our black-box separations are rather weak, since they only rule out rather fine-grained black-box reductions between the classes.
Finally, we show that a very clever non-black-box proof due to Rothblum and Vasudevan [37] extends to our setting. In particular, Rothblum and Vasudevan show, using non-black-box techniques, that the existence of a certain sufficiently strong multicollision-resistant hash function implies the existence of a collision-resistant hash function. We prove an analogue of their result in our (worst-case) setting, showing that suitable hardness of for large implies hardness of for smaller . (After a preliminary version of this work appeared, Buzek and Tessaro [9] improved upon the main result in [37]. We do not know whether the stronger result in [9] extends to our setting.)
As these results are rather technical, we refer the reader to the full version [4] for the details.
1.3 Comparison with Jain, Li, Robere, and Xun
Recent exciting work by Jain, Li, Robere, and Xun [21] also defines and studies the computational problem and the associated complexity class (with slightly different notation). (They also define additional classes that correspond to the union of over different parameters , , and .) [21] is concurrent with and independent of this work (and appeared as a preprint shortly before we released a preprint of the present work). Here, we provide a brief comparison of their work with ours.
Both the present work and [21] define multi-collision classes and study relationships between them. At a high level, [21] has morally stronger (although formally incomparable) results on the structural complexity of these classes, whereas the present work focuses on showing containment and hardness of coding and lattice problems with respect to these classes (which [21] does not study at all).
In terms of structural complexity, [21] contains exciting black-box separation results, which, although formally incomparable to our black-box separations, we think of as stronger. In particular, [21] are essentially able to black-box separate from for any constants and any (reasonable) , , , and . They also show black-box separations between and other interesting complexity classes in , and in particular show a black-box separation between the problem and . We refer the reader to [21] for the technical details.
[21] also studies the relationship between and . Indeed, they prove a result that is essentially identical to our Theorem 5, which shows that for certain parameters. (Formally, our technical result in the full version [4] is more general than the analogous result in [21], but it is clear that the proof in [21] yields the more general result as well.) In addition, they show a containment in the other direction, that , albeit for different parameters. Finally, [21] shows that an interesting problem in is in . The latter two results have no analogue in the present work.
In this work, we focus on the relationship of with coding and lattice problems (see the full version [4]). We also show Merkle-Damgård-style reductions between with different parameters (see full version [4]) and a non-block-box non-separation in the style of [37] (see full version [4]). These topics are not studied in [21].
1.4 Other related work
Our work lies at the intersection of a number of different areas, and there is therefore much related work to discuss in addition to [21]. Here, we focus on how this prior work relates to our work.
The complexity of total lattice problems
The complexity of Minkowski’s bound and more generally is quite well studied, particularly in the norm and the norm. In particular, algorithms for play a very important role in lattice-based cryptography, and algorithms for with different approximation factors are a very active area of research [27, 38, 15, 31, 1]. Some of these algorithms can be viewed as constructive proofs of classical results about the minimum distance of a lattice relative to the determinant. (For example, the celebrated LLL algorithm gives a constructive proof of Hermite’s bound [27], and the slide reduction algorithm gives a constructive proof of Mordell’s inequality [15].)
Finding a vector within Minkowski’s bound in the norm is considered one of the most important problems in the complexity class . In particular, Ban, Jain, Papadimitriou, Psomas, and Rubinstein showed that other important problems in can be reduced to this problem, and they conjectured that this problem is actually -complete [2]. Sotiraki, Zampetakis, and Zirdelis further investigated the relationship between lattice problems, , and , showing two problems related to lattices that are -complete and -complete respectively [41].
The complexity of total coding problems
To our knowledge, much less is known about the complexity of total problems that arise in coding theory. Instead, much work has focused on the -approximate Minimum Distance Problem (-), in which the input is a linear code and the goal is to output a pair of distinct codewords such that the distance between them is within a factor of the minimum distance of the input code (or, since the code is linear, one can equivalently output a non-zero codeword with nearly minimal Hamming weight). This problem is known to be -hard [42], even to approximate [14]. In contrast, we are interested in the problem of finding distinct codewords that are within distance , where depends only on the message length and block size of the code (and not on the minimum distance). We are particularly interested in the total regime, where such problems are very unlikely to be -hard.
To our knowledge, the only direct work in this area is a beautiful paper by Debris-Alazard, Ducas, and van Woerden [11], which showed an LLL-like algorithm for linear codes that efficiently finds codewords within the Griesmer bound [16]. (Note that this algorithm only works for linear codes, and indeed the Griesmer bound itself only applies to linear codes.)
Literature on multicollisions
There is quite a bit of prior work in the cryptography literature on multicollision-resistant hash functions. In our terminology, a multicollision-resistant hash function is some efficiently sampleable distribution of instances of (i.e., circuits ) that are actually hard (i.e., that cannot be solved by polynomial-time algorithms with non-negligible probability). A survey of this literature is beyond the scope of this work. Broadly speaking, work in this area has been concerned with (1) understanding the relationship between collision resistance and multicollision resistance (i.e., the relationship between the case when and the case when ); (2) finding applications of multicollision-resistant hash functions; and (3) building multicollision-resistant hash functions from various cryptographic hardness assumptions.
Many of the techniques that we use to understand the relationship between in different parameter regimes are directly inspired by this cryptographic literature. In particular, our inclusion based on Merkle trees and list-recoverable codes is a direct adaption of Bitanski, Kalai, and Paneth’s construction from the cryptographic setting to our setting [6]; our black-box separations are inspired by [25]; and our non-black-box non-separation is a direct adaptation of Rothblum and Vasudevan’s proof from the cryptographic setting to our setting [37]. (See also the very recent work of [9].)
In contrast, there is very little prior work in the worst-case setting. To our knowledge, the only works that consider the worst-case complexity of finding multicollisions are Komargodski, Naor, and Yogev [25] and Sotiraki [40] (though see Section 1.3). In both of these works, the worst-case complexity of is not the primary focus, but both do define the special cases of the complexity class -, when (specifically, and ) and is a constant. This is a natural setting of parameters, but we show interesting results in other parameter regimes as well.
Sotiraki in particular shows a complete problem for that is similar to the -complete and -complete problems from [41]. In particular, Sotiraki’s -complete problem bears some resemblance to certain lattice and coding problems, though the relationship is unclear. We refer the reader to [41, 40] for more.
Komargodski, Naor, and Yogev claimed a black-box separation between - and - for any constants , but their elegant proof contained a subtle bug that has not been fixed [26]. Our black-box separations use similar ideas but are weaker than what they originally claimed.
Literature on and
In contrast, there is much literature studying the complexity classes and , which correspond to the special case of when (in different parameter regimes). was introduced by Papadimitriou in his seminal work [34]. ( appeared in many works but seems to have been first named in [22].) Since then, many problems of interest have been shown to be contained in either or [22, 2, 41, 8]. Until recently, only a small handful of problems were known to be complete for or [34, 41, 2]. However, Bourneuf, Folwarczný, Hubáček, Rosen, and Schwartzbach recently showed a number of problems arising from extremal combinatorics that are complete for either or [8].
There has also been some literature concerned with generalizing and to classes other than . In particular, Pasarkar, Papadimitriou, and Yannakakis [35] recently introduced the class Polynomial Long Choice (), which can be thought of as corresponding to the generalization of the pigeonhole principle obtained by iterating the pigeonhole principle many times.
1.5 A note on “codes” represented by circuits, injectivity, and systematic form
The coding theory results in this paper are concerned with the problem of finding close codewords (or many codewords that lie in a relatively small ball) in a “code” represented by an arbitrary circuit with size . It is far more common in the literature to consider linear codes represented by a generator matrix (or, equivalently, an invertible circuit with linear gates). (Sometimes when arbitrary codes are considered in the literature, the code is simply represented by listing all points, while we represent our codes succinctly.)
Whether one should really think of a generic circuit as representing a “code” is not so clear. In particular, such a circuit might not be injective, i.e., there might be distinct “messages” that map to the same “codeword” . (And, there is likely no way to efficiently determine whether such a circuit is injective. Indeed, determining this is -complete.) But, we find the coding-theoretic perspective to be quite useful. In particular, the notion of the distance of a code still makes sense with this slightly more general definition, and the bounds on the distance that we discuss in this paper still apply. Indeed, much of coding theory still makes sense if we treat such degenerate, non-injective codes simply as “codes with distance ,” and standard bounds in coding theory, such as the Singleton and Hamming bounds, still apply.
Of course, it is not an issue, and actually a strength that our upper bounds apply to such general “codes.” Indeed, any upper bound that applies in the more general case when “codes” are represented by arbitrary circuits certainly applies to the special case of injective circuits or the even more special (and quite important) case of linear codes.
For our lower bounds, however, this can be viewed as a major flaw in our model. To partially mitigate this, we prove our hardness results in two different settings.
In the first “generic” setting, we show hardness for “codes” represented by arbitrary circuits , with no presumed structure. In particular, the circuit is not necessarily injective. Indeed, one may argue that our reductions are rather artificial, in that the reductions often produce “codes” such that is either zero or strictly larger than , where is the bound on the distance needed to solve the associated coding problem. So, these reductions rely quite heavily on this rather strange definition of a code in which multiple messages can map to the same codeword.
In the “systematic” setting, our codes are in systematic form, which means that the first characters of are simply itself. (Formally, we actually work with a smaller circuit , and we simply interpret as the codeword associated with .) Such circuits are clearly injective, and therefore this setting is much less artificial. (Codes in systematic form are also quite fundamental objects that arise naturally in coding theory.) In this setting, our formal results are a bit different, and we even show that there are efficient algorithms for this setting in regimes where the problem is provably hard for codes represented by arbitrary circuits. However, the high-level picture is the same. In particular, we still show completeness in this setting. (See Figure 1.)
1.6 Open problems
We leave a number of interesting open problems. Here, we mention some of them.
Complexity of coding and lattice problems
We place the computational problems associated with a number of fundamental bounds on the minimum distance of a code in . However, we were unable to say anything non-trivial about the complexity of Delsarte’s linear programming bound [12] and the closely related MRRW bound(s) [29], which are the best known bounds in an important range of parameters. The associated computational problems are, by definition, in , but we do not know any natural subclass of that contains these problems.
Similarly, the best known bound on the minimum distance in the norm of a lattice relative to the determinant is the celebrated bound due to Kabatjanskiĭ and Levenšteĭn [24] (which is better than Blichfeldt’s [7] by a factor of roughly ). The problem of finding a vector within the Kabatjanskiĭ and Levenšteĭn bound is again, by definition, in , but we do not know any natural subclass of that contains this problem.
In a different direction, we show hardness of various total coding problems (and even completeness in some regimes), but only for codes represented by circuits. It would be very exciting to show hardness results of total problems on linear codes (represented, e.g., by generator matrices), since these are the more standard problems. (The analogous question for lattices was already asked in [2] and also remains open.)
Better understanding of across different regimes
The first question that comes to mind about the complexity classes - is, of course, “how do these classes relate to each other across different parameter regimes?” In the case when , it has long been known that the relationship between and does not matter much. For example,
For , it seems unlikely that a similar result holds. We instead describe a rich and rather complicated set of inclusions (and non-black-box relationships) among these classes, as well as black-box separations.
However, we have no evidence that these results are tight. One would ideally like to show black-box separations and inclusions that are tight with one another. (Or, alternatively, one might hope to prove non-trivial equalities -- like what we know in the case of .) The analogous average-case question has been the topic of much research in the cryptography literature [23, 33, 43, 5, 6, 25, 28, 13, 40, 37, 20], but to the authors’ knowledge the worst-case setting was barely explored before this work and the recent exciting (concurrent and independent) work of Jain, Li, Robere, and Xun [21].
2 Upper bounds for coding problems
In this section, we prove that and are contained in in many parameter regimes of interest. In particular, for , we show such results when the distance corresponds to a number of celebrated bounds on the minimum distance of a code, including the Singleton bound (Corollary 7), the Plotkin bound (Corollary 8), the Hamming bound (Corollary 13), and the Elias-Bassalygo bound (Corollary 15). For , we show analogous results for the list Singleton bound (Corollary 11) and list Hamming bound (Corollary 13).
In fact, all of these results are corollaries of four technical results: Theorem 6, which reduces to via a simple truncation technique; Theorem 10, which does something similar for ; Theorem 12, which reduces to by “finding many overlapping Hamming balls centered on codewords”; and Theorem 14, which shows a generic reduction from to .
We also show in Section 2.1.1 an efficient algorithm for within the Plotkin bound for codes in systematic form.
2.1 Upper bounds for the Singleton and Plotkin bounds
We start by giving a fairly general reduction from (the problem of finding a pair of close codewords) to . On input an instance of , the reduction works by defining a compressing circuit that truncates the output of . It then finds such that using its oracle, and finally outputs a pair that minimizes . We then observe that special cases of this general result place in for corresponding to the Singleton bound (Corollary 7) and for corresponding to the Plotkin bound (Corollary 8).
Theorem 6.
Let be such that and with . Then there is a Karp reduction from to - where .
Proof.
See full version [4].
We get useful corollaries from Theorem 6 that show how to compute vectors achieving the Singleton bound (the case) and the Plotkin bound (for larger ) using a oracle.
We now show that the problem of finding a pair of codewords whose distance is at most the Singleton bound is in .
Corollary 7 (Singleton bound in ).
Let with . Then there is a Karp reduction from - to -. In particular, - is in .
Proof.
The claim follows from Theorem 6 by setting and noting that, trivially, for any .
We now show that the problem of computing a pair of codewords whose distance satisfies the high-rate Plotkin bound is in .
Corollary 8 (Plotkin bound in ).
Let be such that , , , and
(1) |
then there is a Karp reduction from - to - which runs in time .
Proof.
See full version [4].
2.1.1 Efficiently solving up to the Plotkin bound
We now note that there is an efficient algorithm for finding codewords within the Plotkin bound when the code is in systematic form. In other words, we show that can be solved efficiently for below the Plotkin bound, where we recall that is the special case of in which the code is in systematic form.
Theorem 9.
Let with and . There is a -time algorithm for where
Proof.
See full version [4].
2.2 An upper bound for the list Singleton bound
Next, we give a reduction from to similar to the reduction from to in Theorem 6. We then show that this reduction places the list Singleton bound (i.e., a variant of the Singleton bound that corresponds to list decoding rather than unique decoding) in (up to a small error).
Theorem 10.
Let be such that and . Then there is a Karp reduction from to -.
Proof.
See full version [4].
Finally, we show that the problem of computing a set of codewords lying in a ball whose radius almost satisfies the list Singleton bound is in .
Corollary 11 (List Singleton bound in ).
Let with , let , and let
(2) |
Then is in -.
Proof.
See full version [4].
We remark that the radius in Equation 2 is almost as small the smallest satisfying the list Singleton bound. Indeed, one can check that any satisfying the list Singleton bound must be such that
and that the radius in Equation 2 satisfies
2.3 Upper bounds for the (list) Hamming and Elias-Bassalygo bounds
We now show containment of in , where corresponds to the Hamming bound and Elias-Bassalygo bound. In particular, when is equal to the value given by the Hamming bound, we show that is contained in (see Item 1 of Corollary 13) and for slightly above the Hamming bound, is contained in (see Item 2 of Corollary 13). In fact, we show a more general result that applies to the generalization of and shows that for within the list Hamming bound, this problem is in (see Item 3 of Corollary 13). We then show that is in whenever above the Elias-Bassalygo, via a generic reduction from to (see Corollary 15). In fact, we obtain a smooth tradeoff between the distance obtained by the reduction and the parameters , , and in the reduction to .
2.3.1 The (list) Hamming bound
We now give a reduction from to , the key to which is the circuit giving an injection from into a Hamming ball for sufficiently large defined in the full version [4]. (In fact, we will use such circuits that are bijective.)
Theorem 12.
Let , and let be a prime power satisfying
Then there is a Karp reduction from to -.
Proof.
See full version [4].
We now use Theorem 12 to show that and with parameters corresponding to the (list) Hamming bound are in (one or more of) , , and .
Corollary 13 ((List) Hamming bound in , , and ).
Let with and . Let
Then:
-
1.
If , then is in .
-
2.
If , then is in .
-
3.
If , then is in -.
Proof.
See full version [4].
2.3.2 E pluribus duo – from many codewords to two
We now give a simple reduction from to for . The idea is that a ball of small radius containing codewords must contain a pair of codewords at small distance . Clearly such a pair must exist at distance at most , but in fact when is larger it is possible to get a better upper bound. So, the reduction simply computes the distance between each pair of codewords for , and outputs a pair that minimizes .
Theorem 14.
Let be such that
Then there is a -time Karp reduction from to , where .
Proof.
See full version [4].
2.3.3 The Elias-Bassalygo bound
We are now ready to show that the problem of computing a pair of codewords satisfying the Elias-Bassalygo bound is in .
Corollary 15 (Elias-Bassalygo bound in ).
Let with , let , and suppose that
Then there is a -time Karp reduction from to -. In particular, is in -.
Proof.
See full version [4].
3 Hardness of coding problems
We now turn to proving hardness results for and . In each of our hardness reductions we first assume that a gadget code, specified by a circuit , is given to the reduction as auxiliary input. We then instantiate the gadget code with an explicit efficiently computable code to obtain a true Karp reduction.
3.1 -hardness of
We first show a generic reduction, with auxiliary from to .
Theorem 16.
Let be such that and . Let be a prime power. Then there is a Karp reduction from - to that takes a circuit defining an code as auxiliary input.
Proof.
See full version [4].
We will instantiate the reduction in Theorem 16 with codes meeting the Zyablov bound, which can be computed in polynomial time (see [18, Theorem 10.2.1]). In fact, what we state is the special case of the (effective) Zyablov bound with sub-constant rate.888We note that the premise of [18, Theorem 10.2.1] is stated with the requirement . While is necessary for the case, for general constant , the result holds for , as in Theorem 17. It is also evident from the proof of the Zyablov bound that the result extends to superconstant as well – i.e., that for any prime power that is an unbounded function of and any , there is an efficiently computable family of codes with relative distance .
Theorem 17 (Zyablov bound [44]).
For any (efficiently computable) , constant prime power , and constant , there is a -time algorithm that takes as input and computes (a circuit representing) a code with relative distance for sufficiently large .
We now show that is -hard on -ary codes of relative distance less than and any constant rate. This hardness corresponds to the red shaded region in the top plot in Figure 1. We remark that is easy for . In particular, for such any such , this problem can be solved by choosing any collection of polynomially many codewords and outputting the closest pair among them. So, Corollary 18 shows essentially tight hardness of in terms of .
Corollary 18 (-hardness of ).
Let be a fixed prime power and let be constants. Then for all sufficiently large with and , there is a Karp reduction from - to .
In particular, is -hard for any constant rate for any constant rate and constant relative distance .
Proof.
See full version [4].
Remark 19.
We remark that the assumption (equivalently, ) in Corollary 18 is not necessary, except to ensure that the instance of output by the reduction meets the definition of a code used there. In fact, modifying Corollary 18 slightly shows -hardness of “” for any for constant (and any constant relative distance less than). This is because the hard instances of constructed in Corollary 18 are such that either or is large for .
We also extend Corollary 18 to codes in systematic form (and in particular to codes with distance , which are represented by injective circuits; see Section 1.5). This hardness corresponds to the red shaded region in the bottom plot in Figure 1.
Corollary 20 (-hardness of ).
Let be a fixed prime power and let be constants. Then for all sufficiently large with and , there is a Karp reduction from - to .
In particular, is -hard for any constant rate and constant relative distance .
Proof.
See full version [4].
3.2 -hardness of
We next show a reduction from for to analogous to Theorem 16.
Theorem 21.
Let and let be a prime power. Suppose that and . Then there is a Karp reduction from - to that takes a circuit defining a -list decodable-code with minimum distance at least (i.e., is injective) as auxiliary input.
Proof.
See full version [4].
We now state a result on explicit codes that nearly achieve list-decoding capacity [17] (see also [18, Theorem 17.3.8]). We note in passing that these codes are folded Reed-Solomon codes, but we will only use them in a black-box way as our gadget codes in Theorem 21.
Theorem 22 ([17], [18, Theorem 17.3.8]).
For any constant rate , any sufficiently small constant , and all sufficiently large , there exist linear -ary -list-decodable codes of dimension and sufficiently large block length , for some
Furthermore, there is a -time algorithm for computing (a circuit representing) such codes .
Finally, we use Theorem 22 to prove -hardness of .
Corollary 23 (-hardness of ).
For any constants and positive integers and , there exists a prime power and a list size such that there is a Karp reduction from - to , where .
In particular, for these parameters, is --hard.
Proof.
See full version [4].
As with , we also show hardness of for codes in systematic form. (See Section 1.5.)
Corollary 24 (-hardness of ).
For any constants and and positive integers and , there exists a prime power and a list size such that there is a Karp reduction from - to , where .
In particular, for these parameters, is --hard
Proof.
See full version [4].
4 Finding short lattice vectors is in
In this section, we show that the problem of finding suitably short non-zero lattice vectors (in norms) can be placed in -, with a smooth tradeoff between the length of the vector obtained and and . In particular, when and , we find vectors whose length is at most the bound given by Minkowski’s celebrated theorem [32] (up to a factor of for an arbitrarily large constant ), and when and , we find shorter vectors, corresponding to a stronger bound due to Blichfeldt [7]. (Blichfeldt proved his bound in the norm, but we generalize it. Again, we match Blichfeldt’s bound up to a factor of for an arbitrarily large constant .)
To that end, we first prove the following technical proposition. We then derive the above results as corollaries.
Proposition 25.
There is an algorithm that takes as input , a radius , an integer , integers , an integer , and a basis for a lattice , makes a single query to an oracle, and outputs a lattice vector with the following behavior. If , and
then . Furthermore, the algorithm runs in time .
Proof.
See full version [4]
Recall that we are interested in - for .
Corollary 26.
Let be a constant integer, and let be such that and for a sufficiently large polynomial . Then for
where is the rank of the lattice in the - instance.
In particular, corresponds to Minkowski’s theorem , and -. Furthermore, for the case of , corresponds to Blichfeldt’s theorem , and - for any .
Proof.
See full version [4].
5 Inclusions
See full version [4].
6 Black-box separations
See full version [4].
7 A non-black-box non-separation
See full version [4].
References
- [1] Divesh Aggarwal, Zeyong Li, and Noah Stephens-Davidowitz. A -time algorithm for -SVP and -Hermite SVP, and an improved time-approximation tradeoff for (H)SVP. In Eurocrypt, 2021. URL: http://arxiv.org/abs/2007.09556.
- [2] Frank Ban, Kamal Jain, Christos H. Papadimitriou, Christos-Alexandros Psomas, and Aviad Rubinstein. Reductions in PPP. Information Processing Letters, 145:48–52, 2019. doi:10.1016/j.ipl.2018.12.009.
- [3] L. A. Bassalygo. New upper bounds for error-correcting codes. Problems of Information Transmission, pages 32–35, 1965.
- [4] Huck Bennett, Surendra Ghentiyala, and Noah Stephens-Davidowitz. The more the merrier! on total coding and lattice problems and the complexity of finding multicollisions. In ITCS, 2025. URL: https://eccc.weizmann.ac.il/report/2024/018.
- [5] Itay Berman, Akshay Degwekar, Ron D. Rothblum, and Prashant Nalini Vasudevan. Multi-collision resistant hash functions and their applications. In Eurocrypt, 2018.
- [6] Nir Bitansky, Yael Tauman Kalai, and Omer Paneth. Multi-collision resistance: A paradigm for keyless hash functions. In STOC, 2018.
- [7] H. F. Blichfeldt. The minimum value of quadratic forms, and the closest packing of spheres. Mathematische Annalen, 101(1):605–608, 1929.
- [8] Romain Bourneuf, Lukáš Folwarczný, Pavel Hubáček, Alon Rosen, and Nikolaj I. Schwartzbach. PPP-completeness and extremal combinatorics. In ITCS, 2023.
- [9] Jan Buzek and Stefano Tessaro. Collision resistance from multi-collision resistance for all constant parameters. In CRYPTO, 2024.
- [10] Ivan Bjerre Damgård. A design principle for hash functions. In CRYPTO, 1989.
- [11] Thomas Debris-Alazard, Léo Ducas, and Wessel P. J. van Woerden. An algorithmic reduction theory for binary codes: LLL and more. IEEE Transactions on Information Theory, 68(5):3426–3444, 2022. doi:10.1109/TIT.2022.3143620.
- [12] Philippe Delsarte. An algebraic approach to the association schemes of coding theory. Thesis, Universite Catholique de Louvain, 1973.
- [13] Itai Dinur. Tight time-space lower bounds for finding multiple collision pairs and their applications. In Eurocrypt, 2020.
- [14] I. Dumer, D. Micciancio, and M. Sudan. Hardness of approximating the minimum distance of a linear code. IEEE Transactions on Information Theory, 49(1):22–37, 2003. doi:10.1109/TIT.2002.806118.
- [15] Nicolas Gama and Phong Q. Nguyen. Finding short lattice vectors within Mordell’s inequality. In STOC, 2008.
- [16] J. H. Griesmer. A bound for error-correcting codes. IBM Journal of Research and Development, 4(5):532–542, 1960. doi:10.1147/RD.45.0532.
- [17] Venkatesan Guruswami and Atri Rudra. Explicit codes achieving list decoding capacity: Error-correction with optimal redundancy. IEEE Trans. Inf. Theory, 54(1):135–150, 2008. doi:10.1109/TIT.2007.911222.
- [18] Venkatesan Guruswami, Atri Rudra, and Madhu Sudan. Essential Coding Theory. self-published, 2023. October 3rd, 2023 book version. URL: https://cse.buffalo.edu/faculty/atri/courses/coding-theory/book/web-coding-book.pdf.
- [19] R. W. Hamming. Error detecting and error correcting codes. The Bell System Technical Journal, 29(2):147–160, 1950. doi:10.1002/j.1538-7305.1950.tb00463.x.
- [20] Yassine Hamoudi and Frédéric Magniez. Quantum time–space tradeoff for finding multiple collision pairs. ACM Trans. Comput. Theory, 15(1-2):3:1–3:22, 2023. doi:10.1145/3589986.
- [21] Siddhartha Jain, Jiawei Li, Robert Robere, and Zhiyang Xun. On pigeonhole principles and Ramsey in TFNP. In FOCS, 2024.
- [22] Emil Jeřábek. Integer factoring and modular square roots. Journal of Computer and System Sciences, 82(2):380–394, 2016. doi:10.1016/J.JCSS.2015.08.001.
- [23] Antoine Joux. Multicollisions in iterated hash functions. application to cascaded constructions. In CRYPTO, 2004.
- [24] Grigorii A. Kabatjanskiĭ and Vladimir I. Levenšteĭn. Bounds for packings on the sphere and in space. Problemy Peredači Informacii, 14(1):3–25, 1978.
- [25] Ilan Komargodski, Moni Naor, and Eylon Yogev. Collision resistant hashing for paranoids: Dealing with multiple collisions. In Eurocrypt, 2018.
- [26] Ilan Komargodski and Eylon Yogev. Personal communication, 2023.
- [27] Arjen K. Lenstra, Hendrik W. Lenstra, Jr., and László Lovász. Factoring polynomials with rational coefficients. Mathematische Annalen, 261(4):515–534, December 1982.
- [28] Qipeng Liu and Mark Zhandry. On finding quantum multi-collisions. In Eurocrypt, 2019.
- [29] R. McEliece, E. Rodemich, H. Rumsey, and L. Welch. New upper bounds on the rate of a code via the Delsarte-MacWilliams inequalities. IEEE Transactions on Information Theory, 23(2):157–166, 1977. doi:10.1109/TIT.1977.1055688.
- [30] Ralph C. Merkle. A certified digital signature. In CRYPTO, 1989.
- [31] Daniele Micciancio and Michael Walter. Practical, predictable lattice basis reduction. In Eurocrypt, 2016. URL: http://eprint.iacr.org/2015/1123.
- [32] Hermann Minkowski. Geometrie der Zahlen. B.G. Teubner, 1910. URL: http://books.google.com/books?id=MusGAAAAYAAJ.
- [33] Mridul Nandi and Douglas R. Stinson. Multicollision attacks on some generalized sequential hash functions. IEEE Transactions on Information Theory, 53(2):759–767, 2007. doi:10.1109/TIT.2006.889721.
- [34] Christos H. Papadimitriou. On the complexity of the parity argument and other inefficient proofs of existence. J. Comput. Syst. Sci., 48(3):498–532, 1994. doi:10.1016/S0022-0000(05)80063-7.
- [35] Amol Pasarkar, Christos Papadimitriou, and Mihalis Yannakakis. Extremal combinatorics, iterated pigeonhole arguments and generalizations of PPP. In ITCS, 2023.
- [36] M. Plotkin. Binary codes with specified minimum distance. IRE Transactions on Information Theory, 6(4):445–450, 1960. doi:10.1109/TIT.1960.1057584.
- [37] Ron D. Rothblum and Prashant Nalini Vasudevan. Collision-resistance from multi-collision-resistance. In CRYPTO, 2022.
- [38] Claus-Peter Schnorr. A hierarchy of polynomial time lattice basis reduction algorithms. Theor. Comput. Sci., 53(23):201–224, 1987. doi:10.1016/0304-3975(87)90064-8.
- [39] R. Singleton. Maximum distance -nary codes. IEEE Transactions on Information Theory, 10(2):116–118, 1964. doi:10.1109/TIT.1964.1053661.
- [40] Aikaterini Sotiraki. New Hardness Results for Total Search Problems and Non-Interactive Lattice-Based Protocols. Thesis, Massachusetts Institute of Technology, 2020. URL: https://dspace.mit.edu/handle/1721.1/129310.
- [41] Katerina Sotiraki, Manolis Zampetakis, and Giorgos Zirdelis. PPP-completeness with connections to cryptography. In FOCS, 2018.
- [42] Alexander Vardy. Algorithmic complexity in coding theory and the Minimum Distance Problem. In STOC, 1997.
- [43] Hongbo Yu and Xiaoyun Wang. Multi-collision attack on the compression functions of MD4 and 3-pass HAVAL. In ICISC, 2007.
- [44] Victor Zyablov. An estimate of the complexity of constructing binary linear cascade codes. Probl. Peredachi Inf., 1971.