Worst-Case and Average-Case Hardness of Hypercycle and Database Problems
Abstract
In this paper we present tight lower-bounds and new upper-bounds for hypergraph and database problems. We give tight lower-bounds for finding minimum hypercycles. We give tight lower-bounds for a substantial regime of unweighted hypercycle. We also give a new faster algorithm for longer unweighted hypercycles. We give a worst-case to average-case reduction from detecting a subgraph of a hypergraph in the worst-case to counting subgraphs of hypergraphs in the average-case. We demonstrate two applications of this worst-case to average-case reduction, which result in average-case lower bounds for counting counting hypercycles in random hypergraphs and queries in average-case databases. Our tight upper and lower bounds for hypercycle detection in the worst-case have immediate implications for the average-case via our worst-case to average-case reductions.
Keywords and phrases:
Hypergraphs, hypercycles, fine-grained complexity, average-case complexity, databasesCategory:
Track A: Algorithms, Complexity and GamesFunding:
Rene Reyes: Supported by the National Science Foundation under Grant No.2209194.Copyright and License:
![[Uncaptioned image]](x1.png)
2012 ACM Subject Classification:
Theory of computation Graph algorithms analysis ; Theory of computation Database theory ; Theory of computation Parameterized complexity and exact algorithmsEditors:
Keren Censor-Hillel, Fabrizio Grandoni, Joël Ouaknine, and Gabriele PuppisSeries and Publisher:

1 Introduction
The fine-grained hardness of hypergraph problems is known to have interesting connections to both theoretical and practical problems in computer science. For example, hypergraph problems have deep connections to solving constraint satisfaction problems: algorithms for hypercycle problems in hypergraphs can be used to solve various Constraint Satisfaction Problems (CSPs), including Max--SAT [16]. On the practical side, a recent line of work in database theory (initiated by Carmeli and Kröll [9]) has shown that certain queries of interest can be interpreted as problems about detecting, listing and counting small structures in hypergraphs. Nevertheless, the fine-grained study of hypergraph problems remains largely under-explored, especially when compared to the volume of work on fine-grained hardness of graph problems. In this paper, we aim to extend the understanding of hypergraphs in the average-case setting. We focus on two main directions: the worst-case hardness of hypercycle problems and the average-case hardness of counting small subhypergraphs.
To our knowledge, this work is the first to show that the conditional hardness of hypercycle detection is dependent on both the length of the hypercycle and the size of the hyperedges. This is somewhat surprising, given that for other problems, such as hyperclique detection, hardness is conjectured to depend only on the size of the hyperclique. In the weighted case we get tight lower bounds for all hypercycle lengths. In the unweighted case, we show that brute-force is necessary for short hypercycles, resulting in tight upper and lower bounds. For the parameter regime where we can’t show the brute force lower bound, we give a new faster algorithm for longer unweighted hypercycles using matrix multiplication. This shows the brute-force lower bound is actually false past the point where our reduction stops working.
For our average-case results, we build on a line of work that has used the error-correcting properties of polynomials to show average-case fine-grained hardness. This technique, introduced by Ball et al. [5] has already been applied to hypergraph problems by Boix-Adserà et al. [7], who show that counting small hypercliques in random hypergraphs where all hyperedges have the same size is as hard as in the worst-case. In the graph setting, their approach was generalized by Dalirrooyfard et.al [12] who showed similar worst-case-to-average-case reductions for counting any small subgraph. We further generalize their result to show that this is true for counting small subhypergraphs in random hypergraphs, including ones where the hyperedges have different sizes. The worst-case to average-case reduction also allows us to prove that counting queries on random databases are hard on average.
These two results are highly complementary. Applying our worst-case-to-average-case reduction to our hypercycle hardness results immediately gives us average-case hardness of hypercycle problems. Furthermore, understanding the worst-case hardness of other hypergraph substructures would shed light on what types of counting queries are hard on average. Finally, in the process of exploring these areas, we have found many new avenues to explore, which range from understanding the parameter regimes where we do not get tight upper and lower bounds to improving some generic techniques from traditional worst-case reductions.
1.1 Hypercycle Algorithms and Lower Bounds
A -hypercycle in a -uniform hypergraph is a list of distinct nodes such that every hyperedge of adjacent nodes exists in the hypergraph. That is, a -hypercycle consists of the edges for all . In the literature, this is often called a tight hypercycle, but we will omit the “tight” and call these hypercycles throughout this paper (following the norm of [16]).
We demonstrate new algorithms and conditional lower bounds for the minimum hypercycle and unweighted hypercycle problems. Our upper and lower bounds for minimum hypercycle are tight, as shown in Figure 1. In the unweighted case, we show that the hardness of hypercycle detection depends on the relationship between the hyperedge size and hypercycle length, as shown in Figure 2. The conditional lower bounds are derived from the minimum -clique hypothesis and the -hyperclique hypothesis:
Definition 1 (MIN WEIGHT k-CLIQUE HYPOTHESIS (e.g. [1, 4])).
There is a constant such that, on a Word-RAM with -bit words, finding a -Clique of minimum total edge weight in an -node graph with non-negative integer edge weights in requires time.
Definition 2 (-HYPERCLIQUE HYPOTHESIS [16]).
Let be integers. On a Word-RAM with bit words, finding a -hyperclique in a -uniform hypergraph on nodes requires time.
As evidence for this hypothesis, [16] give a reduction from the maximum degree--CSP problem to the -hyperclique problem. They also give a reduction from hyperclique detection to hypercycle detection. The hard instances output by this reduction only cover a specific hypercycle length for each uniformity and this seems inherent to the technique. A natural question to ask is whether the hardness of finding -hypercycles in -uniform hypergraphs depends on the relationship between and . We answer this question in the affirmative for both weighted and unweighted hypercycle problems. This is surprising since the hardness of other hypergraph problems such as -hyperclique is conjectured to be independent of this relationship.
For min-weight -hypercycle, we show that brute-force is required when is less than . When , we give an algorithm with runtime independent of . For unweighted hypercycle, we show that brute force is needed for all up to the length of the instances output by the [16] reduction. For longer hypercycles, we show this is not the case by giving novel algorithms which use matrix multiplication to get a speedup.
We now provide a more detailed overview of our techniques along with formal theorem statements for our upper and lower bounds.
1.1.1 Weighted Hypercycle
We now outline results for weighted hypercycle, depicted in Figure 1. For details, see the full version. The matching lower bounds come from a reduction between cliques in graphs (2-uniform hypergraphs) and hypercycles. Informally the minimum -clique hypothesis states that finding a minimum -clique in a graph can’t be done faster than the naive algorithm which takes time (see definition 1). We prove the following.
Theorem 3.
If the minimum -clique hypothesis holds then the minimum -hypercycle problem in a -uniform hypergraph requires time for .
1.1.2 Unweighted Hypercycle
For the unweighted version of the problem we get tight upper and lower bounds in a -uniform graph for all , where we define as is done in [16] and is the largest such that . Intuitively, this is the largest such that any three nodes in the hypercycle are covered by at least one hyperedge of size , which happens when . For larger hypercycles where , we can pick three partitions such that no hyperedge covers all three. This property allows us to reduce the problem to triangle-counting and then use fast matrix multiplication to obtain an algorithmic speedup. This algorithm runs in time where is the matrix multiplication constant. Then, when , we can reduce the problem to a more general reachability problem and once again use fast matrix multiplication to obtain an algorithm which runs in time . We depict the upper and lower bounds for unweighted hypergraphs in Figure 2.
We get tight lower bounds for hypercycle up to the “phase transition” point at . These lower bounds come from the -hyperclique hypothesis. Informally, this says detecting a -hyperclique in a -uniform hypergraph if requires (see Definition 2). We prove the following theorem in section 3.1.
Theorem 4.
Under the -hyperclique hypothesis, an algorithm for finding (counting) -hypercycle for requires time.
When we show an algorithm which is better than brute force by a factor dependent on (where is the matrix multiplication constant).
Theorem 5.
There exists a time- algorithm for finding -hypercycles in any -node -uniform hypergraph when .
Finally, when , we get an algorithm with runtime independent of that also exhibits savings from fast matrix multiplication.
Theorem 6.
There exists a time- algorithm for finding -hypercycles in any -node -uniform hypergraph when .
Proving conditional lower bounds in this regime might unexpectedly imply conditional lower bounds on . In short, we get tight lower bounds for all values of where lower bounds wouldn’t have implications for matrix multiplication’s running time, .
1.1.3 Average-Case Implications
We can apply our worst-case to average-case reduction techniques to get lower bounds for random hypergraphs. Notably, for constant-length unweighted-hypercycle, the hardness of counting in the worst-case is equivalent to the hardness in the average-case up to polylog factors. This gives tight upper and lower bounds for the average-case hardness of counting -hypercycles in the same regime of cycle length for a given uniformity when . Informally, if the -hyperclique hypothesis holds then counting -hypercycles in -uniform Erdős-Rényi hypergraphs when requires time. We prove this in Section 3.
1.2 Counting Subhypergraphs on Average is as Hard as Detecting them in the Worst Case
To enable our average-case results for databases and for counting hypercycles we provide a new reduction. Our reduction can handle non-uniform hypergraphs, meaning hypergraphs with edges of different sizes (see Figure 3). This is necessary for the results to apply to the databases setting. We transform the problem into a low degree polynomial and use a known framework to show average-case hardness. However, to get back to average-case graphs that aren’t color-coded we use a new form of inclusion exclusion. Prior work introduced inclusion-edgesculsion [16]. We present a simultaneously generalized and simplified proof which allows us to show hardness for counting these subhypergraphs in uniformly randomly sampled hypergraphs (even those where there are mixed hyperedge sizes).
Theorem 7 (Informal).
Let be a constant. Counting the number of -node hypergraphs made up of nodes in -partite hypergraphs with partitions in the worst case where we only count copies of where can be solved with polylog calls to a counter for hypergraphs in uniformly random hypergraphs.
Via color-coding this means that if detecting a hypergraph in the worst case is hard then counting that hypergraph in the uniform average-case is hard. So, starting from the hardness of either detection or the counting problem in a -partite graph implies the hardness of the counting problem in the uniform average-case.
These approaches allow us to show hardness for database problems and the problem of counting subhypergraphs. We demonstrate the usefulness of the improved approach by showing its applications to these two problem areas.
1.3 Database Results
We now provide a high-level introduction to the databases setting we are interested in. For full definitions of these problems, see the full version [13]. A table in a database is called a relation. A relation, , is a set of tuples. For example, a relation on would be a set of tuples grouping user ids and user names. A database may have many relations, and the relations may have tuples of different sizes (e.g. the example database could also include a table of ).
Queries over a database can take many forms, but a classic form is a join query. To be concrete, if we have a table with two relations and then we can have a query . If we ask to enumerate all answers to then we want all tuples such that and . Some query types (e.g. “FIRST” in SQL) ask to give a single answer. In this paper we will focus on queries that ask to count the number of matching outputs in the query (e.g. “COUNT” in SQL). In database theory, enumeration queries are the most studied. While these queries are often trivial in the average-case, non-enumerating query types are still often hard-on-average. See the full version [13] for more discussion of enumeration and counting on average.
You might notice that relations look like lists of hyperedges and queries look like requests to enumerate, detect, or count subhypergraphs of the implied hypergraph. However, these database “hyperedges” are directed. The tuple and have different meanings, whereas in an undirected hypergraph the edges are simply sets. However, these problems are similar enough that we can apply analogous worst-case to average-case reduction techniques to show hardness on average for these database problems.
We now work through an example to help solidify the similarity between databases and hypergraphs. The hypergraph from Figure 3 could be represented as a database by the following list of relations:
Then, to count the number of appearances of the hypergraph from Figure 3 within the hypergraph representing the entire database, we could make the following query:
A natural question for databases is this: how hard are queries on uniformly random databases? If our real world case involves uniformly distributed data, when can we hope to find algorithmic improvements? We answer this question by giving lower bounds for counting queries, which can be found in the full version [13] of the paper. We also explain why small enumeration queries will be easy.
1.4 Context and Prior Works
There has been a lot of work on average-case fine-grained complexity. Ball et al. [5] showed that starting from popular fine-grained hypotheses, evaluating certain functions could be shown to be hard on average. Later Ball et al. [6] showed that you can build a fine-grained proof of work using these functions. [7] showed that there is an equivalence, up to polylog factors, between counting constant sized hypercliques in the worst-case and in the average-case of Erdős-Rényi Hypergraphs. Goldreich presented an alternative method to count cliques mod 2 [14]. Dalirrooyfard et al. [12] generalized these ideas and presented a method to produce a worst-case to average-case reduction for any problem which can be represented by a “good” low degree polynomial. Additionally they showed that counting small subgraphs (not just cliques) is equivalently hard in the worst-case and in Erdős-Rényi graphs. In this paper we further generalize this result to the hypergraph setting. Specifically, we show that counting small subhypergraphs is equivalently hard in the worst-case and average-case. We also give worst-case to average-case hardness for many classes of counting database queries. Our contributions include expanding the previous results and connecting fine-grained average-case complexity to database theory.
Recently, the database theory community has had increased interest in the fine-grained hardness of various database queries. This is highlighted by many recent papers in the area. Carmeli and Kröll [9] use fine-grained hypotheses to show that answering unions of conjunctive queries are hard. Carmeli and Segoufin [10] explore the fine-grained complexity of enumeration queries with self joins. Bringman et al. [8] characterize the amount of preprocessing time needed to get polylogarithmic access time. We will show that for counting queries over a constant number of relations that don’t involve self-joins, the worst-case and average-case hardness are equivalent up to polylogarithmic factors.
Carmeli et al [11] eschew enumeration to explore logarithmic time access to the answer to a conjunctive query after a quasi-linear database preprocessing. They provide fast algorithms for these problems. Recent work from Wang, Willsey, and Suciu [18] proposed a free join which achieves worst-case optimality. These are algorithms where the runtime matches the worst-case output size. As we explore counting queries where the output size is a single word, bits, worst-case optimality is impossible as long as the full input must be read to answer the query. We thus focus on efficiency more generally. A 2023 Simons program on Logic and Algorithms in Database Theory and AI ran a workshop on Fine-Grained Complexity, Logic, and Query Evaluation[17]. In this workshop, the open problem of the average-case hardness of database queries was brought up in the open problem session on September 27th. In this paper we have explored and proved this average-case hardness for many kinds of counting queries in databases. In this paper we seek to give a new definition of an average-case for database theory and provide a worst-case to average-case reduction using fine-grained complexity techniques.
Hypercycles (“tight hypercycles” in much of the literature) have been well-studied in both math and computer science. Allen et al. [2] showed that for -hypercycles in a -uniform graph where must exist if there are at least hyperedges for some constant . It is unclear if when you can still guarantee a hypercycle exists. In this paper we show hardness for , where these results do not apply. Huang and Ma [15] continued the exploration of this existential hypercycle question, showing that, in some sense, the previous result is tight up to constants. Later, Allen et al. [3] gave a faster algorithm for tight Hamiltonian hypercycles in random graphs, that is in average-case graphs. The results in our paper rely on being small, for larger the worst-case to average-case reduction becomes increasingly expensive. So their work and ours shows an interesting dynamic where short cycles have more similar hardness in the worst-case and average-case and there is a divergence as the cycle length grows. Lincoln, Vassilevska and Williams [16] give a reduction from max--SAT to hypercycle and then to cycle. However, their results for cycle are not tight. They give a bound of for -cycle. They give a tighter bound for -cycle of in graphs with . In theorem C.1 they show a lower bound for hypercycle. However, this is dense hypercycle. In our paper we present tight bounds for sparse hypercycle. We also present a new faster algorithm for unweighted hypercycle. Our weighted lower bounds are tight. Our unweighted lower bounds are tight until the regime where matrix multiplication is used in the fastest algorithm.
1.5 Paper Structure
We present the basic background and definitions in Section 2. We show tight lower bounds for hypercycles in Section 3. We give fast algorithms for hypercycles in Section 4. We present open problems in Section 5.
There is a full version of this paper [13], which contains details of proofs, extra discussion, as well as the following omitted sections.
In the full version we: give the tight upper and lower bounds for minimum hypercycle, give the worst-case to average-case reduction for counting subhypergraphs, and show the average-case hardness for (self-join-free) counting queries.
2 Preliminaries
In this paper we combine ideas and techniques from average-case fine-grained complexity, databases, and worst-case fine-grained complexity and algorithms. As a result we will mostly define the needed terms in the corresponding sections. Our preliminaries are short and focus on hypergraphs as these appear in most sections.
Definition 8.
A -uniform hypergraph has a vertex set and a set of hyperedges where each hyperedge is a set of vertices from .
A hypergraph of mixed sizes with hyperedge set sizes of has a vertex set and a set of hyper edges where each hyperedge is a set of vertices from and the size of that set must be or .
Definition 9.
A tight -hypercycle in a -uniform hypergraph is a list of nodes where every hyperedge
exists in . That is, every hyperedge of adjacent vertices on the cycle exists.
Definition 10.
A -hypercycle in a -uniform hypergraph is a list of nodes where every hyperedge where exists . That is, every possible hyperedge between the nodes exists in the graph.
In this paper we use the term -hypercycle to refer to a tight -hypercycle. We use these concepts throughout the paper. We also use a more general notion of a subgraph of a hypergraph.
Definition 11.
A subhypergraph or subgraph of a hypergraph (the hypergraph could have mixed uniformity or not) is a graph whose vertex set and edge set is a subset of ’s vertex and edge set. That is, if has a vertex set and edge set then is a subgraph of if and .
Theorem 12 (Theorem 3.1 from [16]).
Let be a -uniform hypergraph on vertices , partitioned into parts .
Let .
In time we can create a -uniform hypergraph on the same node set as , so that contains an -hypercycle if and only if contains an -hyperclique with one node from each .
If has weights on its hyperedges in the range , then one can also assign weights to the hyperedges of so that a minimum weight -hypercycle in corresponds to a minimum weight -hyperclique in and every edge in the hyperclique has weight between . Notably, .
For intuition on this theorem see the full version [13].
Definition 13 (From [12]).
Let be a -node graph with .
An -partite graph is a graph with partitions . This graph must only have edges between nodes and if e .
3 Short Unweighted Hypercycles Require Brute-Force
We will begin by showing that short hypercycles are tightly hard. We will do so for hypercycles of lengths at most . Recall that , so . On an intuitive level corresponds to the largest length of cycle such that all sets of three nodes are included in at least one hyperedge. In this section we will show that given this constraint, the best algorithm for -hypercycle is the naive brute force algorithm. In the next section we will show that if the cycle length is even one larger than , then we can beat brute force using fast matrix multiplication. In this sense our brute force lower-bound is tight, we couldn’t extend the brute force lower bound to cycles of any longer length.
Our lower bounds are based on the min-weight -clique and -hyperclique hypotheses, which states that these problems require time (see Definition 2). We use a theorem from [16] to show hardness for our hypercycle problems. For intuition about this reduction see the full version [13].
3.1 Tight Hardness for Short Hypercycles
We begin by showing that for finding relatively short hypercycles, brute-force algorithms are optimal unless the -hyperclique hypothesis is false. Note that the reduction from Theorem 12 preserves the number of vertices in the graph . We use this second fact throughout the proof of our lower bound for this parameter regime.
Now, what exactly do we mean by “short hypercycles”? To formally define the range in which we get tight results, we must introduce some notation, which allows us to reason about the uniformities we show hardness for.
Definition 14.
Recall the function from Theorem 12. Then, for constant define: .
This function will make it easier to discuss the hypercycle lengths for which the hyperclique reduction yields hardness. These correspond to the range , which we will sometimes refer to as short hypercycles. In particular, we show that improving over brute-force search for short hypercycles would yield faster algorithms for finding hypercliques. More formally we will prove that:
See 4
This conditional lower bound follows from the following ideas, which we prove as intermediate lemmas. First, the function is monotonically increasing, so applying the hyperclique-to-hypercycle reduction from [16] for yields a uniformity . Furthermore, we show a self-reducibility property of the hypercycle problem, which is that algorithms solving -hypercycle on a given uniformity can be used to solve it on smaller uniformities. Putting these ideas together, we are able to show that any hardness given by the hyperclique reduction on uniformities can be extended to .
3.1.1 Understanding the Hyperclique to Hypercycle Reduction
We begin by showing monotonicity of .
Lemma 15.
For all , the function is monotonically increasing.
Proof.
We will show that for any , . Note that . Then, since , we get .
Corollary 16.
For every such that .
We next want to show that if finding hypercycles is hard on -uniform graphs, it is also hard in graphs with uniformity . To do this, we will make use of -circle-layered hypergraphs:
Definition 17 (-circle-layered hypergraphs).
We say that a hypergraph is -circle-layered if it is -partite and its vertex partitions can be arranged in a circle such that hyperedges only exist between adjacent vertex partitions. That is between partitions .
While this seems like a highly restrictive definition, [16, Lemma 2.2] shows that a time- algorithm for finding -hypercycles in this type of hypergraph can be used to find -hypercycles in arbitrary hypergraphs using time . Thus, for practical purposes we will treat the problems as equivalent. Moreover, when indexing into a vertex in a -hypercycle or a partition in a -circle-layered graph, the index should be interpreted as shorthand for .
Now, we need one more property of -circle-layered hypergraphs before we prove our self-reducibility result. The structure of these graphs are useful for reasoning about hypercycles when we can ensure that any hypercycle in the graph uses exactly one vertex from each partition. When this is true, we can think of the hypercycle as “going around” the circular structure of the hypergraph. Nonetheless, there are some scenarios in which more than one vertex from some partition may appear in the hypercycle. We can think of these as “backward cycles”, since they must turn around at some point to re-visit the repeated partition.
We can show that such hypercycles cannot exist when .
Lemma 18.
Suppose there exists a -hypercycle in a -circle layered hypergraph with uniformity that does not use exactly one node from each partition. Then, .
For the proof of this lemma, see the full version [13].
3.1.2 Increasing Uniformity Preserves Hardness
We can now show the following self-reducibility lemma about finding and counting -hypercycles:
Lemma 19.
Let be a hypercycle length such that . Suppose there exists such that there is an algorithm for finding (counting) -hypercycles in -circle-layered graphs that runs in time . Then, for any uniformity , there exists an algorithm for finding (counting) -hypercycle in -circle-layered graphs running in time .
Proof.
We will show how to map an -node -uniform hypergraph to an -node -uniform hypergraph such that has a -hypercycle if and only has one.
We construct as follows. We let its vertex set , and we will add edges so that is also -circle-layered. Then, we construct hyperedges in as follows.
For each hyperedge , we create hyperedges by creating all possible “extensions” of the hyperedge to the next partitions. More concretely, we know that the vertices of this hyperedge satisfy . Then, for all possible combinations of vertices we will add the hyperedge to .
This mapping results in a hypergraph such that and . This blowup in the number of edges will not be a problem since the algorithm we are assuming for -hypercycle is agnostic to sparsity.
Now we must show that preserves -hypercycles from and does not create any new ones. First, assume there exists a hypercycle in . Then, for all consecutive sets of vertices, we have that the hyperedge . Since we added all possible extensions of this hyperedge to , this necessarily implies that for the specific extension corresponding to the next vertices of the hypercycle, the hyperedge . By applying this reasoning to all of the edges in the hypercycle, we get that also form a -hypercycle in .
Now we show why does not contain any hypercycles that cannot be traced back to a hypercycle in . The key idea for this will be the fact that for every hyperedge in , the first nodes in the hyperedge form a hyperedge in .
Assume there exists a hypercycle in . We want to show that these vertices also form a hypercycle in , which means we want to show that for each consecutive set of vertices we have . Now, we know that for any set of consecutive vertices in the hypercycle, there is a hyperedge in of the form . This is due to the fact that, by Lemma 18, any -hypercycle in this parameter regime must use exactly one vertex from each partition. Then, if this edge was added to , this means the first vertices must form a hyperedge in . Consequently, form a -hypercycle in .
Now that our mapping is complete, we can specify how an algorithm solving -hypercycle in time can be used to solve -hypercycle in the same runtime. Given an -node, -uniform hypergraph , we can compute a -uniform hypergraph following our reduction. This requires iterating through all hyperedges in then creating all possible extensions and there can be up to hyperedges, so the reduction takes time. Then, since also has nodes, we can run on to determine whether there is a -hypercycle in time.
The total runtime of this algorithm is , which is what we wanted to show.
3.1.3 Getting the Tight Lower Bound
We finally prove that short hypercycles require brute force:
See 4
Proof.
We want to show that for all , finding a -hypercycle requires time. We will do this by applying Theorem 12 to go from hardness of -hyperclique to hardness of -hypercycle, then use monotonicity of and the self-reducibility lemma to get the desired hardness for -hypercycle.
First, observe that the reduction from -hyperclique to -hypercycle preserves the number of vertices in the hypergraph, and this reduction can be computed in time . Moreover, this reduction outputs a -circle-layered hypergraph. Consequently, an algorithm solving -hypercycle in time would solve -hyperclique in the same runtime. This would violate the -hyperclique conjecture, so we get a conditional lower bound for -hypercycle. Now, since , Lemma 15 tells us that . In the case these are equal, we already have the desired lower bound.
If the inequality is strict, we know from Lemma 19 that an algorithm for -hypercycle would imply an algorithm with the same runtime for -hypercycle since so it is not a multiple of . This violates our conditional lower bound, so we can conclude that under the -hyperclique assumption, finding a -hypercycle requires time.
From our worst-case to average-case reduction found in the full version of the paper [13], we can conclude that the average-case version of counting -hypercycles also requires brute-force:
Corollary 20.
If the -hyperclique hypothesis holds then counting -hypercycles in -uniform Erdős-Rényi hypergraphs when requires time for constant .
Proof.
If is constant then so is . Thus the number of edges and nodes in our subhypergraph is constant.
Recall we are limiting ourselves to . By Theorem 4 we have that deciding if a -hypercycle exists in a -partite graph is hard if the -hyperclique hypothesis is true.
We can then apply Theorem 7 to conclude that counting -hypercycles in -uniform Erdős-Rényi hypergraphs is hard as well.
4 Beating Brute Force for Longer Hypercycles
In this section we give a new faster algorithm for hypercycle. While the range in which we get tightness in the previous section may have seemed arbitrary, the brute-force lower bound is actually false for any longer hypercycle. In particular, we show two different algorithms which leverage matrix multiplication to beat brute force when is sufficiently bigger than . One of these algorithms allows improvement over starting at exactly , which is the point at which we can no longer prove tightness. The second algorithm exploits the sparsity of its input, and yields even better runtimes than the first when . In this section we provide details for the second algorithm, details for the first can be found in the full version [13].
These algorithms suggest the following relationship between hypercycle length and potential lower bounds. When , hardness is still dominated by the hypercycle length, but any reasonable lower bounds must account for fast matrix multiplication. Then, when , since the algorithm exploits sparsity and the total number of possible hyperedges is , hardness becomes independent of hypercycle length and is dominated by uniformity.
4.1 A Faster Algorithm for Longer Hypercycles
In this section, we explore a different approach to the hypercycle problem. Once again, we exploit the structure of a -circle-layered graph. The main idea behind previous algorithms for e.g. weighted cycle is to start at some partition , then for each vertex in iterate through while keeping track of whether has a path to each of these [16]. When going from to , a node in will be reachable if it shares an edge with a node in that is reachable from . This reduces the cycle problem to this reachability subproblem between adjacent partitions.
We define a more general reachability subproblem, modified in ways that are necessary to use it for hypercycle detection. The key difference is that instead of considering all possible starting vertices for the hypercycle, we have to consider all possible sets of starting vertices. This is due to the fact that at the end of the hypercycle, we will need the last edges to contain all of these vertices. See Figure 4 for a visual representation of why partitions are needed to finish the cycle.
We define a reachability problem in circle-layered graphs to capture this idea.
4.1.1 Reachability Problems in Uniform Hypergraphs
Intuitively the following problem asks for all hyperpaths that cross partitions in a -circle-layered hypergraph. The output will answer for all tuples of nodes of the first nodes and last nodes if a path exists, or what the minimum weight path is.
Definition 21 ((Weighted) -uniform circle-layered reachability problem ((Weighted) -CLR)).
Let be an -vertex -uniform -circle-layered (weighted) hypergraph with vertex partitions . Let and be sets of vertex partitions. Then, the -uniform circle-layered reachability problem asks for an output of size . For all tuples of nodes where and you must return the following:
-
If unweighted return if there exists a such that edges , , , , exist in . (That is if a hyperpath exists.)
-
If weighted let be a function which returns the weight of a hyperedge. Then return
The output of this problem is an matrix, which we denote CLR() (we will also call this output matrix CLR2u-1()). This matrix is binary if unweighted and is over field of the weights if weighted.
For convenience we will also define a second version of this which takes in a CLRk-1() matrix and outputs another reach-ability matrix, CLRk(), for a graph with one more circle layer. The key intuition for why we define the problem this way is that if you have reachability information from the first nodes to the last nodes you can extend this to another partition because no hyperedge in the path will need information from the “middle” nodes as the hyperedges are of uniformity .
Definition 22 ((Weighted) -extension uniform circle-layered reachability problem ((Weighted) -ECLR)).
Let be an -vertex -uniform -circle-layered (weighted) hypergraph with vertex partitions . Let and be sets of vertex partitions. Additionally let CLR be a reachability matrix which gives (minimum weight) hyperpath reachability from to .
Then, the -ECLR problem asks for an output of size .
For all tuples of nodes
where and you must return the following:
-
If unweighted return if there exists a hyperpath which starts at nodes and ends at nodes .
-
If weighted then return the minimum weight hyperpath which starts at and ends at nodes .
The output of this problem is an matrix, which we denote CLRk(). This matrix is binary if unweighted and is over field of the weights if weighted. See Figure 5 for a visual of this problem.
4.1.2 Hypercycle Algorithm from Reachability Algorithm
We now show that a pair of algorithms solving these problems can be used efficiently to solve hypercycle.
Lemma 23.
Suppose there exists a -time algorithm for -CLR and a -time algorithm for -ECLR. Then, there exists an -time algorithm for finding (counting) -hypercycles in -vertex -uniform -circle-layered hypergraphs.
Proof.
Let be the algorithm for -CLR, be the algorithm for -ECLR and let be the vertex partitions of the -circle-layered hypergraph. Let and be as defined in Definition 22.
Note that given the matrix -CLRk(), we can solve hypercycle by “completing the hyperpath” as follows. For each possible sequence in , we iterate through each possible sequence in and check its reachability. If the value in the matrix is a , we can then check whether all of the hyperedges exist. If they do, we have found a hypercycle.
Note that this procedure requires work for each possible pair of sequences, and there are pairs, so this requires time.
Now we must show how to efficiently obtain -CLR) using the algorithms and . First, we use to get -CLR) in time. Next, we can repeat the following for : Run on inputs and -CLR) to get -CLR). Once we get to , we can carry out the hyperpath completion mentioned previously and check for hypercycles.
This process requires running once and a constant number of times, which takes time. Thus, the total runtime of the algorithm is .
Note that if and have runtimes independent of , then so will this algorithm. Inutitively this should be true since one problem is completely characterized by -circle-layered graphs, and the other is simply asking about extending the reachability by one layer. In the next subsection, we actually prove this by constructing algorithms for -CLR and -ECLR which will let us prove the following theorem:
Theorem 24.
There exists a time- algorithm for finding (counting) -hypercycles in -node -uniform -circle layered hypergraphs when .
4.1.3 Algorithms for Unweighted Reachability
To achieve the desired runtime, we need algorithms for -CLR and -ECLR running in time . We will show that this can be achieved using matrix multiplication.
We first show how to do this for .
Lemma 25.
There exists a time- algorithm for solving -CLR in -node -uniform hypergraphs.
Proof.
The algorithm heavily relies on the fact that any given hyperedge will cover at most two out of the three vertex partitions . We fix these three, then for each of the possible combination of vertices from the other vertex partitions, we will create a graph which captures the reachability between these three partitions.
The graph will have three vertex partitions, corresponding to , with edges between and as well as between and . We add an edge between any pair of vertices in and which has a hyperpath going through the current choice of vertices in . Similarly, we add an edge between a pair of vertices in and if there is a hyperpath between them going through the current choice of vertices in .
Once the graph has been constructed, assume we have two adjacency matrices, one for each pair of vertex sets with edges between them. Then, we can use matrix multiplication to obtain the reachability between and in time. Since we do this for graphs, the total runtime of the algorithm is time.
We now show how to extend the technique to solve reachability when .
Lemma 26.
There exists a time- algorithm for solving -ECLR in -node -uniform hypergraphs.
Proof.
Recall that we want to produce the matrix -CLR given and the reachability matrix -CLR, where all consist of tuples of vertices. The algorithm essentially creates a reachability matrix from to , then combines this with -CLR to extend the hyperpath.
To create the reachability matrix from to , we must look at all possible pairs of -tuples in . If the two tuples have the form and then we check whether the hyperedge . If it is, then we set the corresponding entry in to . Note that doing this for all possible pairs takes time.
To obtain -CLR from and -CLR we use the same approach as in the previous lemma. We will fix three vertex sets and then create different graphs by taking all possible combinations of vertices in . For each combination, the graph will have three vertex partitions corresponding to and edges will be added between and as well as between and . Crucially, this can be done because is big enough so that any given hyperedge will cover at most two of these To determine reachability between and , we can check the corresponding entry in -CLR. Even though this matrix contains a stricter condition, this condition is necessary for the reachability to , so we enforce it with this connection. To determine reachability between and , we can use the corresponding entries in .
Once we have this graph, we can use matrix multiplication to determine 2-hop reachability and fill out the appropriate entry in -CLR in time. Since we do this for all combinations of vertices, the total runtime of the algorithm is .
We can now prove that -hypercycle can be solved in for sufficiently large .
See 24
Proof.
Recall from Lemma 23 that given a -time algorithm for -CLR and a -time algorithm for -ECLR we can solve -hypercycle in time . Applying Lemma 26, we know we can set . From Lemma 25 we get .
Combining these, we conclude that -hypercycle in -uniform hypergraphs can be solved in time when .
We can now prove our desired theorem.
See 6
Proof.
Using the color-coding approach from [16, Lemma 2.2] we can randomly color the vertices in to obtain a -circle-layered hypergraph. Then, we can run our time- algorithm on this new hypergraph. Finally, this process can be repeated times to boost the success probability, giving us our runtime of .
5 Conclusion and Open Problems
We have given a worst-case to average-case reduction for counting sub-hypergraphs and for counting database queries. We demonstrate the usefulness of our improved reduction by giving tight lower-bounds for average-case hypercycle counting for short hypercycles.
Additionally, we present new algorithms and new lower bounds for hypercycle in the worst-case. We get tight bounds for the weighted problem of minimum-hypercycle. We get tight bounds for short hypercycle lengths for the unweighted problem. We show a new faster algorithm for -hypercycle which depends on the running time of fast matrix multiplication. However, these results leave many problems open.
5.1 Unweighted Hypercycle Open Problems
Fundamentally the open problems here represent getting clear answers about the running time of the unweighted hypercycle problem when . Some specific open problems which we think would mark progress towards understanding this longer-cycle regime are:
-
1.
When can you show a lower bound of for unweighted -hypercycle? Note that if you could show this you would be giving a tight lower bound assuming .
-
2.
(Extending the previous question) When can you show a lower bound of for unweighted -hypercycle?
-
3.
Can you give a reduction which shows that if (counting, directed) -hypercycle in a -uniform graph requires time then so does (counting, directed) -hypercycle? Note that we have a reduction which shows that if directed -hypercycle in a -uniform graph requires time then so does directed -hypercycle (directedness or something similar is needed for hypercycle lengths which are a multiple of ). This generalizes the idea of adding a partition to your graph with a matching which works for .
-
4.
(Proving the previous point can’t be done in general) Can you show that for some and (counting, directed) -hypercycle in a -uniform graph requires time and (counting, directed) -hypercycle in a -uniform graph can be solved in time? For example, if you could show for some that hypercycle could be solved in for some this would be satisfied.
-
5.
Can you give a faster algorithm for the unweighted -hypercycle problem than what we offer in this paper for any and ?
5.2 Existential Hypercycle Open Problems
There are interesting patterns we have noticed in -hypercycles in -uniform hypergraphs which relate to the existence of graphs instead of the existence of algorithms. We note in the introduction that a theorem exists showing that -hypercycles must exist in sufficiently dense graphs [2]. However, as that theorem is currently written it doesn’t show that e.g. graphs with edges must have a -hypercycle. We note that when is not a multiple of you can create an extremely dense graph with no hypercycles. Specifically, create a partite with nodes in each partition and add every possible hyperedge which includes one node from each partition. This is metaphorically similar to no odd-cycles existing in a complete bi-partite graph. We will now give some concrete open problems related to this issue of multiple-of--hypercycles and density.
-
1.
In uniform undirected graphs there is an interesting pattern which emerges. A fully dense bipartite graph contains no odd cycles, however, even cycles must exist even in relatively sparse graphs. Specifically, for constant in a graph with sparsity there must exist a -cycle. How does this relate to hypergraphs of larger uniformity? Hypercycles of length can exist in a -partite hypergraph, however hypercycles of length do not exist in even a complete -paritite hypergraph. This looks like it follows a metaphorically similar structure to graphs where . Certainly it does for . However, we have not yet been able to characterize the density at which -hypercycles are guaranteed to exist when . A construction with edges where no -hypercycle exists would settle the question, as would a proof that a -hypercycle must exist in any graph with hyperedges for some constant .
-
2.
To make the above question more concrete: what is the lowest hyperedge density such that a -hypercycle is guaranteed to exist in any graph with that density in a -uniform hypergraph? Is this density ? Is this density ?
-
3.
To make the above question (potentially) easier: You are given a tripartite 3-uniform hypergraph, , with partitions . Furthermore you are guaranteed that for every pair of nodes there are exactly edges of the form where and . This is a generalization of degree. What is the degree at which a -hypercycle is guaranteed?
-
4.
Given a close reading of the proof of Theorem 1 from “Tight cycles in hypergraphs” ([2]) can you show that in a -uniform hypergraph with at least hyperedges there must be a -hypercycle? (This is what would happen if you could set and still have the proof go through.)
-
5.
For even cycles in 2-uniform graphs the longer the cycle the smaller the sparsity at which it is guaranteed to exist. Can something similar be proven for -uniform graphs in general?
5.3 Counting Color Coding
We would love to say that counting constant sized subhypergraphs in the worst-case is equivalent in hardness up to log factors to counting constant sized subhypergraphs in the average-case. The issue we have here is, surprisingly, in the worst-case. We would like to show that counting a subhypergraph in a hypergraph is equivalent to counting in a -partite graph. The issue is that standard approaches for this, even in -uniform graphs, allow this reduction for detection but not for counting. For specific graph structures (notably cliques) there are ways to build this reduction. However, for most graph structures getting the counts to line up seems untenable. Such a reduction would strengthen our results, but, also strengthen the many results which use color-coding a sub-routine.
References
- [1] Amir Abboud, Virginia Vassilevska Williams, and Oren Weimann. Consequences of faster alignment of sequences. In Javier Esparza, Pierre Fraigniaud, Thore Husfeldt, and Elias Koutsoupias, editors, Automata, Languages, and Programming - 41st International Colloquium, ICALP 2014, Copenhagen, Denmark, July 8-11, 2014, Proceedings, Part I, volume 8572 of Lecture Notes in Computer Science, pages 39–51. Springer, 2014. doi:10.1007/978-3-662-43948-7_4.
- [2] Peter Allen, Julia Böttcher, Oliver Cooley, and Richard Mycroft. Tight cycles in hypergraphs. Electronic Notes in Discrete Mathematics, 49:675–682, 2015. doi:10.1016/J.ENDM.2015.06.091.
- [3] Peter Allen, Christoph Koch, Olaf Parczyk, and Yury Person. Finding tight hamilton cycles in random hypergraphs faster. In Michael A. Bender, Martin Farach-Colton, and Miguel A. Mosteiro, editors, LATIN 2018: Theoretical Informatics - 13th Latin American Symposium, Buenos Aires, Argentina, April 16-19, 2018, Proceedings, volume 10807 of Lecture Notes in Computer Science, pages 28–36. Springer, 2018. doi:10.1007/978-3-319-77404-6_3.
- [4] Arturs Backurs and Christos Tzamos. Improving viterbi is hard: Better runtimes imply faster clique algorithms. In Doina Precup and Yee Whye Teh, editors, Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 August 2017, volume 70 of Proceedings of Machine Learning Research, pages 311–321. PMLR, 2017. URL: http://proceedings.mlr.press/v70/backurs17a.html.
- [5] Marshall Ball, Alon Rosen, Manuel Sabin, and Prashant Nalini Vasudevan. Average-case fine-grained hardness. In Hamed Hatami, Pierre McKenzie, and Valerie King, editors, Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, Montreal, QC, Canada, June 19-23, 2017, pages 483–496. ACM, 2017. doi:10.1145/3055399.3055466.
- [6] Marshall Ball, Alon Rosen, Manuel Sabin, and Prashant Nalini Vasudevan. Proofs of work from worst-case assumptions. In Hovav Shacham and Alexandra Boldyreva, editors, Advances in Cryptology - CRYPTO 2018 - 38th Annual International Cryptology Conference, Santa Barbara, CA, USA, August 19-23, 2018, Proceedings, Part I, volume 10991 of Lecture Notes in Computer Science, pages 789–819. Springer, 2018. doi:10.1007/978-3-319-96884-1_26.
- [7] Enric Boix-Adserà, Matthew Brennan, and Guy Bresler. The average-case complexity of counting cliques in erdős-rényi hypergraphs. In David Zuckerman, editor, 60th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2019, Baltimore, Maryland, USA, November 9-12, 2019, pages 1256–1280. IEEE Computer Society, 2019. doi:10.1109/FOCS.2019.00078.
- [8] Karl Bringmann, Nofar Carmeli, and Stefan Mengel. Tight fine-grained bounds for direct access on join queries. In Proceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS ’22, pages 427–436, New York, NY, USA, 2022. Association for Computing Machinery. doi:10.1145/3517804.3526234.
- [9] Nofar Carmeli and Markus Kröll. On the enumeration complexity of unions of conjunctive queries. ACM Trans. Database Syst., 46(2), May 2021. doi:10.1145/3450263.
- [10] Nofar Carmeli and Luc Segoufin. Conjunctive queries with self-joins, towards a fine-grained enumeration complexity analysis. In Proceedings of the 42nd ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS ’23, pages 277–289, New York, NY, USA, 2023. Association for Computing Machinery. doi:10.1145/3584372.3588667.
- [11] Nofar Carmeli, Nikolaos Tziavelis, Wolfgang Gatterbauer, Benny Kimelfeld, and Mirek Riedewald. Tractable orders for direct access to ranked answers of conjunctive queries. ACM Trans. Database Syst., 48(1):1:1–1:45, 2023. doi:10.1145/3578517.
- [12] Mina Dalirrooyfard, Andrea Lincoln, and Virginia Vassilevska Williams. New techniques for proving fine-grained average-case hardness. In 61st IEEE Annual Symposium on Foundations of Computer Science, FOCS 2020, Virtual, November 16 – 19, 2020. IEEE Computer Society, 2020.
- [13] Cheng-Hao Fu, Andrea Lincoln, and Rene Reyes. Worst-case and average-case hardness of hypercycle and database problems, 2025. arXiv:2504.18640.
- [14] Oded Goldreich. On counting -cliques mod 2. Electron. Colloquium Comput. Complex., TR20-104, 2020. URL: https://eccc.weizmann.ac.il/report/2020/104.
- [15] Hao Huang and Jie Ma. On tight cycles in hypergraphs. SIAM Journal on Discrete Mathematics, 33(1):230–237, 2019. doi:10.1137/18M117741X.
- [16] Andrea Lincoln, Virginia Vassilevska Williams, and R. Ryan Williams. Tight hardness for shortest cycles and paths in sparse graphs. In Artur Czumaj, editor, Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2018, New Orleans, LA, USA, January 7-10, 2018, pages 1236–1252. SIAM, 2018. doi:10.1137/1.9781611975031.80.
- [17] Hung Ngo, Kirk Pruhs, Atri Rudra, and Virginia Vassilevska Williams. Fine-grained complexity, logic, and query evaluation program. Simons Institute for the Theory of Computing, Workshop Schedule, September 2023. Logic and Algorithms in Database Theory and AI. URL: https://simons.berkeley.edu/workshops/fine-grained-complexity-logic-query-evaluation/schedule.
- [18] Yisu Remy Wang, Max Willsey, and Dan Suciu. Free join: Unifying worst-case optimal and traditional joins. Proc. ACM Manag. Data, 1(2):150:1–150:23, 2023. doi:10.1145/3589295.
- [19] Virginia Vassilevska Williams, Yinzhan Xu, Zixuan Xu, and Renfei Zhou. New bounds for matrix multiplication: from alpha to omega. CoRR, abs/2307.07970, 2023. doi:10.48550/arXiv.2307.07970.