Beating Treewidth for Average-Case Subgraph Isomorphism

For any fixed graph $G$, the subgraph isomorphism problem asks whether an $n$-vertex input graph has a subgraph isomorphic to $G$. A well-known algorithm of Alon, Yuster and Zwick (1995) efficiently reduces this to the"colored"version of the problem, denoted $G$-$\mathsf{SUB}$, and then solves $G$-$\mathsf{SUB}$ in time $O(n^{tw(G)+1})$ where $tw(G)$ is the treewidth of $G$. Marx (2010) conjectured that $G$-$\mathsf{SUB}$ requires time $\Omega(n^{\mathrm{const}\cdot tw(G)})$ and, assuming the Exponential Time Hypothesis is true, proved a lower bound of $\Omega(n^{\mathrm{const}\cdot emb(G)})$ for a certain graph parameter $emb(G) = \Omega(tw(G)/\log tw(G))$. With respect to the size of $\mathrm{AC}^0$ circuits solving $G$-$\mathsf{SUB}$, Li, Razborov and Rossman (2017) proved an unconditional average-case lower bound of $\Omega(n^{\kappa(G)})$ for a different graph parameter $\kappa(G) = \Omega(tw(G)/\log tw(G))$. Our contributions are as follows. First, we show that $emb(G)$ is at most $O(\kappa(G))$ for all graphs $G$. Next, we show that $\kappa(G)$ can be asymptotically less than $tw(G)$; for example if $G$ is a hypercube, then $\kappa(G)$ is $\Theta\left(tw(G)/\sqrt{\log tw(G)}\right)$. Finally, we construct $\mathrm{AC}^0$ circuits of size $O(n^{\kappa(G)+\mathrm{const}})$ that solve $G$-$\mathsf{SUB}$ in the average case, on a variety of product distributions. This improves an $O(n^{2\kappa(G)+\mathrm{const}})$ upper bound of Li et al, and shows that the average-case complexity of $G$-$\mathsf{SUB}$ is $n^{o(tw(G))}$ for certain families of graphs $G$ such as hypercubes.


Introduction
The subgraph isomorphism problem asks, given graphs X and G, whether X has a subgraph isomorphic to G.In the "colored" or "partitioned" version of the problem, each vertex of the larger graph X comes with a "color" from the vertex set of G, and we ask whether X has a subgraph that is isomorphic to G with respect to this coloring.We denote the uncolored and colored subgraph isomorphism problems by G-SUB uncol (X) and G-SUB(X) respectively.Subgraph isomorphism is NP-complete (e.g. if G is a clique or Hamiltonian cycle), so research has focused on algorithms for a variety of special cases in the context of parameterized complexity, surveyed in [12].If G is a fixed graph on k vertices then G-SUB uncol is solvable in time O(n k ) by brute force, where (here and throughout this section) n is the order of the input graph.The color-coding algorithm of Alon, Yuster and Zwick [2] improves on this by efficiently reducing G-SUB uncol to G-SUB and solving the latter in time O(n tw(G)+1 ), where tw(G) is the treewidth of the fixed graph G.
We observe that a similar result holds easily on Turing machines, using as a subroutine the sort-merge join algorithm from relational algebra.This involves sorting, which cannot be done in AC 0 [7], so our circuit instead uses hashing that relies on concentration of measure for subgraphs of random graphs.
It was also proved in [10] that κ(G) is between Ω(tw(G)/ log tw(G)) and tw(G) + 1, from which it follows that the worst-case complexity of G-SUB on bounded-depth circuits is at least n Ω(tw(G)/ log tw(G)) .The question was posed in [10] of whether κ(G) is Θ(tw(G)); an affirmative answer would have implied that Conjecture 1.1 holds on bounded-depth circuits.
Our main result is a separation of κ from treewidth.The Hamming graph K d q has vertex set {1, . . ., q} d and edges between every two vertices that differ in exactly one coordinate.It is already known that K d q has treewidth Θ q d √ d [4].We prove the following: Thus, if G is the hypercube graph K d 2 for example, then κ(G) is Θ tw(G) log tw(G) .It follows that an average-case analogue of Conjecture 1.1 is false if G is taken to be the set of all hypercubes.We also prove the following (for arbitrary graphs G): Because of Theorem 1.5, even if our upper bound generalizes to the worst case, it is still consistent with current knowledge (in particular Theorem 1.2) that ETH is true.Another consequence of Theorem 1.5 is that the lower bound from Theorem 1.2 holds unconditionally in AC 0 .
It follows from Theorems 1.4 and 1.5 that if G is a hypercube then emb(G) ≤ O(κ(G)) = o(tw(G)), so proving that Conjecture 1.1 holds under ETH cannot be done by proving that emb(G) is Θ(tw(G)).In fact, this conclusion was already known: Alon and Marx [1] proved that if G is a 3-regular expander then emb(G) is Θ(tw(G)/ log tw(G)).It was proved in [10] that if G is a 3-regular expander then κ(G) is Θ(tw(G)), which makes our separation of κ from treewidth more surprising.On the other hand, we will see that Theorem 1.5 is asymptotically tight in the case of Hamming graphs.
We can make a similar statement regarding AC 0 .Amano [3] observed that the colorcoding algorithm for G-SUB can be implemented by AC 0 circuits of size O(n tw(G)+1 ) for fixed G. Our separation of κ from treewidth implies that if Conjecture 1.1 holds in AC 0 , then this cannot be proved using average-case complexity as defined here and in [10].
The paper is organized as follows.In Section 2 we introduce some notation and definitions.In Section 3 we define the average-case problem and κ(G), and give an Õ(n κ(G) )-time algorithm for the average-case problem.In Section 4 we define emb(G) and prove that emb(G) is O(κ(G)).In Section 5 we prove that κ K d q is Θ(q d /d), and obtain as a corollary that emb K d q is Θ(q d /d) as well.We also summarize the proof of Chandran and Kavitha [4] that tw K d q is Θ q d √ d .In Section 6 we prove our AC 0 upper bound.
We use boldface to denote random variables.

Graphs
All graphs we consider are simple and undirected, and may have isolated vertices.If G is a graph then let V (G) and E(G) denote its vertex and edge sets, with respective cardinalities v(G) and e(G).If u and v are adjacent vertices then we denote the edge connecting them by uv or vu.
Definition 2.1 (Colored subgraph isomorphism problem).For graphs G and X, where X comes with a coloring χ : V (X) → V (G), the problem G-SUB(X) asks whether X has a subgraph G such that χ (restricted to V (G )) is an isomorphism from G to G. (Note that G is not required to be an induced subgraph of X.) When the parent graph G is clear in context, let deg(u) be the degree of a vertex u, and for disjoint S, T ⊆ V (G) let e(S, T ) be the number of edges between S and T .Similarly, for vertex-disjoint graphs A and B let e(A, B) = e(V (A), V (B)).
I P E C 2 0 1 9

24:4 Beating Treewidth
Let G ∩ H be the graph with vertex set V (G) ∩ V (H) and edge set E(G) ∩ E(H), and define G ∪ H similarly.Note that G ∩ H may have isolated vertices even if G and H do not.If A ⊆ B are graphs then let [A, B] = {H | A ⊆ H ⊆ B}, and let (A, B] be the same interval without A, etc.
We denote by K k the complete graph on k vertices, also called the k-clique.The graph K d q has vertex set [q] d , and two vertices are adjacent if and only if they differ in exactly one coordinate.Such graphs are called Hamming graphs.A special case is the d-dimensional hypercube Q d = K d 2 ; we will use {0, 1} d for its vertex set.
Finally, let ER (n, p) be the Erdős-Rényi graph on n vertices in which each possible edge exists independently with probability p.

3
The Average-Case Problem and the Parameter κ(G)

Threshold Random Graphs
First we will define threshold weightings, which assign weights to the vertices and edges of a graph subject to certain constraints.Then we will define a family of random graphs for each threshold weighting.The content in this subsection is essentially all from [10].
We will often denote ∆ = (α, β) in a slight abuse of notation.(Since ∆(u) = α(u) if u is a single vertex, the pair (α, β) is uniquely determined by ∆.)The requirement that α be nonnegative is redundant because it's a special case of the requirement that ∆ be nonnegative.The requirement that β ≤ 2 is also redundant because for every edge uv, A trivial example is (α, β) = (0, 0), i.e. all vertices and edges have a weight of zero.The following example is more general: be a column stochastic matrix (meaning each column sums to 1) such that if with equality if H = G.In fact, in the full paper we prove that every threshold weighting is equivalent to at least one Markov Chain.
The following threshold weighting will be especially important, and can be thought of as representing a uniform random walk on G: Now we define threshold random graphs: Definition 3.4.For ∆ = (α, β) ∈ θ(G) let X ∆,n be the graph with vertices u i for u ∈ V (G) and i ∈ [n α(u) ], and for uv ∈ E(G), each edge u i v j independently with probability n −β(uv) .The graph X ∆,n comes with the coloring to G defined by u i → u.
For H ⊆ G and X in the support of X ∆,n , let Sub X (H) be the set of subgraphs H ⊆ X such that the aforementioned coloring (restricted to V (H )) is an isomorphism from H to H. We say that such a graph H is "H-colored".Note that Sub X (H) can be identified with a subset of u∈V (H) [n α (u) ].
Proof.Let (α, β) = ∆.The set Sub X ∆,n (H) contains each of its n α(H) possible elements with probability n −β(H) , so the result follows from linearity of expectation.(The 1 ± o(1) accounts for having to round n α(•) to an integer.)Lemma 3.5 motivates the requirements that ∆ be nonnegative everywhere and that ∆(G) = 0. Recall that the problem G-SUB(X) asks whether Sub X (G) is the empty set.Since ∆(G) is required to be zero, it follows that Sub X ∆,n (G) has (approximately) one element on average, and the probability that Sub X ∆,n (G) is empty is known to be bounded away from 0 and 1 as n goes to infinity [10].

The Parameter κ(G) and an Algorithm for the Average Case
We now define κ(G): Definition 3.6 ([10]).Let G be a graph with no isolated vertices.Let Seq(G) be the set of union sequences, meaning sequences (H 1 , . . ., H k ) of distinct subgraphs of G such that H k = G and each H i is either an edge or the union of two previous graphs in the sequence.For To simplify the exposition, whenever we refer to κ(G), the graph G is implicitly assumed to lack isolated vertices.It was proved in [10] that for any fixed G, constant-depth circuits solving G-SUB(X ∆,n ) a.a.s.require size at least n κ ∆ (G)−o (1) and at most n 2κ ∆ (G)+c (where c is an absolute constant).The results about average-case complexity described in Section 1 are with respect to a ∆ such that κ ∆ (G) = κ(G).
Proof.First we prove a weaker upper bound of Õ(n 2κ ∆ (G) ), in a manner analogous to the circuit from [10], and then we describe a modification (on Turing machines) that removes the factor of 2 from the exponent.In Section 6 we will remove the factor of 2 in AC 0 using a different approach.
Let S be a union sequence such that κ ∆ (G) = max H∈S ∆(H).For any H ∈ S, by Lemma 3.5 and Markov's Inequality, (1) in Section 6.1.)By a union bound it follows that if X ∼ X ∆,n then max H∈S |Sub X (H)| ≤ Õ(n κ ∆ (G) ) a.a.s.Assume this condition holds for X.For each successive H in S, compute Sub X (H) as follows.If H is a single edge then this is trivial.Otherwise H = A ∪ B for some previous A, B ∈ S, in which case Sub X (H) is the set of A ∪ B such that A ∈ Sub X (A), B ∈ Sub X (B) and the projections of A and B onto [n] V (A∩B) are equal.Therefore Sub X (H) can be computed by brute force in time Õ(|Sub ).Finally, check whether Sub X (G) is empty.

24:6 Beating Treewidth
We can save a quadratic factor by computing Sub X (H) from Sub X (A) and Sub X (B) as follows.(This is a case of the sort-merge join algorithm for computing the natural join of two relations, as defined in database theory [20].)Fix an efficiently computable total order on [n] V (A∩B) , e.g.interpret elements of [n] V (A∩B) as v(A ∩ B)-digit base-n numbers in increasing order, and then define a partial order on [n] V (A) ∪ [n] V (B) by first projecting onto [n] V (A∩B) .Sort Sub X (A) and Sub X (B) in nondecreasing order, and for convenience add the symbol ⊥ to the end of both sorted lists.Let A and B be the first elements of Sub X (A) and Sub X (B) respectively, and initialize an empty accumulator (which will ultimately equal Sub X (H)).While A =⊥ and B =⊥, do the following.If A < B then let A be the next element of Sub X (A).If B < A then let B be the next element of Sub X (B).Otherwise, let B = B, and while B =⊥ and the projections of A and B onto [n] V (A∩B) are equal, add A ∪ B to the accumulator and let B be the next element of Sub X (B).Then (once the procedure involving B has finished) let A be the next element of Sub X (A).
Sorting Sub X (A) and Sub X (B) takes Õ(|Sub X (A)| + |Sub X (B)|) comparisons, and then computing Sub X (H) takes Õ(|Sub We will use the following graph-theoretic properties of κ(G):

The Parameter emb(G) and Proof that emb(G) is O(κ(G))
Recall that emb(G) is significant because of its role in Marx's ETH-hardness result for G-SUB, namely Theorem 1.2.

Definition 4.1 (emb(G))
. Let G (q) be the graph formed by replacing each vertex of G with a q-clique, i.e. it has vertices u i for all u ∈ V (G) and i ∈ [q], and edges u i v j for all u i = v j such that either u = v or uv ∈ E(G).Let emb(G) be the supremum of all r > 0 for which there exists m 0 = m 0 (G, r) such that if H is any graph with m ≥ m 0 edges and no isolated vertices, then H is a minor of G ( m/r ) , and furthermore a minor mapping from H to G ( m/r ) can be computed in time f (G)m O (1) for some function f .Although the requirement that such a minor mapping be efficiently computable is crucial in Theorem 1.2, none of the other results about emb(G) that we reference or derive depend on this requirement, so we may safely ignore it going forward.The following example illustrates Definition 4.1: . If H has no isolated vertices then H could have up to 2m vertices, so 2m ≤ k m/r .Therefore emb(K k ) = k/2: it is sufficient for 2m to be at most km/r (i.e.r ≤ k/2), and no r > k/2 satisfies 2m ≤ k m/r for arbitrarily large m.
Remark.The name emb(G) comes from the fact that Marx [11] called a minor mapping from H to G (q) an "embedding of depth q" from H into G.Marx [11] used the notation G (q) , but the parameter emb(G) is new in the current paper, all results about emb(G) in [11,1] having been stated in terms of embeddings of some depth.
The following is used in our proof that emb(G) is O(κ(G)): Proof Sketch.Given a threshold weighting ∆ on G (q) , collapsing each cluster of q vertices to a single "mega-vertex" induces a threshold weighting ∆ on G. Let S be an optimal union sequence for G with respect to ∆ , and project S back onto G (q) .Now we prove that emb(G) is O(κ(G)) (Theorem 1.5), using an argument similar to the proof by Marx [11] that emb(G) is O(tw(G)): Proof.Let r > 0, and assume there exists an arbitrarily large 3-regular expander H that's a minor of G ( e(H)/r ) .Then by Corollary 3.9, Theorem 3.8(iii), and Lemma 4.3, In [10] the question was posed of whether Theorem 1.2 holds with κ(G) in place of emb(G).By Theorem 1.5 this would be a stronger bound, which makes the question even more interesting.This problem is open even in the case of 3-regular expanders: recall from Section 1 that if G is a 3-regular expander then emb(G) is Θ(tw(G)/ log tw(G)) and κ(G) is Θ(tw(G)) [1,10].

Separating κ from Treewidth
In Section 5.1 we prove that κ(K k ) = k/4 + O(1), which is a special case of the more general result that κ K d q = Θ(q d /d).We obtain tighter multiplicative constants in the case d = 1, and it provides an opportunity to illustrate the main ideas of our proof in a simpler setting, but when reading the full paper it may be skipped without penalty.In Section 5.2 we prove that κ K d q is O(q d /d) when q is even, which is sufficient to separate κ from treewidth.Again, this case is cleaner than the general case and conveys most of the intuition behind it.In an appendix in the full paper we prove that κ K d q is O(q d /d) for all q.In Section 5.3 we prove that κ K d q is Ω(q d /d) in two different ways, completing the proof that κ K d q is Θ(q d /d) (Theorem 1.4), and we obtain as a corollary that emb K d q is Θ(q d /d) as well.In Section 5.4 we summarize the proof of Chandran and Kavitha [4] that tw K d q is Θ q d √ d .
I P E C 2 0 1 9
Rossman [16] proved that κ ∆o (K k ) ≥ k/4, so it suffices to prove the upper bound.By Theorem 3.8(i) it suffices to prove that κ ∆ (K k ) ≤ k/4 + O(1) for an arbitrary ∆ = (1, β) ∈ θ(G).First we construct a sequence ) for all i.The set U k = V (K k ) satisfies this requirement because β(K k ) and β o (K k ) are both equal to k.Given U i , let U i−1 be an (i − 1)-element subset of U i chosen uniformly at random.Each pair of elements in U i is included in U i−1 with the same probability p i (= 1 − 2/i), so it follows from linearity of expectation that Therefore there exists a fixed We construct a union sequence S for K k as follows.Start by enumerating the edges, and then for i from , where e 1 , e 2 , . . .are the edges between U i and As observed in [16], it follows from Equation ( 1

Proof that
First we reduce this to the case q = 2.The graph K d q is a subgraph of Q ((q/2) d ) d (recall Definition 4.1), as explained in the full paper.By Theorem 3.8(iii) and Lemma 4.
), following some brief definitions and a high-level overview of the argument.Fix d.We identify each u ∈ {0, 1} d with Remark.The intuition behind µ is as follows.The reader may note that κ ∆o (Q d ) ≤ µ + 1, by reasoning analogous to that in Section 5.1.That is, for each vertex u of Q d in increasing lexicographic order, add to an accumulator all edges uv for which v < u.
There is another union sequence captured by µ as well.
particular, the root is Q d and the leaves are vertices), and each interior node is the union of its two children along with some additional edges corresponding to a coordinate cut.This tree describes a union sequence S for Q d : recursively obtain the graphs L and R corresponding to the children of Q d , and then take L ∪ R and add the missing edges.Note that max H∈S ∆ o (H) = 2 max 0≤k≤d ∆ o (G(2 k )) ≤ 2µ.
For e ∈ E(G) we can construct T (e) in a similar manner, as explained in the full paper.Lemma 6.15.For all H, H ⊆ G there exists a random, constant-depth circuit, independent of X, with Õ(n max(∆(H),∆(H ))+2 ) wires, that computes T (H ∪ H , π) from T (H, π) and T (H , π ) w.h.p. for some π.
Let ψ i = min(φ i , φ i ).For 0 ≤ d ≤ v(H ∩ H ) let S d be a depth-d tree in which each node at depth i < d (including i = 0) has Õ(n ψi ) children.Each node of S d has a (partial) label defined the same way as in T , such that no two nodes share a non-null label, and {π 1 l1 , . . ., π i li } extends to both H and H (but not necessarily to H ∪ H ) if and only if some node is labeled with l.Each leaf of S d with a non-null label l is associated with the pair (τ, τ ) of subtrees of T and T respectively whose labels are also l.
The tree S 0 is the single node (T, T ), and we can compute S d+1 from S d as explained in the full paper.Let S = S v(H∩H ) .For d from v(H ∩ H ) − 1 down to 0, for each depth-d node N in S, hash (Lemma 6.11) the number of children of N down from Õ(n ψ d ) to Õ(n φd ), and if all of N 's children are null and d > 0 then remove N 's partial label.Finally, for each leaf (τ, τ ) of S, append a copy of τ to each leaf of τ , and put this in place of (τ, τ ) in S.
For each successive H in an optimal union sequence, compute T (H) as described above, and then apply a single OR gate to all leaves of T (G).