Abstract 1 Introduction 2 Preliminaries 3 Algorithm for 𝑷𝟒-freeness 4 Algorithm and certification for 𝑷𝟓-freeness 5 Lower bounds for 𝑷𝒌-freeness 6 Ordered path detection and applications References

Distributed Complexity of 𝑷𝒌-Freeness:
Decision and Certification

Masayuki Miyamoto University of Tsukuba, Japan
Abstract

The class of graphs that do not contain a path on k nodes as an induced subgraph (Pk-free graphs) has rich applications in the theory of graph algorithms. This paper explores the problem of deciding Pk-freeness from the viewpoint of distributed computing.

For specific small values of k, we present the first 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 algorithms specified for Pk-freeness, utilizing structural properties of Pk-free graphs in a novel way. Specifically, we show that Pk-freeness can be decided in O~(1) rounds for k=4 in the 𝖻𝗋𝗈𝖺𝖽𝖼𝖺𝗌𝗍𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model, and in O~(n) rounds for k=5 in the 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model, where n is the number of nodes in the network and O~() hides a polylog(n) factor. The main technical contribution is a novel technique used in our algorithm for P5-freeness to distinguish induced 5-paths from non-induced ones, which is potentially applicable to other induced subgraphs. This technique also enables the construction of a local certification of P5-freeness with certificates of size O~(n). This improves O~(n3/2) by Bousquet and Zeitoun (TCS 2025), and is nearly optimal, given our Ω(n1o(1)) lower bound on certificate size.

For general k, we establish the first 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 lower bound, which is of the form n21/Θ(k). The n1/Θ(k) factor is unavoidable, in view of the O(n22/(3k+2)) upper bound by Eden et al. (Dist. Comp. 2022). Additionally, our approach yields the first superlinear lower bound on certificate size for local certification. This partially answers the conjecture on the optimal certificate size of Pk-freeness, asked by Bousquet et al. (arXiv:2402.12148).

Finally, we propose a novel variant of the problem called ordered Pk detection. We show that in the 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model, the round complexity of ordered Pk detection is Ω~(n) for k5, and in contrast, proving any nontrivial lower bound for ordered P3 detection implies a strong circuit lower bound. As a byproduct, we establish a circuit-complexity barrier for Ω(n1/2+ε) quantum 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 lower bounds for induced 4-cycle detection. This is complemented by our O~(n3/4) quantum upper bound, which surpasses the classical Ω~(n) lower bound by Le Gall and Miyamoto (ISAAC 2021).

Keywords and phrases:
subgraph detection, CONGEST model, local certification
Copyright and License:
[Uncaptioned image] © Masayuki Miyamoto; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Distributed algorithms
Related Version:
Full Version: https://arxiv.org/abs/2410.20353
Acknowledgements:
The author would like to thank François Le Gall for discussions and for commenting on an earlier version of this paper. The author also acknowledges the anonymous reviewers for their helpful comments.
Editors:
Ho-Lin Chen, Wing-Kai Hon, and Meng-Tsung Tsai

1 Introduction

1.1 Background

The subgraph detection problems in the distributed setting of limited communication bandwidth, including the 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model, are interesting and well studied problems. For a given specific small pattern graph H (the description of H is known to all nodes of the network), the goal of H detection is to decide if the network G contains H as a subgraph (or an induced subgraph). If the network G contains H, at least one node outputs “yes”, otherwise all nodes of the network output “no”. (H detection is often referred as H freeness, where at least one node outputs “no” if the network contains H, and otherwise all nodes of the network output “yes”. In this paper we abuse these two problems since they are essentially equivalent.) There are several variants previously considered in the literature: H listing requires to output all copies of H in G (each node outputs some copies of H and the union of all output lists are equivalent to the list of all copies of H in G). In recent years, significant progress in distributed subgraph detection has been made, and we now understand well the round complexities for detection of e.g., cliques and cycles. However, much less is known about induced paths. In the full version, we also summarize the previous results in this context.

Since it is known that in the 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model, detecting non-induced k-paths can be done in O(1) rounds for constant k [27], our focus in this paper is specifically on the induced k-path detection problem. Induced k-paths are particularly more important than non-induced paths because graphs that do not contain an induced k-path (often called Pk-free graphs) have numerous applications in algorithmic graph theory, particularly in the centralized setting. The absence of induced k-node paths imposes structural constraints on graphs, which can be leveraged to develop efficient algorithms for problems like independent set or coloring [5, 20, 22, 31, 23, 36]. Despite its importance, few results are known for Pk-freeness in the distributed setting. The only known result we are aware of is the result of Nikabadi and Korhonen [34], who explored the multicolored variant of the induced k-path detection, where each node is colored by an integer from {1,2,,k}, and the goal is to detect an induced path on k different colors. They demonstrated that multicolored induced Pk detection requires Ω(n/logn) rounds for any k7. However, their proof fails for Pk-freeness. To the best of our knowledge, there is no nontrivial result111For the trivial case of k=3, P3 detection can be done in O(1) rounds. for Pk-freeness in the 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model beyond the general result by [14]: for any subgraph H on k nodes, there is a randomized 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 algorithm that solves induced H detection in O~(n223k+1) rounds.

Local certification of 𝑷𝒌-freeness

Another line of research focuses on local certification of Pk-freeness. In local certification, small labels called certificates are assigned to the nodes of the graph, and each node decides if the graph satisfies some property by using its local view. There are two important parameters, that is, the size of certificates and the locality of verification. In local certification of size s and locality , each node receives an O(s)-bit certificate from the external entity called the prover, and each node decides its binary output (accept or reject) only using the unique identifiers and certificates of its -hop neighbors. The scheme (with the locality =1) is introduced by [28] under the name of proof-labeling schemes, which is now a popular setting in the community of distributed computing:

Definition 1 (proof-labeling schemes).

A proof-labeling scheme (𝖯𝖫𝖲) of size s is a pair (c,𝒜) of functions c and 𝒜 of the input graph G=(V,E) such that

  • for each node u, the function c assigns a bit string c(u) of length s called the certificate of u;

  • for each node u, the function 𝒜, called the verification algorithm, depends on the certificates of -hop neighbors of u as well as the identifiers of these nodes, and decides the output of u. More precisely, let N(u)={v1,,vd} be the -hop neighbors of u, then

    𝒜(𝗂𝖽(u),c(u),𝗂𝖽(v1),c(v1),,𝗂𝖽(vd),c(vd))

    is the output of u.

Let 𝒫 be a graph property. We say that there is a 𝖯𝖫𝖲 that certifies 𝒫 if there exists a verification algorithm 𝒜 with locality (which outputs 𝖠𝖼𝖼𝖾𝗉𝗍 or 𝖱𝖾𝗃𝖾𝖼𝗍) satisfies the following conditions.

  • If G satisfies the property 𝒫, there exists a certificate function c such that all nodes in G output 𝖠𝖼𝖼𝖾𝗉𝗍.

  • If G does not satisfy the property 𝒫, for any certificate function c, at least one node in G outputs 𝖱𝖾𝗃𝖾𝖼𝗍.

When the locality of the verification algorithm is k2, there is a trivial way to certify Pk-freeness: by assigning the list of IDs of neighbors as a certificate, each node try to find a Pk by looking edges adjacent to its k/2-hop neighbors. If there is a Pk, then its central node can now detect it. Thus, our interest lies in cases where the path length is long relative to the locality. Recently, Bousquet, Cook, Feuilloley, Pierron, and Zeitoun [2] studied this topic by analyzing the relationship among path length, certificate size, and locality of verification. They provided various nontrivial upper and lower bounds on certificate size (we discuss details in Section 1.2). While local certification of various subgraph-related problems has been studied (e.g., [3, 11, 18, 19]), the only known result for local certification of Pk-freeness prior to [2] was for k=4: there is a 𝖯𝖫𝖲1 that certifies P4-freeness with O(logn) bits [17]. Moreover, [17] proved that every 𝖬𝖲𝖮1 property Π can be certified with O(log2n) bits and locality 1 if all graphs satisfying Π have bounded clique-width. Since Pk-freeness can be expressed by 𝖬𝖲𝖮1, this suggests that the difficulty gap between certifying P4-freeness and P5-freeness can be attributed to the fact that P4-free graphs have bounded clique-width, whereas P5-free graphs do not. After that, Bousquet and Zeitoun [4] constructed a 𝖯𝖫𝖲1 that certifies P5-freeness with O(n3/2) bits.

1.2 Our results

1.2.1 Result 1: 𝑷𝒌-freeness for 𝒌=𝟒 and 𝒌=𝟓

We first show the following upper bounds for k=4 and k=5. Throughout the paper, we use “a randomized algorithm” as an algorithm that outputs the correct answer with probability at least 2/3.

Theorem 2.

There exists a randomized algorithm that solves P4-freeness in the 𝖻𝗋𝗈𝖺𝖽𝖼𝖺𝗌𝗍 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model, running in O(lognloglogn) rounds.

Theorem 3.

There exists a randomized algorithm that solves P5-freeness in the 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model, running in O(nlog2n) rounds.

Technical challenges

As mentioned above, there were no nontrivial algorithms for Pk-freeness for k4 in the 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model beyond the general O(n22/(3k+2)) upper bound [14]. Moreover, the previous 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 algorithms for subgraph detection have been mostly focused on non-induced cases. Most techniques from these results seem to be useless for induced subgraph detecion. Typical examples are the expander decomposition applicable to cliques [8, 9], and the BFS search with threshold applicable to (non-induced) cycles [6, 18]. We thus need to exploit structural properties of P4/P5-free graphs, resulting our algorithms quite different from the other algorithms for subgraph detection. We believe that our approach hints new algorithms for other induced subgraphs.

We turn our attention to the framework of local certification. We get the following certification scheme for P5-freeness.

Theorem 4.

There is a 𝖯𝖫𝖲1 that certifies P5-freeness with certificates of size O(nlogn).

To obtain this result, we directly use the algorithm of Theorem 3. This is somewhat interesting, as 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 algorithms for subgraph freeness cannot be used for local certification of subgraph freeness in general (unless they are 𝖻𝗋𝗈𝖺𝖽𝖼𝖺𝗌𝗍𝖢𝖮𝖭𝖦𝖤𝖲𝖳 algorithms). We also show that the size of our certificates is optimal, up to a subpolynomial factor:

Theorem 5.

For any k4+1, any 𝖯𝖫𝖲 for Pk-freeness requires certificates of size Ω(neO(logn)).

This result is proved via a combination of several known reduction techniques: we first reduce the nondeterministic three-party communication complexity of set-disjointness function to triangle freeness, which is then reduced to P5-freeness.

Independent and concurrent works

After submitting the first draft of this paper, we learned about the following independent and concurrent works: Bousquet and Zeitoun [4] proved that there is a 𝖯𝖫𝖲1 that certifies P5-freeness with certificates of size O(n3/2). Additionally, as noted in [4], Chaniotis, Cook, Hajebi, and Spirkl (independently and concurrently) obtained Ω(n1o(1)) lower bound for P5-freeness.

1.2.2 Result 2: 𝑷𝒌-freeness for general 𝒌

We provide the following lower bounds for Pk-freeness in the (randomized) 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model using the two-party communication complexity, which is the standard framework for distributed subgraph detection problems.

Theorem 6.

Let d be any positive integer.

  • For d2, Pk-freeness for k11d require Ω~(n21/d) rounds in the randomized 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model.

  • For d3, Pk-freeness for k8d requires Ω~(n21/d) rounds in the randomized 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model.

The n1/d=n1/Θ(k) factor is unavoidable, in light of the upper bound of n2Θ(1/k) established by [14]. We will show these lower bounds using the standard reduction from two-party communication complexity. Although the framework used here is classic, there are several challenges that makes the proof highly non-trivial, which we discuss in detail later in Section 5.2.

Additionally, our constructions apply to the nondeterministic two-party communication setting, leading to the following.

Theorem 7.

Let and d be positive integers.

  • For d2, any 𝖯𝖫𝖲 for Pk-freeness requires certificates of size Ω~(n21/d) for k4d+7d.

  • For d3, any 𝖯𝖫𝖲 for Pk-freeness requires certificates of size Ω~(n21/d) for k4d+4d.

Comparison with the recent results by Bousquet et al. [2]

Let us now compare our results with those of Ref. [2], who considered local certification of Pk-freeness and obtained the following:

Theorem 8 ([2]).

If the locality is 1, then:

  • Pk-freeness for k31 can be certified with certificates of size O(nlog3n);

  • Pk-freeness for k1431 can be certified with certificates of size O(n3/2log2n);

  • Pk-freeness for k4+3 requires certificates of size Ω(n/).

They also conjectured the following.

Conjecture 9 ([2]).

For all α>0, the optimal size for local certification of Pα-free graphs with locality is of the form Θ(n21/f(α)), for some unbounded increasing function f.

We summarize our results and previous results in Table 1. For =1, the results of Bousquet et al. [2] provide nontrivial lower bound for P7-freeness, and Bousquet and Zeitoun [4] provide nontrivial upper bound for P5-freeness. In Theorems 4 and 5, we show the nearly optimal bound of P5-freeness for =1.

For general k, Theorem 7 shows the first superlinear lower bounds, improving the previous linear lower bounds. Moreover, Theorem 7 partially answers Conjecture 9: f is at least linearly increasing.

Table 1: Summary of our results and previous results on local certification of Pk-freeness. Here n denotes the number of nodes in the network.
Path length Certificate size Model Reference
5 O(nlogn) 𝖯𝖫𝖲1 Thm 4
5 O(n3/2) 𝖯𝖫𝖲1 [4]
4+1 Ω(n1o(1)) 𝖯𝖫𝖲 Thm 5
8+14 Ω~(n3/2) 𝖯𝖫𝖲 Thm 7
4d+4d, d3 Ω~(n21/d) 𝖯𝖫𝖲 Thm 7
31 O(nlog3n) 𝖯𝖫𝖲 [2]
4+3 Ω(n) 𝖯𝖫𝖲 [2]
1431 O(n3/2log2n) 𝖯𝖫𝖲 [2]

1.2.3 Result 3: Ordered path detection and applications

We then define and study the following problem called ordered Pk detection.

Definition 10 (ordered Pk detection).

Each node of the graph has a color from {1,,k}, and the goal is to detect an induced path that consists of edges {(pi,pi+1)}i{1,,k1} on k nodes {pi}i{1,,k} where pi is colored by i.

Motivation

The definition of ordered path detection may seem somewhat artificial; however, it can be motivated in a manner similar to that of multicolored path detection studied in [34]. Algorithms for the multicolored/ordered variants of these problems with color-coding techniques [1] are often used to address the standard version. For example, the state-of-the-art 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 algorithm for detecting 2k-node cycles [6, 16] actually detects the ordered variant of 2k-cycles. Consequently, lower bounds for the ordered variant reflect the limitations of algorithms that employ color-coding. Moreover, as demonstrated in Theorem 12, we will show that ordered path detection also provides some nontrivial insight for subgraph detection in the quantum 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model, which is not restricted to algorithms using color-coding.

Contribution

We first prove the following lower bound for k=5.

Theorem 11.

Any randomized algorithm that solves ordered Pk detection for k5 requires Ω~(n) rounds in the 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model.

We then focus on ordered P3 detection. Our next finding is that, similarly to triangle detection, proving non-trivial lower bounds for ordered P3 detection is difficult. This result also shows a circuit complexity barrier for induced C4 detection.

Theorem 12 (Informal).

For any constant ε>0, proving any lower bound of the form Ω(nε) for ordered 3-path detection in the 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model or Ω(n1/2+ε) for induced C4 detection in the quantum 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model implies super-linear lower bounds on circuit complexity for an explicit family of boolean functions.

Note that our circuit complexity barrier for induced C4 detection is in the quantum 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model [30]. We then complement this result by showing a nontrivial upper bound for induced C4.

Theorem 13.

In the quantum 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model, induced C4 detection can be solved in O(n3/4) rounds with high probability.

Previously, no nontrivial upper bound for induced C4 detection was known, and our bound for induced C4 detection beats a classical Ω~(n) lower bound [29].

2 Preliminaries

We write [n]={1,2,,n}. We let N(v) be the set of neighbors of v. For a graph G=(V(G),E(G)), we say that H=(V(H),E(H)) is a subgraph of G if there is an injective function ϕ:V(H)V(G) such that (u,v)E(H)(ϕ(u),ϕ(v))E(G) for any pair of nodes u,vV(H). We say that H is an induced subgraph of G if there is an injective function ϕ:V(H)V(G) such that (u,v)E(H)(ϕ(u),ϕ(v))E(G) for any pair of nodes u,vV(H). Let Pk be a path on k nodes. We say that G is Pk-free if G does not contain Pk as an induced subgraph. Throughout the paper, we assume that the input graph is connected, otherwise we can treat each connected component separately.

The 𝗖𝗢𝗡𝗚𝗘𝗦𝗧 model and variants

In the 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model [35], each node of the network G=(V,E) has a distinct O(logn)-bit identifier and can communicate with its neighbors in a synchronized manner. In each round each node (1) does some local computation and (2) sends an O(logn)-bit message to each of its neighbors. In the initial state, each node only knows its own unique identifier (represented by O(logn) bits) and input, and the unique identifiers of its neighbors. The 𝖻𝗋𝗈𝖺𝖽𝖼𝖺𝗌𝗍𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model is a weaker model in the sense that each node can only broadcast a O(logn)-bit message in each round. In the 𝖼𝗈𝗇𝗀𝖾𝗌𝗍𝖾𝖽𝖼𝗅𝗂𝗊𝗎𝖾 model [32], we allow all-to-all communication in each round. That is, the communication topology is always the n-node clique, and the input graph G is a subgraph of the clique. In the quantum 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model [30], each node can send O(logn)-qubit quantum message to each of its neighbors, instead of O(logn)-bit classical message.

3 Algorithm for 𝑷𝟒-freeness

In this section we prove Theorem 2.

It is well known that a graph is P4-free iff it is a cograph [10]. A cograph is defined as a graph that is constructed using the following rules:

  • A single-node graph K1 is a cograph;

  • For cographs G=(VG,EG) and H=(VH,EH), its disjoint union (VGVH,EGEH) is a cograph;

  • For cographs G=(VG,EG) and H=(VH,EH), its join (VGVH,EGEH(VG×VH)) is a cograph;

Using this characterization, we can construct a low-congestion spanning tree for any cograph as follows. A similar property on the spanning tree of cographs is also used in distributed (interactive) proofs for P4-freeness [17, 33]. We use the following results on the balls into bins problem.

Lemma 14 (Balls into bins problem [21]).

Assume that there are n bins and n balls. Each ball uniform randomly selects one bin which it places into. Then, with probability at least 11/n, all bins have at most O(lognloglogn) balls.

Lemma 15.

There exists a 𝖻𝗋𝗈𝖺𝖽𝖼𝖺𝗌𝗍𝖢𝖮𝖭𝖦𝖤𝖲𝖳 algorithm that runs in O(1) rounds and performs the following:

  • If some node rejects, then G is not a connected cograph.

  • Otherwise, with high probability, it outputs a rooted spanning tree of depth 2, where each node, except the root, has at most O(lognloglogn) children.

Proof.

We first reject if the diameter of the graph is not O(1), since every connected cograph has diameter at most 2. This can be accomplished in O(1) rounds by running a BFS search from the node with minimum identifier as the depth of the BFS tree provides a 2-approximation of the diameter. From the definition of cographs, a connected cograph G=(V,E) is constructed from the join operation on two distinct node sets V1 and V2 where V=V1V2 and |V1||V2|. Let r be the node with the maximum degree in G. Note that r can be found in O(1) rounds since the diameter of the graph is constant. If deg(r)<n/2, the algorithm rejects (if G is a connected cograph, then the maximum degree is at least n/2 as each node in V2 has degree at least |V1|n/2). Consider the tree rooted at r where all neighbors of r are at depth 1. As below, we can construct the desired spanning tree with high probability if the graph is a connected cograph.

  • Case 1: rV2. Define V2=V2\(N(r){r}). Observe that V1×V2E and |V2||V1|. Each vV2 uniform randomly selects one node from its neighbors as its parent in the tree. Now, each depth-1 node u (bin) is selected as the parent of v (ball) with probability at most 1/|V2| as deg(v)|V1|. Then from Lemma 14, all depth-1 nodes have at most O(lognloglogn) children with high probability.

  • Case 2: rV1. Define V1=V1\(N(r){r}). Since deg(r)|V1|, we have

    |V1|=|V1|(deg(r)|V2|)1|V2|.

    Each node v in V1 selects one of its neighbors as its parent uniform randomly. Now, each depth-1 node u (bin) is selected as the parent of v (ball) with probability at most 1/|V2| as deg(v)|V2|. Then from Lemma 14, all depth-1 nodes have at most O(lognloglogn) children with high probability.

Proof of Theorem 2.

Ref. [25] demonstrated a protocol in the randomized multiparty simultaneous messages model (also known as the distributed sketching model), where each node sends a message of size O(logn) to a referee. The referee can then determine if G is a cograph with high probability. We can simulate this protocol in the constructed depth-2 spanning tree, with the root r acting as the referee. Since each depth-1 node has at most O(lognloglogn) children, sending all messages to r is completed in O(lognloglogn) rounds.

4 Algorithm and certification for 𝑷𝟓-freeness

In this section we give an overview of the proof strategy for Theorem 3 and Theorem 4. Since these theorems share the same proof strategy, we focus on our 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 algorithm for P5-freeness (Theorem 3). Full proof of Theorem 4 will be found in the full version.

Our algorithm aims to efficiently detect whether a given graph is P5-free, focusing on cases where the graph has a diameter of 2 or 3 (observe that if the diameter is at least 4, then it must not be P5-free). We treat these cases separately due to their distinct properties.

4.1 Subgraph freeness in graphs with diameter two

Here we assume that the graph G=(V,E) with n nodes and m edges has diameter 2. Let rV be the node with maximum degree. Then deg(r)=Δ=Ω(m/n). Divide the node set V={r}V1V2 where Vi is the set of nodes whose distance to r is i. We first consider that every node u broadcasts the list of its neighbors in O(n) rounds. After that r learns all the edges connected to {r}V1. In the remaining part we show that, with high probability, r can learn the edge set E(V2×V2) in another O(n) rounds. Therefore, r can locally decide if the graph is P5 free.

Assume that we divide V2 into m/n subsets V2i for i[m/n] as follows (assuming m/n is an integer): each node vV2 chooses an integer i[m/n] uniformly at random and joins V2i. Next, we label the set V1 as {vk|k[Δ]} where 𝖨𝖣(vj) is the k-th smallest among IDs of nodes in V1. The partition of V2={V2i} and the label of V1 nodes with respect to the order of IDs are informed to all the nodes in the network in O(n) rounds. We use the following lemma.

Lemma 16 ([8]).

Consider a graph with m¯ edges and n¯ vertices. We generate a subset S by letting each vertex join S independently with probability p. Suppose that the maximum degree is Δm¯p/20logn¯ and p2m¯400log2n¯. Then, with probability 11/nc for sufficiently large c, the number of edges in the subgraph induced by S is at most 6p2m¯.

Here we set n¯=n, m¯=m, p=n/m. Clearly, p2m¯400log2n¯ holds. As Δ<n, one can verify that Δm¯p/20logn¯ holds if m¯=m400nlog2n. We assume that m400nlog2n holds since otherwise a simple O(m)=O~(n)-round algorithm suffices. Then, with high probability, |EV2i×V2j|=O(p2m)=O(n) for all i,j[m/n]. Consider the node vkV1 collects the edge set EV2i×V2j for k=i+(j1)m/n. To do this, assume that (u,v)EV2i×V2j. Then there exists a node w such that (u,w),(w,vk)E since the diameter of the graph is 2. w knows the edge (u,v) because of the previous O(n) round communication and thus w can send it to vj. Since |EV2i×V2j|=O(n), this is done in O(n) rounds. Finally, vk sends EV2i×V2j to r in O(n) rounds. Now we have established the following theorem.

Theorem 17.

In graphs with diameter 2, for any subgraph H, (non-induced and induced) H-freeness can be solved in O(nlog2n) rounds in the 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model with high probability.

It is noteworthy that this result does not hold when the diameter is greater than 2, since for any constant ε>0, there is a pattern graph H such that H-freeness in graphs with diameter 3 requires Ω(n2ε) rounds [15, 29].

4.2 𝑷𝟓-freeness in graphs with diameter three

A High-Level Overview of Our Approach

In the case of a graph with diameter 3, the larger diameter necessitates more tricky definitions and case analysis, resulting in a more involved proof. We outline the main ideas of our proof here before giving details.

Let Vi be the set of nodes at distance i from r, for i{1,2,3}. The edges (except the ones incident to r) are partitioned as Ei,j=E(Vi×Vj) for ij. We employ a procedure 𝒫𝖼𝗈𝗅𝗅𝖾𝖼𝗍 (Algorithm 1) that runs in O(n) rounds, enabling r to gather E1,1 and E1,2.

Algorithm 1 𝒫𝖼𝗈𝗅𝗅𝖾𝖼𝗍.

r: the node with maximum degree.
T: the rooted spanning tree created by the BFS traversal from r.

Step 1

Each node broadcasts the list of its neighbors in O(n) rounds.

Step 2

Consider the partition of V into m/n subsets Vi for i[m/n] as follows: Each node vV chooses an integer i[m/n] uniformly at random, joins Vi and tells it to all other nodes so that all nodes in the graph learn the partition 𝒱={Vi}i[m/n]. The node r broadcasts the set of nodes that contains all nodes in V1 in ascending order with respect to their IDs so that all nodes in the graph agree with vkV1 such that 𝖨𝖣(vk) is the k-th smallest ID in V1.

Step 3

Define E~2,2 by the set of edges (u,v)E2,2 satisfying at least one of the following:

  1. 1.

    N(vk)(N(u)N(v)) where uVi and vVj for the partition 𝒱, and vk is a neighbor of r such that the integer k is assigned in Step 2 for k=i+(j1)m/n,

  2. 2.

    (N(u)N(v))V3.

We also define Fbad by the set of edges (u,v)E2,2\E~2,2 satisfying N(u)V1=N(v)V1. Each node u tells the neighbors the list of its incident edges in E~2,2 and Fbad.

Step 4

For each edge (vk,w), w sends the edge (u,v) between Vi and Vj to vk if w received (u,v) in Step 1. After that, each vkV1 sends these edges between Vi and Vj to r. This takes O(n) rounds since the number of edges between Vi and Vj is O(n) w.h.p., due to Lemma 16.

Step 5

r computes |E2,3|,|E~2,2|,|Fbad|. This is done by using the tree T. r then rejects if the following holds:

Condition 1

There is an edge in E2,3E~2,2 that is not sent to r.

For each edge (u,v)E2,2\E~2,2, u rejects if the following holds:

Condition 2

N(u)V1N(v)V1.

For each edge (u,v)Fbad, u rejects if the following holds:

Condition 3

There is a node wN(u)V2 such that N(u)V1N(w).

For each edge (u,v)E3,3, u rejects if the following holds:

Condition 4

N(u)V2N(v)V2.

For this procedure, we have the following.

Lemma 18 (Restated in Lemma 22).

If any of the conditions in 𝒫𝖼𝗈𝗅𝗅𝖾𝖼𝗍 (Conditions 1-4) are met, then G contains an induced P5.

After the execusion of 𝒫𝖼𝗈𝗅𝗅𝖾𝖼𝗍, if r detects any missing edge in (E2,2\Fbad)E2,3 – where Fbad is a carefully defined subset of E2,2 (defined in Step 3 of 𝒫𝖼𝗈𝗅𝗅𝖾𝖼𝗍) – it can safely conclude that the graph is not P5-free. This is because (i) the invalidity of Condition 2 of 𝒫𝖼𝗈𝗅𝗅𝖾𝖼𝗍 ensures that E2,2\Fbad=E~2,2, and (ii) the invalidity of Condition 1 of 𝒫𝖼𝗈𝗅𝗅𝖾𝖼𝗍 ensures that all edges in E~2,2E2,3=(E2,2\Fbad)E2,3 have been sent to r. Therefore, if no node rejects in 𝒫𝖼𝗈𝗅𝗅𝖾𝖼𝗍, r learns E\(FbadE3,3).

The following lemmas shows that, for any P5 containing an edge from FbadE3,3, detection is performed by nodes connected to the endpoints of these edges.

Lemma 19 (Restated in Lemma 23).

Assume that no node rejects in 𝒫𝖼𝗈𝗅𝗅𝖾𝖼𝗍. If G contains an induced P5 with at least one edge from Fbad, then there is a node that detects an induced P5.

Lemma 20 (Restated in Lemma 24).

Assume that no node rejects in 𝒫𝖼𝗈𝗅𝗅𝖾𝖼𝗍. If G contains an induced P5 with at least one edge from E3,3, then there is a node that detects an induced P5.

For any P5 composed solely of edges from E\(FbadE3,3), detection is performed by node r, which now has access to E\(FbadE3,3). The key challenge here is distinguishing between proper P5s (induced by the entire edge set E) and improper P5s (induced by E\(FbadE3,3) but not by E). We address this by having r count the improper patterns and subtract this count from the total number of P5s it finds. As illustrated in Figure 1, there are 17 non-isomorphic patterns for possible improper P5s. We introduce the concept of a “dangerous” induced subgraph. An induced copy of H in the input network G=(V,E) is dangerous iff it induces a P5 in E\(FbadE3,3). The following lemma shows that the number of such dangerous patterns can be counted efficiently.

Lemma 21 (Restated in Lemma 25).

Let ={Hi}i[17] be a set of five-node graphs that contain P5 as a (non-necessarily induced) subgraph as in Figure 1. Then, for any H, the number of induced copies of H in the graph that are dangerous can be counted (by the node r) in O(n) rounds.

This yields the correct count of proper P5s, since the number of P5s in G is now obtained by subtracting the number of dangerous patterns in G from the number of P5s in (V,E\(FbadE3,3)).

4.2.1 Algorithm for 𝑷𝟓-freeness in graphs with diameter three

Assume that the diameter of the input graph (with n nodes and m edges) is 3 and m400nlog2n. Similarly to the previous case, we divide the node set into V={r}V1V2V3 where Vi is the set of nodes whose distance to r is i. For all i,j{1,2,3}, ij, we also define Ei,j by the edge set between Vi and Vj. We use a subprotocol 𝒫𝖼𝗈𝗅𝗅𝖾𝖼𝗍 described in Algorithm 1, and analyze it here.

Lemma 22.

If any of the conditions in 𝒫𝖼𝗈𝗅𝗅𝖾𝖼𝗍 (Conditions 1-4) are met, then G contains an induced P5.

Proof.
Condition 1

Assume that there is an edge in E2,3 that is not sent to r in 𝒫𝖼𝗈𝗅𝗅𝖾𝖼𝗍. Then for some i,j,k satisfying k=i+(j1)m/n, there exists an edge (u,v) between Vi and Vj such that no node in N(u)N(v) is connected to vk. W.l.o.g., we assume that uV2 and vV3. Then there exists a node wN(u)V1 that is not connected to vkV1. {vk,r,w,u,v} induces a P5.

Assume that there is an edge in E~2,2 that is not sent to r in 𝒫𝖼𝗈𝗅𝗅𝖾𝖼𝗍. First, observe that if an edge (u,v) in E~2,2 satisfies the first condition of E~2,2, it is sent to r in 𝒫𝖼𝗈𝗅𝗅𝖾𝖼𝗍. Assume that (u,v) satisfies the second condition of E~2,2 and is not sent to r in 𝒫𝖼𝗈𝗅𝗅𝖾𝖼𝗍. Then w.l.o.g., there are two nodes wN(u)V3 and xN(u)V1. Furthermore, assume that uVi, vVj, and k=i+(j1)m/n. {vk,r,x,u,w} induces a P5.

Condition 2

Suppose that, for some i,j,k satisfying k=i+(j1)m/n, there exists an edge (u,v)E2,2\E~2,2 between Vi and Vj satisfying N(u)V1\N(v)V1. Then there exist a node wV1(N(u)\N(v)) that is not connected to vkV1. We can conclude that {vk,r,w,u,v} is an induced P5.

Condition 3

Let (u,v)Fbad(Vi×Vj). For a node w(N(u)N(v))V2 such that N(u)V1N(w), if there is y(N(u)V1)\N(w), then {vk,r,y,u,w} is an induced P5 for k=i+(j1)m/n.

Condition 4

Let w(N(u)\N(v))V2. Then there exists a node xN(w)V1. {r,x,w,u,v} is a P5.

Now, assuming that 𝒫𝖼𝗈𝗅𝗅𝖾𝖼𝗍 does not reject, the following properties hold for each edge (u,v)Fbad between Vi and Vj.

Property 1

N(u)V1=N(v)V1,

Property 2

N(vk)(N(u)N(v))= for k=i+(j1)m/n,

Property 3

(N(u)N(v))V3=,

Property 4

for each w(N(u)N(v))V2, N(u)V1=N(v)V1N(w).

Lemma 23.

Assume that no node rejects in 𝒫𝖼𝗈𝗅𝗅𝖾𝖼𝗍. If G contains an induced P5 with at least one edge from Fbad, then there is a node that detects an induced P5.

Proof.

We write {(pi,pi+1)}i{1,2,3,4} be the four edges constructing some induced P5. Here we assume that exactly one edge of them is bad. First, consider that (p2,p3)Fbad. Fix one node vN(p2)N(p3)V1. Then from Property 1 and Property 3, p4 is in V2. Moreover, p4N(v) from Property 4. Similarly, p1V2N(v). Now observe that v knows {(pi,pi+1)}i{1,2,3,4} and v can detect the path.

We then consider that (p1,p2)Fbad. Similarly to the above case, we can assume that p3V2. We distinguish the following cases.

Case 1: p4V2.

If p5V1, p3 can detect the path since p3 knows p5N(p2) and thus p5N(p1) due to Property 1. If p5V3, p3 can detect the path due to Property 3. If p5V2, for each node vN(p1)V1, vN(p2)N(p3). If (p4,v)E or (p5,v)E, v can detect the path. Otherwise {p1,v,p3,p4,p5} is an induced P5. p3 can detect this path as follows. Since p3 knows (p1,p2)Fbad and (v,P5)E, p3 can conclude that (p1,p5)E due to Property 4.

Case 2: p4V1.

If p5V1, then p3 can detect the path due to Property 1. Assume p5V2. Fix arbitrary node vN(p1)N(p2)N(p3)V1. If p5N(v), then v can detect the path. Otherwise, p3 knows that p5N(v) and (p2,p5)E. p3 can detect the path since p3 can verify that (p1,p5)E due to Property 4.

Case 3: p4V3.

If p5V3, then p3 can detect the path since p1,p2 do not have a neighbor in V3. Assume p5V2. Fix arbitrary node vN(p1)N(p2)N(p3). If p5N(v), then v can detect the path. Otherwise, p3 knows that p5N(v). From Property 4, we have p5N(p1)N(p2). p3 can detect the path.

Therefore, any induced P5 that contains exactly one edge from Fbad can be detected by some node.

Now we consider a path {(pi,pi+1)}i{1,2,3,4} containing at least two edges from Fbad. If there are two consecutive bad edges in the path, (pi,pi+1),(pi+1,pi+2), there is a node uV1N(pi)N(pi+1)N(pi+2) from Property 1. u is also connected to pi+3 when i2 or pi1 when i2. u can detect the path. The remaining case is the path containing two bad edges that are not consecutive. We have two distinct cases.

Case 4: (p1,p2),(p3,p4)Fbad.

From Property 1 and Property 4, there is a node uN(p1)N(p2)N(p3)N(p4) who can detect the path.

Case 5: (p1,p2),(p4,p5)Fbad.

This is similar to a subcase of Case 1. For each node vN(p1)V1, vN(p2)N(p3). If (p4,v)E or (p5,v)E, v can detect the path. Otherwise {p1,v,p3,p4,p5} is an induced P5. p3 can detect this path as follows. Since p3 knows (p1,p2)Fbad and (v,P5)E, p3 can conclude that (p1,p5)E due to Property 4.

Lemma 24.

Assume that no node rejects in 𝒫𝖼𝗈𝗅𝗅𝖾𝖼𝗍. If G contains an induced P5 with at least one edge from E3,3, then there is a node that detects an induced P5.

Proof.

We write {(pi,pi+1)}i{1,2,3,4} be the four edges constructing some induced P5. Consider that (p2,p3)E3,3 for instance. p1,p4V3 since Condition 4 of 𝒫𝖼𝗈𝗅𝗅𝖾𝖼𝗍 does not hold. Similarly, since (p3,p4)E3,3 we have p5V3. Therefore there exists vN(p1)N(p2)N(p3)N(p4)N(p5)V2 that can detect the path. The analysis of the other cases is done in the same way – we can show that all path edges are from E3,3. Given Lemmas 23 and 24, we can detect any induced P5 that involves edges from FbadE3,3.

Our focus now shifts to the detection of an induced P5 solely composed of edges from E\(FbadE3,3). In this context, we need to ensure that any P5 induced by E\(FbadE3,3) does not actually arise as an artifact of the removal of FbadE3,3, but is genuinely induced by E. To address this, we consider all five-node induced subgraphs of G. As illustrated in Figure 1, there are exactly 17 distinct five-node patterns ={H1,,H17}, each containing a P5 as a subgraph. See Figure 2 for the illustration.

We introduce the concept of a “dangerous” induced subgraph. An induced copy of H in the input network G=(V,E) is dangerous iff it induces a P5 in E\(FbadE3,3).

Lemma 25.

Let ={Hi}i[17] be a set of five-node graphs that contain P5 as a (non-necessarily induced) subgraph as in Figure 1. Then, for any H, the number of induced copies of H in the graph that are dangerous can be counted (by the node r) in O(n) rounds.

Proof.

We first note that there is no dangerous copy of H including edges from both Fbad and E3,3, since two endpoints of a bad edge do not have neighbors in V3 due to Property 3 of Fbad.

We consider the case of H1 in Figure 1, i.e., 5-node cycles C5. Let

{(c1,c2),(c2,c3),(c3,c4),(c4,c5),(c5,c1)}

be five edges that form a dangerous C5 where (c5,c1)Fbad and (ci,ci+1)E(FbadE3,3) for i{1,2,3,4}. Then, from Property 1 and Property 3 of Fbad, we can assume that c2,c4V2. Moreover, from Property 4 all the nodes in N(c1)V1=N(c5)V1 are neighbors of c2 and c4. Let u be the node in N(c1)V1=N(c5)V1 with minimum identifier. Then u can detect the cycle as it knows all the neighbors of c1,c2,c4,c5. Each node counts the number of such cycles, and sends it to r. In this way, r can count the number of dangerous C5 in the graph. Note that it is impossible to have (c5,c1)E3,3 due to the fact that Condition 4 of 𝒫𝖼𝗈𝗅𝗅𝖾𝖼𝗍 does not hold: If (c5,c1)E3,3, then c2,c4V3. It contradicts the assumption that (c1,c2)E3,3.

For any other pattern graph from {H2,,H17} illustrated in Figure 1, we can see that at least one node of the pattern is connected to at least three other nodes of the pattern who can detect the pattern. This completes the proof.

Figure 1: All different 5-node graphs that contain a P5 as a subgraph.
Figure 2: An illustration of H that contains exactly one bad edge considered in Lemma 25. Thick edges represent bad edges. For instance, an induced C5 (the leftmost graph) contains exactly one edge from FbadE3,3, and the remaining edges from E\(FbadE3,3), then removing FbadE3,3 from the graph creates a new induced P5.
Proof of Theorem 3.

In O(n) rounds, we can compute the diameter of the graph [24]. If the diameter exceeds 3, the network immediately rejects. If the diameter is two, we apply Theorem 17, which establishes that P5-freeness can be verified in O(nlog2n) rounds with high probability. So we focus on the case where the diameter is exactly 3.

First, if the number of edges in the graph satisfies m400nlog2n, the node r collects all edges in O(m) rounds and decide if G is P5-free. Otherwise, the network runs 𝒫𝖼𝗈𝗅𝗅𝖾𝖼𝗍 in O(n) rounds. Assuming 𝒫𝖼𝗈𝗅𝗅𝖾𝖼𝗍 does not reject, r knows all the edges in E\(FbadE3,3) as E2,2\E~2,2=Fbad.

Next, we assume that there is no induced P5 that contains at least one edge from FbadE3,3 as such a P5 can be detected by some node due to Lemma 23 and Lemma 24.

For each Hi, the node r counts the number of induced copies of Hi that are dangerous as in Lemma 25. Let t(Hi) be the count and let t=i[17]t(Hi) be the sum of them.

Finally, r counts the number of P5’s induced by E(FbadE3,3). If it is equal to t, we can conclude that G is P5-free as removing FbadE3,3 from G increases the number of induced P5’s by t. Otherwise, if the count is strictly larger than t, G is not P5-free.

4.3 Local certification of 𝑷𝟓-freeness

We use the following standard technique.

Lemma 26.

Let G=(V,E) be an n-node connected graph, and T be a spanning tree rooted at arbitrary node t. For each uV, let a(u)[poly(n)] be an integer assigned to u. Then, there is a 𝖯𝖫𝖲1 that computes uVa(u) with certificates of size O(logn).

Proof.

Let Tu be a subtree of T rooted at u. The certificate to u consists of two O(logn)-bit values b and b(u). If u is a leaf, u rejects when a(u)b(u). If u is not a leaf, u rejects when b(u)a(u)+vN(u)Tub(v). The root node t rejects when bb(t). If no node rejects, b=uVa(u).

Proof of Theorem 4.

We first describe the certificates and then explain the verification process and its correctness.

The certification comprises the following components, each represented by O(nlogn) bits. We assume that m/n is an integer (otherwise, we use m/n).

  • 𝖭𝖾𝗂𝗀𝗁𝖻𝗈𝗋𝗌: For each node u, 𝖭𝖾𝗂𝗀𝗁𝖻𝗈𝗋𝗌(u) consists of a set of neighbors of u.

  • 𝖣𝗂𝖺𝗆𝖾𝗍𝖾𝗋: 𝖣𝗂𝖺𝗆𝖾𝗍𝖾𝗋(u) consists of a BFS tree starting from u.

  • 𝖲𝗉𝖺𝗇𝗇𝗂𝗇𝗀𝖳𝗋𝖾𝖾: 𝖲𝗉𝖺𝗇𝗇𝗂𝗇𝗀𝖳𝗋𝖾𝖾(u) describes a spanning tree rooted at the node with maximum degree, denoted r. Thus, it is identical to 𝖣𝗂𝖺𝗆𝖾𝗍𝖾𝗋(r).

  • 𝖯𝖺𝗋𝗍𝗂𝗍𝗂𝗈𝗇: 𝖯𝖺𝗋𝗍𝗂𝗍𝗂𝗈𝗇(u) provides a partition of nodes. The prover divides V into m/n subsets 𝒱={Vi}i[m/n] so that the number of edges between Vi and Vj is O(n) for all i,j[m/n]. This partition exists due to Lemma 16. 𝖯𝖺𝗋𝗍𝗂𝗍𝗂𝗈𝗇(u) includes the partition 𝒱 and a set containing all the neighbors of r in ascending order with respect to their IDs. Additionally, 𝖯𝖺𝗋𝗍𝗂𝗍𝗂𝗈𝗇(u) includes encoded values in O(logn)-bit: m~2,2, mbad, m2,3, m3,3, m~2,2(u), mbad(u), m2,3(u), m3,3(u). These values are used to verify the size of edge subsets. Finally, 𝖯𝖺𝗋𝗍𝗂𝗍𝗂𝗈𝗇(u) includes, for each H, the numbers c(H) and cu(H). These values are used to count the number of dangerous copy of H in the graph.

  • 𝖤𝖽𝗀𝖾𝗌: Let vk be the neighbor of r whose degree is the k-th largest among all the neighbors of r for k[m/n]. 𝖤𝖽𝗀𝖾𝗌(vk) then contains O(nlogn)-bit descriptions of all edges between Vi and Vj where k=i+(j1)m/n.

Verification

𝖣𝗂𝖺𝗆𝖾𝗍𝖾𝗋 is used to certificate the graph diameter is at most 3. This is done by checking the structure of the trees as in [7].

Certifying the partition of nodes and edges.

Each node u checks that the partition of nodes and the set representing all the neighbors of r in ascending order with respect to their IDs written in 𝖯𝖺𝗋𝗍𝗂𝗍𝗂𝗈𝗇(u), and the spanning tree written in 𝖲𝗉𝖺𝗇𝗇𝗂𝗇𝗀𝖳𝗋𝖾𝖾(u) are the same as of its neighbors, so that all nodes agree the same partition, and the distance from r for all nodes. Now, given a partition of nodes in 𝖯𝖺𝗋𝗍𝗂𝗍𝗂𝗈𝗇 and the spanning tree in 𝖲𝗉𝖺𝗇𝗇𝗂𝗇𝗀𝖳𝗋𝖾𝖾, endpoints of each edge e determine which subset of

E1,1E1,2E~2,2FbadE2,3E3,3

the edge e belongs to. Next, each node computes the sizes of edge sets |E1,1|, |E1,2|, |E~2,2|, |Fbad|, |E2,3|, |E3,3| using the protocol of Lemma 26 and the values written in 𝖯𝖺𝗋𝗍𝗂𝗍𝗂𝗈𝗇 as follows: For example, we can compute |E2,3| by setting b=|E2,3|, b(u)=m2,3(u), and a(u) be the number of edges in E2,3 incident to u.

Certifying the assigned edges.

Let vk be the neighbor of r that has the k-th minimum ID among all the neighbors of r. Since all nodes agree on the set representing all the neighbors of r in ascending order with respect to their IDs, all nodes know which is vk. For each vk, the validity of 𝖤𝖽𝗀𝖾𝗌(vk) can be checked as follows. First, each vk checks if the following holds.

  1. 1.

    𝖤𝖽𝗀𝖾𝗌(vk)E(Vi×Vj) for k=i+(j1)m/n. vk can check this since all nodes agree on the partition 𝒱.

  2. 2.

    For each edge e𝖤𝖽𝗀𝖾𝗌(vk), at least one of its endpoints is at most distance 2 from vk. vk can check this since vk knows nodes with distance 2 from vk by looking at 𝖭𝖾𝗂𝗀𝗁𝖻𝗈𝗋𝗌 of its neighbors.

If these conditions do not hold then vk rejects. Each neighbor u of vk rejects if 𝖤𝖽𝗀𝖾𝗌(vk) contains non-edge that includes u for some uN(u) by looking at 𝖭𝖾𝗂𝗀𝗁𝖻𝗈𝗋𝗌(u). This ensures that the edge set 𝖤𝖽𝗀𝖾𝗌(vk) is indeed the set of edges between Vi and Vj such that at least one of the endpoints are a 2-hop neighbor of vk. Each node then checks whether the four conditions described in 𝒫𝖼𝗈𝗅𝗅𝖾𝖼𝗍 hold: r checks Condition 1, and each node u checks Condition 2,3, and 4 for each edge (u,v). If any condition holds, the network rejects. This part of verification ensures that r learns E\(FbadE3,3).

Detecting an induced 𝑷𝟓 containing 𝑭𝒃𝒂𝒅𝑬𝟑,𝟑.

Assuming Conditions 1-4 of 𝒫𝖼𝗈𝗅𝗅𝖾𝖼𝗍 are not met, some node can now detect an induced P5 that contains at least one edge from FbadE3,3 if exists, as in Lemma 23 and Lemma 24 since all necessary information from 𝒫𝖼𝗈𝗅𝗅𝖾𝖼𝗍 is included in its certificate or the certificates of its neighbors.

Detecting an induced 𝑷𝟓 solely composed of 𝑬\(𝑭𝒃𝒂𝒅𝑬𝟑,𝟑).

The final components of 𝖯𝖺𝗋𝗍𝗂𝗍𝗂𝗈𝗇(u) are used to count the number of dangerous H for H, as described in Lemma 25. This is accomplished using Lemma 26 as follows. For simplicity, let us consider the case of induced C5 (H1 in Figure 1). In our 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 algorithm, as in the proof of Lemma 25, for each such cycle, the node in V1 with minimum identifier that is incident to the cycle detects it, and report the number of detected cycles to the node r through the BFS tree to compute the entire number of dangerous C5’s in the graph. Setting b=c(H1), b(u)=cu(H1), and a(u) be the number of dangerous C5’s detected by u in Lemma 26, we can certify it. r then determines the number of P5’s that increase by removing FbadE3,3 (i.e., the number of dangerous copies of H for all H) and compares this with the number of induced P5’s in (V,E\(FbadE3,3)). Based on this comparison, r can decide if G is P5-free.

5 Lower bounds for 𝑷𝒌-freeness

5.1 Proof of Theorem 5

Our proof leverages a reduction from the three-party nondeterministic communication complexity of the set-disjointness problem in the number-on-forehead (NOF) model. In this model, each player sees the inputs of the other two players but not their own. The players communicate by writing bit strings on a shared blackboard, and the communication cost of the protocol is the total number of bits written.

As a first step, we use the reduction from set-disjointness to triangle-freeness established by Drucker et al.[13]. The key graph used in their reduction is a tripartite graph GRS=(ABC,E) where (1) |A|=|B|=n and |C|=n/3 for arbitrary integer n; (2) GRS contains t=n2eO(logn) triangles 𝒯={T1,,Tt}; (3) each edge of the graph belongs to exactly one triangle. This graph construction relates to the Ruzsa–Szemerédi problem[26] in extremal graph theory and has been well studied.

For XA,XB,XC{0,1}t, we define G(XA,XB,XC) as the graph obtained by removing edges from GRS as follows. Let Ti=(eA,eB,eC) be a triangle such that eA is an edge between B and C, eB is an edge between A and C, and eC is an edge between A and B. For P{A,B,C}, we remove the edge eP iff the i-th bit of XP is 0.

We then construct another graph G(XA,XB,XC) as follows. This construction is obtained by tweaking the construction in [12, 34]. Let G¯(XA,XB,XC) be the complement graph of G(XA,XB,XC). For each node u in G¯(XA,XB,XC), we have a triangle u1,u2,u3 in G(XA,XB,XC). We add edges (ui,vi) for all i{1,2,3}. For each edge (u,v) in G¯(XA,XB,XC), we add edges (ui,vj) for all i,j{1,2,3} in G(XA,XB,XC). Additionally, there are two extra nodes x,y. x is connected to u1 and u2 for all uV(G¯(XA,XB,XC)). y is connected to u2 and u3 for all uV(G¯(XA,XB,XC)). Thus, the number of nodes in G(XA,XB,XC) is O(n).

Lemma 27.

G(XA,XB,XC)) contains an induced P5 if and only if XA, XB and XC are not disjoint.

Proof.

Suppose that there is an index i such that the i-th bit of XA, XB and XC are all 1. Then we have a triangle Ti in the graph G(XA,XB,XC) and thus have an independent set {u,v,w} of size 3 in G¯(XA,XB,XC). {u1,x,v2,y,w3} induces P5 in G(XA,XB,XC).

Conversely, assume that G(XA,XB,XC) contains an induced P5. Let us focus on an independent set of size 3 in the path. The set does not contain x and y at the same time, since each other node is connected at least one of x and y. Moreover, the set cannot contain x, as otherwise the other two nodes are written u3 and v3, but it is impossible since u3 and v3 are always connected. A similar contradiction occurs if the set contains y. Therefore three nodes in the set are written u1, v2, w3. From the construction u,v,w in G¯(XA,XB,XC) is an independent set. This means that XA, XB and XC are not disjoint.

We are now ready to prove Theorem 5.

Proof of Theorem 5.

Suppose that P5-freeness can be certified with certificates of size s. We construct a non-deterministic communication protocol for set-disjointness with communication complexity O(ns). Suppose that Alice, Bob, and Charlie, who receive the inputs XA, XB, and XC, respectively, simulate G(XA,XB,XC) as follows:

  • For all uA in G¯(XA,XB,XC), Alice simulates nodes u1,u2,u3 in G(XA,XB,XC).

  • For all uB in G¯(XA,XB,XC), Bob simulates nodes u1,u2,u3 in G(XA,XB,XC).

  • For all uC in G¯(XA,XB,XC), Charlie simulates nodes u1,u2,u3 in G(XA,XB,XC).

  • All players simulates x and y.

Since the incident edges of simulated nodes by each player are independent from its input, each player can construct the inputs of its simulated nodes locally. Then the players simulate the certification for P5-freeness. Since the certificate size is s and each player simulates O(n) nodes, the length of the certificates for each player is O(ns). In order to determine the outputs of the simulated nodes, each player writes its certificates to the shared blackboard. From Lemma 27, the players can compute set-disjointness of size n2eO(logn). We now obtain s=Ω(neO(logn))=Ω(n1o(1)) as desired by the nondeterministic multiparty communication complexity of set-disjointness in the number-on-forehead model [37]. To extend the lower bound for larger locality, we just replace edges incident to x and y by longer paths. Thus we can increase the locality by 1 at the cost of increasing the path length by 4.

5.2 Lower bounds for 𝑷𝒌-freeness: general 𝒌

Technical challenges and the general proof idea for Theorem 6

We use a standard reduction from two-party set-disjointness function for the proof of Theorem 6 (and Theorem 7). Alice and Bob jointly construct some graph which depends on their inputs so that the constructed graph is Pk-free iff the output of some function (in our case, the set-disjointness function) is 1.

The main challenge in this proof is to ensure that the cut of the constructed graph (i.e., the number of edges between nodes held by Alice and those held by Bob) is as sparse as possible. This was not an issue in the proof of Theorem 5, which uses the non-deterministic communication complexity of the NOF model or in the proof of the lower bound given in [2], which uses counting arguments. In fact, the constructions used in these proofs do not achieve a sparse cut, making them unsuitable for the proof of Theorem 6 that holds for the 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model.

Fortunately, in the simplest case where d3, we can utilize the construction from Ref. [29] with a slight modification, taking advantage of the graph structure that holds for all d3. However, for the cases where d=1 and d=2, the construction from Ref. [29], which is applicable to induced k-cycles for every k4, fails for induced k-paths. This is intuitively because we lose several symmetric properties of induce k-cycles by removing one edge from the cycle. We thus construct two other graph families from scratch. The main challenge in establishing these constructions is that both edges and non-edges must be carefully chosen to avoid creating unexpected induced k-paths. Our proof requires a detailed analysis to ensure this, as the general structure valid for d3 does not apply for d2.

6 Ordered path detection and applications

In this section, we address the problem of detecting an ordered Pk. In this problem, each node of the graph is assigned a color from {1,,k}, and the objective is to detect an induced path {(pi,pi+1)}i{1,,k1} on k nodes {p1,pk}, where each node pi is colored with i. Our first result is that detecting an ordered P5 is already challenging, unlike the case of Pk-freeness where we only know nontrivial lower bound for k11 as in Theorem 6.

Theorem 28.

Any randomized algorithm that solves ordered Pk detection for k5 requires Ω~(n) rounds in the 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model.

The proof of this theorem leverages a reduction from two-party communication complexity, with a construction that is notably simpler compared to the proof of Theorem 6. The detailed proof is provided in the full version.

Our next insight regarding ordered path detection is the connection between proving non-trivial lower bounds for ordered P3 detection and a long-standing open problem in circuit complexity.222This kind of connection is already known for several subgraphs such as triangles and 6-cycles [15, 14]. Moreover, this result has implications for the detection of induced C4 in the quantum 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model. Specifically, the following problem has remained unresolved for decades:

Open Problem 1.

Construct an explicit family of boolean functions fn:{0,1}n{0,1} such that there exist constants δ1,δ2>0 such that any family of circuits {Cn}n, where each circuit Cn is depth O(nδ1) and consists of O(n1+δ2) gates with constant fan-in and fan-out, cannot compute {fn}n.

The following theorem formalizes this relationship:

Theorem 29 (The formal version of Theorem 12).

Let ε>0 be a constant. Then proving Ω(nε) lower bound for ordered P3 detection in the 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model or Ω(n1/2+ε) lower bound for induced C4 detection in the quantum 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model solves Open Problem 1.

 Remark 30.

One might think that ordered P3 detection can be solved in O(1) rounds as the case of P3-freeness. However, it is easily shown that the round complexity of ordered P3 detection in the 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model is actually at least the round complexity of triangle detection in the 𝖼𝗈𝗇𝗀𝖾𝗌𝗍𝖾𝖽𝖼𝗅𝗂𝗊𝗎𝖾 model: Let G be the graph resulting by deleting all edges in the input graph G between a node colored by 1 and a node colored by 3, and replacing non-edges between them by edges. Then any 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 algorithm for ordered P3 detection is simulated in the 𝖼𝗈𝗇𝗀𝖾𝗌𝗍𝖾𝖽𝖼𝗅𝗂𝗊𝗎𝖾 model with the input graph G where each ordered P3 in G corresponds to a multicolored triangle in G. Note that multicolored triangle detection can be reduced to (the standard) triangle detection in this model [34].

Proof sketch of Theorem 29.

The first part of the statement builds on the known hardness results for triangle detection (and counting) established by [14]. We demonstrate that ordered P3 detection is at least as difficult as certain variants of triangle counting or ordered P3 counting in high-conductance graphs. The full proof is deferred to the full version.

For the second part, we use the framework of distributed quantum search [30]. Consider a function f:X{0,1} defined on some finite set X, and assume a T-round 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 algorithm allows a fixed node u to output f(x) with constant probability, given input xX. In the quantum 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model, we can find an element xX such that f(x)=1 (if such an element exists) with constant probability in O~(|X|T) rounds.

Now, assume that ordered P3 detection can be solved in T rounds in the 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model. We can show that induced C4 detection can be solved in O~(nT) rounds in the quantum 𝖢𝖮𝖭𝖦𝖤𝖲𝖳 model. The statement then follows from the first part of the theorem, as an Ω(n1/2+ε) lower bound for induced C4 detection implies an Ω(nε) lower bound for ordered P3 detection.

Specifically, let u be an arbitrary node, and define M(u)={v:dist(u,v)=2}. Consider the subgraph G[N(u)M(u)] induced by N(u) and M(u). Nodes in M(u) are colored with color 2, while nodes in N(u) randomly choose their color from 1 and 3. Note that G[N(u)M(u)] contains an ordered P3 (p1,p2,p3) if G contains an induced C4 (u,p1,p2,p3). Since p1 and p3 randomly choose their colors, the probability that (p1,p2,p3) is properly colored is constant. Therefore, in O(T) rounds, we can determine if u belongs to an induced C4. By using distributed quantum search for a function f:V{0,1}, which outputs f(u)=1 if and only if u belongs to an induced C4, we can solve induced C4 detection in O~(nT) rounds.

References

  • [1] Noga Alon, Raphael Yuster, and Uri Zwick. Color-coding. Journal of the ACM (JACM), 42(4):844–856, 1995. doi:10.1145/210332.210337.
  • [2] Nicolas Bousquet, Linda Cook, Laurent Feuilloley, Théo Pierron, and Sébastien Zeitoun. Local certification of forbidden subgraphs. arXiv preprint arXiv:2402.12148, 2024. doi:10.48550/arXiv.2402.12148.
  • [3] Nicolas Bousquet, Laurent Feuilloley, and Théo Pierron. Local Certification of Graph Decompositions and Applications to Minor-Free Classes. In 25th International Conference on Principles of Distributed Systems (OPODIS 2021), pages 22:1–22:17, 2022. doi:10.4230/LIPIcs.OPODIS.2021.22.
  • [4] Nicolas Bousquet and Sébastien Zeitoun. A subquadratic certification scheme for p5-free graphs. Theoretical Computer Science, page 115091, 2025. doi:10.1016/J.TCS.2025.115091.
  • [5] Christoph Brause, Ingo Schiermeyer, Přemysl Holub, Zdeněk Ryjáček, Petr Vrána, and Rastislav Krivoš-Belluš. 4-colorability of P6-free graphs. Electronic Notes in Discrete Mathematics, 49:37–42, 2015.
  • [6] Keren Censor-Hillel, François Le Gall, and Dean Leitersdorf. On distributed listing of cliques. In Proceedings of the 39th Symposium on Principles of Distributed Computing, pages 474–482, 2020. doi:10.1145/3382734.3405742.
  • [7] Keren Censor-Hillel, Ami Paz, and Mor Perry. Approximate proof-labeling schemes. Theoretical Computer Science, 811:112–124, 2020. doi:10.1016/J.TCS.2018.08.020.
  • [8] Yi-Jun Chang, Seth Pettie, and Hengjie Zhang. Distributed triangle detection via expander decomposition. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 821–840. SIAM, 2019. doi:10.1137/1.9781611975482.51.
  • [9] Yi-Jun Chang and Thatchaphol Saranurak. Improved distributed expander decomposition and nearly optimal triangle enumeration. In Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing, pages 66–73, 2019. doi:10.1145/3293611.3331618.
  • [10] Derek G Corneil, Helmut Lerchs, and L Stewart Burlingham. Complement reducible graphs. Discrete Applied Mathematics, 3(3):163–174, 1981. doi:10.1016/0166-218X(81)90013-5.
  • [11] Pierluigi Crescenzi, Pierre Fraigniaud, and Ami Paz. Trade-Offs in Distributed Interactive Proofs. In 33rd International Symposium on Distributed Computing (DISC 2019), pages 13:1–13:17, 2019. doi:10.4230/LIPICS.DISC.2019.13.
  • [12] Mina Dalirrooyfard, Thuy Duong Vuong, and Virginia Vassilevska Williams. Graph pattern detection: Hardness for all induced patterns and faster non-induced cycles. In Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, pages 1167–1178, 2019. doi:10.1145/3313276.3316329.
  • [13] Andrew Drucker, Fabian Kuhn, and Rotem Oshman. On the Power of the Congested Clique Model. In Proceedings of the 2014 ACM symposium on Principles of distributed computing (PODC 2014), pages 367–376, 2014. doi:10.1145/2611462.2611493.
  • [14] Talya Eden, Nimrod Fiat, Orr Fischer, Fabian Kuhn, and Rotem Oshman. Sublinear-time distributed algorithms for detecting small cliques and even cycles. Distributed Computing, pages 1–28, 2022.
  • [15] Orr Fischer, Tzlil Gonen, Fabian Kuhn, and Rotem Oshman. Possibilities and impossibilities for distributed subgraph detection. In Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures, pages 153–162, 2018. doi:10.1145/3210377.3210401.
  • [16] Pierre Fraigniaud, Maël Luce, Frederic Magniez, and Ioan Todinca. Even-cycle detection in the randomized and quantum congest model. In Proceedings of the 43rd ACM Symposium on Principles of Distributed Computing (PODC 2024), pages 209–219, 2024.
  • [17] Pierre Fraigniaud, Frédéric Mazoit, Pedro Montealegre, Ivan Rapaport, and Ioan Todinca. Distributed certification for classes of dense graphs. In Proceedings of the 37th International Symposium on Distributed Computing (DISC 2023), pages 20:1–20:17, 2023. doi:10.4230/LIPICS.DISC.2023.20.
  • [18] Pierre Fraigniaud, Pedro Montealegre, Ivan Rapaport, and Ioan Todinca. A meta-theorem for distributed certification. Algorithmica, 86(2):585–612, 2024. doi:10.1007/S00453-023-01185-1.
  • [19] Pierre Fraigniaud and Dennis Olivetti. Distributed detection of cycles. ACM Transactions on Parallel Computing (TOPC), 6(3):1–20, 2019. doi:10.1145/3322811.
  • [20] Peter Gartland and Daniel Lokshtanov. Independent Set on Pk-Free Graphs in Quasi-Polynomial Time. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), pages 613–624. IEEE, 2020.
  • [21] Gaston H Gonnet. Expected length of the longest probe sequence in hash code searching. Journal of the ACM (JACM), 28(2):289–304, 1981. doi:10.1145/322248.322254.
  • [22] Andrzej Grzesik, Tereza Klimošová, Marcin Pilipczuk, and Michał Pilipczuk. Polynomial-time algorithm for maximum weight independent set on P6-free graphs. ACM Transactions on Algorithms (TALG), 18(1):1–57, 2022. doi:10.1145/3414473.
  • [23] Chính T Hoàng, Marcin Kamiński, Vadim Lozin, Joe Sawada, and Xiao Shu. Deciding k-colorability of P5-free graphs in polynomial time. Algorithmica, 57:74–81, 2010. doi:10.1007/S00453-008-9197-8.
  • [24] Stephan Holzer and Roger Wattenhofer. Optimal distributed all pairs shortest paths and applications. In Proceedings of the 2012 ACM symposium on Principles of distributed computing, pages 355–364, 2012. doi:10.1145/2332432.2332504.
  • [25] Jarkko Kari, Martín Matamala, Ivan Rapaport, and Ville Salo. Solving the induced subgraph problem in the randomized multiparty simultaneous messages model. In International Colloquium on Structural Information and Communication Complexity (SIROCCO 2015), pages 370–384, 2015. doi:10.1007/978-3-319-25258-2_26.
  • [26] János Komlós and Miklós Simonovits. Szemeredi”s regularity lemma and its applications in graph theory, 1995.
  • [27] Janne H. Korhonen and Joel Rybicki. Deterministic Subgraph Detection in Broadcast CONGEST. In Proceedings of the 21st International Conference on Principles of Distributed Systems (OPODIS 2017), pages 4:1–4:16, 2018. doi:10.4230/LIPIcs.OPODIS.2017.4.
  • [28] Amos Korman, Shay Kutten, and David Peleg. Proof labeling schemes. Distributed Computing, 22(4):215–233, 2010. doi:10.1007/S00446-010-0095-3.
  • [29] François Le Gall and Masayuki Miyamoto. Lower Bounds for Induced Cycle Detection in Distributed Computing. In 32nd International Symposium on Algorithms and Computation (ISAAC 2021), pages 58:1–58:19, 2021. doi:10.4230/LIPICS.ISAAC.2021.58.
  • [30] François Le Gall and Frédéric Magniez. Sublinear-time quantum computation of the diameter in CONGEST networks. In Proceedings of the 2018 ACM Symposium on Principles of Distributed Computing (PODC 2018), pages 337–346, 2018. URL: https://dl.acm.org/citation.cfm?id=3212744.
  • [31] Daniel Lokshantov, Martin Vatshelle, and Yngve Villanger. Independent set in P5-free graphs in polynomial time. In Proceedings of the twenty-fifth annual ACM-SIAM symposium on discrete algorithms, pages 570–581. SIAM, 2014. doi:10.1137/1.9781611973402.43.
  • [32] Zvi Lotker, Elan Pavlov, Boaz Patt-Shamir, and David Peleg. Mst construction in o (log log n) communication rounds. In Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures, pages 94–100, 2003. doi:10.1145/777412.777428.
  • [33] Pedro Montealegre, Diego Ramírez-Romero, and Ivan Rapaport. Compact distributed interactive proofs for the recognition of cographs and distance-hereditary graphs. In International Symposium on Stabilizing, Safety, and Security of Distributed Systems (SSS 2021), pages 395–409, 2021. doi:10.1007/978-3-030-91081-5_26.
  • [34] Amir Nikabadi and Janne Korhonen. Beyond distributed subgraph detection: Induced subgraphs, multicolored problems and graph parameters. In 25th International Conference on Principles of Distributed Systems (OPODIS 2021), volume 217, 2022. doi:10.4230/LIPIcs.OPODIS.2021.15.
  • [35] David Peleg. Distributed computing: a locality-sensitive approach. SIAM, 2000.
  • [36] Bert Randerath and Ingo Schiermeyer. 3-Colorability P for P6-free graphs. Discrete Applied Mathematics, 136(2-3):299–313, 2004.
  • [37] Anup Rao and Amir Yehudayoff. Simplified lower bounds on the multiparty communication complexity of disjointness. In 30th Conference on Computational Complexity (CCC 2015). Schloss-Dagstuhl-Leibniz Zentrum für Informatik, 2015. doi:10.4230/LIPIcs.CCC.2015.88.