Abstract 1 Introduction 2 Preliminaries 3 Graph admissibility 4 Nadirs and Braids 5 Trimming 6 The main algorithm 7 Lower bounds for testing 𝑪𝟐𝒓+𝟏-freeness in (𝟐,𝒓𝟏)-admissible graphs for 𝒓𝟐 References

Testing Ck-Freeness in Bounded Admissibility Graphs

Christine Awofeso ORCID Birkbeck, University of London, UK Patrick Greaves ORCID Birkbeck, University of London, UK Oded Lachish ORCID Birkbeck, University of London, UK Amit Levi ORCID University of Haifa, Israel Felix Reidl ORCID Birkbeck, University of London, UK
Abstract

We study Ck-freeness in sparse graphs from a property testing perspective, specifically for graph classes with bounded r-admissibility. Our work is motivated by the large gap between upper and lower bounds in this area: Ck-freeness is known to be testable in planar graphs [4], but not in graphs with bounded arboricity for k>3 [7]. There are a large number of interesting graph classes that include planar graphs and have bounded arboricity (e.g. classes excluding a minor), calling for a more fine-grained approach to the question of testing Ck-freeness in sparse graph classes.

One such approach, inspired by the work of Nesetril and Ossona de Mendez [11], is to consider the graph measure of r-admissibility, which naturally forms a hierarchy of graph families 𝒜1𝒜2𝒜 where 𝒜r contains all graph classes whose r-admissibility is bounded by some constant. The family 𝒜1 contains classes with bounded arboricity, the class 𝒜 contains classes like planar graphs, graphs of bounded degree, and minor-free graphs. Awofeso et al[3] recently made progress in this direction. They showed that C4- and C5-freeness is testable in 𝒜2. They further showed that Ck-freeness is not testable in 𝒜k/21 and conjectured that Ck-freeness is testable in 𝒜k/2. In this work, we prove this conjecture: Ck-freeness is indeed testable in graphs of bounded k/2-admissibility.

Keywords and phrases:
Property Testing, Sparse Graphs, Cycle, Admissibility
Category:
Track A: Algorithms, Complexity and Games
Copyright and License:
[Uncaptioned image] © Christine Awofeso, Patrick Greaves, Oded Lachish, Amit Levi, and Felix Reidl; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Streaming, sublinear and near linear time algorithms
Editors:
Keren Censor-Hillel, Fabrizio Grandoni, Joël Ouaknine, and Gabriele Puppis

1 Introduction

Finding a k-cycle (Ck) as a subgraph is a fundamental problem in graph theory with applications in network analysis, bioinformatics, and theoretical computer science. Given a graph G=(V,E), the goal is to detect cycles of exactly k vertices. Various algorithmic approaches have been proposed, including combinatorial search techniques [9], matrix-based methods using spectral graph theory [2], and more recent advancements leveraging graph sparsification and algebraic techniques [12, 5].

In this paper, the focus is on a variation of this problem, where access to the input algorithm is through an oracle that answers queries of the sort: “What is the degree of vertex v?”, “What is the i-th neighbour of the vertex v?”, and “are the vertices u and v neighbours?”. The goal in this variation is to determine whether a graph is Ck-free (it does not have a subgraph isomorphic to Ck), given the size of the input graph and oracle access to the graph, using the minimal possible number of queries.

A deterministic algorithm of this type, for determining whether a graph is Ck-free, may require a number of queries that is at least linear in the size of the graph. Therefore, to reduce the number of queries, an alternative approach is to require the algorithm only to distinguish between graphs that are Ck-free and those that are far from being Ck-free, which means that they require the removal of many edges to become Ck-free. The algorithm is also allowed to be probabilistic and is only required to produce the correct answer with high probability. Observe that if the algorithm is required to be error-free when the input graph is Ck-free (as is the case with the algorithm presented here), it must explicitly identify a Ck subgraph to conclude that the graph is not Ck-free, essentially solving the Ck detection problem for inputs that are far from being Ck-free.

This relaxation of requirements is captured by the framework of property testing. In this framework, the problem described above is referred to as testing Ck-freeness. A central goal is to show that the number of queries required for testing Ck-freeness, when the input is restricted to graphs from a specific family, is independent of the size of the input graph, yet may depend on the parameters of the class and the distance parameter (which is a fixed parameter provided with the input) that governs the ‘farness‘ of the input from Ck-freeness. If such an algorithm exists for some family of graphs, then we say that Ck-freeness is testable for this family.

The results in this work focus on testing Ck-freeness for sparse graph families, specifically those with a bounded average degree. This line of research was initiated in the seminal work of Goldriech and Ron [8] who showed that the query complexity of testing Ck-freeness in graphs with maximum degree d is O(dk). Although that result was promising, Alon et al[1] later showed that the query complexity testing triangle-freeness (C3-freeness) in graphs with constant average degree is Θ(n), where n is the number of vertices in the input graph.

This raised the question of whether Ck-freeness has a better query complexity in families of graphs that are strict subsets of graphs with a bounded average degree. Czumaj et al. [4] showed that the more general property of H-freeness (i.e. graphs that don’t have a subgraph isomorphic to H) is testable in planar graphs, which implies that Ck-freeness is also testable in this setting. Subsequently, Levi [10] showed that the special case of triangle-freeness is testable for graphs with bounded arboricity, a superset of planar graphs. However, Eden et al. [7] recently showed that this does not extend to larger cycles: testing C4-freeness and C5-freeness has query complexity Ω(n1/4) in graphs of bounded arboricity, and testing Ck-freeness for k6 has query complexity Ω(n1/3). This left open the question of which (strict) subsets of classes with bounded arboricity still allow testing of Ck-freeness.

Awofeso et al. [3] recently presented results in this direction. They considered a hierarchy of sparse graph classes defined by the r-admissibility measure, which was first introduced by Dvořák [6] and plays an important role in the study of sparse classes [11, 14]. We postpone the rather technical definition to the next section; for now let us note that graphs of bounded 1-admissibility are equivalent to graphs of bounded arboricity, and graphs whose r-admissibility is bounded for any integer r include many well-known sparse classes such as planar graphs, graphs with bounded genus, graphs of bounded degree and graphs excluding a (topological) minor. Consequently, if we define 𝒜r as the family of all graph classes whose r-admissibility is bounded by some constant, then we have an infinite hierarchy of classes 𝒜1𝒜2𝒜 between graphs of bounded arboricity (𝒜1) and classes whose r-admissibility is bounded for any value of r (𝒜, also known as bounded expansion classes [11]). Note that all these inclusions are proper, for example, the class 𝒦(r) consisting of all cliques with every edge is replaced with a length r path, is contained in 𝒜r but not in 𝒜r+1.

In this hierarchy, Awofeso et al[3] showed that testing C4- and C5-freeness is possible in 𝒜2, the graphs with bounded 2-admissibility. They supplement this with a lower bound, which shows that for all r4 the number of queries required for C2r+1-freeness testing and C2r-freeness testing of classes in 𝒜r1 is polynomial in the size of the input graph. The two results lead them to conjecture that C2r- and C2r+1-freeness should be testable for classes in 𝒜r, i.e. classes with bounded r-admissibility.

In this work, we complete the picture and prove the following.

Theorem (Informal).

For any integer r>1, there exists a testing algorithm for C2r-freeness and C2r+1-freeness of bounded r-admissible graphs with query complexity depending only on r,p and the proximity parameter ϵ, where p is the bound on the r-admissibility bound.

We prove the theorem for C2r+1-freeness, but a minor (and slightly simpler) version of our proof implies the same result C2r-freeness. We also provide the lower bounds that do not appear in [3]. Specifically, that for every r2, C2r-freeness and C2r+1-freeness are not testable in bounded (r1)-admissible graphs.

Techniques and their relation to existing work

The results in this paper are derived from the classical property tester for Ck-freeness in bounded-degree graphs. The tester selects a small initial random subset of the vertices of a graph that is far from being Ck-free; then, with sufficiently high probability, one of these vertices participates in a Ck subgraph of the input graph (a subgraph of the input graph that is isomorphic to Ck). Since the input graph has a bounded degree, starting a “bounded range” breadth-first search (BFS) from each one of the vertices in the initial set of vertices will result in the discovery of a Ck subgraph with sufficient probability. If this event occurs, then the graph is rejected and, otherwise, it is accepted. Clearly, this algorithm will always accept a Ck-free graph.

Suppose that we tried to use a variation of this tester on a graph with a bounded average degree, where the latter bound is provided as part of the input. We call this variation pseudo-BFS (PBFS). To guarantee that the PBFS’s query complexity remains low, the algorithm first computes some γ that depends only on the given average degree and then behaves like the bounded-degree tester until it encounters a vertex with a degree greater than γ (a heavy vertex). When this occurs, the search only includes a size γ randomly selected subset of the vertex’s neighbours. If the PBFS initiates a search on a vertex that is part of a Ck and this Ck contains at most one heavy vertex, then the PBFS will uncover all vertices of said Ck. However, if a Ck contains at least two non-neighbouring heavy vertices, then the PBFS might not find all the vertices of this Ck copy with sufficiently high probability since it only sees a small random subset of a heavy vertex’s neighbours. The lower bound result is based on this scenario.

This raised the question of whether the PBFS approach still works if we impose more restrictions than just the bounded average degree. In [3] it was shown that C4-freeness is testable in 𝒜2 using PBFS as the testing algorithm. The technique involves showing that if the input graph has bounded 2-admissibility and is far from being C4-free, then with sufficiently high probability the PBFS will start its search in a vertex that belongs to a C4 with at most two heavy vertices and the crucial insight is that these two heavy vertices necessarily have a large joint neighbourhood of non-heavy vertices – meaning that if the PBFS locates one of these joint neighbours, it will immediately find both heavy vertices in the next step and two steps later a C4 will be detected with high probability. This technique is sufficient to establish testability of C7-freeness in 3-admissible graphs, but not for C8-freeness in 4-admissible graphs. The reason is that a C8 might contain two heavy vertices of distance exactly four, which means that these paths include vertices that are not neighbours of either of the heavy endpoints, a case that is not covered by this technique.

We model our approach on [3] by using a process called trimming. That is, given an input graph G that is far from being Ck-free, we carve out a subgraph G^ of G by removing just enough edges from G to ensure that G^ has some required structural properties and is still far from being Ck-free (with some changes to the “farness” parameter). The graph G^ is only used for the analysis of our algorithm. Specifically, it is shown that if G^ has a Ck subgraph and one of its vertices is detected when our algorithm runs on G, then with high probability, a Ck-subgraph will be detected.

The main tools used here are various structural properties of graphs with bounded admissibility. For example, a graph G with bounded r-admissibility admits a total ordering < of its vertex set V(G) such that every heavy vertex vG can only reach a bounded number of heavy vertices u<v via paths of length at most r. This allows us to enforce that in the trimmed graph G^, if such a path of length at most r between u and v exists, then there must be a large number of such paths in G^ and therefore also in G (since otherwise they would have been removed from G^).

2 Preliminaries

We use for the set of integer numbers, for the set of real numbers, + for the set of positive reals. For an integer k, we use [k] as a shorthand for the set {1,2,,k}. For integers k1 and k2 such that k1<k2 we use [k1,k2] as a shorthand for the set {k1,k1+1,k} and (k1,k2] as a shorthand for the set {k1+1,k1+2,,k}. For a graph G, we use V(G) and E(G) to refer to its vertex and edge set, respectively. All graphs considered in this work are simple undirected graphs. The degree of a vertex vV(G), denoted degG(v), is the number of edges incident on v. N(v) is the set of neighbours of the vertex v. The distance between two vertices u,vV(G) is denoted distG(u,v). The distance between a vertex uV(G) and a subset VV(G), is the minimum of distG(u,v) over every vV.

For sequences of vertices x1x2x (and in particular, paths), we use the shorthand x1Px to represent the sequence, and use length(x1Px) to denote the length of the corresponding path. In addition, we use Px (or x1P) to refer to a subsequence of x1Px. All paths considered in this work are simple paths. Though paths here are undirected, we often treat them as directed by specifying a start and end vertex.

Property testing

A graph property (or simply property) 𝒫 is a class of graphs closed under isomorphism. We say that a graph G has the property 𝒫 if G𝒫.

We say that a graph G with n vertices and at most m edges is ϵ-far from 𝒫 if at least ϵm edge modifications are required to make the graph have the property. Otherwise, we say that it is ϵ-close to 𝒫.

Definition 1.

A property tester for a property 𝒫 of graphs is a randomized algorithm 𝒯 that receives as input parameters n,m, ϵ>0 and oracle access to a graph G with n vertices and at most m edges. If the graph G is ϵ-far from 𝒫, then 𝒯 rejects with probability at least 2/3. If the graph G𝒫, then 𝒯 accepts with probability 1 (if the tester has one-sided error) or with probability at least 2/3 (if the tester has two-sided error).

In this work, we consider the setting of sparse graphs where the oracle access to G is defined as follows.

Definition 2 (Sparse graph oracle).

Given a graph G, a sparse graph oracle can answer the following queries for vertices  u,vV(G):

  • The degree degG(v) of a vertex v (degree query).

  • The ith neighbour of v in G (neighbour query).

  • Whether {u,v} is an edge in G (adjacency query).

By combining these queries, the whole neighbourhood of a vertex v can be revealed by using 1+degG(v) queries. The oracle returns the special symbol “” when asked a query without sensible answers, e.g. when asked to return the i-th neighbour of a vertex v with degG(v)<i.

In this paper, we consider a particular set of graphs (which will be defined later on) for the property of Ck-freeness. That is, a graph that does not have a subgraph isomorphic to Ck. We refer to the problem as Ck-freeness. We call a graph Ck-free if it is contained in the Ck-freeness property.

In further sections, we use the term knowledge-graph to refer to the graph that includes all the vertices and edges revealed by the algorithm queries.

3 Graph admissibility

We start this section by presenting the definitions required for defining when a graph is (p,r)-admissible. Afterwards, we provide the actual definition and a structural lemma for (p,r)-admissible graphs. Then, we explain why this lemma is critical and provide an extra definition and two more structural claims.

An ordered graph is a pair 𝔾=(G,) where G is a graph and is a total order relation on V(G). We write 𝔾 to denote the ordering of 𝔾 and extend this notation to derive the relations <𝔾, >𝔾, 𝔾.

To define r-admissibility we need the following ideas and notations.

Definition 3.

Let 𝔾=(G,) and vV(G). A path vPx is r-admissible in 𝔾 if length(vPx)r, x<𝔾v and minwPw>𝔾v (where the minimum is with respect to the ordering in 𝔾). That is, the path goes from v to x using only vertices w such that w>𝔾v and x satisfy  v>𝔾x.

Definition 4.

For every integer i>0 we let Target𝔾i(v) be the set of all vertices in uV(G) such that u<𝔾v and u is reachable from v via an r-admissible path vPu of length at most i. We omit the subscript 𝔾 when it is clear from context.

Definition 5.

An r-admissible path packing is a collection of paths {vPixi}i with joint root v and the additional properties that every path vPixi is r-admissible and the subpaths Pixi are all pairwise vertex-disjoint. In particular, all endpoints {xi}i are distinct. We write pp𝔾r(v) to denote the maximum size of any r-admissible path packing rooted at v.

Examples of 2- and 3-admissible path packings are depicted in Figure 1.

Definition 6 (Admissibility).

The r-admissibility of an ordered graph 𝔾, denoted admr(𝔾) and the admissibility of an unordered graph G, denoted admr(G) are111Note that some authors choose to define the admissibility as 1+maxv𝔾pp𝔾r(v) as this matches with some other, related measures.

admr(𝔾)maxv𝔾pp𝔾r(v)andadmr(G)min𝔾π(G)admr(𝔾),

where π(G) is the set of all possible orderings of G.

If 𝔾 is an ordering of G such that admr(𝔾)=admr(G), then we call 𝔾 an admissibility ordering of G. The 1-admissibility of a graph coincides with its degeneracy, and therefore such orderings are easily computable in linear time. For r2, an optimal ordering can also be computed in linear time in n if the class has bounded expansion, i.e. if the graph class has bounded admissibility for every r (see [6]).

Figure 1: An example of a maximum 2-admissible (left) and 3-admissible path packing for a vertex x in an ordered graph 𝔾. The order is depicted by the height of the vertices, with the exception of the blue vertices who appear before x in 𝔾 and whose relative ordering is not important here.

Now we are ready to define the set of graphs we consider for testing C2r+1-freeness.

Definition 7.

We say that a graph G is (p,r)-admissible if admr(G)=p. Note that by definition, if a graph G is (p,r)-admissible, it is also (p,r)-admissible for all rr. We call a graph class admr-bounded if all of its members are (p,r)-admissible for some finite value p.

We state the following fact regarding (p,r)-admissible graphs.

Fact 7.

If G is (p,r)-admissible, then |E(G)|p|V(G)|.

The next lemma is a well-known structural result for admissible graphs. This lemma is the critical component of the result of this paper.

Lemma 8.

Let 𝔾=(G,) be such that admr(G)=p, then for every vV(G), and h[r] we have |Target𝔾h(v)|p(p1)h1.

We refer the reader to [3] (Proposition 3) for a proof.

Lemma 8, is the structural property of bounded admissibility graphs that enables the result in this paper. To understand the importance of the above lemma, fix ϵ>0 and suppose that we have a (2,2) admissible graph G with ordering 𝔾=(G,). Suppose also that G is ϵ-far from being C4-free and all the C4-subgraphs in it have exactly two vertices of degree 2 and 2 vertices of degree 16n1/2/ϵ. Let Q be the set of degree 16n1/2/ϵ vertices and let W be the set of degree 2 vertices. Consider any ordering placing the vertices in Q before the vertices in W. For an example of such a graph see Section 7. Note that that graph is not (2,2)-admissible, however, a (2,2)-admissible graph with such attribute exists. Suppose that we create a new graph G^ from G by doing the following: (i) for every uQ and vN(u) we remove uv if u𝔾v, and (ii) for every uQ and v, such that u𝔾v, if there exists a maximum set of less than n1/2/(16ϵ) edge disjoint paths uwv in G^, then we remove all the edges in these paths from G^.

It is easy to see that in (i) at most two edges are removed for every vertex in Q. So, after the removal of all these edges G^ is still far from C4-freeness. Regarding (ii) it can be shown, using Lemma 8, that after (i), for every uQ there are at most 4 vertices in vV(G) such that u𝔾v and u and v are connected by a path of length 2. This, along with some accounting, implies that after (ii), far fewer than ϵ|E(G)| edges were removed from G to obtain G^. Thus, G^ is ϵ/2-far from being C4-free.

Now, since G^ is ϵ/2-far from being C4-free, because of the type of C4s it has, a constant portion of the vertices of W are in such C4. So, an algorithm that randomly selects a few vertices from the graph, with high probability, will pick one of these vertices. Suppose that this was the algorithm described in the introduction. Once a vertex from W in a C4-subgraph of G^ is selected, its neighbours x,yQ are discovered. Now, since in G^, x and y are connected by a path of length 2, x and y have many common vertices (at least n1/2/(16ϵ)) and hence the algorithm will discover one of them with high probability. This will lead to the discovery of a C4-subgraph in G.

This technique cannot be generalized beyond k=7, since for k>7 the Ck-subgraphs may include more complicated subpaths. To deal with this issue, we need the following definition.

Definition 9.

Let 𝔾=(G,). A path uLv is a chain if for every wv(uL), we have w>𝔾v.

Proposition 10.

Let r,p, 𝔾=(G,) be such that admr(G)=p, uLv be an r-admissible path in G, xL and y{u,v}. The subpath xLy of uLv is a chain.

Proof.

By the definition of an r-admissible path for every zxL, we have z>𝔾max𝔾{u,v} and, in particular, x>𝔾max𝔾{u,v}. Thus, by definition, xLy is a chain. Lemma 8 describes how sets of vertices reachable through r-admissible paths from a fixed start vertex are bounded by a function of admr(G) and r. We can obtain a comparable bound for the more general cases where the vertices are now reachable via chains:

Lemma 11.

Let p,r, [r] and 𝔾 be an ordered (p,r)-admissible graph. Let P1,,Pt be a collection of chains of length at most that all start at the same vertex x𝔾 and each end with a vertex smaller than x under 𝔾.

Then the set W:={min𝔾Pi}i[t] which contains the respective minima, under 𝔾, of each of the above paths has size at most pr2.

Proof.

Let V<x:={yV(G)y<𝔾x}, and s=maxi[t]|V(Pi)V<x|. That is, s is the maximum number of vertices of V<x in a path Pi. Next, we show by induction that the set {min𝔾Pi}i[t] has size at most prs, since s< this implies the lemma.

For s=0 the statement becomes vacuous, so consider s=1. This is the case that for every i[t], the only vertex of Pi smaller than x is its other end point. Since all paths Pi are r-chains the set of these endpoints is a subset of Target𝔾r(x). Therefore, by Lemma 8, |{min𝔾Pi}i=1t|p(p1)r1<pr which provides us with the inductive base.

Assume that the statement holds for all s<s, where s, and all collections Q1,,Qh of r-chains of length , such that for every i[h], Qi starts at the vertex z𝔾 and ends at some vertex smaller than z under 𝔾, and |V(Qi)V<z|s.

Let 𝒫={Pi}i=1t and for every i[t], let yiV(Pi) be such that yi<𝔾x and Pi=xLiyiRi, where |V(xLiyi)V<yi|=1. According to the base of the induction |{yi}i=1t|pr.

Pick a vertex y{yi}i=1t and let 𝒫y be the collection of all the paths yiRi such that yi=y. Since y<Gx, every path in 𝒫y is a subpath of a path in 𝒫, we can conclude that for every yR𝒫y, we have |V(yR)V<y|<s. Thus, by the induction assumption, |{min𝔾P}P𝒫y|pr(s1). Finally, by construction, we have |{min𝔾P}P𝒫|pr|{min𝔾P}P𝒫y|prs.

4 Nadirs and Braids

Definition 12.

For every pair of vertices u,vV(G), a braid is a set of edge-disjoint r-admissible paths each having v as the start vertex and u as the end vertex, and all having the same length. A braid is a full-braid if the number of paths it has is maximum.

We use a tilde on an upper case letter to denote a braid, for example B~. In addition, we use the following notations for paths L and braids B~:

  • |B~| is the number of paths in B~.

  • length(B~) is the length of the paths in B~.

  • ends(B~), is an order pair (v,u) where v is the start vertex of every path in B~, and u is the end vertices of every path in B~ and otherwise it is undefined.

Definition 13.

The nadir of a path uLv, denoted nadir(uLv), is the vertex xV(L) such that for every yV(L) we have y𝔾x. The nadir of a braid B~, denoted nadir(B~) is the set {nadir(uLv)}uLvB~.

Definition 14.

Let B~ be a braid and δ>0, a vertex vnadir(B~) is δ-weak for B~, if the maximum size of a braid B~B~ such that nadir(B~)={v} is at most degG(v)/δ and otherwise it is δ-strong for B~.

Definition 15.

For every δ, a braid D~ is a δ-braid if every vertex in nadir(D~) is δ-weak for D~.

Proposition 16.

Let r,p, 𝔾=(G,) be such that admr(G)=p, uLv an r-admissible path in G, xL and y=nadir(uLv). Then, x𝔾y and the subpath xLy of uLv is a chain.

Proof.

By the definition of an r-admissible path for every zxL, we have z>𝔾nadir(uLv). Thus, by definition, x𝔾y and xLy is a chain.

Lemma 17.

Let p,r, [r] and G be an ordered (p,r)-admissible graph and u,vV(G) such that u>𝔾v. If there exists a braid B~ such that ends(B~)=(u,v), length(B~)r and |nadir(B~)|2r2pr2, then there exists a braid B~B~ such that |B~|2r and the only vertices that are shared between the paths of B~ are u and v.

Proof.

Let B~B~ where for every xnadir(B~), there exists exactly one path uLv such that x=nadir(uLv). By assumption, |B~|2rpr2. Let W be the set of all vertices in the paths of B~ except u and v.

We first argue that every vertex wW can be contained in at most pr2 members of B~. To that end, assume that w is contained in members uL1v,,uLtvB~. Let x1,,xt be the respective nadirs of these subgraphs under 𝔾, so xi:=nadir(uLiv). For every i[t], let wLixi be a subpath of uLiv. Since each of the paths uL1v,,uLtv is r-admissible, by Proposition 16, all the paths wL1x1,,wLtxt are chains. We can therefore invoke Lemma 11 on the paths wL1x1,,wLtxt to conclude tpr2. Thus, w can be contained in at most pr2 members of B~.

It follows immediately that every path in B~ shares vertices that are not u or v with at most rpr2 other paths of B~. Consequently, a simple greedy selection process can construct the claimed braid B~ such that |B~|2r.

5 Trimming

In this section, we present the procedure that we refer to as trimming. It is important to note that the trimming procedure is used only for the analysis of the algorithm we present. This is the reason that trimming can read the entire input graph.

Trimming receives as input a graph G that is (p,r)-admissible, an ordering 𝔾 and the parameters p,r and γ,δ+. From now on, in this section, assume that these parameters are fixed. Trimming creates a subgraph G^ of G that has a number of properties that are essential for the analysis of our algorithm. For example, with the right choice of parameters, if G is ϵ-far from C2r+1-freeness, then G^ is ϵ/2-far from C2r+1-freeness; for every v,uV(G), if there exists an full-braid B~ in G^ such that ends(B~)=(v,u) and length(B~)r, then |B~|degG(v)/γ. Note that the choice of deg𝐆(v) and not deg𝐆^(v), in the last inequality, is crucial for our proof to work and that G^ is (p,r)-admissible since it is a subgraph of G.

The trimming procedure

Initialize G^ to be equal to G and repeat the following steps until each one does not result in any edge removal:

  1. 1.

    For every distinct vV(G), uTarget𝔾r(v), [r], and full-braid B~ in G^ such that length(B~)= and ends(B~)=(v,u), if |B~|<degG(v)/γ, then all the edges participating in the paths of B~ are removed from E(G^) .

  2. 2.

    For every distinct vV(G), uTarget𝔾r(v), [r], and δ-braid D~ in G^ such that length(D~)= and ends(D~)=(v,u), if |D~|<degG(v)/γ, then all the edges participating in the paths of D~ are removed from E(G^).

Next, we show that the graph G^ created by using the trimming process is far from being C2r+1-free.

Lemma 18.

Let p,r, ϵ>0, γ+, and G be a (p,r)-admissible graph that is ϵ-far from C2r+1-freeness. If γ>8rpr/ϵ, then the graph G^ created by trimming G with parameters p, r and γ, is ϵ/2-far from C2r+1-freeness and (p,r)-admissible.

Proof.

Note first that G^ is a subgraph of G since it was created from G by taking a copy of G and removing specific edges from it. So, to prove the claim, we only need to show that at most ϵ|E(G)|/2 edges have been removed from G to obtain G^. Since G is ϵ-far from being C2r+1-free, this implies the claim.

Trimming step (1).

For every vV(G) and uTarget𝔾r(v), the maximum number of edges removed in this step is at most the number of edges in the paths of a full-braid B~ such that length(B~)= and |B~|degG(v)/γ. Thus, for every such v and u at most rdegG(v)/γ edges are removed from G^. By Lemma 8, |Target𝔾r(v)|pr, and therefore the total number of edges removed in this step is at most vV(G)rprdegG(v)/γ(ϵ/8)vV(G)degG(v)=ϵ|E(G)|/4, where the inequality follows since γ>8rpr/ϵ.

Trimming step (2).

The proof is the same as the previous step.

Proposition 19.

Let v,uV(G) be such that u>𝔾v and assume that G^ was created by trimming with parameters r,p and γ,δ+. Then

  1. 1.

    if uvE(G^) then degG(u)γ,

  2. 2.

    if G^ has an r-admissible path uLv, then there exists a full-braid B~ in G^ such that ends(B~)=(u,v), length(B~)=length(uLv) and |B~|degG(u)/γ, and

  3. 3.

    if G^ has an r-admissible path uLv, and there does not exist a B~ in G^ such that nadir(B~)=nadir(uLv), ends(B~)=(u,v), length(B~)=length(uLv), |B~|degG(v)/δ, then there exists a δ-braid D~ in G^ such that ends(D~)=(u,v), length(D~)=length(uLv) and |D~|degG(v)/γ.

Proof.

(1) By definition, the braid B~ that contains only the pass vu is a full-braid, such that ends(B~)=vu and length(B~)=1. Since |B~|=1degG(v)/γ, the edge vu is removed from G^ during step 1 of Trimming.
(2) The proof follows directly from the definition of a full-braid and Step (1) of trimming.
(3) The proof follows directly from Step (2) of trimming.

6 The main algorithm

Algorithm 1

We refer to the loop on line 3 of Algorithm 1, as Selection Loop, to the loop on line 5 of Algorithm 1, as Outer Loop, to the loop on line 7 as middle loop  and to the loop on line 12 as Inner Loop. In all of this section, G is a (p,r)-admissible graph G with n vertices, 𝔾 is an ordering of G, G^ is a subgraph of G obtained by trimming with the parameters r,p, γ,δ+ and when we use the parameter ξ3 we assume that it is a multiple of 2r. Note that by Fact 3, Algorithm 1 receives a bound on the number of edges implicitly, since the value of p is part of its input.

The following lemma follows directly from Algorithm 1.

Lemma 20.

The query complexity of Algorithm 1 is O(ξ1ξ3ξ2).

In this section, we use the term knowledge-graph to refer to the subgraph of G that is discovered by queries of Algorithm 1.

6.1 Proof overview of the main theorem

Bounding the query complexity of Algorithm 1 and proving that it returns ACCEPT given a C2r+1-free input graph is rather simple. The challenge is to prove that Algorithm 1 returns REJECT with probability at least 2/3 on an input graph that is (p,r)-admissible and ϵ-far from being C2r+1-free. This is the focus of this subsection; suppose that Algorithm 1 is given as input p,r, ϵ>0, oracle access to a (p,r)-admissible graph G that is ϵ-far from being C2r+1-free.

The proof is divided into three phases: (i) showing that, with high probability, Algorithm 1 discovers a set of vertices D in some C2r+1-subgraph H of G^, such that every path in H that does not include a vertex from D has a length of at most r. (ii) showing that if (i) occurred, then with high probability, all the vertices of a C2r+1-subgraph H of G^ are discovered; and (iii) showing that if (ii) occurred, then with high probability, the knowledge-graph includes a C2r+1-subgraph. Note that since G^ is a subgraph of G, this implies that, with high probability, a C2r+1-subgraph H of G^ is discovered. We next further explain each phase starting with (iii), proceeding to (i), and ending with (ii).
(iii) The knowledge-graph includes all the vertices of a C2r+1-subgraph H of G^. By (1) of Proposition 19, every edge uv in E(H) is incident to a vertex with degree at most γ in G. Thus, line 9 of Algorithm 1 ensures that the edge uv is discovered.
(i) By using (1) of Proposition 19, it is shown that a significant portion of the vertices x in G are such that: degG(x)γ, x is in some C2r+1-subgraph H of G^, distH(x,y)=r, where y is the smallest vertex in H according to 𝔾. The value of ξ1 is set so that with high probability a vertex such as x is discovered. As in (iii), both of x’s neighbours in H will also be discovered. Now, notice that x and y are selected so that the shorter path in H between x and y is a chain of length r. Since H is in G^, so is the chain above. Thus, by Lemma 22 (appears later), with high probability, y will also be discovered. Thus, a set of vertices as D above is discovered with high probability. The formal proof appears later in Lemma 23.
(ii) Suppose that a set of vertices like D above was discovered by Algorithm 1 and H is the C2r+1-subgraph of G^ such that DV(H). Note that if we can ensure that, with high probability, we find a vertex w such that D{w} satisfies the same attributes as D (though not necessarily with the initial subgraph H), then we are guaranteed that, with high probability, all the vertices of some C2r+1 subgraph of G^ will eventually be discovered.

Let y be the smallest vertex in V(H)D according to 𝔾. Assume that y is on a subpath uLv of H such that V(uLv)D={u,v}. If y<𝔾max𝔾{u,v}, then, as in (i), there exists a chain of length at most r in G^ between one of the vertices u and v and the vertex y. By the same reasoning as in (i), with high probability this vertex is discovered. Next, we describe what happens if y>𝔾max𝔾{u,v}.

Suppose that u>𝔾v. If there exists a braid B~ such that |B~|degG(u)/δ and y=nadir(B~), then by future choice of the value of ξ3, with high probability, a vertex vN(v) and one of the paths in B~ is selected. The fact that y=nadir(B~) ensures that there exists a chain of length at most r in G^ between v and v, containing the vertex y. As in (i), this ensures that y is discovered with high probability. This is proved later on formally in Lemma 24. Hence, it remains to deal with the case that a braid like B~ does not exist.

The nonexistence of a braid like B~ implies that there exists a δ-braid D~ such that |D~|degG(u)/γ. Notice that according to the definition of a δ-braid, this implies that |nadir(D~)|δ/γ. The value of δ is selected to be significantly higher than γ, thus ensuring that |nadir(D~)| is sufficiently large. Using a similar proof to the previous case with some significant extra work and the help of a sufficiently large value of ξ3, it is shown later in Lemma 25, that eventually a sufficiently large portion of the vertices in nadir(D~) are discovered. Once this happens, we are guaranteed by Lemma 17, that there are paths uL1v,,uL2rv in G^ such that the Li are vertex-disjoint. This implies that one of these paths together with H includes a C2r+1-subgraph H of G^ such that D{y}V(H), where y is the nadir of the path, and D{y} satisfies the same attributes as D. The formal proof for this case appears later in Lemma 26.

6.2 Proof of the main theorem

Lemma 21.

Let τ<ξ2 and S be the set of all vertices of an C2r+1-subgraph of G^. If ξ3γ and all the vertices of S are in the knowledge-graph at the end of iteration τ of Outer Loop, then with probability 1, at the end of iteration τ of Outer Loop, the knowledge-graph is not C2r+1-free.

Proof.

Let H be a C2r+1 subgraph of G^ and let uvE(H), be such that u>𝔾v and uvE(G^). By Proposition 19, degG(u)γ. Thus, according to Inner Loop, if both u and v are in the knowledge-graph at the end of iteration τ of Outer Loop, then because ξ3γ with probability 1, at the end of iteration τ of Outer Loop, the edge uv is also in the knowledge-graph. The same applies to all other edges.

Lemma 22.

Let [r], p22()=(1γ1)ξ3, τξ2 and u,vV(G) be such that u>𝔾v. Assume that there exists a chain uLv in G^ such that length(uLv)=. If u is in the knowledge-graph at the end of iteration τ of Outer Loop, then with probability at least 1p22(), at the end of iteration τ+ of Outer Loop, v is also in the knowledge-graph.

Proof.

We proceed by induction on the value of . Suppose that =1. Thus, u and v are neighbours in G^ and therefore, by Proposition 19, degG(u)γ. Thus, according to the contents of Outer Loop, with probability 1, at the end of iteration τ+1 of Outer Loop, v is also in the knowledge-graph.

Assume by induction that the lemma holds for every chain L in G^, where length(L)<. Let x be the closest vertex to u in uLv such that x<𝔾u and let uLx be a subpath of uLv. By construction, for every huL, we have h>𝔾x and therefore uLx is an r-admissible path in G^.

Suppose that xv. As 1<length(uLx)<, according to the induction assumption, given that u is in the knowledge-graph at the end of iteration τ of Outer Loop, with probability at least 1length(uLx)(1γ1)ξ2, at the end of iteration τ+length(uLx) of Outer Loop, x is also in the knowledge-graph.

Let xLv be a subpath of uLv. Since xLv is a subpath of uLV, we conclude that length(xLv)<. Thus, by the induction assumption, given that x is in the knowledge-graph at the end of iteration τ+length(uLx) of Outer Loop, with probability at least 1length(xLv)(1γ1)ξ3, at the end of iteration τ+ of Outer Loop, the vertex v is also in the knowledge-graph. Consequently, together with the previous, given that u is in the knowledge-graph at the end of iteration τ of Outer Loop, with probability at least 1(length(uLx)+length(xLv))(1γ1)ξ3=1(1γ1)ξ3, at the end of iteration τ+ of Outer Loop, the vertex v is also in the knowledge-graph.

Suppose that x=v. By definition, uLv is an r-admissible path in G^. Therefore, by Proposition 19, G^ has a full-braid B~, such that ends(B~)=(u,v) and length(B~)=.

Since there are at least degG(u)/γ edge-disjoint paths in B~, at least degG(u)/γ of the neighbours of the vertex u are in one of the paths of B~. Thus, according to the contents of Inner Loop, given that u is in the knowledge-graph at the end of iteration τ of Outer Loop, with probability at least 1(1γ1)ξ3, at the end of iteration τ+ of Outer Loop, a vertex y adjacent to u in a path of B~ is also in the knowledge-graph.

Let uyLv, be the specific r-admissible path of B~ that includes y. By Proposition 10, yLv is a chain. Since length(yLv)=1, by the induction assumption, given that y is in the knowledge-graph at the end of iteration τ+1 of Outer Loop, with probability at least 1(1)(1γ1)ξ3, at the end of iteration τ+ of Outer Loop, the vertex v is also in the knowledge-graph. Consequently, in this case, given that u is in the knowledge-graph at the end of iteration τ of Outer Loop, with probability at least 1(1+1)(1γ1)ξ3=1(1γ1)ξ3, at the end of iteration τ+ of Outer Loop, the vertex v is also in the knowledge-graph.

Lemma 23.

Let p23(r)=p22(r)+(1ϵp/(2γ))ξ1. If G^ is ϵ-far from C2r+1-freeness, then with probability at least 1p23(r), at the end of iteration r of Outer Loop the knowledge-graph includes a set of vertices D, such that for some C2r+1-subgraph H of G^ we have DV(H) and for every path uLv in H, where DV(uLv)={u,v}, we have length(uLv)r.

Proof.

Let T be the set of all vertices xV(G) such that degG(x)γ. For every C2r+1-subgraph H of G^ let anchor(H) be the vertex in V(H) such that for every yV(H) we have y𝔾anchor(H), and let stern(H) be an arbitrary vertex in V(H)T such that distH(anchor(H),stern(H))=r. Let K be the set of all vertices x such that x=stern(H) for some C2r+1-subgraph H of G^.

Let H be an arbitrary C2r+1-subgraph of G^. stern(H) exists since there are exactly two vertices xV(H) such that distH(anchor(H),x)=r, they are neighbours in G^ and therefore, by Proposition 19, at least one of them is in T.

We first show that, with probability at least 1(1ϵp/(2γ))ξ1, after Selection Loop and prior to the first iteration of Outer Loop, the set S of Algorithm 1 includes a vertex from K. Afterwards, we show that if the above occurred, then with probability at least 1p22(r), at the end of iteration r of Outer Loop, the knowledge-graph includes a set D as required.

Suppose that we removed from G^ every edge adjacent to a vertex in K, then by construction, the graph we get is C2r+1-free. Since, by Lemma 18, G^ is ϵ/2-far from being C2r+1-free it holds that |K|γϵnp/2. Thus, according to Selection Loop, with probability at least 1(1ϵp/(2γ))ξ1, after Selection Loop and prior to the first iteration of Outer Loop, the set S of Algorithm 1 includes a vertex from K.

Suppose that H is a C2r+1-subgraph of G^ and that v=stern(H) is set S of Algorithm 1 after Selection Loop and prior to the first iteration of Outer Loop. Let D be the set of vertices that includes v, v’s neighbours in H and anchor(H). We note that the set D satisfies the requirements of the lemma.

According to the Inner Loop, with probability 1, at the end of the first iteration of Outer Loop, the neighbours of v in H will also be in the knowledge-graph. By construction v>𝔾anchor(H) and G^ has a chain of length r whose end-vertices are v and anchor(H). By Lemma 22, with probability at least 1p22(r), at the end of iteration r of Outer Loop, the knowledge-graph includes the vertex anchor(H). Applying a union bound finishes the proof.

Lemma 24.

Let [r], τξ2, p24()=(1δ1)ξ3(1γ1)ξ3, and u,v,wV(G) be such that u>𝔾v. Suppose that there exists a braid B~ in G^ such that nadir(B~)=w, ends(B~)=(u,v), length(B~)= and |B~|degG(v)/δ. If u is in the knowledge-graph at the end of iteration τ of Outer Loop, then with probability at least 1p24(), at the end of iteration τ+ of Outer Loop, w is also in the knowledge-graph.

Proof.

Since |B~|degG(v)/δ, by the same reasoning as in Lemma 22, if u is in the knowledge-graph at the end of iteration τ of Outer Loop, then with probability at least 1(1δ1)ξ3, at the end of iteration τ+1 of Outer Loop, the knowledge-graph includes a vertex y that is adjacent to u in a path of B~. Assume that ynadir(B~)=w, since otherwise we are done. Let vyLu, be the specific r-admissible path of B~ that includes y. Let yLw we a subpath of vyLu. By Proposition 16 y>𝔾w and yLw is a chain.

Using the same reasoning as in Lemma 22, if u is in the knowledge-graph at the end of iteration τ+1 of Outer Loop, then with probability at least 1(1)(1γ1)ξ3, at the end of iteration τ+1 of Outer Loop, the knowledge-graph includes w. An application of a union bound concludes the proof.

Lemma 25.

Let [r], u,vinV(G) and p25()=(δ/(8γ2))(1γ1)ξ3+eδ16γ3. Assume that G^ has a δ-braid D~ such that ends(D~)=(v,u), length(D~)=, and |D~|degG(v)/γ. Assume also that v is in the knowledge-graph at the end of iteration τξ2 of Outer Loop. With probability at least 1p25(), at the end of iteration τ+ of Outer Loop, the knowledge-graph also includes a size γ/(8δ2) subset of nadir(D~).

Proof.

Let SNG(v) be the set of all vertices adjacent to v in a path of D~. The following analysis is for iteration τ+1 of Outer Loop.

For every i[ξ3], let Si be the set of all vertices of S that are in the knowledge-graph at the end of iteration i of the inner loop, and let Wi=(Si,nadir(D~),F) be a bipartite graph such that xyF if and only if xSi, ynadir(D~) and there exists a path L in D~ such that xV(L) and y=nadir(L). Also, let Y be the size of a maximum matching in Wξ3.

We show first that if Yδ/(8γ2), then with probability at least 1(δ/(8γ2))(1γ1)ξ3, at the end of iteration τ+ of Outer Loop, the knowledge-graph includes at least δ/(8γ2) vertices from nadir(D~). Afterwards, we show that with probability at least 1eδ16γ3, we have Yδ/(8γ2), which together with the previous implies the lemma.

Suppose that Yδ/(8γ2), and let MF be a size δ/(8γ2) matching in Wξ3. For every edge xyM, where xS and ynadir(D~), the braid D~ has an r-admissible path uLv in D~ such that xV(L) and y=nadir(L). By Proposition 16 the subpath xL′′y of uLv is a chain in G^. Thus, by Lemma 22, for every edge xyM, if x is in the knowledge-graph at the end of iteration τ+1 of Outer Loop, then with probability at least 1(1γ1)ξ3, at the end of iteration τ+ of Outer Loop, the vertex y is also in the knowledge-graph. Thus, by the union bound, given that Yδ/(8γ2), with probability at least 1(δ/(8γ2))(1γ1)ξ3, the knowledge-graph includes at least δ/(8γ2) vertices of nadir(D~).

Now, we show that, with probability at least 1eδ16γ3, at the end of iteration τ+ of Outer Loop, Yδ/(8γ2). We use martingales for this purpose.

For every i[ξ3], let Xi be the identity of the neighbour of v selected in iteration i of Inner Loop, and let Wi=(Si,nadir(D~),Fi) be a bipartite graph such that Si={Xj}j=1i and xyFi if and only if xSi, ynadir(D~), x𝔾y and G^ has a chain xLy such that length(xLy).

For every i[ξ3], let Yi be the size of a maximum matching in Wi minus the size of a maximum matching in Wi1 and Y=i=1δ/(2γ)𝔼[Yi]. Since ξ3δ/(2γ), by using the linearity of expectation, we have 𝔼[Y]𝔼[Y]. Next, we show that for every i[δ/(2γ)], we have 𝔼[Yi]1/(2γ). Together, with the above, this implies that 𝔼[Y]δ/(4γ2).

Fix j[2,ξ3], and Mj1 to be a maximum matching in Wj1. Note that if Xj, is a vertex in S that is not in Wj1 and there exists a vertex ynadir(D~), such that y is not in Mj1 and G^ has a chain xLy such that length(xLy), then Wj has an edge that does not share a vertex with any edge in Mj1 and therefore Yj=1. We next lower bound the probability that this happens.

By the definition of a δ-braid, for every ynadir(D~), there are at most degG(u)/δ paths in D~. So, if we remove from D~ all the the paths in which the vertices of V(Mj1)nadir(D~) participate, then we removed from D~ at most jdegG(u)/δ(δ/2γ)degG(u)/δ=degG(u)/(2γ) edges. Thus, after the removal of these edges the resulting braid D~ includes at least degG(u)/(2γ) paths. So, the probability that a vertex selected in Inner Loop will be in one of the paths of D~ is at least 1/(2γ). Note that if this event occurs for some i[2,δ/(2γ)], then YiYi1=1. Thus, we conclude that, indeed, for every i[δ/(2γ)], we have 𝔼[Yi]1/(2γ).

We use the vertex exposure martingale to show that with sufficiently high probability the maximum matching in the random graph W:=Wξ3 is sufficiently large. We reveal the whole graph W, iterating over i=1,,δ/(2γ), and for every i[δ/(2γ)] we reveal the identity of Xi and the identities of its neighbours in W and the edges of W incident on Xi. For every t[δ/(2γ)], we let Zt=𝔼[YXt,,X1] and let Z0=𝔼[Y].

We note that for every t[δ/(2γ)], we have |Zt+1Zt|1 since Zt+1 has only one fixed vertex more than Zt. Thus, by the Azuma Hoeffding tail bound, with probability at least 1e2(δ/(8γ2))2δ/(2γ)=1eδ16γ3, we have |Zδ/(2γ)Z0|δ/(8γ2). This implies that with probability at least 1eδ16γ3, we have Yδ/(8γ2).

Lemma 26.

Let τ[ξ2r2], H a C2r+1-subgraph of G^ and DV(H) such that every subpath of H that does not include vertices from D has length at most r. If δ/(8γ2)2r2pr2, and all vertices of D were in the knowledge-graph at the end of iteration τ of Outer Loop, then with probability at least 12rmax{p22(),p24(),p25()}, at the end of iteration τ+r2 of Outer Loop, all the vertices of some C2r+1-subgraph of G^ are in the knowledge-graph.

Proof.

We shall show that if all vertices of D are in the knowledge-graph at the end of iteration τ of Outer Loop, then with probability at least 1max{p22(),p24(),p25()}, at the end of iteration τ+ of Outer Loop, the knowledge-graph also includes a vertex w such that D{w} are all vertices of some C2r+1-subgraph of G^ (not necessarily H). By the union bound, this implies the lemma.

Let wV(H)D be such that for every yV(H)D we have y𝔾v. Let u,vD be such that u>𝔾v and the path uLv in H satisfies V(L)D=.

Suppose first that w<𝔾v, by construction the path vLw of H is a chain of length at most r in G^. Thus, by Lemma 22, with probability at least 1p22(), at the end of iteration τ+ of Outer Loop, the knowledge-graph also includes w. Note that D{w}V(H).

Assume now that w>𝔾u, then by the definition of a nadir, w=nadir(uLv). Suppose also that there exists a braid B~ in G^ such that nadir(B~)=w, ends(B~)=(u,v), length(B~)r and |B~|degG(v)/δ. By Lemma 24, with probability at least 1p24(), at the end of iteration τ+ of Outer Loop, w is also in the knowledge-graph. Note that D{w}V(H).

Finally, suppose that there does not exist a braid B~ in G^ such that nadir(B~)=w, ends(B~)=(u,v), length(B~)r and |B~|degG(v)/δ. Then, by (3) of Proposition 19, there exists a δ-braid D~ such that ends(D~)=(u,v), length(D~)length(uLv) and |D~|degG(v)/γ. By Lemma 25, with probability at least 1p25(), at the end of iteration τ+ of Outer Loop, the knowledge-graph also includes a size δ/(8γ2) subset W of nadir(D~). Since we assumed that δ/(8γ2)2r2pr2, by Lemma 17, the graph G^ includes a braid D~D~ such that nadir(D~)W, |D~|2r and the only vertices that are shared between the paths of D~ are u and v. Thus, D~ has a path uLv such that length(uLv)=length(D~), nadir(uLv)W, V(L)V(H){nadir(uLv)}. Consequently, H together with uLv includes a C2r+1-subgraph H of G^, such that D{nadir(uLv)}V(H).

Theorem 27.

Given oracle access to a C2r+1-free graph, Algorithm 1 returns “ACCEPT”. Let p,r and ϵ+. There exist values for ξ1, ξ2 and ξ3, that depend only on p,r and ϵ for which the following holds: on input p,r and ϵ, oracle access to a graph G and |V(G)|, Algorithm 1 computes the values of ξ1, ξ2 and ξ3, and if G is (p,r) admissible and ϵ-far from being C2r+1-free, with probability at least 2/3, Algorithm 1 returns “REJECT”. The query complexity of Algorithm 1 depends only on p,r and ϵ.

Proof.

The first part of the claim follows from the fact that Algorithm 1 returns “REJECT” only if its knowledge-graph is not C2r+1-free. Since the knowledge-graph of Algorithm 1 is a subgraph of the input graph, then it will not return “REJECT” if the input graph is C2r+1-free.

So, suppose that G is (p,r) admissible and ϵ-far from being C2r+1-free. Let γ=16rpr/ϵ, δ=256r2pr2γ3, ξ1=16γ/(ϵp), ξ2=r2+r+1 and ξ3=12γδr. The last part of the claim follows from Lemma 20, since ξ1, ξ2 and ξ3 depend only on p,r and ϵ.

Let G^ be the graph created by Trimming with parameters γ and δ. Since γ>8rpr/ϵ, by Lemma 18, G^ is ϵ/2-far from being Cr+1-free.

Since, G^ is ϵ/2-far from being C2r+1-free, by Lemma 23, with probability at least

1r(1γ1)ξ3(1ϵp/(2γ))ξ1>19/20

(the inequality follows because ξ1=16γ/(ϵp) and ξ3=12γδr), at the end of iteration r of Outer Loop, the knowledge-graph includes a set of vertices D, such that for some C2r+1-subgraph H of G^ we have DV(H) and every subpath of H that does not include a vertex from D has length at most r.

By Lemma 26, given that a set of vertices such as D is in the knowledge-graph at the end of iteration r of Outer Loop, with probability at least

12rmax{r(1γ1)ξ3,(1δ1)ξ3+(r1)(1γ1)ξ3,r(δ/(8γ2))(1γ1)ξ3+eδ16γ3}>19/20

(the inequality follows because γ=16rpr/ϵ and ξ3=12γδr), at the end of iteration r2+r of Outer Loop, all the vertices of some C2r+1-subgraph of G^ are in the knowledge-graph.

By Lemma 17, if at the end of iteration r2+r of Outer Loop all the vertices of some C2r+1-subgraph of G^ are in the knowledge-graph, then at the end of iteration ξ2=r2+r+1 of Outer Loop, the knowledge-graph is not C2r+1-free.

The previous three observations imply that, in this case, Algorithm 1 rejects with probability at least 2/3.

7 Lower bounds for testing 𝑪𝟐𝒓+𝟏-freeness in (𝟐,𝒓𝟏)-admissible graphs for 𝒓𝟐

For this section we need some extra notation. For an ordered graph 𝔾 we write N𝔾(u){vN(u)v<𝔾u} for the before neighbourhood and N𝔾+(u){vN(u)v>𝔾u} for the after neighbourhood of a vertex u𝔾. We also use deg𝔾(u)|N𝔾(u)| and deg𝔾+(u)|N𝔾+(u)|, as well as Δ(𝔾)maxu𝔾deg𝔾(u) and Δ+(𝔾)maxu𝔾deg𝔾+(u). We omit the subscript if the context implies it.

Theorem 28.

For every integers p,r2 and sufficiently large n, every two-sided property tester for C2r+1-freeness has query complexity Ω(n1/4), on (2,r1)-admissible input graphs of size n.

The following proofs are for testing C4-freeness in (2,1)-admissible graphs. In the end, we explain how this implies the above theorem.

We prove the lower-bound theorems using Yao’s minimax principle [13], which allows us to prove lower bounds for randomized property testers in the following manner.

In the theorem for the one-sided error case, we show that there exists a distribution 𝒟 over input graphs that are 1/4-far from C4-freeness which further satisfy the following property: every deterministic one-sided property-tester for C4-freeness – when given the parameters n, oracle access to a random graph picked from 𝒟 – with probability at least 2/3 its knowledge-graph will not have a C4-subgraph after using up to n1/4/16 queries. Since a one-sided property tester can only accept if its knowledge-graph has a C4-subgraph and must reject with probability at least 2/3, the above implies that it has query complexity Ω(n1/4).

In the theorem for the two-sided error case, we design two distributions 𝒟𝒫, 𝒟𝒩 over (2,1)-admissible graphs. The support of 𝒟𝒫 is over C4-free graphs, and the support of 𝒟𝒩 graphs is over graphs that are 1/6-far being C4-free. These distributions are constructed so that any two-sided deterministic error tester that uses o(n1/4) queries, has a very low probability of computing which one of the two distributions is the origin of the input graph. If such a property tester rejects a graph from 𝒟𝒩 with probability at least 2/3, then it accepts graphs from 𝒟𝒫 with probability strictly less than 2/3.

We now define the graph Tn that is used to construct the distribution 𝒟 and later, with some modifications, 𝒟𝒫 and 𝒟𝒩. For every n, we define Tn to be the graph that consists of the following vertices and edges: V(Tn) is the union of two sets Q=[n] and W=([n]×[n]){(i,i)}i[n]. E(Tn) consists of all edges {{u,(u,v)}uQ,(u,v)W}{{u,(v,u)}uQ,(v,u)W}.

We sample a graph from 𝒟 by applying a random permutation of the names of the vertices in Q and the same for the names of the in W.

Proposition 29.

For every n, the graph Tn is (2,1)-admissible and 1/6-far from C4-freeness.

Proof.

To see that Tn is (2,1)-admissible, pick an arbitrary ordering 𝕋n of the vertices of Tn where all the vertices of W are larger than all the vertices of Q. Clearly Δ(𝕋n)2, so every vertex in Tn participates in at most two 1-admissible paths.

For every pair of vertices u,vQ, Tn has exactly one C4 that includes both vertices. We note that all these C4 subgraphs are edge disjoint and E(Tn) is the union of all their edges. So, to turn Tn into a C4-free graph, at least (n2) edges must be removed from Tn. Since (2,1)-admissible graphs have less than 2n edges, for a large enough value of n, Tn is 1/6-far from being C4-free. In order to simplify the proofs, we will make the following assumptions about the oracle: if the algorithm queries the identity of a vertex vN(u) for uQ, then it also receives the identity of v’s neighbour that is different from u (since vW it has degree two) for the price of one query. Similarly, if it queries the neighbour of a vertex vW, it receives the identity of both of v’s neighbours. Furthermore, we let the algorithm know all degrees (and thus the sets Q and W) in advance. Therefore, we can assume that the algorithm never asks for edges between vertices u,v if either u,vQ or u,vW and never uses degree queries. Thus, we may also assume that algorithm uses only adjacency queries between a vertex in w and a vertex in Q. In the case that the algorithm asks such a query between the vertices uW and vQ and that uv exists, then we also give the algorithm the identity of the second neighbour of W. Since all these modified queries provide at least as much information as queries in the original setting, the lower bound naturally applies to the latter.

Theorem 30.

For every sufficiently large integer n, every one-sided property tester for C4-freeness of adm1-graphs on n vertices has query complexity Ω(n1/4).

Proof.

Fix 𝒯 to be an arbitrary deterministic one-sided property tester for C4-freeness and suppose that it receives as input n, p=2, ϵ>0 and oracle access to a random graph generated by 𝒟.

For i[t], let pi be the probability that after i queries the knowledge-graph does not contain a C4 subgraph conditioned on the event that the same holds after i1 queries, where the probabilities are over the choice of the input graph according to 𝒟. The probability that after t queries 𝒯’s knowledge-graph does not contain a C4 is i=1tpi. Thus, to conclude the proof, we only need to show that for tn1/4/16 and every i[t], we have pi12i/n, since this implies that

i[t]pii[t](12in)>(12tn)t>(12t2n)>2/3.

Let Gi1 be the knowledge-graph after i1 queries and suppose that Gi1 does not contain a C4. Note that after a neighbour query at most 3 vertices are added to the knowledge-graph, one of degree 2 and two of degree n. The same holds for adjacency queries. Therefore, Gi1 contains at most 2(i1) vertices from the set Q and at most i1 vertices of the set W.
Observe that for every vV(Gi)W, all edges incident on v are in E(Gi). So, the only type of queries that can result in Gi not being C4-free are:

  • a neighbour query to one of the vertices in V(Gi1)Q that returns a vertex in WV(Gi1) and a vertex in V(Gi1)Q,

  • a neighbour query to a vertex in WV(Gi1) that returns two vertices in V(Gi1)Q, and

  • an adjacency query with a vertex in V(Gi1)Q and a vertex in WV(Gi1) that returns a vertex in V(Gi1)Q.

The probability that the first case does not occur is at least 12(i1)/n. The probability that the second case does not occur is at least 14(i1)2/n, which is strictly larger than 12i/n, when it. Finally, the probability that the third case occurs is at least 12(i1)/(nn), which is strictly larger than 12i/n, when it. We now define the graphs TPn and TNn and the distributions 𝒟𝒫 and 𝒟𝒩. TNn consists of two disjoint copies of the graph Tn, and we distinguish the vertices of the copy with a prime (e.g. i and W). We construct TPn by first setting TPn=TNn and then for every distinct i,j[n], we remove from TPn the edge {i,(i,j)} and {i,(i,j)} and add the edges {i,(i,j)} and {i,(i,j)}. We sample a graph from 𝒟𝒫 by applying a random permutation of the names of the degree-n vertices and doing the same for the names of the degree-two vertices. The distribution 𝒟𝒩 is defined in the same way.

Note that here we also set Q to be the set of all degree n vertices and W to be the rest of the vertices. for a two sided-error algorithm we make the same assumptions on the queries as we did for a one-sided error algorithm.

Proposition 31.

For every n, the graphs TNn and TPn are (2,1)-admissible. Moreover, TNn is 1/6 far from C4 freeness, and TPn is C4 free.

Proof.

The proof of the first part of the claim follows from Proposition 29. For the second part, note that in Tn all the C4 subgraphs are of the form (i,(i,j),j,(j,i),i). Thus, when we constructed TPn from TNn, we “disconnected” all the C4s that were in TNn, and we did not introduce any new C4s.

Theorem 32.

For every sufficiently large integer n, every two-sided property tester for C4-freeness of adm1-bounded graphs on n vertices has query complexity Ω(n1/4).

Proof.

Fix 𝒯 to be an arbitrary deterministic two-sided property tester for C4-freeness and suppose that it receives as input n, p=2, ϵ>0 and oracle access to a random graph generated by selecting u.a.r. one of the distributions 𝒟𝒫 and 𝒟𝒩 and then selecting a graph according to the chosen distribution.

Next, we show that, given that the input is selected from only one of the two distributions, with probability at least 9/10 the answers that are vertices of degree n were drawn one by one u.a.r. without returns (after a vertex is drawn, it cannot be drawn again) from the vertices of degree n. The same applies to vertices of degree 2.

This means that 𝒯, with probability 9/10, will behave exactly the same way, regardless of whether the input was drawn from 𝒟𝒫 or 𝒟𝒩. So, if given an input that is drawn from 𝒟𝒩, 𝒯 rejects, with probability at least 2/3, then with probability at least 2/31/10 it rejects an input that is drawn from 𝒟𝒫, that is, it accepts such an input with probability strictly less than 2/3. This is a contradiction to 𝒯 being a property tester, since it must reject an input drawn from 𝒟𝒩 with probability at least 2/3 and must accept an input drawn from 𝒟𝒫 with probability at least 2/3.

We note that to prove this, we only need to show with probability at least 9/10, every query 𝒯 uses returns only vertices that are not already in its knowledge-graph. The proof follows the exact same lines as in the one-sided case. Fix r to be an integer larger than or equal to 2. To get the one-sided error lower bounds for C2r freeness and C2r freeness (2,r1)-admissible for graphs work by creating a new graph Tn from the graph Tn as follows:

  • For C2r, for every vertex (x,y)W, where W is as in the construction of Tn, replace the edge between (x,y) and min{x,y} with a path of length r1.

  • For C2r+1, for every vertex (x,y)W, where W is as in the construction of Tn, if x<y replace the edge {x,(x,y)} with a path of length r and otherwise replace the edge {y,(x,y)} with a path of length r1.

Note that the graph created according to the first item is (2,r1)-admissible and has cycles of length 2r where the graph Tn had cycles of length 4. The graph created according to the second item is also (2,r1)-admissible and has cycles of length 2r+1 where the graph Tn had cycles of length 4.

To obtain the lower two-sided error lower bound for C2r freeness and C2r+1 freeness of (2,r1) admissible graphs, apply the same changes used for originally constructing these graphs, to construct the new TNn and TPn from Tn with the following change: treat the paths that were added to Tn as if they were edges. Thus, Theorem 28 holds.

References

  • [1] Noga Alon, Tali Kaufman, Michael Krivelevich, and Dana Ron. Testing triangle-freeness in general graphs. SIAM Journal on Discrete Mathematics, 22(2):786–819, 2008. doi:10.1137/07067917X.
  • [2] Noga Alon, Raphael Yuster, and Uri Zwick. Finding and counting given length cycles. Algorithmica, 17(3):209–223, 1997. doi:10.1007/BF02523189.
  • [3] Christine Awofeso, Patrick Greaves, Oded Lachish, and Felix Reidl. H-freeness testing in graphs of bounded r-admissibility. In 42nd International Symposium on Theoretical Aspects of Computer Science, STACS 2025, March 4–7, 2025, Jena, Germany, volume 327 of LIPIcs. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2025.
  • [4] Artur Czumaj and Christian Sohler. A characterization of graph properties testable for general planar graphs with one-sided error (it’s all about forbidden subgraphs). In 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS), pages 1525–1548. IEEE, 2019. doi:10.1109/FOCS.2019.00089.
  • [5] Mina Dalirrooyfard, Thuy Duong Vuong, and Virginia Vassilevska Williams. Graph pattern detection: Hardness for all induced patterns and faster non-induced cycles. In Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, pages 1167–1178, 2019. doi:10.1145/3313276.3316329.
  • [6] Zdeněk Dvořák. Constant-factor approximation of the domination number in sparse graphs. European Journal of Combinatorics, 34(5):833–840, July 2013. doi:10.1016/j.ejc.2012.12.004.
  • [7] Talya Eden, Reut Levi, and Dana Ron. Testing c_k-freeness in bounded-arboricity graphs. In Karl Bringmann, Martin Grohe, Gabriele Puppis, and Ola Svensson, editors, 51st International Colloquium on Automata, Languages, and Programming, ICALP 2024, July 8-12, 2024, Tallinn, Estonia, volume 297 of LIPIcs, pages 60:1–60:20. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2024. doi:10.4230/LIPICS.ICALP.2024.60.
  • [8] Oded Goldreich and Dana Ron. Property testing in bounded degree graphs. In Proceedings of the twenty-ninth annual ACM symposium on Theory of computing, pages 406–415, 1997. doi:10.1145/258533.258627.
  • [9] Donald B Johnson. Finding all the elementary circuits of a directed graph. SIAM Journal on Computing, 4(1):77–84, 1975. doi:10.1137/0204007.
  • [10] Reut Levi. Testing triangle freeness in the general model in graphs with arboricity O(n). In 48th International Colloquium on Automata, Languages, and Programming, ICALP 2021, July 12-16, 2021, Glasgow, Scotland (Virtual Conference), volume 198 of LIPIcs, pages 93:1–93:13. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2021. doi:10.4230/LIPICS.ICALP.2021.93.
  • [11] Jaroslav Nešetřil and Patrice Ossona de Mendez. Sparsity: Graphs, Structures, and Algorithms, volume 28 of Algorithms and Combinatorics. Springer, 2012. doi:10.1007/978-3-642-27875-4.
  • [12] Virginia Vassilevska and Ryan Williams. Finding, minimizing, and counting weighted subgraphs. In Proceedings of the forty-first annual ACM symposium on Theory of computing, pages 455–464, 2009. doi:10.1145/1536414.1536477.
  • [13] Andrew Chi-Chin Yao. Probabilistic computations: Toward a unified measure of complexity. In 18th Annual Symposium on Foundations of Computer Science (sfcs 1977), pages 222–227. IEEE Computer Society, 1977.
  • [14] Xuding Zhu. Colouring graphs with bounded generalized colouring number. Discrete Mathematics, 309(18):5562–5568, 2009. doi:10.1016/J.DISC.2008.03.024.