Finding Cliques in Social Networks: A New Distribution-Free Model

We propose a new distribution-free model of social networks. Our definitions are motivated by one of the most universal signatures of social networks, triadic closure---the property that pairs of vertices with common neighbors tend to be adjacent. Our most basic definition is that of a"$c$-closed"graph, where for every pair of vertices $u,v$ with at least $c$ common neighbors, $u$ and $v$ are adjacent. We study the classic problem of enumerating all maximal cliques, an important task in social network analysis. We prove that this problem is fixed-parameter tractable with respect to $c$ on $c$-closed graphs. Our results carry over to"weakly $c$-closed graphs", which only require a vertex deletion ordering that avoids pairs of non-adjacent vertices with $c$ common neighbors. Numerical experiments show that well-studied social networks tend to be weakly $c$-closed for modest values of $c$.


Introduction
There has been an enormous amount of important work over the past 15 years on models for capturing the special structure of social networks. This literature is almost entirely driven by the quest for generative (i.e., probabilistic) models. Well-known examples of such models include preferential attachment [7], the copying model [37], Kronecker graphs [13,38], and the Chung-Lu random graph model [14,15]. There is little consensus about which generative model is the "right" one. For example, already in 2006, the survey by Chakrabarti and Faloutsos [12] compares 23 different probabilistic models of social networks, and multiple new such models are proposed every year.
Generative models articulate a hypothesis about what "real-world" social networks look like, how they are created, and how they will evolve in the future. They are directly useful for generating synthetic data and can also be used as a proxy to study the effect of random processes on a network [4,41,43]. However, the plethora of models presents a quandary for the design of algorithms for social networks with rigorous guarantees: which of these models should one tailor an algorithm to? One idea is to seek algorithms that are tailored to none of them, and to instead assume only determinstic combinatorial conditions that share the spirit of the prevailing generative models. This is the approach taken in this paper.
There is empirical evidence that many NP-hard optimization problems are often easier to solve in social networks than in worst-case graphs. For example, lightweight heuristics are unreasonably effective in practice for finding the maximum clique of a social network [53]. Similar success stories have been repeatedly reported for the problem of recovering dense subgraphs or communities in social networks [61,55,42,60]. To define our notion of "social-network-like" graphs, we turn to one of the most agreed upon properties of social networks-triadic closure, the property that when two members of a social network have a friend in common, they are likely to be friends themselves.

Properties of social networks
There is wide consensus that social networks have relatively predictable structure and features, and accordingly are not well modeled by arbitrary graphs. From a structural viewpoint, the most well studied and empirically validated statistical properties of social networks include heavy-tailed degree distributions [7,11,23], a high density of triangles [65,54,64] and other dense subgraphs or "communities" [26,31,47,48,40], low diameter and the small world property [34,35,36,46], and triadic closure [54,64,57].
For the problem of finding cliques in networks, it does not help to assume that the graph has small diameter (every network can be rendered small-diameter by adding one extra vertex connected to all other vertices). Similarly, merely assuming a power-law degree distribution does not seem to make the clique problem easier [24]. On the other hand, as we show, the clique problem is tractable on graphs with strong triadic closure properties.

Our model: c-closed graphs
Motivated by the empirical evidence for triadic closure in social networks, we define the class of c-closed graphs. Figure 1 shows the triadic closure of the network of email communications at Enron [1] and other social networks have been shown to behave similarly [8]. In particular, the more common neighbors two vertices have, the more likely they are to be adjacent to each other. The definition of c-closed graphs is a coarse version of this property: we assert that every pair of vertices with c or more common neighbors must be adjacent to each other. Figure 1: Triadic closure properties of the Enron email graph (36K nodes and 183K edges). Nodes of this network are Enron employees, and there is an edge connecting two employees if one sent at least one email to the other. Given an x value, the y-axis shows the cumulative closure rate: the fraction of pairs of nodes with at least x common neighbors that are themselves connected by an edge. Definition 1.1 (c-closed graph). For a positive integer c, an undirected graph G = (V, E) is c-closed if, whenever two distinct vertices u, v ∈ V have at least c common neighbors, (u, v) is an edge of G.
The parameter c interpolates between a disjoint unions of cliques (when c = 1) and all graphs (when c = |V | − 1). The class of 2-closed graphs is already non-trivial. These are exactly the graphs that do not contain K 2,2 or a diamond (K 4 minus an edge) as an induced subgraph. For example, graphs with girth at least 5, e.g. constant-degree expanders, are 2-closed. For every c, membership in the class of c-closed graphs can be checked by squaring the adjacency matrix in O(n ω ) time, where ω < 2.373 is the matrix multiplication exponent.
While the definition of c-closed captures important aspects of triadic closure, it is fragile in the sense that a single pair of non-adjacent vertices with many common neighbors prevents the graph from being c-closed for a low value of c. To address this, we define the more robust notion of weakly c-closed graphs and show that our results carry over to these graphs. Well-studied social networks with thousands of vertices are typically weakly c-closed for modest values of c (see Table 1). Definition 1.2. Given a graph and a value of c, a bad pair is a non-adjacent pair of vertices with at least c common neighbors. Definition 1.3 (Weakly c-closed graph). A graph is weakly c-closed if there exists an ordering of the vertices {v 1 , v 2 , . . . , v n } such that for all i, v i is in no bad pairs in the graph induced by A graph can be c-closed only for large c but weakly c-closed for much smaller c. Consider the graph G that is a clique of size k with one edge (u, v) missing. G is not c-closed for any c < k − 2.
The only bad pair in G is (u, v). The vertex ordering that places u and v at the end demonstrates that G is weakly 1-closed. Also, the properties of being c-closed and weakly c-closed are hereditary, meaning that they are closed under taking induced subgraphs. We will use this basic fact often.

Our contributions
One can study a number of computational problems on c-closed (and weakly c-closed) graphs. We focus on the problem of enumerating all maximal cliques, an important problem in social network analysis [16,59,18,21,58]. We study fixed-parameter tractability 1 with respect to c. There is a rich literature on fixed-parameter tractability for other graph parameters including treewidth, arboricity, and the size of the output [17].
In a graph G, a clique is a subgraph of G in which there is an edge between every pair of vertices. A maximal clique in G is a clique that cannot be made any larger by the addition of some other vertex in G. In any graph, all maximal cliques can be listed in O(mn) time per maximal clique [62]. We focus on the following two problems: 1. determining the maximum possible number of maximal cliques in a c-closed graph on n vertices.
2. finding algorithms to enumerate all maximal cliques in c-closed graphs (that run faster than O(mn) time per maximal clique).
Our main result is that for constant c the number of maximal cliques in a c-closed graph on n vertices is O(n 2−2 1−c ). More specifically, we prove the following bound. Theorem 1.4. Any c-closed graph on n vertices has at most min{3 (c−1)/3 n 2 , 4 (c+4)(c−1)/2 n 2−2 1−c } maximal cliques. The proof of the first bound listed in Theorem 1.4 extends to weakly c-closed graphs, giving the following result. Theorem 1.5. Any weakly c-closed graph on n vertices has at most 3 (c−1)/3 n 2 maximal cliques.
In Appendix B, we give experimental results showing that well-studied social networks are weakly c-closed for modest values of c. Note that Theorem 1.5 is exponential in the even smaller value of (c − 1)/3.
Since in any graph all maximal cliques can be listed in O(mn) time per maximal clique, Theorem 1.4 proves that listing all maximal cliques in a c-closed graph is fixed-parameter tractable (i.e. has running time f (c)n α for constant α). We give an algorithm for listing all maximal cliques in a c-closed graph that runs faster than applying the O(mn)-per-clique algorithm as a black box. Our algorithm follows naturally from the proof of Theorem 1.4 and gives the following theorem, where p(n, c) denotes the time to list all wedges (induced 2-paths) in a c-closed graph on n vertices. A result of Gąsieniec, Kowaluk, and Lingas [30] implies that p(n, c) = O(n 2+o(1) c + c (3−ω−α)/(1−α) n ω + n ω log n) where ω is the matrix multiplication exponent and α > 0.29. Non-trivial lower bounds for the number of maximal cliques in a c-closed graph were previously known only for extreme values of c. A 2-closed graph can have n 3/2 maximal cliques [22]. The classic Moon-Moser graph (with additional isolated vertices) is (n − 2)-closed and has 3 n/3 maximal cliques [45]. This graph consists of the complete multipartite graph with n/3 parts of size 3, and possibly additional isolated vertices. By taking a disjoint union of n/(c + 2) Moon-Moser graphs on (c + 2) vertices, we can construct a c-closed graph on n vertices with Ω(c −1 3 c/3 n) maximal cliques for all n ≥ c. We give improved lower bounds for intermediate values of c. It is an open problem to determine the exact exponent of n (between 3/2 and 2 − 2 1−c ) in the expression for the maximum number of maximal cliques in a c-closed graph.

Related work
There are only a few algorithmic results for graph classes motivated by social networks. Although a number of NP-hard problems remain NP-hard on graphs with a power-law degree distribution [25], several problems in P have been shown to be easier on such graphs. Brach, Cygan, Lacki, and Sankowski [10] give faster algorithms for transitive closure, maximum matching, determinant, PageRank and matrix inverse. Borassi, Crescenzi, and Trevisan [9] assume several axioms satisfied by real-world graphs, one being a power-law degree distribution, and give faster algorithms for diameter, radius, distance oracles, and computing the most "central" vertices. Motivated by triadic closure, Gupta, Roughgarden, and Seshadhri [32] define triangle-dense graphs and prove relevant structural results. Intuitively, they prove that if a constant fraction of two-hop paths are closed into triangles, then the graph must contain many dense clusters.
For general graphs, Moon and Moser prove that the maximum possible number of maximal cliques in a graph on n vertices is 3 n/3 (realized by a complete n/3-partite graph) [45]. Tomita, Tanaka, and Takahashi prove that the time to generate all maximal cliques in any n-vertex graph is also O(3 n/3 ) [59].
The clique problem has been studied on 2-closed graphs (under a different name). Eschen, Hoang, Spinrad, and Sritharan [22] show that the maximum number of maximal cliques in a 2-closed graph is O(n 3/2 ). They also show a matching lower bound via a projective planes construction. Suppose n = p 2 + p + 1 for a positive integer p and consider a finite projective plane on n points (and hence with n lines, see e.g. [3]). Let G denote the bipartite graph representing the point-line incidence matrix. The defining properties of finite projective planes imply that no two vertices have two common neighbors, so the 2-closed condition is vacuously satisfied. Every vertex of G has degree p + 1, so the graph has Θ(n 3/2 ) edges, each a maximal clique.
The clique problem has also been studied on other special classes of graphs such as graphs embeddable on a surface [19] and graphs of bounded degeneracy [20]. Degeneracy is a measure of everywhere sparsity. More formally, the degeneracy of a graph G is the smallest value d such that every nonempty subgraph of G contains a vertex of degree at most d. Eppstein et al. show that the maximum number of maximal cliques in a graph of degeneracy d is O(n3 d/3 ). The degeneracy of a graph, however, can be much larger than its c-closure. For example, the degeneracy of a graph is at least the size of a maximum clique, while even in 1-closed graphs, the size of the maximum clique can be arbitrarily large.
Clique counting is a classical problem in extremal combinatorics. One fundamental question is to count the minimum number of cliques in graphs with fixed number of edges i.e. to show that graphs with few cliques must have few edges. This simple question turns out to be a complex problem, and is settled for K 3 by Razborov [51] by flag algebra, K 4 by Nikiforov [49] by a combination of combinatorics and analytical arguments, and all K t by Reiher [52] by generalizing the argument of flag algebra to all sizes of cliques.
There has also been a long line of work in combinatorics on counting (not necessarily maximal) cliques in graphs with certain excluded subgraphs, subdivisions, or minors. Most recently, Fox and Wei give an asymptotically tight bound on the maximum number of cliques in graphs with forbidden minors [28], and an upper bound on the maximum number of cliques in graphs with forbidden subdivisions or immersions [27].
Many problems in combinatorics can be phrased as counting the number of cliques or independent sets in a (hyper)graph. For example, the problems of finding the volume of the metric polytope and counting the number of n-vertex H-free graphs (for any fixed subgraph H) can be translated into clique counting problems. The recently developed "container method" [6,56] is a powerful tool to bound the number of cliques in (hyper)graphs and can be used to tackle a great range of problems.

Organization
In Section 2 we prove the first bound listed in Theorem 1.4, state Theorem 1.5, and introduce the proof of Theorem 1.6. In Section 3 we prove the second bound listed in Theorem 1.4 (which has improved dependence on n). In Section 4 we prove Theorem 1.7.
In Appendix A we give the full proof of Theorem 1.6. In Appendix B we further discuss weakly c-closed graphs including relevant experimental results. In Appendix C, we give generalizations of c-closed and weakly c-closed graphs and extensions of and Theorem 1.4. In Appendix D we give a preliminary result regarding the number of maximal cliques in c-closed K k -free graphs.

Notation
All graphs G(V, E) are simple, undirected and unweighted. For any v ∈ V , let N (v) denote the neighborhood of v. When the current graph is ambiguous, N G (v) will denote the neighborhood of v in G. For any S ⊆ V , let G[S] denote the subgraph of G induced by S.

Bound on number of maximal cliques
In this section, we prove the following bound on the number of maximal cliques in a c-closed graph and show that this bound carries over to weakly c-closed graphs. Let F (n, c) denote the maximum possible number of maximal cliques in a c-closed graph on n vertices. The following theorem uses a natural peeling process and obtain an initial upper bound on the number of maximal cliques. A more involved analysis, Theorem 3.1 which gives a tighter upper bound, is delayed to later. Proof. Let G be a c-closed graph on n vertices and let v ∈ V (G) be an arbitrary vertex. Every maximal clique K ⊆ G is of one of the following types: Bounding the number of maximal cliques of type 1 and 2 is straightforward because every such clique can be obtained by starting with a clique maximal in G \ {v} and extending it to include vertex v if possible. Therefore, the number of maximal cliques of types 1 and 2 combined is at most F (n − 1, c).
Type 3 cliques are maximal in N (v), but not in G \ {v}. We will prove that the number of maximal cliques of type 3 is at most 3 (c−1)/3 n, crucially using the c-closed property. Figure 2 shows a maximal clique K of type 3.
We claim that each type 3 maximal clique K satisfies the following three properties.
, then K can also be extended to include w which contradicts the fact that K is maximal. To see property C, note that since K \ {v} is not a maximal clique in G \ {v} we can extend the clique K \ {v} to include some vertex in G \ {v}. By property B, we can extend K \ {v} to include some vertex u not in N (v).
Let u be as in property C. Then K \ {v} must be a maximal clique in G[N (v) ∩ N (u)] because otherwise we could extend K \{v} to some other vertex in N (v)∩N (u), which contradicts property B.
Thus, the number of type 3 maximal cliques is at most Then since any k-vertex graph has at most 3 k/3 maximal cliques [45], Thus, the number of type 3 maximal cliques in G is at most 3 (c−1)/3 n. Counting all three types of maximal cliques, we have the following recursive inequality: By induction on n with the base case F (1, c) = 1, this gives Note that v was chosen arbitrarily and the proof is valid as long as "|N (u) ∩ N (v)| < c for all vertices u ∈ N (v)". Thus, in each recursive level, we only require the existence of a vertex v in no bad pairs. Equivalently, it suffices to have an ordering of the vertices {v 1 , v 2 , . . . , v n } such that for all i, v i is in no bad pairs in the graph induced by {v i , v i+1 , . . . , v n }. This is exactly the definition of a weakly c-closed graph. Thus, we get the following theorem.

Algorithm to generate all maximal cliques
Recall that p(n, c) denotes the time to list all wedges (induced 2-paths) in a c-closed graph on n vertices. A result of Gąsieniec, Kowaluk, and Lingas [30] where ω is the matrix multiplication exponent and α > 0.29. The algorithm follows naturally from the proof of Theorem 2.1 with two additional ingredients: • A preprocessing step to enumerate all wedges in the graph speeds up the later process of finding the intersection of the neighborhoods of two vertices (i.e. N (u)∩N (v) from the proof of Theorem 2.1).
• An algorithm of Tomita, Tanaka, and Takahashi [59] generates all maximal cliques in any n-vertex graph in time O(3 n/3 ). We apply this to the recursive calls on the small induced , which have less than c vertices, that arise in handling the type 3 cliques in the proof of Theorem 2.1.
We defer the full algorithm description and runtime analysis to Appendix A.

Improved Bound
Recall that F (n, c) is the maximum number of maximal cliques in a c-closed graph on n vertices. The structure of the proof is similar to that of Theorem 2.1. We get an improved bound by a separate analysis depending on whether G has a vertex of "high" degree. This idea appears in the result of Eschen et al. [22], who prove the result for the c = 2 case.
We will require the following simple lemma.  , v is in at most F (∆(G), c − 1) maximal cliques. It remains to bound the number the maximal cliques that contain some vertex in N (v) but not v itself. Such a clique must contain some vertex in u ∈ V \ (N (v) ∪ {v}) (otherwise, it would not be maximal). Let K be the set of such cliques. We will bound |K| by grouping the maximal cliques K in K based on which vertices of N (v) are in K. For nonempty S ⊆ N (v), let N (S) denote u∈S N (u). Also, let N 2 (v) denote the set of vertices of distance exactly 2 from v. Let us bound the number of cliques K ∈ K such that K ∩ N (v) = S. The other vertices in K must be in N (S) ∩ N 2 (v). By Lemma 3.2, For all u ∈ N 2 (v), since u and v are not adjacent, We want to determine for all S the value of |N (S) ∩ N 2 (v)| that maximizes the upper bound for |K| in Inequality (2) subject to the constraint in Inequality (3). Later, we prove our bound on F (from the theorem statement) by induction on n and c. In fact, we show that F (n, c) is bounded by F 0 (n, c) = 4 (c+4)(c−1)/2 n 2−2 1−c , the desired upper bound for F (n, c) that we are trying to prove by induction. Since F 0 (n, c) is convex in n, by the inductive hypothesis we can apply Jenson's inequality on Inequality (2). Jenson's inequality implies that the upper bound on |K| is maximized by setting |N (S) ∩ N 2 (v)| to be as large as possible (note that it cannot exceed ∆(G)) for as many S as possible until the bound in Inequality (3) is met and setting the rest to be 0. By Inequality (3), the number of non-zero terms Thus, we have the following continuation of Inequality (2).
Recall |K| is the number of maximal cliques that contain some vertex in N (v) but not v itself, so we combine Inequality (4) with the observation (from the beginning of case 2) that v is in at most F (∆(G), c − 1) maximal cliques to conclude that the number of maximal cliques containing at least one vertex in N (v) ∪ {v} is at most Then, recursing on G \ (N (v) ∪ {v}), we have: Combining the low and high degree bounds on F (n, c), we get the following recurrence.
Proof. For any y ∈ (0, 1), (1 − y) ≤ e −y and e −y ≤ 1 − y/2. Thus, Applying the claim with x = ∆(G) n and k = 2 − 2 1−c , it suffices to show that or equivalently that Simplifying the left-hand side of the above inequality and using the fact that c ≥ 1: The last inequality holds because ∆ ≥ n 1/2 .
Like the proof of the initial bound (Theorem 2.1), the proof of the improved bound (Theorem 3.1) also suggests an algorithm for generating the set of maximal cliques involving the preprocessing step of listing the set of all wedges in the graph. However, this algorithm is not asymptotically faster than Algorithm 1 since its dependence on n still includes p(n, c) and we omit it. Construction. We suppose that c is even and n is a multiple of c. We can do this with only an absolute constant factor loss in the bound, which is allowable. We start with a graph H on v = 2n/c vertices with girth 5 and the maximum possible number of edges, which is Ω(v 3/2 ) [29].

Lower bound
We construct our c-closed graph G on n vertices from H in the following way. For each vertex x ∈ V (H), we replace it with a vertex set U x with c/2 vertices. Therefore, there are |V (H)|·c/2 = n vertices in G. The adjacency relation of G is as follows.
• Add all edges within each U x so that U x is a clique for all x ∈ V (H).
• For any edge (x, y) of H, we place edges between the vertex sets U x and U y such that the bipartite graph between U x and U y consists of a complete bipartite graph minus a perfect matching.
• For any distinct and nonadjacent x, y ∈ V (H), there are no edges between U x and U y .  Proof. It suffices to check that for any two non-adjacent vertices in G, they have at most c − 1 common neighbors. By the construction, there are only two types of non-adjacent vertices: Case 1: The non-adjacent pair u, v ∈ V (G) are such that u ∈ U x , v ∈ U y and x, y ∈ V (H) are disitinct and non-adjacent in H.
In this case, there are no edges between U x , U y , and the common neighbors of u, v are such that there is a vertex z ∈ V (H) such that (x, z), (y, z) are both edges in H. Since H has girth 5, there is at most one such z ∈ H given x, y, as otherwise H would contain a C 4 . Vertex u ∈ V (G) is adjacent to exactly |U z | − 1 = (c/2) − 1 vertices in U z , so u, v can have at most (c/2) − 1 common neighbors.
Case 2: The non-adjacent pair u, v ∈ V (G) is such that u ∈ U x , v ∈ U y and x, y ∈ V (H) are adjacent in H.
In this case, u and v are adjacent to all other vertices in U x ∪ U y , so they have c − 2 common neighbors in U x ∪ U y . Suppose for contradiction that u, v have some other common neighbor w and w ∈ U z for some z = x, y. This implies that (w, x), (w, y) are both edges in H. However, (x, y) is already an edge in H by the assumption of this case. This implies that H contains a triangle, which contradicts the fact that H has girth 5.

Open problems and future directions
Direct improvement of our results • Determine the exact dependence on n for the maximum possible number of of maximal cliques in a c-closed graph. We have proven (up to constant dependence on c) that this number is between n 3/2 and n 2−2 1−c .
• Find a faster algorithm for listing the set of all wedges (induced 2-paths) in a c-closed graph (this would improve the runtime of Algorithm 1).

Further exploration of c-closed graphs
• Study the densest k-subgraph problem, a generalization of the clique problem, on c-closed graphs. The input to the problem is a graph G and a parameter k, and the goal is to to find the subgraph of G on k vertices with the most edges. Unlike the clique problem, densest k-subgraph is NP-hard even for 2-closed graphs (more specifically, for graphs of girth 6) [50]. For general graphs, the best-known approximation algorithm has approximation ratio roughly O(n 1/4 ) [2] and under certain average-case hardness assumptions (concerning the planted clique problem), constant-factor approximation algorithms do not exist [5].
• Determine which other NP-hard problems are fixed-parameter tractable with respect to c.
• Determine which problems in P have faster algorithms on c-closed graphs.
Other model-free definitions of social networks • Explore other graph classes motivated by the well-established signatures of social networks (described in the introduction): heavy-tailed degree distributions, high triangle density, dense "communities", low diameter and the small world property, and triadic closure.
• Determine other model-free definitions of social networks, for example, those motivated by 4vertex subgraph frequencies. Ugander et al. [63] and subsequently Seshadhri [33] computed 4-vertex subgraph counts in a variety of social networks and the frequencies observed are far different than what one would expect from a random graph. In particular, social networks tend to have far fewer induced 4-cycles than random graphs.

A Appendix: Algorithm to generate all maximal cliques
In this section we prove Theorem 2.3, restated below. Recall that p(n, c) denotes the time to list all wedges (induced 2-paths) in a c-closed graph on n vertices. Before proving Theorem A.1, we give a bound on p(n, c), which follows from a result of Gąsieniec, Kowaluk, and Lingas [30] about computing witnesses of boolean matrix multiplication. If C is the boolean matrix product of A and B, a witness of entry C[i, j] is an index l such that Lemma A.2 (Theorem 1 in [30]). If C is the boolean matrix product of two n × n matrices, we can report all witnesses of all entries of C that have at most k witnesses in expected time O(n 2+o(1) k + k (3−ω−α)/(1−α) n ω + n ω log n) where ω is the matrix multiplication exponent and α is the supremum of the set of r ∈ [0, 1] such that multiplying an n × n r matrix by an n r × n takes time O(n 2+o(1) ).
Let A be the adjacency matrix of a c-closed graph on n vertices and let C be the boolean matrix product A 2 . Then C[i, j] = 1 if and only if vertices i and j have at least one common neighbor. Since G is c-closed, all C[i, j] for non-adjacent i, j have at most c − 1 witnesses. Thus, we have the following corollary. Proof of Theorem A.1. The algorithm follows naturally from the proof of Theorem 2.1 with two additional ingredients: • A preprocessing step to enumerate all wedges in the graph speeds up the later process of finding the intersection of the neighborhoods of two vertices (i.e. N (u)∩N (v) from the proof of Theorem 2.1).
• An algorithm of Tomita, Tanaka, and Takahashi [59] that generates all maximal cliques in any graph in time O(3 n/3 ). We apply this to the recursive calls on the small graphs G[N (u)∩N (v)] from the proof of Theorem 2.1). Let Cliques(G) be a call to this algorithm.
The output of Cliques(G) does not explicitly list every maximal clique, as this could take Ω(3 n/3 n) time e.g. for a complete n 3 -partite graph). Instead, the output of Cliques(G) is a forest F where each node represents a vertex in G and the collection of nodes on any path from root to leaf in F form a maximal clique in G. The output of our algorithm will be of the same form. For any leaf l ∈ V (F ) let K(l) be the maximal clique in G on the set of vertices along the path from l to the root of its tree in F . See Algorithm 1 for pseudocode describing our algorithm that generates a superset of the maximal cliques in a c-closed graph. The correctness of Algorithm 1 follows from the proof of Theorem 2.1. fix an arbitrary vertex v ∈ V (G) 8: for each leaf l in F with K(l) ⊆ N (v) do 10: add v to F as a child of l

15:
add v to F u with an edge between v and every current root in F u 16: Runtime analysis. The reason that CClosedCliques lists a superset of the maximal cliques rather than the exact set is the following. There could be a clique K in N (v) ∩ N (u) ∩ N (w) for some u and w such that K is maximal in N (v) ∩ N (u) but not maximal in N (v) ∩ N (w). In this case K ∪ {v} will be reported in the output even though it is not a maximal clique. To list the exact set of maximal cliques, we make the following addition to the procedure CClosedCliques right before returning. Add every clique in ∪ u F u to a hash set S. Then iterate through every clique K in ∪ u F u in order from largest to smallest, and check whether any subset of of K is in S. Note that the number of vertices in K is at most c. This increases the runtime to O(p(n, c) + 3 c/3 2 c cn 2 ).

B Appendix: Weakly c-closed graphs
In this section we give experimental results regarding the c-closure and weak c-closure of wellstudied social networks, an algorithm to compute the smallest value c such than a given graph is weakly c-closed, and an equivalent definition of weakly c-closed graphs.
Recall the definition of a weakly c-closed graph.   Table 1: The c-closure and weak c-closure of well-studied social networks. The data sets are from the Stanford Network Analysis Platform (SNAP) [39] and are each categorized as either a social network, communication network, collaboration network, or internet peer-to-peer network. For the networks with directed edges, we analyze the underlying undirected graphs. For each network G, n is the number of vertices, m is the number of edges, c is the smallest value c such that G is c-closed, and "weak c" is the smallest value c such that G is weakly c-closed.

B.2 Computing weak c-closure
To get an algorithm for computing the weak c-closure of a graph, we first show that the ordering of the vertices in the definition of weakly c-closed can be chosen greedily. In the following definition, we define a valid ordering of the vertices as one that satisfies the definition of a weakly c-closed graph. The following lemma will be useful.
Lemma B.5. If a vertex v is in no bad pairs with respect to a graph G, then v is also in no bad pairs with respect to any induced subgraph H ⊆ G.
Proof. Remark. We argue that this definition of weakly c-closed is tight in the following sense. Note that if a graph G is such that every subgraph H ⊆ G has at most |V (H)|−1 2 bad pairs, then all H ⊆ G must contain a vertex in no bad pairs so G must be weakly c-closed. One might hope that the clique listing problem remains fixed-parameter tractable for some new definition of weakly c-closed of the form "every subgraph of G has at most k bad pairs" for some k ≥ |V (H)| 2 . To see why this is impossible, let G be a complete n/2-partite graph i.e. the complement of a perfect matching. Then, the only bad pairs in G are the endpoints of the n/2 non-edges so every subgraph H ⊆ G has at most V (H) 2 bad pairs, yet G has 2 n/2 maximal cliques (choose one endpoint of each non-edge).
From the other side, such a definition is trivial for k < |V (H)| 2+c . That is, if a graph G is not c-closed, then there exists a subgraph H ⊆ G with at least |V (H)| 2+c bad pairs. Specifically, consider the graph induced by a non-adjacent pair of vertices and c of their common neighbors.
C Appendix: Generalizations of c-closed graphs and weakly cclosed graphs

C.1 A-bounded graphs
In this subsection, we define A-bounded graphs, a generalization of weakly c-closed graphs. In Definition B.6 of weakly c-closed graphs we require that every subgraph has at least one vertex v in no bad pairs, while here we do allow v to be in bad pairs but restrict the sizes of the common neighborhoods of v with its non-neighbors.
Note that weakly c-closed graphs are (n − 2)3 c/3 -bounded. Like weakly c-closed graphs, A-bounded graphs also have two equivalent interpretations: one in terms of subgraphs of G (Definition C.1) and another in terms of an ordering of the vertices.
Lemma C.2. A graph G is A-bounded if and only if there is an ordering of the vertices of G: v 1 , v 2 , . . . , v n such that {v i , v i+1 , . . . , v i } is A-bounded for each i.
The proof of Lemma C.2 is analogous to that of Lemma B.7 and we omit it.
Theorem C.3. The number of maximal cliques in an A-bounded graph with n vertices is at most nA.
Theorem C.3 follows as a corollary from the following theorem which generalizes Theorem 2.1.
Theorem C.4. Let v 1 , v 2 , . . . , v n be an arbitrary ordering of the vertices of G.
i.e. the set of pairs of non-adjacent vertices whose common neighborhood contains at least one vertex that comes later than i in the ordering. Then the number of maximal cliques in G is at most

C.2 Graphs with given common neighborhood statistics
In the definition of a c-closed graph, we require that no pair of non-adjacent vertices has c common neighbors, while here we allow such pairs and consider the number of pairs with exactly i common neighbors for all i.
Definition C.5. For any integer 0 ≤ i ≤ n − 2, let p(i) be the number of pairs of non-adjacent vertices with exactly i common neighbors.
Note that c-closed graphs have p(i) = 0 for all i > c and i>c p(i) counts the total number of bad pairs in G.
Theorem C.6. The number of maximal cliques in any graph is at most i>0 8p(i) 3 i/3 i+2 . Proof. Directly from Theorem C.4, we have that the number of maximal cliques in any graph G is at most We can do better by applying Theorem C.4 using a uniformly random ordering of the vertices in G. The number of maximal cliques in G is at most where the expectation is over all ordering of the vertices in G where each order appears uniformly at random.
By linearity of expectation, for any two vertices u, v which are not adjacent, we want to compute E 3 |N (v)∩N (u)\{vertices come before u,v in the ordering}|/3 · 1 u,v not adjacent .
Let s = |N (u) ∩ N (v)|. This expectation can simply be written as s k=0 3 (s−k)/3 Pr (exactly k vertices come before u, v in the ordering) .
The probability that there are exactly k vertices before u, v in the random ordering is the probability that when permute the |N (u) ∩ N (v)| = s vertices together with u, v, there are exactly k vertices from N (u) ∩ N (v) that come before u and v. In other words, it means that in the random permutation of size s + 2, one of u, v comes at the k + 1-th position, and the other one comes after the k + 1-th position. This probability is 2(s+2−(k+1)) (s+1)(s+2) . The denominator is the number of ways to place u, v in the s + 2 positions; while the numerator is the probability that one of u, v is at the (k + 1)-th position, and the other one is at the last (s + 2) − (k + 1) positions. Therefore Note that Theorem C.6 implies Theorem 2.1. This is because if G is c-closed, p(i) = 0 for all i > c so by Theorem C.6 the number of maximal cliques in G is at most i≤c p(i)3 i/3 ≤ n 2 3 c/3 G can have no more than n 2 non-edges so nd 2 ∈ O(n 2 ), or equivalently d = O(n 1/2 ), so n 2 (G) = O(n 3/2 ). = Ω(n j (G)s 2 ).

We
G can have no more than n 2 non-edges so n j (G)s 2 = O(n 2 ), or equivalently, An upper bound on n j+1 (G) is the sum over all J ∈ J of the number of K j+1 's that J is in; that is, n j+1 (G) ≤ J∈J |N (J)| = n j (G)s = O(n j (G) 1/2 n) by Equation 6.