Efficient Enumeration of -Plexes and -Defective Cliques
Abstract
We investigate the enumeration of dense subgraphs under two well-known relaxations of cliques: -plexes and -defective cliques. Our main contribution is a family of algorithms with improved worst-case and output-sensitive complexities, driven by a decomposition technique based on graph degeneracy. We first propose a worst-case output-size near-optimal algorithm to enumerate all maximal -plexes of size at least , achieving a total time complexity of , where is the degeneracy and the maximum degree of the input graph. We then refine this result to obtain a fixed-parameter tractable output-sensitive algorithm with complexity , where is the number of solutions, is an arbitrary function of , and is a polynomial. We then extend this framework to the enumeration of -defective cliques and also show a linear-time algorithm for the enumeration of -plexes for graphs with bounded degeneracy. To the best of our knowledge, these complexities are competitive with or better than the current state of the art.
Keywords and phrases:
Parameterized complexity, enumeration algorithms, maximal cliques enumerationCopyright and License:
2012 ACM Subject Classification:
Theory of computation Parameterized complexity and exact algorithmsEditors:
Akanksha Agrawal and Erik Jan van LeeuwenSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl β Leibniz-Zentrum fΓΌr Informatik
1 Introduction
Finding dense substructures in graphs is a fundamental problem in data mining [9, 19], social network analysis [16, 23], bio-informatics [15, 14], and database theory, where uncovering dense groups of entities reveals critical insights into the organization of complex systems. In data mining, dense patterns often signify meaningful associations or frequent co-occurrences; in social networks, they correspond to densely connected communities; in biological networks, they may represent protein complexes or functional modules; and in database systems, they inform efficient indexing and query optimization over graph-structured data.
Among the most well-known dense patterns is the clique, a subset of vertices in which every pair is mutually adjacent. Cliques represent idealized notions of cohesion but are rarely found in real-world networks, which are typically large, sparse, and noisy. The requirement of full connectivity makes cliques overly restrictive for practical applications. As a result, various relaxed clique models have been proposed to capture dense β but not necessarily complete β subgraphs that better reflect real-world connectivity.
Among these, the -plex and -defective clique models have been widely used [2, 19, 10, 20, 24]. A -plex allows each vertex to be non-adjacent to up to k other vertices in the subgraph, while a -defective clique requires the induced subgraph to miss at most k edges. These definitions offer a balance between density and noise tolerance, enabling the discovery of dense structures even in imperfect data. These models have demonstrated practical utility in numerous domains: detecting non-clique-like but cohesive communities in social networks [20, 23], identifying biologically relevant modules in protein-protein interaction networks [22], and mining patterns in incomplete or uncertain data sources [8].
The versatility of relaxed clique models makes them an important tool in modern graph analysis, where the goal is often to find structures that are both dense and resilient to noise. Consequently, the development of efficient algorithms to enumerate such subgraphs has become a central research problem.
2 State of the art
In this paper, we focus on the enumeration of two families of graph-theoretic structures: -defective cliques and -plexes. Both serve as models of relaxed cliques. Although the enumeration of -plexes has been extensively studied, less attention has been paid to -defective cliques. In this section, we review existing techniques and results concerning the enumeration of maximal -plexes and -defective cliques. In both the related work and throughout this paper, we encounter two common classes of output-sensitive algorithms for enumeration problems: Polynomial Total Time and Polynomial Delay. A problem is said to be solvable in polynomial total time if the total time required to enumerate all solutions is bounded by a polynomial in the size of the input and the number of solutions. This notion is typically used when the user is interested in generating the full output. However, Polynomial Delay ensures that the time between any two consecutive outputs is bounded by a polynomial in the input size. This is particularly valuable in practice, as it allows the user to process or visualize each solution incrementally without waiting for the enumeration to complete.
To our knowledge, there exist only two enumeration algorithms specifically designed for maximal -defective cliques, which are given in [10]. The first is a polynomial delay algorithm, meaning that the time elapsed between two consecutive outputs is bounded by a polynomial function. Specifically, the delay is , where is the number of vertices in the graph, is the size of the largest -defective clique. The second algorithm, which follows a more brute force approach and whose complexity is not output sensitive, uses a branch-and-bound approach with an optimized pivoting technique. Its time complexity is , where is a real constant strictly less than 2. An optimization based on the degeneracy ordering of the graph reduces this complexity to , where is the degeneracy of the graph, a parameter that is typically small in real-world networks.
For the problem of enumerating all maximal -plexes, several algorithms have been proposed. Some of these are adaptations of the BronβKerbosch algorithm [3], which was originally designed to enumerate maximal cliques. Wu and Pei [20] extended the BronβKerbosch algorithm to handle maximal -plexes by introducing a set of pruning rules to eliminate unnecessary branches in the search space. The time complexity of their algorithm is . Zhou et al. [24] introduced a novel branching heuristic that improves the running time to , where is a constant depending on and is strictly less than 2. In a related approach, Conte et al. [1] proposed a decomposition-based BronβKerbosch algorithm to enumerate -plexes with diameter at most 2. Although no explicit time complexity is provided, a brief analysis suggests that the algorithm runs in , where denotes the degeneracy and the maximum degree of the graph.
Other approaches have also been proposed. In Wang et al. [19], the authors present an algorithm to enumerate all -plexes with a time complexity of . In Dai et al. [11], they describe an algorithm to enumerate all maximal -plexes of size at least , with a time complexity of , where and . It is also possible to apply the more general framework of Cohen et al. [7], which provides algorithms to enumerate maximal induced subgraphs satisfying hereditary properties. In fact, in a subsequent paper of Berlowitz et al. [2], they applied this framework to the specific case of -plexes, resulting in a fixed-parameter tractable polynomial-time delay algorithm with time complexity ) with a polynomial function of and an arbitrary function of .
Closely related to our problems, the computation of the maximum -defective clique has also been studied. It is known to be NP-hard [21], and several non-trivial algorithms have been developed to address it [4, 6, 5, 13]. These methods achieve exponential time complexities with base constants strictly less than 2, often derived from the roots of characteristic polynomials. Some offer tighter theoretical bounds, while others emphasize practical performance through preprocessing and reduction techniques. Certain approaches also exploit structural properties of the graph, such as degeneracy, to improve practical runtime. However, even optimized methods become impractical for large values of , particularly on real-world datasets [13].
3 Our contributions
In this paper, we investigate the enumeration of dense subgraphs under the models of -plexes and -defective cliques. More specifically, we focus on connected -plexes and -defective cliques of diameter at most 2, as they better capture the notion of a community. To achieve this, Lemma 4 ensures that it suffices to enumerate -plexes of size at least and -defective cliques of size at least .
Our main goal is to develop algorithms to enumerate all maximal -plexes of size at least . We then leverage this approach to enumerate maximal -defective cliques by exploiting the structural relationship between the two concepts.
We begin by presenting general combinatorial results in Section 4.2. Then, in Section 4.3, we focus on the enumeration of maximal -plexes. As a starting point, we build on the decomposition introduced in [9]. We first prove that every maximal -plex of size at least is contained within a small subgraph of size . Using this structural insight, we design an enumeration algorithm that explores each subgraph in time , ensuring that all maximal -plexes of size at least are reported without duplication. A subsequent filtering step based on suffix trees guarantees a total enumeration time bounded by . This result is formalized in Theorem 16. We also establish an upper bound on the number of such -plexes, namely , and demonstrate the near-optimality of this bound by constructing a family of graphs in which the number of maximal -plexes of size at least is . It follows that our algorithm achieves near-optimal runtime in the worst case.
In Berlow et al. [2], the authors introduced an algorithm for the enumeration of maximal -plexes of size at least , with fixed-parameter tractable polynomial delay , where is an arbitrary function of , and is a polynomial in the number of vertices . The considered parameter is . We improve upon this result by applying the enumeration to the subgraphs , leveraging the fact that each contains at most vertices. Combined with our upper bound on the total number of solutions across all subgraphs in Lemma 12, this enables the design of a polynomial total algorithm with runtime , where is the total number of output solutions. To the best of our knowledge, this is the first algorithm with total runtime polynomial both in the output size and in the degeneracy and maximum degree of the input graph.
By applying the same technique to the polynomial delay algorithm of Dai et al. [10] for the enumeration of maximal -defective cliques of size at least , mentioned earlier, we also derive a second output sensitive algorithm with polynomial total runtime. These results are presented in Theorems 17 and 19, respectively.
Finally, we propose an algorithm that lists all maximal 2-plexes in time for graphs of bounded degeneracy. This result is presented in Theorem 23. An overview of the existing results, as well as our claimed complexities, can be found in Table 1.
| Algorithm | Output | Complexity |
|---|---|---|
| Wu and Pei [20] | Maximal -plexes | |
| Zhou et al. [24] | Maximal -plexes | |
| Conte et al. [1] | Maximal -plexes with diameter | |
| Wang et al. [19] | Maximal -plexes | |
| Dai et al. [11] | Maximal -plexes of size | , where , |
| Dai et al. [10] | Maximal -defective cliques | (polynomial-time delay) |
| Dai et al. [10] (Pivoting) | Maximal -defective cliques | |
| Dai et al. [10] (Degeneracy-optimized) | Maximal -defective cliques | |
| Berlowitz et al. [2] | Maximal -plexes | , where is polynomial in and is any function of (fixed-parameter tractable polynomial delay) |
| This paper | Maximal -plexes of size | |
| This paper | Maximal -defective cliques of size | (polynomial total time) |
| This paper | Maximal -plexes | (when ) |
| This paper | Maximal -plexes of size | , where is polynomial in and is any function of (fixed-parameter tractable polynomial total time) |
4 Results
4.1 Notations
We consider graphs of the form , which are simple, undirected, with vertices and edges. If , the subgraph of induced by is denoted by . If , then is a proper subgraph of . When not clear from the context, the vertex set of will be denoted by . The set is called the open neighborhood of the vertex and consists of the vertices adjacent to in .
Given an arbitrary ordering of the vertices of , the set consists of the vertices following in this ordering, including itself, that is, . In the ordering , the rank of , denoted by , is its position in the ordering (here, ). By , we denote the set of integers . The distance between two vertices and is the length of the shortest path from to . Let , where is the set of vertices at distance from (in this paper, we consider ).
4.2 Preliminaries
In this section, we present a set of preliminary definitions and combinatorial results that constitute the theoretical foundation of our algorithms. We begin by introducing a decomposition framework based on subgraphs , which plays a central role throughout the paper. We then formally define the two central notions of subgraph density considered in this work: -plexes and -defective cliques. These definitions enable a rigorous formulation of the enumeration problems that we aim to solve.
The section proceeds with a series of lemmas that establish the essential properties of these structures. In particular, Lemma 5 and Lemma 6 guarantee that any maximal -plex of size at least in a graph is entirely contained in a unique subgraph of the decomposition, Lemma 11 provides a procedure to verify whether a -plex is maximal in its corresponding subgraph in time , making the approach efficient in practice. These results are central in justifying the correctness and non-redundancy of our enumeration strategy. Further, Lemmas 9 and 10 provide tight bounds on the size and number of such -plexes in terms of the graphβs degeneracy and maximum degree . In addition, Lemma 12 establishes an upper bound on the total number of such -plexes in all subgraphs , showing that their sum is proportional to the number of maximal -plexes in . This result is crucial in the design and analysis of our output-sensitive algorithms.
Definition 1.
Let be a graph, and let be an ordering of its vertices. For each , we define the graph as the subgraph of induced by the vertex set , that is, the vertex itself, its neighbors in , and the vertices at distance two from within .
Definition 2.
Let be an undirected graph and let be a non-negative integer. A subset is called a -plex if, in the induced subgraph , each vertex has at most non-neighbors (equivalently, at least neighbors) within .
A -plex is said to be maximal in if there does not exist a strict superset such that is also a -plex in ; that is, is inclusion-maximal among the -plexes of .
Definition 3.
Let be an undirected graph and let be a non-negative integer. A subset is called a -defective clique if the induced subgraph is missing at most edges to be a complete clique. A -defective clique is said to be maximal in if there does not exist a strict superset such that is also a -defective clique in ; that is, is inclusion-maximal among the -defective cliques of .
Lemma 4 (Two-hop property [18]).
If is a -plex in with , then the subgraph is connected and has diameter at most two.
Lemma 5.
Let be a graph, and let be a degeneracy ordering of its vertices. Let be a maximal -plex such that the size of is at least . Then, there exists a unique index such that and .
Proof.
Let be a maximal -plex in with , and let be the smallest index such that . By Lemma 4, the subgraph has diameter at most two and contains , which implies that every vertex is at distance at most two from . Therefore, each such vertex is either a neighbor of or is at distance two from , so by the definition of , it follows that .
To show uniqueness, suppose for contradiction that there exists an index such that and . If , then , which contradicts the minimality of . If , then since , we must have . By definition, , so . However, this is impossible, since all vertices in have indices strictly greater than , whereas . Therefore, such an index cannot exist, proving the uniqueness.
Lemma 6.
Let be a graph, and let be a degeneracy ordering of its vertices. Let be a maximal -plex in subgraph , for some , such that the size of is at least . Then is not maximal in if and only if there exists a maximal -plex of size at least , maximal in , such that for some .
Proof.
Suppose that is a maximal -plex of size at least in , for some , but that is not maximal in . Then there exists a set such that the subgraph is a -plex of size at least , and is maximal in . Let be the vertex with the smallest rank according to the ordering .
If , then since has a diameter of at most 2, the vertex must be at distance at most 2 from every vertex in . By the definition of , this means , so forms a -plex of size at least in , contradicting the maximality of in . Therefore, it must be that . Now set , which by hypothesis is a -plex of size at least , maximal in . Since , we have , and with . This completes the first implication.
Conversely, suppose that there exists a maximal -plex of size at least in , included in a subgraph for some , such that . Then, by definition, cannot be maximal in , which completes the proof.
Corollary 7.
Let be a -degenerate graph and let be a maximal -plex of size at least of an induced subgraph , , such that is not maximal in . Let be a maximal -plex of size at least of , which is a subgraph of some graph , , and such that is a subgraph of . Let and be the words obtained from the vertices of the -plexes and ordered according to . Then is a proper suffix of .
Proof.
First observe that, by Lemma 6, the -plex is well defined and . Let be the vertex of with the smallest index in ; thus, appears first in . Since , we also have .
Assume for contradiction that is not a proper suffix of . This means that there exists at least one vertex which appears after in . Otherwise, would indeed be a proper suffix of . Let denote the vertex of with the smallest index in ; according to Lemma 6, we have . Since any subset of a -plex is itself a -plex, it follows that is a -plex in . Moreover, as has size at least , so does . By Lemma 4, must have diameter at most , implying . Thus, is a -plex of size at least in , which contradicts the maximality of in .
Lemma 8.
Let be a graph of degeneracy . Then the size of any maximal -plex in of size at least is at most .
Proof.
Let be a maximal -plex in of size at least , and let be a degeneracy ordering of with degeneracy . Let be the vertex of with the smallest rank in . By Lemma 5, . Therefore, can contain at most vertices in ; otherwise, the number of βs non-neighbors would exceed the allowed , contradicting the definition of a -plex. Moreover, by the definition of degeneracy, has at most neighbors in , so at most vertices in can belong to . Since and are disjoint, we obtain the following inequality: .
Lemma 9.
Let be a graph, and its degeneracy. Then the number of maximal -defective cliques of size at least k+2 is .
Proof.
Let be a -degenerate graph and a degeneracy ordering of its vertices. Denote by the set of all subsets of , and by the set of all subsets of of size at most . Let denote the number of maximal -plex of size at least in the graph , for , containing , and let be the number of maximal -plex of size at least in the graph . According to Lemma 5, for every maximal -plex of size at least in , there exists such that and . Furthermore, by Lemma 6, there may exist a maximal -plex of size at least in containing (for some ), which are not maximal in . It follows that .
Let . By definition of , the vertex has at most neighbors of higher rank (i.e., greater index) in ; in other words, the cardinality of is at most . Each vertex in is adjacent to at most vertices in , so . Moreover, by arguments analogous to those in the proof of Lemma 8, any maximal -plex of size at least in containing contains at most vertices from and at most vertices from . Thus, for -plex , there exist two sets and such that . Therefore, we obtain the following inequality: , with and . As , it follows that , and thus . Consequently, which is bounded by .
Lemma 10.
For any tuple such that , we can construct a graph of order with maximum degree and degeneracy such that has maximal -plexes of size at least .
Proof.
Let be a simple, undirected, -degenerate graph with maximum degree , and let be a degeneracy ordering. Set , , and consider a partition such that , is a clique, and is an independent set.
For every , let be the subgraph induced by the vertex set , where is adjacent to all vertices in , and is not adjacent to any vertex in . Let denote the number of maximal -plexes of size at least in , and let denote the number of such -plexes in . For , Consider , where is a maximal clique of and is a subset of cardinality . We show that is a maximal -plex of size at least in . First, , and since , we have by Lemma 8. Moreover, since induces a clique and is not adjacent to any vertex in , the subgraph missing exactly edges, namely those between and each vertex of . Thus, is indeed a -plex in .
Assume for contradiction that is not maximal in ; then there exists such that is a -defective clique. Since , then and are not adjacent; hence, has exactly non-neighbors in , contradicting the definition of a -plex. is therefore maximal both in and in .
For every choice of a pair , we obtain a distinct maximal -plex. There are exactly maximal cliques in and subsets of size in (since ). Thus, we obtain maximal -plexes of the form . Therefore, for each , we have . Moreover, since is an independent set, every maximal -plex of size at least in is also maximal in , for each such .
Finally, we obtain: Now, since , it follows that
Lemma 11.
Let be a -degenerate graph with maximum degree , a degeneracy ordering of its vertices, and the subgraph defined as in Definition 1 for some . If is a -plex of size at least , then one can verify whether is maximal in in time .
Proof.
Let be a -plex of size at least . To check whether is maximal in , it suffices, by Definition 2, to verify whether there exists a vertex such that is still a -plex in . The cardinality of is bounded by (i.e., ), since contains at most direct neighbors of and vertices at distance two. Therefore, the size of is also . For each , one needs to check whether , for each , which can be done in time, since by Lemma 4. If this condition holds for every ), then is not maximal in . Therefore, the maximality of a -plex can be verified in time.
Lemma 12.
Let be a -degenerate graph, and let , , be the family of induced subgraphs defined in Definition 1. Let denote the number of maximal -plexes of size at least in , and let denote the number of such -plexes in . We have that .
Proof.
Let denote the number of maximal -plexes of size at least in , that are also maximal in , and let be the number of such -plexes in that are not maximal in . Thus, .
Let be a maximal -plex of size at least in . By Lemma 8, the number of vertices in is at most . Consequently, , the word formed by the vertices of ordered according to , can have at most proper suffixes.
Moreover, by Corollary 7, every maximal -plex of size at least in some subgraph , , that is not maximal in , satisfies being a proper suffix of for some maximal -plex in . Therefore, , and it follows that .
4.3 Algorithms for the enumeration of -plexes
We now describe our enumeration framework for computing all maximal -plexes of size at least in a graph . This section presents two algorithms: a base algorithm that enumerates all such -plexes in each subgraph , and a refined version that guarantees uniqueness of output using a generalized suffix tree. The first algorithm operates locally in each and its enumeration time depends solely on the degeneracy of , ensuring efficient exploration of the search space. The second algorithm builds upon the first by incorporating a filtering mechanism based on suffix tree rejection to avoid duplicate outputs. Both algorithms are described in detail and formally analyzed. The correctness and non-redundancy of the enumerated solutions are established in Theorems 13 and 14, while their complexities are proved in Theorems 15 and 16. Finally, in Theorem 17, we derive an output-sensitive result by applying the polynomial delay algorithm from [12] to the subgraphs defined in Definition 1. By exploiting the fact that each has size , and combining this with the bound established in Lemma 12 on the total number of maximal -plexes across all βs, we design an output-sensitive algorithm with a total running time polynomial in , and linear in the number of solutions.
4.3.1 Proof of correctness
Theorem 13.
Given a subgraph as defined in Definition 1, Algorithm 1 outputs exactly all maximal -plexes of size at least , containing , without duplication.
Proof.
Assume for contradiction that there exists a -plex of size at least , containing and maximal in , which is not reported by Algorithm 1. Let and , so that . Observe that since , the subset is examined at line 4 of the algorithm. Moreover, as has no neighbors in , it follows that . Thus, the subset is included in the enumeration at line 5. Consequently, the set is constructed at line 6, and by assumption, both the size and -plex constraints at line 7, as well as the maximality criterion at line 8, are satisfied. Hence, Algorithm 1 must output , a contradiction. It follows that every -plex of size at least , containing and maximal in , is indeed generated by the algorithm.
We now show that Algorithm 1 does not produce any duplicates. Suppose, to the contrary, that a -plex is generated more than once, i.e., there exist distinct pairs such that . By construction, , , with . Thus, the decomposition is unique, as the sets and are disjoint and correspond to non-overlapping neighborhoods. Therefore, if , the resulting sets are necessarily distinct, contradicting our assumption. This establishes that each clique is output at most once by the algorithm.
Theorem 14.
Given a subgraph , and an integer , Algorithm 2 outputs exactly all maximal -plexes of size at least , without duplication.
Proof.
Assume for contradiction, that there exists a maximal -plex of size at least in that is not output by Algorithm 2. By Lemma 5, there exists a unique index such that is maximal in and . According to Theorem 13, Algorithm 1, which is called at line 4 of Algorithm 2, enumerates all maximal -plexes in without duplication. Thus, must necessarily be produced at line 5. It follows that can only be rejected by the suffix-tree rejection test (lines 7β9) if it is detected as a proper suffix of some already produced -plex that is maximal in , with , as ensured by Corollary 7. However, the inclusion contradicts the maximality of in . Therefore, must indeed be output by Algorithm 2. We conclude that Algorithm 2 outputs all maximal -plexes of size at least in .
We now show that there is no duplication. Assume for contradiction, that this is not the case; that is, there exists a maximal -plex of size at least in that is output at least twice by Algorithm 2. Since Algorithm 1 enumerates, without duplication, all maximal -plexes of each subgraph , , at line 4, the only possibility is that is enumerated in two different subgraphs and with . However, this contradicts Lemma 5, which guarantees that a maximal -plex of size at least in is included in exactly one such subgraph .
Hence, by contradiction, Algorithm 2 outputs exactly all maximal -plexes of size at least in , without duplication.
4.3.2 Proofs of complexity
This subsection establishes the computational complexity of our enumeration algorithms for maximal k-plexes of size at least . We analyze both the local enumeration within individual subgraphs and then the enumeration across the entire graph . These bounds exploit the structural properties of degeneracy orderings and leverage the combinatorial limits established in Lemma 9, combined with efficient suffix tree operations to ensure uniqueness of the output.
Theorem 15.
Given a graph , , and an integer , the time complexity of Algorithm 1 is in .
Proof.
Let be a -degenerate graph, and let be a degeneracy ordering of its vertices. For each vertex , the algorithm operates on the subgraph , as defined in Definition 1. The for loop at line 4 enumerates all subsets , yielding iterations. The loop at line 5 iterates over all with , corresponding to exactly iterations, which is . The checks at line 7 are performed in time, and the maximality verification at line 8 requires time according to Lemma 11. Therefore, the body of the inner loop runs in time. In total, the algorithm has a complexity of , that is, .
Theorem 16.
Given a graph with degeneracy and maximum degree , and an integer , the time complexity of Algorithm 2 is in .
Proof.
Let be a -degenerate graph and let be a degeneracy ordering of its vertices. The computation of the degeneracy and the ordering at line 1 can be performed in time, where is the number of edges. For lines 3β4, we consider each subgraph for , so this step is executed for subgraphs in total. Enumerating all maximal -plexes of size at least in each takes time by Theorem 15. According to Lemma 9, each subgraph contains maximal -plexes of size at least . Thus, the for loop at line 5 runs times. Regarding lines 5β12: Ordering the vertices of according to can be done in time, by Lemma 8. Suffix tree operations (search and insertion) for words of length at most can be performed in time (see [17]). Thus, the body of the inner loop at line 5 has complexity , and so the outer loop at line 3 has total complexity Since all other steps require at most operations, it follows that the overall complexity of the algorithm is .
Theorem 17.
Given a graph of degeneracy and maximum degree , and an integer , one can enumerate all maximal -plexes of size at least , without duplication, in time where denotes the number of maximal -plexes of size at least in , is a polynomial function of , and is an arbitrary function of . The algorithm is output-sensitive, with total running time proportional to the number of solutions.
Proof.
Let denote the number of maximal -plexes of size at least in , and let denote the number of such -plexes in the subgraph , for . In [2], the authors applied this framework to the specific case of -plexes, resulting in a fixed-parameter tractable algorithm with polynomial delay and time complexity , where is the order of the graph, is a polynomial function of , and is an arbitrary function of . This algorithm can be adapted to our decomposition into subgraphs , which yields a significant improvement in complexity: . More concretely, the approach consists of applying this algorithm to each subgraph , for . For each , the order of is , where is the maximum degree of . Consequently, the enumeration in each can be performed with a delay between two outputs in . Since there are such cliques in , for , the enumeration in takes time.
Moreover, by applying the same arguments used in the proof of Lemma 12, it can be shown that . It follows that the enumeration over all subgraphs can be performed in total time To ensure that only cliques which are maximal in are retained, a similar filtering procedure to that of Algorithm 2 can be used. This step requires time, as established in the proof of Theorem 16. Therefore, by Lemma 5, Lemma 6, and Corollary 7, this approach yields an exhaustive, duplication-free enumeration of all maximal -defective cliques of size at least in , with total time complexity
4.4 Application to the enumeration of -defective cliques
In this section, we show how to effectively extend our analytical framework to the enumeration of -defective cliques. We leverage the fact that a -defective clique is, by definition, a -plex, which allows us to reuse the algorithms previously developed for the enumeration of -plexes. By inserting an additional verification step into the enumeration algorithm, we ensure that only those -plexes satisfying the -defectiveness condition are retained. This refinement enables the duplication-free generation of all maximal -defective cliques of size at least in a given graph, as established in Theorem 16.
We then introduce an alternative approach based on an output-sensitive algorithm that fully exploits the degeneracy and the maximum degree of the graph to improve the computational complexity. Theorem 19 formalizes this optimization by providing a bound on the total running time in terms of the number of solutions. To the best of our knowledge, these are the first output-sensitive bounds for this problem that are expressed solely in terms of the parameters , , , and the number of maximal -defective cliques. Since the proofs of Theorems 18 and 19 are similar to the proofs presented in Section 4.3.
Theorem 18.
Given a graph of degeneracy and maximum degree , and an integer , one can enumerate all maximal -defective cliques of size at least , without duplication, in time
Proof.
Since every k-defective clique is a -plex, the enumeration algorithm for maximal -plexes can be leveraged to enumerate all maximal k-defective cliques. Specifically, Algorithm 2 is applied to enumerate all maximal -plexes of size at least . Immediately after Line 11 of Algorithm 2, an additional verification step is incorporated to test whether the current -plex is also a -defective clique. This verification consists in counting the number of missing edges in the subgraph induced by the current -plex. It can be performed in time. According to Theorem 16, Algorithm 2 enumerates all maximal -plexes of size at least in time. Since the additional verification for the -defective clique condition is performed once per candidate and requires at most time, the total time per candidate becomes . Finally, by Lemma 9, Lemma 10, and Corollary 7, the suffix-tree-based filtering guarantees that only the k-defective cliques which are maximal in are retained, and that no duplicates are produced. Hence, this approach yields an exhaustive, duplication-free enumeration of all maximal k-defective cliques of size at least in , with overall time complexity .
Theorem 19.
Given a graph of degeneracy and maximum degree , and an integer , one can enumerate all maximal -defective cliques of size at least , without duplication, in time where denotes the number of maximal -defective cliques of size at least in . The algorithm is output-sensitive, with total running time proportional to the number of solutions.
Proof.
Let denote the number of maximal -defective cliques of size at least of , and let denote the number of maximal -defective cliques of size at least in the subgraph , . In [10], Algorithm 3 enumerates all maximal -defective cliques of size at least in a graph , with a delay of between two outputs, where is the order of the graph and denotes the size of the largest maximal -defective clique in . This algorithm can be adapted to our decomposition into subgraphs , which allows for a significant complexity improvement to , where is the degeneracy of the input graph and its order. More concretely, the approach consists in applying Algorithm 3 to each subgraph , . For every , the order of is , where is the maximum degree of . Moreover, Lemma 4 ensures that the size of the largest maximal -defective clique of size at least in is at most . Thus, the enumeration in each can be performed with a polynomial delay of . Consequently, the enumeration in each can be performed with a polynomial delay between two outputs in . Since there are such cliques in , for , the enumeration in takes time. Moreover, by applying exactly the same arguments used in the proof of Lemma 12, it can be shown that . It follows that the enumeration over all subgraphs can be performed in total time . To ensure that only cliques which are maximal in are retained, one can use a similar filtering procedure as in Algorithm 2, which requires time, as established in the proof of Theorem 16. Therefore, by Lemma 5, Lemma 6, and Corollary 7, this approach yields an exhaustive, duplication-free enumeration of all maximal -defective cliques of size at least in with time.
4.5 Application to the enumeration of 2-plexes
In this section, we focus on the enumeration of maximal 2-plexes. We build upon some of the combinatorial results presented in the previous sections and apply a variation of a result by Eppstein [12] concerning the enumeration of non-induced maximal bicliques. This allows us to derive a linear-time algorithm, with complexity , to enumerate maximal 2-plexes in graphs of constant degeneracy.
Recall that a non-induced biclique is a bipartite subgraph such that every vertex in is adjacent to every vertex in , without the requirement that and form independent sets. Such a biclique is said to be maximal if it is not properly contained in any other biclique of the same form. Moreover, given an undirected graph , a -bounded orientation of is any orientation of the edges such that each vertex has out-degree at most . In this section, the term biclique always refer to non-induced bicliques.
Lemma 20.
Let be an undirected -degenerate graph, and let be a 2-plex of size at least 3 and maximal in . Then there exists a maximal non-induced biclique in such that .
Proof.
Let be an undirected -degenerate graph, and let be a maximal 2-plex of size at least 3 in . By Lemma 5, there exists such that and . Let us define , and , so that . By construction, , and thus . Since is a 2-plex, can contain at most one vertex in , ensuring that has at most one non-neighbor in . Hence, . Furthermore, by the definition of , the vertex is adjacent to every vertex in , and every is adjacent to every vertex in as well; otherwise, would have more than one non-neighbor in , contradicting the fact that is a 2-plex. Consequently, forms a biclique. If is maximal, the lemma holds, otherwise it is included in a maximal non-induced biclique containing .
Lemma 21 (Folklore).
If a graph G has degeneracy d, it has a d-bounded orientation.
Lemma 22 (Eppstein [12]).
Given a graph with a -bounded orientation, and a collection of sets , where each is a 2-plex of size at most in , it is possible to compute the corresponding non-induced bicliques in total time , where denotes the total size of all sets .
Proof.
We rely on a result established in [12], which shows that, given a graph equipped with a -bounded orientation and a collection of sets , where each is a subset of vertices, all non-induced bicliques can be listed in total time , where denotes the total size of sets . In our case, we consider as candidates the sets defined as 2-plexes of size at most in the subgraphs , , according to Definition 1. More specifically, we restrict to sets , since contains at most vertices due to the degeneracy of the graph. Therefore, each contains at most such subsets. Summing over all vertices , the total number of sets is thus bounded by . All these sets can therefore be listed in time . Then, by applying the result from [12] to this collection, all non-induced bicliques can be listed in total time since .
The proof of the following theorem is similar to that given by Eppstein [12] for their linear-time algorithm to enumerate maximal non-induced bicliques, with a few additional simple steps.
Theorem 23.
If an undirected graph has degeneracy , then all maximal 2-plexes of can be listed in time.
Proof.
We rely on Theorem 1 from [12], which establishes that for any graph with arboricity , all maximal non-induced bicliques of can be listed without duplication in total time , where denotes the order of the graph. In their proof, the arboricity is used to ensure the existence of a -bounded acyclic orientation of the graph. However, in our context, Lemma 21 guarantees that any -degenerate graph admits a -bounded orientation. Moreover, Lemma 3 from [12] ensures that such a -bounded orientation implies the existence of a -bounded acyclic orientation, which can be computed in linear time. It follows that a graph with degeneracy admits a -bounded acyclic orientation, thus allowing the application of Theorem 1 from [12] within our framework. To make use of this result, we adapt the technical lemmas from [12] by replacing their Lemmas 2 and 4 with our Lemmas 21 and 22, respectively. Indeed, in the proof of Lemma 21, we define a collection of sets as the 2-plexes of size at most in the subgraphs , which enables the efficient listing of non-induced bicliques of the form . Then, Lemma 20 ensures that every maximal 2-plex in is included in some maximal biclique of .
However, this is not sufficient to recover exactly all maximal 2-plexes. Recall that in Lemma 20, it is shown that a maximal 2-plex is composed of a vertex (the vertex with the smallest rank in the 2-plex, with respect to a degeneracy ordering), a set of its neighbors with higher rank at distance one, and a single vertex with the highest rank at distance two. Therefore, given a maximal non-induced biclique , to find all 2-plexes included in that biclique, it suffices to identify the vertices in that are not neighbors of the vertex . This can be done in time . According to Theorem 1 of Eppstein [12], the computation of the sets requires time, where is the arboricity of the graph. Recalling that , the total time complexity becomes , which yields the claimed result.
References
- [1] A.Conte, T. De Matteis, D. De Sensi, R. Grossi, A. Marino, and L. Versari. D2k: Scalable community detection in massive networks via small-diameter k-plexes. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 1272β1281. ACM, 2018. doi:10.1145/3219819.3220093.
- [2] D. Berlowitz, S. Cohen, and B. Kimelfeld. Efficient enumeration of maximal k-plexes. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pages 431β444. ACM, 2015. doi:10.1145/2723372.2746484.
- [3] C. Bron and J. Kerbosch. Algorithm 457: Finding all cliques of an undirected graph. Communications of the ACM, 16(9):575β577, 1973. doi:10.1145/362342.362367.
- [4] Y. Cai and M. Xiao. A new exact algorithm for the maximum k-defective clique problem, 2023. arXiv:2309.02635.
- [5] Y. Cai and M. Xiao. A simple yet effective branching algorithm for the maximum -defective clique problem, 2024. arXiv:2407.16588.
- [6] X. Chen, Y. Zhou, J.-K. Hao, and M. Xiao. Computing maximum k-defective cliques in massive graphs. Computers & Operations Research, 127:105131, 2021. doi:10.1016/J.COR.2020.105131.
- [7] S. Cohen, B. Kimelfeld, and Y. Sagiv. Generating all maximal induced subgraphs for hereditary and connected-hereditary graph properties. Journal of Computer and System Sciences, 74(7):1147β1159, 2008. doi:10.1016/j.jcss.2007.12.006.
- [8] A. Conte, P. Foggia, and M. Vento. Exploring community structures using relaxed cliques. European Journal of Operational Research, 226(1):103β113, 2013. doi:10.1016/j.ejor.2012.10.031.
- [9] A. Conte, T. De Matteis, D. De Sensi, R. Grossi, A. Marino, and L. Versari. D2k: Scalable community detection in massive networks via small-diameter k-plexes. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1272β1281. ACM, 2018. doi:10.1145/3219819.3220101.
- [10] Q. Dai, R.-H. Li, M. Liao, and G. Wang. Maximal defective clique enumeration. Proceedings of the ACM on Management of Data, 1(1):Article 77, 2023. doi:10.1145/3588931.
- [11] Q. Dai, R.-H. Li, and G. Wang. Output-sensitive enumeration of maximal k-plexes of minimum size in massive graphs, 2024. arXiv:2402.13008.
- [12] D. Eppstein. Arboricity and bipartite subgraph listing algorithms. Information Processing Letters, 51(4):207β211, 1994. doi:10.1016/0020-0190(94)90121-X.
- [13] T. Gao, Z. Xu, R. Li, and M. Yin. An exact algorithm with new upper bounds for the maximum k-defective clique problem in massive sparse graphs. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pages 10174β10183, 2022.
- [14] M. GrbiΔ, A. Kartelj, S. JankoviΔ, D. MatiΔ, and V. FilipoviΔ. Variable neighborhood search for partitioning sparse biological networks into the maximum edge-weighted k-plexes. arXiv preprint arXiv:1807.01160, 2018. arXiv:1807.01160.
- [15] X. Li, H. Chen, and H. Yan. The identification of protein complexes from weighted ppi networks using a novel core-attachment method. BMC Bioinformatics, 11(1):1β13, 2010. doi:10.1186/1471-2105-11-595.
- [16] Y. Liu, L. Zhou, and Q. Liu. Community detection based on k-plexes in complex networks. Physica A: Statistical Mechanics and its Applications, 503:999β1009, 2018. doi:10.1016/j.physa.2018.02.159.
- [17] G. Manoussakis. A new decomposition technique for maximal clique enumeration for sparse graphs. Theoretical Computer Science, 770:25β33, 2019. doi:10.1016/j.tcs.2018.10.014.
- [18] C. Verma and A. K. Tripathi. Enumeration of k-defective cliques, 2024. arXiv:2407.16588.
- [19] Z. Wang, Y. Zhou, M. Xiao, and B. Khoussainov. Listing maximal k-plexes in large real-world graphs. In Proceedings of the ACM Web Conference 2022, WWW β22, pages 1517β1527, New York, NY, USA, 2022. Association for Computing Machinery. doi:10.1145/3485447.3512198.
- [20] B. Wu and X. Pei. A parallel algorithm for enumerating all the maximal k-plexes. In Pacific-Asia Conference on Knowledge Discovery and Data Mining, pages 476β483. Springer, 2007. doi:10.1007/978-3-540-71701-0_46.
- [21] M. Yannakakis. Node- and edge-deletion np-complete problems. In Proceedings of the Tenth Annual ACM Symposium on Theory of Computing (STOC), pages 253β264. ACM, 1978. doi:10.1145/800133.804355.
- [22] H. Yu, X. Zhao, and H. Huang. Discovering protein complexes based on k-plexes in protein interaction networks. PLOS ONE, 9(7):e101541, 2014. doi:10.1371/journal.pone.0101541.
- [23] J. Zhou, X. Chen, J. Huang, and Q. Fang. Efficient community detection based on k-plexes in social networks. Information Sciences, 512:1172β1192, 2020. doi:10.1016/j.ins.2019.10.057.
- [24] Y. Zhou, J. Xu, Z. Guo, M. Xiao, and Y. Jin. Enumerating maximal k-plexes with worst-case time guarantee. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 2442β2449, 2020.