Deterministically Counting -Paths and Trees Parameterized by Treewidth in Single-Exponential Time
Abstract
In this paper, we give new and faster deterministic algorithms to count the number of -paths and trees in host graphs of bounded treewidth. Our algorithms use time that is single-exponential in the treewidth, and employ the determinant method from [4]. Modifications of the algorithms count in single-exponential time the number of -paths between specified end-points, the number of -cycles, and the number of trees with k vertices that are a subgraph of the host graph.
Keywords and phrases:
Parameterized Complexity, Counting Subgraphs, #k-path, Dynamic Programming, Tree Decomposition, Determinant MethodCopyright and License:
2012 ACM Subject Classification:
Theory of computation Parameterized complexity and exact algorithmsEditors:
Akanksha Agrawal and Erik Jan van LeeuwenSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
In this paper, we consider the problem of counting subgraphs isomorphic to a graph in a given graph . This well-studied problem has several applications, e.g., in bio-informatics when modeling biological networks [1, 12, 13] or when analyzing social networks [8, 17]. The frequency of specific subgraphs may give valuable insights into the functionality of a network and can be used to test the quality of network models.
Counting problems are generally much harder than their decision counterparts, asking whether a subgraph occurs in at all. Indeed, decision problems with a trivial solution may have a counting variant that is as hard as the counting variant of some NP-complete problems, and many counting problems are #-Complete [14]. For this reason, most research focuses on the parameterized complexity of subgraph counting problems.
To categorize the complexity of intractable parameterized counting problems, a complexity class hierarchy analogous to the -hierarchy of Downey and Fellows was described independently by Flum and Grohe [9] and McCartin [11]. Notably, Flum and Grohe showed that, while deciding whether there is a path of length in a given graph is , counting -paths parameterized by is -complete. This makes it unlikely that there is a polynomial time algorithm that counts -paths even when is small.
The main contributions of this paper are new and faster algorithms to count respectively paths and trees in graphs of bounded treewidth. First, an time Dynamic Program for counting the number of -paths in a host graph is presented. Although the counting of -paths parameterized by path length has been researched a lot [2, 3, 5, 10], exact algorithms parameterized by treewidth have seen little attention. The currently fastest exact algorithm for counting -paths parameterized by treewidth is a well-known elementary Dynamic Program using a tree decomposition of to solve the problem in time. The presented Dynamic Program improves on this by providing a single-exponential time algorithm for the problem, meaning that a solution is found in time for a constant .
The second contribution of this paper is an time algorithm for #-Tree, the problem of counting the number of copies of a tree with vertices and leaves in a host graph . This result generalizes the -path counting algorithm in the first part of the paper. Since -paths are trees, the counting of labeled trees is at least as hard as counting -paths, showing #-hardness when parameterized by . Verstraete and Mubayi proved a lower bound for the number of labeled copies of tree in host graph of , where is the number of vertices in , is the average degree of vertices in , is the number of edges in and , provided that [16]. The current fastest exact algorithm to solve #-Tree parameterized by treewidth is an elementary Dynamic Program that takes time. The presented algorithm, therefore, improves on this running time when is significantly smaller than and even provides a time algorithm when is constant.
The techniques presented in this paper build upon the determinant method introduced by Bodlaender et al. in 2015 [4]. This method counts classes of subgraphs that are isomorphic to trees and include a specific vertex, such as Hamiltonian Paths and Steiner Trees. To develop single-exponential time algorithms for #-Path and #-Tree, we relax the constraint of including a specific vertex. This modification enables the counting of subgraphs without restrictions on which vertices are included. Additionally, we demonstrate that the determinant method can be extended to count subgraphs formed by “gluing together” simpler graphs.
Furthermore, the constructed Dynamic Programs can be relatively straightforwardly modified to count the number of fixed end-point -paths and -cycles, and to solve #-Tree (asking for the number of occurrences of all trees with edges in host graph ) in time. This last problem is proven to be #-hard when parameterized by by Brand and Roth [6].
2 Preliminaries
Nice Tree decompositions
In this paper, we utilize nice tree decompositions, following the definition provided by Cygan et al. [7].
Counting trees using determinants
This section explores a determinant-based method for counting subgraphs, developed by Bodlaender et al. [4].
Given an undirected graph and a nice path or tree decomposition of with width or respectively, define the incidence matrix of as follows:
Definition 1 (incidence matrix).
The incidence matrix of a graph is an matrix whose rows are indexed by the vertices in and whose columns are indexed by the edges in . Its entries are:
| (1) |
where we use the inherent ordering of vertices in . Let be the set of end-points of edges in ; that is, . Now, consider the reduced incidence matrix of an edge set :
Definition 2 (reduced incidence matrix).
Let , and . The matrix is the submatrix of whose rows are those of with indices in , and whose columns are those of with indices in .
Leveraging the definition of reduced incidence matrices, we can explore a central lemma in the determinant-based counting method; this lemma is used as a step in the proof for the Matrix Tree Theorem (see, for instance, the combination of Lemmas 2.2 and 2.5 in [15]).
Lemma 3.
Let such that . For any vertex in , the following holds: if is a tree, then . Otherwise, .
Corollary 4.
Let such that . If is a tree, then . Otherwise, .
Corollary 4 allows us to count the number of trees within a collection of edge sets by calculating the value of
| (2) |
provided that that for all .
3 Counting k-paths
Let be an undirected graph, and let . We address the following problem:
#-Path Input: A graph and a parameter Output: ; the number of (not necessarily induced) subgraphs of that are isomorphic to a simple (non-overlapping) path with exactly vertices.
This section details a Dynamic Programming algorithm for solving the #-Path problem. Our method first identifies a specific collection of edge sets, , such that the set of all trees in is identical to the set of all simple paths within the graph . We then utilize within Corollary 4 to establish a partial sum constraint that all Dynamic Programming entries must satisfy.
3.1 Partial sum
In line with Corollary 4, we define the collection of edge sets as follows:
An edge set constitutes a simple path if and only if the graph is a tree where every vertex has a degree of at most two. Consequently, the set of trees in is identical to the set of simple paths within the graph . The condition that for all is inherently satisfied by all trees and thus does not alter the number of trees, but does ensure the compatibility of with Corollary 4. The number of simple paths with length in graph is now given by
| (3) |
The condition that , while perhaps intuitive, complicates our Dynamic Programming design by requiring us to track both the number of edges and vertices within any edge set. A more computationally efficient strategy involves counting only the total number of vertices and the number of degree-one vertices. This is particularly advantageous because a simple path invariably has exactly two degree-one vertices. Consequently, we will employ the following proposition:
Proposition 5.
Suppose is a nonempty set such that for all . Then the number of degree one vertices in is equal to two if and only if .
With the equivalence from Proposition 5, we can rewrite our definition of (without changing itself):
| (4) |
We can expand the determinant in Equation 3.1 to obtain
| (5) | ||||
| (6) |
The bijection maps each edge in to a unique vertex in . Observe that if is distinct from both and , the corresponding term in the sum becomes zero due to the zero entry in the incidence matrix. Utilizing the property that , we have .
To compute Equation 6, we will utilize Dynamic Programming on a given tree decomposition of . For this purpose, we define the following parameters:
-
For every bag , denotes the degree of every vertex in in
-
For every bag , denotes for every vertex in whether it has been used as a value in the bijection
-
For every bag , denotes for every vertex in whether it has been used as a value in the bijection
-
Let denote the number of forgotten vertices (in ) that have degree one in
-
Let denote the number of edges in
Furthermore, we define the following useful sets:
-
Let be the collection of all edge sets with exactly edges, with vertices of degree one in , and such that the degree of a vertex in is equal to and the degree of all vertices is at most two.
-
Consider a set of edges , and a set of vertices . For , we define as the set of all bijective functions that map each edge in to a unique vertex in .
Intuitively, represents the possible bijections between the edges in and a selection of vertices from . Since Proposition 5 states that , not all vertices in can be part of this bijection. The sets and both identify the vertices that are excluded from this mapping. A key distinction is that contains vertices within the bag , whereas contains vertices that are strictly outside of .
For a valid -path we will have that as in Equation 6. However, during the construction of this -path, we might encounter intermediate states with multiple connected components. In such cases, might need to be greater than one to ensure a valid bijection exists, as more vertices need to be excluded from . Ultimately, we are guaranteed that . This is because if , the resulting edge sets form cycles, leading to a determinant of zero. Conversely, if , the resulting edge sets become disconnected, implying more than two vertices with degree one.
For every entry of the Dynamic Program in bag we will require that the following partial sum holds:
| (7) |
where we have labeled the ’s and for later reference. Filling in variables, we see that in a root bag where and , we have that , the number of -paths in .
3.2 Dynamic program
This section details the computation of tables for the functions (defined in Equation 7) for each node type in nice tree decompositions.
Leaf bag
| (8) |
In a leaf bag, we have that . Filling this into Equation 7 we see that (1) has no summands unless . In this case, (1) has one summand with , (2) has one summand with , (3) has one summand with (so that ), and (4) has no elements.
Introduce vertex bag with child
| (9) |
Since vertex is introduced in this bag, no edge in is incident to . Therefore for all , and (1) in Equation 7 has no summands if . Similarly, all summands in (3) vanish whenever or if .
If , then reduces to
.
Forget vertex bag with child
When , we set
If , we set
In a forget vertex bag , the vertex is excluded from . We determine the number of satisfying edge sets by summing over the possible degrees of .
When , the edge sets that satisfy Equation 7 are the same as the edge sets in .
When , the edge sets satisfying Equation 7 in bag are the same as the edge sets in child that have one less degree one vertex in . If , reduces to if , and zero otherwise. When , reduces to when , and zero otherwise.
Finally, when we see that reduces to when . When , reduces to .
Introduce edge bag with child
For and we set
where is equal to , except that , and . If or , we can not have that . In this case, we set
In an introduce edge bag we have two options: either is included in , or is not included in . If is not included in , then reduces to .
If is included in , we have to account for all possible bijections and . First note that for all bijections where , we have that and the term vanishes. Therefore we fix or .
The inclusion of into the bijections might change their sign. Since for all and we have that , no vertex in can contribute to a change in sign. Also, since is introduced in bag , we have that for all . The change in sign of is therefore equal to .
Join bag with children and
| (10) |
where we use coordinate-wise addition.
3.3 Running time analysis
Given a host graph together with a nice path or tree decomposition, we can traverse the decomposition in post-ordering, calculating entries according to the Dynamic Program described above. This procedure gives rise to the following theorem:
Theorem 6.
There exists a deterministic algorithm that, given a graph , a parameter and a nice tree decomposition of with treewidth , can solve #-Path in time. Given a nice path decomposition with pathwidth , there exists an algorithm that can solve #-Path in time.
4 Counting trees
In this section, we will discuss a generalization of the -path problem where we count subgraphs isomorphic to a tree .
Let be an undirected graph. Then the #-Tree problem on is defined as:
#-Tree Input: A graph and a tree that has leaf nodes and vertices Output: ; the number of (not necessarily induced) subgraphs of that are isomorphic to , not accounting for internal symmetries in .
To count occurrences of a target tree using Dynamic Programming, we will decompose it into path components. This approach allows us to build upon the path-counting methods detailed in Section 3. However, vertices with a degree of three or more, acting as convergence points for multiple paths, will pose a specific challenge requiring dedicated attention.
4.1 Tree labeling
To count occurrences of a specific tree , we first need to establish a unique labeling for it. This labeling is derived by decomposing into path segments, a process that considers three cases based on the degree of each vertex in :
-
deg; is a leaf.
-
deg; is a path node.
-
deg; is an internal junction.
We define the path segments as follows:
-
Starting in every leaf node with an unlabeled edge, we follow the unique path along the (possibly empty) set of connected path nodes, until we encounter an internal junction or another leaf. The paths traced in this way are called the leaf paths of . Note that if a leaf path terminates in two leaf nodes, is isomorphic to a simple path, and its occurrences can be counted using the method in Section 3. For the remainder of this paper, we assume all leaf paths end in one leaf and one internal junction.
-
For each internal junction, trace the paths starting in its unlabeled edges along the (possibly empty) set of connected path nodes until we encounter another internal junction (note that we can not encounter a leaf node, as per the previous step). These path segments are called internal paths.
For an illustration of this procedure, see Figure 1. By design, all edges along paths originating from leaves or internal junctions are labeled. Consequently, since any path not starting at such a point must form a cycle, all edges in the (acyclic) tree are labeled.
.png)
.png)
Let be the set of internal junctions in , let the set of internal paths in denoted by their vertices, and the set of leaf paths in denoted by their vertices. We will call the labeling of . Using this labeling, we can prove the following theorem, which will help us count the distinct graphs isomorphic to :
Theorem 7.
Let be a tree with labeling , let be a graph, and let be a set of edges in . Then the subgraph is isomorphic to if and only if:
-
is a tree
-
There exists a labeling function , such that:
-
1)
Internal junctions are the image of exactly one vertex in
-
–
i.e. for all , we have that
-
–
-
2)
Path length is preserved
-
–
i.e. for all internal paths , we have that
-
–
and, for all leaf paths , we have that
-
–
-
3)
Vertex degrees in paths are preserved
-
–
i.e. for all internal paths , and for all vertices , we have that .
-
–
and, for all leaf paths , there is exactly one vertex such that . For all vertices , we have that .
-
–
-
4)
Neighborhoods are preserved
-
–
i.e. for all internal paths , and for all vertices , let denote the neighborhood of in . We have that .
-
–
and, for all leaf paths , and for all vertices , we have that .
-
–
-
1)
4.2 Partial sum
Consider a graph and a subset of its edges . Let be a tree with vertices, leaves, and labeling . Define as the set of functions satisfying the conditions of Theorem 7. If denotes the number of distinct labeled subgraphs of isomorphic to , then Theorem 7 implies that
| (11) |
where the second step follows from Corollary 4. Consider any edge set . If there exists a function , then by its defining properties (1) and (2), we have:
-
-
for all internal paths
-
for all leaf paths
Since every vertex in is either an internal junction or belongs to an internal/leaf path, it follows that . Consequently, the condition in the first summation simplifies to . This allows us to expand the determinant in Equation 4.2 analogously to Section 3, resulting in
| (12) |
We now construct the partial sum that defines each entry in our Dynamic Program. The definitions of , and are retained from Section 3. Furthermore, we will define new parameters:
-
For every bag , let denote the label assigned to every vertex in bag . Relating to Theorem 7, can be interpreted as when restricted to the vertices in .
-
Let denote for every internal junction in whether some vertex should be labeled with in the construction of a function . The vector is indexed by the internal junctions in , and we will denote by the value corresponding to internal junction . Conversely, we will denote by all internal junctions in for which is equal to one.
-
Let denote for every internal path in and leaf path in how many vertices in should be labeled with this path in the construction of a function . The vector is indexed by the internal and leaf paths in , and we will denote by the value corresponding to the path . Conversely, we will denote by all paths in for which is at least one.
-
Let denote for every leaf path in the number of forgotten vertices with degree one for which should map to in the construction of some .
-
Let be the collection of all edge sets such that there exists a labeling function that obeys the following requirements:
-
0)
The functions and are identical when restricted to bag vertices
-
–
i.e. for all bag vertices , we have that
-
–
-
1)
All internal junctions in are the image of exactly one vertex in under
-
–
i.e. for all internal junctions , we have that
-
–
-
2)
All paths in are the image of exactly vertices in under
-
–
i.e. for all paths , we have that
-
–
-
3)
Vertex degrees are preserved in forgotten vertices and are equal to in bag vertices
-
–
i.e. for all internal junctions , if the vertex (there is exactly one such vertex as per (1)) is a forgotten vertex, then deg. If is a bag vertex, then deg.
-
–
for all internal paths , we have that all forgotten vertices have degree two in . All bag vertices have degree in .
-
–
and, for all leaf paths , we have that exactly forgotten vertices in have degree one in , and all other vertices in have degree two in . All bag vertices have degree in .
-
–
-
4)
Neighborhoods are preserved
-
–
i.e. for all internal paths , and for all vertices , let denote the neighborhood of in . We have that .
-
–
and, for all leaf paths , and for all vertices , we have that .
-
–
-
0)
With the parameters defined above, we will determine the Dynamic Program entries for each bag such that
| (13) |
Consider a vector indexed by all internal paths () and leaf paths () of . For each internal path , the corresponding value in is len, and for each leaf path , the value is len. Then, in a bag with and , the expression precisely equals Equation 12, thus yielding the number of subgraphs in isomorphic to .
4.3 Dynamic program
In this section, we will describe a Dynamic Program that iteratively calculates all partial sums of Equation 13. The entry calculations are accompanied by a brief explanation.
Leaf Bag
| (14) |
Introduce vertex bag with child
| (15) |
Because vertex is introduced in this bag, no edge in is incident to , and therefore the partial sum reduces to zero if or . Depending on the label there are slightly different cases.
Forget vertex bag with child
| (16) |
To obtain the total number of partial solutions in a forget bag we sum over all possible degrees of in (denoted as ). The case that accounts for the edge sets such that . The case that accounts for the edge sets where is a leaf, the case that , accounts for the edge sets where is a path node, and the case where , accounts for the edge sets where is an internal junction.
Introduce edge bag with child
Due to the preservation of neighborhoods requirement (4) for , the edge can only be included in if there is some internal path such that , or if there is some leaf path such that . Let be the indicator function that is equal to one if there exists such a path, and zero otherwise. If , , or if , we set
| (17) |
Otherwise, we set
| (18) |
In an introduce edge bag we have two options: either is included in , or it is not. In the case that , then reduces to . If we have to correctly account for the change in sign of bijections and . This is done in the same way as in Section 3.2.
Join bag with children and
| (19) |
where we use coordinate-wise addition.
4.4 Running time analysis
Using the Dynamic Program above we can prove the following theorem:
Theorem 8.
There exists a deterministic algorithm that, given a graph together with a nice tree decomposition of with treewidth and given a tree with leaf nodes and vertices with , can solve #-Tree in time. Given a nice path decomposition of with pathwidth , there exists an algorithm that solves #-Tree time.
5 Conclusion
In this paper, we have given deterministic algorithms for the exact counting of -paths and trees. These algorithms are single-exponential, parameterized by the tree- or pathwidth of the host graph , and are linear in the number of vertices in . Both algorithms utilize the determinant-based approach introduced by Bodlaender et al. [4] to construct a Dynamic Program. The algorithms in this paper can also be straightforwardly modified to count the number of cycles of length , the number of fixed-endpoint -paths, and the number of -Trees in single-exponential time when parameterized by the treewidth of the host graph.
Before our work, no algorithms were known for the studied problems with single-exponential dependency on the treewidth. However, the base of the exponents of the algorithm to count trees is rather large. An interesting question is whether these bases can be improved. One possible way in which the algorithm could be improved is if a more efficient way of counting the length of the internal and leaf paths is found. Keeping track of how many edges are in each path quickly blows up the running time, as there are possible combinations. A more efficient way of counting would significantly reduce the scaling of the running time with .
The techniques of this paper can be used to count even more general graph patterns. Suppose for example that some graph is ’almost’ acyclic. That is, there is some small set of edges that, when removed, becomes acyclic. Then we can treat the set of end-points of those edges in as internal junctions (so we can label vertices as end-points of such edges), and we can verify the existence of edges in between those end-points without including them in the bijections. The running time of such an algorithm would be similar to that of the tree counting algorithm but with replaced with .
References
-
[1]
Noga Alon, Phuong Dao, Iman Hajirasouliha, Fereydoun Hormozdiari, and S. Cenk
Sahinalp.
Biomolecular network motif counting and discovery by color coding.
Bioinformatics, 24(13):
i241–i249, 2008. doi:10.1093/bioinformatics/btn163. - [2] Noga Alon and Shai Gutner. Balanced hashing, color coding and approximate counting. In Proceedings 4th International Workshop on Parameterized and Exact Computation, IWPEC 2009, pages 1–16. Springer, 2009. doi:10.1007/978-3-642-11269-0_1.
- [3] V. Arvind and Venkatesh Raman. Approximation algorithms for some parameterized counting problems. In Proceedings 13th International Symposium on Algorithms and Computation, ISAAC 2002, pages 453–464. Springer, 2002. doi:10.1007/3-540-36136-7_40.
- [4] Hans L. Bodlaender, Marek Cygan, Stefan Kratsch, and Jesper Nederlof. Deterministic single exponential time algorithms for connectivity problems parameterized by treewidth. Information and Computation, 243:86–111, 2015. doi:10.1016/j.ic.2014.12.008.
- [5] Cornelius Brand, Holger Dell, and Thore Husfeldt. Extensor-coding. In ACM SIGACT 50th Annual Symposium on Theory of Computing, STOC 2018, pages 151–164, 2018. doi:10.1145/3188745.3188902.
- [6] Cornelius Brand and Marc Roth. Parameterized counting of trees, forests and matroid bases. In 12th International Computer Science Symposium in Russia: Computer Science – Theory and Applications, CSR 2017, pages 85–98. Springer, 2017. doi:10.1007/978-3-319-58747-9_10.
- [7] Marek Cygan, Jesper Nederlof, Marcin Pilipczuk, Michal Pilipczuk, Johan M.M. van Rooij, and Jakub Onufry Wojtaszczyk. Solving connectivity problems parameterized by treewidth in single exponential time. ACM Transactions on Algorithms, 18(2):17:1–17:31, 2022. doi:10.1145/3506707.
- [8] Alexandra Duma and Alexandru Topirceanu. A network motif based approach for classifying online social networks. In Proceedings 9th IEEE International Symposium on Applied Computational Intelligence and Informatics, SACI 2014,, pages 311–315. IEEE, 2014. doi:10.1109/SACI.2014.6840083.
- [9] Jörg Flum and Martin Grohe. The parameterized complexity of counting problems. SIAM Journal on Computing, 33(4):892–922, 2004. doi:10.1137/S0097539703427203.
- [10] Daniel Lokshtanov, Andreas Björklund, Saket Saurabh, and Meirav Zehavi. Approximate counting of k-paths: Simpler, deterministic, and in polynomial space. ACM Trans. Algorithms, 17(3), 2021. doi:10.1145/3461477.
- [11] Catherine McCartin. Parameterized counting problems. Annals of Pure and Applied Logic, 138(1):147–182, 2006. doi:10.1016/j.apal.2005.06.010.
- [12] R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii, and U. Alon. Network Motifs: Simple building blocks of complex networks. Science, 298(5594):824–827, 2002. doi:10.1126/science.298.5594.824.
- [13] N. Pržulj, D. G. Corneil, and I. Jurisica. Modeling interactome: scale-free or geometric? Bioinformatics, 20(18):3508–3515, July 2004. doi:10.1093/bioinformatics/bth436.
- [14] L.G. Valiant. The complexity of computing the permanent. Theoretical Computer Science, 8(2):189–201, 1979. doi:10.1016/0304-3975(79)90044-6.
- [15] Janneke van den Boomen. The Matrix Tree Theorem, 2007. Bachelor thesis, Radbout Universiteit Nijmegen. URL: https://www.math.ru.nl/˜bosma/Students/jannekebc3.pdf.
- [16] Jacques Verstraete and Dhruv Mubayi. Counting Trees in Graphs. The Electronic Journal of Combinatorics, 23(3):P3.39, 2016. doi:10.37236/6084.
- [17] Stanley Wasserman and Katherine Faust. Social Network Analysis: Methods and Applications, volume 8. Cambridge University Press, 1994. doi:10.1017/CBO9780511815478.
