Packing Cycles Faster Than Erd\H{o}s-P\'osa

The Cycle Packing problem asks whether a given undirected graph $G=(V,E)$ contains $k$ vertex-disjoint cycles. Since the publication of the classic Erd\H{o}s-P\'osa theorem in 1965, this problem received significant scientific attention in the fields of Graph Theory and Algorithm Design. In particular, this problem is one of the first problems studied in the framework of Parameterized Complexity. The non-uniform fixed-parameter tractability of Cycle Packing follows from the Robertson-Seymour theorem, a fact already observed by Fellows and Langston in the 1980s. In 1994, Bodlaender showed that Cycle Packing can be solved in time $2^{\mathcal{O}(k^2)}\cdot |V|$ using exponential space. In case a solution exists, Bodlaender's algorithm also outputs a solution (in the same time). It has later become common knowledge that Cycle Packing admits a $2^{\mathcal{O}(k\log^2k)}\cdot |V|$-time (deterministic) algorithm using exponential space, which is a consequence of the Erd\H{o}s-P\'osa theorem. Nowadays, the design of this algorithm is given as an exercise in textbooks on Parameterized Complexity. Yet, no algorithm that runs in time $2^{o(k\log^2k)}\cdot |V|^{\mathcal{O}(1)}$, beating the bound $2^{\mathcal{O}(k\log^2k)}\cdot |V|^{\mathcal{O}(1)}$, has been found. In light of this, it seems natural to ask whether the $2^{\mathcal{O}(k\log^2k)}\cdot |V|^{\mathcal{O}(1)}$ bound is essentially optimal. In this paper, we answer this question negatively by developing a $2^{\mathcal{O}(\frac{k\log^2k}{\log\log k})}\cdot |V|$-time (deterministic) algorithm for Cycle Packing. In case a solution exists, our algorithm also outputs a solution (in the same time). Moreover, apart from beating the bound $2^{\mathcal{O}(k\log^2k)}\cdot |V|^{\mathcal{O}(1)}$, our algorithm runs in time linear in $|V|$, and its space complexity is polynomial in the input size.


Introduction
The Cycle Packing problem asks whether a given undirected graph G = (V, E) contains k vertex-disjoint cycles. Since the publication of the classic Erdős-Pósa theorem in 1965 [16], this problem received significant scientific attention in the fields of Graph Theory and Algorithm Design. In particular, Cycle Packing is one of the first problems studied in the framework of Parameterized Complexity. In this framework, each problem instance is associated with a parameter k that is a non-negative integer, and a problem is said to be fixed-parameter tractable (FPT) if the combinatorial explosion in the time complexity can be confined to the parameter k. More precisely, a problem is FPT if it can be solved in time f (k) · |I| O (1) for some function f , where |I| is the input size. For more information, we refer the reader to recent monographs such as [15] and [11].
In this paper, we study the Cycle Packing problem from the perspective of Parameterized Complexity. In the standard parameterization of Cycle Packing, the parameter is the number k of vertex-disjoint cycles. The non-uniform fixed-parameter tractability of Cycle directed graphs was the first W [1]-hard problem shown to admit such a scheme. Krivelevich et al.
[31] obtained a factor O(|V | 1 2 ) approximation algorithm for Cycle Packing in directed graphs and showed that this problem is quasi-NP-hard to approximate within a factor of O(log 1− |V |) for any > 0.
Several variants of Cycle Packing have also received significant scientific attention. For example, the variant of Cycle Packing where one seeks k odd vertex-disjoint cycles has been widely studied [37,41,36,30,28,29]. Another well-known variant, where the cycles need to contain a prescribed set of vertices, has also been extensively investigated [25,34,26,24,27]. Furthermore, a combination of these two variants has been considered in [24,23].
Finally, we briefly mention that inspired by the Erdős-Pósa theorem, a class of graphs H is said to have the Erdős-Pósa property if there is a function f (k) for which given a graph G, it either contains k vertex-disjoint subgraphs such that each of these subgraphs is isomorphic to a graph in H, or it contains a set of f (k) vertices that hits each of its subgraphs that is isomorphic to a graph in H. A fundamental result in Graph Theory by Robertson and Seymour [38] states the the class of all graphs that can be contracted to a fixed planar graph H has the Erdős-Pósa property. Recently, Chekuri and Chuzhoy [7] presented a framework that leads to substantially improved functions f (k) in the context of results in the spirit of the Erdős-Pósa theorem. Among other results, these two works are also related to the recent breakthrough result by Chekuri and Chuzhoy [8], which states that every graph of treewidth at least f (k) = O(k 98 · polylog(k)) contains the k × k grid as a minor (the constant 98 has been improved to 36 in [9] and to 19 in [10]). Following the seminal work by Robertson and Seymour [38], numerous papers (whose survey is beyond the scope of this paper) investigated which other classes of graphs have the Erdős-Pósa property, which are the "correct" functions f associated with them, and which generalizations of this property lead to interesting discoveries.

Our Contribution
In this paper, we show that the running time of the algorithm that is a consequence of the Erdős-Pósa theorem is not essentially tight. For this purpose, we develop a 2 O( k log 2 k log log k ) · |V |time (deterministic) algorithm for Cycle Packing. In case a solution exists, our algorithm also outputs a solution (in time 2 O( k log 2 k log log k ) · |V |). Moreover, apart from beating the bound 2 O(k log 2 k) · |V | O(1) , our algorithm runs in time linear in |V |, and its space complexity is polynomial in the input size. Thus, we also improve upon the classical 2 O(k 2 ) · |V |-time algorithm by Bodlaender [3]. Our result is summarized in the following theorem.

Theorem 1. There exists a (deterministic) polynomial-space algorithm that solves Cycle
Packing in time 2 O( k log 2 k log log k ) · |V |. In case a solution exists, it also outputs a solution.
Our technique relies on several combinatorial arguments that might be of independent interest, and whose underlying ideas might be relevant to the design of other parameterized algorithms. Let us now outline the structure of our proof, specifying the main ingredients that we develop along the way.
• First, we show that in time linear in |V |, it is easy to bound |E| by O(k log k · |V |) (Assumption 1). • Second, we give an algorithmic version of the Erdős-Pósa theorem that runs in time linear in |V | and which outputs either a solution or a small feedback vertex set (Theorem 2).
• Then, we show that given a graph G = (V, E) and a feedback vertex set F , a shortest cycle in G can be found in time O(|F | · (|V | + |E|)) (Lemma 3). • We proceed by interleaving an application of a simple set of reduction rules (Rules A1, A2 and A3) with a computation of a "short" cycle. Thus, given some g > 6, we obtain a set S of size smaller than gk such that the girth of the "irreducible component" of G − S is larger than g (Lemma 5). Here, the irreducible component of G − S is the graph obtained from G − S by applying our reduction rules. • Next, we show that the number of vertices in the above mentioned irreducible component is actually "small" -for some fixed constant c, it can be bounded by (2ck log k) 1+ 6 g−6 + 3ck log k (Lemma 6). The choice of g = 48 log k log log k + 6 results in the bound 3ck log k + 2ck log 1.5 k (Corollary 7). 2 • Now, we return to examine the graph G − S rather than only its irreducible component.
The necessity of this examination stems from the fact that our reduction rules, when applied to G − S rather than G, do not preserve solutions. We first give a procedure which given any set X, modifies the graph G − X in a way that both preserves solutions and gets rid of many leaves (Lemma 8). We then use this procedure to bound the number of leaves, as well as other "objects", in the reducible component of G − S (Lemma 9). • At this point, the graph G may still contain many vertices: the reducible component of G − S may contain "long" induced paths (which are not induced paths in G). We show that the length of these paths can be shortened by "guessing" permutations that provide enough information describing the relations between these paths and the vertices in S. Overall, we are thus able to bound the entire vertex-set of G by O(k log 1.5 k) in time 2 O( k log 2 k log log k ) · |V | and polynomial space (Lemma 10). • Finally, we apply a DP scheme (Lemma 11). Here, to ensure that the space complexity is polynomial in the input size, we rely on the principle of inclusion-exclusion.

Preliminaries
We use standard terminology from the book of Diestel [13] for those graph-related terms that are not explicitly defined here. We only consider finite graphs possibly having self-loops and multi-edges. Moreover, we restrict the maximum multiplicity of an edge to be 2. For a graph G, we use V and E to denote the vertex and edge sets of the graph G, respectively. For a vertex v ∈ V , we use deg G (v) to denote the degree of v, i.e the number of edges incident on v, in the (multi) graph G. We also use the convention that a self-loop at a vertex v contributes 2 to its degree. For a vertex subset S ⊆ V , we let G[S] and G − S denote the graphs induced on S and V \ S, respectively. For a vertex subset S ⊆ V , we use N G (S) and N G [S] to denote the open and closed neighborhoods of S in G, respectively. That is, . For a graph G = (V, E) and an edge e ∈ E, we let G/e denote the graph obtained by contracting e in G. For E ⊆ V 2 , i.e. a subset of edges, we let G + E denote the (multi) graph obtained after adding the edges in E to G, and we let G/E denote the (multi) graph obtained after contracting the edges of E in G. The girth of a graph is denoted by girth(G), its minimum degree by δ(G), and its maximum degree by ∆(G). A graph with no cycles has infinite girth. 2 We found these constants as the most natural ones to obtain a clean proof of any bound of the form O( k log 2 k log log k ) (that is, the constants were not optimized to obtain the bound 3ck log k + 2ck log 1.5 k).
A path in a graph is a sequence of distinct vertices v 0 , v 1 , . . . , v such that {v i , v i+1 } is an edge for all 0 ≤ i < . A cycle in a graph is a sequence of distinct vertices v 0 , v 1 , . . . , v such that {v i , v (i+1) mod +1 } is an edge for all 0 ≤ i ≤ . Both a double edge and a self-loop are cycles. If P is a path from a vertex u to a vertex v in the graph G then we say that u and v are the end vertices of the path P and P is a (u, v)-path. For a path P , we use V (P ) and E(P ) to denote the sets of vertices and edges in the path P , respectively, and length of P is denoted by |P | (i.e, |P | = |V (P )|). For a cycle C, we use V (C) and E(C) to denote the sets of vertices and edges in the cycle C, respectively, and the length of C, denoted by |C|, is |V (C)|. For a path or a cycle Q we use N G (Q) and N G [Q] to denote the sets N G (V (Q)) and N G [V (Q)], respectively. For a collection of paths/cycles Q, we use |Q| to denote the number of paths/cycles in Q and V (Q) to denote the set Q∈Q V (Q). We say a path P in G is a degree-two path if all vertices in V (P ), including the end vertices of P , have degree exactly 2 in G. We say P is a maximal degree-two path if no proper superset of P also forms a degree-two path. We note that the notions of walks and closed walks are defined exactly as paths and cycles, respectively, except that their vertices need not be distinct. Finally, a feedback vertex set is a subset F of vertices such that G − F is a forest.
Below we formally state some of the key results that will be used throughout the paper, starting with the classic Erdős-Pósa theorem [16].

Proposition 1 ([16]
). There exists a constant c such that every (multi) graph either contains k vertex-disjoint cycles or it has a feedback vertex set of size at most c k log k.
Observe that any (multi) graph G = (V, E) whose feedback vertex set number is bounded by c k log k has less than (2c k log k + 1) · |V | edges (recall that we restrict the multiplicity of an edge to be 2). Indeed, letting F denote a feedback vertex set of minimum size, the worst case (in terms of |E|) is obtained when G − F is a tree, which contains |V | − |F | − 1 edges, and between every pair of vertices v ∈ F and u ∈ V , there exists an edge of multiplicity 2. Thus, by Proposition 1, in case |E| > (2c k log k + 1) · |V |, the input instance is a yes-instance, and after we discard an arbitrary set of |E| − (2c k log k + 1) · |V | edges, it remains a yes-instance. A simple operation which discards at least |E| − (2c k log k + 1) · |V | edges and can be performed in time O(k log k · |V |) is described in Appendix A. Assumption 1. We assume that |E| = O(k log k · |V |). Now, we state our algorithmic version of Proposition 1. The proof partially builds upon the proof of the Erdős-Pósa theorem in the book [13], and it is given in Appendix B.

Theorem 2.
There exists a constant c and a polynomial-space algorithm such that given a (multi) graph G and a non-negative integer k, in time k O(1) · |V | it either outputs k vertex-disjoint cycles or a feedback vertex set of size at most ck log k = r.
Next, we state two results relating to cycles of average and short lengths. Itai and Rodeh [22] showed that given a (multi) graph G = (V, E), an "almost" shortest cycle (if there is any) in G can be found in time O(|V | 2 ). To obtain a linear dependency on |V | (given a small feedback vertex set), we prove the following result in Appendix C. Finally, we state a result that will be used (in Lemma 6) to bound the size of a graph we obtain after performing simple preprocessing operations as well as repetitive removal of short cycles.

Removing Leaves, Induced Paths, and Short Cycles
As is usually the case when dealing with cycles in a graph, we first define three rules which help getting rid of vertices of degree at most 2 as well as edges of multiplicity larger than 2. It is not hard to see that all three Reduction Rules A1, A2, and A3 are safe, i.e. they preserve solutions in the reduced graph.
Reduction Rule A1. Delete vertices of degree at most 1.

Reduction Rule A2. If there is a vertex v of degree exactly 2 that is not incident to a self-loop, then delete v and connect its two (not necessarily distinct) neighbors by a new edge.
Reduction Rule A3. If there is a pair of vertices u and v in V such that {u, v} is an edge of multiplicity larger than 2, then reduce the multiplicity of the edge to 2.
Observe that the entire process that applies these rules exhaustively can be done in time we first remove the vertex-set of each maximal path between a leaf and a degree-two vertex. No subsequent application of Rule A2 or Rule A3 creates vertices of degree at most one. Now, we iterate over the set of degree-two vertices. For each degree-two vertex that is not incident to a self-loop, we apply Rule A2. Next, we iterate over E, and for each edge of multiplicity larger than two, we apply Rule A3. At this point, the only new degree-two vertices that can be created are vertices incident to exactly one edge, whose multiplicity is two. Therefore, during one additional phase where we exhaustively apply Rule A2, the only edges of multiplicity larger than two that can be created are self-loops. Thus, after one additional iteration over E, we can ensure that no rule among Rules A1, A2 and A3 is applicable.
Since these rules will be applied dynamically and iteratively, we define an operator, denoted by reduce(G), that takes as input a graph G and returns the (new) graph G that results from an exhaustive application of Rules A1, A2 and A3.

Definition 4.
For a (multi) graph G, we let G = reduce(G) denote the graph obtained after an exhaustive application of Reduction Rules A1, A2 and A3. | reduce(G)| denotes the number of vertices in reduce(G). Moreover, img(reduce(G)) denotes the pre-image of reduce(G), i.e. img(reduce(G)) is the set of vertices in G which are not deleted in reduce(G).
The first step of our algorithm consists of finding, in time linear in |V |, a set S satisfying the conditions specified in Lemmata 5 and 6. Intuitively, S will contain vertices of "short" cycles in the input graph, where short will be defined later.

Lemma 5.
Given a (multi) graph G = (V, E) and two integers k > 0 and g > 6, there exists an k O(1) · |V |-time algorithm that either finds k vertex-disjoint cycles in G or finds a (possibly empty) set S ⊆ V such that girth(reduce(G − S)) > g and |S| < gk. Proof. We proceed by constructing such an algorithm. First, we apply the algorithm of Theorem 2 which outputs either k vertex-disjoint cycles or a feedback vertex set F of size at most ck log k = r. In the former case we are done. In the latter case, i.e. the case where a feedback vertex set F is obtained, we apply the following procedure iteratively (initially, we set S = ∅): (1) Apply Lemma 3 to find a shortest cycle C in reduce(G).
(2) If no cycle was found or |C| > g then return S.
(3) Otherwise, i.e. if |C| ≤ g, then add the vertices of C to S, delete those vertices from G to obtain G , set G = G , and repeat from Step (1).

Note that if
Step (3) is applied k times then we can terminate and return the corresponding k vertex-disjoint cycles in G. Hence, when the condition of Step (2) is satisfied, i.e. when the described procedure terminates, the size of S is at most g(k − 1) < gk and girth(reduce(G − S)) > g. Since the algorithm of Theorem 2 runs in time k O(1) · |V |, and each iteration of Steps , we obtain the desired time complexity. Lemma 6. Given a (multi) graph G = (V, E) and two integers k > 0 and g > 6, let S denote the set obtained after applying the algorithm of Lemma 5 (assuming and |V | = n . First, recall that G admits a feedback vertex set of size at most ck log k = r. Since Reduction Rules A1, A2 and A3 do not increase the feedback vertex set of the graph (see, e.g., [35], Lemma 1), G also admits a feedback vertex set F of size at most r. Let T denote the induced forest on the remaining N = n − r vertices in G . Moreover, from Lemma 5, we know that girth(G ) > g > 6.
Next, we apply Proposition 3 to T to get W . Now with every element a ∈ W we associate an unordered pair of vertices of F as follows. Assume a ∈ L, i.e. a is a vertex of degree 0 or 1. Since the degree of a is at least 3 in G , a has at least two neighbors in F . We pick two of these neighbors arbitrarily and associate them with a. We use {x a , y a } to denote this pair. If a = {u, v} is an edge from M then each of u and v has degree at least 3 in G and each has at least one neighbor in F . We pick one neighbor for each and associate the pair We now construct a new multigraph G = (V , E ) with vertex set V = F as follows. For every vertex a ∈ W we include an edge in E between x a and y a , and for every edge a = {u, v} ∈ W we include an edge in E between x u and x v . By Proposition 3, we know that W is of size at least N 4 . It follows that G has at least N 4 edges and hence its average degree is at least N 2r as |V | = ck log k = r. Note that if G has a cycle of length at most , then G has a cycle of length at most 3 , as any edge of the cycle in G can be replaced by a path of length at most 3 in G . Combining this with the fact that girth(G ) > g > 6, we conclude that G contains no self-loops or parallel edges. Hence G is a simple graph with average degree at least N 2r . By Proposition 2, G must have a cycle of length at most which implies that G must have a cycle of length at most 6 log r log( N 2r − 1) Finally, by using the fact that girth(G ) > g and substituting N and r, we get 6 log r log( N 2r − 1) This completes the proof.
The usefulness of Lemma 6 comes from the fact that by setting g = 48 log k log log k + 6, we can guarantee that | reduce(G − S)| < 3ck log k + 2ck log 1.5 k, and therefore we can beat the O(k log 2 k) bound. That is, we have the following consequence.
Corollary 7. Given a (multi) graph G = (V, E) and an integer k > 0, let S denote the set obtained after applying the algorithm of Lemma 5 with g = 48 log k log log k + 6 (assuming no k vertex-disjoint cycles obtained).
This completes the proof.

4
Bounding the Core of the Remaining Graph At this point, we assume, without loss of generality, that we are given a graph G = (V, E), a positive integer k, g = 48 log k log log k + 6, and a set S ⊆ V such that girth(reduce(G − S)) > g, |S| < gk, and | reduce(G − S)| ≤ 3ck log k + 2ck log 1.5 k.
Even though the number of vertices in reduce(G − S) is bounded, the number of vertices in G − S is unbounded. In what follows, we show how to bound the number of "objects" in G − S, where an object is either a vertex in G − S or a degree-two path in G − S. The next lemma is a refinement extending a lemma by Lokshtanov et al.
[32] (Lemma 5.2). We give a full proof in Appendix D.

Lemma 8.
Let G = (V, E) be a (multi) graph and let X ⊆ V be any subset of the vertices of G. Suppose there are more than |X| 2 (2|X| + 1) vertices in G − X whose degree in G − X is at most one. Then, there is either an isolated vertex w in G − X or an edge e ∈ E such that (G, k) is a yes-instance of Cycle Packing if and only if either (G − {w}, k) or (G/e, k) is a yes-instance. Moreover, there is an O(|X| 2 · k log k · |V |)-time algorithm that given G and X, Cycle Packing if and only if (G , k) is a yes-instance of Cycle Packing, and G − X contains at most |X| 2 (2|X| + 1) vertices whose degree in G − X is at most one.
Armed with Lemma 8, we are now ready to prove the following result. For a forest T , we let T ≤1 , T 2 , and T ≥3 , denote the sets of vertices in T having degree at most one in T , degree exactly two in T , and degree larger than two in T , respectively. Moreover, we let P denote the set of all maximal degree-two paths in T . Proof. To see why T = G − S − R must be a forest it is sufficient to note that for any cycle in G − S at least one vertex from that cycle must be in R = img(reduce(G − S)) (see Figure 1). Recall that, since girth(reduce(G − S)) > 6, every vertex in R has degree at least 3 in G − S. Now assume there exists some path P ∈ P having exactly three (the same argument holds for any number) distinct vertices u, v and w (in that order) each having at least one neighbor in R (possibly the same neighbor). We show that the middle vertex v must have been in R, contradicting the fact that T = G − S − R. Consider the graph G − S and apply Reduction Rules A1, A2 and A3 exhaustively (in G − S) on all vertices in the tree containing P except for u, v and w. Regardless of the order in which we apply the reduction rules, the path P will eventually reduce to a path on three vertices, namely u, v, and w. To see why v must be in R observe that even if the other two vertices have degree two in the resulting graph, after reducing them, v will have degree at least three (into R) and is therefore non-reducible.
Next, we bound the size of T ≤1 , which implies a bound on the sizes of T ≥3 and P. To do so, we simply invoke Lemma 8 by setting X = S ∪ R. Since |S| < gk, g = 48 log k log log k + 6 and |R| ≤ 3ck log k + 2ck log 1.5 k, we get that |T ≤1 | ≤ |S ∪ R| 2 (2|S ∪ R| + 1) = k O(1) . Since in a forest, it holds that |T ≥3 | < |T ≤1 |, the bound on |T ≥3 | follows. Moreover, in a forest, it also holds that |P| < |T ≤1 | + |T ≥3 | -if we arbitrarily root each tree in the forest at a leaf, one end vertex of a path in P will be a parent of a different vertex from T ≤1 ∪ T ≥3 -the bound on |P| follows as well.

Guessing Permutations
This section is devoted to proving the following lemma. Note that assuming the statement of the lemma, the only remaining task (to prove Theorem 1) is to develop an algorithm running in time O(2 |V | · poly(|V |)) and using polynomial space, which we present in Section 6. Proof. We fix g = 48 log k log log k + 6. Using Lemma 5, we first compute a set S in time k O(1) · |V |. Then, we guess which vertices to delete from S -that is, which vertices do not participate in a solution -in time O(2 gk ) = 2 O( k log k log log k ) . Here, guesses refer to different choices which lead to the construction of different instances of Cycle Packing that are returned at the end (recall that we are allowed to return up to 2 O( k log 2 k log log k ) different instances). Combining Lemma 5 and Corollary 7, we now have a set S ⊆ V such that |S| = O( k log k log log k ), and | reduce(G − S)| = O(k log 1.5 k).
Applying Lemma 9 with R = img(reduce(G−S)) ⊆ (V \S), we get a forest T = G−(S ∪R) such that for every maximal degree-two path in P there are at most two vertices on the path having neighbors in R (in the graph G − S). In addition, the size of R is bounded by O(k log 1.5 k), and the sizes |T ≤1 |, |P| and |T ≥3 | are bounded by k O(1) (see Figure 1).
For every vertex in S (which is assumed to participate in a solution), we now guess its two neighbors in a solution (see Figure 2). Note however that we only have a (polynomial in k) bound for |S|, |R|, |T ≤1 |, |P| and |T ≥3 |, but not for the length of paths in P and therefore not for the entire graph G. We let Z P denote the set of vertices in V (P) having neighbors in R. The size of Z P is at most 2|P|. Moreover, we let P denote the set of paths obtained after deleting Z P from P. Note that the size of P is upper bounded by |P| + |Z P | ≤ 3|P|, and that vertices in V (P ) are adjacent only to vertices in V (P ) ∪ Z P ∪ S. Now, we create a set of "objects", O = S ∪ R ∪ T ≤1 ∪ T ≥3 ∪ Z P ∪ P . We also denote O = O \ P . We then guess, for each vertex in S, which two objects in O constitute its neighbors, denoted by (v) and r(v), in a solution. It is possible that (v) = r(v). Since |O| = k O(1) , we can perform these guesses in k O( k log k log log k ) , or equivalently 2 O( k log 2 k log log k ) , time. We can assume that if (v) ∈ O, then (v) is a neighbor of v, and otherwise v has a neighbor on the path (v), else the current guess is not correct, and we need not try finding a solution subject to it. The same claim holds for r(v). If (v) = r(v) ∈ O, then {v, (v)} is an edge of multiplicity two, and otherwise if (v) = r(v), then v has (at least) two neighbors on the path (v).
Next, we fix some arbitrary order on P , and for each path in P , we fix some arbitrary orientation. We let S denote the multiset containing two occurrences of every vertex v ∈ S, denoted by v and v r . We guess an order of the vertices in S . The time spent for guessing such an ordering is bounded by |S|!, which in turn is bounded by 2 O( k log 2 k log log k ) . The ordering, assuming it is guessed correctly, satisfies the following conditions. For each path P ∈ P , we let (P ) and r(P ) denote the sets of vertices v ∈ S such that (v) ∈ V (P ) and r(v) ∈ V (P ), respectively. Now, for any two vertices u, v ∈ (P ), if u < v according to the order that we guessed, then the neighbor (u) of u appears before the neighbor (v) of v on P . Similarly, for any two vertices u, v ∈ r(P ), if u r < v r , then r(u) appears before r(v) on P . Finally, for any two vertices u ∈ (P ) and v ∈ r(P ), if u < v r , then (u) appears before r(v) on P , and otherwise r(v) appears before (u) on P .
Given a correct guess of (v) and r(v), for each v in S, as well as a correct guess of a permutation of S , for each path in P , we let {x v , y v } denote the two guessed neighbors of a r(v)). Otherwise, we assign neighbors to a vertex by a greedy procedure which agrees with the guessed permutation on S ; that is, for every path P ∈ P , we iterate over (P ) ∪ r(P ) according to the guessed order, and for each vertex in it, assign its first neighbor on P that is after the last vertex that has already been assigned (if such a vertex does not exist, we determine that the current guess is incorrect and proceed to the next one). We let We also let E S be the set of edges incident on a vertex in S, and we let E = {{x v , y v } | v ∈ S} denote the set of all pairs of guesses. Finally, to obtain an instance (G , k), we delete the vertex set W = S \ (X ∪ Y ) from G, we delete the edge set E S from G, we add instead the set of edges E , and finally we apply the reduce operator, i.e. G = reduce((G − W − E S ) + E ). O(k log 1.5 k).

Proof. Recall that by Corollary 7, we know that
Hence, we conclude that | reduce((G − W − E S ) + E )| = O(k log 1.5 k), as needed.

Claim 2. (G, k) is a yes-instance if and only if at least one of the generated instances (G , k) is a yes-instance.
Proof. Assume that (G, k) is a yes-instance and let C = {C 1 , C 2 , . . .} be an optimal cycle packing, i.e set of maximum size of vertex-disjoint cycles, in G. Note that if no cycle in C intersects with S then C is also an optimal cycle packing in G − S. By the safeness of our reduction rules, C is also an optimal cycle packing in reduce(G − S). Since we generate one instance for every possible intersection between an optimal solution and S, the case where no vertex from S is picked corresponds to the instance (G , k), with G = reduce(G − S). Hence, in what follows we assume that some cycles in C intersect with S. Consider any cycle C which intersects with S and let P C = {u 0 , u 1 , . . . , u f } denote any path on this cycle such that u 0 , u f ∈ S but u i ∈ S for 0 < i < f . We claim that, for some G , all such paths will be replaced by edges of the form {u 0 , u f } in reduce((G − W − E S ) + E ). Again, due to our exhaustive guessing, for some G we would have guessed, for each i, (u i ) = u i−1 and r(u i ) = u i+1 . Consequently, P C \ {u 0 , u f } is a degree-two path in (G − W − E S ) + E and therefore an edge in reduce((G − W − E S ) + E ). Using similar arguments, it is easy to show that if C is completely contained in S then this cycle is contained in G as a loop on some vertex of the cycle.
For the other direction, let (G , k) be a yes-instance and let C = {C 1 , C 2 , . . .} be an optimal cycle packing in G . We assume, without loss of generality, that C is a cycle packing in (G − W − E S ) + E , as one can trace back all reduction rules to obtain the graph (G − W − E S ) + E . If no cycle in C uses an edge {u 0 , u f } ∈ E then we are done, as Otherwise, we claim that all such edges either exist in G or can be replaced by vertex disjoint paths P = {u 0 , u 1 , . . . , u f } (on at least three vertices) in G such that u i ∈ S for 0 < i < f . If either u 0 or u f is in X ∪ Y ⊆ S then the former case holds. It remains to prove the latter case. Recall that for every vertex in S we guess its two can easily find a path (or singleton) in G[S] to replace this edge by simply backtracking the neighborhood guesses. Now assume that {u 0 , u f } ⊆ O and recall that no vertex in a path in P can have neighbors in R. Hence, any cycle containing such an edge must intersect with S (in G). Assuming we have correctly guessed the neighbors of vertices in S (as well as a permutation for P ), we can again replace this edge with a path in S.
Combining Claims 1 and 2 concludes the proof of the theorem.

Dynamic Programming and Inclusion-Exclusion
Finally, we give an exact exponential-time algorithm for Cycle Packing. For this purpose, we use DP and the principle of inclusion-exclusion, inspired by the work of Nederlof [33]. Due to space constraints, the details are given in Appendix E.

Lemma 11.
There exists a (deterministic) polynomial-space algorithm that in time O(2 |V | · poly(|V |)) solves Cycle Packing. In case a solution exists, it also outputs a solution.

Conclusion
In this paper we have beaten the best known 2 O(k log 2 k) · |V |-time algorithm for Cycle Packing that is a consequence of the Erdős-Pósa theorem. For this purpose, we developed a deterministic algorithm that solves Cycle Packing in time 2 O( k log 2 k log log k ) · |V |. Two additional advantageous properties of our algorithm is that its space complexity is polynomial in the input size and that in case a solution exists, it outputs a solution (in time 2 O( k log 2 k log log k ) · |V |). Our technique relies on combinatorial arguments that may be of independent interest. These arguments allow us to translate any input instance of Cycle Packing into 2 O( k log 2 k log log k ) instances of Cycle Packing whose sizes are small and can therefore be solved efficiently.
It remains an intriguing open question to discover the "true" running time, under reasonable complexity-theoretic assumptions, in which one can solve Cycle Packing on general graphs. In particular, we would like to pose the following question: Does there exist a 2 O(k log k) · |V | O(1) -time algorithm for Cycle Packing? This is true for graphs of bounded maximum degree as one can easily bound the number of vertices by O(k log k) and then apply Lemma 11. Moreover, Bodlaender et al. [6] proved that this is also true in case one seeks k edge-disjoint cycles rather than k vertex-disjoint cycles. On the negative side, recall that (for general graphs) the bound f (k) = O(k log k) in the Erdős-Pósa theorem is essentially tight, and that it is unlikely that Cycle Packing is solvable in time 2 o(tw log tw) · |V | O(1) [12]. 4

A Discarding Edges
In this appendix we describe the procedure, mentioned in Section 2 to justify Assumption 1, which given a graph G = (V, E), discards at least |E| − (2c k log k + 1) · |V | edges in time O(k log k · |V |). We examine the vertices in V in some arbitrary order {v 1 , v 2 , . . . , v |V | }, and initialize a counter x to 0. For each vertex v i , if x < (2c k log k + 1) · |V | then we iterate over the set of edges incident to v i , and for each edge whose other endpoint is v j for j ≥ i, we increase x by 1. Let be the largest index for which we iterated over the set of edges incident to v . We copy V , and initialize the adjacency lists to be empty. Then, we copy the adjacency lists of the vertices v 1 , v 2 , . . . , v , where for each adjacency involving vertices v i and v j , where i ≤ < j, we update the adjacency list of v j to include v i .

B Proof of Theorem 2
We fix c as the smallest integer such that c ≥ 150(log 2 c). Let G = (V, E) be a (multi) graph, and let k be a non-negative integer. The objective is to show that in time k O(1) · |V | we can either output k vertex-disjoint cycles or a feedback vertex set of size at most ck log k = r. We remark that the first part of this proof, which ends at the statement of Lemma 4, follows the proof of the Erdős-Pósa theorem [16] given in the book [13]. We may assume that G contains at least one cycle, since this fact can clearly be checked in time O(|V | + |E|), and if it is not true, we output an empty set as a feedback vertex set. Now, we construct a maximal subgraph H of G such each vertex in H is of degree 2 or 3 (in H). This construction can be done in time O(|V | + |E|) (see [2]). Let V 2 and V 3 be the degree-2 and degree-3 vertices in H, respectively. We also compute (in time O(|V | + |E|)) the set S of connected components of G − V (H). Observe that for each connected component S ∈ S, there is at most one vertex v S ∈ V 2 such that there is at least one vertex in S adjacent to v S , else we obtain a contradiction to the maximality of H as it could have been extended by adding a path from S. We compute (in time O(|V | + |E|)) the vertices v S , where for each component for which v S is undefined (since it does not exist), we set v S = nil. Let V 2 ⊆ V 2 be the set of vertices v S = nil such that v S has at least two neighbors in S, which is easily found in time O(|V | + |E|). Observe that if |V 2 | ≥ k, we can output k vertex-disjoint cycles in time O(|V | + |E|). Thus, we next assume that |V 2 | < k. Moreover, observe that V 2 ∪ V 3 is a feedback vertex set. Thus, if |V 2 ∪ V 3 | ≤ ck log k, we are done. We next assume that |V 2 ∪ V 3 | > ck log k. In particular, it holds that |V 3 | > ck log k − k ≥ (c − 1)k log k.
Let H * be the graph obtained from H by contracting, for each vertex in V 2 , an edge incident to it. We remark that here we permit the multiplicity of edges to be 3. Then, H * is a cubic graph whose vertex-set is V 3 . To find k vertex-disjoint cycles in G in time k O(1) · |V |, it is sufficient to find k vertex-disjoint cycles in H * in time k O(1) · |V |, since the cycles in H * can be translated into cycles in G in time O(|V | + |E|). We need to rely on the following claim, whose proof is given in the book [13]. We remark that the original claim refers to graphs, but it also holds for multigraphs.
Thus, we know that H * contains k vertex-disjoint cycles, and it remains to find them in time k O(1) · |V |. We now modify H * to obtain a cubic graph H on at least q vertices but at most O(k · log k) vertices, such that given k vertex-disjoint cycles in H , we can translate them into k vertex-disjoint cycles in H * in time O(|V |), which will complete the proof. To this end, we initially let H be a copy of H * . Now, as long as |V (H )| > (c − 1)k log k + 2, we perform the following procedure: 1. Choose arbitrarily a vertex v ∈ V (H ).

2.
If v has exactly one neighbor u -that is, {v, u} is an edge of multiplicity 3 -remove v and u from the graph. 3. Else if v has a neighbor u such that u, in turn, has a neighbor w (which might be v) such that the edge {u, w} is of multiplicity 2, then remove u and w from H and connect the remaining neighbor of u to the remaining neighbor of w by a new edge (which might be a self-loop). 4. Else, let x, y, z be the three distinct neighbors of v. Then, remove v and add an edge between x and y. Now, each vertex is of degree 3, except for z, which is of degree 2, and has two distinct neighbors. Remove z, and connected its two neighbors by an edge.
Since this procedure runs in time O(1) and each call decreases the number of vertices in the graph, the entire process runs in time O(|V |). It is also clear that the procedure outputs a cubic graph, and at its end, (c − 1)k log k ≤ |V (H )| ≤ (c − 1)k log k + 2. Thus, to prove the correctness of the process, it is now sufficient to consider graphs H 1 and H 2 , where H 2 is obtain from H 1 by applying the procedure once, and show that given a set C 2 of k vertex-disjoint cycles in H 2 , we can modify them to obtain a set C 1 of k vertex-disjoint cycles in H 1 . Let v be the vertex chosen in the first step. If the condition in the second step was true, we simply let C 1 = C 2 . In the second case, we examine whether the newly added edge belongs to a cycle in the solution in time O(1) (as we assume that each element in the graph, if it belongs to the solution, has a pointer to its location in the solution), and if it is true, we replace it by the path between its endpoints whose only internal vertices are u and w. Finally, suppose the procedure reached the last case. Then, if the first newly added edge is used, replace it by the path between its endpoints, x and y, whose only internal vertex is v, and if the second newly added edge is used, replace it by the path between its endpoints whose only internal vertex is z.
We are now left with the task of finding k vertex-disjoint cycles in H . We initialize a set C of vertex-disjoint cycles to be empty. As long as |C| < k, we find a shortest cycle in H in time O(|V (H )| · |E(H )|) = k O(1) (see [22]), insert it into C and remove all of the edges incident to its vertices from H . Thus, to conclude the proof, it remains to show that for each i ∈ {0, 1, . . . , k − 1}, after we remove the edges incident to the ith cycle from H , it still contains a cycle.
By using induction on i, we show that after removing the edges incident to the ith cycle from H , the number of edges in H is at least p(i) = 3 2 (c − 1)k log 2 k − 12 · i · log 2 (ck log 2 k). This would imply that the average degree of a vertex of H is at ≥ 2 (we later also explicitly show that 2p(i) ≥ ( √ 2 + 1)ck log 2 k), and therefore it contains a cycle (since the average degree of a forest is smaller than 2). Initially, H is a cubic graph, and therefore |E(H )| = 3 2 |V (H )| ≥ 3 2 (c − 1)k log 2 k, and the claim is true. Now, suppose that it is true for some i ∈ {0, 1, . . . , k − 2}, and let us prove that it is true for i + 1. By Proposition 2, a shortest cycle in H is of length at most . Such a cycle is incident to at most 6 log d−1 (ck log 2 k) edges. Therefore, after removing from H the edges incident to a shortest cycle in it, it contains at least p(i) − 6 log d−1 (ck log 2 k) ≥ p(i) − 6 log 2 (ck log 2 k) log 2 (d − 1) = p(i) − 6 log 2 (ck log 2 k) edges. Thus, by the induction hypo-thesis, it remains to prove that log 2 ( 2p(i) ck log 2 k − 1) ≥ 1/2, to which end we need to show that 2p(i) ck log 2 k − 1 ≥ √ 2, that is, 2p(i) ≥ ( √ 2 + 1)ck log 2 k. For this purpose, it is sufficient to show that 4p(i) ≥ 5ck log 2 k. By the induction hypothesis and since i ≤ k − 1, 4p(i) ≥ 6(c − 1)k log 2 k − 48k log 2 (ck log 2 k) = 5ck log 2 k + (ck log 2 k − 6k log 2 k − 48k log 2 k − 48k log 2 c − 48k log 2 log 2 k) ≥ 5ck log 2 k + (ck log 2 k − 150(log 2 c)k log 2 k). Thus, we need to show that c ≥ 150(log 2 c), which holds by our choice of c. This concludes the proof.

C Proof of Lemma 3
We can clearly detect self-loops and edges of multiplicity 2 in time O(|V | + |E|), and return a cycle of length 1 or 2 accordingly, and therefore we next assume that G is a simple graph.
Since F is a feedback vertex set, to prove the lemma it is sufficient to present a procedure that given a vertex v ∈ F , finds in time O(|V | + |E|) a cycle that is at least as short as the shortest cycle in G that contains v. Indeed, then we can iterate over F and invoke this procedure, returning the shortest cycle among those returned by the procedure. Thus, we next fix some vertex v ∈ F . Let H be the connected component of G containing v.
From the vertex v, we run a breadth first search (BFS). Thus, we obtain a BFS tree T rooted at v, and each vertex in V gets a level i, indicating the distance between this vertex and v (the level of v is 0). By iterating over the neighborhood of each vertex, we identify the smallest index i 1 such that there exists an edge with both endpoints, u 1 and v 1 , at level i 1 (if such an index exists), and the smallest index i 2 such that there exists a vertex w 2 at level i 2 adjacent to two vertices, u 2 and v 2 , at level i 2 − 1 (if such an index exists). For i 1 , the edge {u 1 , v 1 } and the paths between v 1 and u 1 and their lowest common ancestor result in a cycle of length at most 2i 1 + 1. For i 2 , the edges {w 2 , u 2 } and {w 2 , v 2 } and the paths between u 2 and v 2 and their lowest common ancestor result in a cycle of length at most 2i 2 . We return the shorter cycle among the two (if such a cycle exists).
Suppose that there exists a cycle containing v, and let C be a shortest such cycle. We need to show that above procedure returns a cycle at least as short as C. Every edge of H either connects two vertices of the same level, or a vertex of level i − 1 with a vertex of level i. Thus, if there does not exist an index i 1 such that there exists an edge in E(C) with both endpoints, u 1 and v 1 , at level i 1 , there must exist an index i 2 such that there exists a vertex w 2 at level i 2 adjacent to two vertices, u 2 and v 2 , at level i 2 − 1, and the edges {w 2 , u 2 } and {w 2 , v 2 } belong to E(C). First, suppose that the first case is true. Then, the procedure returns a cycle of length at most 2i 1 + 1. The length of C cannot be shorter than 2i 1 + 1, since it consists of a path from v to u 1 (whose length is at least i 1 since u 1 belongs to level i 1 ), a path from v to v 1 whose only common vertex with the previous path is v (whose length is at least i 1 since v 1 belongs to level i 1 ), and the edge {u 1 , v 1 }. Now, suppose that the second case is true. Then, the procedure returns a cycle of length at most 2i 2 . The length of C cannot be shorter than 2i 2 , since it consists of two internally vertex-disjoint paths from v to w 2 (each of length at least i 2 since w 2 belongs to level i 2 ).

D Proof of Lemma 8
For (u, v) ∈ X × X, let L(u, v) be the set of vertices of degree at most one in G − X such that each x ∈ L(u, v) is adjacent to both u and v (if u = v, then L(u, u) is the set of vertices which have degree at most one in G − X and an edge of multiplicity two to u). For each pair (u, v) ∈ X × X, we arbitrarily mark 2|X| + 1 vertices from L(u, v) if |L(u, v)| > 2|X| + 1, Proof of Lemma 11 First, we recall the principle of inclusion-exclusion.
Proposition 5 (Folklore, [33]). Let U and R be sets, and for every v ∈ R let P v be a subset of U . UseP v to denote U \ P v . With the convention v∈∅P v = U , the following holds: It is straightforward to verify that the calculations are correct. The order of the computation is an ascending order with respect to j, which ensures that when an entry is calculated, the entries on which it relies have already been calculated. To output a solution, we apply a simple self-reduction from the decision to the search variant of the problem. In particular, we repeatedly remove edges until no more edges can be removed from the graph while preserving a yes-instance.