Improved FPT Algorithms for Deletion to Forest-like Structures

The Feedback Vertex Set problem is undoubtedly one of the most well-studied problems in Parameterized Complexity. In this problem, given an undirected graph $G$ and a non-negative integer $k$, the objective is to test whether there exists a subset $S\subseteq V(G)$ of size at most $k$ such that $G-S$ is a forest. After a long line of improvement, recently, Li and Nederlof [SODA, 2020] designed a randomized algorithm for the problem running in time $\mathcal{O}^{\star}(2.7^k)$. In the Parameterized Complexity literature, several problems around Feedback Vertex Set have been studied. Some of these include Independent Feedback Vertex Set (where the set $S$ should be an independent set in $G$), Almost Forest Deletion and Pseudoforest Deletion. In Pseudoforest Deletion, each connected component in $G-S$ has at most one cycle in it. However, in Almost Forest Deletion, the input is a graph $G$ and non-negative integers $k,\ell \in \mathbb{N}$, and the objective is to test whether there exists a vertex subset $S$ of size at most $k$, such that $G-S$ is $\ell$ edges away from a forest. In this paper, using the methodology of Li and Nederlof [SODA, 2020], we obtain the current fastest algorithms for all these problems. In particular we obtain following randomized algorithms. 1) Independent Feedback Vertex Set can be solved in time $\mathcal{O}^{\star}(2.7^k)$. 2) Pseudo Forest Deletion can be solved in time $\mathcal{O}^{\star}(2.85^k)$. 3) Almost Forest Deletion can be solved in $\mathcal{O}^{\star}(\min\{2.85^k \cdot 8.54^\ell,2.7^k \cdot 36.61^\ell,3^k \cdot 1.78^\ell\})$.


Introduction
Feedback Vertex Set (FVS) is a classical NP-complete problem and has been extensively studied in all subfields of algorithms and complexity. In this problem we are given an undirected graph G and a non-negative integer k as input, and the goal is to check whether there exists a subset S ⊆ V (G) (called feedback vertex set or in short fvs) of size at most k such that G − S is a forest. This problem originated in combinatorial circuit design and found its way into diverse applications such as deadlock prevention in operating systems, constraint satisfaction and Bayesian inference in artificial intelligence. We refer to the survey by Festa et al. [14] for further details on the algorithmic study of feedback set problems in a variety of areas like approximation algorithms, linear programming and polyhedral combinatorics.
FVS has been extensively studied in Parameterized Algorithms. FVS has played a pivotal role in the development of the field of Parameterized Complexity. The earliest known FPT algorithms for FVS go back to the late 80s and the early 90s [4,13] and used the seminal Graph Minor Theory of Robertson and Seymour. These algorithms are quite impractical because of large hidden constants in the run-time expressions. Raman et al. [31] designed an algorithm with running time O (2 O(k log log k) ) which basically branched on short cycles in a bounded search tree approach. For FVS, the first deterministic O (c k ) algorithm was designed only in 2005; independently by Dehne et al. [12] and Guo et al. [16]. It is important to note here that a randomized algorithm for FVS with running time O (4 k ) was known in as early as 1999 [3]. The deterministic algorithms led to the race of improving the base of the exponent for FVS algorithms and several algorithms [6,7,8,9,17,22,23], both deterministic and randomized, have been designed. Until few months ago the best known deterministic algorithm for FVS ran in time O (3.619 k ) [22], while the Cut & Count technique by Cygan et al. [9] gave the best known randomized algorithm running in time O (3 k ). However, just in the last few months both these algorithms have been improved; Iwata and Kobayashi [17, IPEC 2019] designed the fastest known deterministic algorithm with running time O (3.460 k ) and Li and Nederlof [23, SODA 2020] designed the fastest known randomized algorithm with running time O (2.7 k ). The success on FVS has led to the study of many variants of FVS in literature such as Connected FVS [9,28], Independent FVS [1,24,27], Simultaneous FVS [2,32], Subset FVS [11,18,19,20,26], Pseudoforest Deletion [5,29], Generalized Pseudoforest Deletion [29], and Almost Forest Deletion [30,25].
Given an instance of FVS, by subdividing every edge we get an instance of Independent FVS, showing that it generalizes FVS. On the other hand setting = 0 in Almost Forest Deletion results in FVS. The best known algorithms for Independent FVS, Almost Forest Deletion, and Pseudoforest Deletion are O (3.619 k ) [24], O (5 k 4 ) [25], and O (3 k ) [5], respectively. Our main objective is to improve over these running times for the corresponding problems. In a nutshell our paper is as follows.
Motivated by the methodology developed by Li and Nederlof [23] for FVS, we relook at several problems around FVS, such as Independent FVS, Almost Forest Deletion, and Pseudoforest Deletion, and design the current fastest randomized algorithm for these problems. Our results show that the method of Li and Nederlof [23] is extremely broad and should be applicable to more problems.
To achieve improvements and tackle Independent FVS and Almost Forest Deletion at once, we propose a more generalized version of the Almost Forest Deletion problem.
Restricted Independent Almost Forest Deletion (RIAFD) Parameter: k and Input: A graph G, a vertex set R ⊆ V (G), and integers k and Question: Does there exist a vertex set S ⊆ V (G) of size at most k that does not contain any element from R, that is also an independent set in G, and G − S is an -forest? Setting = 0, R = ∅ we get the Independent FVS problem. A simple polynomial time reduction, where we subdivide every edge and add all the subdivision vertices to R, yields an instance of RIAFD, given an instance of Almost Forest Deletion. The reduction leaves and k unchanged. It is worth mentioning lower bounds for the above problems. Assuming the Exponential Time Hypothesis to be true, we know that no algorithm running in time 2 o(k) exists for either of PDS or RIAFD (for a fixed ). To describe our results, we first summarize the method of Li and Nederlof [23] (for FVS) which we adopt accordingly. The main observation guiding the method is the fact that after doing some simple preprocessing on the graph, we can ensure that a large fraction of edges are incident on every solution to the problem. This leads to two-step algorithms, one for the dense case and the other for the sparse case. In particular, if we are aiming for an algorithm with running time O (α k ), then we do as follows. Dense Case: In this case, the number of edges incident to any FVS is superlinear(in k), and we select a vertex into our solution with probability at least 1 α . Sparse Case: Once the dense case is done, we know that we have selected vertices, say k 1 , with probability ( 1 α ) k1 . Now, we know that the number of edges incident to an FVS of the graph is O(k) and the existence of solution S of size at most k, implies that the input graph has treewidth at most k + 1. Now, using this fact and the fact that deleting the solution leaves a graph of constant treewidth, we can actually show that graph has treewidth (1 − Ω(1))k = γk. This implies that if we have an algorithm on graphs of treewidth (tw) with running time β tw , such that β γ ≤ α, then we get the desired algorithm with running time O (α k ).
So a natural approach for our problems which are parameterized by the solution size is to devise an algorithm using another algorithm parameterized by treewidth with an appropriate base in the exponent, along with probabilistic reductions with a good success probability. However, to get the best out of methods of Li and Nederlof [23], it is important to have an algorithm parameterized by treewidth with an appropriate base in the exponent that is based on the Cut & Count method [10]. However, for all the algorithms for problems we consider, only non Cut & Count algorithms were known. Thus, our first result is as follows. Note that a yes-instance of RIAFD has treewidth k + + 1. Thus as our first result, we design a randomized algorithm based on Theorem 1.1 and iterative compression with running time O (3 k · 3 ) for RIAFD. This yields O (3 k ) and O (3 k · 3 ) running time algorithms for Independent FVS and Almost Forest Deletion, respectively, which take polynomial space (though, these do not appear in literature). Next, we devise probabilistic reduction rules to implement the first step in the method of Li and Nederlof [23]. We analyze these rules by modifying the analysis of their lemmas to get an O (2.85 k · 8.54 ) time algorithm that takes polynomial space, and an O (2.7 k · 36.61 ) time algorithm that takes exponential space for solving RIAFD. All these algorithms while progressively improving the dependence on k slightly, significantly worsen the dependence on . Therefore, to obtain an algorithm with an improved dependence on we describe a procedure to construct a tree decomposition of width k + 3 5.769 + O(log( )) given a riafd-set of size k. This procedure when combined with an iterative compression routine yields an O (3 k · 1.78 ) algorithm for RIAFD. This brings us to the following result. Although we have a deterministic O (3 k ) algorithm for Pseudoforest deletion given by Bodlaender et al. [5] which runs in exponential space, to make use of the techniques from [23] we develop our Cut & Count algorithm which has the same asymptotic running time. This requires some work to apply Cut & Count here. However, even with our Cut & Count algorithm, we cannot make full use of the methods of Li and Nederlof [23] and only get the following improvement. Due to paucity of space we prove only Theorem 1.2 (2) and (4) in this paper. The reader is referred to the full version of the paper [15] for the details that have been skipped due to paucity of space.

Preliminaries
For a set A, A ·,·,· denotes the set of all partitions of A into three subsets. Let G(V, E) or G = (V, E) be an undirected graph, where V is the set of vertices and E is the set of edges. We also denote V (G) to be the vertex set and E(G) to be the edge set of graph G. Also, |V | = n and |E| = m. For a vertex subset Given an edge e = (u, v), the subdivision of the edge e is the addition of a new vertex between u and v, i.e. the edge e is replaced by two edges (u, w) and (w, v), where w is the newly added vertex. Here, w is called a "subdivision vertex".
The treewidth of a graph G, denoted by tw(G), is the minimum width over all tree decompositions of G.
We sometimes abuse notation and use tw(T) to denote the width of the tree decomposition T. For the definition above, if there are parallel edges or self loops we can just ignore them, i.e., a tree decomposition of a graph with parallel edges and self loops is just the tree decomposition of the underlying simple graph (obtained by keeping only one set of parallel edges and removing all self loops).
There is also the notion of a nice tree decomposition, which is used in this paper. In literature, there are a few variants of this notion that differ in details. We use the one with introduce edge nodes and root bag and leaf bags of size zero. A nice tree decomposition is a tree decomposition ({B x | x ∈ I}, T = (I, F )) where T is rooted tree and the nodes are one of the following five types. With each bag in the tree decomposition, we also associate a subgraph of G; the subgraph associated with bag x is denoted G x = (V x , E x ). We give each type together with how the corresponding subgraph is formed.
, G x is obtained from G y by adding an edge between these two vertices in B x . If we have parallel edges, we have one introduce edge node for each parallel edge. A self loop with endpoint v is handled in the same way, i.e., there is an introduce edge node with v ∈ B x and G x is obtained from G y by adding the self loop on v. Forget vertex nodes x. x has one child, say y. There is a vertex v such that B x = B y \{v} and G x and G y are the same graph.
Join nodes x. x has two children, say y and z. B , G x is the union of G y and G z , where the vertex set B x is the intersection of the vertex sets of these two graphs.
In this paper, we will be dealing with randomized algorithms with one-sided errorprobability, i.e. only false negatives are possible. The success-probability of an algorithm is the probability that the algorithm finds a solution, given that at least one such solution exists. We define high-probability to be probability at least 1 − 1 2 c|x| or sometimes 1 − 1 |x| c , where |x| is the input size and c is a constant. Given an algorithm with constant success-probability, we can boost it to high-probability by performing O (1) independent trials. We cite the following folklore observation: [23]). If a problem can be solved with success probability 1 S and in expected time T , and its solutions can be verified for correctness in polynomial time, then it can be also solved in O (S · T ) time with high probability.
We will use the following notion of separation in a graph from [23]: of V is a separation if there are no edges between A and B.
A β-separator for a graph G(V, E) is a set of vertices whose removal from G leaves no connected component of size larger than |V | β vertices, where β > 0 is some constant. Thus, a β-separator is a balanced separator of the graph. More generally, one can define a β-separator with respect to a weight function on the vertices. We now give a method to construct a β-separator of a graph G given a tree decomposition (Lemma 2.4) and its proof can be found in the full version of the paper [15]. In [23], the authors presented a method involving randomized reductions and small separators to get faster randomized algorithms for FVS. It turns out that this method can be generalized to work for a certain set of "vertex-deletion problems". We will now describe the basic structure of this method and will follow this outline wherever this method is used in the rest of the paper.
Throughout this outline, assume that we are working on some vertex-deletion problem P. Let G(V, E) be the graph involved in a given instance of P. A valid solution S ⊆ V is a set of vertices of G which solves the given problem instance of P.
The method is divided into two cases: A dense case and a sparse case.
Dense Case. The algorithm goes into this case when for a given problem all the existing solution sets are of high average degree. In formal terms, every set S ⊆ V of size k which is a valid solution of the given instance satisfies deg(S) > c · k, where c = Θ(1). To handle this case, a vertex v ∈ V is sampled randomly based on a weight function ω(v) which depends on deg(v), deletes v and makes appropriate updates to the parameters. In this paper, we use ω(v) = deg(v) − 2 for all the problems discussed. This process acts like a probabilistic reduction rule for the problem as it may fail with certain probability. Sparse Case. The algorithm goes into this case when for a given problem there exists a solution set which has low average degree. In formal terms, there exists a vertex subset S ⊆ V of size k which is a valid solution of the given instance and satisfies deg(S) ≤ c · k, where c = O(1). Due to this reason, the number of edges in the given graph can be bounded, thus the input graph G is sparse.
The proof for the small separator lemma in [23] doesn't require the remaining graph, i.e. the graph obtained by deleting the solution set, to be a forest only. As long as there is a good β-separator of the graph G − S, the proof works. Lemma 2.4 helps to construct such a β-separator of size β(tw + 1) for a graph with given tree decomposition of width tw.
The small separator helps to construct a tree decomposition of small width, given a solution set with bounded degree. The idea suggested in [23] was to use iterative compression techniques to construct a solution utilizing the small separator. This also requires solving a bounded degree version of the problem, which can be done using Cut & Count based algorithms. Specific details for each problem will be explained in the corresponding sections in due course.

(2) and (4)
We use the method of [23] using the outline described in Section 2. We use the term riafd-set for a solution to the given instance of RIAFD and the term afd-set for a solution to the given instance of AFD. Following is a simple reduction rule for RIAFD.

Definition 3.1 (Reduction 1).
Apply the following rules exhaustively, until the remaining graph has minimum vertex degree at least 2: 1. Delete all vertices of degree at most one in the input graph. 2. If k = 0 and G is not an -forest, then we have a no instance. If k ≥ 0 and G is an -forest, we have a yes instance.

Dense Case
Now we give a probabilistic reduction for RIAFD that capitalizes on the fact that a large number of edges are incident to the riafd-set. In particular, for a yes instance we focus on obtaining a probabilistic reduction that succeeds with probability strictly greater than 1/3 so as to achieve a randomized algorithm running in time O (3 − ) k with high probability.
Definition 3.2 (Reduction 2 (P)). Assume that Reduction 1 does not apply and G has a vertex of degree at least 3.
Delete v and add its neighbours to R. Decrease k by 1.

Improved FPT Algorithms for Deletion to Forest-Like Structures
Proof. Let F ⊆ V (G) be a riafd-set of G of size exactly k. For Reduction 2 to succeed with probability at least 1 3− , we need ω(F ) ω(F ) ≥ 1 2− . The value of ω(F ) can be rewritten as, By Claim 3.3 (as riafd-set is also an afd-set),

Sparse Case
For the sparse case, we first construct a small separator. Due to the presence of two variables (k and ), we have to modify the small separator lemma in [23] with a bivariate analysis. Also, though we are discussing RIAFD, we will show how to construct a small separator assuming that we are given an afd-set, as a riafd-set is also an afd-set.
Small Separator. The main idea, as presented in [23], is to convert an afd-set with small average degree into a good tree decomposition. In particular, suppose a graph G has an afd-set F of size k with deg(F ) ≤ d(k + ), where d = O(1). We show how to construct a tree decomposition of width (1 − Ω(1))k + (2 − Ω(1)) . Note that d is not exactly the average degree of F . This definition helps us to bound the width of the tree decomposition well. Before constructing this separator, we will first see a construction of a β-separator of an -forest. We could use Lemma 2.4, but the size of the separator obtained would be · o(k) which is huge (treewidth ≤ ). We now give a method to construct a β-separator of size + o(k).

Lemma 3.5. Given an -forest T (V, E) on n vertices with vertex weights ω(v), for any β > 0, we can delete a set S of β + vertices in polynomial time so that every connected component of T − S has total weight at most ω(V )
β .
Proof. Construct some spanning tree for each connected component of T , call this resulting forest T . Let X be the set of remaining edges which are not in T . For each edge in X, delete one vertex from T . As |X| ≤ , we will delete at most vertices. The resulting graph will still be a forest, call it T . Now, root every component of the forest T at an arbitrary vertex. Iteratively select a bag x of maximal depth whose subtree has total weight more than ω(V ) β , add all vertices in B x to S and then remove x and its subtree. The subtrees rooted at the children of v have total weight at most ω(V ) β , since otherwise, v would not satisfy the maximal depth condition. Moreover, by removing the subtree rooted at v, we remove at least ω(V ) β total weight, and this can only happen β times. Thus, we delete at most β + vertices overall.
With the help of Lemma 3.5, we will now proceed to the small separator lemma. (G, k, ) and an afd-set F of G of size k, define d := deg(F ) k+ , and suppose that d = O(1). There is a randomized algorithm running in expected polynomial time that computes a separation (A, B, S) of G such that:

Lemma 3.6 (Small Separator). Given an instance
The proof will be similar to [23,Lemma 4]. First, we fix a parameter := (k + ) −0.01 throughout the proof. Apply Lemma 3.5 to the -forest G − F with β = (k + ) and vertex v weighted by |E[v, F ]|. Let S be the output. Observe that: Now form a bipartite graph H, as in [23], i.e., on the vertex bipartition F R, where F is the afd-set, and there are two types of vertices in R, the component vertices and the subdivision vertices. For every connected component C in G − F − S , there is a component vertex v C in R that represents that component, and it is connected to all vertices in F adjacent to at least one vertex in C. For every edge e = (u, v) in E[F, F ], there is a vertex v e in R with u and v as its neighbours. Observe that: The algorithm that finds a separator (A, B, S) works as follows. For each vertex in R, color it red or blue uniformly and independently at random. Every component C in G − F − S whose vertex v C is colored red is added to A in the separation (A, B, S), and every component whose vertex v C is colored blue is added to B. Every vertex in F whose neighbors are all colored red joins A, and every vertex in F whose neighbors are all colored blue joins B. The remaining vertices in F , along with the vertices in S , comprise S. It is easy to see that (A, B, S) is a separation.
We now show with good probability both conditions (1) and (2) hold. The algorithm can then repeat the process until both conditions hold. Proof. Firstly, notice that F has at most (k + ) vertices with degree at least d . These can be ignored as they affect condition (1) for large enough k.
By a union bound over all the ≤ k 0.1 such color classes with |F i | ≥ k 0.9 , the probability that 1) . In this case, where the last inequality follows from convexity of the function 2 −x . Recall that |F | ≥ k − o(k + ), and observe that deg(F ) |F |+ ≤ deg(F ) k+ = d since the vertices in F \ F are vertices with degree greater than some threshold. Thus, proving condition (1) for A. The argument for |B ∩ F | is symmetric. This completes the proof. Since there is no edge connecting A to B, we can construct the tree decomposition T of G by simply adding an edge between an arbitrary node from T 1 and T 2 . It is evident from the construction procedure that T is a valid tree decomposition of G and it takes polynomial time to compute it.
As we are in the sparse case, there exists a riafd-set F of size k with bounded degree, i.e., deg(F ) ≤ dk. We call this bounded version of the problem BRIAFD. As we saw, the small separator helps in constructing a tree decomposition of small width, but requires that we are given an afd-set of size k and bounded degree. To attain this, we use an Iterative Compression based procedure which at every iteration constructs a riafd-set of size at most k with bounded degree and uses it to construct the small separator. Using this small separator we construct a tree decomposition of small width and run a Cut & Count based procedure to solve bounded RIAFD problem for the current induced subgraph, i.e, get a riafd-set of size at most k with bounded degree. Further, we make use of the observation that since each bag of the tree decomposition consists primarily of vertices from the given afd-set, as seen in proof of Lemma 3.9, if we fix the states of these vertices in the Cut & Count algorithm, then there exists only a constant number of vertices in each bag for which we need to compute all states, and hence reduces the space requirement considerably from exponential (corresponding to all states of tw number of vertices in each bag) to polynomial space (corresponding to all states of constant number of vertices in each bag). Due to paucity of space, the details of the algorithm are deferred to the full version of the paper [15]. We state the result here.

Combining dense and sparse cases: Algorithm for RIAFD
Having described the Dense and the Sparse Cases, we now combine them to give the final randomized algorithm. Now, we give the Algorithm RIAFD1(G, k, ), which is the complete randomized algorithm combining the Dense and the Sparse Cases (small separator). Now, we prove that RIAFD1(G, k, ) succeeds with probability For simplicity of calculations, we replace k with k. Moreover, as each iteration is an independent trial, k is an upper bound for any k that succeeds. We use Induction on k. The statement is trivial when k = 0, since no probabilistic reduction is used and hence it succeeds with probability 1. For Apply Reduction 2 to (G , R , k , ) to get vertex v ∈ V and instance (G , R , k − 1, )

I S A
return Infeasible the inductive step, consider an instance RIAFD1(G, k + 1, ). Let (G , k , ) be the reduced instance after Line 3. Suppose that every riafd-set F of G of size k satisfies the condition deg(F ) ≤ dk ; here, we only need the existence of one such F . In this case, if Line 7 is executed, then it will correctly output a riafd-set F of size at most k , with high probability by Lemma 3.10. This happens with probability at least

Improving the Dependence on
In this subsection, we will try to reduce the dependence on in the Cut & Count algorithm. To achieve this, we will construct a tree decomposition with reduced dependence on (Lemma 3.13). We use the following result to prove Lemma 3.13. G is an -forest from the definition of riafd-set. We apply the following reduction rules exhaustively on G : remove v and insert a new edge between its two neighbors, if no such edge exists.
For the safeness of the above reduction rules refer to [21]. Let the reduced graph be called G (V , E ). It is trivial to see that after applying these rules the G we get is also an -Forest. Therefore, after removing at most edges from G , we are left with at most |V | − 1 edges (since the remaining graph is a forest). Therefore, we get that |E | ≤ |V |+ −1. Since the degree of each vertex in G is at least 3, |E | ≥ 3|V |/2. Therefore, 1.5|V | ≤ |V | + − 1 from which we obtain the bounds |V | ≤ 2 and |E | ≤ 3 . Proposition 3.12 implies that, G has a tree decomposition of width at most 3 5.769 + O(log( )) which can be computed in polynomial time.
Claim 3.14 ([21, Lemma 4.2]). Given a connected graph G, with tw(G) > 2 and let G be a graph obtained from G by applying R 0 , R 1 and R 2 then tw(G) = tw(G ) Also, from proof of Lemma 4.2 of [21], it's easy to see that this also works on graphs which might not be connected. Given these facts, we see that we can obtain a tree decomposition of G with width at most 3 5.769 + O(log( )) in polynomial time from the tree decomposition of G . Now to get the tree decomposition of the given graph instance G, add F (of size k which we removed) to all the bags of the tree decomposition of G . This finally gives the required tree decomposition of G of width at most k + 3 5.769 + O(log( )). We combine the treewidth bound that can be obtained from Lemma 3.13 with Iterative Compression, together with the 3 tw algorithm to obtain an O (3 k 1.78 ) algorithm for RIAFD.
We now describe the working of the routine RIAFD_IC3. The iterative compression routine proceeds as follows. We start with an empty graph, and add the vertices of G one by one, while always maintaining an riafd-set of size at most k in the current graph. Maintaining an RIAFD for the current graph helps us utilize Lemma 3.13 to obtain a small tree decomposition (of size k + 3 5.769 + O(log( ))). Then we can add the next vertex in the ordering to all the bags in the tree decomposition to get a new riafd-set of size k in O (3 tw ). If we are unable to find such an riafd-set in a particular iteration, we can terminate the algorithm early. Now we restate Theorem 1.2 (4) and prove it. Proof. Suppose that there exists a riafd-set F of size at most k. Let (v 1 , v 2 , . . . , v n ) be the ordering from Line 2, and define V i := {v 1 , . . . , v i }. We note that F ∩ V i is an RIAFD of G [V i ] so RIAFD problem on Line 7 will be feasible in each iteration (and will be computed correctly with high probability in every iteration). Therefore, with high probability, an RIAFD is returned successfully (by union bound).
We now bound the running time. On Line 5, the current set F is a riafd-set of G [V i ], so Lemma 3.13 guarantees a tree decomposition of width at most k+ 3 5.769 +O(log( )) and adding v i to each bag on Line 6 increases the width by at most one. By the Cut & Count algorithm for RIAFD from Theorem 1.

Conclusion
In this paper, we applied the technique of Li and Nederlof [23]