Parameterized (Approximate) Defective Coloring

In Defective Coloring we are given a graph $G = (V, E)$ and two integers $\chi_d, \Delta^*$ and are asked if we can partition $V$ into $\chi_d$ color classes, so that each class induces a graph of maximum degree $\Delta^*$. We investigate the complexity of this generalization of Coloring with respect to several well-studied graph parameters, and show that the problem is W-hard parameterized by treewidth, pathwidth, tree-depth, or feedback vertex set, if $\chi_d = 2$. As expected, this hardness can be extended to larger values of $\chi_d$ for most of these parameters, with one surprising exception: we show that the problem is FPT parameterized by feedback vertex set for any $\chi_d \ge 2$, and hence 2-coloring is the only hard case for this parameter. In addition to the above, we give an ETH-based lower bound for treewidth and pathwidth, showing that no algorithm can solve the problem in $n^{o(pw)}$, essentially matching the complexity of an algorithm obtained with standard techniques. We complement these results by considering the problem's approximability and show that, with respect to $\Delta^*$, the problem admits an algorithm which for any $\epsilon>0$ runs in time $(tw/\epsilon)^{O(tw)}$ and returns a solution with exactly the desired number of colors that approximates the optimal $\Delta^*$ within $(1 + \epsilon)$. We also give a $(tw)^{O(tw)}$ algorithm which achieves the desired $\Delta^*$ exactly while 2-approximating the minimum value of $\chi_d$. We show that this is close to optimal, by establishing that no FPT algorithm can (under standard assumptions) achieve a better than $3/2$-approximation to $\chi_d$, even when an extra constant additive error is also allowed.


Introduction
Defective Coloring is the following problem: we are given a graph G = (V, E), and two integer parameters χ d , ∆ * , and are asked whether there exists a partition of V into at most χ d sets (color classes), such that each set induces a graph with maximum degree at most ∆ * . Defective Coloring, which is also sometimes referred to in the literature as Improper Coloring, is a natural generalization of the classical Coloring problem, which corresponds to the case ∆ * = 0. The problem was introduced more than thirty years ago [2,16], and since then has attracted a great deal of attention [1, 4, 6, 13, 14, Table 1 Summary of results. Hardness results for tree-depth imply the same bounds for treewidth and pathwidth. Conversely, algorithms which apply to treewidth apply also to all other parameters. matches asymptotically the exponent given in the algorithm of [9].
To complement the above results, we also consider the problem from the point of view of (parameterized) approximation. Here things become significantly better: we give an algorithm using a technique of [36] which for any χ d and error > 0 runs in time (tw/ ) O(tw) n O (1) and approximates the optimal value of ∆ * within a factor of (1 + ). Hence, despite the problem's W-hardness, we produce a solution arbitrarily close to optimal in FPT time.
Motivated by this algorithm we also consider the complementary approximation problem: given ∆ * find a solution that comes as close to the minimum number of colors needed as possible. By building on the approximation algorithm for ∆ * , we are able to present a (tw) O(tw) n O (1) algorithm that achieves a 2-approximation for this problem. One can observe that this is not far from optimal, since an FPT algorithm with approximation ratio better than 3/2 would contradict the problem's W-hardness for χ d = 2. However, this simple argument is unsatisfying, because it does not rule out algorithms with a ratio significantly better than 3/2, if one also allows a small additive error; indeed, we observe that when parameterized by feedback vertex set the problem admits an FPT algorithm that approximates the optimal χ d within an additive error of just 1. To resolve this problem we present a gap-introducing version of our reduction which, for any i produces an instance for which the optimal value of χ d is either 2i, or at least 3i. In this way we show that, when parameterized by tree-depth, pathwidth, or treewidth, approximating the optimal value of χ d better than 3/2 is "truly" hard, and this is not an artifact of the problem's hardness for 2-coloring.
The Exponential Time Hypothesis (ETH) states that 3-SAT on instances with n variables and m clauses cannot be solved in time 2 o(n+m) [29]. We define the k-Multi-Colored Clique problem as follows: we are given a graph G = (V, E), a partition of V into k independent sets V 1 , . . . , V k , such that for all i ∈ {1, . . . , k} we have |V i | = n, and we are asked if G contains a k-clique. It is well-known that this problem is W [1]-hard parameterized by k, and that it does not admit any n o(k) algorithm, unless the ETH is false [18].

W-hardness for Feedback Vertex Set and Tree-depth
The main result of this section states that deciding if a graph admits a (2, ∆ * )-coloring, where ∆ * is part of the input, is W[1]-hard parameterized by either fvs or td. Because of standard relations between graph parameters (Lemma 1), this implies also the same problem's W-hardness for parameters pw and tw. As might be expected, it is not hard to extend our proof to give hardness for deciding if a (χ d , ∆ * )-coloring exists, for any constant χ d , parameterized by tree-depth (and hence, also treewidth and pathwidth). What is perhaps more surprising is that this cannot be done in the case of feedback vertex set. Superficially, the reason we cannot extend the reduction in this case is that one of the gadgets we use in many copies in our construction has large fvs if χ d > 2. However, we give a much more convincing reason in Theorem 20 of Section 5 where we show that Defective Coloring is FPT parameterized by fvs for χ d ≥ 3, and therefore, if we could extend our reduction in this case it would prove that FPT=W [1].
The main theorem of this section is stated below. We then present the reduction in Sections 3.1, 3.2, and give the Lemmata that imply Theorem 2 in Section 3.3.

Theorem 2.
Deciding if a graph G admits a (2, ∆ * )-coloring, where ∆ * is part of the input, is W[1]-hard parameterized by fvs(G). Deciding if a graph G admits a (χ d , ∆ * )-coloring, where χ d ≥ 2 is any fixed constant and ∆ * is part of the input is W [1]-hard parameterized by td(G).

Basic Gadgets
Before we proceed, we present some basic gadgets that will be useful in all the reductions of this paper (Theorems 2, 14, 26). We first define a building block T (i, j) which is a graph that can be properly colored with i colors, but admits no (i − 1, j)-coloring (similar constructions appears in [28]). We then use this graph to build two gadgets: the Equality Gadget and the Palette Gadget (Definitions 5 and 8). Informally, for given χ d , ∆ * , the equality gadget allows us to express the constraint that two vertices v 1 , v 2 of a graph must receive the same color in any valid (χ d , ∆ * )-coloring. The palette gadget will be used to express the constraint that, among three vertices v 1 , v 2 , v 3 , there must exist two with the same color. For both gadgets we first prove formally that they express these constraints (Lemmata 6 and 9). We then show that, under certain conditions, these gadgets can be added to any graph without significantly increasing its tree-depth or feedback vertex set (Lemmata 7 and 10).

Definition 3.
Given two integers i > 0, j ≥ 0, we define the graph T (i, j) recursively as follows: T (1, j) = K 1 for all j; for i > 1, T (i, j) is the graph obtained by taking (j + 1) disjoint copies of T (i − 1, j) and adding to the graph a new universal vertex.

Construction
We are now ready to present a reduction from k-Multi-Colored Clique. In this section we describe a construction which, given an instance of this problem (G, k) as well as an integer χ d ≥ 2 produces an instance of Defective Coloring. Recall that we assume that in the initial instance G = (V, E) is given to us partitioned into k independent sets V 1 , . . . , V k , all of which have size n. We will produce a graph H(G, k, χ d ) and an integer ∆ * with the property that H admits a (χ d , ∆ * )-coloring if and only if G has a k-clique. In the next section we prove the correctness of the construction and give bounds on the values of td(H) and fvs(H) to establish Theorem 2.
In our new instance we set ∆ * = |E| − k 2 . Let us now describe the graph H. Since we will repeatedly use the gadgets from Definitions 5 and 8, we will use the following convention: whenever v 1 , v 2 are two vertices we have already introduced to H, when we say that we add an equality gadget Q(v 1 , v 2 ), this means that we add to H a copy of Q(u 1 , u 2 , χ d , ∆ * ) and then identify u 1 , u 2 with v 1 , v 2 respectively (similarly for palette gadgets). To ease presentation we will gradually build the graph by describing its different conceptual parts.
Palette Part: Informally, the goal of this part is to obtain two vertices (p A , p B ) which are guaranteed to have different colors. This part contains the following: Transfer Part: Informally, the goal of this part is to transfer the choices of the previous part to the rest of the graph. For each color class of the original instance we make (k − 1) "low" transfer vertices, whose deficiency will equal the choice made in the previous part, and (k − 1) "high" transfer vertices, whose deficiency will equal the complement of the same value. Formally, this part of H contains the following:

11.
For i, j ∈ {1, . . . , k}, i = j the vertex h i,j and the vertex l i,j . We call these the high and low transfer vertices.  Finally, once we have added a gadget (as described above) for each e ∈ E, we add the following structure to H in order to ensure that we have a sufficient number of edges included in our clique:

19.
A vertex c U (universal checker) connected to all c e for e ∈ E. 20. An equality gadget Q(p A , c U ).

Budget-Setting:
Our construction is now almost done, except for the fact that some crucial vertices have degree significantly lower than ∆ * (and hence are always trivially colorable). To fix this, we will effectively lower their deficiency budget by giving them some extra neighbors. Formally, we add the following:

21.
For each guard vertex g i j , with j ∈ {A, B}, we construct an independent set G i j of size ∆ * − n and connect it to g i j . For each v ∈ G i j we add an equality gadget Q(p j , v). 22. For each transfer vertex l i,j (respectively h i,j ), we construct an independent set of size ∆ * − n and connect all its vertices to l i,j (or respectively to h i,j ). For each vertex v of this independent set we add an equality gadget Q(p A , v). 23. For each vertex c e we add an independent set of size ∆ * and connect all its vertices to c e .
For each vertex v of this independent set we add an equality gadget Q(p B , v).
This completes the construction of the graph H.

Correctness
To establish Theorem 2 we need to establish three properties of the graph H(G, k, χ d ) described in the preceding section: that the existence of a k-clique in G implies that H admits a (χ d , ∆ * )-coloring; that a (χ d , ∆ * )-coloring of H implies the existence of a k-clique in G; and that the tree-depth and feedback vertex set of G are bounded by some function of k. These are established in the Lemmata below.
Lemma 11. For any χ d ≥ 2, if G contains a k-clique, then the graph H(G, k, χ d ) described in the previous section admits a (χ d , ∆ * )-coloring.
Proof. Consider a clique of size k in G that includes exactly one vertex from each V i . We will denote this clique by a function f : {1, . . . , k} → {1, . . . , n}, that is, we assume that the clique contains the vertex with index f (i) from V i . We produce a (χ d , ∆ * )-coloring of H as follows: vertex p A receives color 1, while vertex p B receives color 2. All vertices for which we have added an equality gadget with one endpoint identified with p A (respectively p B ) take color 1 (respectively 2). We use Lemma 6 to properly color the internal vertices of the equality gadgets.
We have still left uncolored the choice vertices c i j as well as the internal vertices For all other edges we use the opposite coloring: we color all vertices of the sets L 1 e , H 1 e , L 2 e , H 2 e with color 2, and c e with color 1. We use Lemma 9 to properly color the internal vertices of palette gadgets, since all palette gadgets that we add use either color 1 or color 2 twice in their endpoints. This completes the coloring.
To see that the coloring we described is a (χ d , ∆ * )-coloring, first we note that by Lemmata 6,9 internal vertices of equality and palette gadgets are properly colored. Vertices p A , p B have exactly ∆ * neighbors with the same color; guard vertices g i j have exactly n neighbors with the same color among the choice vertices, hence exactly ∆ * neighbors with the same color overall; choice vertices have at most k neighbors of the same color, and we can assume that k < |E| − k 2 ; the vertex c U has exactly ∆ * = |E| − k 2 neighbors with color 1, since the clique contains exactly k 2 edges; all internal vertices of edge gadgets have at most one neighbor of the same color. Finally, for the transfer vertices l i,j and h i,j , we note that l i,j 11:8

Parameterized (Approximate) Defective Coloring
(respectively h i,j ) has exactly f (i) (respectively n − f (i)) neighbors with color 1 among the choice vertices. Furthermore, when i < j, l i,j (respectively h i,j ) has |L 1 e | (respectively |H 1 e |) neighbors with color 1 in the edge gadgets, those corresponding to the edge e that belongs in the clique between V i and V j . But by construction |L 1 e | = n − f (i) and |H 1 e | = f (i), and with similar observations for the case j < i we conclude that all vertices have deficiency at most ∆ * .
Theorem 2 now follows directly from the reduction we have described and Lemmata 11,12,13.

ETH-based Lower Bounds for Treewidth and Pathwidth
In this section we present a reduction which strengthens the results of Section 3 for the parameters treewidth and pathwidth. In particular, the reduction we present here establishes that, under the ETH, the known algorithm for Defective Coloring for these parameters is essentially best possible. We use a similar presentation order as in the previous section, first giving the construction and then the Lemmata that imply the result. Where possible, we re-use the gadgets we have already presented. The main theorem of this section states the following:

Basic Gadgets
We use again the equality and palette gadgets of Section 3 (Definitions 5,8). Before proceeding, let us show that adding these gadgets to the graph does not increase the pathwidth too much. For the two types of gadget Q, P , we will call the vertices u 1 , u 2 (, u 3 ) the endpoints of the gadget.
Lemma 15. Let G = (V, E) be a graph and let G be the graph obtained from G by repeating the following operation: find a copy of Q(u 1 , u 2 , χ d , ∆ * ), or P (u 1 , u 2 , u 3 , χ d , ∆ * ); remove all its internal vertices from the graph; and add all edges between its endpoints which are not already connected. Then tw(G) ≤ max{tw(G ), χ d } and pw(G) ≤ pw(G ) + χ d .

Construction
We now describe a construction which, given an instance G = (V, E), k, of k-Multi-Colored Clique and a constant χ d returns a graph H(G, k, χ d ) and an integer ∆ * such that H admits a (χ d , ∆ * )-coloring if and only if G has a k-clique, and the pathwidth of H is O(k + χ d ). We use m to denote |E|, and we set ∆ * = m − k 2 . As in Section 3 we present the construction in steps to ease presentation, and we use the same conventions regarding adding Q and P gadgets to the graph.
Palette Part: This part repeats steps 1-5 of the construction of Section 3. We recall that this creates two main palette vertices p A , p B (which are eventually guaranteed to have different colors).
Choice Part: In this part we construct a sequence of independent sets, arranged in what can be thought of as a k × 2m grid. The idea is that the choice we make in coloring the first independent set of every row will be propagated throughout the row. We can therefore encode k choices of a number between 1 and n, which will encode the clique.

Edge Representation:
In the k × 2m grid of independent sets we have constructed we devote two columns to represent each edge of G. In the remainder we assume some numbering of the edges of E with the numbers {1, . . . , m}, as well as a numbering of each We perform the following steps for each such edge.

13.
We add a checker vertex c j and connect it to all vertices of H 1 Validation and Budget-Setting: Finally, we add a vertex that counts how many edges we have included in our clique, as well as appropriate vertices to diminish the deficiency budget of various parts of our construction.

14.
We add a universal checker vertex c U and connect it to all vertices c j added in step 13. We add an equality gadget Q(p A , c U ). 15. For every vertex c j added in step 13 we construct an independent set of size ∆ * and connect all its vertices to c j . For each vertex v in this set we add an equality gadget Q(p B , v). 16. For each vertex constructed in step 10 (h 1 j , l 1 j , h 2 j , l 2 j ), we construct an independent set of size ∆ * − n and connect it to the vertex. For each vertex v of this independent set we add an equality gadget Q(p A , v). 17. For each backbone vertex b l i,j , with l ∈ {A, B}, we construct an independent set of size ∆ * − n and connect it to b l i,j . For each vertex v of this independent set we add an equality gadget Q(p l , v). 18. If χ d ≥ 3, for each vertex v added in steps 6-17 we add a palette gadget P (p A , p B , v). The proof of Theorem 14 now follows directly from Lemmata 16,17,18.

Exact Algorithms for Treewidth and Other Parameters
In this section we present several exact algorithms for Defective Coloring. Theorem 19 gives a treewidth-based algorithm which can be obtained using standard techniques. Essentially the same algorithm was already sketched in [9], but we give another version here for the sake of completeness and because it is a building block for the approximation algorithm of Theorem 23. Theorem 20 uses a win/win argument to show that the problem is FPT parameterized by fvs when χ d = 2 and therefore explains why the reduction presented in Section 3 only works for 2 colors. Theorem 21 uses a similar argument to show that the problem is FPT parameterized by vc (for any χ d ).

Approximation Algorithms and Lower Bounds
In this section we present two approximation algorithms which run in FPT time parameterized by treewidth. The first algorithm (Theorem 23) is an FPT approximation scheme which, given a desired number of colors χ d , is able to approximate the minimum feasible value of ∆ * for this value of χ d arbitrarily well (that is, within a factor (1 + )). The second algorithm, which also runs in FPT time parameterized by treewidth, given a desired value for ∆ * , produces a solution that approximates the minimum number of colors χ d within a factor of 2.
These results raise the question of whether it is possible to approximate χ d as well as we can approximate ∆ * , that is, whether there exists an algorithm which comes within a factor (1 + ) (rather than 2) of the optimal number of colors. As a first response, one could observe that such an algorithm probably cannot exist, because the problem is already hard when χ d = 2, and therefore an FPT algorithm with multiplicative error less than 3/2 would imply that FPT=W [1]. However, this does not satisfactorily settle the problem as it does not rule out an algorithm that achieves a much better approximation ratio, if we allow it to also have a small additive error in the number of colors. Indeed, as we observe in Corollary 28, it is possible to obtain an algorithm which runs in FPT time parameterized by feedback vertex set and has an additive error of only 1, as a consequence of the fact that the problem is FPT for χ d ≥ 3. This poses the question of whether we can design an FPT algorithm parameterized by treewidth which, given a (χ d , ∆ * )-colorable graph, produces a coloring with ρχ d + O(1) colors, for ρ < 3/2.
In the second part of this section we settle this question negatively by showing, using a recursive construction that builds on Theorem 2, that such an algorithm cannot exist. More precisely, we present a gap-introducing version of our reduction: the ratio between the number of colors needed to color Yes and No instances remains 3/2, even as the given χ d increases. This shows that the "correct" multiplicative approximation ratio for this problem really lies somewhere between 3/2 and 2, or in other words, that there are significant barriers impeding the design of a better than 3/2 FPT approximation for χ d , beyond the simple fact that 2-coloring is hard.

Approximation Algorithms
Our first approximation algorithm, which is an approximation scheme for the optimal value of ∆ * , relies on a method introduced in [36] (see also [3]), and a theorem of [11]. The high-level idea is the following: intuitively, the obstacle that stops us from obtaining an FPT running time with the dynamic programming algorithm of Theorem 19 is that the dynamic program is forced to store some potentially large values for each vertex. More specifically, to characterize a partial solution we need to remember not just the color of each vertex in a bag, but also how many neighbors with the same color this vertex has already seen (which is a value that can go up to ∆ * ). The main trick now is to "round" these values in order to decrease the number of possible states a vertex can be found in. To do this, we select an appropriate value δ (polynomial in log n ), and try to replace every value that the dynamic program would calculate with the next higher integer power of (1 + δ). This has the advantage of limiting the number of possible values from ∆ * to log (1+δ) ∆ * ≈ log ∆ * δ , and this is sufficient to obtain the promised running time. The problem is now that the rounding we applied introduces an approximation error, which is initially a factor of at most (1 + δ), but may increase each time we apply an arithmetic operation as part of the algorithm. To show that this error does not get out of control we show that in any bag of the tree all values stored are within a factor (1 + δ) h of the correct ones, where h is the height of the bag. We then use a theorem of Bodlaender and Hagerup [11] which states that any tree decomposition can be balanced in such a way that its height is at most O(log n), and as a result we obtain that all values are sufficiently close to being correct.
The second algorithm we present in this section (Theorem 25) uses the approximation scheme for ∆ * to obtain an FPT 2-approximation for χ d . The idea here is that, given a (χ d , ∆ * )-colorable graph, we first produce a (χ d , (1 + )∆ * )-coloring using the algorithm of Theorem 23, and then apply a procedure which uses 2 colors for each color class of this solution but manages to divide by two the number of neighbors with the same color of every vertex. This is achieved with a simple polynomial-time local search procedure.
Theorem 22. [11] There is a polynomial-time algorithm which, given a graph G = (V, E) and a tree decomposition of G of width tw, produces a tree decomposition of G of width at most 3tw + 2 and height O(log n).

Hardness of Approximation
The main result of this section is that χ d cannot be approximated with a factor better than 3/2 in FPT time (for parameters tree-depth, pathwidth, or treewidth), even if we allow the algorithm to also have a constant additive error. We remark that an FPT algorithm with additive error 1 is easy to obtain for feedback vertex set (Corollary 28).

Conclusions
In this paper we classified the complexity of Defective Coloring with respect to some of the most well-studied graph parameters, given essentially tight ETH-based lower bounds for pathwidth and treewidth, and explored the parameterized approximability of the problem. Though this gives a good first overview of the problem's parameterized complexity landscape, there are several questions worth investigating next. First, is it possible to make the lower bounds of Section 4 even tighter, by precisely determining the base of the exponent in the algorithm's dependence? This would presumably rely on a stronger complexity assumption such as the SETH, as in [37]. Second, can we determine the complexity of the problem with respect to other structural parameters, such as clique-width [15], modular-width [24], or neighborhood diversity [35]? For some of these parameters the existence of FPT algorithms is already ruled out by the fact that Defective Coloring is NP-hard on cographs [9], however the complexity of the problem is unknown if we also add χ d or ∆ * as a parameter. Finally, it would be very interesting to close the gap between 2 and 3/2 on the performance of the best treewidth-parameterized FPT approximation for χ d .

A.1 Omitted Preliminaries
We recall here some standard definitions for the reader's convenience. A tree decomposition of a graph G = (V, E) is a (rooted) tree T = (X, I) such that each node of T is a subset of V . We call the elements of X bags. T must obey the following constraints: ∀v ∈ V ∃B ∈ X such that v ∈ B; ∀(u, v) ∈ E ∃B ∈ X such that u, v ∈ B; ∀v ∈ V the bags of X that contain v induce a connected sub-tree. The width of a tree decomposition is max B∈X |B| − 1, and tw(G) is the minimum width of a tree decomposition

11:16
Parameterized (Approximate) Defective Coloring of G. Pathwidth is defined similarly, except the decomposition is required to be a path instead of a tree.
For a rooted tree T we define its height as the number of vertices in the longest path from the root to a leaf, and its completion as the graph obtained by connecting each node to all of its ancestors. For a graph G we define td(G) as the minimum height of any tree whose completion contains G as a subgraph. An equivalent recursive definition is the following: td(K 1 ) = 1; if G is disconnected then td(G) is equal to the maximum tree-depth of G's connected components; otherwise td(G) = 1 + min v∈V td(G[V \ v]).
A graph's feedback vertex set (respectively vertex cover) is the smallest set of vertices whose removal leaves the graph acyclic (respectively edge-less).
Proof of Lemma 1. All stated relations are standard but we recall here the proofs for the sake of completeness. To obtain tw(G) − 1 ≤ fvs(G), if S ⊆ V is a feedback vertex set, we can construct a tree decomposition of G by including all vertices of S in a tree decomposition (of width 1) of G[V \ S]. fvs(G) ≤ vc(G) follows because every vertex cover is also a feedback vertex set. tw(G) ≤ pw(G) because all path decompositions are also valid tree decompositions. pw(G) ≤ td(G) − 1 can be seen by recalling that, if G is connected ∃v ∈ V such that td(G) = 1 + td(G[V \ v]). We can now take a path decomposition of G[V \ v] and add v to every bag. To see that td(G) ≤ vc(G) + 1 we observe that G is a subgraph of the rooted tree we construct if we connect all the vertices of a vertex cover in a path, and attach all the other vertices to the path's last vertex.
For the coloring statements, we recall that a graph with treewidth tw is (tw+1)-degenerate, that is, there exists an ordering of its vertices such that each vertex has at most tw + 1 neighbors among the vertices that precede it [12]. To see that td(G) colors suffice to color G if it is connected, we recall that ∃v ∈ V such that td(G) = 1 + td(G[V \ v]), use a unique color for v and td(G) − 1 for the rest of the graph. fvs(G) + 2 colors are always sufficient to properly color a graph because we can use distinct colors for the feedback vertex set, and two-color the remaining forest.

A.2 Omitted Proofs from Section 3
Proof of Lemma 4. We begin by the last statement: clearly td(T (1, j)) = pw(T (1, j)) + 1 = tw(T (1, j)) + 1 = 1, while it can be seen that tw(T (i, j)) + 1 ≤ pw(T (i, j)) + 1 ≤ td(T (i, j)) ≤ 1 + td(T (i − 1, j)) by removing the universal vertex. We also observe that td(T (i, j)) ≥ pw(T (i, j)) + 1 ≥ tw(T (i, j)) + 1 ≥ i because T (i, j) contains a clique of size i. The third statement implies the first by Lemma 1. Finally, to see that T (i, j) does not admit an (i − 1, j)-coloring, we do induction on i. Clearly, T (1, j) requires at least one color. Suppose now that T (i, j) does not admit an (i − 1, j)-coloring but, for the sake of contradiction, T (i + 1, j) admits an (i, j)-coloring. By assumption, each of the j + 1 copies of T (i, j) contained in T (i + 1, j) must be using all i available colors. Hence, each color appears at least j + 1 times, which implies that there is no available color for the universal vertex.
Proof of Lemma 6. For the first statement, consider a (χ d , ∆ * )-coloring of G and examine the copies of T (χ d − 1, ∆ * ) contained in the equality gadget added to G. For a set C ⊆ {1, . . . , χ d } with size |C| = χ d − 1 we say that C is contained in a copy of T (χ d − 1, ∆ * ) if all the colors of C appear in this copy in the coloring of G . There are χ d χ d −1 = χ d such sets of colors C, and every copy of T (χ d − 1, ∆ * ) contains at least one by Lemma 4. Hence, the set of colors C that is contained in the largest number of copies is contained in at least χ d = ∆ * + 1 copies, therefore all its colors appear at least ∆ * + 1 times. This means that v 1 , v 2 cannot take any of the colors in C, and therefore must use the same color.
For the second statement, recall that by Lemma 4, T (χ d − 1, ∆ * ) can be properly colored with χ d − 1 colors, and χ d − 1 colors are available if v 1 , v 2 use the same colors.

Proof of Lemma 7.
For the first inequality, we begin by observing that td(G ) ≤ td(G \ S) + |S|, so it suffices to show that td(G \ S) ≤ td(G \ S) + χ d − 1. Observe now that in G \ S, in every copy of Q one of the vertices u 1 , u 2 has been removed.
By definition, there must exist a rooted tree T 1 with td(G \ S) levels such that if we complete the tree (that is, connect each node of T 1 to all its descendants), G \ S is a subgraph of the resulting graph. Similarly, there exists a rooted tree T 2 with χ d − 1 levels such that T (χ d − 1, ∆ * ) is a subgraph of its completion. We now observe that if we take T 1 and attach to each of its nodes a copy of T 2 we have a tree with td(G \ S) + χ d − 1 levels whose completion contains G \ S as a subgraph.
For the final statement, if χ d = 2 the equality gadgets we have added to G contain copies of T (1, ∆) = K 1 . If we remove S from G , and therefore remove one endpoint of each equality gadget, these vertices become leaves, and hence do not affect the size of the graph's minimum feedback vertex set. Deleting them gives us the graph G \ S, so we conclude that fvs(G \ S) = fvs(G \ S) which, together with the fact that fvs(G ) ≤ fvs(G \ S) + |S| completes the proof.
Proof of Lemma 9. For the first statement, consider a (χ d , ∆ * )-coloring of G and examine the copies of T (χ d − 2, ∆ * ) contained in the palette gadget added to G. For a set C ⊆ {1, . . . , χ d } with size |C| = χ d − 2 we say that C is contained in a copy of T (χ d − 2, ∆ * ) if all the colors of C appear in this copy in the coloring of G . There are χ d χ d −2 = χ d 2 such sets of colors C, and every copy of T (χ d − 2, ∆ * ) contains at least one by Lemma 4. Hence, the set of colors C that is contained in the largest number of copies, is contained in at least = ∆ * + 1 copies, therefore all its colors appear at least ∆ * + 1 times. This means that v 1 , v 2 , v 3 cannot take any of the colors in C, and therefore have only two colors available for them. By pigeonhole principle, two of them must share a color. For the second statement, recall that by Lemma 4, T (χ d − 2, ∆ * ) can be properly colored with χ d − 2 colors, and χ d − 2 colors are available if v 1 , v 2 , v 3 use at most two colors.
Proof of Lemma 10. The proof follows along the same lines as the proof of Lemma 7. First, we observe that td(G ) ≤ td(G \ S) + |S| and then show that td(G \ S) ≤ td(G \ S) + χ d − 2 by taking a tree T 1 with td(G \ S) levels whose completion contains G \ S and attaching to each node a tree T 2 with χ d − 2 levels whose completion contains T (χ d − 2, ∆ * ). Step 5 we then know that p A , p B each has at least ∆ * neighbors with the same color. Therefore, because of the edge connecting them, we conclude that c(p A ) = c(p B ). Without loss of generality we will assume below that c(p A ) = 1 and c(p B ) = 2.
Because of the equality gadget of Step 20 we have c(c U ) = 1. Because c U has degree |E|, we conclude that it has at least k 2 neighbors with color 2. These correspond to a set E ⊆ E of edges of the original graph with |E | ≥ k 2 . We will prove that, in fact, E induces a k-clique in G.
Let e ∈ E be an edge such that c(c e ) = 2. This implies that all the vertices of L 1 e ∪ H 1 e ∪ L 2 e ∪ H 2 e must take color 1, because by Step 23 c e already has ∆ * neighbors with color 2. In case χ d ≥ 3 we have also used here the fact that, by Step 18, every internal vertex of the gadget representing e must take color 1 or 2.

Parameterized (Approximate) Defective Coloring
Suppose that e ∈ E connects the vertex with index i 1 in V j1 to the vertex with index i 2 in V j2 , j 1 < j 2 . We first show that, for an e ∈ E also connecting V j1 to V j2 it must be that e ∈ E . Suppose for contradiction that e ∈ E , and let i 1 , i 2 be the indices of the endpoints of e . We observe that l j1,j2 has at least |L 1 e | + |L 1 e | = 2n − i 1 − i 1 neighbors with color 1 in the edge gadgets, while h j1,j2 has at least |H 1 e | + |H 1 e | = i 1 + i 1 such neighbors. Both l j1,j2 and h j1,j2 had ∆ * − n neighbors of color 1 added in Step 22. Finally, among the 2n choice vertices c j1 j which are neighbors of either l j1,j2 or h j1,j2 there are at least n which received color 1, because all the choice vertices have colors 1 or 2 (Step 10) and g j1 B , which has color 2 (Step 9), is connected to all of them and also has ∆ * − n other neighbors of color 2 (Step 21). Hence, the total number of vertices in N (l j1,j2 ) ∪ N (h j1,j2 ) with color 1 is at least 2n + 2(∆ * − n) + n > 2∆ * , hence one of these two vertices has deficiency higher than ∆ * , contradiction. We conclude that e ∈ E .
To complete the proof, let us show that the k 2 edges of E , each of which connects a different pair of parts of V , are incident on the same endpoints. Take e ∈ E as in the previous paragraph, and e ∈ E connecting vertices with indices i 1 , i 3 from the parts V j1 , V j3 , for j 3 = j 2 . It suffices to show that i 1 = i 1 . Suppose for contradiction i 1 = i 1 . Consider now the vertices l j1,j2 , h j1,j2 , l j1,j3 , h j1,j3 , which, by similar reasoning as before, have n − i 1 , i 1 , n − i 1 , i 1 color-1 neighbors in the edge gadgets respectively. If there are strictly more than i 1 vertices with color 1 among the choice vertices c j1 j , j ∈ {1, . . . , n}, then l j1,j2 would have deficiency more than ∆ * . If there are strictly more than n − i 1 vertices with color 1 among the choice vertices c j1 j , j ∈ {n + 1, . . . , 2n}, then h j1,j2 would have deficiency more than ∆ * . Since, by the same reasoning as previously, there are at least n vertices with color i among the choice vertices c j1 j , we conclude that there are exactly i 1 vertices with color 1 among the c j1 j for j ∈ {1, . . . , n}, and exaclty n − i 1 such vertices in the rest. We can now conclude that the only way not to violate the deficiency of l j1,j3 or h j1,j3 is for i 1 = i 1 .
Proof of Lemma 13. We first observe that all equality and palette gadgets added to the graph ( Steps 3,9,10,14,18,[20][21][22][23]  For both parameters we start by removing from the graph all the guard and transfer vertices, which are 2k + 2k(k − 1) = 2k 2 in total. We now have that all vertices p i j , as well as all choice vertices are isolated. Furthermore, all vertices added to represent edges, as well as the budget-setting vertices, form a tree with root at c U and 3 levels. We conclude that H has td(H ) ≤ 2k 2 + 4 and fvs(H ) ≤ 2k 2 .

A.3 Omitted Proofs from Section 4
Proof of Lemma 15. First, we observe that there is a path decomposition of Q(u 1 , u 2 , χ d , ∆ * ) with width χ d , as by Lemma 4 there is a path decomposition of T (χ d − 1, ∆ * ) of width χ d − 2, and we can add to all its bags the vertices u 1 , u 2 . Call this path decomposition T Q . In the same way, there is a path decomposition of width χ d for P (u 1 , u 2 , u 3 , χ d , ∆ * ), call it T P .
We now take an optimal tree or path decomposition of G , call it T , and construct from it a decomposition of G. Consider a gadget H ∈ {Q, P } that appears in G with endpoints u 1 , u 2 (, u 3 ). Since in G these endpoints form a clique, there is a bag in T that contains all of them. Let B be the smallest such bag. Now, if T is a tree decomposition, we take T H and attach it to B. If T is a path decomposition, we insert in the decomposition immediately after B the decomposition T H where we have added all vertices of B in all bags of T H . It is not hard to see that in both cases the decompositions remain valid, and we can repeat this process for every H until we have a decomposition of G.
Proof of Lemma 16. Suppose that G has a k-clique, given by a function σ : {1, . . . , k} → {1, . . . , n}, meaning that the clique contains vertex σ(i) from the set V i . We color H as follows: p A receives color 1, p B receives color 2, and all vertices on which we have attached equality gadgets receive the appropriate color, according to Lemma 6. By Lemmata 6,9 we can extend this coloring to the internal vertices of equality and palette gadgets. For every independent set C i,j , we color σ(i) of its vertices with 1 if j is odd, otherwise we color n − σ(i) of its vertices with 1; we color the remaining vertices of independent sets C i,j with 2. For the j-th edge of E, if it is contained in the clique then we color c j with 2 and H 1 j , L 1 j , H 2 j , L 2 j with 1, otherwise we color c j with 1 and H 1 j , L 1 j , H 2 j , L 2 j with 2. This completes the coloring. To see that this coloring is valid, observe that the vertices in the palette part have each at most ∆ * neighbors of the same color; the backbone vertices b l i,j have exactly ∆ * neighbors of the same color (σ(i) in one grid independent set and n − σ(i) in the other, plus ∆ * − n from step 17); the vertices l 1 j , h 1 j , l 2 j , h 2 j if the j-th edge belongs to the clique have exactly ∆ * neighbors with the same color; the same vertices for an edge that does not belong to the clique have strictly fewer than ∆ * neighbors of the same color; all vertices c j have at most ∆ * neighbors with the same color; and vertex c U has m − k 2 = ∆ * neighbors with the same color.

11:20
Parameterized (Approximate) Defective Coloring by removing from the graph the vertices p A , p B , c U . This does not decrease the pathwidth by more than 3, since these vertices can be added to all bags. In the remaining graph we remove all leaves and isolated vertices. It is not hard to see that this does not decrease pathwidth by more than 1, since if we find a path decomposition of the remaining graph, we can reinsert the leaves as follows: for each leaf v we find the smallest bag in the decomposition that contains its neighbor and insert after it a copy of the same bag with v added. We note that removing all leaves deletes from the graph all vertices added for budget-setting, as well as the remaining vertices of the palette part.
What remains then is to bound the pathwidth of the graph induced by the backbone vertices b l i,j , the choice vertices in sets C i,j , and the edge representation vertices. We construct a backbone of a path decomposition as follows: for each j ∈ {1, . . . , m} we construct a bag that contains all b l i,2j−1 , b l i,2j , and b l i,2j+1 (if they exist), as well as h 1 j , l 1 j , h 2 j , l 2 j , c j . We connect these bags in a path in increasing order of j. All these bags have with at most O(k).
We now observe that for every remaining vertex of the graph, there is a bag in the path decomposition that we have constructed that contains all its neighbors. We therefore do the following: for every remaining vertex v, we find the smallest bag of the path decomposition that contains its neighborhood, and insert after it a copy of this bag with v added. This process results in a valid path decomposition, and it does not increase the size of the largest bag by more than 1.

A.4 Omitted Proofs from Section 5
Proof of Theorem 19. The algorithm uses standard dynamic programming techniques, so we sketch some of the details. We assume we are given a nice tree decomposition, as defined in [12]. For each bag B t of the decomposition we denote by B ↓ t the set of vertices included in bags in the sub-tree of the decomposition rooted at B t . We will maintain in each bag B t a dynamic programming  For an Introduce node B t with child B t such that B t = B t ∪ {u}, for any s ∈ D t , and for any c u ∈ {1, . . . , χ d }, we add to D t a signature s which agrees with s on B t and contains the pair (c u , 0) for vertex u. For a Forget node B t with child B t such that B t = B t \ {u} for every signature s ∈ D t we do the following: let (c u , d u ) be the pair contained in s corresponding to vertex u. Let S u ⊆ B t be the set of vertices of B t which are given color c u according to s and which are neighbors of u. We check two conditions: first that d u + |S u | ≤ ∆ * ; second, that for all v ∈ S u such that s contains the pair (c u , d v ) we have d v ≤ ∆ * − 1. If both conditions hold, we add to D t a signature s that agrees with s on B t \ S u , and that for each v ∈ S u such that s returns (c u , d v ), returns the pair (c u , d v + 1). For a Join node B t with children B t1 , B t2 , (such that B t = B t1 = B t2 ) we do the following: for each s 1 ∈ D t1 and each s 2 ∈ D t2 we check the following two conditions for all u ∈ B t : if s 1 returns (c u1 , d u1 ) for u and s 2 returns (c u2 , d u2 ) we check if c u1 = c u2 ; and we check if d u1 + d u2 ≤ ∆ * . If both conditions hold for all u ∈ B t we say that s 1 , s 2 are compatible, and we add to D t a signature s which for u ∈ B t contains the pair (c u1 , d u1 + d u2 ).

11:22
Parameterized (Approximate) Defective Coloring 2. If there exists a signature s ∈ D t , then there exists a coloring c of B ↓ t such that all vertices of B ↓ t \ B t have at most (1 + )∆ * neighbors; all vertices of B t take in c the colors described in s; if s dictates that a vertex u ∈ B t has d u neighbors with the same color in B ↓ t \ B t , then u has at most d u neighbors with the same color in B ↓ t \ B t according to coloring c.
The first of the two properties above implies that, if there exists a (χ d , ∆ * )-coloring of G, the algorithm will be able to find some entry in the table of the root bag that will allows us to construct a (χ d , (1 + δ) H )-coloring, where H is the height of the tree decomposition. We recall now that H = O(log n), therefore, (1 + δ) H ≤ e δH ≤ e O( / log n) ≤ 1 + . Hence, if we establish the first property, we know that if a (χ d , ∆ * )-coloring exists, the algorithm will be able to find a (χ d , (1 + )∆ * )-coloring. Conversely, the second property assures us that, if the algorithm places a signature s in a DP table, there must exist a coloring that matches this signature.
In order to establish these invariants we must make a further modification to the algorithm of Theorem 19. We recall that the algorithm makes some arithmetic calculation in Forget nodes (where the value d v of neighbors of the forgotten node with the same color is increased by 1); and in Join nodes (where values d u1 , d u2 corresponding to the same node are added). The problem here is that even if the values stored are integer powers of (1 + δ), the results of these additions are not necessarily such integer powers. Hence, our algorithm will simply "round up" the result of these additions to the closest integer power of (1 + δ). Formally, instead of the value d v + 1 we use the value (1 + δ) log (1+δ) (dv+1) , and instead of the value d u1 + d u2 we use the value (1 + δ) log (1+δ) (du 1 +du 2 ) .
We can now establish the two properties by induction.
For the second property, observe that since we always round up, the value stored in the table will always be at least as high as the true number of neighbors of a vertex in the coloring c. Calculations are similar for Forget nodes.
Because of the above we have an algorithm that runs in time polynomial in |D t | = (χ d |Σ|) O(tw) . We can assume without loss of generality that χ d ≤ tw + 1, otherwise by Lemma 1 the graph can be easily properly colors. By the observations of |Σ| we therefore have that the running time is (tw log n/ ) O(tw) . A well-known win/win argument allows us to obtain the promised bound as follows: if tw ≤ √ log n, this running time is in fact polynomial in n, 1/ , so we are done; if √ logn ≤ tw then log n ≤ tw 2 and the running time is upper bounded by (tw/ ) O(tw) .

Proof of Lemma 24.
We run what is essentially a local search algorithm for Max Cut. Initially, color all vertices with color 1. Then, as long as there exists a vertex u such that the majority of its neighbors have the same color as u, we change the color of u. We continue with this process until all vertices have a majority of their neighbors with a different color. In that case the claim follows. To see that this procedure terminates in polynomial time, observe that in each step we increase the number of edges that connect vertices of different colors.
Proof of Theorem 25. We assume without loss of generality that ∆ * is sufficiently large (e.g. ∆ * ≥ 20), otherwise we can solve the problem exactly by using the fact that χ d is bounded by tw (by Lemma 1) and the algorithm of Theorem 19. We invoke the algorithm of Theorem 23, setting = 1/10. The algorithm runs in the promised running time. If it reports that G does not admit a (χ d , ∆ * )-coloring, we output the same answer and we are done.
Suppose that the algorithm of Theorem 23 returned a (χ d , 11 10 ∆ * )-coloring of G. We transform this to a (2χ d , ∆)-coloring by using Lemma 24.
We consider each color class in the returned coloring of G separately. Each class induces a graph with maximum degree 11 10 ∆ * . According to Lemma 24, we can two-color this graph so that no vertex has more than 11 20 ∆ * ≤ ∆ * neighbors with the same color. We produce such a two-coloring for the graph induced by each color class using two new colors. Hence, the end result is a (2χ d , 11 20 ∆ * )-coloring of G, which is also a valid (2χ d , ∆ * )-coloring.
Proof of Theorem 26. First, observe that the theorem already follows for χ d = 1 by Theorem 2, which states that it is W[1]-hard parameterized by td(G) to decide if a graph admits a (2, ∆ * )-coloring. Let G 1 be the graph produced in the reduction of Theorem 2. By repeated composition we will construct, for any χ d , a graph G χ d such that either G χ d admits a (2χ d , ∆ * )-coloring, or it does not admit a (3χ d − 1, ∆ * )-coloring, depending on whether G 1 admits a (2, ∆ * )-coloring. Suppose that we have constructed the graph G χ d , for some χ d . We describe how to build the graph G χ d +1 . We start with a copy of G 1 , which we call the main part of our construction. We will add to this many disjoint copies of G χ d and appropriately connect them to G 1 to obtain G χ d +1 .
Recall that the graph G 1 contains two palette vertices p A , p B , each connected to ∆ * neighbors p i j , i ∈ {1, . . . , ∆ * }, j ∈ {A, B} with both edges and equality gadgets. Furthermore, recall that for two colors, an equality gadget with endpoints p j , p i j is an independent set on 2∆ * + 1 vertices which are common neighbors of p j and p i j . For each j ∈ {A, B}, each i ∈ {1, . . . , ∆ * }, and each internal vertex v of the equality gadget Q(p j , p i j ) added in step 3 we add to the main graph 3χ d +2 3χ d ∆ * + 1 disjoint copies of G χ d and connect all their vertices to p j , p i j , and v. Now, for every vertex v of G 1 that is not part of the palette (that is, every vertex that was not constructed in steps 1-5), we add another 3χ d +2 3χ d ∆ * + 1 disjoint copies of G χ d and connect all their vertices to p A , p B , and v.
This completes the construction. We now need to establish three properties: that if G 1 admits a (2, ∆ * )-coloring then G χ d +1 admits a (2χ d + 2, ∆ * )-coloring; that if G 1 does not admit a (2, ∆ * )-coloring then G χ d +1 does not admit a (3χ d + 2, ∆ * )-coloring; and that the tree-depth of G χ d +1 did not increase too much.
We proceed by induction and assume that all the above have been shown for G χ d . For the first property, if G 1 admits a (2, ∆)-coloring and G χ d admits a (2χ d , ∆ * )-coloring, then we can construct a coloring of G χ d +1 by taking the same coloring with 2χ d colors for all the copies of G χ d , and using two new colors to color the main graph G 1 .
For the second property, suppose that we know that a (3χ d − 1, ∆ * )-coloring of G χ d implies the existence of a (2, ∆ * )-coloring of G 1 . We want to show that a (3χ d + 2, ∆ * )coloring of G χ d +1 also implies a (2, ∆ * )-coloring of G 1 . Suppose then that we have such a (3χ d + 2, ∆ * )-coloring of G χ d +1 . If a copy of G χ d included in G χ d +1 uses at most 3χ d − 1 colors, we are done, since this implies the existence of a (2, ∆ * )-coloring of G 1 . Therefore, assume that all copies of G χ d +1 use at least 3χ d colors.
Consider now two vertices p j , p i j , for some j ∈ {A, B}, i ∈ {1, . . . , ∆ * }. We claim that they must receive the same color. To see this, take an internal vertex v of the equality gadget Q(p j , p i j ) and recall that we have added 3χ d +2 3χ d ∆ * + 1 disjoint copies of G χ d connected to