Independent Set Reconfiguration Thresholds of Hereditary Graph Classes

Traditionally, reconfiguration problems ask the question whether a given solution of an optimization problem can be transformed to a target solution in a sequence of small steps that preserve feasibility of the intermediate solutions. In this paper, rather than asking this question from an algorithmic perspective, we analyze the combinatorial structure behind it. We consider the problem of reconfiguring one independent set into another, using two different processes: (1) exchanging exactly $k$ vertices in each step, or (2) removing or adding one vertex in each step while ensuring the intermediate sets contain at most $k$ fewer vertices than the initial solution. We are interested in determining the minimum value of $k$ for which this reconfiguration is possible, and bound these threshold values in terms of several structural graph parameters. For hereditary graph classes we identify structures that cause the reconfiguration threshold to be large.


Introduction
Over the past decade, reconfiguration problems have drawn a lot of attention of researchers in algorithms and combinatorics [4,5,8,12,14,15,17,21,24]. In this framework, one asks the following question: Given two solutions I, J of a fixed optimization problem, can I be transformed into J by a sequence of small steps that maintain feasibility for all intermediate solutions? Such problems are practically motivated by the fact it may be impossible to adapt a new production strategy instantaneously if it differs too much from the strategy that is currently in use; changes have to be made in small steps, but production has to keep running throughout. From a theoretical perspective, the study of reconfiguration problems provides deep insights into the structure of the solution space. One of the well-studied examples is when the solution space consists of all the independent sets of a graph (optionally all having a prescribed size). In this case, three types of reconfiguration rules have been considered. These are naturally explained using tokens on vertices of the graph. In Token Addition Removal (TAR) [15,21], there is a token on every vertex of the initial independent set, and there is a buffer of tokens, initially empty. A step consists of removing a token from a vertex and placing it in the buffer, or placing a buffer token onto a vertex of the graph. The set of vertices with tokens must form an independent set at all times, and the goal is to move the tokens from the initial to the target independent set while ensuring the buffer size never exceeds a given threshold. In Token Sliding (TS) [14,17], a step consists of replacing one vertex v in the independent set by a neighbor of v (the token slides along an edge). In Token Jumping (TJ) [17] a step also consists of replacing a single vertex, but the newly added vertex need not have any neighboring relation with the replaced vertex (the token jumps). Token jumping reconfiguration is equivalent to TAR reconfiguration with a buffer of size one.
As mentioned, the goal in reconfiguring independent sets is to go from one given independent I to another one J by a sequence of small steps. In the TS and TJ models, a step involves moving only a single token. This is ideal, but unfortunately reconfiguration is often impossible in the TS or TJ model. Reconfiguration in the TAR model is always possible if one makes the buffer size sufficiently large. However, having a large buffer size is undesirable. We are interested in determining the minimum buffer size that is sufficient to ensure any independent set in a given graph G can be reconfigured to any target independent set of the same size. We call this minimum the TAR reconfiguration threshold (precise definitions in Section 2). Our aim is to bound the threshold in terms of properties of the graph, and to identify the structures contained in hereditary graph classes that cause the thresholds to be large. We also generalize the TJ model to Multiple Token Jumping (MTJ), where in each step a prescribed number of tokens may be moved simultaneously. In the MTJ model, the question becomes: What is the minimum number of simultaneously jumping tokens needed to ensure any reconfiguration is possible? This quantity is called the MTJ reconfiguration threshold.
Our contribution. We provide upper and lower bounds on the MTJ and TAR reconfiguration thresholds in terms of several graph parameters. In general, our bounds apply to the reconfiguration thresholds of hereditary graph classes. The threshold of a graph class is the supremum of the threshold values of the graphs in that class: it is the (a) A pumpkin of size 18.
(b) A graph of treewidth two with a complete binary tree T of depth two as a bipartite topological double minor. Figure 1: The bipartite structures responsible for large MTJ and TAR reconfiguration thresholds, respectively. A pumpkin consists of odd-length vertex-disjoint paths between two vertices. The special form of topological minor represents each vertex of the tree T by an edge or even cycle in G, and each edge of T by two odd-length paths connecting vertices in opposite partite sets in G.
smallest value k such that for any graph in the class, any source independent set I in that graph can be reconfigured into any target independent set J using steps of size k (for MTJ) or a buffer of size k (for TAR).
The MTJ reconfiguration threshold of graphs that are structurally very simple, may nevertheless be very large. For example, an even cycle with 2n vertices can be partitioned into two independent sets I and J of size n each. Any MTJ reconfiguration of I into J requires a jump of n vertices, and this is trivially sufficient. Since a cycle has a feedback vertex set (FVS, see Section 2) of size one, the MTJ threshold cannot be bounded in terms of the size of a minimum feedback vertex set. However, we prove that the threshold is upper-bounded by the size of a minimum vertex cover of G. Although this bound is tight in the worst case, there are many graph classes with a small MTJ threshold even though they require a large vertex cover. Trees for example have MTJ threshold at most one. We therefore introduce the notion of pumpkin, which consists of two nodes connected by at least two vertex-disjoint paths of odd length (Figure 1a). The size of a pumpkin is the total number of vertices in the structure. We characterize the MTJ reconfiguration threshold of a hereditary graph class Π in terms of the size of the largest pumpkin it contains: the MTJ reconfiguration threshold is upper-and lower-bounded in terms of the largest pumpkin contained in a bipartite graph in Π.
TAR reconfiguration is more versatile than MTJ reconfiguration. In the concrete example of a 2n-cycle discussed above, its MTJ threshold is n while any pair of independent sets can be reconfigured in the TAR model using a buffer of size two. Moreover, we show that any graph that has a feedback vertex set of size k has TAR reconfiguration threshold at most k + 1, and reconfiguring one side of the complete bipartite graph K n,n to the other side shows that this is tight. Our main result concerning TAR reconfiguration states that the TAR reconfiguration threshold of any graph is upper-bounded by its pathwidth.
Somewhat surprisingly, there are graphs of constant treewidth (treewidth 2 suffices) for which the TAR reconfiguration threshold is arbitrarily large. We also introduce the concept of bipartite topological double minor (BTD-minor), see Figure 1b, and show using an isoperimetric inequality that any hereditary graph class containing a graph having a complete binary tree of depth d as a BTD-minor, has TAR reconfiguration threshold Ω(d). We conjecture that the TAR reconfiguration threshold can also be upper-bounded in terms of the depth of the largest complete binary tree BTD-minor, but we have not been able to prove this (see Section 6).
We require the restriction to hereditary graph classes in some of our statements to be able to develop meaningful lower bounds on reconfiguration thresholds, as explained next. Let G be the disjoint union of K n,n and a graph H, and let I and J be the two partite sets of K n,n . One can verify that I can be reconfigured to J by jumps of size at most one if and only if H has an independent set of size n − 1. Similarly, I can be TAR reconfigured to J using a buffer of size k if and only if H has an independent set of size n − k. Since the size of a maximum independent set is NP-complete to determine, there are no good characterizations of this quantity. When developing lower bounds on the threshold of a hereditary graph class Π, this issue disappears since the reconfiguration threshold of any class containing the graph G above, is at least as high as the threshold of H (which must be contained in Π if G is), which is n. The restriction to hereditary graph classes therefore enables us to focus our attention to reconfiguration problems where all vertices in the graph are contained in either the source or target independent set, thereby avoiding the obstacle that the reconfiguration threshold matches the size of a maximum independent set.
Applications. The MTJ and TAR reconfiguration thresholds play an important role in statistical physics and wireless communication networks. To understand the importance of the TAR reconfiguration threshold, consider the following process: In a graph G, nodes are trying to become active (transmit information) at some rate, independently of each other in a distributed manner. When a potential activation occurs at a node, it can only become active if none of its neighboring nodes are active at that moment (as otherwise the transmissions would interfere). An active node deactivates at some rate independent of the other processes. At any point in time, the set of active nodes in this process forms an independent set of the graph. In statistical physics, this process is known as Glauber dynamics with hard-core interaction. This activity process on graphs has many applications in different fields of study. Loosely speaking, when the activation rate is large, in the long run the above process always tries to stay in a maximum independent set. For the graphs with more than one maximum independent set, it is interesting to study the time this process takes to reach a target independent set, starting from some specific independent set. This time has been shown to depend crucially upon what we call the TAR reconfiguration threshold of the underlying graph [22]. In particular, the mixing time of the Glauber dynamics on a graph increases exponentially with its TAR reconfiguration threshold, and hence the Glauber dynamics on the graph is fast mixing if and only if the TAR reconfiguration threshold is small.
The MTJ reconfiguration threshold of a graph G can be interpreted in the following way. Consider the auxiliary graph, whose vertices correspond to size-s independent sets in G for some fixed s, with an edge between vertices representing sets I, J if |I \ J| k. Then the MTJ reconfiguration threshold is at most k if and only if this graph auxiliary graph is connected for all s. The MTJ reconfiguration threshold therefore has applications in the parallel Glauber dynamics (PGD) [16,18], where the MTJ reconfiguration threshold provides the jump size required to make the underlying Markov process ergodic.
Organization. The succeeding sections are organized as follows. In Section 2 we provide graph-theoretic preliminaries. In Section 3 we provide a formal description of the two types of reconfiguration. In Section 4 we analyze MTJ reconfiguration. Section 5 deals with TAR reconfiguration.

Preliminaries
In this section we give the most important graph-theoretic definitions. Notions not defined here can be found in one of the textbooks [7,9]. A graph is a pair G = (V, E), where V is the set of vertices, and E is the set of edges. We also use V(G) and E(G) to refer to the vertex and edge set of G, when convenient. All graphs we consider are finite, simple, and undirected. For U ⊆ V we denote by G − U the graph obtained from G by removing the vertices in U and their incident edges. A set U ⊆ V is called an independent set of G, if {u, v} / ∈ E for any u, v ∈ U. The symmetric difference of two sets U and U is U∆U : The minimum cardinality of a vertex cover of G is denoted by vc(G). A set U ⊆ V is a feedback vertex set if G − U is acyclic (a forest). The minimum cardinality of a feedback vertex set of G is denoted fvs(G). For a vertex v, denote by N G (v) the set of its neighbors (excluding v itself). The open and closed neighborhood of a set U ⊆ V are N G (U) := s∈U N G (s) \ U and N G [U] := s∈U N G (s) ∪ U, respectively. We omit the subscript when it is clear from the context.
A graph class is a (possibly infinite) collection of graphs. A graph class Π is said to be hereditary if given any graph G ∈ Π, any induced subgraph of G belongs to the class Π as well. A graph is bipartite if its vertex set can be partitioned into two independent sets I and J, which are also called the partite sets. We sometimes denote such a bipartite graph by G = (I ∪ J, E). A bipartite graph is balanced if |I| = |J|. A matching is a set of edges that do not share any endpoints. A matching covers a vertex v if it contains an edge incident on v. A matching is perfect if it covers all vertices. We will utilize the following well-known consequence of Kőnig's theorem. Definition 1 ([7, §7.2]). A path decomposition of a graph G = (V, E) is a sequence P = (X 1 , X 2 , . . . , X r ) of subsets of V called bags, satisfying the following conditions: In other words, every vertex of G is in at least one bag.
(P3) For every v ∈ V, if u ∈ X i ∩ X k for some i k, then u ∈ X j also for each j such that i j k. In other words, the indices of the bags containing u form an interval in {1, 2, . . . , r}.
The width of a path decomposition (X 1 , . . . , X r ) is max 1 i r |X i | − 1. The pathwidth of G, denoted by pw(G), is the minimum possible width of a path decomposition of G. A path-decomposition (X 1 , X 2 , . . . , X r ) of a graph G is nice if the following holds: It is well-known (cf. [7,Lemma 7.2]) that every graph admits a nice path decomposition of width pw(G). For any path decomposition P = (X 1 , X 2 , . . . , X r ) of G = (V, E), and any vertex v ∈ V, define l P (v) = min{i : v ∈ X i } and r P (v) = max{i : v ∈ X i }, i.e. l P (v) and r P (v) respectively denote the index of the first and last bag containing v. Note that if P is nice, then l P (·) and r P (·) are injective maps over the set of vertices.

Definitions and Basic Facts for Reconfiguration
In this section we formally define the two notions of reconfiguration and establish some basic facts.
Multiple Token Jump (MTJ). Given any two independent sets I and J, with |I| = |J|, we say that I can be k-MTJ reconfigured to J, if there exists a finite sequence of independent sets (I = W 0 , W 1 , W 2 , . . . , W n , W n+1 = J) for some n 0, such that for all i ∈ {0, . . . , n + 1} the set W i is independent in G, in the reconfiguration process with |W i \ W i+1 | = k is called a k-TJ move. Given a graph G = (V, E), define mtj(G, s) as the minimum value of k, such that any two independent sets of size s in G can be k-MTJ reconfigured to each other. Now define mtj(G) := max 1 s |V| mtj(G, s). Our goal is to characterize the value of mtj(G) in terms of certain parameters of the graph G. We call mtj(G) the MTJ reconfiguration threshold of the graph G. The MTJ reconfiguration threshold of a graph class Π is defined as mtj(Π) := sup G∈Π mtj(G).
Token Addition Removal (TAR). Given any two independent sets I and J, with |I| = |J|, we say that I can be k-TAR reconfigured to J, if there exists a finite sequence of independent sets (I = W 0 , W 1 , W 2 , . . . , W n , W n+1 = J) for some n 0, such that W i is independent in G, |I| − |W i | k, and |W i−1 ∆W i | 1 for all i ∈ {0, . . . , n + 1}. We refer to the quantity B i := |I| − |W i | as the buffer size at step i: the tokens that were on the initial independent set, and are not on the current independent set W i , are placed in the buffer. Define tar(G, s) to be the smallest buffer size k such that any two independent sets of size s can be k-TAR reconfigured to each other. Define tar(G) := max 1 s |V| tar(G, s). As before, we call tar(G) the TAR reconfiguration threshold of the graph G, and extend the same terminology to graph classes Π by defining tar(Π) := sup G∈Π tar(G).

Facts on Reconfiguration.
Observe that for any graph G, it holds that mtj(G) = 1 if and only if tar(G) = 1. In general, the TAR reconfiguration threshold is at most the MTJ reconfiguration threshold. Indeed, to see this, observe that each k-TJ move can be thought of as a sequence of 2k steps with maximum buffer size k. First, sequentially remove the k vertices that are jumping away, placing their tokens in the buffer; then sequentially place the buffer tokens on the k new vertices in the independent set. Proposition 1. Let G be a graph with independent sets I and J of equal size. If I \ J can be k-TAR reconfigured (resp. k-MTJ reconfigured) to J \ I in the graph G[I∆J], then I can be k-TAR reconfigured (resp. k-MTJ reconfigured) to J in G.
Proof. Consider a sequence of independent sets (I \ J = W 0 , . . . , W n+1 = J \ I) in G[I∆J] that reconfigures I \ J to J \ I. Since I and J are independent in G, no vertex of I∆J is adjacent to a vertex of I ∩ J. Hence W i := W i ∪ (I ∩ J) is an independent set in G for all i, and the sequence (W 0 , . . . , W n+1 ) reconfigures (I \ J) ∪ (I ∩ J) = I to (J \ I) ∪ (I ∩ J) = J in G. The step size and buffer size of this sequence in G are not greater than the corresponding values for the sequence in G[I∆J], which completes the proof. Proposition 1 shows that to upper-bound the TAR or MTJ reconfiguration threshold, it suffices to do so in balanced bipartite graphs where the source and target configurations are disjoint; note that G[I∆J] is balanced bipartite and I \ J and J \ I are disjoint. We will frequently exploit this in our proofs. For any graph class Π, let Π bip denote the set of bipartite graphs in Π. The following proposition shows that the reconfiguration threshold of a hereditary graph class is determined by the behavior of the bipartite graphs in the class. Note that for hereditary classes Π, the class Π bip is also hereditary. Proof. The definitions of the thresholds imply that mtj(Π) mtj(Π bip ) and tar(Π) tar(Π bip ), since Π ⊇ Π bip . For the reverse direction, assume that the reconfiguration threshold of Π bip (in one of the models) is at most k and consider any graph G ∈ Π with independent sets I and J of equal size. By Proposition 1 the cost of reconfiguring I to J is bounded by the cost of reconfiguring I \ J to J \ I in G[I∆J]. Since G[I∆J] is bipartite and Π is hereditary, we have G[I∆J] ∈ Π bip , and hence the cost of reconfiguring in G[I∆J] is at most k. So reconfiguring I to J can be done with cost at most k in this model.

Threshold for Multiple Token Jump Reconfiguration
We start our discussion of token jump reconfiguration by recalling the following known result. The intuition behind this result is that since a forest does not contain any cycle, one can start reconfiguring from the leaf nodes or the isolated vertices, each of which has at most one neighbor from the target configuration. For arbitrary graphs, the above procedure does not work since there may not be any leaves or isolated vertices. But if a graph G has a small vertex cover, then its MTJ reconfiguration threshold is again small. Proof. We prove the theorem using induction on the number n of vertices in G. For n = 1 the claim is trivially true, so consider a graph G = (V, E) with source and target independent sets I and J of equal size and |V| > 1. Our induction hypothesis is that any graph G with less than |V| vertices has MTJ reconfiguration threshold upper-bounded by max(vc(G ), 1).
By Proposition 1 it is enough to show that in the graph G[I∆J] induced by I∆J, starting from I = I \ J, one can construct a sequence of MTJ moves to reach the configuration J = J \ I with step-size at most max(vc(G), 1). Let V := I∆J. Note that vc(G[I∆J]) vc(G), and let S ⊆ V be a vertex cover of G[I∆J] of cardinality at most vc(G). If S is a vertex cover of G[I∆J], then there is no edge between any two vertices of V \ S. Assume without loss of generality that |I ∩ S| |J ∩ S| (otherwise swap the role of I and J , which does not affect reconfigurability). We distinguish three cases. Case 1. If the vertex cover is empty (S = ∅), then G[I ∆J ] has no edges. Consequently, all vertex subsets in the graph are independent, and we can reconfigure I to J by jumping one token at a time. By Proposition 1, this implies I can be reconfigured to J in G using jumps of size 1.
Case 2. Suppose that |I | = |J | vc(G). Then we can jump the tokens from I onto J in a single step of size at most vc(G), and complete the argument using Proposition 1.

Case 3.
If the previous cases do not apply, we claim that s := |I ∩ S| > 0. Indeed, if s = 0, then since |I ∩ S| |J ∩ S| we would have I ∩ S = J ∩ S = ∅, implying that S = ∅ and that the first case applies. Moreover, we have |J \ S| s, otherwise and we are in the previous case. Now let Z be an arbitrary set of s vertices from J \ S. Choose all the vertices in I ∩ S and jump their tokens to Z, i.e., remove the vertices I ∩ S from the independent set I and replace them by Z to obtain I . The set I is independent because the fact that S is a vertex cover implies that the only neighbors of Z ⊆ J \ S belong to the set S, while I contains no vertex from S. Since |I ∩ S| |S| vc(G), the step size of this move is at most vc(G).
Consider the graph G which is obtained from G[I∆J] by removing Z and I ∩ S, which is again a balanced bipartite graph. Note that since the only neighbors of Z belong to I (since G is bipartite and Z ⊆ J), and belong to S (since S is a vertex cover and Z ∩ S = ∅), it follows that G contains no vertex that is a neighbor of Z in G[I∆J]. Consequently, the union of Z with any independent set in G is independent in G[I∆J]. Since G is smaller than G, by induction one can reconfigure I \ S to J \ Z in the graph G with steps of size at most vc(G ) vc(G). Adding Z to each set in the corresponding reconfiguration sequence produces a sequence that reconfigures (I \ S) ∪ Z to J in G[I∆J]. By inserting the step from I to (I \ S) ∪ Z at the front of this sequence, we obtain a reconfiguration from I to J with steps of size at most vc(G) in G[I∆J]. By Proposition 1 this implies that I can be reconfigured to J with steps of size vc(G) in G, completing the proof for this case.
An even cycle of length 2n has MTJ reconfiguration threshold n. Since its vertex cover number is n, Theorem 4 is best-possible. Long cycles are not the only graphs whose MTJ reconfiguration threshold equals half the size of the vertex set. Bistable graphs (defined below), of which the pumpkin structure defined in the introduction is a special case, also have this property.

MTJ Reconfiguration Threshold in Terms of Bistable Rank
In this section we introduce the notion of bistable graph, derive several properties of bistable graphs, and use these to bound the MTJ reconfiguration threshold in terms of the size of the largest induced bistable subgraph. The resulting bounds on the MTJ reconfiguration threshold are tight, but can be hard to apply to specific graph classes: it may be difficult to estimate the size of the largest induced bistable graph, or even to determine whether a given graph is bistable or not. In Section 4.2 we will therefore relate the size of the largest induced bistable subgraph to the size of the largest pumpkin subgraph. This will result in upper-and lower bounds on the MTJ reconfiguration threshold in terms of the largest pumpkin structure contained in the graph (class), which is arguably a more insightful parameter. The resulting bound will not be best-possible, however.

Definition 2 (Bistable graphs).
A graph is called bistable if it is connected, bipartite, and has exactly two distinct maximum independent sets formed by the two partite sets in its unique bipartition. The rank of a bistable graph is defined as the size of its maximum independent sets.
Let bi(G) denote the rank of the largest induced bistable subgraph of G. If G contains no induced bistable subgraphs (which can only occur if G has no edges), then we define bi(G) to be one. For a graph class Π we define bi(Π) := sup G∈Π bi(G).
The pumpkin shown in Figure 1a forms an example of a bistable graph. Lemma 5 connects bistable graphs to independent set reconfiguration. Consider the task of reconfiguring the J-partite set to the I-partite set in a balanced bipartite graph G = (I ∪ J, E). If we have a set S ⊆ I such that |S| |N(S)|, then one way to make progress in the reconfiguration is to select |S| vertices from N(S) ⊆ J and jump their tokens onto the vertices in S, resulting in a new independent set of the same size. The following lemma shows that when we consider a set S that is minimal with respect to being at least as large as its neighborhood, then the induced subgraph G[N[S]] is bistable. Hence the cost of such a jump of |S| vertices is bounded by bi(G), which will allow us to bound the MTJ reconfiguration threshold. Proof. We have to show that the graph G induced by the vertices from S and their neighborhood satisfies all conditions for being bistable. Since G is bipartite, G is as well. Before proving the remaining properties, we establish the following claim. Claim 1. The graph G has a matching covering S.
Proof. Assume for a contradiction that no such matching exists. By Fact 1, there is a set S ⊆ S with |N G (S )| = |N G (S )| < |S |. As G has no isolated vertices, we have |S | > 1. Removing an arbitrary vertex from S to obtain S then decreases the size of the set by at most one without increasing the neighborhood size. Hence |N G (S )| |S |, a contradiction to the minimality of S.
We now show that G has all properties of a bistable graph.

Connectivity.
Assume for a contradiction that G is not connected. If G has a connected component with vertex set C that contains at least as many I-vertices as J-vertices, then S := C ∩ I is a strict subset of S with |S | |C ∩ J| |N G (S )| = |N G (S )|, contradicting minimality of S. Otherwise, all connected components of G have strictly more J-vertices than I-vertices. Since the J-vertices in G form the neighborhood of S, this implies that |N G (S)| > |S|, a contradiction to the choice of S. Hence G is connected.
Balance. Since G is connected, it has a unique bipartition and it is easy to verify that S is one of the partite sets: G = (S ∪ J , E ). Since there is a matching covering S (Claim 1) and all matching partners of vertices in S are distinct and belong to J , we therefore have |J | |S|. We have |J | = |N G (S)| |S| by assumption on S, establishing that |J | = |S| which proves G is balanced. This implies that a matching in G that saturates S (which exists by Claim 1) is in fact a perfect matching in G .
Two maximum independent sets. Assume for a contradiction that G has at least three maximum independent sets. Then there is a maximum independent set in G that is not equal to either of the two partite sets J or S; let X be such a maximum independent set. Since G is bipartite and has a perfect matching M, the set X contains exactly one vertex from each matching edge in M. Now letŜ := X ∩ I. Since X = J by assumption, it follows thatŜ is not empty; since X = S, it is a proper subset of S. We show that |N G (Ŝ)| |Ŝ|, contradicting minimality of S.
Let M ⊆ M denote the matching edges intersected byŜ. Since X contains one vertex from each edge of M, for all edges in M \ M the J-endpoint of the edge belongs to the independent set X. So the J-endpoint of an edge in M \ M is not in the neighborhood ofŜ, as X is independent. Consequently, only the matching partners ofŜ can be in the neighborhood ofŜ, implying there are at most |Ŝ| such neighbors. Hence |N G (Ŝ)| |Ŝ|; a contradiction. It follows that G has at most two maximum independent sets. To see that it has exactly two, it suffices to observe that since G has a perfect matching, both its partite sets are maximum independent sets. This establishes that G satisfies all conditions for being bistable and concludes the proof of Lemma 5. Now, in the lemma below, we prove two key properties of bistable graphs. They will later be useful to relate the quantities pum(G) and bi(G). Lemma 6. Let G = (I ∪ J, E) be a bistable graph. Then the following holds: 1. G has a perfect matching covering I (and hence J).

G is biconnected.
Proof. (1) By Kőnig's theorem (cf. [23,Thm. 16.2]), the size of a maximum matching in the bipartite graph G equals the size of a minimum vertex cover in G. By Definition 2, the partite sets I and J are maximum independent sets and therefore have equal size. Since the complement of a maximum independent set is a minimum vertex cover, it follows that V(G) \ I = J is a minimum vertex cover. Hence there is a matching of size |J| = |I| in G, which is a perfect matching since it covers 2|J| = |V(G)| vertices.
(2) Assume for a contradiction that G is not biconnected. Let v be a cutvertex and let M be a perfect matching in G, which exists by the previous property. Assume that v ∈ I; the argument for v ∈ J is symmetric. Let u be the matching partner of v under M. Since v is a cutvertex, the graph G − {v} consists of multiple connected components C 1 , . . . , C . Without loss of generality, assume that u is contained in component C 1 .
For all components C i with i 2, the component contains the same number of I and Jvertices: for each vertex its matching partner in the opposite partite set belongs to the same component. For component C 1 the number of I-vertices is one smaller than the number of J-vertices, since the matching partner of u does not belong to C 1 . Consider the set S consisting of the J-vertices from C 1 along with the I-vertices of all other components C 2 , . . . , C of G − {v}. The set S is independent in G − {v} since it consists of entire partite sets of different components of the bipartite graph. Since v ∈ S it follows that S is also independent in G. As S contains exactly one endpoint from each edge in M (the I-endpoint for matching edges intersecting a component C i for i 2, and the J-endpoint for the remaining matching edges) it follows that S is a maximum independent set in G that differs from I and J; a contradiction to Definition 2.
Proof. We first prove the lower bound on mtj(G). Note that if G = K 1 contains no induced bistable subgraphs, then G is a collection of isolated vertices, and in that case mtj(G) = bi(G) = 1. Assume then that G contains a nonempty induced bistable subgraph G = (I ∪ J , E ) of rank bi(G). By Definition 2, the sets I and J are the only independent sets of size bi(G) in G . It follows that in any MTJ reconfiguration sequence from I to J , the set I is immediately followed by J which requires a jump of |I | = |J | = bi(G) tokens simultaneously. Hence mtj(G ) bi(G).
We prove the upper bound on mtj(G) by induction on the size of the graph. If G consists of a single vertex, then there is a unique nonempty independent set, so mtj(G) = 0. In the remainder, assume G has more than one vertex and let I and J be two independent sets in G of equal size. By Proposition 1 it suffices to prove that I := I \ J can be MTJ-reconfigured to J \ I in the graph G := G[I∆J] with jumps of size at most bi(G). Assume first that G has no isolated vertices, and let S ⊆ I \ J be an inclusion-wise minimal nonempty subset of I \ J with the property that |S| |N G (S)|. Such a set exists since G is a balanced bipartite graph with partite sets I \ J and J \ I, so the set I \ J satisfies the stated condition (but may not yet be minimal). It is easy to verify that since G has no isolated vertices and S is minimal, we have |N G (S)| = |S|. Now move all tokens from N G (S) onto S in a single jump of size |S|. By Lemma 5, the graph ] is a bistable induced subgraph of G of rank |S|, and therefore bi(G) |S| which shows that the size of the jump is sufficiently small. We may then invoke induction similarly as in the proof of Theorem 4 to complete the argument. If G has an isolated vertex, then instead one can jump a token onto this isolated vertex and induct. This concludes the proof of Theorem 7.
The following corollary characterizes the MTJ reconfiguration threshold of hereditary graph classes. It follows directly from Theorem 7. It applies to all graph classes except the one consisting only of the single graph K 1 with a single vertex, for which the reconfiguration threshold is zero but bi(K 1 ) = 1 by definition. Proof. By Theorem 7 we have mtj(G) bi(G) bi(Π) for all graphs G ∈ Π, hence mtj(Π) bi(Π). To prove the converse, consider an arbitrary graph G ∈ Π having at least one edge. Then G contains an induced bistable subgraph H = (I ∪ J, E) of rank bi(G), and since Π is hereditary we have H ∈ Π. Reconfiguring I to J in H requires a jump of size |I| = |J| = bi(G) since those are the only two independent sets of that size. Hence mtj(Π) bi(G) for all G ∈ Π with at least one edge, showing that mtj(Π) bi(Π) if Π contains at least one bistable graph. In the exceptional setting that Π contains no bistable graph, all graphs in Π are edgeless causing bi(Π) to be one. Since Π = {K 1 }, the graph consisting of two isolated vertices is contained in Π, which has reconfiguration threshold one. Hence the lower bound also holds in this case.

MTJ Reconfiguration Threshold in Terms of Pumpkin Size
In this section we formally introduce the pumpkin structure described in the introduction. We relate pumpkins to bistable graphs to obtain bounds on the MTJ reconfiguration threshold in terms of the size of the largest pumpkin subgraph.

Definition 3 (Pumpkin).
A pumpkin is a graph consisting of two terminal vertices u and v linked by two or more vertex-disjoint paths with an odd number of edges, having no edges or vertices other than those on the paths. A path can consist of the single edge {u, v}. The size of the pumpkin is the total number of vertices.
For a graph G we denote by pum(G) the size of the largest (not necessarily induced) subgraph isomorphic to a pumpkin that is contained in G, or zero if G contains no pumpkin. For a graph class Π we define pum(Π) := sup G∈Π pum(G).
An example of a pumpkin structure is shown in Figure 1a. Observe that a pumpkin is a bipartite graph, since all cycles consist of two uv-paths of odd length and are therefore even. Furthermore, a pumpkin is a balanced bipartite graph: vertices u and v belong to different partite sets since their distance is odd, and on every (odd-length) uv-path in the structure there is an even number of interior vertices, which alternate between the two partite sets. It is not difficult to verify that the two partite sets are the only maximum independent sets in a pumpkin, leading to the following observation.

Observation 1. Every pumpkin graph is bistable.
The next theorem shows that the rank of the largest bistable induced subgraph of G can be upper-bounded in terms of the size of G's largest pumpkin subgraph.   Proof. Consider a bistable graph G = (I ∪ J, E). If G is acyclic, then any biconnected subgraph of G contains at most two vertices. There is a unique bistable graph with at most two vertices, which consists of a single edge and has rank one. Since any bistable graph is biconnected by Lemma In the remainder we assume that G contains a cycle, which implies that pum(G) 1 since any cycle in the bipartite graph G is even and forms a pumpkin.
For ease of notation, define L := pum(G). Construct a depth-first search (DFS) tree T of G, starting at an arbitrary vertex r which becomes the root of the tree. By the structure of the DFS process, we obtain the following property: if u and v are vertices of G that are adjacent in G, then u is an ancestor of v in T , or v is an ancestor of u. For v ∈ T , we use T v to denote the subtree of T rooted at v. We will often use T v to refer to the vertices in the tree as well.
Claim 2. The depth of T is at most L 2 .
Proof. Assume for a contradiction that there is a path from the root r of T to a leaf , consisting of more than L 2 edges. By Lemma 6, graph G is biconnected. The existence of a path of more than L 2 edges in a biconnected graph G is known [10, Theorem 1] to imply that G contains a simple cycle of length more than L. Since G is bipartite the cycle is even and forms a pumpkin: it splits into two odd paths. So pum(G) > L, a contradiction.
If each vertex in T has at most (L 3 + L 2 ) children, then the bound on the depth given by the previous claim implies that T (and therefore G) has at most L 2 i=0 (L 3 + L 2 ) i = (L 3 + L 2 ) L 2 +1 + 1 vertices and therefore: To complete the proof, it therefore suffices to show that no vertex of T has more than L 3 + L 2 children. Assume for a contradiction that some vertex v exists with a larger number of children u 1 , . . . , u m , for some m > L 3 + L 2 . By switching the roles of I and J if needed, we may assume that v ∈ I. For a vertex w, let M w denote the set of its proper ancestors in T . We classify the children u i for i ∈ [m] into three types: Type A: Some vertex of T u has an edge in G to a vertex in M u ∩ J.
Type B: Vertex u is not of type A and |I ∩ T u | = |J ∩ T u |.
Type C: Vertex u is not of type A and |I ∩ T u | = |J ∩ T u |.
Observe that any child of v belongs to exactly one type.

Claim 3.
There are less than L 3 type-A children of v.
Proof. Suppose there are at least L 3 type-A children of v and assume these are numbered u 1 , . . . , u L 3 . Each subtree T u i for i ∈ [L 3 ] contains a vertex that has an edge in G to a proper ancestor of u i in J, by our definition of types. Since v ∈ I, this cannot be v so that in fact it is also a proper ancestor of v and belongs to M v . Since T has depth at most L 2 by Claim 2, by the pigeon-hole principle there is a vertex w ∈ M v ∩ J, such that L subtrees among T u 1 , . . . , T u L 3 contain a vertex that is adjacent to w in G. From each such subtree T u i , we obtain a path in G from v to w whose internal vertices belong to T u i , by going from v to u i , then to a neighbor of w in the subtree T u i using the tree edges, and ending with the edge to w. Applying this procedure to each of the L subtrees that connect to w yields at least L internally vertex-disjoint paths from v to w. Since G is bipartite and v and w belong to different partite sets, each path connecting v and w is of odd length. Hence this collection of L vertex-disjoint paths between v and w forms a pumpkin of size more than L: each of the L paths has at least one internal vertex, and together with v and w this gives size at least L + 2. This contradicts our choice of L = pum(G). Proof. Since G is bistable, it has a perfect matching M by Lemma 6. By the properties of a DFS tree, for each vertex in T its neighbors in G are among its ancestors and descendants in T . The number of I and J-nodes in a subtree T u i rooted at a type-B child u i are not equal. Since each vertex in T u i is assigned a unique neighbor in the other partite set by the perfect matching M, it follows that the matching partner of some vertex in T u i does not belong to T u i , and must therefore be a proper ancestor of u i by the properties of DFS trees. Since there are at most depth(v) + 1 ancestors of v to use as matching partners, and each type-B child uses a different ancestor as a matching partner for one of its vertices, the number of type-B subtrees is at most depth(v) + 1.
Since the depth of T is at most L 2 by Claim 2, any vertex v that is not a leaf has depth at most L 2 − 1. Hence the number of type-B children of v is at most L 2 .

Claim 5. No child of v is of type C.
Proof. Suppose there exists a type-C child u i of v. Subtree T u i does not contain a vertex adjacent to M u i ∩ J, else u i would have been type A. Any G-neighbor of a vertex in T u i that is not contained in T u i is a proper ancestor of u i , by the properties of DFS trees. Hence the set J = (J \ T u i ) ∪ (I ∩ T u i ) forms an independent set in G. By the definition of type-C vertices, |J ∩ T u i | = |I ∩ T u i |, so that |J | = |J|. This shows that J is a maximum independent set distinct from I and J, contradicting the assumption that G is bistable.
The preceding claims show that no vertex of T has more than L 2 + L 3 children, which completes the proof of Theorem 8 using (2).
The following theorem is our main result on the MTJ reconfiguration threshold. It bounds the MTJ reconfiguration threshold of a hereditary graph class Π in terms of the maximum size of pumpkin subgraph of a graph in Π bip . Recall that Π bip contains the bipartite graphs in Π.
Proof. We combine the bounds on the MTJ reconfiguration threshold of Theorem 7, with the relation between pumpkins and bistable graphs of Theorem 8.

Lower bound on MTJ.
Consider a bipartite graph G in Π bip . Then G contains a pumpkin subgraph on pum(G) vertices S ⊆ V(G). Then G[S] = (I ∪ J, E) is a bipartite supergraph of a pumpkin, which is contained in Π bip . Since any pumpkin is bistable by Observation 1, it follows that reconfiguring I to J in the pumpkin subgraph of G[S] requires a jump of size |I| = |J| = pum(G)/2. It is clearly no easier to reconfigure I to J in the supergraph G[S] ∈ Π bip . Hence mtj(Π) pum(G)/2 for all graphs G in Π bip , giving the lower bound mtj(Π) pum(Π bip )/2.
While the upper bound of Theorem 9 has room for improvement, the following lemma shows that the exponential dependency on the pumpkin size in the upper bound is unavoidable. Proof. For each k 1 we will construct a graph G k belonging to the class Π pum (24k + 6) with mtj(G k ) = 2 Ω(k) . We will call G k a super-pumpkin. It is defined recursively, as explained next. Define a (4,2)-pumpkin, denoted P 4,2 , to be a pumpkin whose two terminal vertices are connected by four paths with two interior vertices each. A super-pumpkin is now defined as follows. Like a regular pumpkin, it has two designated terminal vertices. The super-pumpkin G 1 consists of just a single edge, whose endpoints are its terminal vertices. The super-pumpkin G k is obtained by gluing two copies of a super-pumpkin G k−1 -we will denote these copies by G 1 k−1 and G 2 k−1 -into a (4,2)-pumpkin P 4,2 . This is done by identifying the terminal vertices of G 1 k−1 and G 2 k−1 with specific vertices of the (4,2)-pumpkin, as indicated in Fig. 3.
Proof. The proof is by induction on k. It will be convenient to prove the following stronger claim on I(G k ), the set of all independent sets of G k . I(G k ) contains no independent set of size more than |G k |/2 and exactly two independent sets of size |G k |/2. These independent sets are disjoint, and one of them contains one terminal vertex of G k while the other contains the other terminal vertex.
This claim trivially holds for I(G 1 ), so now consider I(G k ) for k > 1. Let s, t be the two terminal vertices of G k , and label the other vertices of the pumpkin P 4,2 as u 1 , . . . , u 4 and v 1 , . . . , v 4 -see Fig. 3. We define W := {s, t, u 1 , u 3 , v 2 , v 4 } to be the set of vertices in G k that do not occur in G 1 k−1 or G 2 k−1 . We distinguish two types of independent sets in I(G k ). Type 1. Independent sets I such that both G 1 k−1 and G 2 k−1 have |G k−1 |/2 vertices in I. Note that by the induction hypothesis we have |{u 2 , v 1 } ∩ I| = |{u 4 , v 3 } ∩ I| = 1 for any Type 1 independent set I. Moreover, the total number of vertices from I inside G 1 k−1 and G 2 k−1 is 2(|G k−1 |/2) = |G k |/2 − 3. We will argue that G k has two Type 1 independent sets with |G k |/2 vertices and with the required properties, and that all other Type 1 independent sets have less than |G k |/2 vertices. To this end we distinguish three subtypes of Type 1.
• Type 1(i). Independent sets I with u 2 ∈ I and u 4 ∈ I. By the induction hypothesis such independent sets I exist, and the choice of vertices of I inside G 1 k−1 and G 2 k−1 is fixed. Moreover, there is only one way to obtain an independent set I * with |G k |/2 vertices, namely by adding {u 1 , u 3 , t} from W-all other selections from W give smaller independent sets.
• Type 1(ii). Independent sets I with v 1 ∈ I and v 3 ∈ I. Again, the choice of vertices for I inside G 1 k−1 and G 2 k−1 is fixed, and there is only one way to obtain an independent set of |G k |/2 vertices, this time by adding {s, v 2 , v 4 }. This independent set I * * is disjoint from I * -this follows from the induction hypothesis and the fact {u 1 , u 3 , t} ∩ {s, v 2 , v 4 } = ∅-and it contains s while I * contains t.
• Type 1(iii). Independent sets I with u 2 ∈ I and v 3 ∈ I, or v 1 ∈ I and u 4 ∈ I. Now at most two of the vertices from W can be in I, and so |I| < |G k |/2.

Type 2.
Independent sets I such that at least one of G 1 k−1 and G 2 k−1 has less than |G k−1 |/2 vertices in I. We will argue that all such independent sets have less than |G k |/2 vertices.
Assume without loss of generality that G 1 k−1 has less than |G k−1 |/2 vertices in I. If G 2 k−1 also has less than |G k−1 |/2 vertices in I, then the total number of vertices from I in G 1 k−1 and G 2 k−1 is at most 2(|G k−1 |/2 − 1) = |G k |/2 − 5, and since at most four vertices can be selected from W we have |I| < |G k |/2. If G 2 k−1 has |G k−1 |/2 vertices in I, then |{u 4 , v 3 } ∩ I| = 1. Assume without loss of generality that u 4 ∈ I. Then s and v 4 are not in I. Since v 2 and t cannot be both in I, we conclude that we can select at most three vertices from W into I. This again implies that |I| < |G k |/2.
Note that each I ∈ I(G k ) is of Type 1 or Type 2, since G 1 k−1 and G 2 k−1 cannot have more than |G k−1 |/2 vertices in I by the induction hypothesis. This finishes the proof of the claim. Claim 6 implies that mtj(G k ) = |G k |/2 = 2 Ω(k) : since there are only two independent sets of size |G k |/2, say I and J, and these are disjoint, the only way to go from I to J is to remove all tokens from I and place them onto J. Next we bound the size of the largest pumpkin in G k .
Proof. Define d max to be the maximum degree in G k and C k to be the maximum length of any simple cycle in G k . Then pum(G k ) d max · C k /2.
The following statement is easy to prove by induction: the degree of the terminal vertices in G k is four, and the maximum degree of any other vertex in G k is six. Hence, d max = 6 and so pum(G k ) 3C k .
Next we argue that C k 8k + 2. To this end, define L k to be the length (measured in number of vertices) of a longest simple path in G k that ends at the two terminal vertices of G k . Then L k = L k−1 + 4 with L 1 = 2, and thus L k = 4k − 2. We now prove that C k 8k + 2 by induction on k. We have C 1 = 0, so the statement is true for k = 1. Now suppose k > 1. Let C be a simple cycle in G k . If C stays within one of the copies of G k−1 we have |C| C k−1 by induction. Otherwise the maximum possible length for C is obtained by taking a longest path from u 2 to v 1 in the first copy of G k−1 , a longest path from u 4 to v 3 in the second copy, and connecting them into a cycle using all six vertices in W, where W is defined as before. Hence, It follows that C k 8k + 2. Hence, pum(G k ) 3C k 24k + 6.
This concludes the proof of Proposition 10.

Threshold for Token Addition Removal Reconfiguration
In this section we study the model of token additional removal. First observe that when G is a forest, we have mtj(G) 1 and therefore tar(G) 1 as well. Also, from Theorem 4 we get tar(G) max(vc(G), 1). But the inequality tar(G) mtj(G) tells us nothing about the behavior of the TAR reconfiguration threshold when the MTJ reconfiguration threshold is large. The next simple proposition immediately points towards this direction. Indeed observe that a large pumpkin (with large MTJ reconfiguration threshold) can have a small feedback vertex set; this happens for even cycles, for example. Proposition 11. Let G = (V, E) be a graph. Then tar(G) fvs(G) + 1.
Proof. Let G = (V, E) be a graph with a minimum feedback vertex set S ⊆ V of size k, and let I, J ⊆ V be independent sets of equal size. By Proposition 1 we can assume that V = I ∪ J and I ∩ J = ∅. If |I| = |J| k, then it is trivial to reconfigure I to J with a buffer of size at most k, by first moving all tokens from I into the buffer, and then onto J. In the remainder we assume |I| = |J| > k. Let S I be a superset of I ∩ S of size k, and let S J be a superset of J ∩ S of size k. Then the graph G := G − (S I ∪ S J ) is a subgraph of G − S and is therefore acyclic since S is a feedback vertex set. By Theorem 3 it follows that I := I \ S I can be MTJ reconfigured to J := J \ S J in G by jumps of size 1, which easily implies that I can be TAR reconfigured to J in G using a buffer of size at most 1; let S be a corresponding reconfiguration sequence. To reconfigure I to J in G, start by removing the tokens from the k vertices in S I and place them in the buffer. Then apply the reconfiguration sequence S to reconfigure I to J , using at most 1 extra buffer token. Finish by moving the k buffer tokens onto S J to arrive at the independent set J ∪ S J = J.
One can see that the above bound is tight, by considering the TAR reconfiguration threshold of a complete balanced bipartite graph. Indeed for K n,n , the minimum size of a feedback vertex set is n − 1, and one can see that in order to include any one of the vertices of the target independent set the reconfiguration must pass through the empty set. This shows that the TAR reconfiguration threshold is also n.

TAR Reconfiguration Threshold in Terms of Pathwidth
As the main result of this section, we will show that the TAR reconfiguration threshold of a graph is upper-bounded in terms of its pathwidth. Before proving that statement, we present a structural lemma about path decompositions that will be useful in the proof.

Lemma 12.
Let G = (I ∪ J, E) be a bipartite graph with a nice path decomposition P = (X 1 , . . . , X r ) of width k. Let S ⊆ J such that |N(S)| |S| while no non-empty subset of S has this property. If we order the vertices in S as i 1 , . . . , i t such that r P (i 1 ) < r P (i 2 ) < . . . < r P (i t ), then |N({i 1 , . . . , i t })| < t + k for all 1 t t.
Intuitively, the lemma says the following. Suppose a set S ⊆ J is inclusion-wise minimal with respect to being no smaller than its neighborhood. Then ordering S according to the right endpoints of the intervals representing S in the path decomposition, we are guaranteed that every prefix of S has a fairly small neighborhood compared to its size: the neighborhood size exceeds the size of the prefix by less than the pathwidth. Note that since the lemma deals with bipartite graphs only, no vertex of S can belong to the neighborhood of any prefix of S. The ordering of the vertices is uniquely defined since the path decomposition is nice. The bound of Lemma 12 is best-possible. Consider a complete bipartite graph K n,n , with pathwidth n. In any optimal path decomposition, for t = 1 the first vertex in the ordering has a neighborhood of size n and so n < t + n = 1 + n, but a better bound is not possible.
Proof of Lemma 12. First observe that in a graph with a path decomposition of width k = 0 there can be no edges. Then the only vertex-minimal set S satisfying the assumptions is an isolated vertex, for which the claim trivially holds. In the remainder we assume k 1. For t = t we have {i 1 , . . . , i t } = S, and by assumption |N(S)| |S|. So for t = t the claim in the lemma holds trivially for any k 1. Assume for a contradiction that there is some t < t such that: We partition T := S ∪ N(S) into three disjoint subsets to derive some structural properties that will lead to a contradiction.
(i) T 1 := {v ∈ S ∪ N(S) : r P (v) r P (i t )}, the set of all vertices in S ∪ N(S) that are not contained in any of the bags after the bag with index r P (i t ).
(ii) T 2 := {v ∈ S ∪ N(S) : l P (v) > r P (i t )}, the set of all vertices in S ∪ N(S) that are not contained in any of the bags before or including the bag with index r P (i t ).
(iii) T 3 := {v ∈ S ∪ N(S) : l P (v) r P (i t ) < r P (v)}, the set of all vertices in S ∪ N(S) that are contained in some bags before or including the bag with index r P (i t ) and also in some bag after it.
Proof. From property (P3) in the definition of path decomposition we know that T 3 ⊆ X for = r P (i t ). Now |X | k + 1 since the width of P is at most k, and we know that For the remainder of the proof we distinguish two cases.
is a partition of S. By Claim 8 we have |T 3 ∩ S| k, and therefore Proof. Suppose that |T 3 ∩ S| < k. Then: . . , i t } and S is independent, k + t by (4), contradicting the precondition to the lemma. It follows that |T 3 ∩ S| k. Since |T 3 ∩ S| |T 3 | k by Claim 8, it follows that |T 3 ∩ S| = k and that all vertices of T 3 belong to S.
Let := r P (i t ). Since the path decomposition is nice there is only one vertex (i.e., i t ) that occurs in X but not after X . So X = {i t } ∪ T 3 , and Claim 10 implies that no vertex of N(S) occurs in X since X = {i t } ∪ T 3 ⊆ S. Claim 9 shows that all neighbors of i t +1 , . . . , i t are also neighbor to some vertex of the prefix i 1 , . . . , i t . Since i 1 , . . . , i t are ordered by increasing right endpoint of the intervals representing them in the decomposition, all neighbors of i t +1 , . . . , i t therefore have to occur in a bag with index at most r P (i t ), and since X contains no vertex of N(S), by (P3) it follows that no vertex of N(S) occurs in a bag with index or later. Since X = {i t } ∪ T 3 and |T 3 ∩ S| = k by the previous claim, there are k + 1 vertices in X . Since the size difference of consecutive bags in a nice path decomposition is exactly one, and no bag has size more than k + 1 since the width is k, it follows that X −1 = X \ {v} for some vertex v ∈ {i t } ∪ T 3 ⊆ S. Since no vertex of N(S) occurs in bag X or after, and v does not occur in X −1 or earlier, it follows that v does not occur in a bag together with a vertex of N(S). By the definition of path decomposition, this implies that v has no neighbor in N(S); since v ∈ S and S is an independent set (it is a subset of a partite set of a bipartite graph), this implies that v is an isolated vertex in G. But since 1 t < t = |S|, the set S := {v} is a nonempty strict subset of S for which 0 = |N(S )| |S | = 1, contradicting the precondition to the lemma. This concludes the proof of Case 1.
Case 2: T 2 ∩ S = ∅. We continue the proof of Lemma 12 for the case that T 2 ∩ S = ∅. We will show that T 2 ∩ S is a nonempty strict subset of S with |T 2 ∩ S| |N(T 2 ∩ S)|. This will contradict our assumption that S is inclusion-wise minimal with the property that |S| |N(S)|. Now let us denote |T 3 ∩ I| = k I and |T 3 ∩ J| = k J . Note that T 3 ∩ J = T 3 ∩ S, and observe from Claim 8 that k I + k J k.
Recall from the choice of r P (i t ) that |T 1 ∩ S| = |{i 1 , . . . , i t }| = t . Since S = (T 1 ∪ T 2 ∪ T 3 ) ∩ S, and the T i 's are mutually disjoint, we have: Therefore, Also note that Now observe that any vertex which is a neighbor of some vertex in T 1 ∩ S and some vertex in T 2 ∩ S, must be both in some bag with index at most r P (i t ) (to meet T 1 ∩ S) and in some bag with index strictly more than r P (i t ) (to meet T 2 ∩ S). This implies that Hence, (8) and (9) yield Therefore, combining (7) and (10) we get 0 from our hypothesis about S, and Equation (4) and (6).
So T 2 ∩ S is a nonempty strict subset of S satisfying the key property, contradicting that S is inclusion-wise minimal. This completes the proof of Lemma 12.
Using Lemma 12 we bound the TAR reconfiguration threshold in terms of pathwidth.
Proof. We prove this theorem using induction on the number of vertices. As before, it is enough to consider G = (V, E) and assume that the initial and target independent sets I and J respectively are such that |I| = |J|, I ∪ J = V and I ∩ J = ∅. We will show that pw(G) k implies that tar(G) k, using induction on the number of vertices n. For n = 1, the statement is trivially true. Now fix any k 1, and assume the induction hypothesis that any graph G with n vertices satisfying pw(G) k has tar(G) k. Assume G is a graph of n + 1 vertices having pathwidth at most k. Let S be an inclusion-minimal subset of J for which |S| |N(S)|. Such a set exists since |J| = |I| |N(J)|. We will show that if we reconfigure the set S in a suitable order by moving tokens from N(S) onto S, then the buffer size will not grow beyond k. There are enough vertices in S to accommodate all tokens on N(S), and afterward we will invoke induction.
We first deal with a special case. If S = {v} is a singleton set, then it has degree at most one since |S| |N(S)|. Move the token from the neighbor u of v (or from an arbitrary vertex u, if v has no neighbors) into the buffer, and then onto v. By induction there exists a TAR reconfiguration from I \ {u} to J \ {v} in G − {u, v} using a buffer of size at most max(pw(G − {u, v}), 1) max(pw(G), 1). When inserting the token move from u onto v at the beginning of this sequence, we get a TAR reconfiguration from I to J with the desired buffer size. In the remainder of the proof we can therefore assume |S| 2. This implies that |S| = |N(S)|: if |S| > |N(S)| and |S| 2, then we can remove a vertex v from S to obtain |S \ {v}| |N(S \ {v})| for the nonempty set S \ {v}, contradicting minimality.
Let P = (X 1 , X 2 , . . . , X r ) be a nice path decomposition of width at most k. If G has no edges, then S is a singleton set containing an isolated vertex. Since we already covered that case, we know G has at least one edge, so any path decomposition has width k 1. Enumerate the vertices of S as i 1 , . . . , i m such that r P (i 1 ) < . . . < r P (i m ). Hence the vertices are ordered by increasing rightmost endpoint of the interval of bags containing it.
In order to describe the reconfiguration procedure we suitably group several TAR reconfiguration steps together as one step in the algorithm. In particular, one reconfiguration step in the algorithm described below will consist of a run of successive removals of nodes, followed by a single node addition.
We use the notion of a buffer set B t at the t th step of the reconfiguration, such that |B t | will correspond to the number of tokens in the buffer at any particular time, and max t |B t | + 1 will correspond to the maximum buffer size of the corresponding TAR reconfiguration sequence. The buffer set is a subset of vertices, showing where the tokens in the buffer came from. At time step t = 0, define W 0 = I to be the independent set of vertices with a token, and let the buffer set B 0 be empty. We will define intermediate independent sets W i and buffer sets B i representing the grouped reconfiguration steps. The algorithm stops when W m contains all vertices in S; we will then invoke the induction hypothesis to finish the sequence. From the sequence (W 0 , W 1 , . . . , W m ) one obtains a formal reconfiguration sequence as defined in Section 3 by inserting "transitioning independent sets" in between W i and W i+1 for all i. From W i , repeatedly remove one vertex until arriving at W i+1 \ W i , and then add the single vertex of W i+1 \ W i to the resulting set.
For t 1, the transition from t − 1 to t is obtained as follows. Let u t be an arbitrary vertex from B t−1 ∪ (N(i t ) ∩ W t−1 ). Intuitively, at step t we take the token from u t (in the buffer set or on a neighbor of i t ) and move it onto vertex i t , causing u t to disappear from the buffer and adding i t to the independent set. To ensure the resulting set is independent, tokens on neighbors of i t are moved into the buffer beforehand. Observe that the above step is valid only if B t−1 ∪ (N(i t ) ∩ W t−1 ) is nonempty. Below in Claim 11 we show that due to the choice of S, this is indeed the case for all t m. Formally, we obtain the following: Algorithm (Reconfiguring graphs with small pathwidth). Initialize with B 0 = ∅ and W 0 = I. We recursively define B t and W t for t 1.
1. The neighbors of i t that have tokens (i.e. that are in the current independent set) are removed from the previous independent set W t−1 , making room to add i t to the new independent set: 2. The neighbors of i t belonging to the previous independent set W t move to the buffer, while u t is removed from the buffer since its token has moved onto i t : As mentioned earlier, a step from W t to W t+1 can be thought as a sequence of successive removals of the nodes N(i t+1 ) ∩ W t , and then addition of the node i t+1 .
During this successive TAR reconfiguration sequence corresponding to the step W t to W t+1 , the maximum buffer size is given by |B t+1 | + 1, since the buffer size will be |B t−1 ∪ (N(i t ) ∩ W t−1 )| just before the buffer token from u t is moved onto i t . Therefore, the maximum buffer size in the entire TAR reconfiguration sequence starting from W 0 and ending at W m is given by max 0 t m |B t | + 1. Also, at the end of the algorithm, all vertices from the set S will be in the independent set, and no vertex in the buffer set. This can be seen by observing the following. Initially all tokens were on the vertices belonging to the set N(S) ⊆ I, since S ⊆ J. At each step of the algorithm essentially one token is selected from N(S) as long as the number of such tokens is positive, and is placed on some vertex in S. Now since |S| |N(S)|, all the tokens in N(S) must eventually exhaust before the algorithm terminates placing one token at each vertex of S. For the validity of the above algorithm we claim the following, which in turn also characterizes the size of the buffer set at all intermediate time steps.
Proof. Suppose on the contrary that there exists t m, such that is empty, and in particular N(i t ) = ∅, so that i t = i 1 is an isolated vertex. But since |S| 2 by our argument above, it follows that S = {i 1 } is a nonempty strict subset with |S | |N(S )|; a contradiction. So in the remainder we consider t > 1. We show that, for all t < t , |B t | = |N({i 1 , . . . , i t })| − t. Using this, we prove that 2 t m leads to a contradiction. Observe that for any t < t , after the t th step of the algorithm, the total number of distinct vertices that have been added to the buffer set is given by |N({i 1 , . . . , i t })|. Furthermore, for all t t < t , the set B t −1 ∪ (N(i t ) ∩ W t −1 ) has always been nonempty. This implies that at each step, precisely one token has been removed from the buffer, thus reducing the size of the buffer set by moving a buffer token onto a vertex that is added to the independent set. Therefore, in total t times the size of the buffer set reduces by one. Since initially the buffer set was empty, for any t < t we have |B t | = |N({i 1 , . . . , i t })| − t.
Since we have assumed that B t −1 ∪ (N(i t ) ∩ W t −1 ) is empty, we know B t −1 is empty, and therefore from the above argument Since t 2 the set S is nonempty, contradicting the minimality of S. This proves the first part of the claim. Since the buffer does not become empty until after step t, the given argument then also proves the second part of the claim. It remains to show that throughout the process the buffer size will not grow beyond k, i.e. |B t | k − 1, for all t m. Claim 11 (ii) implies that max t m |B t | k if and only if ∃ t m such that |N({i 1 , . . . , i t })| − t k, which is not possible due to Lemma 12. This then ensures that throughout the algorithm, the buffer size will never exceed k.
Since the buffer set empties out after reconfiguring the set S, after the execution of the algorithm, W m ∩ J = S and W m ∩ I ⊂ V \ (S ∪ N(S)). Now define G = G − (S ∪ N(S)), I = I ∩ W m , and J = J \ S. Observe that G has pathwidth at most k, and |I | = |I ∩ W m | = |I| − |S| = |J |. Furthermore, since S is non-empty, |V(G )| n. By the induction hypothesis, there exists a TAR reconfiguration sequence from I to J in G using a buffer of size at most k. Since N(S) is not in G , any independent set in G remains to be an independent set in G when augmented with the set S. Therefore we can first apply the given reconfiguration from N(S) to S, followed by the reconfiguration from I to J , to reconfigure I to J with a buffer of size at most k.
Observe by considering a complete balanced bipartite graph on 2n vertices K n,n , that in general the above bound is tight. Indeed, from [3] we know that K n,n has pathwidth equal to n, and as explained earlier, the TAR reconfiguration threshold is also n.

Obstructions to TAR Reconfigurability
Having proved Theorem 13, it is natural to ask whether pathwidth in some sense characterizes the TAR reconfiguration threshold: does large pathwidth of a graph imply that its TAR reconfiguration threshold is large? This is not the case: the pathwidth of a complete binary tree is proportional to its depth [19], but its reconfiguration threshold is one by Theorem 3.
We now identify a graph structure which forces the TAR reconfiguration threshold to be large. First we formally introduce the special type of minor illustrated in Figure 1b.
Definition 4 (Bipartite topological double minor). Let G = (I ∪ J, E) be a bipartite graph and let H be an arbitrary graph. Then H is a bipartite topological double minor of G, if one can assign to every v ∈ V(H) a subgraph ϕ(v) of G, which is either an edge or an even cycle in G, and one can assign to each edge e = {u, v} ∈ E(H) a pair of odd-length paths ψ 1 (e), ψ 2 (e) in G, such that the following holds: • For any u, v ∈ V(H) with u = v the subgraphs ϕ(u) and ϕ(v) are vertex-disjoint.
• For any v ∈ V(H) no vertex of ϕ(v) occurs as an interior vertex of a path ψ 1 (e) or ψ 2 (e), for any e ∈ E(H).
• For any v ∈ V(H) and edge {u, v} ∈ E(H), the attachment points of ψ 1 (e) and ψ 2 (e) in ϕ(v) belong to different partite sets.
The triple (ϕ, ψ 1 , ψ 2 ) is a BTD-minor model of H in G. For an edge e ∈ E(H) we define ψ 1 (e), ψ 2 (e) ⊆ V(G) as the interior vertices of the paths ψ 1 (e) and ψ 2 (e), which may be ∅ if the path consists of a single edge.
Intuitively, H occurs as a bipartite topological double minor (or BTD-minor) if each vertex of H can be realized by an edge or even cycle, and every edge of H can be realized by two odd-length paths that connect an I-vertex of ϕ(v) to a J-vertex of ϕ(u) and the other way around, in such a way that these structures are vertex-disjoint except for the attachment of paths to cycles. The definition easily extends to bipartite graphs whose bipartition is not given, since a BTD-minor is contained within a single connected component of the graph, which has a unique bipartition.

Proposition 14.
Let G = (I ∪ J, E) be a bipartite graph having a connected graph H as a BTDminor model (ϕ, ψ 1 , ψ 2 ), such that each vertex of G is in the image of ϕ, ψ 1 , or ψ 2 . Then G has a perfect matching with |I| = |J| edges, and for any independent set W in G: Proof. As before, we consider a balanced bipartite graph G ∈ Π bip with bipartition V(G) = I ∪ J that has a complete binary tree T of depth d as a BTD-minor. Since the graph class is hereditary, for the lower bound, we consider only the subgraph of G induced by v∈V(T ) {ϕ(v)} ∪ e∈E(T ) {ψ 1 (e) ∪ ψ 2 (e)} , and without loss of generality, we shall refer to it as G itself. The above implies that there exists i 0 |V(T )|, such that any size-i 0 subset of V(T ) has a neighborhood of size at least c 1 · d. Let I ∪ J be the unique bipartition of the connected graph G, and consider an arbitrary TAR reconfiguration sequence from I and J. In this sequence (I = W 0 , W 1 , . . . , W t = J) of independent sets in G, look at the reconfiguration step when for the first time there exists S ⊆ V(T ) with |S| = i 0 , such that the intermediate independent set W at that step contains v∈S (ϕ(v) ∩ J), and for all v / ∈ S it satisfies (ϕ(v) ∩ W ∩ J) (ϕ(v) ∩ J). We will prove that |J| − |W| c 1 · d, implying that from the initial independent set of |I| = |J| tokens, at least c 1 · d tokens must reside in the buffer.
To prove the theorem, consider the intermediate independent set W, and the set S ⊆ V(T ) with |S| = i 0 satisfying the above criteria. The following claim shows that for each vertex in N T (S), the independent set W uses at least one vertex fewer than the maximum independent set J does.

Claim 12.
Consider an edge e = {u, v} ∈ E(T ) with u ∈ S and v / ∈ S, and let Q e,v ⊆ V(G) denote the vertices in ϕ(v) ∪ ψ 1 (e) ∪ ψ 2 (e). The following holds: Proof. By Proposition 14, the maximum independent set J contains exactly half the vertices of Q e,v . If |W ∩ ψ i (e)| < |ψ i (e)|/2 for some i ∈ {1, 2}, then we are done: by Proposition 14 the set W contains fewer vertices from ψ i (e) that the maximum independent set J does, and this cannot be compensated within the other parts of the structure since J contains half the vertices there and no independent set contains more. In the remainder, we can assume that W contains exactly half the vertices from ψ 1 (e) and ψ 2 (e). Then the following are true: (i) All J-nodes of ϕ(u) are in W (by our choice of W and since u ∈ S).
(ii) Some J-node of ϕ(v) is not in W (by our choice of W and since v ∈ S).
(iii) Some I-node of ϕ(v) is not in W. To see this, let i ∈ {1, 2} such that ψ i (e) is an oddlength path from a J-node in ϕ(u) to an I-node in ϕ(v), which exists by Definition 4, and orient it in that direction. Since the first vertex on the path is a J-node in ϕ(u), it is contained in W as shown above. Hence the second vertex on the path, the first interior vertex, is not in W. Since exactly half the interior vertices from ψ i (e) belong to W, every other interior vertex from ψ i (e) is in W. Since the path has an even number of interior vertices and the first interior vertex is not in W, the last interior vertex must be in W. But this prevents its I-node neighbor in ϕ(v) from being in W.
Therefore, since ϕ(v) is either an edge or an even cycle, we have |W ∩ ϕ(v)| < |ϕ(v)|/2 by observing the following: the only independent sets in ϕ(v) of size |ϕ(v)|/2 are ϕ(v) ∩ I and ϕ(v) ∩ J, but ϕ(v) ∩ W is not equal to either of these sets since it avoids a J-node and an I-node. Hence |W ∩ ϕ(v)| < |ϕ(v)|/2 = |J ∩ ϕ(v)|, and Proposition 14 shows that this cannot be compensated in other parts of the minor model, implying |W ∩ Q e,v | < |J ∩ Q e,v |.
Using Claim 12 we finish the proof of Theorem 15. For each v ∈ N T (S), pick an edge e = {u, v} such that u ∈ S. By Claim 12 the set W contains less than half the vertices of Q e,v , while the maximum independent set J contains exactly half. Since the sets Q e,v considered for different vertices v ∈ N T (S) are disjoint, while Proposition 14 shows that from the other pieces of the minor model W cannot use more vertices than J does, it follows that |W| |J| − |N T (S)| |J| − c 1 · d. Hence the buffer contains at least c 1 · d tokens.

Conclusion
In this paper we considered two types of reconfiguration rules for independent set, involving simultaneously jumping tokens and reconfiguration with a buffer. For both models, we derived tight bounds on the corresponding reconfiguration thresholds in terms of several graph parameters like the minimum vertex cover size, the minimum feedback vertex set size, and the pathwidth. Many results in the literature concerning the parameter pathwidth can be extended to hold for the parameter treewidth as well. This is not the case here; the upper bound on the TAR reconfiguration threshold in terms of pathwidth (Theorem 13) cannot be strengthened to treewidth, since one can make arbitrarily deep complete binary trees as BTD-minors in bipartite graphs of treewidth only two (see Figure 1b). On the other hand, there are bipartite graphs of large treewidth with TAR reconfiguration threshold two (Figure 2b). To characterize the TAR reconfiguration threshold one therefore needs to combine graph connectivity (as measured by the width parameters) with notions that constrain the parity of the connections in the graph. This is precisely why we introduced BTD-minors. We conjecture that the converse of Theorem 15 holds, in the sense that any hereditary graph class having a large TAR reconfiguration threshold must contain a graph having a complete binary tree of large depth as a BTDminor. Our belief is based partially on the fact that a BTD-minor model of a deep complete binary tree is arguably the simplest graph of large pathwidth and feedback vertex number. Resolving this conjecture is our main open problem.