All-Subsets Important Separators with Applications to Sample Sets, Balanced Separators and Vertex Sparsifiers in Directed Graphs

Anand, Aditya; Lee, Euiwoong; Li, Jason; Saranurak, Thatchaphol

doi:10.4230/LIPIcs.ICALP.2025.12

All-Subsets Important Separators with Applications to Sample Sets, Balanced Separators and Vertex Sparsifiers in Directed Graphs

Aditya Anand

University of Michigan Ann Arbor, MI, USA Euiwoong Lee

University of Michigan Ann Arbor, MI, USA Jason Li

Carnegie Mellon University, Pittsburgh, PA, USA Thatchaphol Saranurak

University of Michigan Ann Arbor, MI, USA

Abstract

Given a directed graph $G$ with $n$ vertices and $m$ edges, a parameter $k$ and two disjoint subsets $S,T\subseteq V(G)$ , we show that the number of all-subsets important separators, which is the number of $A$ - $B$ important vertex separators of size at most $k$ over all $A\subseteq S$ and $B\subseteq T$ , is at most $\beta(|S|,|T|,k)=4^{k}\binom{|S|}{\leq k}\binom{|T|}{\leq 2k}$ , where $\binom{x}{\leq c}=\sum_{i=1}^{c}\binom{x}{i}$ , and that they can be enumerated in time $\mathcal{O}(\beta(|S|,|T|,k)k^{2}(m+n))$ . This is a generalization of the folklore result stating that the number of $A$ - $B$ important separators for two fixed sets $A$ and $B$ is at most $4^{k}$ (first implicitly shown by Chen, Liu and Lu Algorithmica ’09). From this result, we obtain the following applications:

1.

We give a construction for detection sets and sample sets in directed graphs, generalizing the results of Kleinberg (Internet Mathematics’ 03) and Feige and Mahdian (STOC’ 06) to directed graphs.
2.

Via our new sample sets, we give the first FPT algorithm for finding balanced separators in directed graphs parameterized by $k$ , the size of the separator. Our algorithm runs in time $2^{\mathcal{O}(k)}\cdot(m+n)$ .
3.

Additionally, we show a $\mathcal{O}(\sqrt{\log k})$ approximation algorithm for finding balanced separators in directed graphs in polynomial time. This improves the best known approximation guarantee of $\mathcal{O}(\sqrt{\log n})$ and matches the known guarantee in undirected graphs by Feige, Hajiaghayi and Lee (SICOMP’ 08).
4.

Finally, using our algorithm for listing all-subsets important separators, we give a deterministic construction of vertex cut sparsifiers in directed graphs when we are interested in preserving min-cuts of size upto $c$ between bipartitions of the terminal set. Our algorithm constructs a sparsifier of size $\mathcal{O}\left(\binom{t}{\leq 3c}2^{\mathcal{O}(c)}\right)$ and runs in time $\mathcal{O}\left(\binom{t}{\leq 3c}2^{\mathcal{O}(c)}(m+n)\right)$ , where $t$ is the number of terminals, and the sparsifier additionally preserves the set of important separators of size at most $c$ between bipartitions of the terminals.

Keywords and phrases:

directed graphs, important separators, sample sets, balanced separators

Category:

Track A: Algorithms, Complexity and Games

Funding:

Euiwoong Lee: Supported in part by NSF grant CCF-2236669 and Google.

Thatchaphol Saranurak: Supported by NSF Grant CCF-2238138. Partially funded by the Ministry of Education and Science of Bulgaria’s support for INSAIT, Sofia University “St. Kliment Ohridski” as part of the Bulgarian National Roadmap for Research Infrastructure.

Copyright and License:

2012 ACM Subject Classification:

Theory of computation

\rightarrow

Fixed parameter tractability ; Theory of computation

\rightarrow

Graph algorithms analysis

Editors:

Keren Censor-Hillel, Fabrizio Grandoni, Joël Ouaknine, and Gabriele Puppis

Series and Publisher:

Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik

1 Introduction

The study of parameterized algorithms, or fixed-parameter tractability (FPT) (see [9, 12]) along with the classical area of approximation algorithms, has emerged as one of the most promising ways to cope with NP-completeness. Given a decision problem $Q$ with input size $n$ , together with a parameter $\ell$ , a parameterized (or FPT) algorithm is an algorithm running in time $f(\ell)n^{\mathcal{O}(1)}$ that decides $Q$ . Over the last few years, parameterized algorithms have been studied for most well-studied NP-complete problems. A particular focus has been on graph cut optimization problems, including Multiway Cut, Multicut, Minimum Bisection, Feedback Vertex Set, Balanced Separator [28, 15, 5, 30, 11, 10, 22]. One of the most important tools for graph cut problems has been the technique of important separators (see [29] for a brief survey), which has formed a building block in parameterized algorithms for many of these and other problems [28, 5, 6, 7, 24, 25].

In this work, our contribution is two-fold. First, we show a new structural result on important separators. Next, we show that using this result, combined with other techniques, leads to interesting consequences. We show that one can compute sample sets in directed graphs. This in turn allows us to obtain both an FPT algorithm and an improved approximation algorithm for finding small balanced separators in directed graphs. We also show a deterministic construction of vertex cut sparsifiers in directed graphs. All our algorithms are simple and concise modulo standard results in approximation and parameterized algorithms.

Figure 1: Summary of contribution of our paper.

1.1 All-Subsets Important Separators

Important separators have proved to be a very important tool in the design of many FPT algorithms, including fundamental problems such as Multiway Cut, Multicut, Directed Feedback Arc Set. We refer to Chapter 8 of [9] for various applications of important separators to design parameterized algorithms. Given two disjoint subsets of vertices $A, B$ and another set of vertices $X$ (which may intersect $A, B$ ) in a directed graph, we say that $X$ is an $A$ - $B$ separator if there is no directed path from any vertex of $A$ to any vertex of $B$ in $G\setminus X$ ¹¹1 $G\setminus X$ denotes the graph obtained from $G$ by deleting the vertex set $X$ along with its incident edges. For such a separator $X$ , define $R(X)$ to be the set of vertices reachable from $A$ in $G\setminus X$ . Then an $A$ - $B$ separator $X$ is called an important $A$ - $B$ separator if it is (a) inclusion-wise minimal - so that for any $v\in X$ , $X\setminus\{v\}$ is not an $A$ - $B$ separator and (b) for any other $A$ - $B$ separator $Y\subseteq V(G)$ with $|Y|\leq|X|$ , we do not have $R(X)\subset R(Y)$ .²²2Throughout the paper, we use the notation $P\subset Q$ to mean that $P$ is a proper subset of $Q$ . Informally, important separators are those for which there is no other separator which is further away from $A$ without increasing the cut-size. Marx [28] first introduced the notion of important separators and showed a bound of $4^{k^{2}}$ on the number of important $A$ - $B$ separators of size at most $k$ . Later this bound was improved to $4^{k}$ (implicit in the FPT algorithm for Vertex Multiway Cut by Chen, Liu and Lu [6])³³3While these results focused mostly on the undirected version, the same proof works in directed graphs.. Over the last few years, this result has been used extensively to obtain FPT algorithms for various central parameterized problems [30, 7, 11, 25]. In this work, we show a new structural result on important separators in directed graphs: Given a directed graph $G$ and two disjoint subsets $S,T\subseteq V(G)$ , we show that the number of $A$ - $B$ important separators across all $A\subseteq S$ and $B\subseteq T$ is at most $\beta(|S|,|T|,k)=4^{k}\binom{|S|}{\leq k}\binom{|T|}{\leq 2k}$ , where $\binom{n}{\leq a}=\sum_{i=1}^{a}\binom{n}{i}$ . Note that the trivial bound is $4^{k}2^{|S|+|T|}$ , which follows directly from the fact that there are at most $4^{k}$ important $A$ - $B$ separators for any fixed $A\subseteq S,B\subseteq T$ . Previously, no such result is known even for undirected graphs.

Throughout this paper, given a graph, we will denote by $n$ the number of vertices and by $m$ the number of edges/arcs.

Theorem 1 (All-Subsets Important Separators).

Let $G$ be a digraph, $k$ be a positive integer and let $S\subseteq V(G)$ and $T\subseteq V(G)$ be disjoint sets of source and sink vertices. Then there are at most $\beta(|S|,|T|,k)=4^{k}\binom{|S|}{\leq k}\binom{|T|}{\leq 2k}$ $A$ - $B$ important separators of size $\leq k$ across all $A\subseteq S$ and $B\subseteq T$ . Further, they can be enumerated in time $\mathcal{O}(\beta(|S|,|T|,k)\cdot k^{2}\cdot(m+n))$ .

It is not difficult to show that this result is essentially tight: we show this formally in the next lemma, whose proof is deferred to the appendix.

Lemma 2.

For any positive integer $k$ , the following two statements hold.

1.

There exist infinitely many positive integers $c$ , such that for each $c$ , there is a directed graph $G_{c,k}$ and disjoint subsets $S,T\subseteq V(G)$ of vertices with $|T|=c$ and $|S|=1$ so that there are at least $\binom{|T|}{\leq k}$ $A$ - $B$ important separators of size at most $k$ across all choices $A\subseteq S$ , $B\subseteq T$ .
2.

There exist infinitely many positive integers $c$ , such that for each $c$ , there is a directed graph $G_{c,k}$ and disjoint subsets $S,T\subseteq V(G)$ of vertices with $|S|=c$ and $|T|=k+1$ so that there are at least $\binom{|S|}{\leq k}$ $A$ - $B$ important separators of size at most $k$ across all choices $A\subseteq S$ , $B\subseteq T$ .

In the next few subsections, we show that this simple structural result has various interesting consequences, allowing us to obtain many new results which were previously only known for undirected graphs.

1.2 Detection Sets and Sample Sets in Directed Graphs

Kleinberg [18] introduced the concept of a detection set in a graph. The principal motivation for this concept was that of a network failure: given a network, can we compute a small set of representative nodes so that for any small set of failure nodes that cuts communication between two large subsets of nodes, there are two representatives that cannot communicate? Formally, Kleinberg defined an $(\epsilon,k)$ detection set for an undirected graph $G$ as a set of terminals $T\subseteq V(G)$ which satisfies the following property. First, define a network failure as a set of vertices $X$ with $|X|\leq k$ so that $G\setminus X$ can be partitioned into $A\cup B$ where $|A|,|B|\geq\epsilon n$ and there are no edges between $A$ and $B$ . Then for every such (vertex) failure set $X$ , the set $T$ must intersect with at least two components of $G\setminus X$ . Kleinberg [18] showed that there is a detection set of size $\mathcal{O}(\frac{k^{3}}{\epsilon}\log\frac{1}{\epsilon})$ . For the edge failure version, he showed a bound of $\mathcal{O}(\frac{k}{\epsilon}\log\frac{1}{\epsilon})$ which was subsequently improved to $\mathcal{O}(\frac{k}{\lambda}\frac{1}{\epsilon}\log\frac{1}{\epsilon})$ by Kleinberg et al. [19], where $\lambda$ is the size of the global minimum cut in the graph. Feige and Mahdian [15] showed an improved bound on (vertex) detection sets: they show a bound of $\mathcal{O}(\frac{k}{\epsilon}\log\frac{1}{\epsilon})$ . Further, they showed a bound of $\mathcal{O}(\frac{k}{\epsilon})$ for the edge failure version, removing the dependence on $\log\frac{1}{\epsilon}$ .

Feige and Mahdian [15] also studied a strengthening of this notion of detection sets called sample sets for undirected graphs: on a high level, these are a small set of terminals $T$ which represent all the small cuts of the graph in proportion to their size, up to some additive error. Concretely, given an undirected graph $G$ and a parameter $k$ , they show that there exists a set of terminals $T$ with $|T|=\mathcal{O}(\frac{k}{\epsilon^{2}}\log\frac{1}{\epsilon})$ , such that for any vertex set $X$ of size $|X|\leq k$ and every connected component $C$ of $G\setminus X$ , we have $||C|-\frac{n}{|T|}|C\cap T||\leq\epsilon n$ . Further, the set $T$ can be obtained by simple random sampling, and [15] shows that any random subset $T\subseteq V(G)$ of size $\mathcal{O}(\frac{k}{\epsilon^{2}}\log\frac{1}{\epsilon})$ is a sample set with constant probability.

The most important feature of these results is that the size of the detection set (or sample set) $T$ does not depend on $n$ . It is then natural to ask if such results are possible for directed graphs. Given a digraph $G$ , let us define a network failure as a set of vertices $X$ with $|X|\leq k$ , so that $G\setminus X$ can be partitioned into two parts $A$ and $B$ , each of size at least $\epsilon n$ , such that there is no arc from $B$ to $A$ . The analogous question for detection sets is then: is there a set of terminals $T$ with $|T|=f(k,\epsilon)$ , such that for any network failure $X$ there exists a pair $t_{1},t_{2}\in T$ so that there is no path from $t_{1}$ to $t_{2}$ in $G\setminus X$ ? Similarly, the analogous question for sample set becomes: is there a set of terminals $T$ with $|T|=g(k,\epsilon)$ for some function $f$ , so that for any set of vertices $X$ with $|X|\leq k$ , and any strongly connected component (SCC) $C$ of $G\setminus X$ , we have $||C|-\frac{n}{|T|}|C\cap T||\leq\epsilon n$ ? We answer both these questions in the affirmative, showing that one can in fact asymptotically match the same bound as in the undirected case, with $f(k,\epsilon)=\mathcal{O}(\frac{k}{\epsilon}\log\frac{1}{\epsilon})$ and $g(k,\epsilon)=\mathcal{O}(\frac{k}{\epsilon^{2}}\log\frac{1}{\epsilon})$ .

Theorem 3.

For any directed graph $G$ , given parameters $\epsilon\in(0,1)$ and $k\in\mathbb{N}$ , there is an absolute constant $c$ such that there is an $(\epsilon,k)$ detection set of size $f(k,\epsilon)=c\frac{k}{\epsilon}\log\frac{1}{\epsilon}$ . Further, a random set of $f(k,\epsilon)$ vertices must be an $(\epsilon,k)$ detection set with probability at least $\frac{2}{3}$ .

Theorem 4.

For any directed graph $G$ , given parameters $\epsilon\in(0,1)$ and $k\in\mathbb{N}$ , there is an absolute constant $c$ such that there is an $(\epsilon,k)$ sample set of size $f(k,\epsilon)=c\frac{k}{\epsilon^{2}}\log\frac{1}{\epsilon}$ . Further, a random set of $f(k,\epsilon)$ vertices must be an $(\epsilon,k)$ sample with probability at least $\frac{2}{3}$ .

While we believe that this is of independent interest, we also show a similar connection to parameterized and approximation algorithms for finding balanced cuts in directed graphs along the lines of the undirected case as in [15], as discussed in the following subsections. In fact, by using a slightly more nuanced analysis, our results generalize that of [15] even for undirected graphs.

Finally, we note that one can easily prove that there exists an absolute constant $c$ such that a random subset of size $c\frac{k\log n}{\epsilon^{2}}$ is a sample set with constant probability in a directed graph. This follows from a simple application of Chernoff bounds and a union bound noting that the number of vertex sets of size at most $k$ is at most $\binom{n}{k}$ ( $\binom{m}{k}$ in the case of edge sets). However, for most applications this bound is too weak. For instance, our parameterized algorithm for Directed Balanced Separator has an exponential dependency in the size of the sample set, and hence to show results parameterized by $k$ , it is essential that the size of the sample set does not depend on $n$ .

1.3 Directed Balanced Cuts

One of the most well-studied graph partitioning problems, both from the parameterized and approximation algorithms point of view, is the problem of Minimum Bisection. Given an undirected graph $G$ and a parameter $k$ , the goal of the Minimum Bisection problem is to obtain a partition of the graph into two equal sized parts, such that the number of cut edges is at most $k$ . Minimum Bisection was shown to be fixed-parameter tractable by Cygan et al. [11] and the current best parameterized algorithm is due to [10] who show an algorithm with running time $2^{\mathcal{O}(k\log k)}n^{\mathcal{O}(1)}$ . Räcke [33] gave an $\mathcal{O}(\log n)$ approximation for (the optimization version of) Minimum Bisection. If only an approximate bisection where both sides have $\Omega(n)$ vertices is desired, then this problem is essentially the Balanced Separator problem which is known to have an FPT algorithm running in time $2^{\mathcal{O}(k)}(m+n)$ due to Feige and Mahdian [15] and an $\mathcal{O}(\sqrt{\log n})$ approximation algorithm in polynomial time using the seminal result of Arora, Rao and Vazirani [3]. Feige et al. [14] showed that in fact this guarantee can be made $\mathcal{O}(\sqrt{\log k})$ , and also showed that one can compute vertex separators with the same approximation ratio. In directed graphs, the Minimum Bisection problem has been typically studied as Directed Bisection: Given a directed graph $G$ with even number of vertices, is it possible to partition the vertex set into two equal parts $A$ and $B$ so that the number of arcs from $A$ to $B$ is at most $k$ ? This question was first raised by Feige and Yahalom [16]. When $k=0$ , they referred to this problem as Oneway Cuts, and showed that even this problem is NP-hard. However, given a Oneway Cuts instance which admits a solution, if one relaxes the requirement so that the algorithm can output a partition $(A^{\prime},B^{\prime})$ of $V(G)$ such that there are no arcs directed from $A^{\prime}$ to $B^{\prime}$ and $||A^{\prime}|-|B^{\prime}||\leq\epsilon n$ where $\epsilon=\Omega(\frac{1}{\log n})$ , the problem now becomes tractable, and they show a polynomial time algorithm for Oneway Cuts. Madathil et al. [27] showed that the Directed Bisection problem is FPT with respect to $k$ , even when one requires $|A|=|B|$ exactly, on a subclass of directed graphs called semi-complete digraphs, which are the class of directed graphs where for every pair of vertices $u$ and $v$ , there is an arc from $u$ to $v$ or an arc from $v$ to $u$ . In terms of approximation algorithms, Agarwal et al. [1] showed an $\mathcal{O}(\sqrt{\log n})$ approximation for Directed Balanced Separator , the directed analogue of Balanced Separator, while Even et al. [13] showed an $\mathcal{O}(\log k)$ approximation, which is the best known approximation guarantee depending only on $k$ .

However, there is no prior work on the fixed-parameter tractability of Directed Balanced Separator or Directed Bisection for general $k$ on general directed graphs, even when we relax the requirement of finding a bisection to that of finding an approximate bisection, that is, find a partition $(A^{\prime},B^{\prime})$ with $|A^{\prime}|,|B^{\prime}|=\Omega(n)$ so that the number of arcs from $A^{\prime}$ to $B^{\prime}$ is at most $k$ .

Our result makes the first progress on this problem. Before we state our results, we define the Directed Balanced Separator problem formally. Since all our results work for both the vertex and edge versions, we state them together as one problem. We adapt our definition from the definition of Balanced Separator in [15], whose results we generalize. For completeness, we recall the definition of Balanced Separator in [15].

Balanced Separator Parameter: $k, b$ Input: Undirected graph $G=(V,E)$ Question: Is there a set of vertices (edges) $F$ with $|F|\leq k$ , so that in $G\setminus F$ , every connected component has size at most $b n$ ?

Directed Balanced Separator Parameter: $k, b$ Input: Directed graph $G=(V,E)$ Question: Is there a set of vertices (arcs) $F$ with $|F|\leq k$ , so that in $G\setminus F$ , every strongly connected component has size at most $b n$ ?

For the sake of clarity and comparison, we state the main result of [15]. Given an undirected graph $G$ , we say that a set of vertices/edges $F$ is a $b$ -balanced separator if every connected component of $G\setminus F$ has size at most $b n$ .

Theorem 5 ([15]).

Given an instance of Balanced Separator with $\frac{2}{3}\leq b\leq 1$ there is a randomized algorithm, that for any $\epsilon>0$ , runs in time $2^{\mathcal{O}\left(k\log\left(\frac{1}{\epsilon}\right)/{\epsilon^{2}}\right)% }(m+n)$ and with constant probability outputs either (a) a set of vertices (edges) $F^{\prime}$ of size at most $k$ such that every connected component of $G\setminus F^{\prime}$ has size at most $(b+\epsilon)n$ or (b) concludes correctly that there is no $b$ -balanced separator of size at most $k$ .

The following theorem, which directly generalizes the result of Theorem 5 is our main result. Given a directed graph $G$ , we say that a set of vertices/arcs $F$ is a $b$ -balanced separator if every strongly connected component (SCC) of $G\setminus F$ has size at most $b n$ .

Theorem 6.

Given an instance of Directed Balanced Separator there is a randomized algorithm, that for any $\epsilon>0$ , runs in time $2^{\mathcal{O}\left(k\min\{\log\frac{1}{b},\log k\}\log\frac{1}{\epsilon}/% \epsilon^{2}\right)}(m+n)$ and with constant probability outputs either (a) a set of vertices (arcs) $F^{\prime}$ of size at most $k$ such that every strongly connected component of $G\setminus F^{\prime}$ has size at most $(b+\epsilon)n$ or (b) concludes correctly that there is no $b$ -balanced separator of size at most $k$ .

We observe that our algorithm has a running time of $2^{\mathcal{O}\left(k\log\left(\frac{1}{\epsilon}\right)/{\epsilon^{2}}\right)% }(m+n)$ for any $b=\Omega(1)$ , matching the run-time of Theorem 5 for $\frac{2}{3}\leq b\leq 1$ while also extending to any parameter $b\in(0,1)$ . We also observe that Theorem 6 implies Theorem 5, since given any undirected graph $G$ , one can create a directed graph $H$ on the same vertex set, so that for every edge $\{u,v\}\in E(G)$ , we have the two arcs $(u,v),(v,u)\in E(H)$ and we can apply Theorem 6 to obtain Theorem 5.

Our algorithm for Directed Balanced Separator can be used to solve (approximate) Directed Bisection as well. To see this, given a graph $G$ , observe that in any bisection $(A,B)$ so that the number of arcs going from $A$ to $B$ is at most $k$ , the set $F$ of these at most $k$ arcs form a $\frac{1}{2}$ -balanced separator, so that in $G\setminus F$ , every strongly connected component is of size at most $\frac{n}{2}$ . Using Theorem 6 with $b=\frac{1}{2}$ , we can find a set of arcs $F^{\prime}$ with $|F^{\prime}|\leq k$ that forms a $(\frac{1}{2}+\epsilon)$ balanced separator. Finally, note that the strongly connected components of the graph $G\setminus F^{\prime}$ form a Directed Acyclic Graph (DAG), and each strongly connected component of $G\setminus F^{\prime}$ has size at most $(\frac{1}{2}+\epsilon)n$ . It follows that there is a topological ordering $\{C_{1},C_{2}\ldots C_{\ell}\}$ of these strongly connected components of $G\setminus F^{\prime}$ , such that there is no arc from $C_{j}$ to $C_{i}$ for $i,j\in[\ell]$ with $j>i$ . Therefore we can pick some prefix of strongly connected components in the topological ordering of $G\setminus F^{\prime}$ to obtain a set $A^{\prime}$ , so that both $|A^{\prime}|,|V(G)\setminus A^{\prime}|\geq(1-2\epsilon)\frac{n}{4}$ , and in $G\setminus F^{\prime}$ , there are no arcs from $V(G)\setminus A^{\prime}$ to $A^{\prime}$ . It follows that the set of arcs $F^{\prime}$ forms an (approximate) directed bisection. Note that the approximation is only in the balance, not in the number of arcs cut.

Next, we show a $\mathcal{O}(\sqrt{\log k})$ approximation for Directed Balanced Separator. This improves both the $\mathcal{O}(\sqrt{\log n})$ approximation of [1] and the $\mathcal{O}(\log k)$ approximation given by Even et al. [13] for approximating Directed Balanced Separator in polynomial time.

Theorem 7.

There is an $\mathcal{O}(\sqrt{\log k})$ approximation to Directed Balanced Separator in polynomial time. Formally, given an instance of Directed Balanced Separator with $b=\Omega(1)$ , suppose there is a set of vertices (arcs) $F$ with $|F|\leq k$ , so that every strongly connected component of $G\setminus F$ has at most $b n$ vertices. Then there is a polynomial time randomized algorithm that with constant probability finds a set of vertices (arcs) $F^{\prime}$ with $|F^{\prime}|\leq\mathcal{O}(k\sqrt{\log k})$ so that in $G\setminus F^{\prime}$ , every strongly connected component has size at most $b^{\prime}n$ for some $b^{\prime}<1$ depending on $b$ .

Note that for this theorem, we do not optimize $b^{\prime}$ , unlike our FPT result where we were able to show that $b^{\prime}\leq b+\epsilon$ for some suitably chosen $\epsilon$ . Still, we remark that is not a limitation of our framework, but inherent in [1] due to the use of the ARV separation theorem [3].

1.4 Vertex Cut Sparsifiers

Vertex sparsification is a fundamental problem in various settings. Broadly, given a graph and a terminal set $T$ , a vertex sparsifier is a smaller graph (with size typically depending only on $|T|$ ) that preserves some (cut based) property of the terminals $T$ . Moitra [32] first introduced a version of vertex sparsification in undirected graphs. Chuzhoy [8] generalized this notion by allowing Steiner nodes in the sparsifier. Both these notions preserve edge cuts between bi-partitions of the terminal set.

For our setting, we focus on directed graphs and vertex cuts. Given a directed graph $G$ and a set of terminals $T$ along with an integer $c$ , a $(c,T)$ vertex cut sparsifier for $G$ is another graph $G^{\prime}$ with $T\subseteq V(G^{\prime})$ , so that for every partition $\{A,B\}$ of $T$ (so that $A\cup B=T$ and $A\cap B=\emptyset$ ), if the size of an $A$ - $B$ vertex min-cut is at most $c$ in $G$ , then the size of an $A$ - $B$ vertex min-cut is the same in both in $G$ and $G^{\prime}$ . In other words, the goal is to find a vertex sparsifier that preserves vertex min-cuts of size at most $c$ between sets of terminals. Here we allow the deletion of terminals as well.

Kratsch and Wahlström [20, 21] first studied this notion of vertex sparsification for vertex cuts (without the parameter $c$ ) and showed that given $G, T$ , there is a randomized polynomial time algorithm that obtains a $(|T|,T)$ vertex sparsifier $G^{\prime}$ for $G$ with $|V(G^{\prime})|\leq\mathcal{O}(|T|^{3})$ . This bound was improved to $\mathcal{O}(|T|^{2})$ by [17] for the special case of directed acyclic graphs. Their algorithm runs in linear time for fixed $|T|$ , but is still randomized as it needs to compute a representation of gammoids.

One can then ask the question as to what is known about deterministic algorithms. Recently, Misra et al. [31] showed that a representation of a gammoid with rank $r$ over a ground set of $m$ elements can be constructed in time $\mathcal{O}(\binom{m}{r}m^{\mathcal{O}(1)})$ deterministic time. The technique of [20] needs to compute a representation of a gammoid whose ground set has size $n$ , the number of vertices and rank $|T|$ , the number of terminals. We therefore obtain the following result.

Theorem 8 ([20], [31]).

Given a $n$ -vertex graph $G$ and a terminal set $T$ , there is a deterministic algorithm that runs in time $\mathcal{O}(\binom{n}{|T|}n^{\mathcal{O}(1)})$ and computes a $(|T|,T)$ vertex cut sparsifier for $G$ of size at most $\mathcal{O}(|T|^{3})$ .

However, this algorithm (which is an XP algorithm in the notation of parameterized complexity) can be easily improved to a (still deterministic) FPT algorithm. The reason for this is simple: there are at most $2^{|T|}$ partitions of the terminal set $T$ . For each partition $A\cup B$ , we can compute an $(A,B)$ directed min-vertex cut $M$ . Note that this min-cut is of size at most $|T|$ , since one can simply delete all the terminals to obtain a cut. This gives a set $X$ of $2^{|T|}\cdot|T|$ vertices. It can now be shown that we can apply the closure operation to the set $V(G)\setminus X$ , where we simply delete all vertices of $V(G)\setminus X$ , and for each vertex pair $(u,w)$ such that there are vertices $(v_{1},v_{2})$ with $v_{1},v_{2}\in V(G)\setminus X$ with $(u,v_{1}),(v_{2},w)\in E(G)$ we add an arc $(u,w)$ . This results in a sparsifier of size $\mathcal{O}(|T|2^{|T|})$ . One can now set $n=\mathcal{O}(|T|2^{|T|})$ in Theorem 8 to obtain a sparsifer of size $2^{\mathcal{O}(|T|^{2})}$ in time $2^{\mathcal{O}(|T|^{2})}$ . The total running time is $2^{\mathcal{O}(T)}(m+n)+2^{\mathcal{O}(|T|^{2})}$ .

A similar line of research considers edge and vertex cuts in undirected graphs. For edge cuts in undirected graphs, Chalermsook et al. [4] showed an upper bound of $\mathcal{O}(|T|c^{4})$ which was later improved to $\mathcal{O}(|T|c^{3})$ by Liu [23]. Both these algorithms are randomized. Saranurak and Yingchareonthawornchai [34] considered the vertex version in undirected graphs, and gave a deterministic algorithm to compute a sparsifier of size $\mathcal{O}(|T|2^{\mathcal{O}(c^{2})})$ in time $\mathcal{O}(m^{1+o(1)}2^{\mathcal{O}(c^{2})})$ .

However, no improvement on the FPT algorithm with running time $2^{O(|T|)}(m+n)+2^{O(|T|^{2})}$ based on Theorem 8 is known for deterministic algorithms for vertex cut sparsifiers in directed graphs. We show the following result, which shows that if one is interested in preserving cuts of size at most $c$ , then a better deterministic algorithm is possible.

Theorem 9.

Given a directed graph $G$ , a set of terminals $T$ and integer $c$ , there is a deterministic algorithm that runs in time $\mathcal{O}(\psi(|T|,c)\cdot(m+n))$ and computes a $(c,T)$ vertex sparsifier $G^{\prime}$ for $G$ of size at most $\psi(|T|,c)$ , where $\psi(|T|,c)=\binom{|T|}{\leq 3c}2^{\mathcal{O}(c)}$ . Additionally, $G^{\prime}$ satisfies $T\subseteq V(G^{\prime})\subseteq V(G)$ , with the property that for every partition $A\cup B$ of $T$ and every subset $X\subseteq V(G)$ with $|X|\leq c$ , $X$ is an $A$ - $B$ important separator in $G$ if and only if $X\subseteq V(G^{\prime})$ and $X$ is an $A$ - $B$ important separator in $G^{\prime}$ .

While the dependence on $c$ in the exponent for the size of the sparsifier seems undesirable, Theorem 9 shows that the vertex sparsifier $G^{\prime}$ constructed by our algorithm is more powerful: it in-fact “preserves” all $A$ - $B$ important separators for any partition $(A,B)$ of $T$ . Additionally, if one wants only a vertex sparsifier, we note that simply running the algorithm in Theorem 8 after applying our result in Theorem 9 gives a vertex sparsifier of size $\mathcal{O}(|T|^{3})$ , but now the running time is $\left(\frac{|T|}{c}\right)^{\mathcal{O}(c|T|)}$ . Thus, together, we obtain a vertex cut sparsifier of size $\mathcal{O}(|T|^{3})$ in time $\binom{|T|}{\leq 3c}2^{\mathcal{O}(c)}(m+n)+\left(\frac{|T|}{c}\right)^{% \mathcal{O}(c|T|)}$ .

Corollary 10.

Given a graph $G$ and a terminal set $T\subseteq V(G)$ , there is a deterministic algorithm that runs in time $\binom{|T|}{\leq 3c}2^{\mathcal{O}(c)}(m+n)+\left(\frac{|T|}{c}\right)^{% \mathcal{O}(c|T|)}$ and obtains a $(c,T)$ vertex cut sparisifer for $G$ with $\mathcal{O}(|T|^{3})$ vertices.

1.5 Techniques and Overview

In this section we give a high-level overview of our techniques. For the sake of simplicity and clarity we sometimes omit small details in this section.

All-Subset Important separators

Our result on counting all-subset important separators is obtained using the connection between important separators and closest sets. For simplicity, in this overview, suppose that we are given a single source $s$ and a set of target vertices $T$ . Our goal in this simplified setting shall be to bound the number of $(s,B)$ important separators of size exactly $k$ across all $B\subseteq T$ , and we will show a weaker bound of $4^{k}\binom{|T|}{\leq{(k+1)^{2}}}$ . Before we describe the main idea, we remark that our eventual goal will be to show that every $(s,B)$ important separator of size $k$ for some $B\subseteq T$ is an $(s,B^{\prime})$ important separator for some $B^{\prime}\subseteq B$ with $|B^{\prime}|\leq(k+1)^{2}$ . It is easy to see that this suffices, since there are only $\binom{|T|}{\leq(k+1)^{2}}$ choices for $B^{\prime}$ , and there are at most $4^{k}$ such $(s,B^{\prime})$ important separators of size $k$ for a fixed choice of $B^{\prime}$ .

Fix $B\subseteq T$ , and consider an $(s,B)$ important separator $X$ of size $k$ . Consider the (vertex) min-cut $X^{\prime}$ between $X$ and $B$ ⁴⁴4Throughout this paper, we allow deletion of source and sink vertices in vertex cuts. If this min-cut $X^{\prime}$ has size strictly less than $|X|=k$ , $X$ cannot be important: $X^{\prime}$ would allow more vertices to be reachable from $s$ while having a size strictly less than $k$ . This means that $X$ by itself must be a $(X,B)$ min-cut. Using simple cut-flow duality, it must be the case that there are $|X|=k$ vertex disjoint paths from $X$ to (some) vertices in $B$ .

However, this flow property can be strengthened: it is easy to see that we can in fact assume that every such important separator $X$ is the $(X,B)$ min-cut “closest” to $B$ . (We say $X$ is closest to $B$ if $X$ is the unique $(X,B)$ min-cut.) But it can be shown that closest sets have a stronger flow property: we can now conclude that for every vertex $v\in X$ , there are $k+1$ paths from $X$ to $B$ , which are vertex disjoint except that two of these paths both start at $v$ (Lemma 12).

Why does this help? Fix a $v\in X$ , and fix the $k+1$ paths from $X$ to $B$ . Suppose the $k+1$ endpoints of these paths are $B^{*}_{v}\subseteq B$ . Consider any set of vertices $X^{\prime}$ that is a $(s,B^{*}_{v})$ separator of size $\leq k$ that is “closer” to $B^{*}_{v}$ than $X$ . (Here “closer” means that the set of vertices reachable from $s$ in $G\setminus X^{\prime}$ is a superset of that in $G\setminus X$ ). Then $X^{\prime}$ must delete $v$ ! For if not, since there are $k+1$ paths from $X$ to $B^{*}_{v}$ which are vertex disjoint except $2$ paths which begin at $v$ , and $|X^{\prime}|\leq k$ , $X^{\prime}$ cannot be a $(s,B^{*}_{v})$ separator.

Applying the above argument for every $v\in X$ and letting $B^{\prime}:=\cup_{v\in X}B^{*}_{v}$ allows us to conclude that there is no $(s,B^{\prime})$ separator $X^{\prime}$ that is closer to $B^{\prime}$ than $X$ . Thus, in fact, among all the sets of size at most $k$ separating $s$ from $B^{\prime}$ , $X$ is a set which is closest to $B^{\prime}$ . This means $X$ must be an important $(s,B^{\prime})$ separator. But $|B^{\prime}|=|\cup_{v\in X}B^{*}_{v}|\leq k(k+1)\leq(k+1)^{2}$ , and hence we are done. Our actual result uses a slightly more subtle argument using augmenting paths to obtain such a $B^{\prime}$ with $|B^{\prime}|\leq 2k$ .

Here we remark that Lemma 4.10 of Lokshtanov et al. [26] showed a similar result: it claims that there is a set $B^{\prime}\subseteq B$ with $|B^{\prime}|\leq k+1$ such that any $(s,B)$ important separator is an $(s,B^{\prime})$ important separator. However, there is a gap in the proof which leads to an unavoidable factor of $2$ loss. Indeed, as we show in Lemma 30, there is a graph $G$ , source vertex $s\in V(G)$ and $B\subseteq V(G)$ for which $|B^{\prime}|=2k$ is necessary, and hence our result is tight. Our high-level idea is flow-based, similar to [26]. But as described above, the direct application of this flow-based idea gives a bound of $k(k+1)$ , which we refine using an augmenting path based argument to obtain the tight bound of $2k$ .

Detection sets and Sample sets for Directed Graphs

To obtain detection sets and sample sets for directed graphs, on a high level, we follow the approach of Feige and Mahdian [15]. We focus on sample sets in this brief overview. First, we consider the family of sets that arise as a strongly connected component after deleting an arbitrary $k$ vertices in the given digraph. Our key contribution is to show that this family has VC-dimension $\mathcal{O}(k)$ . Using the standard results on $\epsilon$ -samples, similar to that in [15], it then follows that a random set of $\mathcal{O}(\frac{k}{\epsilon^{2}}\log\frac{1}{\epsilon})$ vertices is a sample set with constant probability. Thus the question becomes: how can we bound the VC-dimension of the family of sets $\SS$ formed by strongly connected components after deleting some $k$ vertices? We note that the techniques in [15] to bound the VC-dimension do not extend to directed graphs, and hence a different approach is needed. We accomplish this by showing a connection to our result on all-subset important separators.

Recall the definition of VC-dimension: given a set family $\SS$ on a universe $U$ , the VC-dimension is the size of the largest set $U^{\prime}\subseteq U$ shattered by $\SS$ . Here $U^{\prime}$ is shattered by $\SS$ if for every subset $U^{\prime\prime}\subseteq U^{\prime}$ there exists an $S\in\SS$ with $|U^{\prime}\cap S|=U^{\prime\prime}$ . Thus to prove a bound of $\mathcal{O}(k)$ for VC-dimension for our case, it is enough to show that there exists some constant $c$ so that any set of terminals $T\subseteq V(G)$ with $|T|\geq ck$ cannot be shattered by the set family $\SS$ which consists of the strongly connected components $C$ in $G\setminus F$ across all sets $F\subseteq V(G)$ with $|F|\leq k$ . Fix any $T$ of size $\geq ck$ for large enough $c$ which we choose later. For the sake of contradiction, assume that the set $T$ can be shattered. Fix a “pattern” $P\subseteq T$ , $P\neq\emptyset$ . Since $T$ can be shattered by $\SS$ , there exists a set $F$ of at most $k$ vertices, so that in $G\setminus F$ , there is a strongly connected component $C$ that satisfies $C\cap T=P$ . Fix some terminal $t\in P$ . Then it must be the case that $P$ is the exact set of terminals that can both reach $t$ and can be reached from $t$ in $G\setminus F$ . Equivalently, it is the exact set of terminals that can be reached from $t$ in both $G\setminus F$ and $G^{R}\setminus F$ , where $G^{R}$ is obtained by reversing every arc of $G$ . This motivates us to define $\mathrm{Reach}(t)$ for each $t\in T$ , the collection of subsets of terminals that can be reached from $t$ after deleting a set $F$ of size at most $k$ . Formally,

	$\displaystyle\mathrm{Reach}(t)=\{Q\subseteq T\mid\,$	there exists $F\subseteq V(G)$ , $\|F\|\leq k$ such that $Q$ is reachable from $t$
		$\displaystyle\text{while $T\setminus Q$ is unreachable from $t$ in $G\setminus F% $}\}$

If we can bound $|\mathrm{Reach}(t)|$ for any graph by a function $B(|T|,k)$ , then applying this result twice, once on $G$ and once on $G^{R}$ , will bound the number of such patterns $P$ with $t\in P$ by $(B(|T|,k))^{2}$ . Then a simple union bound over all $t\in T$ will give that the number of patterns $P\subseteq T$ is at most $|T|B(|T|,k))^{2}$ . If this is less than $2^{|T|}$ whenever $|T|\geq ck$ , then we have a contradiction to the fact that $T$ can be shattered. Thus the key question is to obtain $B(|T|,k)$ , a good bound on $|\mathrm{Reach}(t)|$ .

We do this as follows. Suppose $T^{\prime}\in\mathrm{Reach}(t)$ some $T^{\prime}\subseteq T$ . Then there exists a set $F$ of at most $k$ vertices after whose deletion the set of reachable terminals from $t$ is exactly $T^{\prime}$ . We then show that there is a $(t,T\setminus T^{\prime})$ important separator $X$ of size at most $k$ whose deletion gives the same reachability set $T^{\prime}$ as that after deleting $F$ . But now using our bound on important separators, we can show that $B(|T|,k)\leq 4^{k}\binom{|T|}{\leq 2k}$ . It is then easy to see that $|T|B(|T|,k)^{2}<2^{|T|}$ whenever $|T|\geq ck$ for some large enough (absolute) constant $c$ .

FPT algorithm for Directed Balanced Separators

To obtain the algorithm for Directed Balanced Separator, we follow the approach of [15] in the undirected case. For simplicity, in this overview, we work with vertex separators, assume that $b=\frac{1}{2}$ , and we will only return a $b^{\prime}=(\frac{3}{4}+2\epsilon)$ -balanced separator (as opposed to the $(b+\epsilon)$ balance guaranteed in our result). We are given that there is a set of vertices $F$ with $|F|\leq k$ , so that every strongly connected component of $G\setminus F$ has at most $\frac{1}{2}n$ vertices. We first compute an $(\epsilon,k)$ sample set $T$ of size $\mathcal{O}\left(\frac{k\log\frac{1}{\epsilon}}{\epsilon^{2}}\right)$ using our result on sample sets, Theorem 4. Since $T$ is a sample set, each SCC of $G\setminus F$ has at most $(\frac{1}{2}+\epsilon)|T|$ terminals. We will try to find a set of vertices $F^{\prime}$ such that $|F^{\prime}|\leq k$ and every SCC of $G\setminus F^{\prime}$ has at most $(\frac{3}{4}+\epsilon)|T|$ terminals. The property of sample sets will then again imply that $F^{\prime}$ is a $(\frac{3}{4}+2\epsilon)$ -balanced separator.

Consider the toplogical ordering $C_{1},C_{2}\ldots C_{\ell}$ of the SCC’s of $G\setminus F$ , so that there is no arc from $C_{j}$ to $C_{i}$ for $i,j\in[\ell]$ with $j>i$ . Consider the smallest index $i^{*}$ so that the union $L=\bigcup_{i=1}^{i^{*}}C_{i}$ contains at least $(\frac{1}{4}-\epsilon)|T|$ terminals. Since no component contains more than $(\frac{1}{2}+\epsilon)|T|$ terminals, it follows that $L$ contains at most $\frac{3}{4}|T|$ terminals. Also, by definition, $V(G)\setminus L$ contains at most $(\frac{3}{4}+\epsilon)|T|$ terminals.

The algorithm proceeds as follows. First, guess $T\cap L$ and $T\cap(V(G)\setminus L)$ . This can be done in time $2^{\mathcal{O}\left(\frac{k\log\frac{1}{\epsilon}}{\epsilon^{2}}\right)}$ since $|T|=\mathcal{O}\left(\frac{k\log\frac{1}{\epsilon}}{\epsilon^{2}}\right)$ . Next, we compute a directed min-vertex cut between $T\cap(V(G)\setminus L)$ and $T\cap L$ . Since $F$ is a vertex cut between these sets of size at most $k$ , this min-cut $F^{\prime}$ must be of size at most $k$ as well. But then the set of vertices $F^{\prime}$ is a $(\frac{3}{4}+\epsilon)$ -balanced separator with respect to the set $T$ . Using the property of sample sets, it follows that $F^{\prime}$ is a $(\frac{3}{4}+2\epsilon)$ -balanced separator in $G$ .

In order to make the loss in the balance factor only an additive $\mathcal{O}(\epsilon)$ we need a slightly more sophisticated argument. We accomplish this using a reduction to the Skew Separator problem, where we are given pairs $(s_{i},t_{i}),i\in[\ell]$ , and the goal is to separate $s_{i}$ from all $t_{j}$ , $j\leq i$ , for each $i$ , by deleting minimum number of vertices. This problem, which is a special case of Directed Multicut, was first defined by [5] in their FPT algorithm for Directed Feedback Arc Set.

Approximation algorithm for Directed Balanced Separators

Our $\mathcal{O}(\sqrt{\log k})$ approximation algorithm proceeds similar to the FPT algorithm. The only key technical difference is that to compute a balanced separator with respect to the terminal set, we cannot guess where the terminals of the sample set are (this requires FPT time); instead we use the algorithm of [1] to obtain a balanced separator with respect to the sample set. While the algorithm of [1] has an approximation ratio of $\mathcal{O}(\sqrt{\log n})$ , when we need a balanced separator with respect to a terminal set $T$ , this algorithm can be made to work with an approximation ratio of $\mathcal{O}(\sqrt{\log|T|})$ . On a high level, the reason is quite simple: the structure theorem of Arora, Rao and Vazirani [3] is a statement about vectors. As long as we apply the structure theorem to only $|T|$ vectors, we get an approximation ratio of $\mathcal{O}(\sqrt{\log|T|})$ .

Vertex Sparsifiers

Our result on vertex sparsifiers is similar in spirit to the result by Kratsch and Wahlström [20] and proceeds by identifying irrelevant vertices. However instead of using techniques based on matroids, we use our result on all-subsets important separators. Given a terminal set $T$ , we are interested in preserving cuts across partitions of the terminal set of size up to $c$ . We will define an irrelevant vertex as a vertex that is (a) not in $T$ and (b) not in any $A$ - $B$ important separator of size $\leq c$ for any partition $A\cup B$ of $T$ . By our result on all-subsets important separators, we can bound the number of relevant (non-irrelevant) vertices by $\binom{|T|}{\leq 3c}2^{\mathcal{O}(c)}$ , and identify this set in time $\binom{|T|}{\leq 3c}2^{\mathcal{O}(c)}(m+n)$ . Thus at least $|V(G)|-|T|-\binom{|T|}{\leq 3c}2^{\mathcal{O}(c)}$ vertices are irrelevant. Let $I$ be the set of irrelevant vertices. We now apply the standard operation of “closing” the irrelevant vertices $I$ , which is to delete each $v\in I$ , and for every pair $(w,y)$ where $w$ is an in-neighbour of $v_{1}$ and $y$ is an out-neighbour of $v_{2}$ for some $v_{1},v_{2}\in I$ , to add the arc $(w,y)$ . This operation is the vertex equivalent of edge contraction. The key observation here is that the closure operation can be applied to the entire set $I$ at once as opposed to applying it to one vertex of $I$ and restarting the whole algorithm. This helps us achieve a linear-time algorithm. Thus we obtain another graph $G^{\prime}$ with $V(G^{\prime})\subseteq V(G)$ , where $|V(G^{\prime})|\leq|V(G)\setminus I|\leq\binom{|T|}{\leq 3c}2^{\mathcal{O}(c)}$ .

In this short version, in the interest of space, we present our result on all-subsets important separators and our result on sample sets. Our other results can be found in the linked full version.

2 All-Subsets Important Separators

The goal of this section is to prove our result on all-subsets important separators. See 1

As described in the introduction, our proof proceeds by capturing the relationship between important separators and closest sets. We then use well-known connectivity properties of closest sets to obtain our result.

Definition 11 (Closest set).

For sets of vertices $X,T\subseteq V(G)$ , $X$ is closest to $T$ if $X$ is the unique vertex mincut between $X$ and $T$ .

We use the following well-known fact about closest cuts.

Lemma 12 (Lemma 18 of [17]).

$X$ is closest to $T$ if and only if for each vertex $v\in X$ , there are $|X|+1$ paths from $X$ to $T$ that are vertex-disjoint except at $v$ , which appears in exactly two of these paths.

Definition 13 (Minimal separators).

Given $X,S,T\subseteq V(G)$ such that $S\cap T=\emptyset$ , $X$ is called an $S$ - $T$ (inclusion-wise) minimal separator if in $G\setminus X$ , there is no path from any $s\in S$ to any $t\in T$ , and further, for every $v\in X$ , there is a path between some $s\in S$ and $t\in T$ in $G\setminus(X\setminus\{v\})$ .

Definition 14.

Given $X,S\subseteq V(G)$ , the reachability set of the source vertices $S$ after removing $X$ , that is, the set of all vertices $v$ reachable from some vertex of $S$ via a directed path after removing $X$ , is denoted by $R^{G}_{S}(X)$ .

Definition 15.

Given $X,S,T\subseteq V(G)$ with $S\cap T=\emptyset$ , a minimal $S$ - $T$ separator $X$ is called an $S$ - $T$ important separator if for every $S$ - $T$ separator $X^{\prime}\subseteq V(G)$ with $|X^{\prime}|\leq|X|$ , $R^{G}_{S}(X^{\prime})\supset R^{G}_{S}(X)$ does not hold.

We will drop the subscript or the superscript in $R^{G}_{S}(X)$ when the graph or the set of source vertices is clear from the context.

Informally, the idea of important separators is similar to that of a closest set: they are separators which cannot be pushed “closer” to the sink, without increasing the cut-size. The next lemma captures this relationship formally.

Lemma 16.

Given $X,S,T\subseteq V(G)$ , $X$ is an important $S$ - $T$ separator if and only if (a) $X$ is a minimal $S$ - $T$ separator and (b) $X$ is closest to $T$ .

Proof.

Notice that by definition, any important $S$ - $T$ separator must be a minimal $S$ - $T$ separator. First, suppose to the contrary that $X$ is an important $S$ - $T$ separator but $X$ is not closest to $T$ . Let $Y$ be a vertex min-cut between $X$ and $T$ with $Y\neq X$ . Since $X$ is not closest to $T$ , such a $Y$ indeed does exist. Clearly, $|Y|\leq|X|$ . We claim that $R(X)\subset R(Y)$ , which will contradict that $X$ is an important separator. First, let us show that $R(X)\subseteq R(Y)$ . Suppose for some vertex $v$ , $v\in R(X)$ but $v\notin R(Y)$ . Since $Y$ is a min-cut, there are vertex disjoint paths between each vertex $y\in Y$ and some vertex $t(y)\in T$ such that no internal vertex of these paths is from $X$ . Consider a path $P$ from some vertex $s\in S$ to $v$ in $G\setminus X$ . There must be such a path, since $v\in R(X)$ . Let $y$ be a vertex of $Y$ on this path: there has to be such a vertex $y$ since $v\notin R(Y)$ . Then $y$ is reachable from $s$ in $G\setminus X$ . Also, there is a path from $y$ to $t(y)$ which does not contain any vertex of $X$ . It follows that there is a path from $s$ to $t(y)$ in $G\setminus X$ , which is a contradiction since $X$ is an $S$ - $T$ separator. Since $X$ is a minimal $S$ - $T$ separator and $Y\neq X$ , it follows that there exists a $v\in X\cap R(Y)$ , and hence $R(X)\subset R(Y)$ .

Next, suppose that $X$ is both closest to $T$ and is also a minimal $S$ - $T$ separator. Assume for the sake of contradiction that $X$ is not important. Since $X$ is not important, there is another separator $X^{\prime}$ with $|X^{\prime}|\leq|X|$ which is a minimal $S-T$ separator, and further $R(X)\subset R(X^{\prime})$ . Since $X^{\prime}\neq X$ , there exists a $v\in X\cap(V(G)\setminus X^{\prime})$ . By Lemma 12, for every vertex $v\in X$ there exists a collection of $|X|+1$ paths from $X$ to $T$ which are vertex disjoint except at $v$ , which appears twice. Since $X^{\prime}$ does not delete $v$ , it follows that there is at least one such path $P$ from some vertex $x\in X$ to some vertex $t\in T$ which survives in $G\setminus X^{\prime}$ . We next show that there must be a path from some vertex of $S$ to $x$ in $G\setminus X^{\prime}$ and this will imply a path from $S$ to $t$ in $G\setminus X^{\prime}$ , a contradiction.

Consider any path from some vertex $s\in S$ to $x$ , all of whose internal vertices are not in $X$ . Such a path must exist, for otherwise $X$ will not be a minimal $S$ - $T$ separator since $X\setminus\{x\}$ will also be an $S$ - $T$ separator. Every internal vertex on this path is reachable from $s$ in $G\setminus X$ . But $R(X)\subseteq R(X^{\prime})$ , which means every internal vertex on this path is reachable from $s$ in $G\setminus X^{\prime}$ as well. This in turn means that there is a path from $s$ to $x$ in $G\setminus X^{\prime}$ , a contradiction. $\hfill\blacktriangleleft$

Lemma 17 (Theorem 8.51 of [9]).

Given a digraph $G$ and two disjoint sets $A,B\subseteq V(G)$ , there are at most $4^{k}$ $A$ - $B$ important separators of size $\leq k$ .

Equipped with these results, we are now ready to prove our main theorem in this section which bounds the number of important separators between across all subsets $A\subseteq S$ and $B\subseteq T$ .

See 1

Proof.

Let $X$ be an important $A$ - $B$ separator for some $A\subseteq S$ and $B\subseteq T$ with $|X|\leq k$ . We will show that $S$ is also an $A^{*}$ - $B^{*}$ important separator for some $A^{*}\subseteq S$ and $B^{*}\subseteq T$ with $|A^{*}|\leq k$ and $|B^{*}|\leq 2k$ .

By Lemma 16, it must be the case that $X$ is closest to $B$ . Also, there are vertex disjoint paths from each vertex $x$ of $X$ to some vertex $t(x)$ of $B$ . Also, we know that for every $v$ there exists $|X|+1$ disjoint paths from $X$ to $B$ which are vertex disjoint except at $v$ which appears twice.

Consider the vertex-flow network $H$ obtained by adding a super-source $s^{*}$ and a super-sink $t^{*}$ . We add outgoing arcs from $s^{*}$ to each vertex of $X$ , and incoming arcs to $t^{*}$ from each vertex of $B$ . The capacity of each vertex is $1$ . Let $f$ be the (maximum) flow corresponding to any set of $|X|$ vertex disjoint paths between $X$ and $B$ , and let the endpoints of these paths in $B$ be the set $Y$ . Now we do the following operation for each vertex $v$ . Consider the modified flow network $H_{v}$ obtained from $H$ where the underlying graph is same, but the vertex capacity of $v$ is raised to $2$ . $f$ is not a maximum flow in $H_{v}$ , since there exists $|X|+1$ paths from $X$ to $B$ which are vertex disjoint except at $v$ which appears in two of these paths. Thus, there is an augmenting path for $f$ in $H_{v}$ . We augment the flow $f$ along this augmenting path to obtain a new flow $f_{v}$ . The new flow has one additional endpoint in $B$ , so that the endpoints of the paths of this new flow $f_{v}$ are either in $Y$ or at the newly added vertex $u\in B\setminus Y$ . Let us denote by $Y_{v}$ the set of the new endpoints.

Let $B^{*}=\bigcup_{v\in X}Y_{v}$ . Notice that $|B^{*}|\leq 2k$ , since each $Y_{v}=Y\cup\{u\}$ for some vertex $u$ for each $v\in X$ . Also since $X$ is minimal, there must exist paths $P_{v}$ for each $v\in X$ from some vertex $s(v)$ of $A$ to $v$ which do not contain any other vertex of $X$ . Let $A^{*}=\bigcup_{v\in X}s(v)$ . We now claim that $X$ is also an important $A^{*}-B^{*}$ separator.

First, observe that $X$ must be a minimal $A^{*}$ - $B^{*}$ separator: there exists a path from some vertex of $A^{*}$ to each vertex $v\in X$ containing no other vertex of $X$ , and similarly there exists a path from each vertex $v$ of $X$ to some vertex in $Y\subseteq B^{*}$ containing no other vertex of $X$ . Now suppose that $X$ is not an $A^{*}$ - $B^{*}$ important separator. Then there exists another $A^{*}$ - $B^{*}$ minimal separator $X^{\prime}$ with $|X^{\prime}|\leq|X|$ with $R(X^{\prime})\supset R(X)$ . Since $X^{\prime}\neq X$ , there exists a vertex $v\in X\cap(V(G)\setminus X^{\prime})$ . Consider the set $Y_{v}$ and the set of $|X|+1$ paths from $X$ to $Y_{v}$ which are vertex disjoint except at $v$ , which appears twice. Since $X^{\prime}$ does not delete $v$ , at least one of these paths from some vertex $x\in X$ to $t\in Y_{v}$ survives in $G\setminus X^{\prime}$ . By construction, there is a path $P$ from some vertex $s\in A^{*}$ to $x\in X$ , whose every internal vertex is not in $X$ . Since every internal vertex of $P$ is in $R(X)$ , it must also be in $R(X^{\prime})$ . It follows that there is a path from $s$ to $x$ in $G\setminus X^{\prime}$ . Appending this to the path from $x$ to $t$ we obtain a path from $s$ to $t$ in $G\setminus X^{\prime}$ , contradicting the fact that $X^{\prime}$ is an $A^{*}$ - $B^{*}$ separator.

Finally, it is clear that the number of $A^{*}$ - $B^{*}$ important separators where $A^{*}\subseteq S$ and $B^{*}\subseteq T$ with $|A^{*}|\leq k$ and $|B^{*}|\leq 2k$ is at most $4^{k}$ times the number of ways of choosing $A^{*}$ and $B^{*}$ , and hence is at most $4^{k}\binom{|S|}{\leq k}\binom{|T|}{\leq 2k}$ . Furthermore, once we fix $A^{*}$ and $B^{*}$ , all the important $A^{*}$ - $B^{*}$ separators of size $\leq k$ can be enumerated in time $\mathcal{O}(4^{k}\cdot k^{2}\cdot(m+n))$ , see Theorem 8.51 of [9]. The set of all subsets of $A$ of size at most $k$ and the set of all subsets of $B$ of size at most $k$ can be enumerated in time $\mathcal{O}(k\binom{|S|}{k})$ and $\mathcal{O}(k\binom{|T|}{k})$ respectively. This completes the proof. $\hfill\blacktriangleleft$

3 Detection Sets and Sample Sets in Directed Graphs

The goal of this section is to prove our result on detection sets and sample sets in directed graphs. The following theorems are our main results in this section. See 3 See 4

We start by defining a notion of $(\epsilon,k)$ nets and recalling the definitions for $(\epsilon,k)$ detection sets and $(\epsilon,k)$ sample sets from the overview. These definitions naturally extend those in [15] for undirected graphs.

Definition 18 (Net).

Given a directed graph $G$ , an $(\epsilon,k)$ net is a set of terminals $T\subseteq V(G)$ satisfying the following property: for every set of vertices $F$ with $|F|\leq k$ , the following two conditions are met:

1.

For every SCC $C$ of $G\setminus F$ with $|C|\geq\epsilon n$ we have

$|C\cap T|\geq 1.$
2.

Let $C^{*}$ be the union of all except one SCC in $G\setminus F$ such that $|C^{*}|\geq\epsilon n$ , then we have

$|C^{*}\cap T|\geq 1.$

Definition 19 (Detection Set).

Given a directed graph $G$ , an $(\epsilon,k)$ detection set is a set of terminals $T\subseteq V(G)$ satisfying the following property: for every set of vertices $F$ with $|F|\leq k$ such that $V(G\setminus F)$ can be partitioned into $(A,B)$ with $|A|,|B|\geq\epsilon n$ and there are no arcs from $B$ to $A$ , there exists $t_{1},t_{2}\in T$ such that there is no $t_{1}$ - $t_{2}$ path in $G\setminus F$ .

Definition 20 (Sample Set).

Given a directed graph $G$ , an $(\epsilon,k)$ sample set is a set of terminals $T\subseteq V(G)$ satisfying the following property: for every set of vertices $F$ with $|F|\leq k$ , and every SCC $C$ of $G\setminus F$ , we have

\left|\frac{|C\cap T|}{|T|}-\frac{|C|}{n}\right|\leq\epsilon.

We note that these definitions are almost identical to the one in [15], with the only difference being that we deal with directed graphs and strongly connected components in lieu of undirected graphs and connected components. Before proving the main results of this section, we show that $(\epsilon,k)$ -nets are also $(\epsilon,k)$ -detection sets. The proof is similar to that in undirected graphs.

Lemma 21.

Every $(\epsilon,k)$ -net is also an $(\epsilon,k)$ detection set.

Proof.

Let $T$ be an $(\epsilon,k)$ -net for $G$ . Consider a set of vertices $F$ of size at most $k$ , so that $V(G\setminus F)$ admits a partition $A, B$ with $|A|,|B|\geq\epsilon n$ such that there are no arcs from $B$ to $A$ . Consider an SCC $C$ of $G\setminus F$ , and let $C^{\prime}$ be the union of all SCC’s of $G\setminus F$ except $C$ . Then clearly $|C^{\prime}|\geq\epsilon n$ , hence there exists a terminal $t_{1}\in C^{\prime}\cap T$ . Let $C_{t_{1}}$ be the SCC of $G\setminus F$ containing $t_{1}$ . Now consider $C^{\prime\prime}$ which is the union of all SCC’s of $G\setminus F$ except $C_{t_{1}}$ . Again $|C^{\prime\prime}|\geq\epsilon n$ , therefore there is a terminal $t_{2}\in C^{\prime\prime}\cap T$ . But clearly $t_{1}$ and $t_{2}$ lie in different SCC’s of $G\setminus F$ , which means there is either no $t_{1}$ - $t_{2}$ path or no $t_{2}$ - $t_{1}$ path in $G\setminus F$ , hence satisfying our detection set property. $\hfill\blacktriangleleft$

Thus we henceforth focus on obtaining $(\epsilon,k)$ -nets and samples.

First, we observe that showing a VC-dimension bound directly implies $(\epsilon,k)$ samples. This is similar to the approach in [15]. We start with a few definitions.

Definition 22 (Shattering a set of elements).

Suppose we are given a set system $(\SS,U)$ consisting of a family of sets $\SS$ , where each set in $\SS$ consists of elements from a universe $U$ . We say that a subset $W\subseteq U$ is shattered by $\SS$ , if for every subset $Y\subseteq W$ , there exists a set $S\in\SS$ so that $S\cap W=Y$ .

Definition 23 (VC-dimension).

Given a set system $(\SS,U)$ , its VC-dimension is the size of the largest subset $W\subseteq U$ that can be shattered by $\SS$ .

Theorem 24 ( $\epsilon$ -net theorem, see [15]).

Let $(\SS,U)$ be a set system with VC-dimension $d$ , and universe size $|U|=n$ . Then for every $\epsilon>0$ , there exists an absolute constant $c$ such that a random subset $T\subseteq U$ of size $c{d}\frac{1}{\epsilon}\log\frac{1}{\epsilon}$ is an $\epsilon$ -net with probability at least $\frac{2}{3}$ . Concretely, $T$ satisfies $|S\cap T|\geq 1$ for every $S\in\SS$ satisfying $|S|\geq\epsilon n$ with probability at least $\frac{2}{3}$ .

Theorem 25 ( $\epsilon$ -sample theorem, see [15]).

Let $(\SS,U)$ be a set system with VC-dimension $d$ , and universe size $|U|=n$ . Then for every $\epsilon>0$ , there exists an absolute constant $c$ such that a random subset $T\subseteq U$ of size ${cd}\frac{1}{\epsilon^{2}}\log\frac{1}{\epsilon}$ is an $\epsilon$ -sample with probability at least $\frac{2}{3}$ . Concretely, $T$ satisfies $\left|\frac{|S|}{n}-\frac{|S\cap T|}{|T|}\right|\leq\epsilon$ for every $S\in\SS$ with probability at least $\frac{2}{3}$ .

Let $\SS$ be the family of sets consisting of all possible sets $C$ which are either (a) a strongly connected component after deleting some vertex set $F$ of size at most $k$ or (b) the union of all but one strongly connected components after deleting some vertex set $F$ of size at most $k$ . Then it suffices to prove a VC-dimension bound of $\mathcal{O}(k)$ for this family of sets with the universe as the vertex set - this would then directly imply Theorems 3 and 4 using Theorems 24 and 25. Thus we will henceforth focus on upper-bounding the VC-dimension of the set system $\SS$ .

Towards this, we consider a slightly different problem of reachability from a single source after deleting $k$ vertices.

Definition 26 (Single source reachability profile).

Given a graph $G$ , a source vertex $s$ and a set of sink vertices $T^{\prime}$ with $s\notin T^{\prime}$ , we define the single source reachability profile of $s$ as $\mathrm{Reach}^{k}(s,T^{\prime})=\{R^{G}_{\{s\}}(F)\cap T^{\prime}\mid F% \subseteq V(G),|F|\leq k\}$ In other words, $\mathrm{Reach}^{k}(s,T^{\prime})$ is the collection of subsets of $T^{\prime}$ reachable from $s$ after deleting some set of at most $k$ vertices from $G$ .

The next theorem bounds the size of $\mathrm{Reach}^{k}(s,T^{\prime})$ .

Theorem 27.

Given a graph $G$ , source vertex $s$ and sink vertices $T^{\prime}$ , $|\mathrm{Reach}^{k}(s,T^{\prime})|\leq 4^{k}\binom{|T^{\prime}|}{\leq 2k}$ .

We prove Theorem 27 using our result on all-subset important separators, Theorem 1. We do this as follows. Given $s$ and $T^{\prime}$ , suppose there exists a set $X$ of $k$ vertices so that $P\subseteq T^{\prime}$ is the subset of $T^{\prime}$ reachable from $s$ in $G\setminus X$ . Then in the following lemma, Lemma 28, we show that there is in fact an $s$ - $(T^{\prime}\setminus P)$ important separator $X^{\prime}$ of size $\leq k$ , so that the subset of $T^{\prime}$ reachable from $s$ in $G\setminus X$ remains $P$ . The bound on all-subset important separators will then imply the bound on the reachability profile.

Lemma 28.

Suppose there exists a set $X$ of $\leq k$ vertices so that $P\subseteq T^{\prime}$ is the subset of $T^{\prime}$ reachable from $s$ in $G\setminus X$ . Then there is an $s$ - $(T^{\prime}\setminus P)$ important separator $X^{\prime}$ of size $\leq k$ , so that the subset of $T^{\prime}$ reachable from $s$ in $G\setminus X$ remains $P$ .

Proof.

Without loss of generality, we assume that $X$ is a minimal $s-(T^{\prime}\setminus P)$ separator (otherwise we consider the subset of $X$ that is minimal, this still has the same reachability set $P$ ). Let $X^{\prime}$ be the vertex mincut between $X$ and $T^{\prime}\setminus P$ that is closest to $T^{\prime}\setminus P$ . We have $|X^{\prime}|\leq|X|$ since $X$ itself is a vertex cut between $X$ and $T^{\prime}\setminus P$ .

We show that the reachability profile after removing $X^{\prime}$ is also $P$ . In other words, a vertex $t\in T^{\prime}$ is reachable from $s$ after removing $X$ iff it is reachable after removing $X^{\prime}$ .

Consider some sink vertex $t\in T^{\prime}\setminus P$ , so that $t$ is not reachable after removing $X$ . Then, every $s$ – $t$ path contains a vertex in $X$ . Since $X^{\prime}$ is a cut between $X$ and $T^{\prime}\setminus P$ , this path must also contain a vertex in $X^{\prime}$ . This proves one direction.

Suppose that some vertex $t\in T^{\prime}$ is not reachable after removing $X^{\prime}$ . Then, every $s$ – $t$ path contains a vertex in $X^{\prime}$ . Suppose for contradiction that some $s$ – $t$ path does not contain a vertex in $X$ . Consider the prefix of the path leading to a vertex $v$ in $X^{\prime}$ . Append to it a path from $v$ to $T^{\prime}\setminus P$ that only contains one vertex in $X^{\prime}$ (at the start) and no vertex of $X$ . Such a path must exist since $X^{\prime}$ is a min-cut between $X$ and $T^{\prime}\setminus P$ . The result is a path from $s$ to $T\setminus P$ that does not contain a vertex in $X$ . This contradicts the set $P$ being the reachability profile after removing $X$ .

Finally, we show that $X^{\prime}$ is an important $s$ - $(T^{\prime}\setminus P)$ separator. By Lemma 16 it is enough to show that $X^{\prime}$ is closest to $T^{\prime}\setminus P$ and that $X^{\prime}$ is a minimal $s$ - $(T^{\prime}\setminus P)$ separator. It follows from the construction of $X^{\prime}$ that $X^{\prime}$ must be closest to $T^{\prime}\setminus P$ . Thus it is enough to show that $X^{\prime}$ is a minimal $s$ - $(T^{\prime}\setminus P)$ separator. Since $X^{\prime}$ is a min-cut between $X$ and $T^{\prime}\setminus P$ , for every vertex $x^{\prime}\in X^{\prime}$ , there exists a path $Q$ from some vertex $x\in X$ to some vertex $t\in T^{\prime}\setminus P$ through $x^{\prime}$ that does not contain any other vertex of $X^{\prime}$ . Since $X$ is a minimal $s$ - $(T^{\prime}\setminus P)$ separator, there is a path $P^{\prime}$ from $s$ to each vertex $x\in X$ that does not include any other vertex from $X$ . Further $P^{\prime}$ does not contain any vertex from $X^{\prime}$ : if it contains some vertex $y^{\prime}\in X^{\prime}$ , it must be the case that $y^{\prime}\notin X$ . Then it follows that there is a path from $s$ to $y^{\prime}$ in $G\setminus X$ . Since $X^{\prime}$ is a min-cut between $X$ and $T^{\prime}\setminus P$ , there is a path from each vertex of $X^{\prime}$ to some vertex of $T^{\prime}\setminus P$ that does not have any internal vertex from $X$ . In particular, there is a path from $y^{\prime}$ to some vertex of $T^{\prime}\setminus P$ which does not include any vertex of $X$ . This path, when appended to the path from $s$ to $y^{\prime}$ in $G\setminus X$ , results in a path from $s$ to some vertex of $T^{\prime}\setminus P$ which does not contain any vertex of $X$ , which in turn contradicts the fact that $X$ is a cut between $s$ and $T^{\prime}\setminus P$ . Finally, appending $P^{\prime}$ with $Q$ gives a path from $s$ to $t\in T^{\prime}\setminus P$ that contains $x^{\prime}$ , while containing no other vertex of $X^{\prime}$ . Since such a path exists for any $x^{\prime}\in X^{\prime}$ , it follows that $X^{\prime}$ is minimal, and hence $X^{\prime}$ is an important $s$ - $T^{\prime}\setminus P$ separator. $\hfill\blacktriangleleft$

Proof of Theorem 27.

Suppose $P\in\mathrm{Reach}^{k}(s,T^{\prime})$ . Lemma 28 implies that there is an important $s$ - $(T^{\prime}\setminus P)$ separator $X$ so that the reachability set of $s$ in $G\setminus X$ is exactly $P\subseteq T^{\prime}$ . Using Theorem 1 the number of such important separators is at most $4^{k}\binom{|T^{\prime}|}{\leq 2k}$ , and hence the number of such $P$ is at most $4^{k}\binom{|T^{\prime}|}{\leq 2k}$ . $\hfill\blacktriangleleft$

Next, we show why Theorem 27 implies a VC-dimension bound.

Theorem 29.

Let $\SS$ be the family of sets over consisting of all possible sets $C$ which are either (a) a strongly connected component after deleting some vertex set $F$ of size at most $k$ or (b) the union of all but one strongly connected components after deleting some vertex set $F$ of size at most $k$ . Then the VC-dimension of $\SS$ is $\mathcal{O}(k)$ .

Proof.

From the definition of VC-dimension, it is enough to show that there no set of size $\geq dk$ that can be shattered, for some constant $d$ . Suppose a set of terminals $T$ can be shattered. We will show that $|T|=\mathcal{O}(k)$ .

Consider some $P\subseteq T$ , with $P\neq\emptyset$ , and suppose that after deleting some subset $F$ of vertices of size at most $k$ , some strongly connected component $C$ of $G\setminus F$ satisfies $C\cap T=P$ . Fix any $s\in C\cap T$ , and let $T^{\prime}=T\setminus\{s\}$ . Let $G^{R}$ denote the graph $G$ with arcs reversed. Then by definition of an SCC, $P\setminus s$ is the exact subset of $T^{\prime}$ reachable from $s$ in both $G\setminus F$ and $G^{R}\setminus F$ . But this means $P\setminus s=P_{1}\cap P_{2}$ for some $P_{1}\in\mathrm{Reach}_{G}^{k}(s,T^{\prime})$ and $P_{2}\in\mathrm{Reach}_{G^{R}}^{k}(s,T^{\prime})$ .

Together, we obtain that if $P\subseteq T$ is $C\cap T$ for some SCC $C$ after deleting some $k$ vertices, there must exist a vertex $s\in P$ such that $P\setminus\{s\}=P_{1}\cap P_{2}$ for some $P_{1}\in\mathrm{Reach}_{G}^{k}(s,T\setminus\{s\})$ and $P_{2}\in\mathrm{Reach}_{G^{R}}^{k}(s,T\setminus\{s\})$ . But then the number of choices for $P$ is at most $|T|(4^{k}\binom{|T|}{\leq 2k})^{2}$ .

Alternatively, suppose that $P=C^{\prime}\cap T$ where $C^{\prime}$ is the union of all but one strongly connected component, say $C^{*}$ , after deleting some set $F$ of $k$ vertices. Suppose $P\neq T$ , so that $C^{*}\cap T\neq\emptyset$ . Consider some $s^{\prime}\in C^{*}\cap T$ . Then it follows that $(T\setminus P)\setminus\{s^{\prime}\}=Q_{1}\cap Q_{2}$ where $Q_{1}\in\mathrm{Reach}^{k}_{G}(s^{\prime},T\setminus\{s^{\prime}\})$ and $Q_{2}\in\mathrm{Reach}_{G^{R}}^{k}(s^{\prime},T\setminus\{s^{\prime}\})$ . But then the number of choices for $T\setminus P$ is at most $|T|(4^{k}\binom{|T|}{\leq 2k})^{2}$ , and hence the number of choices for $P$ is at most $|T|(4^{k}\binom{|T|}{\leq 2k})^{2}$ .

Thus the total number of choices for $P$ across both cases is at most $2|T|(4^{k}\binom{|T|}{\leq 2k})^{2}+2$ (accounting for $P=\emptyset$ in the first case, and $P=T$ in the second) which is not less than $2^{|T|}$ only when $|T|=\mathcal{O}(k)$ , finishing the proof. $\hfill\blacktriangleleft$

References

[1] Amit Agarwal, Moses Charikar, Konstantin Makarychev, and Yury Makarychev. $\mathcal{O}(\sqrt{\log n})$ approximation algorithms for min uncut, min 2cnf deletion, and directed cut problems. In Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, pages 573–581, 2005.
[2] Aditya Anand, Euiwoong Lee, Jason Li, and Thatchaphol Saranurak. All-subsets important separators with applications to sample sets, balanced separators and vertex sparsifiers in directed graphs. CoRR, 2025. arXiv:2504.20027.
[3] Sanjeev Arora, Satish Rao, and Umesh Vazirani. Expander flows, geometric embeddings and graph partitioning. Journal of the ACM (JACM), 56(2):1–37, 2009. doi:10.1145/1502793.1502794.
[4] Parinya Chalermsook, Syamantak Das, Yunbum Kook, Bundit Laekhanukit, Yang P Liu, Richard Peng, Mark Sellke, and Daniel Vaz. Vertex sparsification for edge connectivity. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1206–1225. SIAM, 2021. doi:10.1137/1.9781611976465.74.
[5] J Chen, Y Liu, S Lu, B O’Sullivan, and Igor Razgon. A fixed-parameter algorithm for the directed feedback vertex set problem. Journal of the ACM, 55(5):1–19, 2008. doi:10.1145/1411509.1411511.
[6] Jianer Chen, Yang Liu, and Songjian Lu. An improved parameterized algorithm for the minimum node multiway cut problem. Algorithmica, 55(1):1–13, 2009. doi:10.1007/S00453-007-9130-6.
[7] Rajesh Chitnis, MohammadTaghi Hajiaghayi, and Dániel Marx. Fixed-parameter tractability of directed multiway cut parameterized by the size of the cutset. SIAM Journal on Computing, 42(4):1674–1696, 2013. doi:10.1137/12086217X.
[8] Julia Chuzhoy. On vertex sparsifiers with steiner nodes. In Proceedings of the forty-fourth annual ACM symposium on Theory of computing, pages 673–688, 2012. doi:10.1145/2213977.2214039.
[9] Marek Cygan, Fedor V Fomin, Łukasz Kowalik, Daniel Lokshtanov, Dániel Marx, Marcin Pilipczuk, Michał Pilipczuk, and Saket Saurabh. Parameterized algorithms, volume 4(8). Springer, 2015. doi:10.1007/978-3-319-21275-3.
[10] Marek Cygan, Paweł Komosa, Daniel Lokshtanov, Marcin Pilipczuk, Michał Pilipczuk, Saket Saurabh, and Magnus Wahlström. Randomized contractions meet lean decompositions. ACM Transactions on Algorithms (TALG), 17(1):1–30, 2020. doi:10.1145/3426738.
[11] Marek Cygan, Daniel Lokshtanov, Marcin Pilipczuk, Michał Pilipczuk, and Saket Saurabh. Minimum bisection is fixed-parameter tractable. SIAM Journal on Computing, 48(2):417–450, 2019. doi:10.1137/140988553.
[12] Rodney G Downey, Michael R Fellows, et al. Fundamentals of parameterized complexity, volume 4. Springer, 2013.
[13] Guy Even, Joseph Naor, Satish Rao, and Baruch Schieber. Fast approximate graph partitioning algorithms. SIAM Journal on Computing, 28(6):2187–2214, 1999. doi:10.1137/S0097539796308217.
[14] Uriel Feige, Mohammadtaghi Hajiaghayi, and James R Lee. Improved approximation algorithms for minimum weight vertex separators. SIAM Journal on Computing, 38(2):629–657, 2008. doi:10.1137/05064299X.
[15] Uriel Feige and Mohammad Mahdian. Finding small balanced separators. In Proceedings of the thirty-eighth annual ACM symposium on Theory of computing, pages 375–384, 2006. doi:10.1145/1132516.1132573.
[16] Uriel Feige and Orly Yahalom. On the complexity of finding balanced oneway cuts. Information processing letters, 87(1):1–5, 2003. doi:10.1016/S0020-0190(03)00251-5.
[17] Zhiyang He, Jason Li, and Magnus Wahlström. Near-linear-time, optimal vertex cut sparsifiers in directed acyclic graphs. In 29th Annual European Symposium on Algorithms (ESA 2021). Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2021.
[18] Jon Kleinberg. Detecting a Network Failure. Internet Mathematics, 1(1):37–55, 2003. doi:10.1080/15427951.2004.10129077.
[19] Jon Kleinberg, Mark Sandler, and Aleksandrs Slivkins. Network failure detection and graph connectivity. SIAM Journal on Computing, 38(4):1330–1346, 2008. doi:10.1137/070697793.
[20] Stefan Kratsch and Magnus Wahlström. Representative sets and irrelevant vertices: New tools for kernelization. In 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science, pages 450–459. IEEE, 2012. doi:10.1109/FOCS.2012.46.
[21] Stefan Kratsch and Magnus Wahlström. Representative sets and irrelevant vertices: New tools for kernelization. Journal of the ACM (JACM), 67(3):1–50, 2020. doi:10.1145/3390887.
[22] Jason Li and Jesper Nederlof. Detecting feedback vertex sets of size k in $\mathcal{O}(2.7^{k})$ time. ACM Transactions on Algorithms (TALG), 18(4):1–26, 2022.
[23] Yang P Liu. Vertex sparsification for edge connectivity in polynomial time. In 14th Innovations in Theoretical Computer Science Conference (ITCS 2023), volume 251, page 83. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023.
[24] Daniel Lokshtanov and Dániel Marx. Clustering with local restrictions. Information and Computation, 222:278–292, 2013. doi:10.1016/J.IC.2012.10.016.
[25] Daniel Lokshtanov, Pranabendu Misra, MS Ramanujan, Saket Saurabh, and Meirav Zehavi. Fpt-approximation for fpt problems. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 199–218. SIAM, 2021.
[26] Daniel Lokshtanov, Maadapuzhi-Sridharan Ramanujan, Saket Saurabh, Roohani Sharma, and Meirav Zehavi. Wannabe bounded treewidth graphs admit a polynomial kernel for directed feedback vertex set. ACM Trans. Comput. Theory, 17(1), February 2025. doi:10.1145/3711669.
[27] Jayakrishnan Madathil, Roohani Sharma, and Meirav Zehavi. A sub-exponential fpt algorithm and a polynomial kernel for minimum directed bisection on semicomplete digraphs. Algorithmica, 83:1861–1884, 2021. doi:10.1007/S00453-021-00806-X.
[28] Dániel Marx. Parameterized graph separation problems. Theoretical Computer Science, 351(3):394–406, 2006. doi:10.1016/J.TCS.2005.10.007.
[29] Dániel Marx. Important separators and parameterized algorithms. In International Workshop on Graph-Theoretic Concepts in Computer Science, pages 5–10. Springer, 2011. doi:10.1007/978-3-642-25870-1_2.
[30] Daniel Marx and Igor Razgon. Fixed-parameter tractability of multicut parameterized by the size of the cutset. SIAM Journal on Computing, 43(2):355, 2014.
[31] Pranabendu Misra, Fahad Panolan, MS Ramanujan, and Saket Saurabh. Linear representation of transversal matroids and gammoids parameterized by rank. Theoretical Computer Science, 818:51–59, 2020. doi:10.1016/J.TCS.2018.02.029.
[32] Ankur Moitra. Approximation algorithms for multicommodity-type problems with guarantees independent of the graph size. In 2009 50th Annual IEEE Symposium on Foundations of Computer Science, pages 3–12. IEEE, 2009. doi:10.1109/FOCS.2009.28.
[33] Harald Räcke. Optimal hierarchical decompositions for congestion minimization in networks. In Proceedings of the fortieth annual ACM symposium on Theory of computing, pages 255–264, 2008. doi:10.1145/1374376.1374415.
[34] Thatchaphol Saranurak and Sorrachai Yingchareonthawornchai. Deterministic small vertex connectivity in almost linear time. In 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS), pages 789–800. IEEE, 2022. doi:10.1109/FOCS54457.2022.00080.

Appendix A Appendix

See 2

Proof.

Essentially, we will construct undirected graphs: these graphs can be made directed simply by adding for every undirected edge $(u,v)$ , the directed edges $(u,v)$ and $(v,u)$ .

For the first part, consider the star on $c+1$ vertices for some $c\gg k$ . Let $S$ consist of the single center vertex, and let $T$ consist of all the leaves. Observe that for any $T^{\prime}\subseteq T$ of size at most $k$ , the separator which consists of exactly the vertices $T^{\prime}$ is an important $S$ - $T^{\prime}$ separator of size at most $k$ . Thus there are at least $\binom{|T|}{\leq k}$ such separators.

For the second part, consider again a star, but this time we blow up the center vertex to a clique of size $k+1$ . Formally, consider the graph $G$ on $c+k+1$ vertices, where $T\subseteq V(G)$ is a clique of size $k+1$ , and $S\subseteq V(G)$ forms the remaining $c$ vertices. Each vertex of $S$ is adjacent to each vertex of $T$ , and not adjacent to any vertex of $S$ . Thus the graph is a star with a core $T$ of size $k+1$ and $c$ leaves which form the set $S$ . Now for every subset $S^{\prime}\subseteq S$ with $|S^{\prime}|\leq k$ , $S^{\prime}$ itself forms an $S^{\prime}$ - $T$ important separator of size at most $k$ . Thus there are at least $\binom{|S|}{\leq k}$ such separators. $\hfill\blacktriangleleft$

Lemma 30 (Tight example for important separator preservation).

There is a graph $G$ , source vertex $s\in V(G)$ , integer $k$ , sink vertices $B\subseteq V(G)$ with $|B|=2k$ and an $s$ - $B$ important separator $X\subseteq V(G)$ with $|X|=k$ , such that $X$ is not an $s$ - $B^{\prime}$ important separator for any $B^{\prime}\subset B$ .

Proof.

Consider the graph in Figure 2. $X$ is an important $s$ - $B$ separator of size $3$ . However, clearly $X$ is not an important $s$ - $B^{\prime}$ important separator for any $B^{\prime}\subset B$ , since we must have $B^{\prime}\cap\{b_{2i-1},b_{2i}\}\leq 1$ for some $i\in[3]$ and this means that $(X\setminus\{u_{i}\})\cup v_{2i-1}$ or $(X\setminus\{u_{i}\})\cup v_{2i}$ is an $s$ - $B^{\prime}$ separator closer to $B^{\prime}$ than $X$ , contradicting the importance of $X$ .

Figure 2: Tight example for important separator preservation.

$\hfill\blacktriangleleft$

[bib.bib1] [1] Amit Agarwal, Moses Charikar, Konstantin Makarychev, and Yury Makarychev. $\mathcal{O}(\sqrt{\log n})$ approximation algorithms for min uncut, min 2cnf deletion, and directed cut problems. In Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, pages 573–581, 2005.

[bib.bib2] [2] Aditya Anand, Euiwoong Lee, Jason Li, and Thatchaphol Saranurak. All-subsets important separators with applications to sample sets, balanced separators and vertex sparsifiers in directed graphs. CoRR, 2025. arXiv:2504.20027.

[bib.bib3] [3] Sanjeev Arora, Satish Rao, and Umesh Vazirani. Expander flows, geometric embeddings and graph partitioning. Journal of the ACM (JACM), 56(2):1–37, 2009. doi:10.1145/1502793.1502794.

[bib.bib4] [4] Parinya Chalermsook, Syamantak Das, Yunbum Kook, Bundit Laekhanukit, Yang P Liu, Richard Peng, Mark Sellke, and Daniel Vaz. Vertex sparsification for edge connectivity. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1206–1225. SIAM, 2021. doi:10.1137/1.9781611976465.74.

[bib.bib5] [5] J Chen, Y Liu, S Lu, B O’Sullivan, and Igor Razgon. A fixed-parameter algorithm for the directed feedback vertex set problem. Journal of the ACM, 55(5):1–19, 2008. doi:10.1145/1411509.1411511.

[bib.bib6] [6] Jianer Chen, Yang Liu, and Songjian Lu. An improved parameterized algorithm for the minimum node multiway cut problem. Algorithmica, 55(1):1–13, 2009. doi:10.1007/S00453-007-9130-6.

[bib.bib7] [7] Rajesh Chitnis, MohammadTaghi Hajiaghayi, and Dániel Marx. Fixed-parameter tractability of directed multiway cut parameterized by the size of the cutset. SIAM Journal on Computing, 42(4):1674–1696, 2013. doi:10.1137/12086217X.

[bib.bib8] [8] Julia Chuzhoy. On vertex sparsifiers with steiner nodes. In Proceedings of the forty-fourth annual ACM symposium on Theory of computing, pages 673–688, 2012. doi:10.1145/2213977.2214039.

[bib.bib9] [9] Marek Cygan, Fedor V Fomin, Łukasz Kowalik, Daniel Lokshtanov, Dániel Marx, Marcin Pilipczuk, Michał Pilipczuk, and Saket Saurabh. Parameterized algorithms, volume 4(8). Springer, 2015. doi:10.1007/978-3-319-21275-3.

[bib.bib10] [10] Marek Cygan, Paweł Komosa, Daniel Lokshtanov, Marcin Pilipczuk, Michał Pilipczuk, Saket Saurabh, and Magnus Wahlström. Randomized contractions meet lean decompositions. ACM Transactions on Algorithms (TALG), 17(1):1–30, 2020. doi:10.1145/3426738.

[bib.bib11] [11] Marek Cygan, Daniel Lokshtanov, Marcin Pilipczuk, Michał Pilipczuk, and Saket Saurabh. Minimum bisection is fixed-parameter tractable. SIAM Journal on Computing, 48(2):417–450, 2019. doi:10.1137/140988553.

[bib.bib12] [12] Rodney G Downey, Michael R Fellows, et al. Fundamentals of parameterized complexity, volume 4. Springer, 2013.

[bib.bib13] [13] Guy Even, Joseph Naor, Satish Rao, and Baruch Schieber. Fast approximate graph partitioning algorithms. SIAM Journal on Computing, 28(6):2187–2214, 1999. doi:10.1137/S0097539796308217.

[bib.bib14] [14] Uriel Feige, Mohammadtaghi Hajiaghayi, and James R Lee. Improved approximation algorithms for minimum weight vertex separators. SIAM Journal on Computing, 38(2):629–657, 2008. doi:10.1137/05064299X.

[bib.bib15] [15] Uriel Feige and Mohammad Mahdian. Finding small balanced separators. In Proceedings of the thirty-eighth annual ACM symposium on Theory of computing, pages 375–384, 2006. doi:10.1145/1132516.1132573.

[bib.bib16] [16] Uriel Feige and Orly Yahalom. On the complexity of finding balanced oneway cuts. Information processing letters, 87(1):1–5, 2003. doi:10.1016/S0020-0190(03)00251-5.

[bib.bib17] [17] Zhiyang He, Jason Li, and Magnus Wahlström. Near-linear-time, optimal vertex cut sparsifiers in directed acyclic graphs. In 29th Annual European Symposium on Algorithms (ESA 2021). Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2021.

[bib.bib18] [18] Jon Kleinberg. Detecting a Network Failure. Internet Mathematics, 1(1):37–55, 2003. doi:10.1080/15427951.2004.10129077.

[bib.bib19] [19] Jon Kleinberg, Mark Sandler, and Aleksandrs Slivkins. Network failure detection and graph connectivity. SIAM Journal on Computing, 38(4):1330–1346, 2008. doi:10.1137/070697793.

[bib.bib20] [20] Stefan Kratsch and Magnus Wahlström. Representative sets and irrelevant vertices: New tools for kernelization. In 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science, pages 450–459. IEEE, 2012. doi:10.1109/FOCS.2012.46.

[bib.bib21] [21] Stefan Kratsch and Magnus Wahlström. Representative sets and irrelevant vertices: New tools for kernelization. Journal of the ACM (JACM), 67(3):1–50, 2020. doi:10.1145/3390887.

[bib.bib22] [22] Jason Li and Jesper Nederlof. Detecting feedback vertex sets of size k in $\mathcal{O}(2.7^{k})$ time. ACM Transactions on Algorithms (TALG), 18(4):1–26, 2022.

[bib.bib23] [23] Yang P Liu. Vertex sparsification for edge connectivity in polynomial time. In 14th Innovations in Theoretical Computer Science Conference (ITCS 2023), volume 251, page 83. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023.

[bib.bib24] [24] Daniel Lokshtanov and Dániel Marx. Clustering with local restrictions. Information and Computation, 222:278–292, 2013. doi:10.1016/J.IC.2012.10.016.

[bib.bib25] [25] Daniel Lokshtanov, Pranabendu Misra, MS Ramanujan, Saket Saurabh, and Meirav Zehavi. Fpt-approximation for fpt problems. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 199–218. SIAM, 2021.

[bib.bib26] [26] Daniel Lokshtanov, Maadapuzhi-Sridharan Ramanujan, Saket Saurabh, Roohani Sharma, and Meirav Zehavi. Wannabe bounded treewidth graphs admit a polynomial kernel for directed feedback vertex set. ACM Trans. Comput. Theory, 17(1), February 2025. doi:10.1145/3711669.

[bib.bib27] [27] Jayakrishnan Madathil, Roohani Sharma, and Meirav Zehavi. A sub-exponential fpt algorithm and a polynomial kernel for minimum directed bisection on semicomplete digraphs. Algorithmica, 83:1861–1884, 2021. doi:10.1007/S00453-021-00806-X.

[bib.bib28] [28] Dániel Marx. Parameterized graph separation problems. Theoretical Computer Science, 351(3):394–406, 2006. doi:10.1016/J.TCS.2005.10.007.

[bib.bib29] [29] Dániel Marx. Important separators and parameterized algorithms. In International Workshop on Graph-Theoretic Concepts in Computer Science, pages 5–10. Springer, 2011. doi:10.1007/978-3-642-25870-1_2.

[bib.bib30] [30] Daniel Marx and Igor Razgon. Fixed-parameter tractability of multicut parameterized by the size of the cutset. SIAM Journal on Computing, 43(2):355, 2014.

[bib.bib31] [31] Pranabendu Misra, Fahad Panolan, MS Ramanujan, and Saket Saurabh. Linear representation of transversal matroids and gammoids parameterized by rank. Theoretical Computer Science, 818:51–59, 2020. doi:10.1016/J.TCS.2018.02.029.

[bib.bib32] [32] Ankur Moitra. Approximation algorithms for multicommodity-type problems with guarantees independent of the graph size. In 2009 50th Annual IEEE Symposium on Foundations of Computer Science, pages 3–12. IEEE, 2009. doi:10.1109/FOCS.2009.28.

[bib.bib33] [33] Harald Räcke. Optimal hierarchical decompositions for congestion minimization in networks. In Proceedings of the fortieth annual ACM symposium on Theory of computing, pages 255–264, 2008. doi:10.1145/1374376.1374415.

[bib.bib34] [34] Thatchaphol Saranurak and Sorrachai Yingchareonthawornchai. Deterministic small vertex connectivity in almost linear time. In 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS), pages 789–800. IEEE, 2022. doi:10.1109/FOCS54457.2022.00080.

All-Subsets Important Separators with Applications to Sample Sets, Balanced Separators and Vertex Sparsifiers in Directed Graphs

Abstract

Keywords and phrases:

Category:

Funding:

Copyright and License:

2012 ACM Subject Classification:

Related Version:

DOI:

Event:

Editors:

Series and Publisher:

1 Introduction

1.1 All-Subsets Important Separators

Theorem 1 (All-Subsets Important Separators).

Lemma 2.

1.2 Detection Sets and Sample Sets in Directed Graphs

Theorem 3.

Theorem 4.

1.3 Directed Balanced Cuts

Theorem 5 ([15]).

Theorem 6.

Theorem 7.

1.4 Vertex Cut Sparsifiers

Theorem 8 ([20], [31]).

Theorem 9.

Corollary 10.

1.5 Techniques and Overview

All-Subset Important separators

Detection sets and Sample sets for Directed Graphs

FPT algorithm for Directed Balanced Separators

Approximation algorithm for Directed Balanced Separators

Vertex Sparsifiers

2 All-Subsets Important Separators

Definition 11 (Closest set).

Lemma 12 (Lemma 18 of [17]).

Definition 13 (Minimal separators).

Definition 14.

Definition 15.

Lemma 16.

Proof.

Lemma 17 (Theorem 8.51 of [9]).

Proof.

3 Detection Sets and Sample Sets in Directed Graphs

Definition 18 (Net).

Definition 19 (Detection Set).

Definition 20 (Sample Set).

Lemma 21.

Proof.

Definition 22 (Shattering a set of elements).

Definition 23 (VC-dimension).

Theorem 24 (ϵ-net theorem, see [15]).

Theorem 25 (ϵ-sample theorem, see [15]).

Definition 26 (Single source reachability profile).

Theorem 27.

Lemma 28.

Proof.

Proof of Theorem 27.

Theorem 29.

Proof.

References

Appendix A Appendix

Proof.

Lemma 30 (Tight example for important separator preservation).

Proof.

Theorem 24 ( $\epsilon$ -net theorem, see [15]).

Theorem 25 ( $\epsilon$ -sample theorem, see [15]).