On Deleting Vertices to Reduce Density in Graphs and Supermodular Functions
Abstract
We consider deletion problems in graphs and supermodular functions where the goal is to reduce density. In Graph Density Deletion (GraphDD), we are given a graph with non-negative vertex costs and a non-negative parameter and the goal is to remove a minimum cost subset of vertices such that the densest subgraph in has density at most . This problem has an underlying matroidal structure and generalizes several classical problems such as vertex cover, feedback vertex set, and pseudoforest deletion set for appropriately chosen and all of these classical problems admit a -approximation. In sharp contrast, we prove that for every fixed integer , GraphDD is hard to approximate to within a logarithmic factor via a reduction from SetCover, thus showing a phase transition phenomenon. Next, we investigate a generalization of GraphDD to monotone supermodular functions, termed Supermodular Density Deletion (SupmodDD). In SupmodDD, we are given a monotone supermodular function via an evaluation oracle with element costs and a non-negative integer and the goal is remove a minimum cost subset such that the densest subset according to in has density at most . We show that SupmodDD is approximation equivalent to the well-known Submodular Cover problem; this implies a tight logarithmic approximation and hardness for SupmodDD; it also implies a logarithmic approximation for GraphDD, thus matching our inapproximability bound. Motivated by these hardness results, we design bicriteria approximation algorithms for both GraphDD and SupmodDD.
Keywords and phrases:
Combinatorial Optimization, Approximation Algorithms, Randomized Algorithms, Hardness of Approximation, Densest Subgraph, Supermodular Functions, Submodular Set CoverCategory:
Track A: Algorithms, Complexity and GamesCopyright and License:
![[Uncaptioned image]](x1.png)
2012 ACM Subject Classification:
Theory of computation Graph algorithms analysis ; Theory of computation Approximation algorithms analysisAcknowledgements:
Shubhang Kulkarni thanks Kishen Gowda for engaging in preliminary discussions about approximations for feedback vertex set and pseudoforest deletion set. We thank anonymous reviewers for pointing out an error, and for suggesting improvements to a previous version of this work.Funding:
This work was supported in part by NSF grant CCF-2402667.Editors:
Keren Censor-Hillel, Fabrizio Grandoni, Joël Ouaknine, and Gabriele PuppisSeries and Publisher:

1 Introduction
The densest subgraph problem in graphs (DSG) is a core primitive in graph and network mining applications. In DSG, we are given a graph and the goal is to find , where is the set of edges with both end vertices in . DSG is not only interesting for its applications but is a fundamental problem in algorithms and combinatorial optimization with several connections to graph theory, matroids, and submodularity. Many recent works have explored various aspects of DSG and related problems from both theoretical and practical perspectives [23, 4, 28, 30, 9, 20, 12, 13, 8, 26]. A useful feature of DSG is its polynomial-time solvability. This was first seen via a reduction to network flow [19, 27] but another way to see it is by considering a more general problem, namely the densest supermodular subset problem (DSS): Given a supermodular function via evaluation oracle, the goal is to find . One can easily see that DSG is a special case of DSS by noting that for any graph , the function defined by for every is a supermodular function. It is well-known and easy to see that DSS and DSG can be solved in polynomial-time by a simple reduction to submodular function minimization. Several other problems that are studied in graph and network mining can be seen as special cases of DSS. Recent work has demonstrated the utility of the supermodularity lens in understanding greedy heuristics and approximation algorithms for DSG and these problems [9, 30, 20, 21].
Density Deletion Problems.
In this work we consider several interrelated vertex deletion problems that aim to reduce the density. We start with the graph density deletion problem.
Definition 1 ().
For a fixed constant , the -graph density deletion problem, denoted , is defined as follows:
Input: | Graph and vertex costs |
Goal: | . |
This deletion problem naturally generalizes to supermodular functions as defined below. We recall that a set function is (i) submodular if for every , (ii) supermodular if is submodular, (iii) non-decreasing if for every , and (iv) normalized if . We observe that non-negative normalized supermodular functions are non-decreasing. For a function and , we define as the function restricted to the ground set . The evaluation oracle for the function takes a subset as input and returns the function value of the set .
Definition 2 ().
For a fixed constant , the -supermodular density deletion problem, denoted , is defined as follows:
Input: | Integer-valued normalized supermodular function via an |
---|---|
evaluation oracle and element costs | |
Goal: | . |
When the density threshold is part of input, we use GraphDD and SupmodDD to refer to these problems. It is easy to see that GraphDD (and hence SupmodDD) is NP-Hard from a general result on vertex deletion problems [25]. Our goal is to understand the approximability of these problems.
Motivations and Connections.
While the deletion problems are natural in their formulation, to the best of our knowledge, GraphDD has only recently been explicitly defined and explored. Bazgan, Nichterlein and Vazquez Alferez [2] defined and studied this problem from an FPT perspective. As pointed out in their work, given the importance of DSG and DSS in various applications to detect communities and sub-groups of interest, it is useful to consider the robustness (or sensitivity) of the densest subgraph to the removal of vertices. In this context, we mention the classical work of Cunningham on the attack problem [11] which can be seen as the problem of deleting edges to reduce density; this edge deletion problem can be solved in polynomial time for integer parameters via matroidal and network flow techniques. In addition to their naturalness and the recent work, we are motivated to consider GraphDD and SupmodDD owing to their connections to several classical vertex deletion problems as well as a matroidal structure underlying GraphDD that we articulate next.
We observe that is equivalent to the vertex cover problem: requiring density of after deleting is equivalent to being a vertex cover of . One can also see, in a similar fashion, that is equivalent to the pseudoforest deletion set problem, denoted PFDS– where the goal is to delete vertices so that every connected component in the remaining graph has at most one cycle, and is equivalent to the feedback vertex set problem, denoted FVS– where the goal is to delete vertices so that the remaining graph is acyclic (see [5] for formal definitions of both problems). Vertex cover, PFDS, and FVS admit -approximations, and moreover this bound cannot be improved under the Unique Games Conjecture (UGC) [22]. We note that while -approximations for vertex cover are relatively easy, -approximations for FVS and PFDS are non-obvious [1, 3, 10]. Until very recently there was no polynomial-time solvable linear program (LP) that yielded a -approximation for FVS and PFDS. In fact, the new and recent LP formulations [5] for FVS and PFDS are obtained via connections to Charikar’s LP-relaxation for DSG [7]. Fujito [18] unified the -approximations for vertex cover, FVS, and PFDS via primal-dual algorithms by considering a more general class of matroidal vertex deletion problem on graphs that is relevant to our work. This abstract problem, denoted MatroidFVS 111We use the feedback vertex set terminology in our naming of the MatroidFVS problem since the goal is to pick a min-cost subset of vertices to cover all circuits of the matroid defined on the edges of a graph. This generalizes FVS which is MatroidFVS where the matroid of interest is the graphic matroid on the input graph., is defined below.
Definition 3 (MatroidFVS).
The Matroid Feedback Vertex Set problem, denoted MatroidFVS, is defined as follows:
Input: | Graph , vertex costs , and |
---|---|
Matroid with being the collection of independent sets | |
(via an independence testing oracle) | |
Goal: | . |
Fujito [18] obtained a -approximation for MatroidFVS for the class of “uniformly sparse” matroids [24]. It is not difficult to show that vertex cover, FVS, and PFDS can be cast as special cases of MatroidFVS where the associated matroids are “uniformly sparse”. Consequently, Fujito’s result unifies the -approximations for these three fundamental problems.
We now observe some non-trivial connections between -GraphDD, MatroidFVS and . We can show that -GraphDD is a special case of MatroidFVS for every integer : indeed, -GraphDD corresponds to MatroidFVS where the matroid is the -fold union of the -cycle matroid defined on the edge set of the input graph. Although it is not obvious, we can show that MatroidFVS is a special case of . We refer the reader to Appendix A of the full version [6] of this work for details regarding these connections, and the problems in the right column in Figure 1(b) for a pictorial representation of the reductions discussed so far. Given these connections and the existence of a -approximation for vertex cover, FVS, and PFDS, we are led to the following questions.
Question 1.
What is the approximability of -GraphDD, MatroidFVS, and -SupmodDD? Do these admit constant factor approximations?
1.1 Results
In this section, we give an overview of our technical results that resolve Question 1 up to a constant factor gap.
1.1.1 Connections between SubmodCover and SupmodDD
We obtain a logarithmic approximation for , MatroidFVS, and via a reduction to the submodular cover problem and using the Greedy algorithm for it due to Wolsey [31]. First, we recall the submodular cover problem.
Definition 4 (SubmodCover).
The submodular cover problem, denoted SubmodCover, is defined as follows:
Input: | Integer-valued normalized non-decreasing submodular function |
---|---|
via evaluation oracle and element costs | |
Goal: | . |
For a function , we define the marginal for every and . We show the following result.
Theorem 5.
Let be an integer-valued normalized supermodular function and be a rational number. Then, there exists a normalized non-decreasing submodular function such that for , we have that if and only if . Moreover,
-
1.
if is an integer, then is integer-valued,
-
2.
for all , and
-
3.
evaluation queries for the function can be answered in polynomial time by making polynomial number of evaluation queries to the function .
We discuss the consequences of Theorem 5 for . We recall that SubmodCover admits a -approximation for input function via the Greedy algorithm of Wolsey [31]. Consider for integer-valued . By Theorem 5, we have a reduction to SubmodCover and consequently, we have a -approximation. In particular, we note that for integer-valued and MatroidFVS admit -approximation, where is the number of vertices in the input graph.
Corollary 6.
for integer-valued admits an -approximation, where is the input integer-valued, normalized supermodular function. Consequently, for integer valued and MatroidFVS admit -approximations, where is the number of vertices in the input graph.
Remark 7.
The reduction from SupmodDD to SubmodCover is in some sense implicit in prior literature (see [29, 18] for certain special cases of supermodular functions). We note that the reduction from FVS to SubmodCover which follows from this connection does not seem to be well-known in the literature, and the authors of this paper were not aware of it until recently.
From a structural point of view we also prove that SubmodCover reduces to , thus essentially showing the equivalence of SubmodCover and SupmodDD. We believe that it is useful to have this equivalence explicitly known given that vertex deletion problems arise naturally but seem different from covering problems on first glance.
Theorem 8.
Let be an integer-valued normalized non-decreasing submodular function. Then, there exists a normalized supermodular function such that for , we have that if and only if . Moreover,
-
1.
for all , and
-
2.
evaluation queries for the function can be answered in polynomial time by making a constant number of evaluation queries to the function .
1.1.2 Hardness of Approximation
A starting point for our attempt to answer Question 1 was our belief that -GraphDD for integer admits a -approximation via the primal-dual approach suggested by Fujito for MatroidFVS [18]. This belief stems from Fujito’s work which showed a -approximation for vertex cover, FVS, PFDS, and MatroidFVS for “uniformly sparse” matroids and our reduction showing that -GraphDD for integral is a special case of MatroidFVS (see Appendix A of the full version of this work [6]). We note that the matroid that arises in the reduction is not a “uniformly sparse” matroid but has lot of similarities with it, so our initial belief was that a more careful analysis would lead to a constant factor approximation. However, to our surprise, after several unsuccessful attempts to prove a constant factor upper bound, we were able to show that for every integer , -GraphDD is -hard to approximate via a reduction from Set Cover.
Theorem 9.
For every integer , there is no approximation for assuming , where is the number of vertices in the input instance.
Thus, -GraphDD exhibits a phase transition: it admits a -approximation for (via Fujito’s results [18]) and becomes -hard for every integer . To conclude our hardness results, we note that since GraphDD is a special case of MatroidFVS, which itself is a special case of SupmodDD, both MatroidFVS and SupmodDD are -inapproximable. However, both these problems are also -approximable via Corollary 6. Thus, we resolve the approximability of all these problems to within a small constant factor. We refer the reader to Figure 1 for an illustration of problems considered in this work and approximation-factor preserving reductions between them.
1.1.3 Bicriteria Approximations
The hardness result for -GraphDD (and -GraphDD) motivates us to consider bicriteria approximation algorithms. Can we obtain constant factor approximation by relaxing the requirement of meeting the density target exactly? We show that this is indeed possible. We consider an orientation based LP that was used recently to obtain a polynomial-time solvable LP to approximate FVS and PFDS [5]. We observed that this LP has an integrality gap when considering -GraphDD. Nevertheless, the LP is useful in obtaining the following bicriteria approximation.
Theorem 10.
There exists a polynomial time algorithm which takes as input a graph , vertex deletion costs , a target density , and an error parameter , and returns a set such that:
-
1.
,
-
2.
,
where OPT denotes the cost of an optimum solution to GraphDD on the instance .
Next, we consider -SupmodDD. Unlike the case of graphs, it is not clear how to write an integer programming formulation for SupmodDD whose LP-relaxation is polynomial-time solvable. Instead, we take inspiration from the very recent work of [32] on vertex deletion to reduce treewidth. We design a combinatorial randomized algorithm that yields a bicriteria approximation for SupmodDD, where the bicriteria approximation bounds are based on a parameter that depends on the input supermodular function . For a normalized supermodular function , we define
This parameter was defined in a recent work on DSS to unify the analysis of the greedy peeling algorithm for DSG [9]. We note that and moreover, if and only if the function is modular. If is the induced edge function of a graph (i.e., is the number of edges with all its end-vertices in for every subset of vertices), then . This follows from the observation that the sum of degrees is at most twice the number of edges in a graph. Similarly, if is the induced edge function of a hypergraph with rank (i.e., all hyperedges have size at most ), then . We show the following bicriteria approximation for SupmodDD.
Theorem 11.
There exists a randomized polynomial time algorithm which takes as input a normalized monotone supermodular function (given by oracle access), element deletion costs , a target density , and an error parameter , and returns a set such that:
-
1.
, and
-
2.
,
where OPT denotes the cost of an optimum solution to on the instance .
As a consequence of Theorem 11, we obtain a bicriteria approximation for density deletion problems in graphs and -rank hypergraphs. We note that the bicriteria guarantee that we get from this theorem for graphs is weaker than the guarantee stated in Theorem 10. We discuss another special case of SupmodDD where the supermodular function of interest has bounded to illustrate the significance of Theorem 11. Given a graph and a parameter , the -mean density of is defined as , where is the number of edges in incident to the vertex . The -mean density of graphs was introduced and studied by Veldt, Benson, and Kleinberg [30]. Subsequent work by Chekuri, Quanrud, and Torres [9] showed that -mean density is a special case of the densest supermodular set problem (i.e., DSS) where the supermodular function of interest is given by for every and moreover . We note that the natural vertex deletion problem is the -mean density deletion problem: given a graph with vertex deletion costs, find a min-cost subset of vertices to delete so that the -mean density of the remaining graph is at most a given threshold. Since for the function of interest here, Theorem 11 implies a bicriteria approximation for -mean density deletion for integer-valued . An interesting open question is to obtain better bicriteria approximation for – in particular, can we remove the dependence on ?
Organization.
In Section 2, we show an approximation-preserving reduction from SetCover to and prove Theorem 9. In Section 3, we give a bicriteria approximation for and prove Theorem 11. Finally, in Section 4 we show the connections between SupmodDD and Submodular Cover by proving Theorems 5 and 8. We defer the proof of Theorem 10 to the full version of this work [6].
2 Hardness of Approximation
In this section, we show Theorem 9, i.e., we show an approximation preserving reduction from SetCover to . We recall the set cover problem and its inapproximability.
Definition 12 (SetCover).
The set cover problem, denoted SetCover, is defined as follows:
Input: | Finite Universe , Family with costs |
Goal: | . |
Theorem 13 ([15, 14]).
For every , there does not exist a -approximation for SetCover assuming P NP, where is the size of the input instance.
We will also need the following characterization of density via orientations. We recall that an orientation of a graph assigns each edge to either or . For notational convenience, we use to denote an orientation of the graph and to denote the indegree of a vertex (i.e., the number of edges assigned to ) in the oriented graph . The following connection between density and orientations will be used in our hardness reduction. This characterization is implied by a result of Hakimi [16] and also by the dual of Charikar’s LP to solve DSG [7].
Proposition 14.
Let be a graph and let be an integer value. Then, we have that if and only if there exists an orientation of the graph such that for every .
We now restate and prove Theorem 9.
See 9
Proof.
We show the theorem when the target density via a reduction from SetCover. At the end of the proof, we remark on how to modify the reduction to obtain hardness for all integers as claimed in the theorem. Let be a SetCover instance. We will assume (without loss of generality) that all elements in have the same frequency which is a power of – we note that this assumption is not a technical requirement and is only for ease of exposition. For this instance, we construct an instance of -GraphDD as follows:
-
1.
Add vertices representing sets: For each set , add a set-vertex to .
-
2.
Add binary trees representing elements: For each element , add a complete binary tree with the leaves as the set-vertices corresponding to the sets containing . We denote this tree as and its root as .
-
3.
Add self-loops: For each element , add a self-loop to the root vertex of the tree . For each set , add self-loops to the vertex .
-
4.
Define cost function: We define the cost function as follows: for all and for all .
We refer the reader to Figure 2(a) for pictorial depictions of the instance constructed via the reduction above. The next claim shows the correctness of our reduction and also implies that our reduction is approximation-preserving. The approximation hardness guarantee of the theorem when the target density then follows by Theorem 13 and the observation that the number of vertices is a constant factor of the size of the input set cover instance.
Claim 15.
The instance has a feasible solution to -GraphDD of finite cost if and only if has a SetCover of cost .
Proof.
We first show the forward direction of the claim. Let be a feasible solution to -GraphDD of finite cost . By construction, we have that . Let denote the corresponding sets. By way of contradiction, suppose is not a set cover. Consequently, there exists an element not covered by . For convenience, we use to denote the sets that contain the element . We note that since the element is not covered by , we have that . Let denote the set of vertices obtained by including all vertices of the binary tree . Then, the following gives us a contradiction:
Here, the first inequality is because is a feasible solution to -GraphDD. The second inequality is by definition of graph density. The equality is because there are self loop edges, non-self loop edges, and vertices in the tree by construction.
We now show the reverse direction. Let be a set cover. Let . Then, we show that the graph has density at most . By Proposition 14(2), it suffices to exhibit an orientation of the graph in which the indegree of every vertex is at most . We first consider the following intermediate orientation of . For each element , we do the following: for vertex , we denote as the (unique) parent of in the (rooted) tree, and we orient the edge towards the parent . All self-loops are assumed to be trivially oriented. Let denote this intermediate orientation restricted to the graph , and denote the set of all root vertices. Refer to Figure 2(b) for a pictorial depiction of orientation . We now make three important observations regarding the indegrees in the orientation .
Observation 16.
We have the following:
-
1.
for all , ,
-
2.
for all , , and
-
3.
for each element , there exists a set such that .
Proof.
We show each statement separately below.
-
1.
We note that the statement easily follows by construction for the set vertices . Let be an arbitrary element and let be a non-root internal vertex of the binary tree. Since has exactly two child nodes and one parent node in , we have that . We note that the inequality may be strict if any children of belong to the set .
-
2.
Let . We note that has exactly children and self loop, and consequently as claimed. Here, we note that bo child of belongs to the set because of our simplifying assumption that .
-
3.
Let . Since is a set cover, there exists a set such that . Consequently, , and so by construction.
We now use the orientation and Observation 16 to construct the orientation which certifies that the graph has density at most . We note that by Observation 16(1) and (2), it suffices to modify the orientation to reduce the indegree of all root vertices in to while keeping all other indegrees at most . Consider an arbitrary element . Let be an arbitrary vertex of such that . We note that such a vertex exists by Observation 16(3). We consider the unique path from to in . We note that by the construction of the orientation , every edge along this path is oriented in the direction of the path. Consider the orientation obtained by reorienting these edges in the reverse direction of the path. Refer to Figure 2(c) for a pictorial depiction of the modified orientation. We note that only the indegrees of and change due to this reorientation. In particular, for all , we have that and . This concludes the proof. The preceding reduction can be modified to show approximation hardness for all integral . Suppose that , where . Then, we construct the same graph as above with additional self-loops on each vertex. The proof generalizes.
Remark 17.
The addition of self-loops is not technically necessary for the construction; they simply make it more straightforward. Specifically, a vertex with self-loops can be replaced by a subgraph with density exactly . In this subgraph, one vertex, say , is identified with and its cost is defined as . All other vertices have infinite cost. All edges incident to are then redirected to connect to instead.
3 Bicriteria Approximation for GraphDD
In this section, we describe a randomized combinatorial bicriteria approximation algorithm for -SupmodDD and prove Theorem 11. The algorithm is inspired by the ideas of the recent work of Włodarczyk [32]. Our algorithm is based on the following idea. Suppose that we had non-negative potentials for the elements of the ground set such that the potential value of an optimal solution is large, say at least . Then, a natural algorithm – at least when the vertex deletion costs are uniform – would be to compute the potentials, sample an element in proportion to the potentials and delete it, define a residual instance, and repeat. This would ensure an -approximation for the problem in expectation via a martingale argument.
Unfortunately, our hardness result suggests that we are unlikely to obtain good potentials. In fact, the hard instances seem to be the functions that have density very close to the target density. However, we leverage supermodularity to show that if the density of the input function is at least times the target density, then we can indeed find such good vertex potentials. In order to ensure that the density of the input function is at least times the target density, we perform a preprocessing step to prune certain elements from the ground set without changing the cost of an optimal solution. Overall, this gives us an -bicriteria guarantee, where the values and are as given in Theorem 11. We also note that the cost function to delete vertices may be arbitrary – we overcome this by using the natural bang-per-buck sampling strategy, i.e. we sample a vertex in proportion to .
The rest of the section is organized as follows. In Section 3.1 we give our preprocessing step based on the dense decomposition of supermodular functions. In Section 3.2 we describe our element potentials based on marginal gains of supermodular functions. In Section 3.3 we present our algorithm and complete the proof of Theorem 11.
3.1 Preprocessing via Dense Decomposition
We discuss the dense decomposition of a normalized non-negative supermodular set function [20, 17] and prove a lemma that will enable us to use it as a preprocessing step.
Definition 18 ([20]).
Let be a non-negative normalized supermodular function. A sequence is the dense decomposition of if
-
1.
is a partition of obtained iteratively as follows: for , is the inclusion-wise maximal set that maximizes
-
2.
the values are obtained as
We note that the dense decomposition can also be viewed algorithmically as the output of a recursive process which computes the unique inclusion-wise maximal set that maximizes the ratio and recurses on the contracted function defined as for all . It can be shown that this decomposition is unique for a supermodular . The following lemma allows us to use the dense decomposition as an algorithmic preprocessing step in the next section. In particular, the lemma says that the dense decomposition can be used to find a set such that it suffices to focus on solving the problem on the function restriction ; and additionally, the ground set elements of the restricted function have large marginal gains.
Lemma 19.
Let be a normalized non-negative supermodular function, be a cost function, and be a positive real value. Moreover, let be the dense decomposition of and for , let . Then, we have that
-
1.
every feasible solution to for the function is also a feasible solution to for the function ,
-
2.
, where and denote the costs of optimal solutions to for the functions and respectively,
-
3.
for all , and
-
4.
the set can be computed in polynomial time given access to a function evaluation oracle for .
Proof.
We show all four properties separately below.
-
1.
Let be a feasible solution to for the function and by way of contradiction suppose that is not a feasible solution to for the function . Thus, there exists a set such that . We note that this set cannot be contained in since otherwise would not be a feasible solution to for . Then, the following gives us the required contradiction:
Here, the second inequality is by supermodularity of the function . The third inequality is by the observation that for non-negative numbers . For the final inequality, observe that because is a feasible for . Furthermore, we have that by definition of the dense decomposition and .
-
2.
Let be an optimal for w.r.t. cost function . Then, we note that is a feasible for . This can be easily observed as follows: by way of contradiction, suppose that is not a feasible for . Then, there exists a set such that . Consequently, we have that , a contradiction to being a feasible for . Then, follows by non-negativity of .
-
3.
By way of contradiction, suppose that there exists a vertex such that . We recall that . Let be such that . For convenience, we will let and . We note that and are contained in and by definition of the dense decomposition. Moreover, by the definition of the set , and so by supermodularity we have that . Then, the following sequence of inequalities gives us the required contradiction.
where the second inequality is by the observation that for non-negative numbers . Here, we note that since otherwise, and so by supermodularity we have that , a contradiction.
-
4.
It is well-known that the dense decomposition (and consequently the set ) can be computed in polynomial time (given access to the function evaluation oracle for ) via supermodular maximization. This is implicit in Fujishige’s work on principle partitions [17], and is also explicitly considered in more recent works on dense decompositions for supermodular functions [20]. We omit the details of a formal proof here for brevity.
3.2 Element Potentials via Marginal Gains
We now show that the marginal gains of elements relative to the entire ground set are good potentials for a sampling-based algorithm for -SupmodDD when all the marginal gains of the input function are large enough, in particular, at least .
Lemma 20.
Let and . Let be a normalized monotone supermodular function such that for all , and let be a for . Then, we have that
Proof.
By supermodularity of , we have that
(1) |
By way of contradiction, suppose that the lemma is false. Then, we have the following.
Here, the first inequality is by definition of the parameter . The second inequality is because is a feasible for the function . The third inequality is by (1). The final inequality is by our contradiction assumption. Then, on rearranging the terms, we obtain the following contradiction.
where the final inequality is because for all .
3.3 Random Deletion Algorithm
We now describe our bicriteria algorithm for -SupmodDD and analyze its approximation factor. Algorithm 1, Lemma 23 and Lemma 24 together complete the proof of Theorem 11.
Algorithm.
Our algorithm takes as input (1) a normalized non-negative supermodular function , (2) element deletion costs , (3) target density , and (4) error parameter . The algorithm returns a set which starts off as the empty-set and is then constructed element-by-element. This is done iteratively as follows. Let . If the function has density at most , then the algorithm breaks and returns the current set . Otherwise, the algorithm first computes the dense decomposition of the function , defines the set , and redefines the function to be the restricted function – we use DenseDecompositionPreprocess to denote a subroutine that computes the set and returns the tuple . Next, the algorithms samples a random element from the (modified) set in proportion to the ratio . The algorithm then adds the vertex to the set , restricts to the ground set , and repeats the previous steps. We give a formal description of the algorithm in Algorithm 1.
Algorithm:
-
1.
-
2.
while :
-
(a)
Redefine DenseDecompositionPreprocess
-
(b)
vertex sampled from according to the following distribution:
-
(c)
and
-
(a)
-
3.
return .
Martingales.
For the analysis of our randomized algorithm, we will require the following concepts from probability theory.
Definition 21.
-
1.
A sequence of random variables is called a supermartingale w.r.t. the sequence of random variables if for each it holds that (i) is a function of , (ii) and (iii) .
-
2.
A random variable is called a stopping time with respect to the sequence of random variables if for each , the event depends only on .
The following result shows that the expected value of a random variable in the supermartingale process only decreases with time. This will be crucial in analyzing the performance of Algorithm 1.
Theorem 22 (Doob’s Optional-Stopping Theorem).
Let be a supermartingale w.r.t. the sequence of random variables and be a stopping time with respect to the process . Suppose that for some integer . Then, we have that .
Algorithm Analysis.
Henceforth, we consider the execution of Algorithm 1 on a fixed input instance . Let be the number of iterations of the while-loop – we note that is a random variable with value at most since at every iteration of the while-loop, the size of the ground set decreases by at least . Throughout the analysis, we will index the (random) variables at the iteration of the algorithm with the subscript for all . In particular, we let denote the set at the start of the iteration (so , and is defined by Step 2(c)), denote the preprocessed function after step 2(a), and denote the sampled vertex after step 2(b) of the iteration of the algorithm. For simplicity, we define , and to be the empty-function for all . The next lemma shows that the density of the function after deleting the set is at most , i.e. is a feasible solution to for the function . The proof easily follows by considering any fixed execution of the algorithm and leveraging Lemma 19(1) while inducting on . We omit details of the proof here for brevity.
Lemma 23 (Approximate Feasibility).
.
The next lemma shows that the expected cost of the solution returned by the algorithm is at most times the cost of the optimal of the function . For any restriction of the function , we use to denote the value of an optimal for with respect to the cost function .
Lemma 24 (Approximate Cost).
.
Proof.
For ease of exposition, we will use . We consider the sequence of random variables , where for all . Our strategy will be to first show that this sequence of random variables is a supermartingale, and then apply Doob’s Optional-Stopping Theorem with stopping time to bound the expected cost of the set returned by the algorithm (note that since with each iteration of the while-loop, the size of the ground set decreases by at least ). Before showing that the sequence is a supermartingale, we first show that the expected cost of a vertex chosen in step 2(b) of an iteration of the algorithm is at most an -fraction of the expected decrease in the optimum value during the iteration.
Claim 25.
for all .
Proof.
Let be an optimal for . We have the following:
where the first inequality is by Lemma 20 and the fact that by our preprocessing (Step 2(a)) and Lemma 19(3). We now show that because is an optimal solution for , the final expression in the above can be upper bounded by , thereby completing the proof of the claim. This can be seen as follows:
Here, the second inequality is by Step 2(b) of Algorithm 1 which says that the function is defined to be and Lemma 19(2). The third inequality is because is an optimal solution for and is a feasible solution for .
We now show that the sequence is a supermartingale w.r.t. the sequence of random variables chosen by the algorithm.
Claim 26.
The sequence of random variables is a supermartingale w.r.t. the sequence of random variables .
Proof.
Let be arbitrary. We note that has finite expectation and also is fully determined by the subsequence . Thus, our goal is to show that . This is equivalent to showing . We note that this inequality indeed holds because
where the inequality is by Claim 25.
By Claim 26, the sequence is a supermartingale w.r.t. the sequence of random variables. Consider the stopping time . By Theorem 22, we have that . The following then completes the proof of the lemma:
Here, the final inequality follows by observing that and applying Lemma 19(2).
4 SubmodCover and SupmodDD
See 5
Proof.
For simplicity, we define an intermediate function , and use it to define the function of interest. The functions and are as follows: for every ,
The following claim shows that the function is normalized, non-decreasing, and submodular.
Claim 27.
The function is normalized, non-decreasing, and submodular.
Proof.
We note that is normalized by definition, and is non-decreasing because the is non-decreasing. To show that is submodular, it suffices to show that the is supermodular. Let be arbitrary. Let and . Then, we have the following:
Here, the first inequality follows from supermodularity of and the second inequality is by definition of .
Let . We now show that if and only if as claimed. We have the following sequence of equivalences.
(2) | ||||
Here, the second equivalence can be seen by the following. For the forward direction, we suppose that . By way of contradiction, suppose that . By definition of the function , there exists a set such that . Thus, . Equivalently, we have that , a contradiction to our hypothesis. Here we note that as otherwise we would have that since our function is normalized, contradicting our choice of . For the reverse direction, suppose that . By way of contradiction, suppose that there exists a non-empty set such that . Then, we equivalently have that , a contradiction.
We now show that the function satisfies properties (1)-(3) of the theorem. We note that if is an integer, then is integer-valued by definition. Moreover, we can answer evaluation queries for using polynomial many evaluation queries to (via supermodular maximization). Thus satisfies properties (1) and (3). We now show property (2).
Let . We have the following:
where the final inequality is because for all such that by supermodularity of the function .
See 8
Proof.
We consider the function defined as follows: for every ,
We note that the function is normalized, non-decreasing, integer-valued and supermodular. Moreover, we can answer evaluation queries for the function using two queries to the evaluation oracle for . We note that for , if and only if can be observed by following the steps of (2) in reverse order, and so we omit the formal details here for brevity. Property (1) of the theorem can be observed as follows: for , we have the following:
where the final equality is because is normalized.
5 Conclusion
We considered several interrelated density deletion problems motivated by the question of understanding the robustness of densest subgraph. We showed tight logarithmic approximations for these problems. We showed inapproximability of graph density deletion by reduction from set cover, and approximation algorithms by exhibiting the equivalence of supermodular density deletion and submodular cover. Motivated by the hardness results, we designed bicriteria approximations. Our bicriteria approximation for graph density deletion is LP-based, and that for supermodular density deletion is a randomized combinatorial one and relies on the notion of dense decomposition of supermodular functions. We mention two open questions. First, our bicriteria approximation for supermodular density deletion depends on the parameter related to the input supermodular function (see Theorem 11). Is it possible to design a bicriteria approximation without dependence on the parameter ? Second, our hardness reduction shows that -GraphDD is -hard for every fixed constant integer . We were able to adapt our reduction to conclude that it is -hard for every fixed constant (not necessarily integers). Is -GraphDD -hard for every fixed constant ?
References
- [1] Vineet Bafna, Piotr Berman, and Toshihiro Fujito. Constant ratio approximations of the weighted feedback vertex set problem for undirected graphs. In Algorithms and Computations, pages 142–151, 1995. doi:10.1007/BFB0015417.
- [2] Cristina Bazgan, André Nichterlein, and Sofia Vazquez Alferez. Destroying Densest Subgraphs Is Hard. In 19th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2024), pages 6:1–6:17, 2024. doi:10.4230/LIPICS.SWAT.2024.6.
- [3] Ann Becker and Dan Geiger. Optimization of Pearl’s method of conditioning and greedy-like approximation algorithms for the vertex feedback set problem. Artificial Intelligence, 83:167–188, 1996. doi:10.1016/0004-3702(95)00004-6.
- [4] Digvijay Boob, Yu Gao, Richard Peng, Saurabh Sawlani, Charalampos Tsourakakis, Di Wang, and Junxing Wang. Flowless: Extracting densest subgraphs without flow computations. In Proceedings of The Web Conference 2020, pages 573–583, 2020. doi:10.1145/3366423.3380140.
- [5] Karthekeyan Chandrasekaran, Chandra Chekuri, Samuel Fiorini, Shubhang Kulkarni, and Stefan Weltge. Polyhedral aspects of feedback vertex set and pseudoforest deletion set. Mathematical Programming, pages 1–48, 2025.
- [6] Karthekeyan Chandrasekaran, Chandra Chekuri, and Shubhang Kulkarni. On deleting vertices to reduce density in graphs and supermodular functions, 2025. doi:10.48550/arXiv.2503.08828.
- [7] Moses Charikar. Greedy Approximation Algorithms for Finding Dense Components in a Graph. In Approximation Algorithms for Combinatorial Optimization, pages 84–95, 2000. doi:10.1007/3-540-44436-X_10.
- [8] Chandra Chekuri, Aleksander Bjørn Christiansen, Jacob Holm, Ivor van der Hoog, Kent Quanrud, Eva Rotenberg, and Chris Schwiegelshohn. Adaptive out-orientations with applications. In Proceedings of the 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 3062–3088, 2024.
- [9] Chandra Chekuri, Kent Quanrud, and Manuel R Torres. Densest subgraph: Supermodularity, iterative peeling, and flow. In Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1531–1555, 2022. doi:10.1137/1.9781611977073.64.
- [10] Fabián A. Chudak, Michel X. Goemans, Dorit S. Hochbaum, and David P. Williamson. A primal–dual interpretation of two 2-approximation algorithms for the feedback vertex set problem in undirected graphs. Operations Research Letters, 22(4):111–118, 1998. doi:10.1016/S0167-6377(98)00021-2.
- [11] William H Cunningham. Optimal attack and reinforcement of a network. Journal of the ACM (JACM), 32(3):549–561, 1985. doi:10.1145/3828.3829.
- [12] Laxman Dhulipala, Quanquan C Liu, Sofya Raskhodnikova, Jessica Shi, Julian Shun, and Shangdi Yu. Differential privacy from locally adjustable graph algorithms: k-core decomposition, low out-degree ordering, and densest subgraphs. In 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS), pages 754–765, 2022. doi:10.1109/FOCS54457.2022.00077.
- [13] Michael Dinitz, Satyen Kale, Silvio Lattanzi, and Sergei Vassilvitskii. Almost tight bounds for differentially private densest subgraph. In Proceedings of the 2025 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2908–2950, 2025. doi:10.1137/1.9781611978322.94.
- [14] Irit Dinur and David Steurer. Analytical approach to parallel repetition. In Proceedings of the Forty-Sixth Annual ACM Symposium on Theory of Computing, STOC, pages 624–633, 2014. doi:10.1145/2591796.2591884.
- [15] Uriel Feige. A threshold of ln n for approximating set cover. Journal of the ACM (JACM), 45(4):634–652, 1998. doi:10.1145/285055.285059.
- [16] A. Frank. Connections in combinatorial optimization. Oxford University Press, Oxford, 2011.
- [17] Satoru Fujishige. Lexicographically optimal base of a polymatroid with respect to a weight vector. Mathematics of Operations Research, 5(2):186–196, 1980. doi:10.1287/MOOR.5.2.186.
- [18] Toshihiro Fujito. Approximating node-deletion problems for matroidal properties. Journal of Algorithms, 31(1):211–227, 1999. doi:10.1006/JAGM.1998.0995.
- [19] A. V. Goldberg. Finding a maximum density subgraph. Technical report, University of California at Berkeley, USA, 1984.
- [20] Elfarouk Harb, Kent Quanrud, and Chandra Chekuri. Faster and scalable algorithms for densest subgraph and decomposition. In Advances in Neural Information Processing Systems, 2022.
- [21] Elfarouk Harb, Kent Quanrud, and Chandra Chekuri. Convergence to lexicographically optimal base in a (contra)polymatroid and applications to densest subgraph and tree packing. In 31st Annual European Symposium on Algorithms (ESA), pages 56:1–56:17, 2023. doi:10.4230/LIPICS.ESA.2023.56.
- [22] Subhash Khot and Oded Regev. Vertex cover might be hard to approximate to within 2- . Journal of Computer and System Sciences, 74(3):335–349, 2008.
- [23] Tommaso Lanciano, Atsushi Miyauchi, Adriano Fazzone, and Francesco Bonchi. A survey on the densest subgraph problem and its variants. ACM Comput. Surv., 56(8), April 2024. doi:10.1145/3653298.
- [24] Audrey Lee and Ileana Streinu. Pebble game algorithms and sparse graphs. Discrete Mathematics, 308(8):1425–1437, April 2008. doi:10.1016/J.DISC.2007.07.104.
- [25] John M. Lewis and Mihalis Yannakakis. The node-deletion problem for hereditary properties is np-complete. Journal of Computer and System Sciences, 20(2):219–230, April 1980. doi:10.1016/0022-0000(80)90060-4.
- [26] Ta Duy Nguyen and Alina Ene. Multiplicative weights update, area convexity and random coordinate descent for densest subgraph problems. In Proceedings of the 41st International Conference on Machine Learning (ICML), 2024.
- [27] Jean-Claude Picard and Maurice Queyranne. A network flow solution to some nonlinear 0-1 programming problems, with applications to graph theory. Networks, 12(2):141–159, 1982. doi:10.1002/NET.3230120206.
- [28] Saurabh Sawlani and Junxing Wang. Near-optimal fully dynamic densest subgraph. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, pages 181–193, 2020. doi:10.1145/3357713.3384327.
- [29] Shuichi Ueno, Yoji Kajitani, and Shin’ya Gotoh. On the nonseparating independent set problem and feedback set problem for graphs with no vertex degree exceeding three. Discrete Mathematics, 72(1):355–360, December 1988. doi:10.1016/0012-365X(88)90226-9.
- [30] Nate Veldt, Austin R Benson, and Jon Kleinberg. The generalized mean densest subgraph problem. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pages 1604–1614, 2021. doi:10.1145/3447548.3467398.
- [31] Laurence A Wolsey. An analysis of the greedy algorithm for the submodular set covering problem. Combinatorica, 2(4):385–393, 1982. doi:10.1007/BF02579435.
- [32] Michał Włodarczyk. Losing treewidth in the presence of weights. In Proceedings of the 2025 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 3743–3761, 2025.