The Densest Subgraph Problem with a Convex/Concave Size Function

In the densest subgraph problem, given an edge-weighted undirected graph $G=(V,E,w)$, we are asked to find $S\subseteq V$ that maximizes the density, i.e., $w(S)/|S|$, where $w(S)$ is the sum of weights of the edges in the subgraph induced by $S$. This problem has often been employed in a wide variety of graph mining applications. However, the problem has a drawback; it may happen that the obtained subset is too large or too small in comparison with the size desired in the application at hand. In this study, we address the size issue of the densest subgraph problem by generalizing the density of $S\subseteq V$. Specifically, we introduce the $f$-density of $S\subseteq V$, which is defined as $w(S)/f(|S|)$, where $f:\mathbb{Z}_{\geq 0}\rightarrow \mathbb{R}_{\geq 0}$ is a monotonically non-decreasing function. In the $f$-densest subgraph problem ($f$-DS), we aim to find $S\subseteq V$ that maximizes the $f$-density $w(S)/f(|S|)$. Although $f$-DS does not explicitly specify the size of the output subset of vertices, we can handle the above size issue using a convex/concave size function $f$ appropriately. For $f$-DS with convex function $f$, we propose a nearly-linear-time algorithm with a provable approximation guarantee. On the other hand, for $f$-DS with concave function $f$, we propose an LP-based exact algorithm, a flow-based $O(|V|^3)$-time exact algorithm for unweighted graphs, and a nearly-linear-time approximation algorithm.


Introduction
Finding dense components in a graph is an active research topic in graph mining. Techniques for identifying dense subgraphs have been used in various applications. For example, in Web graph analysis, they are used for detecting communities (i.e., sets of web pages dealing with the same or similar topics) [9] and spam link farms [13]. As another example, in bioinformatics, they are used for finding molecular complexes in protein-protein interaction networks [4] and identifying regulatory motifs in DNA [11]. Furthermore, they are also used for expert team formation [6,20] and real-time story identification in micro-blogging streams [2].
To date, various optimization problems have been considered to find dense components in a graph. The densest subgraph problem is one of the most well-studied optimization problems. Let G = (V, E, w) be an edge-weighted undirected graph consisting of n = |V | vertices, m = |E| edges, and a weight function w : E → Q >0 , where Q >0 is the set of positive rational numbers. For a subset of vertices S ⊆ V , let G[S] be the subgraph induced by S, i.e., G[S] = (S, E(S)), where E(S) = {{i, j} ∈ E | i, j ∈ S}. The density of S ⊆ V is defined as w(S)/|S|, where w(S) = e∈E(S) w(e). In the (weighted) densest subgraph problem, given an (edge-weighted) undirected graph G = (V, E, w), we are asked to find S ⊆ V that maximizes the density w(S)/|S|.
The densest subgraph problem has received significant attention recently because it can be solved exactly in polynomial time and approximately in nearly linear time. In fact, there exist a flow-based exact algorithm [14] and a linear-programming-based (LP-based) exact algorithm [7]. Charikar [7] demonstrated that the greedy algorithm designed by Asahiro et al. [3], which is called the greedy peeling, obtains a 2-approximate solution 1 for any instance. This algorithm runs in O(m + n log n) time for weighted graphs and O(m + n) time for unweighted graphs.
However, the densest subgraph problem has a drawback; it may happen that the obtained subset is too large or too small in comparison with the size desired in the application at hand. To overcome this issue, some variants of the problem have often been employed. The densest k-subgraph problem (DkS) is a straightforward size-restricted variant of the densest subgraph problem [10]. In this problem, given an additional input k being a positive integer, we are asked to find S ⊆ V of size k that maximizes the density w(S)/|S|. Note that in this problem, the objective function can be replaced by w(S) since |S| is fixed to k. Unfortunately, it is known that this size restriction makes the problem much harder to solve. In fact, Khot [16] proved that DkS has no PTAS under some reasonable computational complexity assumption. The current best approximation algorithm has an approximation ratio of O(n 1/4+ ) for any > 0 [5].
Furthermore, Andersen and Chellapilla [1] introduced two relaxed versions of DkS. The first problem, the densest at-least-k-subgraph problem (DalkS), asks for S ⊆ V that maximizes the density w(S)/|S| under the size constraint |S| ≥ k. For this problem, Andersen and Chellapilla [1] adopted the greedy peeling, and demonstrated that the algorithm yields a 3-approximate solution for any instance. Later, Khuller and Saha [17] investigated the problem more deeply. They proved that DalkS is NP-hard, and designed a flow-based algorithm and an LP-based algorithm. These algorithms have an approximation ratio of 2, which improves the above approximation ratio of 3. The second problem is called the densest at-most-k-subgraph problem (DamkS), which asks for S ⊆ V that maximizes the density w(S)/|S| under the size constraint |S| ≤ k. The NP-hardness is immediate since finding a maximum clique can be reduced to it. Khuller and Saha [17] proved that approximating DamkS is as hard as approximating DkS, within a constant factor.

Our Contribution
In this study, we address the size issue of the densest subgraph problem by generalizing the density of S ⊆ V . Specifically, we introduce the f -density of S ⊆ V , which is defined as w(S)/f (|S|), where f : Z ≥0 → R ≥0 is a monotonically non-decreasing function with f (0) = 0. 2 Note that Z ≥0 and R ≥0 are the sets of nonnegative integers and nonnegative real numbers, respectively. In the f -densest subgraph problem (f -DS), we aim to find S ⊆ V that maximizes the f -density w(S)/f (|S|). For simplicity, we assume that E = ∅. Hence, any optimal solution S * ⊆ V satisfies |S * | ≥ 2. Although f -DS does not explicitly specify the size of the output subset of vertices, we can handle the above size issue using a convex size function f or a concave size function f appropriately. In fact, we can show that any optimal solution to f -DS with convex (resp. concave) function f has a size smaller (resp. larger) than or equal to that of any densest subgraph (i.e., any optimal solution to the densest subgraph problem). For details, see Sections 2 and 3.
Here we mention the relationship between our problem and DkS. Any optimal solution S * ⊆ V to f -DS is a maximum weight subset of size |S * |, i.e., S * ∈ argmax{w(S) | S ⊆ V, |S| = |S * |}, which implies that S * is also optimal to DkS with k = |S * |. Furthermore, the iterative use of a γ-approximation algorithm for DkS leads to a γ-approximation algorithm for f -DS. Using the above O(n 1/4+ )-approximation algorithm for DkS [5], we can obtain an O(n 1/4+ )-approximation algorithm for f -DS.
In what follows, we summarize our results for both the cases where f is convex and where f is concave.
The case where f is convex. We first describe our results for the case where f is convex.
We first prove the NP-hardness of f -DS with a certain convex function f by constructing a reduction from DamkS. Thus, for f -DS with convex function f , one of the best possible ways is to design an algorithm with a provable approximation guarantee.
To this end, we propose a min -approximation algorithm, where S * ⊆ V is an optimal solution to f -DS with convex function f . Our algorithm consists of the following two procedures, and outputs the better solution found by them. The first one is based on the brute-force search, which obtains an The second one adopts the greedy peeling, which obtains a 2f (n)/n f (|S * |)−f (|S * |−1) -approximate solution in O(m + n log n) time. Thus, the total running time of our algorithm is O(m + n log n). Our analysis on the approximation ratio of the second procedure extends the analysis by Charikar [7] for the densest subgraph problem.
At the end of our analysis, we observe the behavior of the approximation ratio of our algorithm for three concrete size functions. We consider size functions between linear and quadratic because, as we will see later, f -DS with any super-quadratic size function is a trivial problem; in fact, it only produces constant-size optimal solutions. The first example is f (x) = x α (α ∈ [1,2]). We show that the approximation ratio of our algorithm is 2 · n (α−1)(2−α) , where the worst-case performance of 2 · n 1/4 is attained at α = 1.5. The second example is f (x) = λx + (1 − λ)x 2 (λ ∈ [0, 1)). For this case, the approximation ratio of our algorithm is (2 − λ)/(1 − λ), which is a constant for a fixed λ. The third example is f (x) = x 2 /(λx + (1 − λ)) (λ ∈ [0, 1]). Note that this size function is derived by density function λ w(S) The approximation ratio of our algorithm is 4/(1 + λ), which is at most 4.
The case where f is concave. We next describe our results for the case where f is concave.
Unlike the above convex case, f -DS in this case can be solved exactly in polynomial time.
In fact, we present an LP-based exact algorithm, which extends Charikar's exact algorithm for the densest subgraph problem [7] and Khuller and Saha's 2-approximation algorithm for DalkS [17]. It should be emphasized that our LP-based algorithm obtains not only an optimal solution to f -DS but also some attractive subsets of vertices. Let us see an example in Figure 1. The graph consists of 8 vertices and 11 unweighted edges (i.e., w(e) = 1 for every e ∈ E). For this graph, we plotted all the points contained in P = {(|S|, w(S)) | S ⊆ V }. We refer to the extreme points of the upper convex hull of P as the dense frontier points. The (smallest) densest subgraph is a typical subset of vertices corresponding to a dense frontier point. Our LP-based algorithm obtains a corresponding subset of vertices for every dense frontier point. It should be noted that the algorithm SSM designed by Nagano, Kawahara, and Aihara [18] can also be used to obtain a corresponding subset of vertices for every dense frontier point. The difference between their algorithm and ours is that their algorithm is based on the computation of a minimum norm base, whereas ours solves linear programming problems.
Moreover, in this concave case, we design a combinatorial exact algorithm for unweighted graphs. Our algorithm is based on the standard technique for fractional programming. By using the technique, we can reduce f -DS to a sequence of submodular function minimizations. However, the direct application of a submodular function minimization algorithm leads to a computationally expensive algorithm that runs in O(n 5 (m + n) · log n) time. To reduce the computation time, we replace a submodular function minimization algorithm with a much faster flow-based algorithm that substantially extends a technique of Goldberg's flow-based algorithm for the densest subgraph problem [14].
The total running time of our algorithm is O(n 3 ). Modifying this algorithm, we also present an O n 3 log n · log log n -time (1 + )-approximation algorithm for weighted graphs. Although our flow-based algorithm is much faster than the reduction-based algorithm, the running time is still long for large-sized graphs. To design an algorithm with much higher scalability, we adopt the greedy peeling. As mentioned above, this algorithm runs in O(m + n log n) time for weighted graphs and O(m + n) time for unweighted graphs. We prove that the algorithm yields a 3-approximate solution for any instance.

Related Work
Tsourakakis et al. [20] introduced a general optimization problem to find dense subgraphs, which is referred to as the optimal (g, h, α)-edge-surplus problem. In this problem, given an unweighted undirected graph G = (V, E), we are asked to find S ⊆ V that maximizes edge-surplus α (S) = g(|E(S)|) − αh(|S|), where g and h are strictly monotonically increasing functions, and α > 0 is a constant. The intuition behind this optimization problem is the same as that of f -DS. In fact, the first term g(|E(S)|) prefers S ⊆ V that has a large number of edges, whereas the second term −αh(|S|) penalizes S ⊆ V with a large size. Tsourakakis et al. [20] were motivated by finding near-cliques (i.e., relatively small dense subgraphs), and they derived the function OQC α (S) = |E(S)| − α |S| 2 , which is called the OQC function, by setting g(x) = x and h(x) = x(x − 1)/2. For OQC function maximization, they adopted the greedy peeling and a simple local search heuristic.
Recently, Yanagisawa and Hara [21] introduced density function |E(S)|/|S| α for α ∈ (1, 2], which they called the discounted average degree. For discounted average degree maximization, they designed an integer-programming-based exact algorithm, which is applicable only to graphs with a maximum of a few thousand edges. They also designed a local search heuristic, which is applicable to web-scale graphs but has no provable approximation guarantee. As mentioned above, our algorithm for f -DS with convex function f runs in O(m + n log n) time, and has an approximation ratio of

Convex Case
In this section, We remark that f (x)/x is monotonically non-decreasing for x since we assume that f (0) = 0. It should be emphasized that any optimal solution to f -DS with convex function f has a size smaller than or equal to that of any densest subgraph. To see this, let S * ⊆ V be any optimal solution to f -DS and S * DS ⊆ V be any densest subgraph. Then we have This implies that |S * | ≤ |S * DS | holds because f (x)/x is monotonically non-decreasing.

Hardness
We first prove that f -DS with convex function f contains DamkS as a special case.
where e is an arbitrary edge.
Proof. Since the maximum of linear functions is convex, the function f is convex. We remark that which implies that S is not optimal to f -DS. Thus, we have the theorem.

Our Algorithm
In this subsection, we provide an algorithm for f -DS with convex function f . Our algorithm consists of the following two procedures, and outputs the better solution found by them. Let S * ⊆ V be an optimal solution to the problem. The first one is based on the brute-force search, which obtains an The second one adopts the greedy peeling [3], which obtains a Combining these results, both of which will be proved later, we have the following theorem.
Theorem 2. Let S * ⊆ V be an optimal solution to f -DS with convex function f . For the problem, our algorithm runs in O(m + n log n) time, and has an approximation ratio of .

Brute-Force Search
As will be shown below, to obtain an it suffices to find the heaviest edge (i.e., argmax{w(e) | e ∈ E}), which can be done in O(m + n) time. Here we present a more general algorithm, which is useful for some case. Our algorithm examines all the subsets of vertices of size at most k, and then returns an optimal subset among them, where k is a constant that satisfies k ≥ 2. For reference, we describe the procedure in Algorithm 1. This algorithm can be implemented to run in O((m + n)n k ) time because the number of subsets with at most k vertices is k i=0 n i = O(n k ) and the value of w(S)/f (|S|) for each S ⊆ V can be computed in O(m + n) time.
We analyze the approximation ratio of the algorithm. Let S * i ⊆ V denote a maximum weight subset of size i ≥ 2, i.e., S * i ∈ argmax{w(S) | S ⊆ V, |S| = i}. We refer to w(S * i )/ i 2 as the edge density of i vertices. The following lemma gives a fundamental property of the edge density. Lemma 1. The edge density is monotonically non-increasing for the number of vertices, i.e., , as desired.
Using the above lemma, we can provide the approximation ratio.
Lemma 2. Let S * ⊆ V be an optimal solution to f -DS with convex function f . If |S * | ≤ k, then Algorithm 1 obtains an optimal solution. If |S * | ≥ k, then it holds that Proof. If |S * | ≤ k, then Algorithm 1 obtains an optimal solution because S * ∈ {S * 2 , . . . , S * k }. If |S * | ≥ k, then we have where the first inequality follows from Lemma 1, and the last inequality follows from k ≥ 2.
From this lemma, we see that Algorithm 1 with k = 2 has an approximation ratio of Algorithm 2: Greedy peeling . . , S n } that maximizes w(S)/f (|S|);

Greedy Peeling
Here we adopt the greedy peeling. The following lemma provides the approximation ratio.
Lemma 3. Let S * ⊆ V be an optimal solution to f -DS with convex function f . Algorithm 2 returns S ⊆ V that satisfies Proof. Choose an arbitrary vertex v ∈ S * . By the optimality of S * , we have By using the fact that w(S * \ {v}) = w(S * ) − d S * (v), the above inequality can be transformed to Let l be the smallest index that satisfies S l ⊇ S * , where S l is the subset of vertices of size l appeared in Algorithm 2. Note that v l (∈ argmin v∈S l d S l (v)) is contained in S * . Then we have where the first inequality follows from the greedy choice of v l , the second inequality follows from S l ⊇ S * , the third inequality follows from inequality (2), and the last inequality follows from the monotonicity of f (x)/x. Since Algorithm 2 considers S l as a candidate subset of the output, we have the lemma.

Examples
Here we observe the behavior of the approximation ratio of our algorithm for three concrete convex size functions. We consider size functions between linear and quadratic because f -DS with any superquadratic size function is a trivial problem; in fact, it only produces constant-size optimal solutions. This follows from the inequality The following corollary provides an approximation ratio of our algorithm.
Proof. Let s = |S * |. By Theorem 2, the approximation ratio is The first inequality follows from the fact that The last inequality follows from the fact that the first term and the second term of the minimum function are, respectively, monotonically non-decreasing and non-increasing for s, and they have the same value at s = n α−1 .
Proof. Let s = |S * |. By Theorem 2, the approximation ratio is where the second inequality follows from the fact that the first term and the second term of the minimum function are, respectively, monotonically non-decreasing and non-increasing for s, and they have the same value at s = (1+λ)n λn+(1−λ) .

Concave Case
In this section, we investigate f -DS with concave function f . A function f : Z ≥0 → R ≥0 is said to be concave if f (x) − 2f (x + 1) + f (x + 2) ≤ 0 holds for any x ∈ Z ≥0 . We remark that f (x)/x is monotonically non-increasing for x since we assume that f (0) = 0. It should be emphasized that any optimal solution to f -DS with concave function f has a size larger than or equal to that of any densest subgraph. This follows from inequality (1) and the monotonicity of f (x)/x.

Dense Frontier Points
Here we define the dense frontier points and prove some basic properties. We denote by P the set {(|S|, w(S)) | S ⊆ V }. A point in P is called a dense frontier point if it is a unique maximizer of y − λx over (x, y) ∈ P for some λ > 0. In other words, the extreme points of the upper convex hull of P are dense frontier points. The (smallest) densest subgraph is a typical subset of vertices corresponding to a dense frontier point. We prove that (i) for any dense frontier point, there exists some concave function f such that any optimal solution to f -DS with the function f corresponds to the dense frontier point, and conversely, (ii) for any strictly concave function f (i.e., f that satisfies f (x) − 2f (x + 1) + f (x + 2) < 0 for any x ∈ Z ≥0 ), any optimal solution to f -DS with the function f corresponds to a dense frontier point. We first prove (i). Note that each dense frontier point can be written as (i, w(S * i )) for some i ∈ {0, 1, . . . , n}, where S * i ⊆ V is a maximum weight subset of size i. Let (k, w(S * k )) be a dense frontier point and assume that it is a unique maximizer of y −λx over (x, y) ∈ P forλ > 0.
Consider the concave function f such that f (x) =λ(x − k) + w(S * k ) for x > 0 and f (0) = 0 (see . Then, any optimal solution S * ⊆ V to f -DS with the function f corresponds to the dense frontier point (i.e., (|S * |, w(S * )) = (k, w(S * k )) holds) because w(S)/f (|S|) is greater than or equal to 1 if and only if w(S) −λ|S| ≥ w(S * k ) −λk holds. We next prove (ii). Let f be any strictly concave function. Let S * k ⊆ V be any optimal solution to f -DS with the function f , and takeλ that satisfies f (k) (see Figure 2). Note that the strict concavity of f guarantees the existence of suchλ. Since f is strictly concave, we havê for any S ⊆ V , and the inequalities hold as equalities only when (|S|, w(S)) = (k, w(S * k )). Thus, (k, w(S * k )) is a unique maximizer of y −λx over (x, y) ∈ P, and hence is a dense frontier point.

LP-Based Algorithm
We provide an LP-based polynomial-time exact algorithm. We introduce a variable x e for each e ∈ E and a variable y v for each v ∈ V . For k = 1, . . . , n, we construct the following linear programming Solve LP k and obtain an optimal solution (x k , y k ); 3 Compute r * k that maximizes w(S k (r))/f (|S k (r)|); 4 return S ∈ {S 1 (r * 1 ), . . . , S n (r * n )} that maximizes w(S)/f (|S|); problem: For an optimal solution (x k , y k ) to LP k and a real parameter r, we define a sequence of subsets S k (r) = {v ∈ V | y k v ≥ r}. For k = 1, . . . , n, our algorithm first solves LP k to obtain an optimal solution (x k , y k ), and then computes r * k that maximizes w(S k (r))/f (|S k (r)|). Note here that to find such r * k , it suffices to check all the distinct sets S k (r) by simply setting r = y k v for every v ∈ V . The algorithm returns S ∈ {S 1 (r * 1 ), . . . , S n (r * n )} that maximizes w(S)/f (|S|). For reference, we describe the procedure in Algorithm 3. Clearly, the algorithm runs in polynomial time.
In what follows, we demonstrate that Algorithm 3 obtains an optimal solution to f -DS with concave function f . The following lemma provides a lower bound on the optimal value of LP k . Lemma 4. For any S ⊆ V , the optimal value of LP |S| is at least w(S).
Proof. For S ⊆ V , we construct a solution (x, y) of LP |S| as follows: Then we can easily check that (x, y) is feasible for LP |S| and its objective value is w(S). Thus, we have the lemma.
We prove the following key lemma.
Lemma 5. Let S * ⊆ V be an optimal solution to f -DS with concave function f , and let k * = |S * |. Furthermore, let (x * , y * ) be an optimal solution to LP k * . Then, there exists a real number r such that S k * (r) is optimal to f -DS with concave function f .

Proof.
For each e = {u, v} ∈ E, we have x * e = min{y * u , y * v } from the optimality of (x * , y * ). Without loss of generality, we relabel the indices of (x * , y * ) so that y * 1 ≥ · · · ≥ y * n . Then we have where [y * u ≥ r and y * v ≥ r] is the function of r that takes 1 if the condition in the square bracket is satisfied and 0 otherwise, and the last inequality follows from Lemma 4. Moreover, we have where y * n+1 is defined to be 0 for convenience, and the inequality holds by the concavity of f (i.e., . Using inequalities (3) and (4), we have This completes the proof.
Algorithm 3 considers S k * (r * ) as a candidate subset of the output. Therefore, we have the desired result. By Lemma 5, for any concave function f , an optimal solution to f -DS with the function f is contained in {S k (r) | k = 1, . . . , n, r ∈ [0, 1]} whose cardinality is at most n 2 . As shown above, for any dense frontier point, there exists some concave function f such that any optimal solution to f -DS with the function f corresponds to the dense frontier point. Thus, we have the following result.
Theorem 4. We can find a corresponding subset of vertices for every dense frontier point in polynomial time.

Flow-Based Algorithm
We provide a combinatorial exact algorithm for unweighted graphs (i.e., w(e) = 1 for every e ∈ E). We first show that using the standard technique for fractional programming, we can reduce f -DS with concave function f to a sequence of submodular function minimizations. The critical fact is that max S⊆V w(S)/f (|S|) is at least β if and only if min S⊆V (β · f (|S|) − w(S)) is at most 0. Note that for β ≥ 0, the function β · f (|S|) − w(S) is submodular because β · f (|S|) and −w(S) are submodular [12]. Thus, we can calculate min S⊆V (β · f (|S|) − w(S)) in O(n 5 (m + n)) time using Orlin's algorithm [19], which implies that we can determine max S⊆V w(S)/f (|S|) ≥ β or not in O(n 5 (m + n)) time. Hence, we can obtain the value of max S⊆V w(S)/f (|S|) by binary search. Note that the objective function of f -DS on unweighted graphs may have at most O(mn) distinct values since w(S) is a nonnegative integer at most m. Thus, the procedure yields an optimal solution in O(log(mn)) = O(log n) iterations. The total running time is O(n 5 (m + n) · log n).
To reduce the computation time, we replace Orlin's algorithm with a much faster flow-based algorithm that substantially extends a technique of Goldberg's flow-based algorithm for the densest subgraph problem [14]. The key technique is to represent the value of min S⊆V (β · f (|S|) − w(S)) using the cost of minimum cut of a certain directed network constructed from G and β ≥ 0.
For a given unweighted undirected graph G = (V, E, w) (i.e., w(e) = 1 for every e ∈ E) and a real number β ≥ 0, we construct a directed network (U, A, w β ) as follows. Note that for later convenience, we discuss the procedure on weighted graphs. The vertex set U is defined by The edge weight w β : A → R ≥0 is defined by β · k · a k (e = (p k , t) ∈ A t ), where d(v) is the (weighted) degree of vertex v, and Note that a k ≥ 0 holds since f is a monotonically non-decreasing concave function. For reference, Figure 3 depicts the network (U, A, w β ).
The following lemma reveals the relationship between a minimum s-t cut in (U, A, w β ) and the value of min S⊆V (β · f (|S|) − w(S)). Note that an s-t cut in (U, β · n · a n (i.e., X ∪ Y = U and X ∩ Y = ∅) such that s ∈ X and t ∈ Y , and the cost of (X, Y ) is defined to be (u,v)∈A: u∈X,v∈Y w β (u, v).
Lemma 6. Let (X, Y ) be any minimum s-t cut in the network (U, A, w β ), and let S = X ∩ V . Then, the cost of (X, Y ) is equal to w(V ) + β · f (|S|) − w(S).
Proof. We first show that for any positive integer s (≤ n), it holds that n i=1 min{i, s} · a i = f (s).
By the definition of a k , we get .
We are now ready to prove the lemma. Note that p k ∈ X if |S| > k and p k ∈ Y if |S| < k. 1 Let {β 1 , . . . , β r } = {p/f (q) | p = 0, 1, . . . , m, q = 2, 3, . . . , n} such that β 1 < · · · < β r ; 2 i min ← 1 and i max ← r; Compute a minimum s-t cut (X, Y ) in (U, A, w β i ); 6 if the cost of (X, Y ) is larger than w(V ) then i max ← i − 1; 7 else if the cost of (X, Y ) is less than w(V ) then i min ← i + 1; Therefore, the cost of the minimum cut (X, Y ) is where the first equality follows from equality (5).
From this lemma, we see that the cost of a minimum s-t cut is w(V ) + min S⊆V (β · f (|S|) − w(S)). Therefore, for a given value β ≥ 0, we can determine whether there exists S ⊆ V that satisfies w(S)/f (|S|) ≥ β by checking the cost of a minimum s-t cut is at most w(V ) or not. Our algorithm applies binary search for β within the possible objective values of f -DS (i.e., {p/f (q) | p = 0, 1, . . . , m, q = 2, 3, . . . , n}). For reference, we describe the procedure in Algorithm 4. The minimum s-t cut problem can be solved in O(N 3 / log N ) time for a network with N vertices [8]. Thus, the running time of our algorithm is O( n 3 log n · log(mn)) = O(n 3 ) since |U | = 2n + 2. We summarize the result in the following theorem. For f -DS with concave function f on weighted graphs, the binary search used in Algorithm 4 is not applicable because there may be exponentially many possible objective values in the weighted setting. Alternatively, we present an algorithm that employs another binary search strategy (Algorithm 5). We have the following theorem. Compute a minimum s-t cut (X (i) , Y (i) ) in (U, A, w β (i) ); 5 if the cost of (X (i) , Y (i) ) is larger than w(V ) then β In what follows, we analyze the time complexity of the algorithm. For each i ∈ {0, 1, . . . , i * − 1}, it holds that Since β . Therefore, the total running time of the algorithm is O n 3 log n · log log n log(1+ ) = O n 3 log n · log log n , where the equality follows from the fact that lim →+0 log(1+ ) = 1 holds.

Greedy Peeling
Finally, we provide an approximation algorithm with much higher scalability. Specifically, we prove that the greedy peeling (Algorithm 2) has an approximation ratio of 3 for f -DS with concave function f . As mentioned above, the algorithm runs in O(m+n log n) time for weighted graphs and O(m+n) time for unweighted graphs. We prove the approximation ratio. Recall that S n , . . . , S 1 are the subsets of vertices produced by the greedy peeling. We use the following fact, which implies that there exists a 3-approximate solution for DalkS in S n , . . . , S k . Theorem 7. The greedy peeling (Algorithm 2) has an approximation ratio of 3 for f -DS with concave function f .