Model-Agnostic Approximation of Constrained Forest Problems
Abstract
Constrained Forest Problems (CFPs) as introduced by Goemans and Williamson in capture a wide range of network design problems with edge subsets as solutions, such as Minimum Spanning Tree, Steiner Forest, and Point-to-Point Connection. While individual CFPs have been studied extensively in individual computational models, a unified approach to solving general CFPs in multiple computational models has been lacking. Against this background, we present the shell-decomposition algorithm, a model-agnostic meta-algorithm that efficiently computes a -approximation to CFPs for a broad class of forest functions. The shell-decomposition algorithm isolates the problem-specific hardness of individual CFPs in a single computational subroutine, breaking the remainder of the computation into fundamental tasks that are studied extensively in a wide range of computational models. In contrast to prior work, our framework is compatible with the use of approximate distances.
To demonstrate the power and flexibility of this result, we instantiate our algorithm for three fundamental, NP-hard CFPs (Steiner Forest, Point-to-Point Connection, and Facility Placement and Connection) in three different computational models (Congest, PRAM, and Multi-Pass Streaming). For constant , we obtain the following -approximations in the Congest model:
-
(1)
For Steiner Forest specified via input components (SF–IC), where each node knows the identifier of one of disjoint subsets of (the input components), we achieve a deterministic -approximation in rounds, where is the hop diameter of the graph, significantly improving over the state of the art.
-
(2)
For Steiner Forest specified via symmetric connection requests (SF–SCR), where connection requests are issued to pairs of nodes , we leverage randomized equality testing to reduce the running time to , succeeding with high probability.
-
(3)
For Point-to-Point Connection, we provide a -approximation in rounds.
-
(4)
For Facility Placement and Connection, a relative of non-metric Uncapacitated Facility Location, we obtain a -approximation in rounds.
We further show how to replace the term by the complexity of solving Partwise Aggregation, achieving (near-)universal optimality in any setting in which a solution to Partwise Aggregation in near-shortcut-quality time is known. Notably, all of our concrete results can be derived with relative ease once our model-agnostic meta-algorithm has been specified. This demonstrates the power of our modularization approach to algorithm design.
Keywords and phrases:
Distributed Graph Algorithms, Model-Agnostic Algorithms, Steiner ForestFunding:
Corinna Coupette: Supported by Digital Futures at KTH Royal Institute of Technology.Copyright and License:
2012 ACM Subject Classification:
Theory of computation Distributed algorithms ; Mathematics of computing Graph algorithms ; Theory of computation Models of computationEditor:
Dariusz R. KowalskiSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
The classic approach to determining the computational complexity of a task consists in deriving upper and lower bounds that are as tight as possible in a particular model of computation. On the upper-bound side, this can result in solutions that are specifically engineered toward the chosen computational model, such that the task at hand needs to be reexamined for every relevant model. Worse still, one might end up with fragile solutions that overcome only the challenges specifically represented by the model under study. At first glance, lower bounds might appear more robust in this regard, since they highlight an obstacle that any algorithm needs to overcome. However, many such lower bounds are shown for very specific network topologies that are rarely observed in practice. This might lead to a false sense of success in having classified the complexity of a given task – although the lower bound merely indicates that the parameters used to capture the computational complexity of the task are insufficient. In brief, traditional upper and lower bounds bear two limitations: model specificity and existential optimality.
A textbook illustration of these limitations is provided by the Steiner Forest problem (SF), which generalizes the well-known Steiner Tree problem (ST). In the classic SF formulation using input components (SF–IC), we are given a weighted graph, along with disjoint subsets of nodes , and the goal is to determine a minimum-weight subgraph spanning each , . This general and fundamental connectivity problem has been studied in depth in the classic centralized model of computation [1, 16, 28, 8, 19]. It is of significant practical importance, both due to direct application, see e.g., [36, Sec. 18.5.5], and being the key building block in variants of Steiner-type problems with broad real-world utility, e.g., Steiner Tree reoptimization [5] and Multicommodity Rent-or-Buy [20]. The state of the art in the Congest model is due to Lenzen and Patt-Shamir [30], who presented modified and adapted variants of a well-known centralized algorithm by Agrawal et al. [1]. One might suspect that algorithmic techniques underlying these tailored solutions could be transferred to other computational models, but this cannot be readily determined from their specialized solution (model specificity). Furthermore, for any fixed topology, there are inputs such that even their fastest algorithm runs for rounds.111We use and to suppress factors of . This is known to be necessary in the worst case (existential optimality): the Minimum Spanning Tree (MST) problem is a special case of the Steiner Forest Problem with and , to which the lower bound by Das Sarma et al. [11] applies. However, the topology of the lower-bound graph is highly contrived, and the technique of using low-congestion shortcuts has been demonstrated to overcome the barrier for MST in many more realistic settings [13, 24, 14]. These findings motivate us to revisit the Steiner Forest problem, along with a much broader class of connectivity problems, in an attempt to overcome the abovementioned limitations.
Beyond Model Specificity and Existential Optimality
To go beyond of model specificity and existential optimality, two paradigms have emerged.
First, numerous works have pushed toward what we term model agnosticism, developing algorithms that can be readily instantiated in many computational models [35, 32, 7, 6, 21]. The individual steps of these algorithms are either core tasks like distance computation, identifying connected components, and sorting, which are well-studied across a wide range of models, or they are easily implemented at low cost in any notable model. Hence, we call these algorithms model-agnostic (meta-)algorithms. Naturally, model-agnostic algorithms are more robust against model variations, and by allowing us to plug in optimized model-specific subroutines for the core computational problems, they sacrifice little performance over model-specific solutions. Thus, model-agnostic algorithms can be viewed as model-agnostic reductions to more basic computational tasks. Moreover, since these basic tasks can be solved by model-specific subroutines, they improve whenever progress is made on the current performance bottleneck in a given computational model.
Second, universal optimality, which was coined in the context of distributed computing, pushes for the design of topology-adaptive algorithms that are asymptotically worst-case optimal on every network topology, i.e., when varying input parameters other than the underlying communication graph [27, 15, 37].222Note that instance optimality, which requires that an algorithm is -competitive with any other always-correct algorithm on each instance, including the best algorithm for the specific instance, is often unachievable [27]. In particular, one can test whether the input matches an abritrary fixed instance in rounds. If so, a matching solution can be output without further computation; otherwise, a fallback algorithm is executed. One might argue that this idea is no different than taking into account more parameters, such as the network diameter, the node connectivity, or any other quantity meaningful for the computational task. However, parametrizing complexity by the input graph is extremely general, subsuming a large number of parameters that one might consider, and capturing any topology-specific obstacle for the task at hand in a given computational model.
Our Contributions in Brief
In this work, we study a general class of connectivity problems called Constrained Forest Problems (CFPs), introduced by Goemans and Williamson [16], through the lenses of model agnosticism and universal optimality. Intuitively, a CFP is specified by a Boolean function that indicates, for each node subset , whether it needs to be connected to its complement, i.e., if the output must contain an edge from to . We require to be proper, i.e., (1) , (2) , and (3) for disjoint implies that . For example, the Steiner Forest problem can be specified by setting if and only if, for given disjoint node subsets , there exists some such that both and are non-empty.
Model-Agnostic Algorithm for CFPs
We devise the shell-decomposition algorithm, a generic approximation algorithm for CFPs that is efficient if can be evaluated efficiently.
Theorem 1 (Model-Agnostic Complexity of Constrained Forest Problems).
Given and a graph with polynomially bounded edge weights , a -approximation to a CFP with proper forest function can (up to book-keeping operations) be obtained at complexity , with the terms in the sum denoting the complexities of solving (1) aSSSP: -approximate Set-Source Shortest-Path Forest, (2) MST: Minimum Spanning Tree, (3) RPS: Root-Path Selection, and (4) FFE: Forest-Function Evaluation for .
By book-keeping operations, we denote simple steps that extract the input for the next subroutine from the output of the previous ones and can be easily implemented at low complexity, where the complexity measure(s) depend on the model at hand (e.g., round complexity in Congest). MST is the special case of SF in which all nodes are to be connected by a lightest possible tree. The task of aSSSP requires us, given and , to compute a forest that preserves distances to up to a factor of , which ie equivalent to computing a -approximate shortest-path tree in the graph obtained by identifying all nodes in . Finally, RPS (a special case of the more general transshipment problem) demands that, given a rooted forest and a set of marked nodes, we select all edges on paths from marked nodes to the roots of their respective trees.
MST, aSSSP, and RPS are well-understood in many models of computation. In contrast, can be (ab)used to force the evaluation of an arbitrarily hard function on the entire topology.333For an input graph with uniform edge weights and any function mapping such a graph to a desired range, choosing an arbitrary node and setting if and only if (i) or , and (ii) results in a proper forest function . Thus, Theorem 1 can be viewed as confining the hardness of the task arising from the choice of to iterations of evaluating for all , where is a set of disjoint connected components. Because for any such , one can enforce this evaluation by assigning weight to edges inside each and a large weight, say , to all other edges, this is the best possible up to an factor.
To illustrate the power and flexibility of our result, we apply our machinery in three models of computation – Congest, Parallel Random-Access Machine (PRAM), and Multi-Pass Streaming (MPS) – to three NP-hard CFPs: (1) Steiner Forest (SF); (2) Point-to-Point Connection (PPC), i.e., given of equal cardinality, finding a lightest set of edges such that each induced component the number of nodes from is the same as the number of nodes from ; and (3) Facility Placement and Connection (FPC), i.e., minimizing the cost of opening facilities at some nodes and connecting a set of clients to them. Footnote˜5 summarizes our results in Congest; for all problems, our PRAM algorithms require work and depth, and our MPS algorithms need passes and memory.
| Problem | LB | Previous Work | Our Work | |||
| Ref. | APX | Complexity | Ref. | APX | Complexity | |
| SF(–IC) | [27] | [30] | ||||
| PPC | [27] | — | ||||
| FPC | [27] | — | ||||
| Problem | Input | ex. LB | APX | Complexity |
|---|---|---|---|---|
| SF–IC | Each node is given its component identifier | |||
| SF–CIC | As in SF–IC, but node knows and (if ) | |||
| SF–CR | Each node is given | |||
| SF–SCR |
; each node knows
|
Approaching Universal Optimality
In our upper bounds for Congest, can largely be replaced by , where is the running time of an algorithm performing Partwise Aggregation (PA) [25, 14] – i.e., given a partition of into connected subsets of nodes, computing the result of an aggregation function separately on each subset and outputting the result at each of its constituent nodes. Due to the respective hardness results [27], this implies that the running times of our solutions to PPC and FPC are universally optimal up to a factor of .
For SF–IC, this is true up to the additive term of . Here, PA is insufficient: Evaluating requires us to determine, for each set , if it is contained in a single connected component induced by the set of edges that have been selected into the current (partial) solution, but the may not induce connected components in . Existential lower bounds demonstrate that this obstacle is unavoidable in general [30]. We suspect that studying Partwise Aggregation with disjoint components that may be internally disconnected – i.e., Disjoint Aggregation () – is key to characterizing the universal complexity of SF–IC, but leave this to future work, noting that an upper bound of arises from standard pipelining techniques.
An orthogonal consideration, however, yields surprising results, as summarized in Table 2. For the SF problem, the input representation drives the problem complexity. It is known that if nodes are given the identifiers of other nodes they must connect to in the output to encode connectivity requirements (SF–CR), an existential lower bound of applies [30], where is the number of terminals, i.e., nodes that need to be connected to some other node. In our corresponding upper bound, the additive then becomes an additive , again resulting in existential but not universal optimality. Interestingly, this picture is turned upside down when inputs are symmetric in the following sense: If knows that it must connect to by the input, then also knows that it must connect to (SF–SCR). In this setting, we can exploit symmetry to efficiently evaluate using randomized equality testing. This leads to an algorithm running in rounds, which is close to universal optimality, provided that an efficient partwise-aggregation routine is available. Similarly, we observe that the bound can be circumvented if nodes receive not only the identifier of their input component , but also the size of this input component (SF–CIC). We then obtain a randomized algorithm running in time.
We note that a simple adaptation of the lower-bound construction from Lenzen and Patt-Shamir [30] shows the same hardness (i.e., for SF–SCR and for SF–CIC, respectively) for deterministic algorithms, based on a communication-complexity reduction from -player equality testing. To the best of our knowledge, this is the first natural example of a provably large gap between the randomized and deterministic complexity in the Congest model for a global problem.
Structure
After introducing our main definitions in Section 2, we provide a technical overview of our results in Section 3. To conclude the main text, we discuss open questions in Section 4. A full specification of our model-agnostic algorithm and a detailed proof of its correctness are given in Appendix A. Model descriptions, model-specific results, and further related work are deferred to the additional appendices available in the extended version of this work [10].
2 Preliminaries
| Symbol | Definition | Meaning | |
| Cardinality of a set | |||
| Set of all subsets (i.e., power set) of | |||
| Set of -element subsets of | |||
| Set of nonnegative integers no larger than | |||
| Set of positive integers no larger than | |||
| Graph with node set and edge set | |||
| Number of nodes | |||
| Number of edges | |||
| Cost of edge | |||
| Cost of edge subset | |||
| s.t. : | -hop path between and | ||
| , | (distinct nodes and distinct edges) | ||
| , | |||
| Hop (= unweighted) distance between and | |||
| Hop diameter of | |||
| Weighted distance between and | |||
| with | Shortest path between and | ||
| Shortest-path diameter of | |||
| Set of edges with exactly one endpoint in | |||
| Terminals | |||
| Number of terminals | |||
| Number of input components in SF–IC |
We begin by defining our basic notation as well as the class of problems we are interested in.
Basic Notation
Our basic notation is summarized in Table 3.
For a set , we denote its cardinality by , its power set by , and the set of its -element subsets by . We extend functions to subsets in the natural way by setting , and we write the sets of positive and nonnegative integers no greater than as resp. .
We consider weighted graphs with nodes, edges, and edge weights (edge costs) polynomially bounded in .666Assuming polynomially bounded edge weights allows us to encode polynomial sums of edge weights with bits, which means that we can encode edge weights in a single message (Congest) or memory word (PRAM and MPS). Zero-weight edges arise naturally when simulating contractions in the distributed setting. We can handle them by scaling all non-zero edge weights by (where w.l.o.g., is polynomially bounded as well), and assigning weight to all previously zero-weight edges. Each node is equipped with a unique identifier of bits, which is also used to break ties. An -hop path from to , denoted , is a sequence of distinct edges arising from a sequence of distinct nodes such that for , , and . The (unweighted) hop distance between and is the smallest number of hops needed to go from to , i.e., , and the hop diameter of is . The (weighted) shortest-path distance between and is . A shortest path between and , denoted , is a path from to of length . The shortest-path diameter of is the maximum over all of the minimum number of hops contained in a shortest path from to , i.e., . Given a cut , the set of edges with exactly one endpoint in is denoted as .
An event occurring with high probability (w.h.p.) has probability at least for a freely chosen constant .
Constrained Forest Problems
We are interested in Constrained Forest Problems (CFPs) as introduced by Goemans and Williamson [16].777To simplify the technical exposition, like Goemans and Williamson [16], we disallow zero-weight edges. However, it is straightforward to extend our approach to zero-weight edges by scaling edge weights as discussed in Footnote 6. Given a graph with edge costs and a function , a CFP asks us to solve the integer program stated as Problem 1, whose dual relaxation is provided as Problem 2.
Problem 1 (CFP Primal IP).
| s.t. | |||
Problem 2 (CFP Dual LP).
| s.t. | |||
That is, a CFP is a minimization problem whose optimal solution is a forest888For any cycle and node set , cannot contain exactly one edge from the cycle. Hence, from any solution with a cycle, we can remove an edge. of edges from the input graph that meets the constraints imposed by the forest function . Like Goemans and Williamson [16], we consider CFPs with proper functions , which satisfy (1) zero, i.e., , (2) symmetry, i.e., , and (3) disjointness (also called maximality [17]), i.e., if for two sets and , then implies .999We also assume that is nontrivial, i.e., that there exists at least one such that .
For a CFP with forest function , a node is called a terminal if . Given a forest function , we denote the set of terminals as and the number of terminals as .
Specific Constrained Forest Problems
To demonstrate the flexibility of our approach, we consider Survivable Network Design Problems (SNDPs) that originate from three different real-world challenges.
Definition 2 (Steiner Forest (SF)).
Given a partition of the terminals , find a minimum-cost edge subset connecting the nodes in each input component. The corresponding forest function evaluates to on if and only if there is such that . We distinguish between several input representations:
- SF–CR
-
Terminal is given the identifiers of other terminals as connection requests .
- SF–SCR
-
As SF-CR, but connection requests are symmetric, i.e., .
- SF–IC
-
Terminal is given a unique identifier (of size ) for its input component.
- SF–CIC
-
As SF-IC, but is also given the cardinality of its input component as input.
SF has practical relevance especially in infrastructure development [1], with MST () and ST () as special cases. It is NP-complete [29] and APX-hard [9].
Definition 3 (Point-to-Point Connection (PPC)).
Given a set of sources and a set of targets , find a minimum-cost edge subset such that in each connected component, the number of sources equals the number of targets. That is, for , if and only if .
PPC is motivated by challenges from circuit switching and VLSI design, and the problem is NP-complete [31].
Our last problem is Facility Placement and Connection (FPC), an NP-complete facility-location-type problem arising, e.g., in operations research. Intuitively, FPC can be stated as follows.
Definition 4 (FPC [intuitive]).
Given for each node an opening cost and indication whether it is in the set of clients , identify a subset of facilities to open and an edge set such that each client is connected to a facility by , minimizing .
To turn this task into a CFP matching our framework, we add one additional node and, for each , an edge of weight . The task then becomes to determine a (low-weight) Steiner Tree spanning , i.e., the special case of SF with .
Definition 5 (FPC [rephrased]).
Given, for each node , an opening cost and an indication whether it is in the set of clients , solve ST on , with edge costs of for and for , where the terminals are .
Even in Congest, we can solve ordinary ST instances efficiently, regardless of the specific input representation. However, the virtual node and its incident edges need to be simulated in the chosen model of computation. This is trivial in PRAM (simply modify the input representation in parallel) and Multi-Pass Streaming (since we use bits of memory anyway) but nontrivial in Congest. For Congest, we show that we can simulate the virtual node efficiently using Partwise Aggregation.
Partwise Aggregation and Shortcut Quality
Finally, we introduce a subroutine that we use as a black box to achieve near-universal optimality in cases where efficient solutions are known.
Definition 6 (Partwise Aggregation [13]).
For disjoint node sets , suppose that induces a connected subgraph. In the Partwise Aggregation (PA) problem, each is given a unique identifier for (of size ) and a second -bit value . For a specified associative and commutative operator , for each and each , needs to compute its output .
Definition 7 (Shortcut Quality).
The shortcut quality of , denoted by , is the maximum over all feasible operators and partitions of of the minimum number of rounds in which a Congest algorithm with knowledge of the full topology can solve Partwise Aggregation. Put differently, the algorithm may preprocess the graph and the operator, but must then compute the output within rounds after the nodes have been given their inputs to the PA instance.
Haeupler et al. [27] show that is a lower bound for MST (i.e., SF and ST with and ) and shortest - path (the special case of PPC with ) – regardless of the approximation ratio and also for randomized Las Vegas algorithms, i.e., those that guarantee a feasible output. They further prove that PA can be solved in rounds in the Supported Congest model, which is Congest with the unweighted graph topology given as part of the input.
3 Technical Overview
In this section, we outline our techniques and discuss the technical challenges to be overcome. We also use this opportunity to discuss the most relevant related work in context.
Model-Agnostic Algorithm for Proper Constrained Forest Problems
Our shell-decomposition algorithm, depicted in Figure 1, is based on the primal-dual formulation for general CFPs given by Goemans and Williamson [16] (GW algorithm), restated as Algorithm 1, which yields a -approximation [16] for CFPs with proper forest functions. Our algorithm can also be seen as a model-agnostic generalization of the algorithm by Agrawal et al. [1], ported to the distributed setting by Lenzen and Patt-Shamir [30]. These algorithms have also been called moat-growing algorithms [18, 3, 30].
Intuitively, our algorithm operates as follows. We maintain connected components , initialized to the singletons . A component is active if . The algorithm concurrently grows balls around all active components, with respect to the metric induced by the then-current edge costs . In the growth process, two balls can touch only when at least one of their associated components is active, and only when the balls of two active components , touch, adding edges to merge the components can make the resulting component inactive – otherwise, i.e., assuming , symmetry implies , disjointness implies , and using symmetry again we get , a contradiction.
As illustrated in Figure 2, until the balls around two components touch, they are disjoint, witnessing that the dual problem has a solution with weight larger than the product of the current radius times the number of currently active components. Accordingly, when merging active components, we can afford to connect the terminals by adding a shortest path between them to the primal solution, paying a cost of at most twice the radius at merge. Because each merge we perform reduces the number of active components by at least one, the ball growth always witnesses sufficient additional weight in a dual solution to pay for future merges up to an approximation factor of . Upon termination, i.e., when all components are inactive, the set of edges we selected constitutes a feasible solution.
The intuition sketched above already suggests that the ball-growing process allows for substantial concurrency. To achieve high efficiency across the board in various models, our shell-decomposition algorithm performs several modifications to the centralized GW algorithm, which originally assumes a final filtering step to eliminate unneeded edges from the solution, as well as exact distances and immediate forest-function evaluation.
-
(A)
Incremental Solution-Set Construction. We merge active components that touch regardless of whether this ultimately turns out to be necessary to satisfy connectivity requirements. This does not impact the approximation guarantee, which was implicit already in the contribution by Agrawal et al. [1] and is now made explicit in our reformulation of the GW framework [16]. As a result, we need to determine if imposes further connectivity requirements only as often as we iterate through the loop in Figure 1.
-
(B)
Approximate Distance Computations. At the cost of a factor of in the approximation ratio, we replace exact distance computations with -approximate distance computations. The challenge here is to ensure that we do not violate the dual constraints when charging dual variables (corresponding to cuts) based on the progress of the primal solution. This can be achieved by constructing the dual solution in the true metric space, rather than reusing the approximate distances leveraged by the primal solution. In contrast to the other modifications, this requires a comparatively involved argument, and it is the main technical novelty and contribution in this part of our work.
-
(C)
Deferred Forest-Function Evaluation. Again at the cost of a factor of in the approximation ratio, we let components grow to radii that are integer powers of before reevaluating our forest function to update their activity status. This technique was introduced by Lenzen and Patt-Shamir [30] for the Steiner Forest problem specifically; we show its general correctness in the context of the GW algorithm. Using this technique, we can limit the number of loop iterations in Figure 1 to (assuming polynomially bounded edge weights).
The pseudocode of the resulting model-agnostic shell-decomposition algorithm is given as Algorithm 2. Here, to increase clarity and highlight the primal-dual nature of our algorithm, we also compute and output the lower bound LB on the cost of an optimal solution, which is not required to determine the output forest . An example execution of Algorithm 2 is depicted in Figure 3. In Appendix A, we prove that the algorithm maintains an approximation ratio of .
Leveraging the modifications specified in Items B, C, and A, to derive concrete algorithms in specific models of computation, what remains is to implement the individual steps in Figure 1. Up to simple book-keeping operations, this entails four main tasks (colored boxes in Figure 1):
- (1)
- (2)
-
(3)
Root-Path Selection (RPS). Here, we are given a forest rooted at sources (where each node knows its parent in its tree and its closest source), along with a number of marked nodes. Our goal is to select the edges on the path from each marked node to its closest source. This problem can be straightforwardly addressed by solving a much more general flow problem: (Approximate) Transshipment, in which flow demands are to be satisfied at minimum cost. Again, we can plug in state-of-the-art algorithms for each model of interest [4, 35]. Note that in our case, the solution is restricted to containing the edges of a predetermined forest, such that the solution is unique, edge weights play no role, and the challenge becomes to determine the feasible flow quickly.
-
(4)
Forest-Function Evaluation (FFE). The remaining subtask is to assess if , for each component in a set of disjoint, internally connected components – i.e., to determine which of the components still need to be connected to others. This is the only step that depends on , requiring an implementation of matching the given model of computation.
Note that is an arbitrary proper function, so evaluating it can be arbitrarily hard. Thus, our algorithm confines the hardness of the task that comes from the choice of to evaluations of for disjoint connected components (cf. Item A). In contrast, the other three subtasks can be solved by state-of-the-art algorithms from the literature in a black-box fashion.
To illustrate the power of our result, we apply our machinery in three models of computation – Congest, Parallel Random-Access Machine (PRAM), and Multi-Pass Streaming (MPS) – to three Constrained Forest Problems. Our results follow from Theorem 1 with (1) the referenced results on aSSSP, MST, and RPS, (2) model- and problem-specific subroutines for FFE, and (3) model-specific subroutines for book-keeping operations.
In Footnote˜5, we highlight the improvements over the state of the art achieved for our example problems by instantiating the shell-decomposition algorithm in the Congest model.
-
(I)
Steiner Forest (SF). From time101010Recall that denotes the number of nodes, the number of input components, the number of terminals, the unweighted (hop) diameter, and the shortest-path diameter. Both and can be up to , regardless of or . for a -approximation and time for an -approximation to SF–IC obtained by Lenzen and Patt-Shamir [30] to -approximations in time (1) for SF–IC, (2) for SF–SCR, and (3) for SF–CIC. For SF–CR, we incur the same additive running-time overhead of as Lenzen and Patt-Shamir [30], matching the existential lower bound they showed.
-
(II)
Point-to-Point Connection (PPC). We are not aware of prior work providing Congest algorithms for the PPC problem. Here, we obtain a -approximation in time.
-
(III)
Facility Placement and Connection (FPC). We do not know of any existing Congest algorithms for the FPC problem, and we realize a -approximation in time.
We also derive algorithms for SF, PPC, and FPC in the PRAM and Multi-Pass Streaming models, taking work and depth in the PRAM model, as well as passes and space in the Multi-Pass Streaming model. To the best of our knowledge, these algorithms are either the first ones to perform these tasks in their respective models or they cover more general classes of instances than the state of the art (see extended version). Notably, we obtain our diverse results with relative ease once the analysis of our model-agnostic shell-decomposition algorithm is in place – which contrasts with the challenges of directly designing specific algorithms for specific problems in specific models.
Taking Shortcuts
In the Congest model, the term in the complexity is due to the fact that, in general, it is not possible to solve Partwise Aggregation (PA) in rounds. As formalized in Definition 6, PA denotes the task of performing an aggregation and broadcast operation on each subset in a partition of that induces connected components [14]. We can leverage PA, inter alia, to determine a leader and distribute its ID, or to find and make known a heaviest outgoing edge, for each component (a.k.a. part) in parallel. PA-type operations are key to efficient MST construction, and any -round solution to Partwise Aggregation lets us solve MST in rounds [13, 14].
As MST computation is a CFP, it might not surprise that Partwise Aggregation can serve as a key subroutine for other CFPs in the Congest model as well. We show that the term in the complexity of CFPs can be replaced by . Since this is already known for -approximate Set-Source Shortest-Path Forest [35], Minimum Spanning Tree [13], and -approximate Transshipment [35] (which can be used to solve Root-Path Selection),111111The results for approximate Set-Source Shortest-Path Forest and approximate Transshipment are conditional on the existence of fast -cycle-cover algorithms; otherwise, we incur an overhead (see Footnote 5). again we need to show this for Forest-Function Evaluation only. This is straightforward for PPC and FPC, and simple algorithms evaluate the forest function for SF-IC and SF-CR in and rounds, respectively.
The substantial literature on low-congestion shortcuts provides a large array of results on solving Partwise Aggregation in time comparable to the shortcut quality of the input graph, i.e., the best possible running time for Partwise Aggregation [37, 35, 27, 26, 15, 14, 22, 23, 2]. In particular, if the input graph does not contain a fixed -dense minor, without precomputations or further knowledge of [14]. Examples of graphs without -dense minors are planar graphs and, more generally, graphs of bounded genus. Moreover, in Supported Congest (where is known – or rather, can be preprocessed), , i.e., within a polylogarithmic factor of the optimum. Due to the modular structure of our results, any future results on Partwise Aggregation and low-congestion shortcuts will automatically improve the state of the art for CFPs in the Congest model.
The Quest for Universal Optimality
In the Congest model, it is known that even on a fixed network topology, rounds are required to obtain any non-trivial approximation for a large class of problems. This class includes MST, a special case of ST, which reduces to FPC by making all but the opening cost of a single node very high, and shortest --path [27], a special case of PPC. Thus, this universal lower bound applies to our example tasks as well. In particular, our results on PPC and FPC have almost universally tight running times.
In contrast, the additive terms of and in the running times of our algorithms for SF-IC and SF-CR, respectively, are only existentially optimal [30]: The lower-bound graph – a double star – has shortcut quality , but in a fully connected graph, it is trivial to evaluate for all current components in two rounds. This motivates us to split Forest-Function Evaluation for Steiner Forest into two parts: (1) determining, for each input component or connection request , respectively, whether the specified connectivity requirements are met, and (2) determining, for each current component , whether it contains a terminal whose connectivity requirements are not met. While the second part can be solved via standard Partwise Aggregation, the first part requires aggregating information within (or ) disjoint components that may be internally disconnected. We call this task Disjoint Aggregation on parts, , achieve the trivial bound via pipelining over a BFS tree, and hypothesize that the universal complexity of merits further investigation in future work.
As an orthogonal approach, we explore the effect of the input specification on our SF results. The assumptions that (1) pairs of terminals know that they need to be connected (SF–SCR), or (2) the size of each input component is known to its constituent terminals (SF–CIC) both are plausible, and they change the basis of the applicable existential lower bounds from -party set disjointness [30] to -party equality, as depicted in Figure 4. We provide the details in the extended version.
Assuming SF–SCR, we show how to achieve a running time of using efficient randomized -party equality testing. We sketch a solution assuming shared randomness here; standard techniques achieve the same without shared randomness. For each terminal-request pair , we flip a fair independent coin and denote the result by . Now each component aggregates . Observe that if for request pair , then for SF–SCR, it holds that and . Hence, if contains, for each request pair, either both terminals or none of the terminals, then the sum is guaranteed to be modulo . Otherwise, fix a request pair with and . After evaluating the sum up to coin , it is either or , and hence, by independence of , with probability , the sum is modulo . Therefore, performing the process times in parallel, we can distinguish between and with high probability. This computation can be performed by a single aggregation, where the aggregation operator is given by bit-wise addition modulo .
Our strategy for SF–CIC is similar, but results in a much weaker bound of ; our main point here is to demonstrate that the can be beaten. We distinguish three cases. (1) Components of size at most are spanned by a rooted tree of size . Here, we can aggregate the terminal counts, for all input components, within at ’s root in rounds to determine activity status. (2) For each input component of size at least , we globally aggregate if there are two distinct component IDs with a terminal from that input component. This requires one aggregation for each such input component, all of which can be completed within rounds via a BFS tree, as there can be at most input components of this size. (3) For input components of size , each component of size larger than uses the same strategy as for SF–SCR. However, only input components of the same size can be handled in a single aggregation, as the summation is now modulo . Hence, aggregations by at most components of size larger than are required, which can again be performed in rounds.
4 Discussion
In this work, we presented a general model-agnostic framework for the -approximation of Constrained Forest Problems (CFPs) and demonstrated its utility on three NP-hard CFPs in three models of computation. We conclude with a number of open questions – beyond applying our framework to other graph problems and computational models – in increasing order of generality:
-
(Q1)
Can the running time of SF–CIC in Congest be improved to nearly ?
-
(Q2)
Many of our results are near-universally optimal in Congest, even for Las Vegas algorithms – i.e., randomized with guaranteed success – but our algorithms for SF–SCR and SF–CIC are Monte Carlo. Due to existential lower bounds based on the communication complexity of -party equality, this is required to (always) achieve small running times w.h.p.
Is also a lower bound for Las Vegas Congest algorithms? -
(Q3)
How hard is Disjoint Aggregation, i.e., how can we characterize ?
-
(Q4)
The FPC problem minimizes the sum of opening and forest costs, disregarding the (often distance-based) connection costs considered in other facility-location-type problems.
Does our approach to FPC generalize to problems with connection costs? -
(Q5)
As we reduce most of our tasks to few PA instances, in Congest, rounds are both necessary and sufficient to achieve near-universal optimality. Since PA can be solved in work and depth in PRAM, PA-based algorithms also yield good solutions in PRAM. However, in the streaming model, while PA can be solved in memory and two passes, it is unclear if this yields optimal results (unless we insist that the output needs to be held in memory), as we are mostly limited to existential lower bounds.
Are there -memory streaming algorithms with passes? Better yet: Is there an analog to universal optimality in the MPS model?
Note that seems inadequate as a parameter here, as each part will require some memory for a few-pass implementation, but there may be parts. -
(Q6)
Does our approach generalize to CFPs on hypergraphs?
-
(Q7)
Beyond proper functions, the primal-dual method has proven useful for uncrossing functions [17]. One of the main features of optimization problems with uncrossing functions is that they are guaranteed to feature an optimal dual solution that is laminar.
Can our approach be extended to uncrossing or other non-proper functions?
References
- [1] Ajit Agrawal, Philip N. Klein, and R. Ravi. When trees collide: An approximation algorithm for the generalized steiner problem on networks. SIAM J. Comput., 24(3):440–456, 1995. doi:10.1137/S0097539792236237.
- [2] Ioannis Anagnostides, Christoph Lenzen, Bernhard Haeupler, Goran Zuzic, and Themis Gouleakis. Almost universally optimal distributed laplacian solvers via low-congestion shortcuts. Distributed Comput., 36(4):475–499, 2023. doi:10.1007/s00446-023-00454-0.
- [3] Aaron Archer, MohammadHossein Bateni, MohammadTaghi Hajiaghayi, and Howard J. Karloff. Improved approximation algorithms for prize-collecting steiner tree and TSP. SIAM J. Comput., 40(2):309–332, 2011. doi:10.1137/090771429.
- [4] Ruben Becker, Sebastian Forster, Andreas Karrenbauer, and Christoph Lenzen. Near-optimal approximate shortest paths and transshipment in distributed and streaming models. SIAM J. Comput., 50(3):815–856, May 2021. doi:10.1137/19M1286955.
- [5] Davide Bilò. New algorithms for steiner tree reoptimization. Algorithmica, 86(8):2652–2675, 2024. doi:10.1007/S00453-024-01243-2.
- [6] Joakim Blikstad, Jan van den Brand, Yuval Efron, Sagnik Mukhopadhyay, and Danupon Nanongkai. Nearly optimal communication and query complexity of bipartite matching. In 63rd IEEE Annual Symposium on Foundations of Computer Science, FOCS 2022, Denver, CO, USA, October 31 - November 3, 2022, pages 1174–1185. IEEE, 2022. doi:10.1109/FOCS54457.2022.00113.
- [7] Yi-Jun Chang, Manuela Fischer, Mohsen Ghaffari, Jara Uitto, and Yufan Zheng. The complexity of (+1) coloring in congested clique, massively parallel computation, and centralized local computation. In Peter Robinson and Faith Ellen, editors, Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing, PODC 2019, Toronto, ON, Canada, July 29 - August 2, 2019, pages 471–480. ACM, 2019. doi:10.1145/3293611.3331607.
- [8] Chandra Chekuri and F. Bruce Shepherd. Approximate integer decompositions for undirected network design problems. SIAM J. Discret. Math., 23(1):163–177, 2008. doi:10.1137/040617339.
- [9] Miroslav Chlebík and Janka Chlebíková. The steiner tree problem on graphs: Inapproximability results. Theor. Comput. Sci., 406(3):207–214, 2008. doi:10.1016/j.tcs.2008.06.046.
- [10] Corinna Coupette, Alipasha Montaseri, and Christoph Lenzen. Model-agnostic approximation of constrained forest problems, 2024. doi:10.48550/arXiv.2407.14536.
- [11] Atish Das Sarma, Stephan Holzer, Liah Kor, Amos Korman, Danupon Nanongkai, Gopal Pandurangan, David Peleg, and Roger Wattenhofer. Distributed verification and hardness of distributed approximation. SIAM J. Comput., 41(5):1235–1265, 2012. doi:10.1137/11085178X.
- [12] Joan Feigenbaum, Sampath Kannan, Andrew McGregor, Siddharth Suri, and Jian Zhang. On graph problems in a semi-streaming model. Theor. Comput. Sci., 348(2-3):207–216, 2005. doi:10.1016/j.tcs.2005.09.013.
- [13] Mohsen Ghaffari and Bernhard Haeupler. Distributed algorithms for planar networks II: low-congestion shortcuts, mst, and min-cut. In Robert Krauthgamer, editor, Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2016, Arlington, VA, USA, January 10-12, 2016, pages 202–219. SIAM, 2016. doi:10.1137/1.9781611974331.ch16.
- [14] Mohsen Ghaffari and Bernhard Haeupler. Low-congestion shortcuts for graphs excluding dense minors. In Avery Miller, Keren Censor-Hillel, and Janne H. Korhonen, editors, PODC ’21: ACM Symposium on Principles of Distributed Computing, Virtual Event, Italy, July 26-30, 2021, pages 213–221. ACM, 2021. doi:10.1145/3465084.3467935.
- [15] Mohsen Ghaffari and Goran Zuzic. Universally-optimal distributed exact min-cut. In Alessia Milani and Philipp Woelfel, editors, PODC ’22: ACM Symposium on Principles of Distributed Computing, Salerno, Italy, July 25 - 29, 2022, pages 281–291. ACM, 2022. doi:10.1145/3519270.3538429.
- [16] Michel X. Goemans and David P. Williamson. A general approximation technique for constrained forest problems. SIAM J. Comput., 24(2):296–317, 1995. doi:10.1137/S0097539793242618.
- [17] Michel X Goemans and David P Williamson. The primal-dual method for approximation algorithms and its application to network design problems. In Approximation Algorithms for NP-Hard Problems, pages 144–191. PWS Publishing Co., 1996.
- [18] Anupam Gupta and Amit Kumar. A constant-factor approximation for stochastic steiner forest. In Michael Mitzenmacher, editor, Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009, Bethesda, MD, USA, May 31 - June 2, 2009, pages 659–668. ACM, 2009. doi:10.1145/1536414.1536504.
- [19] Anupam Gupta and Amit Kumar. Greedy algorithms for steiner forest. In Rocco A. Servedio and Ronitt Rubinfeld, editors, Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing, STOC 2015, Portland, OR, USA, June 14-17, 2015, pages 871–878. ACM, 2015. doi:10.1145/2746539.2746590.
- [20] Anupam Gupta, Amit Kumar, Martin Pál, and Tim Roughgarden. Approximation via cost-sharing: A simple approximation algorithm for the multicommodity rent-or-buy problem. In 44th Symposium on Foundations of Computer Science, FOCS 2003, Cambridge, MA, USA, October 11-14, 2003, Proceedings, pages 606–615. IEEE Computer Society, 2003. doi:10.1109/SFCS.2003.1238233.
- [21] Bernhard Haeupler, D. Ellis Hershkowitz, and Thatchaphol Saranurak. Maximum length-constrained flows and disjoint paths: Distributed, deterministic, and fast. In Barna Saha and Rocco A. Servedio, editors, Proceedings of the 55th Annual ACM Symposium on Theory of Computing, STOC 2023, Orlando, FL, USA, June 20-23, 2023, pages 1371–1383. ACM, 2023. doi:10.1145/3564246.3585202.
- [22] Bernhard Haeupler, Taisuke Izumi, and Goran Zuzic. Low-congestion shortcuts without embedding. In George Giakkoupis, editor, Proceedings of the 2016 ACM Symposium on Principles of Distributed Computing, PODC 2016, Chicago, IL, USA, July 25-28, 2016, pages 451–460. ACM, 2016. doi:10.1145/2933057.2933112.
- [23] Bernhard Haeupler, Taisuke Izumi, and Goran Zuzic. Near-optimal low-congestion shortcuts on bounded parameter graphs. In Cyril Gavoille and David Ilcinkas, editors, Distributed Computing - 30th International Symposium, DISC 2016, Paris, France, September 27-29, 2016. Proceedings, volume 9888 of Lecture Notes in Computer Science, pages 158–172. Springer, 2016. doi:10.1007/978-3-662-53426-7_12.
- [24] Bernhard Haeupler, Taisuke Izumi, and Goran Zuzic. Low-congestion shortcuts without embedding. Distributed Comput., 34(1):79–90, 2021. doi:10.1007/s00446-020-00383-2.
- [25] Bernhard Haeupler and Jason Li. Faster distributed shortest path approximations via shortcuts. In Ulrich Schmid and Josef Widder, editors, 32nd International Symposium on Distributed Computing, DISC 2018, New Orleans, LA, USA, October 15-19, 2018, volume 121 of LIPIcs, pages 33:1–33:14. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2018. doi:10.4230/LIPIcs.DISC.2018.33.
- [26] Bernhard Haeupler, Harald Räcke, and Mohsen Ghaffari. Hop-constrained expander decompositions, oblivious routing, and distributed universal optimality. In Stefano Leonardi and Anupam Gupta, editors, STOC ’22: 54th Annual ACM SIGACT Symposium on Theory of Computing, Rome, Italy, June 20 - 24, 2022, pages 1325–1338. ACM, 2022. doi:10.1145/3519935.3520026.
- [27] Bernhard Haeupler, David Wajc, and Goran Zuzic. Universally-optimal distributed algorithms for known topologies. In Samir Khuller and Virginia Vassilevska Williams, editors, STOC ’21: 53rd Annual ACM SIGACT Symposium on Theory of Computing, Virtual Event, Italy, June 21-25, 2021, pages 1166–1179. ACM, 2021. doi:10.1145/3406325.3451081.
- [28] Kamal Jain and Vijay V. Vazirani. Approximation algorithms for metric facility location and k-median problems using the primal-dual schema and lagrangian relaxation. J. ACM, 48(2):274–296, 2001. doi:10.1145/375827.375845.
- [29] Richard M. Karp. Reducibility among combinatorial problems. In Raymond E. Miller and James W. Thatcher, editors, Proceedings of a symposium on the Complexity of Computer Computations, held March 20-22, 1972, at the IBM Thomas J. Watson Research Center, Yorktown Heights, New York, USA, The IBM Research Symposia Series, pages 85–103. Plenum Press, New York, 1972. doi:10.1007/978-1-4684-2001-2_9.
- [30] Christoph Lenzen and Boaz Patt-Shamir. Improved distributed Steiner forest construction. In Magnús M. Halldórsson and Shlomi Dolev, editors, ACM Symposium on Principles of Distributed Computing, PODC ’14, Paris, France, July 15-18, 2014, pages 262–271. ACM, 2014. doi:10.1145/2611462.2611464.
- [31] Chung-Lun Li, S. Thomas McCormick, and David Simchi-Levi. The point-to-point delivery and connection problems: complexity and algorithms. Discret. Appl. Math., 36(3):267–292, 1992. doi:10.1016/0166-218X(92)90258-C.
- [32] Sagnik Mukhopadhyay and Danupon Nanongkai. Weighted min-cut: Sequential, cut-query, and streaming algorithms. In Konstantin Makarychev, Yury Makarychev, Madhur Tulsiani, Gautam Kamath, and Julia Chuzhoy, editors, Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, STOC 2020, Chicago, IL, USA, June 22-26, 2020, pages 496–509. ACM, 2020. doi:10.1145/3357713.3384334.
- [33] Seth Pettie and Vijaya Ramachandran. An optimal minimum spanning tree algorithm. J. ACM, 49(1):16–34, 2002. doi:10.1145/505241.505243.
- [34] Seth Pettie and Vijaya Ramachandran. A randomized time-work optimal parallel algorithm for finding a minimum spanning forest. SIAM J. Comput., 31(6):1879–1895, 2002. doi:10.1137/S0097539700371065.
- [35] Václav Rozhon, Christoph Grunau, Bernhard Haeupler, Goran Zuzic, and Jason Li. Undirected (1+)-shortest paths via minor-aggregates: Near-optimal deterministic parallel and distributed algorithms. In Stefano Leonardi and Anupam Gupta, editors, STOC ’22: 54th Annual ACM SIGACT Symposium on Theory of Computing, Rome, Italy, June 20 - 24, 2022, pages 478–487. ACM, 2022. doi:10.1145/3519935.3520074.
- [36] Stefan Voß. Steiner tree problems in telecommunications. In Mauricio G. C. Resende and Panos M. Pardalos, editors, Handbook of Optimization in Telecommunications, pages 459–492. Springer, 2006. doi:10.1007/978-0-387-30165-5_18.
- [37] Goran Zuzic, Gramoz Goranci, Mingquan Ye, Bernhard Haeupler, and Xiaorui Sun. Universally-optimal distributed shortest paths and transshipment via graph-based -oblivious routing. In Joseph (Seffi) Naor and Niv Buchbinder, editors, Proceedings of the 2022 ACM-SIAM Symposium on Discrete Algorithms, SODA 2022, Virtual Conference / Alexandria, VA, USA, January 9 - 12, 2022, pages 2549–2579. SIAM, 2022. doi:10.1137/1.9781611977073.100.
Appendix A Model-Agnostic Algorithm
A.1 Modified Centralized -Approximation
Theorem 8.
For a graph , edge costs , and proper function , the modified GW algorithm with incremental solution-set construction, -approximate distance computations, and -deferred forest-function evaluation (Algorithm 2) yields a -approximation to the optimal solution, i.e., a -approximation for .
Denote by the iterations of the while loop in Algorithm 2, calling each iteration a merge phase, and for each phase by (i) the radius at the beginning of phase (with ), (ii) , , , , , and the values of the respective variables at the end of phase , and (iii) the number of active components at the end of the phase (with ). We prepare our proof in five lemmas.
Lemma 9.
For all s.t. Algorithm 2 does not terminate at the end of phase , we have (i) , (ii) for each , (iii) is spanned by , (iv) , (v) , (vi) , (vii) and (viii) is connected and the weighted diameter w.r.t. reduced costs is decreasing.
Proof.
The first and the second statement hold by construction due to Lines 21 and 12, respectively. For the third statement, observe that connects each node in to some node in , and the set includes all edges that are both contained in (i.e., whose reduced cost has become ) and connect different connectivity components with respect to of . Thus, the choice of ensures that is spanned by . This readily implies the forth statement, since , as well as the fifth, since all other edges of reduced cost are removed to obtain , forcing any approximate SSSP solution to reselect the edges from into . The sixth statement holds because the algorithm only adds edges to , and the seventh statement holds by induction, using that for some set . Finally, for the eighth statement, observe that since is spanned by , whose reduced cost is , and any edges not fully contained in remain in , the shortest path between any pair of nodes w.r.t. only becomes shorter: For any edge contained in , there is now a path of reduced weight between its endpoints, while any edge with reduced cost is still present and retains at most its original cost .
Lemma 10.
For , Algorithm 2 terminates in iterations of its loop.
Proof.
Since each of the operations within the loop terminate (assuming correct subroutines), it suffices to show the claimed bound on the number of loop iterations. To exit the while-loop, we must have for each , where is the set of connected components induced by our current edge set . Recall that edge weights and hence the weighted diameter of the graph are polynomially bounded in . As , there is an for which exceeds the weighted diameter of times . By Lemma 9 (viii), this entails that if we reach this phase, then contains the entire connected graph . By Lemma 9 (iii), (and thus ) is spanned by . The merge operation in Line 16 hence ensures that all terminals in are connected by . Denote by the connectivity component of containing the nodes in . For each terminal , lies in a connected component of satisfying that . This entails that we can partition into two types of sets: (i) components of satisfying that , and (ii) non-terminals , which satisfy by definition. By disjointness of the proper forest function , it follows that , and by symmetry it follows that . Finally, by Lemma 9 (vi), we have . Therefore, each connectivity component of other than decomposes into connectivity components of , satisfying that , and non-terminals, which again by disjointness implies that . Thus, Algorithm 2 terminates by the end of phase .
Lemma 11.
The edge set output by Algorithm 2 is primal feasible.
Proof.
When Algorithm 2 terminates, we must have exited its while-loop, implying that we have for each , where is the set of connected components induced by our output set . Consider any set . If non-trivially intersects a component , i.e., , then there is an edge of in the cut defined by , i.e., . Otherwise, for some . As satisfies disjointness and for each , it follows that in this case, too. Thus, is primal feasible.
Lemma 12.
The value output by Algorithm 2 is the cost of a feasible dual solution.
Proof.
Recall that in phase , and denote by the index of the final phase Abbreviating , we can write the value as
Denote the cost reduction of an edge achieved in phase as , with , and the amount that is increased in phase for some set as . Now observe that to construct a feasible dual solution of value , it suffices to, in each phase , for each component surviving phase , increase the dual variables associated with the subsets in sum by , while ensuring that for each , we maintain .
Since is proper, we know that for each component surviving phase , there exists at least one component active in phase such that . Therefore, to allocate to dual variables associated with , we can trace the construction of from the set of components is a connected component of (where ) as follows. Starting with as it resulted from phase , we increase the radii of all components at rate and the -variables of at rate . Thus, we gradually construct and reduce the costs of all edges in the affected cuts (i.e., increase ) until the first edge achieves (or the phase ends). This happens at the latest when the first dual constraint in phase becomes tight – it can happen earlier as the edge cost reductions associated with our radius increases are based on -approximate distances. When an edge achieves , we update and to ensure that both endpoints of lie in the same component, and we iterate the process described above.
Since survived phase , this process does not add an edge to that lies in the cut until we have increased the radius by . We claim that at no point in this process, becomes empty. Assuming otherwise, we would have that with at the respective point of the process, implying the contradiction that by disjointness of . Accordingly, the total increase of -variables that we attribute to during phase is at least , as desired. Moreover, our direct coupling of the -variable increases with edge-cost reductions further ensures that the -variables relevant for any individual edge increase by at most , i.e., its cost reduction in phase , guaranteeing feasibility.
Summing over the components surviving phase , we increase the dual variables by at least per merge phase, concluding the proof.
Lemma 13.
Denoting by the phase after which Algorithm 2 terminates, it holds that
Proof.
Recall that merge phase is the phase in which , and is the number of active components surviving phase . Moreover, note that all edges added to satisfy that on termination, and that the cost reduction for each edge in phase is at most , as each endpoint of the edge is contained in at most one tree of . Hence, the cost of each edge in can be amortized over the phases in which its cost is reduced by the algorithm, i.e., which it starts with , and in which it intersects . Accordingly, decomposes into a set of nested shells (with ), which the algorithm iteratively and implicitly constructs around active components. This shell-decomposition argument is illustrated in Figure 2.
Since in phase , the remaining edges (where ) to be added to form a forest with active components as nodes, and the average degree of such a forest is at most , we can bound the cost of the forest output by Algorithm 2 from above as
where .
Since is a feasible primal solution by Lemma 11, each terminal must be incident with at least one edge from ,
and the minimum edge weight is at least , we have that .
On the other hand, , yielding that
.
Using the above lemmas, we can now prove Theorem 8.
Proof of Theorem 8.
Assume that Algorithm 2 terminates at the end of phase , i.e., ; by Lemma 10, such a phase exists. Observe that the value output by Algorithm 2 then satisfies By Lemma 11, is a feasible primal solution. As is the cost of a feasible dual solution by Lemma 12, using Lemma 13, we obtain the desired approximation guarantee as
A.2 Specification of our Model-Agnostic Meta-Algorithm
The task of -approximating CFPs in any specific model reduces to simulating Algorithm 2. By Lemma 10, we can divide the computation into phases, where, starting at phase , phase grows components by radius . Each phase tackles six abstract problems, corresponding to the six building blocks of our algorithm (see Figure 1).
A.2.1 Problems Used as Building Blocks
Problem 3 (-approximate Set-Source Shortest-Path Forest).
Given a connected graph with edge costs and a set of sources , compute a forest spanning such that for all nodes , , where , and is the weighted distance between and in .
Note that in phase , we can confine ourselves to computing an -approximate set-source shortest-path forest up to distance (see Algorithm 2, Lines 9–10).
Problem 4 (Candidate-Merge Identification).
Given a graph , edge costs , and a rooted forest with a subset of its trees marked such that each node in a marked tree knows that its tree is marked as well as the identity of its root , identify all edges that are in distinct marked trees and satisfy .
Problem 5 (Minimum Spanning Tree).
Given a graph with edge costs , compute the Minimum Spanning Tree of .
Problem 6 (Root-Path Selection).
Given a graph with edge costs , a rooted forest, and a set of marked nodes , select the forest edges connecting each marked node to its root.121212Root-Path Selection can be reduced to approximate transshipment as follows: (i) count the number of marked nodes in each tree; (ii) set the demand of each marked node to and the demand of the root of each tree to the number of marked nodes in the tree; (iii) set edge costs to for tree edges and to for all other edges; (iv) solve the approximate transshipment problem for these demands and edge weights; and (v) select all edges with non-zero flow in the output. Note that the only non-trivial step of the reduction is the computation of the demand, which boils down to a single Partwise Aggregation.
Problem 7 (Edge-Cost Reduction).
Given a graph with edge costs , a radius , and an output of Problem 3, compute, for each edge ,
Problem 8 (Forest-Function Evaluation).
Given a graph , a partition of the node set such that each is connected, and a proper forest function , evaluate for each component .
A.2.2 Meta-Algorithm Using the Building Blocks
Initialize , , , , and as in Algorithm 2. Throughout our algorithm, we maintain a set of connected components with activity statuses for each . At the beginning of phase , contains exactly the singleton sets corresponding to all nodes, i.e., , and the active components are the terminals. Each phase of our algorithm (i.e., one loop iteration in Figure 1, simulating one while-loop iteration of Algorithm 2) then consists of the following steps, executed for .
-
(1)
Approximate Set-Source Shortest-Path Forest (aSSSP). Assign as temporary edge weight to the reduced cost if or (where in phase ). Compute a -approximate (-restricted) SSSP forest , using the active terminals as sources, i.e., for each active component, contains the terminal with the minimum identifier. After Step 1, for each node , we know its parent in the truncated SSSP forest, its closest source in the respective shortest-path tree (if any), and its distance to that source.
-
(2)
Edge-Cost Reduction (ECR). Using the approximate -restricted SSSP forest and the distances computed in Step 1, update the edge costs in accordance with Problem 7.131313We can keep the edge costs in by making sure that phases end with integral values of . Note that , or distance computations must be exact. Scale all weights by . Now rounding up to the next integer has marginal impact on the approximation guarantee, as overgrowing by factor plus an additive is not worse than overgrowing by factor .
-
(3)
Candidate-Merge Identification (CMI). Using that nodes’ parents and reduced edge costs are known, identify candidate merges for adjacent trees of the aSSSP forest (Step 1).
-
(4)
Minimum Spanning Tree (MST). Compute a Minimum Spanning Tree of with the following edge weights: (i) for edges in , i.e., the tree edges in the output of Step 1, (ii) for edges in , i.e., those determined in Step 3, and (iii) (or a large value) for all other edges. Mark all selected edges of that are also in the set known from Step 3, i.e., the edges constituting , and add them to (thus excluding all edges with temporary weight greater than ). For each connected component of the forest constituted by the selected edges of temporary weight or that contains a terminal , set as the new identifier of the component to be created from , making it known to all .
-
(5)
Root-Path Selection (RPS). Connect the marked edges identified in Step 4 to the roots (i.e., the node with the same identifier as the component) of the components they connect by adding the necessary edges to .
-
(6)
Forest-Function Evaluation (FFE). Using the new component memberships known from Step 4, update the set and evaluate for each updated . If for all such components, terminate and output – else, continue with the next loop iteration.
A.3 Correctness and Complexity of our Model-Agnostic Meta-Algorithm
Theorem 14 (Model-Agnostic Shell-Decomposition Algorithm).
Proof.
We prove the claim by induction on the phase , going step by step through the algorithm given in Section A.2.2 and arguing why the computed objects, in particular , match those of Algorithm 2. We augment the induction hypothesis by the claim that at the beginning of a phase, in Algorithm 2, contains exactly the edges of non-zero reduced cost and . The induction anchor (phase ) is given by the identical initialization of objects. For the step to phase , observe first that by the induction hypothesis (in particular the additional claim), Step 1 computes the same and the same distances as Algorithm 2, and hence Step 2 yields the same . It follows that Step 3 computes the same set of candidate merges, implying that Step 4 correctly determines and adds it to . Note that the latter step also updates component memberships and component identifiers, but does not yet evaluate whether for the new components. This is finally done in Step 6, such that in Step 1 of the next phase, the correct set will be used. It remains to prove the additional claim that contains exactly the edges of non-zero reduced cost and , which now is immediate from Line 17 and Lemma 9 (vii).
We conclude that both algorithms terminate at the end of the same phase , returning the same forest , which by Theorem 8 is a -approximation.
The proof of our main theorem now follows immediately.
Proof of Theorem 1.
By Theorem 14, our modular shell-decomposition algorithm (Algorithm 2) delivers the desired approximation guarantee. Without loss of generality, we may assume that , as this is enough to enforce that times the cost of an optimal solution is smaller than , i.e., a -approximation is guaranteed. Thus, it is sufficient to instantiate the algorithm with , such that by Lemma 10, the algorithm will terminate after while-loop iterations. In each of these iterations (up to bookkeeping operations), aSSSP, MST, RPS, and FFE computations are performed exactly once, yielding the desired model-agnostic complexity.
