Approximation Algorithms for the Generalized Point-To-Point Problem
Abstract
We consider the Generalized Point-to-Point (GP2P) problem in which we have an edge-weighted graph with (possibly negative) node charges . The goal is to find a minimum-cost set of edges such that each component has nonnegative total charge. Viewing the positive charges as specifying supply and negative charges as demand quantities at various nodes, the problem is equivalent to build the cheapest network so that it is possible to satisfy all demands by routing supplies across the network.
This problem is a significant generalization of other network design problems such as the well-studied Steiner Forest problem. Even the special case of only having one single demand point (having charge and all the other nodes having charge ) is capturing the -Minimum Spanning Tree problem. Earlier work by Hajiaghayi et al. (2016) [10] gave an approximation in pseudo-polynomial time with further improved guarantees if the total supply is not much larger than the total demand, and also a -approximation if the total supply equals the total demand.
Our contributions are four-fold: (a) we show how known -Minimum Spanning Tree approximations can be extended to GP2P approximations while losing only a -factor if the number of demand points in the instance is bounded by a constant, (b) we improve the running time to be Fixed-Parameter Tractable (FPT) in the number of demand points in constant-dimensional Euclidean metrics, (c) we give a 2-approximation in instances where edge costs are all and for each node and show such instances are APX-hard, and (d) we show how the logarithmic approximations in earlier work can be modified to run in truly polynomial time.
Keywords and phrases:
Point-to-Point Network design, Approximation, Steiner Forest, -MSTFunding:
Zachary Friggstad: Supported by NSERC.Copyright and License:
2012 ACM Subject Classification:
Theory of computation Graph algorithms analysis ; Theory of computation Approximation algorithms analysisEditors:
Pat Morin and Eunjin OhSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
Consider the following setting in a network: some locations have a finite supply of goods and some locations have a given demand for goods. The goal is, naturally, to route supplies to the demand locations in order to satisfy all demands. For example, in the Minimum-Cost Transportation the goal is to do this in a way that minimizes the total travel cost of all goods. However, it is natural to consider variations of this problem. A network-design version would be to install a minimum-cost set of connections (edges between locations) so that it is possible to find a routing satisfying demands such that each unit of supply only travels along installed connections. That is, we only pay one for each link. Such a setting is captured by the following problem.
Definition 1.
In the Generalized Point to Point Connection problem (GP2P), we are given a tuple where is an undirected graph with edge costs , for all and (possibly negative) node charges , for all . The goal is to find a minimum-cost such that for each connected component of .
A convenient view is that corresponds to a demand point, corresponds to a supply point, and is a location that is a Steiner point. For brevity, throughout this paper when the instance is clear from the context we let . This way, represents the total charge in absolute value but also ensures to count the zero-charge nodes.
It seems the first study of GP2P was actually for a directed variant of the problem and was conducted by Di Gaspero et al.[5] in the course of studying a practical problem related to scheduling shifts. The undirected version itself was formally proposed by Even, Kortsarz, and Slany [6] under the name Infinite Capacity Minimum Edge Cost Flow problem and an -approximation was given. The most recent work on GP2P is by Hajiaghayi et al. [10] where they give a -approximation when and an -approximation in general time that is polynomial in . This is achieved by an adaptation of the primal-dual algorithm of [9] for the classic Steiner Forest problem and its generalization (e.g. when we have a proper function). As pointed out in [6], many special cases of GP2P were already well-studied. For example:
-
Steiner Forest: We are given terminal pairs in an edge-weighted graph and the goal is to buy the cheapest set of edges so each pair has both endpoints in the same component. This is modelled by GP2P by letting and (and for all non-terminals). Goemans and Williamson give a 2-approximation for Steiner Forest [9].
-
-Minimum Spanning Tree (k-MST): We are given a particular root node and an integer . The goal is to find the cheapest tree including and at least other nodes. This is modelled by GP2P by letting and for all other . Note, this instance of GP2P has only a single demand point. The history of approximating k-MST is storied, currently the best approximation is a 2-approximation [8, 4].
-
Knapsack Covering: We are given items with sizes and costs plus a demand value . The goal is to find a minimum-cost set of items with total size at least . This modelled by GP2P by using a star with center and leaf nodes . The cost of is and the charges are and . An FPTAS for Knapsack Covering is known, e.g. by slightly modifying the standard dynamic programming based FPTAS for standard for standard Knapsack.
Notably, all these cases have constant-factor approximations or better. An open problem is to determine if GP2P has a constant-factor approximation even if we allow pseudo-polynomial running time, i.e. running time that is polynomial in ; note for the purposes of getting an -approximation we can assume without loss of generality that the edge costs are bounded by a polynomial in (see Lemma 19 in Appendix A). Even special cases of GP2P that generalize some of the classic problems stated above are open. For instance, for generalization of -MST to the situation where we have more than one root node, say , and the goal is to find a minimum cost subgraph where the connected component of each has at least nodes, it is not known if one can get an -approximation or not. This is one of the cases we study in this paper.
1.1 Our Results
For brevity, we say an instance of GP2P is metric (Metric-GP2P) if is a complete graph and edge costs satisfy the triangle inequality for any distinct . As is usual with single-connectivity network design problems, e.g. Steiner Forest, assuming the graph is metric can be done without loss of generality. The standard proof is found in Lemma 18 in Appendix A. We also say an instance of GP2P is graphical (Graphical-GP2P) if all edge costs are but is not necessarily complete.
Our results need approximation algorithms for slight generalizations of k-MST.
Definition 2.
In k-MST with required nodes (k-MST-R), instead of a single root node we are given a subset . The goal is to find the cheapest tree spanning and at least other nodes in . In weighted k-MST-R (W-k-MST-R), each vertex is given an integer weight and the goal is to find the cheapest tree spanning plus a subset of nodes in with total weight at least .
We note all k-MST approximations we consider in this paper can be readily adapted to the generalization k-MST-R. One can also extend them to W-k-MST-R that would run in pseudo-polynomial time (the simple reduction can be found in Appendix A).
Observation 3.
If there is a polynomial-time -approximation for k-MST-R then there is an -approximation for W-k-MST-R whose running time is polynomial in the number of nodes and the total weight of all nodes.
Let be the number of demand points in a given instance, i.e. . Recall from above that k-MST is a special case of GP2P when there is only one demand point, i.e. . We consider the case of GP2P with . This is akin to generalizing k-MST to multiple roots, each with its own size requirement, but there is one key difference. One might imagine a multiple-root k-MST generalization would want to keep the trees disjoint but in GP2P we can have multiple demand points in a single component as long as the total supply in the component is at least the total demand in the component.
Our main result shows how an -approximation for W-k-MST-R can be used to give a -approximation for GP2P.
Theorem 4.
For any , given an -approximation algorithm for W-k-MST-R, there is a -approximation for GP2P. This algorithm makes calls to . For -dimensional Euclidean metrics the running time of the GP2P approximation can be improved to calls to .
We also show how existing algorithms for k-MST (the -approximation for k-MST in [3] for general metrics and the PTAS for k-MST in constant-dimensional Euclidean metrics [2, 11]) can be suitably adapted to given the same approximation guarantees for W-k-MST-R, which combined with the above theorem imply the following:
Corollary 5.
For any , there is a -approximation for GP2P that runs in time is . For -dimensional Euclidean metrics there is a -approximation for GP2P with running time
Note that the -approximation for W-k-MST-R and for GP2P runs in truly polynomial time for any fixed but the -approximation for the Euclidean metrics is polynomial in , hence only pseudo-polynomial, as it is not clear how to get a PTAS for W-k-MST-R in the Euclidean metrics.
We also mention that this would also extend to doubling metrics using the quasi-polynomial time -approximation for k-MST in metrics with constant doubling dimension by Talwar [12]. The running time would be quasipolynomial in but still FPT in .
Our next collection of results is for Graphical-GP2P. First, we observe that particular restrictions of GP2P remain essentially as hard to approximate as the general problem.
Observation 6.
If there is a polynomial-time -approximation for instances of GP2P with for each , then there is a -approximation for general instances of GP2P with running time being polynomial in .
Observation 7.
If there is a polynomial-time -approximation for instances of Graphical-GP2P with for each , then for any there is a -approximation for general instances of GP2P with running time being polynomial in and .
These simple observations are proven in Appendix B. Our main contribution here is that the intersection of these two restrictions of GP2P is still APX-hard but does at least admit a simple approximation.
Theorem 8.
The restriction of Graphical-GP2P with for each is APX-hard and admits a polynomial-time -approximation.
Finally, we improve the running time of the logarithmic approximations in [10] by making them run in truly polynomial time (i.e. removing the dependence on ).
Theorem 9.
There is an -approximation for GP2P running in polynomial time.
1.2 Notation
For a graph and a subset of edges , we let denote the set of all nodes in that appear as the endpoint of at least one edge in and say that spans . We refer to a component of as a subset of vertices corresponding to a connected component of . For a given instance of GP2P, let be the supply points, be the demand points, and be the Steiner points. Correspondingly, let , and denote their sizes with . We also use the convention for and, similarly, for . Note differs from , the former measures the excess of positive charge in the input and the latter measures the absolute value of all charges while ensuring Steiner points are still counted.
1.3 Organization
2 Approximations for Constant
The algorithms proving Theorem 4 are obtained by reducing to W-k-MST-R. Recall that in W-k-MST-R, we are given a set of required nodes along with (integer) node weights and the goal is to find a minimum cost tree spanning such that the total weight of the nodes in the tree is at least a given integer . The special case of and for all is the traditional k-MST problem. Despite its generality, W-k-MST-R can be approximated as well (or nearly as well) as the simpler k-MST problem by adapting the known algorithms.
Theorem 10.
The k-MST algorithm described [3] can be adapted to give a polynomial-time -approximation for W-k-MST-R for any constant . Similarly, the PTAS for k-MST in -dimensional Euclidean metrics described in [11] with running time can be adapted to give a PTAS for k-MST-R with the same asymptotic running time and a -approximation for W-k-MST-R with running time .
These adaptations are discussed in Appendix C. We emphasize both algorithms run in polynomial time (for constant ) even if the number of required nodes is not bounded by a constant. It is possible that the -approximations for k-MST in [8] or [4] could be extended to W-k-MST-R and that the PTAS for Euclidean k-MST in [2] could be similarly extended. We focus on these particular algorithms due to the ease in describing how to extend them.
We present the proof of Theorem 4 in two parts. First for the general metrics and then the improved FPT time for -dimensional Euclidean metrics.
2.1 Theorem 4: General metrics
Throughout, let denote the cost of an optimum solution. Fix to be a sufficiently small constant, suffices. From Lemma 18 (Appendix A), we may assume is a complete graph with edge costs satisfying the triangle inequality. For any and any we let be the ball of radius around .
The intuition for our algorithm is the following. If the optimum solution consisted of only a single tree containing all nodes with negative charge, we could treat it as a W-k-MST-R instance where and while setting for and for . However, this may not be the case. To cope, we show how to decompose the instance into disjoint subproblems that have single-tree optimal solutions whose total costs are close to .
To perform the decomposition, we first show that a near-optimum solution to the original instance exists such that any two trees in this solution notably far apart. Then we guess a small “net” of points in each tree, the union of balls with a particular common radius around each net point covers all trees, but two balls from different trees are disjoint. This allows us to partition the instance into disjoint instances, one per tree in the near optimum solution.
The first step is to show some near-optimum solution has its components being bounded away from each other.
Lemma 11.
There is a solution with such that for any two components of with and we have for any .
Proof.
Let initially be an optimum solution. While there are two components of with and such that for some and , add to . Each such addition increases the cost of by at most and this procedure will be executed fewer than times since there are initially at most components containing a node in .
Of course, we may also assume, by discarding edges, that is a minimal solution meaning is not feasible for any . So consists of vertex-disjoint trees, say that collectively span all of plus some other nodes in . Any other node not in a tree has and forms an isolated component of .
Next, we identify a net of nodes in the trees that will be small enough for our algorithm to guess.
Lemma 12.
For any tree and any there is a set with such that for each .
Similar constructions have been considered many times before, one example is in the -approximation for k-MST [3]. We include a proof for completeness.
Proof.
We prove this by induction on . Root at an arbitrary node . The base case is when for each . If so, we simply let .
Otherwise, for two let be the cost of the unique path in . Pick any that has maximum value . Note . Now let be the furthest node along the path such that . Since , then . Let be the parent of in . By our choice of we also have .
Let denote the subtree of rooted at . For any we have . Let be the tree obtained by removing the subtree rooted at from . This removes all edges of the path, so . By induction, we can find a set with such that for each .
Finally, set and note . Every is now within distance from some node in , as required.
The entire algorithm is summarized in Algorithm 1. The high-level idea is that it guesses a value very close to and then guesses the “nets” from Lemma 12 applied to each tree of the near-optimum solution using . Below, we show for the proper guess of that the sets obtained by the union of balls for are disjoint. This instance of GP2P then naturally decomposes into disjoint instances of W-k-MST-R. Supporting results demonstrating the performance of our algorithm are found below.
We first discuss the running time. The number of iterations of the outer loop is logarithmic in the ratio , which is polynomial in the number of bits used to represent the costs in the instance. There are clearly only possible values for and the number of -tuples satisfying the stated bounds is at most . So when is regarded as a constant, the total number of iterations is polynomial in the input size and, thus, the entire algorithm makes a polynomial number of calls to a k-MST-R approximation on instances with at most nodes and, otherwise, runs in polynomial time.
Towards the performance guarantee, we show for the “correct” guess of values in the loops the algorithm will perform well.
Proof.
For (a), we have
For (b), if, say, for some distinct then there would be some and such that
But this contradicts Lemma 11, which showed . Finally, (c) follows because the balls for collectively cover all of and each node of lies on some .
To finish the analysis, consider the iteration of the algorithm for the particular setting of and described in Lemma 13. With these values, the algorithm proceeds to run the k-MST-R approximations. In the instance corresponding to , we know itself is a feasible solution so the returned tree satisfies . Thus, the candidate GP2P solution found in this iteration has cost .
2.2 Theorem 4: Euclidean metrics
We simply describe how to modify Algorithm 1. Clearly, we can use a -approximation for W-k-MST-R in Euclidean metrics to make Algorithm 1 a -approximation in Euclidean metrics. The pseudo-polynomial running time in the statement of Theorem 4 comes from the fact that we only know how to adapt k-MST PTASes to k-MST-R and then rely on Observation 3 to get a pseudo-polynomial time W-k-MST-R -approximation. One small comment is that even though the distances are not necessarily rational numbers, the number of iterations of the outer loop is still polynomial in the number of bits used to describe the locations of the points in .
To improve the running time to be FPT in , we change how the nets are guessed. Let be the dimension of the metric, recall that is assumed to be a constant. For brevity, let . Note is the value from Algorithm 1. The idea of the improvement is the following. For simplicy let’s consider the case of . If one considers a square of side length , the number of disjoint balls of radius that can be placed inside that square is . This simple packing argument can be used to bound the number of guessed points for the nets to be bounded by .
For each guess in the outer loop, we first let be any -net of . That is, every has yet for any . Such a set can be constructed by greedily adding points while maintaining the property that until no more points can be added. For each , let be its closest point in . So for and, otherwise, we at least know .
Now let be the sets identified by applying Lemma 12 using and let . For and for we still have that . That is, suppose otherwise and let be such that and be such that . Then when we have
which contradicts Lemma 11 and the fact that and lie in different trees in .
So it suffices to guess in the algorithm. But now we leverage packing property of Euclidean metrics to help reduce the number of guesses to a constant depending on and .
Lemma 14.
For each , is bounded by .
Proof.
The Euclidean balls of radius about points in are disjoint by construction of and are completely contained in the Euclidean ball of radius about . The volume a -dimensional ball with radius is within an absolute constant factor of . Therefore, is at most a constant factor times .
The steps for guessing are to first try all ways to partition into nonempty groups, a coarse upper bound on the number of such choices is . For each such partition, let be any particular representatives from the parts. We try all tuples where each such that (as opposed to as in Lemma 13 since the radius is smaller). For each such tuple that passes the other requirements of Lemma 13, we partition the instance into disjoint Euclidean k-MST-R instances and approximate these with a PTAS. A coarse upper bound on the number of such tuples is .
3 Approximation Algorithms and Hardness for Graphical-GP2P with Charges
Observations 6 and 7 show GP2P is not much easier to approximate if we assume either that for each or that the instance is graphical and for each . We begin this section by showing the common intersection of these two special cases does admit a 2-approximation. After this, we complete the proof of Theorem 8 by showing such GP2P instances remain APX-hard.
3.1 Graphical Instances with Unit Charges
Let be an instance of Graphical-GP2P where for each . As usual, let denote the optimum solution cost which, in this case, is just measuring the minimum size feasible solution .
Theorem 15.
Let be any minimal feasible solution (i.e. is not feasible for any ). Then .
A 2-approximation is then straightforward as one could simply start with being any spanning tree and then iteratively try to drop an edge from while retaining feasibility until no such drop is possible.
Proof.
Recall and . So as no charge is zero in this case. We claim and that any minimal solution has size at most . Thus, any minimal solution satisfies , as required.
For the first bound, let be an optimum solution and any connected component in that contains at least one node in . If has, say, nodes in then must contain at least nodes in as well. That is, so contains at least edges in component . Summing over all components that contain at least one node in , we see .
Now let be any minimal feasible solution. Let be any connected component of and let be the edges of in component . If then minimality of means is a single node with (i.e. it has no edges). If , we claim . If so, then so as is a tree (by minimality). Summing over all components would complete the claim that .
To see , again recall is a tree. Further, for each we have that deleting would produce a component with negative charge but it could not be that both components have negative charge since . Consider the orientation of edges that directs each edge toward the side that would have negative charge if was deleted. Since is a tree, the orientation of edges produces a directed acyclic graph. Let be any source node in this DAG.
View the tree as being rooted at and say has children. Let be such that are the nodes in the subtree under the ’th child of . Since and , we know has at least two nodes so .
By the orientation of edges, we have for each . Therefore
which, by feasibility of , means .
It may be possible to get a better-than-2 approximation through a more involved approach. For example, notice in our proof is only tight if all components of the optimum solution with had (i.e. is a matching). So if the optimal solution has at least, say, nodes of in components of size greater than 2 then any minimal solution would be a -approximation. Otherwise, nearly all nodes of are in components that consist of a single edge. In this case, a maximum-size matching between and would then create components of charge that collectively include most nodes of . One can even show there is a way to “alternate” the matching so that a greedy algorithm yields a good approximation (i.e. by alternating so our matching edges largely agree with the optimum matching edges), but efficiently finding such an alternation seems challenging.
One would reasonably wonder if instances of GP2P with unit-cost edges and charges are actually hard. Indeed, we conclude this section by showing APX-hardness.
Theorem 16.
The restriction of Graphical-GP2P to instances with unit edge costs and charges is APX-hard.
Proof.
We reduce from the Minimum Vertex Cover Problem in simple, cubic graphs which is known to be APX-hard [1]. That is, for some constants it is NP-hard to distinguish if a cubic graph on vertices has a vertex cover of size at most or if all vertex covers have size exceeding .
So let be an -vertex cubic graph with edges. We obtain a graph and set the charges of as follows.
Initially let and set for every . Then subdivide each edge with a single vertex with . Then for each we add new vertices and edges . Here, all have positive charge and have negative charge. See Figure 1 for an illustration of this process.
We claim has a vertex cover of size if and only the Graphical-GP2P instance given by and has a solution of size . So if has a vertex cover of size at most then there is a GP2P solution of size at most and if all vertex covers in have more than then all solutions to this GP2P instance have size more than . This shows there is no -approximation for GP2P with unit edge costs and charges unless P = NP.
To see the claim, first let be a vertex cover of with size . To get a corresponding Graphical-GP2P solution, buy the following edges:
-
For each , purchase .
-
For each , purchase and .
-
For each , let be any endpoint of in . Purchase and any edge of the form that was not purchased by another edge this way. Since is cubic, this is always possible.
In total, this purchases edges. This can be seen to be a feasible solution by matching each negative-charged vertex with a positively-charged vertex in its component in a one-to-one fashion. Such a mapping is immediate from the construction: each can be matched with , each can be matched with , and each can be matched with the corresponding purchased in the description above.
For the converse, we first claim there is a well-structured optimal solution.
Claim 17.
There is an optimal solution such that: (a) for each both and are in , (b) has exactly edges of the form where and , (c) each has for at least one endpoint of , and (d) for each edge exactly one of or .
If so, then is a vertex cover of and , as required to complete the proof.
Proof of Claim 17.
Let be an optimal solution. First observe we must have and for every otherwise some negatively-charged node would be isolated. So in each component we must have the number of nodes of the form for various is at most the number of nodes of the form for various . By optimality of and the fact each is a pendant node, we have that these counts are in fact equal in each component.
Let be any minimum-cost pairing of nodes with nodes of the form lying in the same component as . Here, the cost of pairing two nodes is the length of their shortest path using only edges in . For each , let denote the corresponding shortest path from to . Finally, let be the first vertex after along , notice corresponds to an endpoint of in the original cubic graph .
Now consider an alternative pairing of the vertices with the vertices (which do not necessarily need to lie in the same component as , for now). For each , pair it with any vertex of the form for some that has not been paired before. Such a pairing is possible since is cubic. Let be obtained by modifying in the following way. From , first delete all edges of the form then add all edges of the form where is paired under the new pairing.
This does not change the size of . To ensure it is feasible, do the following. For each such that was not initially paired with one of , it must have been that was the second edge along for some and that (otherwise and both use but in opposite directions, meaning we can uncross the paths to get a cheaper pairing than ). So we remove and add (if it was not already there). Doing so for all will ensure it is now feasible since each can now reach the vertex it is paired with while not increasing the size of because an edge of the form is added only after an edge of the form is removed. This also ensures all properties required by the claim now hold.
4 A Polynomial-Time Logarithmic Approximation
We show that a slight variation of algorithm of [10] yields an -approximation in truly polynomial time. The algorithm in [10] begins by observing, using standard metric embeddings [7], that it suffices to given an -approximation if the graph is a tree which, after rooting at some vertex, has exactly two children per node. Then they present an exact algorithm using dynamic programming where one index of the DP table considers values up to . Roughly speaking, for every vertex and every they consider the subtree rooted at and compute the cheapest subset of edges in the tree such that the component with has charge and every other component in the subtree has nonnegative charge.
We simply point out that we can flip the roles of charges and costs in the DP table. First, we use the reduction in Lemma 19 (using, say, ) to instances where each edge has its cost being bounded by a polynomial in . It is easy to verify this reduction produces a tree if the input graph was a tree. Let be the resulting instance of GP2P, i.e. is a tree and each edge cost is a positive integer bounded by a polynomial in .
Root at an arbitrary node and for each node let be the subtree under . Our dynamic programming table is the following: for each node and each let be the maximum such that in the subtree , it is possible to purchase edges with total cost at most such that the component with has charge at least and every other component has nonnegative charge. Given these values, the optimum solution cost is then the minimum such that .
To compute the values:
-
If is a leaf node then .
-
Otherwise, say are the two children of . Intuitively, we try all subsets of to delete and try all ways to split the remaining budget between the subproblems and keep the best solution found overall. That is, we try all ways to purchase a subset such that and all such that and such that if and if (i.e. if we do not purchase the corresponding parent edge, then the budget in the subproblem better be able to buy a feasible solution in the subtree).
Then is the maximum of the following expression over such :
where is the -indicator for the logical expression enclosed by the brackets.
The number of distinct subproblems is polynomial in since the edge costs are integers at most and there are at most different ways to select in a subproblem, the algorithm runs in polynomial time. In particular, if denotes the total edge cost in the tree then the number of distinct subproblems is and processing each entry takes time (including recursive calls) so the total running time is . Finally, if one only permits recursive calls to subproblems where is at most the total edge cost in the subtree and the loops over the split only iterate over values where and are at most the total edge cost of their respective subtrees, the running time is improved to .
This polynomial-time approximation for GP2P in trees can be used in a black-box fashion to improve the running time of the in [10] to run in true polynomial time.
References
- [1] Paola Alimonti and Viggo Kann. Some apx-completeness results for cubic graphs. Theoretical Computer Science, 237(1):123–134, 2000. doi:10.1016/S0304-3975(98)00158-3.
- [2] Sanjeev Arora. Polynomial time approximation schemes for euclidean traveling salesman and other geometric problems. J. ACM, 45(5):753–782, September 1998. doi:10.1145/290179.290180.
- [3] Sanjeev Arora and George Karakostas. A 2 + approximation algorithm for the k-mst problem. Math. Program., 107(3):491–504, July 2006.
- [4] Emmett Breen, Renee Mirka, Zichen Wang, and David P. Williamson. Revisiting garg’s 2-approximation algorithm for the k-mst problem in graphs. In 2023 Symposium on Simplicity in Algorithms, SOSA 2023, pages 56–68. SIAM, 2023. doi:10.1137/1.9781611977585.CH6.
- [5] Luca Di Gaspero, Johannes Gärtner, Guy Kortsarz, Nysret Musliu, Andrea Schaerf, and Wolfgang Slany. The minimum shift design problem. Annals of operations research, 155:79–105, 2007. doi:10.1007/S10479-007-0221-1.
- [6] Guy Even, Guy Kortsarz, and Wolfgang Slany. On network design problems: fixed cost flows and the covering steiner problem. ACM Trans. Algorithms, 1(1):74–101, July 2005. doi:10.1145/1077464.1077470.
- [7] Jittat Fakcharoenphol, Satish Rao, and Kunal Talwar. A tight bound on approximating arbitrary metrics by tree metrics. Journal of Computer and System Sciences, 69(3):485–497, 2004. Special Issue on STOC 2003. doi:10.1016/j.jcss.2004.04.011.
- [8] Naveen Garg. Saving an epsilon: a 2-approximation for the k-mst problem in graphs. In Proceedings of the Thirty-Seventh Annual ACM Symposium on Theory of Computing, STOC ’05, pages 396–402, 2005. doi:10.1145/1060590.1060650.
- [9] Michel X. Goemans and David P. Williamson. A general approximation technique for constrained forest problems. SIAM Journal on Computing, 24(2):296–317, 1995. doi:10.1137/S0097539793242618.
- [10] Mohammadtaghi Hajiaghayi, Rohit Khandekar, Guy Kortsarz, and Zeev Nutov. On fixed cost k-flow problems. Theor. Comp. Sys., 58(1):4–18, January 2016. doi:10.1007/s00224-014-9572-6.
- [11] Joseph S. B. Mitchell. Guillotine subdivisions approximate polygonal subdivisions: A simple polynomial-time approximation scheme for geometric tsp, k-mst, and related problems. SIAM Journal on Computing, 28(4):1298–1309, 1999. doi:10.1137/S0097539796309764.
- [12] Kunal Talwar. Bypassing the embedding: algorithms for low dimensional metrics. In László Babai, editor, Proceedings of the 36th Annual ACM Symposium on Theory of Computing, pages 281–290. ACM, 2004. doi:10.1145/1007352.1007399.
Appendix A Standard Reductions
Proof of Observation 3.
Each with is replaced with colocated copies and each with is left as well. In the new k-MST-R instance, use . Consider any tree (say, one obtained using a k-MST-R approximation) in the new instance that spans and at least other nodes. By adding 0-cost edges if necessary, we may assume that contains all colocated copies of any node it spans.
Since at most nodes have , then spans at least nodes from groups of colocated copies of original nodes. Since each group has a size that is a multiple of , then spans at least such nodes, i.e. viewing as a tree in the original graph after contracting colocated copies of nodes it spans yields a feasible solution.
Lemma 18.
Let be an instance of GP2P and let be the complete graph over with metric edge costs given by the minimum-cost path. Any feasible solution to the GP2P instance can be mapped to a feasible solution to the Metric-GP2P instance with no greater cost and vice-versa.
Proof.
For each we have since is one possible path between its endpoints. So for any feasible solution to we have , as required. Conversely, for any feasible solution to if we let be the union of all shortest paths in for each then then . Also, two nodes that were in the same connected component in still lie in the same connected component of . That is, each connected component of is the union of one or more connected components in so .
Lemma 19.
For any constant , if there is an -approximation for instances of GP2P where every edge cost is an integer at most , then there is a -approximation for general instances of GP2P.
Proof.
Let be a general instance of GP2P with optimum solution with cost . By contracting -cost edges (which does not change the optimal solution value), we assume for each .
First, compute smallest value such that every connected component in the graph with edges has . We claim : the first bound is because any feasible solution must use at least one edge of cost by our choice of and the other is because a spanning forest of is a feasible solution using fewer than edges each of cost at most .
Let ; we have since . Define new edge costs for each . Notice is a positive integer (since ) and by construction of .
Let denote the optimum solution cost for the GP2P instance . We have . Therefore, running an -approximation on this new instance finds a set of edges with . Since for every ,
Appendix B Restricted Instances of GP2P
Proof of Observation 6.
Let be an instance of GP2P. As noted in the introduction, we may assume this is an instance of Metric-GP2P. Finally, let (i.e. the nodes with non-zero and be the subgraph of induced by . Notice the restriction of to is an instance of Metric-GP2P.
Let be an optimal solution for the original Metric-GP2P instance. For each tree in the forest , let be the tour obtained by doubling the edges of and shortcutting the resulting Eulerian tour past nodes in . In this way, spans all nodes in that are spanned by and . Thus, the optimal solution cost in the restriction to is at most twice the optimal solution cost for .
To complete the reduction, for each vertex of if then replace with collocated copies each having charge 1 and if then replace with collocated copies each having charge . Note this steps takes pseudopolynomial time.
Proof of Observation 7.
Consider any constant . Let be an instance of GP2P. First we consider some preprocessing. If the 0-cost edges form a feasible solution, then it must be optimal so there is nothing more to do. Otherwise, apply Lemma 19 with and let be the resulting graph with positive integer edges costs being bounded by a polynomial in .
Let be a positive integer to be specified later. Form by performing the following operations to .
-
Subdivide each into a path of length of unit cost edges. Each new vertex in the subdivision has .
-
For each with , append a path to using new vertices and edges: each edge has cost 1 and each vertex on , including itself, has . Note, the other endpoint of is a pendant.
-
Similarly each with , append a path to using new vertices and edges: each edge has cost -1 and each vertex on , including itself, has .
-
Each with also has .
-
Finally, each edge of this new graph has .
Observe is an instance of Graphical-GP2P with for each .
Note a solution naturally maps to a solution in the final instance with cost at most by including all pendant paths and all subdivided paths corresponding to edges in . Conversely, consider any solution . Let be the set of edges of such that their entire subdivision is included in . It is easy to verify that is a feasible solution with cost at most .
Let be the optimal solution value for instance and let be the result of using -approximation on . By the preceding discussion, this yields a feasible solution to with
By setting and noting that as edge costs are positive integers in , this is at most .
Finally, by accounting for the application of Lemma 19 at the start of this proof we see that we would have an approximation for the original instance with guarantee .
Appendix C Adapting k-MST Approximations
We only sketch how the algorithms can be adapted. We refer the reader to the respective papers for their details.
Lemma 20 (Slight adaptation of Arora and Karakostas [3]).
There is a polynomial-time -Approximation for W-k-MST-R.
Proof.
The algorithm in [3] explicitly guesses a subset of vertices that appear in the optimum solution and builds that into the LP relaxation they write. So it can already handle the situation where we have a larger set of required nodes . The only thing to mention is how it can be extended to handle node weights for . We can assume each node with weight is implicitly a collection of many nodes connected using 0-cost edges in a star fashion. The algorithm of [3] first guesses an additional set of size of vertices of to be required. The next step of the algorithm is to run a “primal dual” like algorithm to find a tree . This step works without modification in our setting.
The final step in [3] is to modify appropriately. That is, the manner in which was constructed actually provides us with two options: include some subset of nodes or not. One choice would result in fewer than non-required nodes being spanned and the other would result in at least non-required nodes being spanned. In [3], it is mentioned how to pick the correct number of nodes contiguously from this “optional” portion so that grafting them in to the remaining portion of yields a feasible solution with the required number of nodes. The grafting only costs due to the guess of the net at the start of the algorithm. This can also be done in polynomial time if one implicitly maintains a 0-cost tree spanning each group of colocated points.
Lemma 21 (Slight adaptation of Arora [2]).
There is a PTAS for k-MST-R
We first comment that a similar adaptation could be made to Mitchell’s PTAS [11]. We chose this one because it was slightly easier to describe. We also emphasize that this adaptation is only for unweighted k-MST-R. Combining this with Observation 3 yields a pseudo-polynomial time approximation scheme for W-k-MST-R.
Proof.
The PTAS for k-MST in constant-dimensional Euclidean plane uses a dynamic programming routine through a quadtree dissection of the plane (an in higher dimensions a -tree dissection in -dimensional spaces). The DP table entries for each square, roughly speaking, describe the interface of the optimal solution across the boundary of that square through “portals” and also include the guess for how many nodes should be covered within the square. If there are required nodes , we can use the same DP table and simply insist that the subproblem’s solution also span any nodes of in the square. The base cases are trivially extended to this setting and the combination of subproblems (i.e. the recurrence) for a non-base case is identical to before.
