A QPTAS for Facility Location on Unit Disk Graphs
Abstract
We study the classic (Uncapacitated) Facility Location problem on Unit Disk Graphs (UDGs). For a given point set in the plane, the unit disk graph UDG(P) on has vertex set and an edge between two distinct points if and only if their Euclidean distance is at most 1. The weight of the edge is equal to their distance . An instance of Facility Location on UDG(P) consists of a set of clients and a set of facilities, each having an opening cost . The goal is to pick a subset to open while minimizing , where is the distance of to nearest facility in through UDG(P).
In this paper, we present the first Quasi-Polynomial Time Approximation Schemes (QPTAS) for the problem. While approximation schemes are well-established for facility location problems on sparse geometric graphs (such as planar graphs), there is a lack of such results for dense graphs. Specifically, prior to this study, to the best of our knowledge, there was no approximation scheme for any facility location problem on UDGs in the general setting.
Keywords and phrases:
Facility Location, Unit Disk Graphs, Approximation AlgorithmsFunding:
Zachary Friggstad: Supported by NSERC.Copyright and License:
2012 ACM Subject Classification:
Theory of computation Graph algorithms analysis ; Theory of computation Approximation algorithms analysisEditors:
Pat Morin and Eunjin OhSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
Unit-disk graphs (UDGs) are a well-studied class of graphs due to their extensive applications in modeling ad-hoc communication and wireless sensor networks; see for example [18, 4, 22, 10, 15, 26, 25]. UDGs are defined as intersection graphs of a collection of unit-diamater disks in the two-dimensional plane. Specifically, each UDG represents a set of unit disks as vertices, with each vertex corresponding to one unit disk. An edge exists between two vertices/points if and only if their Euclidean distance is at most (equivalently, the unit-diameter balls around and intersect) and the weight or length of each such edge is given by their by the Euclidean distance between the corresponding vertices.
Formally, for a given point set in the plane, the unit disk graph representation of these points, denoted as UDG(P), is a graph with the vertex set , where each vertex corresponds to a point in . The edge set consists of edges between points and if and only if their Euclidean distance, denoted as , is at most 1. The weight of the edge is equal to their distance . For a given subset , we define the (weak) diameter of as , where represents the minimum weight of a path between vertices and in (we assume is connected as we may solve facility location for each connected component).
For many optimization problems, approximation schemes are known when the input graph is a UDG (e.g. maximum independent set, minimum dominating set, minimum clique-partition [23, 14, 5, 24]). Some of the techniques for designing PTAS’s for optimization problems on UDGs (e.g. maximum independent set) involves partitioning the input into regions of bounded size (at a small loss, e.g. ignoring points that touch the boundary of the partitions) and then solving the problem on such instances using exhaustive search and/or dynamic programming which leverage Euclidean distance properties. This shifting strategy was introduced by [13].
We study the Facility Location problem on UDGs. An instance of Facility Location consists of an edge-weighted graph , where the edges satisfy the metric property, a set of clients, and a set of facilities, each having an opening cost . The goal is to pick a subset to open to minimize , where is the distance of to nearest facility in . Facility Location has been studied extensively and the best known upper and lower bounds for it are 1.488 [21] and 1.46 [11], respectively. Approximation schemes are known for Facility Location when the metric is Euclidean [2] or when is a planar graph [8]. Additionally, Cohen et al. [7] developed approximation schemes for the “uniform” (that is all facilities cost 1 to open) facility location problem in minor-free graphs. To the best of our knowledge, no approximation scheme was known for any facility location type problem on UDGs. Our main result of this paper is the following:
Theorem 1.
There is an algorithm that, given an instance of Facility Location in UDG and , finds a -approximate solution in time , where the constant in is .
In order to prove Theorem 1 we combine ideas from [8] with a low-diameter decomposition for UDGs that follows from [19, 20]. We also introduce a new dissection procedure obtained by finding a proper balanced separator for UDGs. This allows us (at a small loss) to break the problem into independent instances and use dynamic programming to combine the solutions to obtain the solution for the original instance. There are many details on how to put these pieces together carefully so as to bound the overall error.
2 Preliminaries
For planar graphs and, more generally, graphs that exclude as a minor for some fixed , Klein-Plotkin-Rao [16] showed a decomposition of the input graph into low diameter parts by removing a small fraction of edges. More specifically, given a graph with nodes and edges that excludes as a minor, one can remove edges so that the (weak) diameter of each remaining component is at most . The general idea was based on chopping breadth-first search (BFS) trees (i.e. shortest-path trees in the unweighted version of the graph): suppose one constructs a BFS tree from some root node and then cut the edges at level for where is a random offset. Then repeat this procedure on each of the connected components, for iterations. Then the resulting components have weak diameter. This result was further improved in [9, 1], to show for each graph without as a minor there is a probabilistic decomposition into (weak) diameter components by removing edges. Lee [19, 20] generalized this by introducing region intersection graphs, which includes UDGs as a special case, and showed that one can obtain similar decomposition results for such graphs. Theorem 4.2 in [19] implies that a similar BFS chopping procedure applied to UDGs for a constant number of iterations results in graphs of bounded (weak) diameter. We describe this chopping procedure a bit more formally.
Definition 2 (-chopping operation).
For any connected graph and any number , we define the -chopping operation on as follows. Choose any node from , select a random integer , and then compute a BFS tree from . Partition into annuli , where and annulus for is defined as: , where is the number of edges on the BFS tree path from to .
So there is an offset that cuts only a -fraction of edges. Earlier works on minor free graphs [17] imply that if is -minor free if we perform this chopping procedure on each component of recursively up to depth yields components with (weak) diameter at most . Corollary 4.3 in [19] immediately implies the following, Appendix A contains the brief argument.
Theorem 3 ([19]).
iterations of the -chopping iteration applied to a UDG results in a graph of weak diameter .
To prove Theorem 1 we also rely on the following result (which we prove) for the special case when the instance is in a bounded size region.
Theorem 4.
There is a PTAS for Facility Location in UDGs when the point set is contained within a bounding box of constant size in the plane.
Theorem 4 uses different techniques than discussed above. It follows from a simple reduction to prize-collecting Uncapacitated Facility Location in for which a PTAS is known [6] after guessing an “-net” of centers in the optimum solution which is possible in polynomial time since is bounded. The proof is deferred to Appendix B.
Our algorithm for Theorem 1 starts with some preliminary steps of algorithm of [8] that presented a PTAS for Facility Location on planar metrics. Those preliminary steps in fact are valid for general metrics (they do not use planarity in those initial steps) and reduce the problem to instances with certain structures that serve as our starting point. For this reason, we briefly outline the main steps of their algorithm. These initial steps reduce the problem to instances with certain structural properties and their proof works for general metrics (not just planar ones). Hence, the same initial reductions work in our setting as well.
2.1 Starting point: the PTAS for Facility Location on planar graphs [8]
A Well-Structured Instance
The goal of this section is to reduce Uncapacitated Facility Location in UDGs to more well-structured instances that, intuitively speaking, has the clients partitioned into annuli about facilities from some with bounded aspect ratio and that these facilities are near some facilities opened in an optimal solution. Definition 7 below contains the precise notion of what it means for an instance to be well structured. Corollary 9 is the main result from this section.
Given an instance of Facility Location, which consists of an edge-weighted graph , a set of clients , and a set of facilities with opening costs (for each ), the first step of the algorithm in [8] involves partitioning the instance into separate (independent) sub-instances with specific structural properties. For any solution , we denote by the connection cost of () and by the opening cost of facilities in (), and . We sometimes use to denote we refer to the cost of for instance . To achieve this, they compute an -approximation solution (where ) to a modified instance where each opening cost is scaled down by a factor of . In other words, is an -approximation for . It is not hard to see that is also an -approximation for . For any , let denote the set of clients connected to in this solution and define be the average cost of the cluster by facility . Suppose is the set of facilities in an optimum solution to . So the cost of is at most times cost of . They show:
Lemma 5 (Corollary 5 [8]).
.
Then they build a modified instance where clients of each are not too close or too far away from compared to . Let be the cost of an optimum solution to and be the cost of an optimum solution to . They show that:
Lemma 6 (Corollary 7 [8]).
For any if (for some ) then
Therefore, a near optimum solution to yields a near optimum solution to . In order to be able to obtain a PTAS they further partition the instance (starting from ) into several instances each of which has certain structural properties.
Definition 7 (Structured Instance with Bounded Aspect Ratio).
Consider an instance of Facility Location consisting of an edge-weighted graph , a set of clients , and a set of facilities with opening costs (for each ). Suppose we are provided a set that partitions into nonempty clusters . We say that the instance has bounded aspect ratio of the average costs and being structured if the following properties hold:
-
i)
, for each and each ,
-
ii)
for each , there exists such that ,
-
iii)
the aspect ratio of the average costs (i.e. ) is bounded by .
They show that one can partition into instances such that each instance satisfies the structural properties of Definition 7 and also how to combine solutions for the various to get a good solution for :
Lemma 8 (Lemmas 10 and 11 [8]).
Given for , we can build in polynomial time s.t. Furthermore
Recall that all these results only use the metric property of instance . The preceding lemmas show that to prove Theorem 1 it is sufficient to present a QPTAS for instances satisfying conditions of Definition 7 and this is what we will do. We state this formally.
Corollary 9.
Suppose for any constant there is a PTAS (resp. QPTAS) for Uncapacitated Facility Location instances in UDGs when we are additionally given and clusters satisfying the properties in Definition 7. Then there is a PTAS (resp. QPTAS) for any Uncapacitated Facility Location instance in UDGs.
2.2 Overview: A recursive decomposition of UDGs
Adapting the approach from [8]
In order to get a PTAS for well-structured instances in the planar case, [8] uses a Baker-type type layering technique [3] in conjunction with the properties of the instance, and further decompose the instance into instances of constant radius at a small loss. By utilizing balanced separators for planar graphs, they obtain a hierarchical decomposition of the plane embedding of the graph into separate regions (similar to the decomposition of Euclidean instances by Arora [2]). By placing “portals” along each separator they use dynamic programming over this decomposition, while the portals control the interface of different regions.
In our setting, instead of using Baker layering to obtain low diameter instances, we use Theorem 3 to break the instance that satisfies conditions of Definition 7 (at a small loss) into low diameter instances. We will use a variant of a balanced separator theorem developed by [12] (which improved over a more complex separator by [27]) for UDGs. Roughly speaking, they show that given a UDG, one can find two paths originating from a vertex to two other vertices that are shortest paths, and , such that the removal of these two paths and all the vertices that are within distance 1 of them leaves connected components of size at most (see Theorem 10). We adapt this theorem to get an (almost) balanced separator to obtain a hierarchical decomposition of a low diameter instance into smaller instances. We also place portals at these separators and use Dynamic Programming to combine the solutions. After a logarithmic depth of hierarchical decomposition, we arrive at instances that are easy to solve using other methods (e.g. another PTAS that is described in Theorem 4).
Balanced (partly) Separators for UDGs
To obtain our dynamic program scheme, we would like to be able to break a UDG into (almost) balanced parts by picking a separator. This will act as a “cut” in the disection schema developed by Arora [2] that has been used in designing PTAS’s for various optimization problems on Euclidean plane. For this, we utilize the balanced separator theorem for UDGs as presented by Yan et al. [12]. Let and . A hop-shortest path between two nodes is a path with the minimum number of edges.
Theorem 10 (Theorem 13 in [12]).
For a UDG , and a root , there exist two nodes of and hop-shortest paths and for which the removal of from yields components each having at most vertices from .
Theorem 13 in [12] actually only proves it for the case . See Appendix A for discussion about why it holds for general . This theorem serves as a counterpart to the well-known balanced shortest paths separator theorem in planar graphs by Lipton and Tarjan. However, it poses a challenge due to the fact that the separator is formed by the 1-neighborhoods of the paths rather than just the paths themselves. This means that we must not only remove the shortest paths but also all nodes within a distance of one from the nodes on the shortest paths. To tackle this challenge, we narrow our focus to cases where the average distance between clients and facilities is relatively large. By doing so, we can assume that clients and facilities, which end up on two different sides of the shortest path, always get connected via nodes in . Note that, as stated in Theorem 11 (which is a slight modification of this theorem), the only exceptions to this assumption occur when the path between clients and facilities crosses the border using an edge whose endpoints are very close to nodes in . However, using the fact that we can assume the average distance between clients and facilities is relatively large, we can force those paths to visit nodes in with a relatively tiny error.
In the following, we demonstrate that a slight modification of this theorem yields a balanced, yet partial in some sense, separator for UDGs. More precisely:
Theorem 11.
For a UDG , and a source , there exists two nodes of and hop-shortest paths and such that removing partitions the vertices into two sets each having at most vertices from . Additionally, for any edge with and , there exists such that .
This follows easily from Theorem 10 but we provide a proof in Appendix A for the sake of completeness. Note that we say is a separator between and if there are no edges in that connect a vertex from set to a vertex from set . However, here, we relax this condition and allow for the presence of such edges, provided that their endpoints are in close vicinity to the separator.
3 Proof of Theorem 1
We can now describe our QPTAS for Uncapacitated Facility Location in UDGs. Consider an instance of Facility Location, consisting of an edge-weighted UDG , a set of clients , and a set of facilities with opening costs (for each ). Let denote the set of facilities opened by the optimal solution, with representing the cost of this solution. Additionally, let denote the -approximation solution as described in Section 2.1, and let represent the cost of this solution. Note . We assume that the instance satisfies the properties mentioned in Definition 7. Let and denote the minimum distance between a client and its facility (cluster center) in . It can be verified using the properties of Definition 7 that the inequalities and hold for each and each (these are essentially the conditions in Lemma 9 of [8] except that we can’t do scaling in UDG instances, hence we have the factor ). Moreover, based on property (ii) (Definition 7) of the instance, it follows that holds for each client .
Our next step is to decompose the instance into independent subinstances.
Lemma 12.
At a loss of we can decompose the instance into a number of independent instances where each has diameter at most .
The full proof is deferred to Appendix A. Intuitively, it uses a BFS from a node to define layers and partitions by breaking the graph every layers using a random offset for this partitioning. Facilities of in the first layers of each part are opened, which has low cost on average over the random choice of offset. This chopping procedure is recursively applied times, after which each component has edge-hop diameter by Theorem 3. In the proof, we note BFS distances approximate actual weighted UDG distances within a factor of 2 for pairs of non-adjacent nodes so the actual diameter under weighted UDG distances is also for each component.
It can also be seen that the sum of optimum solutions of all these instances costs at most since for each client in their optimum facility will also be in . The rest of our algorithm approximates solutions in each with the following guarantees.
Lemma 13.
Given instance for Facility Location on a UDG let be an approximate solution as described above, consider the instances as in Lemma 12. There is a quasi-polynomial algorithm that produces solutions for ’s such that the total cost of the solutions is at most .
From this, we immediately get Theorem 1 since the total error is at most since the first term in the expression above is at most and .
We will treat each individual instance separately and will produce a solution for it of cost such that . A key observation that we use to bound the sum of the additive error bound above is the following. Suppose that is sufficiently large, i.e. (we will handle the case of small separately). Note that (as mentioned in the first paragraph of this Section) since , therefore ; so if is large () and each client is moved an extra distance, then it adds at most to the cost of an optimum solution for the modified instance. Hence, this instance still has a solution of cost at most . In our algorithm for each we might consider paying an extra for each client. Using this argument, it can be seen that the total additive error for each will be ; summing over all the instances the additive error is at most . So from now on we assume our instance is one of the and we prove Lemma 13.
3.1 Hierarchical decomposition with portalization
In this section, we assume the instance is a low diameter instance as obtained by applying our chopping operation from the proof of Lemma 12, i.e. one of the . More specifically, at a loss of we assume: and hold for each and each , and for each client , and the diameter is is at most .
We obtain a hierarchical decomposition of the instance and describe a Dynamic Programming (DP) algorithm based on this decomposition. This decomposition has a logarithmic depth, and each region within the hierarchy is obtained by applying two shortest paths separators to the parent region (for readers familiar with the standard decomposition obtained by a dissection of the plane used in designing PTASs for problems such as TSP and Facility Location on Euclidean planes: one can think of our shortest paths separators as the “line” that breaks the problem into two balanced instances; we will place “portal” at appropriate locations along the separator). One difference in our schema is that the region at each level might have a “boundary” that is composed of separators of all the ancestor of it; so it is bounded by separators (whereas, for e.g., a region defined in the recursive decomposition for TSP is defined by 4 dissecting lines). This is the only reason our running time becomes quasi-polynomial.111We conjecture a more careful analysis of our scheme and applying the separator could imply a “boundary” that is defined by only a few separators. This would turn the whole algorithm into a PTAS.
Recall . Observe that if we have a UDG with diameter at most then all the points must be in a bounding box of . Using this, Theorem 4 provides a PTAS when the value of is small (i.e. at most ). Therefore, we can assume that . We provide a solution that has multiplicative approximation factor and additive factor that can be charged to the number of clients in the instance (eventually we will have .)
Indeed, assuming that is sufficiently large is vital for our approach, as it enables us to utilize the balanced “partly” separator described in Theorem 11. This separator allows for a hierarchical decomposition of the graph with a logarithmic depth, but it does permit direct interactions between points (within a small distance from the separator path) in the separated regions. By forcing the interaction through the separator nodes, we incur an additive error. However, when the minimum distance between clients and facilities is sufficiently large, this error becomes negligible.
Sparsifying
Let denote the diameter of the graph. Our goal is to obtain a net of size so that the number of points we deal with is in terms of (instead of ), while we loose a small factor (compared to ).
Lemma 14.
We can obtain a graph where with , where:
-
For any two , .
-
If we move each client and facility in to its nearest point in , the optimum solution increases by .
-
Conversely, given a solution to this modified instance we can get a solution to with additional cost .
We let for denote all clients and facilities of that moved to .
We can think of as the set of nodes in a ball of radius around . The proof is by a fairly standard net construction and is found in Appendix A. Note that the error described above when summed over all ’s, is at most (since ).
Hierarchical decomposition of
We obtain a hierarchical decomposition of which will be associated with a rooted binary tree where the root of is . We will have a Dynamic Programming to solve Facility Location using this decomposition. This will be similar to the hierarchical decompositions obtained for PTAS’s on the Euclidean instances (e.g. [2]) except that we use the separator in Theorem 11 instead of dissecting lines and that the leaf nodes in our decomposition tree in our case do not correspond to trivial instances (typically a leaf node is a box with only one point in it). In our case, each leaf node is an instance with certain properties which enables us to solve it in quasi-polynomial time at a small loss.
Let us describe the decomposition for . Our hierarchical decomposition tree is associated with a labeling : we set the label of the root of to be . Each represents a subgraph with the property that if are children of , then . We use to denote the boundary vertices of and the rest of the vertices, , we call them the core vertices and are denoted by . The “boundary” is obtained by finding a separator using Theorem 11 and is added to the boundary that is inherited from the parent (details to follow). The boundary of the root node is the empty set (and so all the vertices of are core vertices for the root node of ). The overall idea is whenever a current leaf node has we decompose into two smaller subgraphs and we obtain children for . More specifically, starting from when is a single node corresponding to , iteratively, for each leaf of , if , apply Theorem 11 over with to obtain two graphs and , along with the two shortest paths and . We use the same vertex to find the two shortest paths for each node and we make sure this vertex is passed down. Define to be and . This also defines to be the subset of vertices of that fall into and not in the boundary of . Note that our separator separates the vertices of our sub-instance into parts each of which has at most many core vertices.
Every time we find a separator to break the graph (as described) we also designate of the vertices of each of these two paths as portals. More specifically, let and designate some of the vertices of these paths as portal so that they are apart (note that as mentioned before, the hop-distance and UDG distances are within factor 2 of each other). Since these paths have hop-distance length at most , we will have portals per path. Our intention is that if two points and are to be connected they have to go through portals of the boundary. This is illustrated in Figure 1.
Observe that the depth of is (recall that ). By construction, for each node , the region defined by consists of core nodes in and the boundary consists of (at most) separators (which are shortest paths starting from ) obtained using Theorem 11 each of length proportional to the diameter of the graph, namely . For any let be the set of portals on the boundary of region . Note that each region boundary is composed of at most paths and each path has portals, so each has at most portals.
By Theorem 11 and the way we put the portals (since they are apart), it follows that for any two points and (where are children of a node ), there exists a portal in the separator that breaks into for which . Now if are two vertices of (recall that is the ball of that created net node in ), the distance between to go via their ball centers and then via the portal is bounded as well: .
Suppose we require, for each node with children , the connection between points in be via portals in the separator that broke into . As argued above, this detour adds at most to the cost for each connection. Since the depth of the recursion tree is , the accumulated error incurred to each client/facility for communicating via portals and centers of their balls (among all the levels of the decomposition) is at most . Hence, there is a total error of at most for all clients connections. By a suitable choice of depending on , and setting , we get , the total error for re-routing connections of clients to be via portals (among all levels of decomposition) between regions is bounded by:
| (1) |
This will be our additive error . Recall that we have , implying . This, together with (1) and the fact that , imply that the total additive error for all ’s is bounded by if is sufficiently small.
Note.
The observation above (that if we re-route clients connections to be via portals adds only to the total cost) will be crucially used in our DP. In other words, if for every client/facility connection distance, we have a rounding error of in every level of , then the total error across all ’s is bounded by . In particular, imagine for each leaf node we have moved all the points (clients/facilities) in the ball of each node in to the nearest portal in . This adds an extra cost of over all levels of decomposition for all ’s. This simplifies our instance significantly and allows us to find a near optimum solution using Dynamic Programming. We present an overview of the DP procedure before going into details.
Overview of Dynamic Program based on
Consider an arbitrary node and portals (where on the paths that define . For each portal we will have two values : indicates (approximately) the distance to the nearest facility (to ) that is supposed to be open in the instance defined by , and indicates (approximately) the distance to the nearest facility (to ) that will be open outside the region defined by (and clients inside the balls of vertices of can be connected to them via by paying an additional distance cost of ). These distances are “approximate” since we only keep multiples of , since having a precision parameter of (as argued above) results in total error .
DP Table
For any and any two vectors of dimensions (for the portals of ) we will have a DP table entry . This entry is supposed to store the (approximate) cost of an optimum solution to the Facility Location instance defined by the balls (where ) subject to the following conditions:
-
for each portal there is an open facility in the solution with distance to at most as specified by ,
-
for each client either it is served by an open facility inside, or is paying connection cost to go to a portal and if is the number of clients that go to portal then they pay to get serviced by a facility outside of at distance . This will be part of the cost for the solution.
We say vector (and similarly ) is valid if it satisfies the following condition: for any two portals , if their distance (rounded to the nearest multiple of ) is then . This condition clearly must hold as if there is a facility with distance from then the nearest facility to cannot be further than away from it. Our DP table is computed only for valid vectors for each node .
Recurrence Overview
Suppose we have computed (approximate) solutions for each problem defined for each leaf node of and each (valid) vectors for the portals. Our DP will compute the solution for the instance of root of (i.e. ) in a bottom up manner from the leaf nodes to the root. For each internal node with children and vectors (for ) and (for , respectively) it will see if the three solutions are “consistent”. We will define this formally in the next section but at a high level this checks whether the solutions for (given the portal vectors) can be combined so that we get a solution for where it is consistent with the vectors of portals specified by .
If one keeps the distances (stored in portal vectors) as as multiples of , since distances are bounded by , there will be choices for each portal, which leads to a table size.222We can reduce this using a trick that is also used earlier (e.g. see [2]) to show how to reduce the size of the portal vectors by storing only “smoothed” vectors. Informally, the observation that helps is that if we keep distances of the nearest facility (inside or outside) for a portal as the multiple of then if the value for portal is then the value for portal just before or after on the separator path that belongs to is in . Thus, the total number of vectors we need to consider for a node is , where there are choices for the first portal values and then for the subsequent portals there are only choices for each distance. The base case will be solved using an exhaustive search by guessing a subset of portals and opening the cheapest facility on that portal followed by a check if the distance requirements for facilities are satisfied: the best consistent solution will be kept and we will store if there is no such solution.
3.2 Dynamic Program
In this section we describe the details of our dynamic program based on the decomposition . As mentioned earlier, each node corresponds to a set of vertices with boundary nodes compose of at most separators, each of which is two paths initiating from (using Theorem 11) and each containing portals that are apart (for a total of portals). Note that the diameter of the instance was , so if we round a distance to the nearest multiple of , we get an integer of value at most . As mentioned in the overview, for each pair of -dimension vectors , where each entry (for portal ) is an integer at most , we have an entry in our DP table . As mentioned before, we only consider valid vectors .
The goal of this subproblem is to identify a set of facilities to open in and to assign each client in to either an open facility or bring to a portal such that minimizes the total opening cost plus connection cost such that: (i) For each portal , there is an open facility in of distance at most (rounded to the nearest multiple of ); and (ii) For any portal suppose is the number of clients in that are connected to by the solution. The connection cost for these clients is the cost they pay to be connected to plus .
A well-structured near-optimum solution
We first show the existence of a near optimum solution with certain properties such that our DP actually finds such a near optimum solution. Starting from an optimum solution on we make some changes to it to satisfy the properties we want, while increasing the cost a little only. First for each node with we consider keeping the cheapest open facility in and closing all the others and re-routing all the clients that were served by other facilities in to that single open facility via . This adds at most to the distance each client has to travel (via ) as nodes in have distance at most to . This increases the cost by at most over all clients. We can assume the open facilities are on the centers of the balls now. Let’s call this new (near optimum) solution . Next we modify the solution further by the following process. We say a solution is -adapted for a node if (i) for each of its descendant the solution is -adapted, and (ii) for each client , if is connected to a facility outside of then it is connected via a portal , where is the set of portals of .
We start with and at leaf nodes of and going up the tree, we make the solution -adapted for each at small increase in cost. Let us consider a leaf node with (the case when is even easier) and boundary corresponds to paths with a total of portals . We consider as a portal as well and add it to . Suppose is the subset of , where there is a facility in distance at most of in . For each such , we assume we have kept open the cheapest facility within of it. For any client if was connected to a facility in in (note that all of those are within distance of some portal in ) we consider routing to the nearest portal in (and then from there to the single cheap facility we kept open). Note that this increases the connection cost for each client by at most . If was connected to a facility outside of we re-route it first to a nearest portal along its way and then from there to be connected to the facility it was connected to outside of (i.e. we make the connection of to go through a portal). This increases the connection cost for each by at most as argued earlier in the overview and the solution becomes -adapted.
Now suppose is a non-leaf node with children and supposed is -adapted and -adapted. We make it -adapted with small increase in the cost. For any client , if is connected to a facility in we don’t need to make any further changes. Otherwise we make a detour for the connection of to go through one of the portals . This detour increases the connection cost of a client by at most .
One last change we do is in the calculation of the cost of the adapted solution to make it portal vector adapted: for each node , the -adapted solution also induces vectors for portals of in the following way. The clients that are served by facilities outside are first going to a portal (of ). The distance from that portal to the nearest open facility (outside ) rounded up to the nearest multiple of is what induces a value for at node . We use this rounded (up) value instead of the actual distance in the calculation of the cost of the -adapted solution. Similarly, for each portal , the distance to the nearest open facility in rounded to the nearest multiple of induces a value for node . Let correspond to the vectors induced by the -adapted optimum solution. We assume the cost a client pays to go out of to be connected to a facility via a portal (from ) is . Note that our estimate of the nearest to open facility inside or outside , described by has an additive error of at most . Thus, the connection costs for each client at each node can be larger by again. We call this new cost, portal vector adapted.
It is easy to see that if we make the solution -adapted, where is the root of , and consider the portal adapted cost (as described), then the increase in the connection cost for each client over all the nodes of is at most (since the height of is ); summed over all the clients this was shown to be . Thus, the best -adapted and portal adapted solution has cost , and for a suitable choice of the additive error over all ’s will be add to at most .
Recurrence details
Let correspond to the vectors induced by the -adapted near optimum solution. By this argument it is enough to find compute the entries to obtain a -approximate solution.
Base case.
Let us consider a leaf node with (the case when is even easier) and boundary , which corresponds to paths with a total of portals . We consider as a portal as well and add it to . For any subset of , where there is a facility in , we consider opening the cheapest facility in ; let’s call that facility . For any client we consider routing to (i) nearest portal with an open facility, or (ii) to a portal to be connected outside at a total cost if this is less than the distance to the nearest portal with an open facility. This will be considered a feasible solution if: for each portal there is a portal with an open facility such that . The cost for will be the cost of the cheapest feasible solution over all choices of as described above. If there is no such solution (consistent with vectors , we set . It is easy to see that we obtain the best -adapted portal adapted solution.
Filling in the rest of DP table.
Now consider an arbitrary (non-leaf) node and vectors and suppose has children . Suppose all the entries of for all vectors of portals are computed. Let be the set of portals of , respectively. For vectors for portals of , and vectors for portals of we say subproblems , , are consistent if the following hold:
-
For each portal , either or .
-
For each portal , i.e. a portal that is on the separator of that creates : and .
-
for each we must have .
First observe that checking consistency for the three subproblems can be done in time . Then
where the is over all vectors such that , , are consistent. By induction, assuming that are induced portal vectors for a -adapted near optimum solution and that and are computed correctly, one can see that we get that the cost of a solution at that is no more than that of optimum -adapted solution at induced on .
Run time analysis.
Note that diameter of a (connected) UDG is at most (and we assume the graph is connected since we can run the algorithm on each connected component). Thus and hence the size of the DP is . To compute each base case it takes time and for each non-leaf node of which children and vectors , the solution can be computed by comparing all triples of valid solutions for ; this also takes . Overall the runtime is therefore , where the constant in is .
References
- [1] Ittai Abraham, Cyril Gavoille, Anupam Gupta, Ofer Neiman, and Kunal Talwar. Cops, robbers, and threatening skeletons: Padded decomposition for minor-free graphs. In Proceedings of the forty-sixth annual ACM symposium on Theory of computing, pages 79–88, 2014. doi:10.1145/2591796.2591849.
- [2] Sanjeev Arora, Prabhakar Raghavan, and Satish Rao. Approximation schemes for euclidean k-medians and related problems. In Proceedings of the thirtieth annual ACM symposium on Theory of computing, pages 106–113, 1998. doi:10.1145/276698.276718.
- [3] Brenda S. Baker. Approximation algorithms for np-complete problems on planar graphs. J. ACM, 41(1):153–180, January 1994. doi:10.1145/174644.174650.
- [4] Sergio Cabello and Miha Jejčič. Shortest paths in intersection graphs of unit disks. Computational Geometry, 48(4):360–367, 2015. doi:10.1016/J.COMGEO.2014.12.003.
- [5] Xiuzhen Cheng, Xiao Huang, Deying Li, Weili Wu, and Ding-Zhu Du. A polynomial-time approximation scheme for the minimum-connected dominating set in ad hoc wireless networks. Networks, 42(4):202–208, 2003. doi:10.1002/net.10097.
- [6] Vincent Cohen-Addad, Andreas Emil Feldmann, and David Saulpic. Near-linear time approximation schemes for clustering in doubling metrics. Journal of the ACM (JACM), 68(6):1–34, 2021. doi:10.1145/3477541.
- [7] Vincent Cohen-Addad, Philip N Klein, and Claire Mathieu. Local search yields approximation schemes for k-means and k-median in euclidean and minor-free metrics. SIAM Journal on Computing, 48(2):644–667, 2019. doi:10.1137/17M112717X.
- [8] Vincent Cohen-Addad, Michał Pilipczuk, and Marcin Pilipczuk. A polynomial-time approximation scheme for facility location on planar graphs. In 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS), pages 560–581. IEEE, 2019. doi:10.1109/FOCS.2019.00042.
- [9] Jittat Fakcharoenphol and Kunal Talwar. An improved decomposition theorem for graphs excluding a fixed minor. In RANDOM-APPROX, pages 36–46, 2003. doi:10.1007/978-3-540-45198-3_4.
- [10] Jie Gao and Li Zhang. Well-separated pair decomposition for the unit-disk graph metric and its applications. In Proceedings of the thirty-fifth annual ACM symposium on Theory of computing, pages 483–492, 2003. doi:10.1145/780542.780613.
- [11] Sudipto Guha and Samir Khuller. Greedy strikes back: Improved facility location algorithms. Journal of algorithms, 31(1):228–248, 1999. doi:10.1006/JAGM.1998.0993.
- [12] Elfarouk Harb, Zhengcheng Huang, and Da Wei Zheng. Shortest path separators in unit disk graphs. In Timothy M. Chan, Johannes Fischer, John Iacono, and Grzegorz Herman, editors, 32nd Annual European Symposium on Algorithms, ESA 2024, September 2-4, 2024, Royal Holloway, London, United Kingdom, volume 308 of LIPIcs, pages 66:1–66:14. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2024. doi:10.4230/LIPICS.ESA.2024.66.
- [13] Dorit S. Hochbaum and Wolfgang Maass. Approximation schemes for covering and packing problems in image processing and vlsi. J. ACM, 32(1):130–136, January 1985. doi:10.1145/2455.214106.
- [14] Harry B Hunt, Madhav V Marathe, Venkatesh Radhakrishnan, S.S Ravi, Daniel J Rosenkrantz, and Richard E Stearns. Nc-approximation schemes for np- and pspace-hard problems for geometric graphs. Journal of Algorithms, 26(2):238–274, 1998. doi:10.1006/jagm.1997.0903.
- [15] Haim Kaplan, Wolfgang Mulzer, Liam Roditty, and Paul Seiferth. Routing in unit disk graphs. Algorithmica, 80:830–848, 2018. doi:10.1007/S00453-017-0308-2.
- [16] Philip Klein, Serge A Plotkin, and Satish Rao. Excluded minors, network decomposition, and multicommodity flow. In Proceedings of the twenty-fifth annual ACM symposium on Theory of computing, pages 682–690, 1993. doi:10.1145/167088.167261.
- [17] Philip Klein, Serge A. Plotkin, and Satish Rao. Excluded minors, network decomposition, and multicommodity flow. In Proceedings of the Twenty-Fifth Annual ACM Symposium on Theory of Computing, STOC ’93, pages 682–690, New York, NY, USA, 1993. Association for Computing Machinery. doi:10.1145/167088.167261.
- [18] Fabian Kuhn, Tim Nieberg, Thomas Moscibroda, and Rogert Wattenhofer. Local approximation schemes for ad hoc and sensor networks. In Proceedings of the 2005 joint workshop on Foundations of mobile computing, pages 97–103, 2005. doi:10.1145/1080810.1080827.
- [19] James R. Lee. Separators in region intersection graphs, 2016. arXiv:1608.01612.
- [20] James R. Lee. Separators in Region Intersection Graphs. In Christos H. Papadimitriou, editor, 8th Innovations in Theoretical Computer Science Conference (ITCS 2017), volume 67 of Leibniz International Proceedings in Informatics (LIPIcs), pages 1:1–1:8, Dagstuhl, Germany, 2017. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.ITCS.2017.1.
- [21] Shi Li. A 1.488 approximation algorithm for the uncapacitated facility location problem. Information and Computation, 222:45–58, 2013. doi:10.1016/J.IC.2012.01.007.
- [22] Xiang-Yang Li, Wen-Zhan Song, and Yu Wang. Efficient topology control for ad-hoc wireless networks with non-uniform transmission ranges. Wireless Networks, 11(3):255–264, 2005. doi:10.1007/S11276-005-6609-4.
- [23] Tomomi Matsui. Approximation algorithms for maximum independent set problems and fractional coloring problems on unit disk graphs. In Jin Akiyama, Mikio Kano, and Masatsugu Urabe, editors, Discrete and Computational Geometry, pages 194–200, Berlin, Heidelberg, 2000. Springer Berlin Heidelberg.
- [24] Imran A. Pirwani and Mohammad R. Salavatipour. A weakly robust PTAS for minimum clique partition in unit disk graphs. Algorithmica, 62(3-4):1050–1072, 2012. doi:10.1007/s00453-011-9503-8.
- [25] Erik Jan van Leeuwen. Approximation algorithms for unit disk graphs. In Graph-Theoretic Concepts in Computer Science: 31st International Workshop, WG 2005, Metz, France, June 23-25, 2005, Revised Selected Papers 31, pages 351–361. Springer, 2005. doi:10.1007/11604686_31.
- [26] Chenyu Yan, Yang Xiang, and Feodor F Dragan. Compact and low delay routing labeling scheme for unit disk graphs. Computational Geometry, 45(7):305–325, 2012. doi:10.1016/J.COMGEO.2012.01.015.
- [27] Chenyu Yan, Yang Xiang, and Feodor F. Dragan. Compact and low delay routing labeling scheme for unit disk graphs. Computational Geometry, 45(7):305–325, 2012. doi:10.1016/j.comgeo.2012.01.015.
Appendix A Missing Proofs
Proof of Theorem 3..
Unit disk graphs are string graphs - intersection graphs of continuous arcs in the plane. Lemma 1.4 in [19] then implies each unit disk graphs is a so-called region intersection graphs over a planar graph (see the introduction in [19] for a definition). Since planar graphs exclude the complete graph as a minor, then Corollary 4.3 in [19] immediately implies the result.
Proof of Theorem 10..
We simply describe how to adapt the proof in [12] to the more general setting with . Intuitively, [12] reduces their separators to those of finding separators in planar graphs. In fact, they use a vertex-weighted separator theorem in planar graphs in their proof (see Theorem 2 in [12]) to distinguish nodes of from auxiliary nodes created in their reduction.
Precisely, in the proof of Lemma 11 of [12] one can make the following modification to the start of the second paragraph: for vertices that correspond to some vertex , if then we give a weight of 1 to and a weight of 0 to and if then give all a weight of 0. The subsequent appeal to finding balanced separating cycles vertex-weighted planar graphs ensures the final UDG separator leaves each component having at most nodes from .
Proof of Theorem 11..
Consider shortest paths and by Theorem 10, partition the components of into two graphs and each containing at most vertices from .
Partition into two sets such that each (for ) has size at most vertices from : such a partition can be obtained greedily by starting with and then iteratively adding nodes from to either or while maintaining . Observe this property is true initially when and it is trivial to maintain by adding each new node to part with the fewest nodes from .
We claim that satisfies the required property. Let be such that , then it cannot be that and by Theorem 10. Without loss of generality, say . Then there exists such that . Then is a path of length 2 from to .
Proof of Lemma 12..
We begin with the following observation: If is a BFS tree from a vertex in a UDG , then for any two vertices at levels of the tree respectively (for any ) we have their (weighted) distance in is strictly larger than (or else they would be adjacent and hence cannot be at two levels of the BFS tree) and no more than 2 (as any two adjacent vertices have distance at most ), i.e. . Thus, the BFS will 2-approximate actual (weighted) shortest paths for any node that is not a neighbour of .
Now suppose we run a BFS from an arbitrary node and group the vertices into layers where layer consists of all the nodes of BFS at levels between . Note that the “thickness” of a layer is levels of BFS, so for any two vertices in layers : . Since the distance of any client to their facilities in the optimum is at most , no path from a client to a facility (in the optimum) would cross an entire layer. Now we group consecutive layers into bundles of layers, with a random off-set chosen from . Suppose we call the first layer of each bundle a “red” layer and all the other layers of a bundle are blue; so between every two “red” layers we have blue layers. We open all the facilities of in the red layers and serve the clients in their cluster. Based on the random shift and the fact that was a -approximate solution, the total cost incurred to open these facilities and serve their clients is at most . So we can assume all these facilities are open (i.e. have zero opening cost) and we delete the clients they have served. For any other client left in the red layer, they can be partitioned into two parts: those in the top levels are called top red clients and those in the bottom levels are called bottom clients. Since for each client , , the top clients cannot cross over the bottom levels to be served by a facility. Similarly, the bottom clients cannot be crossing the top levels to be served by a facility. We show how this breaks the instance into independent instances.
For the remaining (blue) layers in every bundle we consider the clients and facilities in those layers, together with the facilities and remaining clients in the nearest levels of the two red layers above and below them. Note that these instances are now independent since for every client , , so no client in a blue layer would need to pass beyond levels into a red layer to reach its facility in optimum. Similarly, the remaining red clients will only need the facilities in the blue layers they are grouped with. This means we can solve the blue layers of each bundle (together with the facilities and clients in a strip of layers above and below) independently. So we consider the blue layers of each bundle plus the (red) layers above and below as one instance (recall the facilities in the red layers are open). This means we can assume we have deleted any connections (edges) between these independent instances. This is similar to one round of chopping in the proof of Theorem 3. We perform a sequence of chopping rounds as above on the graph and utilizing Theorem 3, we can assume that the weak diameter of each independent instance generated is bounded by and the total cost paid for the facilities in the layers chopped is ; those facilities now have opening cost zero.
So from now on (at a loss of ) we focus on each independent instance where the weak diameter is bounded by . Let’s call these instances . For each such instance we use to denote the clients that belong to . It is easy to see that the ’s are disjoint. Next, we modify ’s so that they have bounded diameter (not just weak diameter): we add all the vertices of that are within distance of some vertex of to . Now for each we have that the diameter (not just weak diameter) of is bounded by . Note that the set of clients and facilities of is the same as before (we do not bring in the clients and facilities that were outside of when adding vertices to bound the diameter).
Proof of Lemma 14..
We construct a new graph by selecting a subset of vertices from , denoted as ( will be a “net”): start by adding an arbitrary vertex of to and then iteratively include nodes from that have a minimum Euclidean distance of at least from nodes in (alternatively we can think of picking a node from to be added to and then deleting all the nodes within distance 1/8 of it from iteratively). Note that after this round is done, is . Furthermore, for each pair of vertices , if there is a vertex within distance of in and a vertex within distance of in where then add both to . Note that in this case, all are edges in . This augmentation results in the graph (over ) with a total number of vertices of . We arbitrarily order the nodes of and for each node , we define as the set of nodes in that lie inside the Euclidean ball of radius centered at but outside the balls of previously processed nodes. We focus on the UDG induced by .
To see why we can focus on the net and how it helps: each client and facility in the instance is moved to its nearest point in (keeping only the cheapest facility moved to a point if multiple move there). This instance has a solution of cost more than with since each client moves at most two extra steps of distance . Conversely, given any solution to this new instance we obtain a solution in by opening the same set of facilities except at their original locations. Again, clients move an additional each when translating from the solution in to the solution in .
So the total error will be at most (whereas the cost of optimum for was at least ). Also, it is easy to see that the size of the graph , is in terms of now ().
Appendix B PTAS for Facility Location on UDG in Bounded Regions
In this section we present a PTAS for Facility Location in UDG in the special case that the point set is contained within a bounding box of size in the plane where can be regarded as a constant, i.e. prove Theorem 4.
Consider an instance of Facility Location consisting of an edge-weighted graph , a set of clients , and a set of facilities with opening costs (for each ). Let be the facilities in an optimum solution and let denote the facility that serves the client in . Suppose we know (this assumption will be removed) and let be the error parameter. We greedily form an -net as follows:
-
Sort the facilities in by their opening costs in non-decreasing order.
-
In this order, while there is some such that , add to .
This procedure ensures that the resulting forms an -net, where no facility in is at a distance greater than from .
Claim 15.
.
Proof.
The balls of radius centered at each point in are interior-disjoint. These balls collectively occupy a total area of . The proof follows from the fact that the balls are entirely contained within a square of side length .
Now we divide the instance into sub-instances using a random grid of size that splits the bounding box into squares of size at most . Note for any two points and lying in the same cell. As a result, the metric between points within a cell can be treated as Euclidean distance. For any client , we say is cut if and lie in different grid cells. Let .
Claim 16.
For any point , .
Proof.
This is obvious if , so let’s assume . The probability that a horizontal line in the grid separates points is at most . Same when considering vertical lines in the random grid. Since , then the centre serving has a direct connection with so as well.
For each grid cell , let be the restriction of the points in the input to cell . Recall that, in the prize-collecting version of the facility location problem (Prize Collecting Facility Location), in addition to the input for facility location, each client is associated with a penalty cost . This penalty cost can be paid instead of the connection cost. The goal is to find an optimal solution that minimizes the total cost, including opening costs and both connection costs and penalties. Define an Euclidean Prize Collecting Facility Location instance for each cell . For , its penalty is . Let be the optimum facilities in cell that are not part of the net.
Claim 17.
The optimum Prize Collecting Facility Location solution for this instance has cost at most
Proof.
Consider the solution that opens . If a point is not cut, we can directly connect it to its centre in paying a cost of (since the cell has dimensions then the direct connection is possible).
Otherwise, we can pay the penalty for . Note this is upper bounded by moving to its optimum centre in (paying ) and then from there to the nearest net point (paying an additional ).
Proof of Theorem 4.
Consider the following algorithm:
-
1.
For all possible choices of with do
-
Partition the instance into sub-instances using a random grid of size . Let be the corresponding cells.
-
For each run a PTAS on the corresponding Euclidean Prize Collecting Facility Location instance [6].
-
Obtain a solution for the facility location instance: open all facilities in set , as well as the facilities opened by PTASs in each cell. For every that paid a penalty in its corresponding Prize Collecting Facility Location instance, assign to its nearest (in the UDG metric) facility in .
-
-
2.
Output a minimum cost solution, among the solutions obtained.
It is sufficient to show that -net satisfies the claim. Using the previous claims, we obtain the following. The cost of opening all facilities in -net plus expected total cost of all Facility Location solutions for all cells is at most:
using the fact that for each client , we always pay and, perhaps, an additional if is cut. But is cut with probability at most .
