Approximation Algorithms for Minimum-Load k-Facility Location

We consider a facility-location problem that abstracts settings where the cost of serving the clients assigned to a facility is incurred by the facility. Formally, we consider the minimum-load k-facility location (MLkFL) problem, which is defined as follows. We have a set F of facilities, a set C of clients, and an integer k≥ 0. Assigning client j to a facility f incurs a connection cost d(f,j). The goal is to open a set F⊆ F of k facilities and assign each client j to a facility f(j)∈ F so as to minimize maxf∈ F∑ j∈ C:f(j)=fd(f,j); we call ∑ j∈ C:f(j)=fd(f,j) the load of facility f. This problem was studied under the name of min-max star cover in References [3, 7], who (among other results) gave bicriteria approximation algorithms for MLkFL for when F=C. MLk FL is rather poorly understood, and only an O(k)-approximation is currently known for MLkFL, even for line metrics. Our main result is the first polytime approximation scheme (PTAS) for MLkFL on line metrics (note that no non-trivial true approximation of any kind was known for this metric). Complementing this, we prove that MLkFL is strongly NP-hard on line metrics. We also devise a quasi-PTAS for MLkFL on tree metrics. MLkFL turns out to be surprisingly challenging even on line metrics and resilient to attack by a variety of techniques that have been successfully applied to facility-location problems. For instance, we show that (a) even a configuration-style LP-relaxation has a bad integrality gap and (b) a multi-swap k-median style local-search heuristic has a bad locality gap. Thus, we need to devise various novel techniques to attack MLkFL. Our PTAS for line metrics consists of two main ingredients. First, we prove that there always exists a near-optimal solution possessing some nice structural properties. A novel aspect of this proof is that we first move to a mixed-integer LP (MILP) encoding of the problem and argue that a MILP-solution minimizing a certain potential function possesses the desired structure and then use a rounding algorithm for the generalized-assignment problem to “transfer” this structure to the rounded integer solution. Complementing this, we show that these structural properties enable one to find such a structured solution via dynamic programming.


INTRODUCTION
Facility-location (FL) problems have been widely studied in the Operations Research and Computer Science communities (see, e.g., References [17] and the survey [20]) and have a wide range of applications. These problems are typically described in terms of an underlying set of clients that require service and a candidate set of facilities that provide service to these clients. The goal is to determine which facilities to open and decide how to assign clients to open facilities to minimize some combination of the facility-opening and client-connection (service) costs. An oft-cited prototypical example is that of a company wanting to decide where to locate its warehouses/distribution centers so as to serve its customers in a cost-effective manner.
We consider settings where the cost of serving the clients assigned to a facility is incurred by the facility; for instance, in the above example, each warehouse may be responsible for supplying its clients and, consequently, bears a cost equal to the total cost of servicing its clients. In such settings, it is natural to consider the problem of minimizing the maximum cost borne by any facility. Formalizing this, we consider the following mathematical model: We have a set F of facilities and a set C of n clients. Assigning client j to a facility f incurs a connection cost or service cost d ( f , j). There are no facility-opening costs. The goal is to open k facilities from F and assign each client j to an open facility f (j) so as to minimize the maximum load of an open facility, where the load of an open facility f is defined to be j ∈C:f (j )=f d ( f , j); that is, the load of f is the total connection cost incurred in serving the clients assigned to it. We call this the minimum-load k-facility location (MLkFL) problem. As is common in the study of facility-location problems, we assume that the clients and facilities lie in a common metric space, so the d is a metric on F ∪ C.
Despite the extensive amount of literature on facility-location problems, there is surprisingly little amount of work on MLkFL and it remains a rather poorly understood problem (see Reference [19]). One can infer that the problem is NP-hard, even when the set of open facilities is fixed, via a reduction from the makespan-minimization problem on parallel machines and that an O (k )approximation can be obtained by running any of the various O (1)-approximation algorithms for k-median [4,5,11,12,16] (where one seeks to minimize the sum of the facility loads). No better approximation algorithms are known for MLkFL even on line metrics, and this was mentioned as an open problem in Reference [19]. The only works on approximation algorithms for this problem that we are aware of are due to Even et al. [7] and Arkin et al. [3], both of which refer to this problem as min-max star cover (where F = C). 1 Both works obtain bicriteria approximation 2 When F = C, it is easy to obtain a bicriteria guarantee for MLk FL using an α -approximation algorithm A for k -median, as follows. Letting L opt denote the optimal cost, by running A we obtain k facilities with total load at most kα L opt . Fix ϵ ≥ 0. For each facility i with load L i ≥ α (1 + ϵ )L opt , we can partition its clients into at most 2L i α (1+ϵ )L opt groups, each creating a load of at most α (1 + ϵ )L opt at i; we can open a facility at the location in a group closest to i to handle each group incurring a load of at most 2α (1 + ϵ )L opt . This creates at most k (1 + 2 1+ϵ ) facilities, each having load at most 2α (1 + ϵ )L opt . 3 In contrast, for line and tree metrics, integer (and, hence, rational) demands can be handled by creating colocated copies of a client. Although, done naively, this creates a pseudpolynomial blow-up in the input size, this reduction can often be simulated without explicitly incurring this blowup. This is true of our PTAS and QPTAS for line and tree metrics; we therefore focus on unit demands in these settings for (mostly notational) simplicity.
to the difficulty in leveraging such LP-based insights. In fact, we do not know of any LP-relaxation for MLkFL with a constant integrality gap even on line metrics. An approach that often comes to the rescue for FL problems when there is no known good LP-relaxation (e.g., capacitated FL) is local search; however, the min-max nature of MLkFL makes it difficult to exploit this. In particular, one can come up with simple examples where a k-median style multi-swap local-search algorithm does not yield any bounded approximation ratio even for line metrics; see Section 7. (This is also true for the (uncapacitated) k-center problem, which is another min-max problem, and we are not aware of any local-search-based algorithms for k-center.) It appears the standard methods that have been successfully applied to solve some classical clustering problems (such as variants of facility location, k-median, etc.) do not work for MLkFL, and one needs to find new venues of attack for min-load k-FL. Our PTAS for line metrics consists of two main ingredients. First, we prove that there always exists a near-optimal solution possessing some nice structural properties (Section 3.1). Second, we show in Section 3.2 that these structural properties enable one to find such a structured solution via dynamic programming (DP).
We now give a high-level overview of these two ingredients. First, we show that, for any ϵ > 0, there is a (1 + O (ϵ ))-approximate solution, where the intervals corresponding to the client-facility assignments with "small" connection costs, which we call small arms, form a laminar family. It appears difficult to argue this directly. However, a key insight and a notable aspect of our proof is that one can derive this structure by moving to a mixed-integer LP (MILP), arguing that an MILP solution satisfying this laminarity property must exist and then utilizing a suitable rounding algorithm to "transfer" the laminarity property from the MILP solution to the rounded integral solution. More precisely, we show using uncrossing arguments that an MILP-solution minimizing a certain potential function (which depends on the optimal solution) must satisfy the above laminarity property. Further, we observe that the fractional assignment of clients to integrally open facilities represented by the MILP solution can be converted to an integral one using the rounding algorithm of Reference [21] for the generalized assignment problem (GAP), and this rounding procedure preserves the laminarity property.
Our next step is to exploit the above structure to efficiently find a solution satisfying these structural properties via DP. To gain some intuition, note that if all arms form a laminar family, then one could use DP to identify tuples (I , c, r ), where I is a maximal interval such that all clients in I are either served by facilities opened from I or from c (which is a center outside I ) and the load imposed by clients from I that are assigned to c is at most r . However, our setting is somewhat more complicated since only the small arms form a laminar family. Our definition of small arms, however, ensures that the number of "big," i.e., not small, arms incident to a facility is at most a constant (depending on ϵ). Exploiting this, we show that, despite this complication, a DP scheme is still possible if we maintain some extra information in the DP table corresponding to the big arms that "cross" each interval (see Section 3.2). This yields the desired PTAS.

Related work.
There is a wealth of literature on facility-location problems (see, e.g., References [17,20]); we limit ourselves to the work that is relevant to MLkFL. As mentioned earlier, Even et al. [7] and Arkin et al. [3] are the only two previous works that study MLkFL (under the name minmax star cover). They view the problem as one where we seek to cover the nodes of a graph by stars (hence the name min-max star cover) and obtain bicriteria guarantees. Viewed from this perspective, MLkFL falls into the class of problems where we seek to cover nodes of an underlying graph using certain combinatorial objects. Even et al. and Arkin et al. consider various other min-max problems-where the number of covering objects is fixed and we seek to minimize the maximum cost of an object-in this genre. Both works devise a 4-approximation algorithm when the covering objects are trees (see also Reference [18]), and Even et al. obtain the same approximation for the rooted problem where the roots of the trees are fixed. Arkin et al. obtain an O (1)-approximation when the covering objects are paths or walks. The approximation guarantees for min-max tree cover were improved by Khani and Salavatipour [14]. All of these works also consider the version of the problem where we fix the maximum cost of a covering object and seek to minimize the number of covering objects used. Frederickson et al. [8] obtain an (e + 1)-approximation when the covering objects are tours rooted at a given node.
For MLkFL on star metrics, when F = C, certain results follow from some known results and the above min-max results. For example, it is not hard to show that MLkFL, even with non-unit demands, can be reduced to the makespan-minimization problem on parallel machines while losing a factor of 2. 4 Since the latter problem admits a PTAS [10], this yields a (2 + ϵ )-approximation algorithm for MLkFL on star metrics when F = C. When F = C and with unit demands, one can also infer that (for star metrics) the objective value of any solution for min-max tree cover (viewed in terms of the node-sets of the trees) is within a constant factor of its objective value for minmax star cover. (This is simply because for any set S of nodes, the cost of the best star spanning S is at most twice the cost of the minimum spanning tree for S.) These correspondences however break down when F C, even for unit demands. Our 12-approximation algorithm for star metrics works for arbitrary F , C sets and non-unit (equivalently, non-uniform) demands.
As with the k-median and k-center problems, MLkFL can also be motivated and viewed as a clustering problem: We seek to cluster points in a metric space around k centers in a way that minimizes the maximum load (or "star cost") of a cluster. Whereas MLkFL and k-center are minmax clustering problems, where the quality is measured by the maximum cost (under some metric) of a cluster, k-median is a min-sum clustering problem, where the clustering quality is measured by summing the cost of each cluster.
Finally, observe that if we fix the set of k open facilities, then the problem of determining the client assignments is a special case of GAP. There is a well-known 2-approximation algorithm for GAP [21]. As noted earlier, this algorithm plays a role in the analysis of our PTAS for line metrics (but not the algorithm itself), when we reason about the existence of well-structured near-optimal solutions.

PROBLEM DEFINITION
In the minimum-load k-facility location (MLkFL) problem, we are given a set of clients C and a set of facilities F in a given metric space d. The distance between any pair of points i, j ∈ C ∪ F is denoted by d (i, j). Additionally, we are given an integer k ≥ 1. The goal is to select k facilities f 1 , . . . , f k to open and assign each client j to an open facility so as to minimize is the facility to which client j is assigned. We use the terms facility and center interchangeably. We frequently use the term star to refer to a pair ( f , S ), where f is an open facility in the solution and S ⊆ C is the collection of clients assigned to f ; we also refer to f as the center of this star. The cost of this star, which is the load of facility f , is j ∈S d ( f , j). Thus, our goal is to find k stars, ( f 1 , S 1 ), ( f 2 , S 2 ), . . . , ( f k , S k ), centered at facilities so that they "cover" all the clients (i.e. C = ∪ k i=1 S i ) and the maximum load of a facility (or cost of the star) is minimized. Throughout, we use OPT to denote an optimum solution and L opt to denote its cost.

A PTAS FOR LINE METRICS
In this section, we focus on MLkFL on line metrics and present a PTAS for it. Here, each client/facility i ∈ C ∪ F is located at some rational point v i ∈ R. It may be that v i = v j for i j, for instance, when we have collocated clients. (As noted earlier, we focus on unit demands for simplicity; but, with a little work, one can handle also integer demands by viewing these implicitly as colocated clients.) To simplify notation we use the term "point" to refer to a client or facility i ∈ C ∪ F as well as to its location v i . The distance d (i, j) between points i, j ∈ C ∪ F is simply |v i − v j |. We assume that |C ∪ F | = n and that 0 ≤ v 1 ≤ v 2 ≤ · · · ≤ v n . For a star ( f , S ) in a MLkFL solution and for any j ∈ S, say that the open interval with endpoints v f and v j is an arm of the star ( f , S ) and we say that f covers j. For S ⊆ S, we sometimes use the phrase "load of f by S " to refer to the sum of the lengths of arms of f to the clients in S . The main result of this section is the following theorem. Theorem 3.1. There is a (1 + ε)-approximation algorithm for MLkFL on line metrics for any constant ε > 0.
Our high-level approach is similar to other min-max problems. Namely, we present an algorithm that, given a guess B on the optimum solution value, either certify that B < L opt or else find a solution with cost not much more than B. Our main technical result, which immediately yields Theorem 3.1, is the following. In both Theorems 3.1 and 3.2 we assume ε < 1 and ϵ < 1 without loss of generality. Proof of Theorem 3.1 : Set ϵ := ε/18. We use binary search to find a value B ≤ L opt such that algorithm A from Theorem 3.2 finds a solution with cost ≤ (1 + 18ϵ ) · B ≤ (1 + 18ϵ ) · L opt . Return this solution.
Since the points v i are rational and since n · v n is clearly an upper bound on the optimum solution, then we may perform the binary search over integers α ∈ [0, nv n Δ] where Δ is such that v i Δ ∈ Z for each point i. For each such value α in the binary search, we try algorithm A with value B = α Δ . In what follows, we describe algorithm A. We will assume that B ≥ L opt and show how to find a solution with cost at most (1 + 18ϵ ) · B. Let S B denote a collection of stars {( f 1 , S 1 ), . . . , ( f k , S k )} with cost at most B. In the remainder of this section, we will describe some preprocessing steps that simplify the structure of the problem. In Section 3.1 we prove that a well-structured nearoptimum solution exists and in Section 3.2 we describe a dynamic programming algorithm that finds such a near-optimum well-structured solution.
Without loss of generality, we assume that 1/ϵ is an integer. We start with some preprocessing steps. Note that d (j, f ) ≤ B for any j ∈ S of a star ( f , S ) in S B . So if the distance of two consecutive points on the line is more than B, then we can decompose the instance into some instances that the distance of any two consecutive points is at most B. For each of the resulting instances Π , we find the smallest k such that running the subsequent algorithm on the instance with k instead of k finds a solution with cost at most (1 + 18ϵ )B. Since we are assuming B ≥ L opt , then the sum of these k values over the subinstances is at most k. Note that in each subinstance Π we can assume 0 ≤ v i ≤ n · B for each point v i .
Next, we perform a standard scaling of distances. Move every point i ∈ C ∪ F left to its nearest integer multiple of ϵ B n and then multiply this new point by n ϵ B . That is, move i from v i to Proof. After sliding each point v i left to its nearest integer multiple of ϵ B n , the distance between any two points changes by at most ϵ B n . Therefore, the load of any star changes by at most ϵB so each star has load at most (1 + ϵ )B. Finally, after multiplying all points by n ϵ B , we have that the maximum load of any star is at most (1 + 1/ϵ ) · n.
Now consider any solution with cost at most (1 + α · ϵ ) · (1 + 1/ϵ ) · n. Scaling the points v back by ϵB/n produces a solution with cost at most Then sliding any two points i, j back to their original positions v i , v j changes their distance by at most ϵB/n, so doing this for all points changes the cost of any star by at most ϵB. The resulting stars then have cost at most (1 + (2 + 2α )ϵ ) · B.
In subsequent sections, we describe a (1 + 8ϵ )-approximation for any one of the subinstances Π of Π, except we use the new points v i . By Lemma 3.3, this gives us a solution to Π with cost at most (1 + 18ϵ )B, proving Theorem 3.2 To simplify notation, we use v i to refer to the new location of point i ∈ C ∪ F (i.e., rename v i to v i ). Similarly, the notation d (i, j) for i, j ∈ C ∪ F refers to these new distances |v i − v j | and B denotes the new budget (1 + 1/ϵ ) · n. From now on, we assume our given instance Π of MLkFL satisfies the following properties: • Each point v i is an integer between 0 and (1 + 1/ϵ ) · n 2 (which is n times the new value of B). • There is a solution S B with cost at most B = (1 + 1 ϵ )n.

Structure of Near-Optimum Solutions
In this section, we show that there is a near-optimum solution to the instance Π with clients and facilities C ∪ F that has some suitable structural properties. In Section 3.2, we will find such a solution using a dynamic programming approach.
We denote the open interval between two points v i and v j on the line by I i, j and call this the arm between i and j (assuming that one of i, j is a client and the other is a facility). An arm I i, j is large if d (i, j) > ϵB and is small otherwise. We say that two arms I i, j and I i , j cross if I i, j is not contained in I i , j or vice versa and I i, j ∩ I i , j ∅. A well-formed solution for an MLkFL instance is a solution in which the small arms between clients and their assigned facilities (centers) do not cross. We show that there exists a low-cost well-formed solution in two steps. First, we demonstrate the existence of a fractional solution where there are k (integral) facilities and the clients are assigned to these centers fractionally. This will be such that the fractional load of each facility is still at most B, all strictly fractional arms in the support have length at most 2ϵB, and that all small arms in the support of the solution do not cross.
Second, we use a rounding algorithm for the Generalized Assignment Problem (GAP) by Shmoys and Tardos [21] to round such a fractional solution to an integral solution with cost at most (1 + 2ϵ )B. We emphasize that this rounding algorithm is not a part of our algorithm; it is only used to demonstrate the existence of a well-structured solution.
For the first step, we will consider a fractional uncrossing argument to eliminate crossings. Instead of proving the fractional uncrossing process eventually terminates, we will instead provide a potential function that strictly decreases in a fractional uncrossing. This potential function is the objective function of a mixed integer-linear program below; thus an optimal solution will not contain any crossings between small arms in its support.
We let C B = { f 1 , . . . , f k } denote the centers (facilities) of the stars in the solution S B (recall that each star in C B has cost/load at most B). The variable x i j indicates that client j is assigned to facility f i ∈ C B . The first constraint ensures every client is assigned to some facility and the second ensures the cost of a star (i.e., load of a facility) does not exceed B, We stress that this is not a relaxation for MLkFL. The objective function is more similar to the objective function for the k-median problem. Rather, we will only be using this to help demonstrate the existence of a well-formed solution. The objective function acts as a potential function. Proof. First observe that there is in fact a feasible solution x because the integer solution S B is feasible for this ILP. By standard theory of mixed-integer programming and the fact that the set of feasible solutions is bounded, there is then an optimal solution x. The rest of this proof shows that an optimal solution to (MIP) cannot contain crossings between small arms in its support.
So suppose x is a feasible solution such that two small arms I i, j and I i , j in the support of x cross. To simplify notation, let and v 2 = v j be the locations of the clients j, j . Also let x 1 denote x i j and x 2 denote x i j . That is, x 1 is the extent to which the client at location v 1 is assigned to the center at location c 1 and similarly for x 2 . Finally, let 1 = |c 1 − v 1 | and 2 = |c 2 − v 2 | denote the lengths of the two crossing small arms. See Figure 1 for an illustration of how this notation is used; the notation will be defined separately for each case considered below.
We check all possible ways that these two arms can cross. When we say that we shift some value α of coverage from one variable x to another x , we mean increase x by α and decrease x by α. Note that we will always shift value between the and d ( f i , j) ≤ 2ϵB so such an uncrossing will maintain the constraint that only arms of length at most 2ϵB may be fractional.
(1) v 1 and v 2 lie between c 1 and c 2 (Figure 1(a)). Let > 0 be the length of intersecting parts of these arms. Without loss of generality, assume that x 1 ≤ x 2 . Shift x 1 coverage from x i j to x i j and from x i j to x i j and note that this preserves feasibility, since each client is still covered (fractionally) to the extent of 1. The coverage x after this shifting is depicted in Figure 1(b). The total cost of the two stars (centered at c 1 and c 2 ) decreases by 2 x 1 > 0, so the objective function strictly decreases and we are left with an even cheaper solution to (MIP). (2) v 1 and v 2 are on different sides of the segment c 1 c 2 ( Figure 1(c)). Let > 0 be the distance between c 1 and c 2 . Without loss of generality, assume that We verify that the fractional load at each center c 1 and c 2 does not increase. That is, the load at c 1 changes by < 0, and the load at c 2 changes by Since the load at both facilities strictly decreases then this is also yields a cheaper solution to (MIP).
(3) v 1 and v 2 are on the same side of the segment c 1 c 2 (Figure 1(e)). Let > 0 be the distance between c 1 and c 2 . Without loss of generality, assume that v 1 and v 2 are on the right side of segment c 1 and c 2 and the left center is c 1 . This means v 1 is between c 2 and v 2 and, hence, There are two sub-cases: This means x 1 1 Figure 1(f)). The fractional load at s 1 changes by x 1 1 + 2 ( + 2 ) − x 1 1 = 0 and the fractional load at s 2 changes by Since the total load strictly decreases, then this is also yields a cheaper solution to (MIP).  (Figure 1(g)). The fractional load at s 1 changes by x 2 ( + 2 ) − x 2 + 2 1 1 = 0 and the fractional load at s 2 changes by Since the total load strictly decreases, then this is also yields a cheaper solution to (MIP).
In all these cases, the new solution is feasible and has a smaller objective values as required.
We will use Lemma 3.4 to prove the existence of a near-optimum solution to instance Π where the small arms used by clients do not cross. To complete this proof, we rely on a structural result concerning the polytope of a relaxation for the following scheduling problem.
Definition 3.5. In the scheduling problem on unrelated machines, we are given machines m 1 , . . . ,m k , jobs j 1 , . . . , j n and processing times p(m i , j a ) ≥ 0 for any job j a and any machine m i . The goal is to assign each job j a to a machine ϕ (j a ) ∈ {m 1 , . . . ,m k } to minimize the maximum total running time a:ϕ (j a )=m i p(m i , j a ) of any machine.
Shmoys and Tardos [21] prove a result concerning the polytope of an LP relaxation for this problem, as a part of a more general result concerning the related Generalized Assignment Problem (GAP). The following summarizes the results they obtain that are relevant for our work. Theorem 3.6 (Shmoys and Tardos, [21]). Suppose we have a bound B and fractional values x (m i , j a ) ≥ 0 for each job j a and each machine m i that satisfy the following: Then there is an assignment ϕ of jobs to machines such that x (ϕ (j a ), j a ) > 0 for each job j a and the maximum load of any machine under ϕ is at most B + max a,i:0<x We use the above theorem together with Lemma 3.4 to prove the following. Theorem 3.7. There is a feasible (integer) solution to the MLkFL instance Π with maximum load (1 + 2ϵ )B on each star such that no two small arms cross.
Proof. Let x * be the fractional solution provided by Lemma 3.4. We view x * as a solution to the following scheduling problem on unrelated machines. We have k machines m 1 , . . . ,m k , each corresponding to a facility f i ∈ C B . For each client a ∈ C, there is a single job j a . The processing time p(m i , j a ) of job j a on machine m i is |v i − v a |, the distance between the corresponding locations.
Now, x * fractionally assigns each job j a to the machines to a total extent of 1 and the maximum (fractional) load at machine m i is B. Furthermore, the only strictly fractional assignments (i.e. those with 0 < x i j < 1) have |v i − v j | ≤ 2ϵB. In the scheduling terminology, the only strictly fractional assignments are between a job j a and a machine m i such that p(m i , j a ) ≤ 2ϵB. Theorem 3.6 shows we can transform this fractional assignment x * into an integer assignment such that (a) if client j is assigned to facility/center i, then x * i j > 0 and (b) the maximum load of a facility is B In this solution, small arms used by clients do not cross because they come from the support of x * .

3.2.1
Step Min-max Cost. Theorem 3.7 shows that there is a solution of cost at most (1 + 2ϵ )B such that no two small arms (i.e., length ≤ ϵB) used to assign clients to centers cross. Call this solution S B . We now show that we can find such a well-structured solution of cost at most (1 + 8ϵ )B.
The main idea behind our approach is the following. If it were true that a near-optimum solution did not have any crossing arms (large or small), then we can find such a solution using a dynamic programming approach. At a very high-level, we could exploit the laminar structure of the solution by decomposing the solution into a family of nested intervals I such that for every I ∈ I there is one center c with v c I such that clients in I are served either by centers in I or by c. From this, we can consider triples (I , c, r ), where I ∈ I, c is a location outside of I and r is some integer between 0 and poly(n, 1/ϵ ) describing the load assigned to c from clients in I . We can look for partial solutions parameterized by these triples and relate them through an appropriate recurrence.
Unfortunately, we are only guaranteed that the small arms do not cross in our near-optimum solution so the collection of all arms in the solution is not necessarily laminar. To handle this general case, we must carry extra information about large arms through our dynamic programming approach and keep track of how many large arms of each possible length cross an interval. Since each large arm has a length between ϵB and B, if we store the exact length, then this amounts to storing a vector with roughly B coordinates. There are exponentially many vectors with B coordinates that the values in different coordinates sum up to at most n, so we cannot keep track of this information. However, by coarsening the length, we ensure that large arms have only a constant possible number of different perceived length, so we can keep track of this information when we move to perceived length.
First, recall that all large arms have length more than ϵB. Thus, each facility is serving at most ≤ 3 ϵ clients that are at distance more than ϵB from it; in other words, each facility is assigned at most 3 ϵ large arms in the solution provided by Theorem 3.7. Since large arms have length at least ϵB, if we store their length in multiple of ϵ 2 B, then we will not lose much information about them. Let MULT(i, j) denote the number of multiples of Then we measure the the length of a large arm I i j between client j and facility i as We call this the perceived cost of this arm. In this method of measurement, the length of the arm changes by at most ϵ 2 B, and so the total load for each center changes by at most 3ϵB. Now for this method of measurement, since there are 1 ϵ 2 coordinates, the number of possible vectors for keeping track of large arms is at most n 1 ϵ 2 which is polynomial. In the dynamic programming algorithm described below, we will use this coarse method to measure the distance of large arms. Since in dynamic programming approach the problem is often broken into subproblem, a large arm might be partitioned with respect to subproblems. We would like to note that for points q=0 MULT(p q , p q+1 ) (Note that MULT(a, a) = 0 for any a.). We use the term the perceived cost of a facility i to denote the the total cost of the small arms plus the perceived cost of its large arms. The following is proved using arguments similar to the proof of Lemma 3.3, recalling that every facility in S B has at most 3/ϵ large arms.
Lemma 3.8. The perceived cost of every star in S B is at most (1 + 5ϵ )B. Furthermore, any star with perceived cost at most (1 + 5ϵ )B and at most 3/ϵ long arms has (actual) cost at most (1 + 8ϵ )B.
Our dynamic programming algorithm will find a solution with perceived cost at most (1 + 5ϵ )B and at most 3/ϵ large arms per star, so the actual cost will be at most (1 + 8ϵ )B.

Dynamic
Programming. Before we formally define the subproblems of dynamic programming, we discuss the structure of a well-formed solution, say, S. We call a client covered by a small (large) arm a small client (large client), respectively. It will be convenient to associate a direction with each arm, which goes from the center/facility to the client. For a star S with center f , let S small denote the clients covered by small arms in S. Let the s-span of S be the open interval, possibly empty, spanning (l S , r S ), where are the leftmost and the rightmost small clients (or facility) in this star, respectively. Since the small arms do not intersect in S, for any two s-spans I 1 and I 2 of two stars, either I 1 ∩ I 2 = ∅ or I 1 ⊆ I 2 or I 2 ⊆ I 1 . Therefore, the ⊆ relation between s-span of stars in S defines a laminar family. A laminar family can naturally be viewed as a forest where we put a node for each member of the family and each node I has an edge to the minimal member, say, I , of the family that contains it, i.e., I ⊆ I . If a facility f does not serve any client by a small arm, then its s-span is (v f , v f ) by definition and although this is an empty interval, in the forest corresponding to laminar family, then it has an edge to the minimal s-span (interval) that contains v f . We frequently refer to this forest-view when referring to a laminar family.
Let us try to understand the subproblems that come up in constructing a solution. The dynamic programming in fact stitches together the solutions of these subproblems to find the solution to the original problem. For indices 1 ≤ l ≤ r ≤ n, let V l,r denote the set of points {v l , v l +1 , . . . ,v r } with l ≤ r , so V = V 1,n . We want to use the dynamic programming to answer the question if it is possible to open k centers in V 1,n and assign clients to these centers such that the perceived cost of each center is at most (1 + 5ϵ )B and the s-span of the centers form a laminar family. Now consider solution S B which has a maximum perceived load of (1 + 5ϵ )B. Consider the forest corresponding to the s-span of centers in this solution and let c be the center that its s-span is the leftmost root in the forest. We can guess c as there are at most O (n) possible choices for it. Let k r and k l denote the number of centers in S B which are on the right and left side of c, respectively, so k r + k l = k − 1. There are O (k ) possible choice for values k r and k l , so we guess k r and k l as well (note that k is dominated by n). Let us now focus on the interval V 1,c−1 . Since c corresponds to the interval at the root of the forest, no center in V c+1,n can serve clients in V 1,c−1 by small arms (otherwise the s-span of c is a subset of s-span of some other center). We can guess the load corresponding to small clients served by c in V 1,c−1 , as this load is an integer in poly(n, 1/ϵ ). We may have some clients in V 1,c−1 served by large arms originating in V c,n . We cannot guess the large arms serving these clients as the number of possible such arms is not polynomial (there are O (n k n 3/ϵ ) possible choices), but instead we can bundle large arms entering V 1,c−1 based on their perceived length past v c−1 , that is, their perceived length in V 1,c−1 interval. More precisely, we guess how many large arms have perceived length q × ϵ 2 B past v c−1 for each 0 ≤ q ≤ 1 ϵ 2 as there are O (n 1 ϵ 2 ) possible choices for our guess. Similarly, we may have some large arms originating in the interval V 1,c−1 which serve clients in V c+1,n . Again we can bundle these arms based on their perceived length past v c+1 (their perceived length in V c+1,n ) and we guess the number of large arms with perceived length q × ϵ 2 B past v c+1 for each 0 ≤ q ≤ 1 e 2 . This give us an idea of what parameters are needed for describing the subproblems. Now consider some interval V l,r between two arbitrary points v l , v r and consider how S B looks in this interval. There may be some large arms that enter and/or leave this interval from v l or v r . The arms that enter the interval can cover the deficiency of coverage for some clients in the interval and the arms that leave the interval provide coverage for some clients outside of interval and can be viewed as surplus to the demand of coverage of the clients in the interval. We keep track of all large arms crossing the sides of interval V l,r in terms of deficiency and surplus vectors as follows: • Deficiencies: Vector D l is the deficiency vector in [n] 1 ϵ 2 for v l , where D l [q] for 0 ≤ q ≤ ϵ −2 is the number of large arms from a center location i < l to a client j ≥ l such that MULT(l, j) = q. So D l [q] keeps track of the number of large arms originating in i < l and crossing exactly q multiples of ϵ 2 B in interval (v l , v j ]. Note that client j ≥ l can be located outside of interval V l,r , i.e., j > r as well. The vector D r is defined similarly for v r , that is, D r [q] is the number of large arms from a center location i > r to a client j ≤ r such that MULT(j, r) = q. Let q = MULT(l, r), then all arms represented by D l [q] for q < q must end at clients located in V l,r and all arms represented by D l [q] for q > q must end at clients located to the right of V l,r . If q = q , then some arms may end at clients in V l,r and some may end at clients located to the right of V l,r .
• Surpluses: Vector S l is the surplus vector in [n] 1 ϵ 2 for v l , where S l [q] for 0 ≤ q ≤ ϵ −2 is the number of large arms from a center location i ≥ l to a client j < l such that MULT(j, l) = q. So S l [q] keeps track of large arms originating at i ≥ l and crossing exactly q multiples of Note that center i ≥ l can be larger than r and does not need to be located in the interval V l,r . The vector S r is defined similarly, that is, S r [q] is the number of large arms from a center location i ≤ r to a client j > r such that MULT(r, j) = q. Recall that q = MULT(l, r). Note that for q > q , any arm contributing to D l [q] also contributes to S r [q − q ] and, similarly, any arm contributing to D r [q] also contributes to S l [q − q ].

The Table
The table we build in our dynamic programming algorithm captures "snapshots" of solutions bound between two given points plus some information on how arms cross these points. We consider Boolean value A[k , l, r , c, β, D l , D r , S l , S r ] corresponding to subproblems. The meanings of the parameters are as follows: • 0 ≤ k ≤ k is the number of centers in the interval V l,r . • 1 ≤ l ≤ r ≤ n corresponds to the interval V l,r .
• c ∈ F denotes a single point with either c < l or c > r (i.e. outside of V l,r ), that is the center of some star, or else c = ⊥. If c ⊥, then it is supposed to be the only center outside of V l,r with small arms going into V l,r and the total cost of small arms that c pays to cover clients i with l ≤ i ≤ r is β where 0 ≤ β ≤ (1 + 5ϵ )B is an integer. • D l , D r , S l , S r are deficiency and surplus vectors for the endpoints of interval V l,r .
Note that in the above, if c = ⊥, then the value of β can be assumed to be zero.
The and also assign the start of some of the large arms exiting the interval to these open centers in V l,r such that the following hold: • The perceived load of each of the k centers is at most (1 + 5ϵ )B, • The load of c from small arms originating from clients j ∈ C with l ≤ j ≤ r is β, • The large arms entering and/or exiting V l,r are consistent with D l , D r , S l , S r .
By consistent, we mean the following. For each interval [a, b] where a and b are consecutive multiples of ϵ 2 B, we check the number of promised clients to be served by D l , D r , S l , S r in this interval (depending on position of a and b with respect to l and r , some of these vectors may not give any information). More precisely, when b ≥ r , each of D l [q 1 ] and S r [q 2 ], where q 1 = MULT(l, a) and q 2 = MULT(r, a) gives the number of clients in [a, b) that are supposed to be served by centers i ≤ l and i ≤ r , respectively. Now since l ≤ r , any large arm counted in D l [q 1 ] must be counted in S r [q 2 ], i.e., S r [q 2 ] ≥ D l [q 1 ], and we must have at least S r [q 2 ] clients in [a, b). Note that S r [q 2 ] − D l [q 1 ] correspond to long arms that start in interval V l,r . Similar arguments must be made for when a ≤ l and D r and S l . For intervals containing l or r , we have to subdivide the interval, e. g., [a, r ) and [r , b), and then check the consistency.
The number of table entries is polynomial, because k , l, r , c are in O (n) and β is a polynomial in n and 1 ϵ and the deficiency and surplus vectors, in total, can take one of O (n 1+1/ϵ 2 ) values. We shortly explain how one can compute the values A in polynomial time through dynamic programming. After that, to find out if there is a feasible solution having perceived cost (1 + 5ϵ )B, one simply needs to look at the value of A[k, 1, n, ⊥, 0, 0, 0, 0, 0], where 0 is a vector having 1 + 1/ϵ 2 zero components.

The Recurrence
In the remaining of the section, we explain how the value of a table entry is calculated. We call a subproblem feasible if A[k , l, r , c, β, D l , D r , S l , S r ] is True.
Base case. The base case is when k = 0 and l = r . Since k = 0, we are not allowed to open a facility at v l (l might not be a facility anyway). There are two possible main cases based on whether c can serve a possible client at r or not.
If r is a client, then it has to be served by a large arm coming from outside of V l,r . If r is a facility, then it cannot be opened as k = 0. So we only check consistency of D l , D r , S l , S r , i.e., D l = S r and D r = S l . If these conditions hold, then we set the value of the table entry to True otherwise we set it to False.
If r is a client, then it has to be served by a small arm originating at c, so β must be equal to d (v j , v c ) and it must be smaller than ϵB. The vectors D l , D r , S l , S r must be consistent, i.e., D l = S r and D r = S l . If r is a facility, then the subproblem is infeasible by the definition.
Recursive step. Next, we show how to determine if A[k , l, r , c, β, D l , D r , S l , S r ] is True when the parameters do not represent a base case by relating its value to values of smaller problems. In what follows, by guessing a parameter, we mean that we try all polynomially many possible values of that parameter and if one of them results in a feasible solution, we set the value of the current subproblem to True. We consider two cases: Case (1): c ⊥ and β > 0. Without loss of generality, suppose c < l. There must be a small client j with l ≤ j ≤ r covered by c. We guess j to be the leftmost such small client along with how many of k facilities in V l,r are in V l, j−1 , call this k (the rest of facilities, k − k , will be in V j+1,r ). For subproblem constructed for V l, j−1 , no small arm can enter V l, j−1 , and for subproblem constructed for V j+1,r , the center outside V j+1,r is c with allowed load of β = β − d (v j , v c ). We can also guess the large arms leaving and/or entering V l, j−1 as well as V j+1,r and in polynomial time, we check if these vectors are consistent with each other as well as D l , D r , S l , S r . So A[k , l, r , c, β, D l , D r , S l , S r ] is set to True if one of the following expressions is True (see Figure 2), for some l ≤ j ≤ r such that there is a small arm from c to j, denoted by I c j and |v c − v j | ≤ β, some 0 ≤ k ≤ k , and D j−1 , S j−1 , D j+1 , S j+1 consistent with D l , D r , S l , S r , (using the assumption that j is served with a small arm).
Note that when j = l, we just check one other subproblem, namely In this case, l r (otherwise, we are in a base case). All clients in V l,r must be covered by large arms from centers (facilities) outside the interval. If l is a facility, then it must be closed so we recursively check A[0, l + 1, r , c, 0, D l , D r , S l , S r ], where the new deficiency and surplus vectors are obtained from D l , D r , S l , S r after taking into account the number α of multiples of ϵ 2 B between v l and v l +1 . If this is not possible, e.g., if D r (q) > 0 for some q < α or similarly, then the subproblem is not feasible.
So, suppose that l is a client. First assume l is covered by a large arm from the left. Then D l (0) > 0, and we use one to cover client l. In this case, define D l , D r , S l , S r to reflect the fact that this one large arm covers l and the perceived length of the remaining ones accounted for by D l , S l that entered or exited by v r have a different perceived length depending on the number of multiples of ϵ 2 B between v l and v l +1 (again, if this is not possible, then the subproblem is not feasible).
Then we declare the subproblem to be feasible if and only if A[0, l + 1, r , c, 0, D l , D r , S l , S r ] = True. A similar argument works if l is covered by a large arm from the right.
Note that since c ⊥ and β = 0, or c = ⊥, no small arm can enter V l,r . Consider the set of centers in V l,r . The s-span (interval of small arms) of these centers forms a laminar family. Consider the roots of the forest of this laminar family and let c be the center  corresponding to the leftmost root; we guess c (see Figure 3) along with the contribution of small arms originating at c going to the left (call β ) and the right (call β ) and the number of centers located between l and i, say, k . Observe that the s-span of i is not contained in the s-span of any other center in V l,r . Center c has at most 3/ϵ large arms. We guess the large arms of c along with large arms entering/leaving V l,c −1 and V c +1,r . For all the guesses that the perceived cost of c is at most (1 + 5ϵ )B and the large arms are consistent with each other as well as D l , D r , S l , S r . So A[k , l, r , c, β, D l , D r , S l , S r ] is set to True if one the following expressions is True (see Figure 3):

TREE METRICS
The extension of the PTAS presented for line metrics to tree metrics is not clear, because we heavily exploited the linear structure of line metrics in the PTAS from Section 3. However, we can obtain a QPTAS for this case using a different approach. Our approach is inspired by the QPTAS for line metrics from Reference [13] in that it uses a recursive decomposition of the tree into geometrically smaller subtrees, but the parameters used to define a subproblem are a fair bit different.
Let T be a tree with edge costs d e , e ∈ T . For two nodes u, v ∈ T , let d (u, v) be the cost of all edges on the unique u − v path inT . Also for a nodev and edge e = uw, let d (v, e) = min{d (v, u), d (v, w )}. Let n denote the total number of nodes in T . Not all nodes are required to be a client or a facility. As in the case of line metrics, we focus on unit demands for (mostly notational) simplicity.
The algorithm, as before, works with a guessed value B as an upper bound for L opt . Delete every edge with cost greater than B and focus on one of the resulting subtrees. We try all guesses for k ≤ k to determine the smallest one for which there is a solution with cost (1 + O (ϵ )) · B in this subtree (if any). If any of these subtrees does not have such a solution for any k ≤ k or if the sum of these values k over all subtrees exceeds k, then we determine there is no solution with cost ≤ B. From now on, we will simply let k denote this guessed value for k in the subtree and assume all edge costs are at most B.
Next, we make T have degree at most 3 by "expanding" any node with at least four neighbors. That is, if v is a node with at least four neighbors, then we add a new vertex v that is neither a client nor a facility, connect v to v with a 0-cost edge, and have two neighbors, say, u, w, of v instead be neighbors of v with the same edge cost (i.e., d uv = d uv and d wv = d wv ). The optimum solution cost does not change. The degree of v strictly decreases, so iterating this process until all nodes have degree at most 3 produces a tree with at most 2n nodes. Let T denote this tree and say it has n ≤ 2n nodes.
Using a scaling argument as for the case of line metrics, we can assume that the aspect ratio of heaviest to cheapest nonzero edge cost is polynomially bounded: Edges with cost less than ϵB/n 2 are changed to have a cost of 0. Let T denote this modification of T . Since each of the n clients hops across at most n edges to their assigned facility in any solution, the difference in cost when measuring a solution in T or T is at most ϵB. Finally, scale the costs by n 2 /(ϵB), so each edge has cost either 0 or in the range [1, n 2 /ϵ]. Let T denote the tree produced from T after the edge costs are scaled this way. From now on, we will assume that T has maximum degree 3 and each nonzero edge cost lies in the range [1, n 2 /ϵ] where n denotes the number of nodes in T . Summarizing the above discussion, we are assuming:

The Decomposition Tree
We construct a rooted decomposition tree T for T , which is itself a tree whose nodes correspond to connected subtrees of T . The tree T itself will be the root of T .
To construct T , initially set it to be the tree with just T as its only node. Then, while there is some leaf node T in T (taking T = T if T only includes the one node T ) with n ≥ 2 nodes, find an edge e = (u, v) such that the two connected components of T − e have size in [n /3, 2n /3]. It is well known that such an edge exists in any tree with degree at most 3. Add two subtrees as children of T in T . Denote this edge e by e T .
The height of T is O (log n) because the sizes of the subtrees on any root-to-leaf path in T decrease geometrically. Each edge e ∈ T is of the form e T for precisely one non-leaf T ∈ T . For a subtree T in T , we let  As in the case of paths, we refer to the directed path from a center c to a client j being served by c as an arm. Our recurrence coarsens how we measure different parts of an arm. Let C = {0} ∪ {(1 + ϵ ) σ : 0 ≤ σ ≤ log 1+ϵ B } be the set of possible coarse arm lengths. Note |C| = O (ϵ −1 log n).
We classify arms in the following way. For some e ∈ T and some α, β ∈ C, we say that the arm c − j has type (e, α, β ) if • the c − j path traverses e and e = e T where T is the least common ancestor of the leaf nodes corresponding to the singletons {c} and {j} in T .
That is, e is the "highest" portal edge traversed by the c − j arm and α, β approximate the c − e and e − j distances within a factor of 1 + ϵ. We will say the perceived distance of this arm is α + β + d e .

The Recurrence
We still maintain deficiency and surplus vectors for various subtrees, but they are indexed somewhat differently than in our recurrence for line metrics. A subproblem consists of a subtree T ∈ T , an integer 0 ≤ k ≤ k, and surplus and deficiency integer vectors S, D, both indexed by tuples (e, α, β ), where e ∈ p(T ) and α, β ∈ C.
Let A[T , k , S, D] be a Boolean value that is True if it is possible to • Open exactly k centers in T .
• Assign each client c ∈ T to one of these centers or is served by a center outside T using an arm of type (e, α, β ), where e ∈ p(T ). This should be done such that exactly D(e, α, β ) clients are assigned to outside T using an arm of type (e, α, β ), and a client c can only be assigned to such an arm if β ≤ d (e, j) ≤ (1 + ϵ ) · β. • Assign precisely S(e, α, β ) arms of type (e, α, β ) between the k open centers for each e ∈ p(T ). Such an arm can only be assigned to a center c if α ≤ d (c, e) ≤ (1 + ϵ ) · α. • The perceived distance of all arms and clients assigned to any one of these k centers is at Note that one major difference between this table and the table used in the PTAS for line metrics is that no arm will be counted by both S and D in a subproblem. That is, while some arms may "enter and exit" T , they will not be accounted for in this subproblem. An arm is only counted in this subproblem if it has one endpoint in T and the other endpoint not in T .
Simply by definition of A, if there is a solution with maximum load B, then A[T , k, 0, 0] will be true, and if A[T , k, 0, 0] is true, then by replacing the perceived distance of each arm with its actual distance we see there is in fact a solution with maximum load (1 + ϵ ) · B.
The recurrence for A is somewhat simpler than the recurrence we used for line metrics. The base cases are precisely when T corresponds to a subtree with exactly one node (i.e., when T is a leaf in T ).

Base Cases
That is, the total perceived distance of all arms accounted for by the surplus vector should not exceed (1 + ϵ ) · B.

Inductive
Step Suppose T 1 ,T 2 are the two children of a non-leaf node. At a high level, to determine if A[T , k , S, D] = True, we simply have to guess which arms represented by the surplus and deficiency vectors start/end from T 1 and which ones start/end from T 2 and how many arms of various types reach between T 1 and T 2 (in which case e T would be the highest portal edge they cross).
More specifically, A[T , k , S, D] = True if and only if there are values 0 ≤ k 1 , k 2 ≤ k and integer vectors S 1 , D 1 , S 2 , D 2 corresponding to the subproblems T 1 ,T 2 , respectively, such that α, β ). In words, the subproblems agree on the arms that have e T as their highest edge. • For each α, β ∈ C and each e ∈ p(T ), S 1 (e, α, β ) + S 2 (e, α, β ) = S(e, α, β ) and D 1 (e, α, β ) + D 2 (e, α, β ) = D(e, α, β ). In words, the arms that enter/exit each subproblem but do not cross between T 1 and T 2 form a partition of the arms exiting T .
Verifying correctness of this recurrence is straightforward. Note that the number of subproblems is n O (ϵ −1 ·log n) , because there are O (n) subtrees in T , at most k + 1 ≤ n + 1 possible values for k , and each of the O (ϵ · log n) components of the surplus and deficiency vectors is an integer between 0 and n. Similar counting shows the number of subproblems that have to be checked when computing A[T , k , S, D] is n O (ϵ ·log n) . Theorem 4.3. For any constant 0 < ε ≤ 1, there is a (1 + ε)-approximation algorithm for MLkFL on tree metrics that runs in quasi-polynomial time.

A CONSTANT-FACTOR APPROXIMATION ALGORITHM FOR MLkFL IN STAR METRICS
A star is a rooted tree of height 1 and a star metric is the shortest-path metric completion of a weighted star. We now consider MLkFL in star metrics and in the more-general setting, where each client j has a demand D j that may be split across various open facilities. The load of a facility i is now defined as j x i j d (i, j), where x i j ∈ R ≥0 is the amount of j's demand that is served by i. We also consider the integer-splittable version where the demands are integers and may be split integrally across the open facilities. (As noted earlier, with a star metric, we cannot reduce (even) the integer-demand setting to the unit-demand setting by creating colocated clients as this will destroy the star topology.) We devise a 12-approximation algorithm for this problem, which also yields a 14-approximation for the integer-splittable version. At a high level, our approach is similar to the one used to obtain the PTAS for line metrics. We again "guess" the optimal value B. We argue via a slightly different uncrossing technique that if B ≥ L opt , then there exists a well-structured fractional solution with maximum load at most 6B, and use DP to obtain a fractional solution with maximum load at most 12B. This can then be converted to an integer-splittable assignment with maximum load at most 14B using the GAP-rounding algorithm, since, for the integer-splittable version, it is easy to ensure via some preprocessing that d (i, j) ≤ 2B for every facility i and client j. We can now combine this with a binary-search procedure to find the "right" B; this yields the 12-approximation algorithm (and a 14-approximation for the integer-splittable version).
Let r be the root of the star graph defining the star metric, V denote the set of all leaf nodes, and d i = d (i, r ) for leaf i. We may assume that r F ∪ C, since we can add an extra leaf with distance zero to r . Number the nodes of V from 1 to n so that d 1 ≤ d 2 ≤ · · · ≤ d n . Let D j be the demand of client j ∈ C. We often refer to a pair (i, j), where i ∈ F , j ∈ C, as an arm.
Theorem 5.1. There is a 12-approximation algorithm for MLkFL on star metrics with non-uniform demands and a 14-approximation algorithm for the integer-splittable version.
Proof. Let B be our current guess of the optimal value. In Section 5.1, we show that if B ≥ L opt , then there exist k facilities and a well-structured fractional assignment of clients to these facilities of cost (i.e., maximum load) at most 6B. In Section 5.2, we devise a dynamic programming approach that finds k facilities and a well-structured fractional assignment of clients to these facilities of cost at most 12B provided there is such a solution of cost at most 6B. In the integer-splittable version, we may assume that d i ≤ B for all i = 1, . . . , n. Otherwise, if d i > B, then no client may assign any demand to i (if i ∈ F ); also, if D i > 0, then we must have i ∈ F for the instance to be feasible, and all of D i must be served by i. Thus, we can remove i from V and, in the latter case, decrease k by 1 and proceed with the smaller instance.
Combining these ingredients, we can either certify that B < L opt or find k facilities and a fractional assignment of clients to these facilities that has maximum load at most 12B. In the integersplittable version, the above preprocessing ensures that d (i, j) ≤ 2B for every facility i and client j. Thus, using Theorem 3.6, we can round the fractional assignment (with maximum load at most 12B) to an integer solution while increasing the maximum load by at most max i ∈F , j ∈C d (i, j) ≤ 2B.
Finally, we discuss the binary-search procedure used to find a suitable B. We can detect if L opt = 0, so assume otherwise. It is easy to find an upper bound UB on L opt such that log UB is polynomially bounded in the input size. In the integer-splittable version, since L opt is an integer, it is straightforward to do binary search in the interval [0, UB] to find the smallest integer B (which must be at most L opt ) for which the above procedure returns a solution with maximum load at most 14B. With non-integer demands (and fractional assignments), the binary search is somewhat more involved. We can obtain an integer K such that such that L opt is a rational number with denominator at most K and log K is polynomially bounded in the input size. This is because given the set of open facilities in an optimal solution, L opt is the solution to a polynomial-size LP, and the optimal value to a rational LP has bit size polynomial bounded in the size of the LP. Now we perform binary search in the interval [0, UB] to find integers A and U = A + 1 such that L opt > A 2K 2 and, for B = U 2K 2 , the above procedure returns a solution with maximum load at most 12B. Now, using continued fractions (see Reference [15]), we can find the rational number, p, closest to U having denominator at most K, in time polynomial in the bit sizes of U and K. We claim that if L opt < U 2K 2 , then p = L opt since |p − L opt | ≤ 2|L opt − U 2K 2 | < 1 K 2 , and any two distinct rational numbers having denominator at most K have a spacing of at least 1 K 2 . So if p ∈ ( L 2K 2 , U 2K 2 ), then we run our procedure with B = p; if the procedure returns a solution for this value of B, then we return this solution; otherwise, we return the solution found for B = U 2K 2 .

A Well-Structured Near-Optimal Solution
We show that if B ≥ L opt , then there exists a fractional assignment satisfying various nice structural properties, which will then enable us to find such a solution via DP (Section 5.2). Let C B be the set of open facilities in some integer-splittable solution having maximum load at most B. Consider the following LP. For notational convenience, we set D j = 0 if j C and think of every node as a client (but with potentially 0 demand), Given a solution x to (S-P), we say that arms (i, j ) and . Note this is different from the notion of crossing in our PTAS for line metrics, but it can be seen to generalize it; that is, one can check that if two arms (i, j ), and (i , j) on the line cross according to the definition in Section 3.1 then We know that (S-P) is feasible. We prove that the optimal solution to (S-P) does not have any crossing arms in its support.
Lemma 5.2. The optimal solution to (S-P) does not have any crossing arms in its support.
Proof. Let x be an optimal solution to (S-P) Suppose (i, j ) and (i , j) cross in x. If d (i, j) = 0, then simply update x by moving all of j's demand to i. Similarly, if d (i , j ) = 0, then move all of the demand of j to i . In both cases, the objective value of x decreases, which is a contradiction.
For some ϵ, ϵ > 0 to be specified shortly, we create a new assignment x that agrees with x in all center-client pairs except that . We chose ϵ, ϵ such that the load at i does not change, so ϵ · d (i, j) = ϵ · d (i, j ) and so that either x i j = 0 or x i j = 0 while the other remains nonnegative. The change in the load of i as well as the change in objective value is given by j). This yields a contradiction.
Observe that the above uncrossing property is stronger then the uncrossing that we achieved for line metrics, where we only ensured that small arms do not cross. Figure 4 illustrates all the (nonsymmetric) cases that count as crossing. The figures on the right show the result after modifying x as described in the above proof. In each picture, at most one of the dotted arrows has a non-zero x i j value. We use c 1 , c 2 for the location of centers i, i and v 1 , v 2 for the location of clients j , j to be consistent with the notation used in the PTAS for line metrics. • For each V a and each j ∈ V a ∩ C, we have x i j = 0 for i V a . That is, each client is completely served within its partition. • For each V a , at least one of the followings hold: (1) |C B ∩ V a | = 1 (2) x i j = 0 for all j ∈ V a ∩ C and i < j (clients are only satisfied by centers to the right) (3) x i j = 0 for all j ∈ V a ∩ C and i > j (clients are only satisfied by centers to the left) Lemma 5.4. There is a well-structured fractional solution x to (S-P) with maximum load at most 6B.
Proof. Let x 0 be an optimal solution to (S-P). So x 0 has maximum load at most B, and by Lemma 5.2, it does not have any crossings. We start with x 0 and in each step modify given solution such that the maximum load is not increased by more than a constant factor. We use x r to denote the output of Step r , for 1 ≤ r ≤ 3.  We initialize x 1 to x 0 . For any client j ∈ C B , all the demand at j can be satisfied by the collocated center, so we can assume x 1 i j = 0 for i j. For any other client j C B , either Suppose the former is true (the latter is similar). Then we simply set x 1 i j = 0 for i > j and scale the x 1 i j with i < j uniformly until they sum to D j again. After doing this for all clients, we have that x 1 i j at most doubles for each center-client pair, so the maximum load is at most 2B. Note that any non-zero coordinate in x 1 is a non-zero coordinate in x 0 as well, so x 1 does not have any crossings either.
. If there is no such center, then this step is done. We will modify the assignment to i and, perhaps, some nearby centers and then form a consecutive subsequence of V whose only center is i.
Consider the rightmost center i L ∈ C B such that i L < i, see Figure 5. If there is no center to the left of i, then the operations in this paragraph are skipped. Otherwise we update the assignment x 2 in the following way: For any client Note that this modification does not introduce any crossings in x 2 as each client j is now assigned to a closer location. Moreover, now any client j < i with x 2 i j > 0, has to be on the right of i L , i.e., i L < j. Similarly, if there is a leftmost center i R ∈ C B with i R > i, then move all assignment x 2 i j with j > i R to i R . Again after these modifications, no new crossing is introduced, and any client j > i with x 2 i j > 0 has to be on the left of i R , i.e., j < i R . Let j L be the leftmost client with x 2 i j L > 0; if no such client exists, then define j L = n + 1. Similarly, let j R be the rightmost client with x 2 i j R > 0; if no such client exists, then let j R = 0. Define interval V a to be j 1 , j 2 , . . . , j m , where j 1 = min(j L , i) and j m = max(i, j R ) (see Figure 5). Note that partition V a satisfies |C B ∩ V a | = 1. Also, no clients outside of V a are assigned to i in x 2 . We claim that all clients in V a are completely assigned to i in x 2 . Let j be an arbitrary client in V a . Without loss of generality assume j < i (the other case is similar). Note that in this case j 1 i, and so j 1 = j L < n + 1. Note that i in x 1 serves some q > i (initial condition for picking i). We have two possible cases as follows: -j = j L . By definition of j L , x 2 i j L > 0. By step 1, we know that j L is served in one direction that has to be right as i is on the right of j L . Suppose facility i > i, serves j L . This gives us a contradiction as x 1 i j L · x 1 iq > 0 and arms (i, q) and (i , j L ) cross (Case (1) or Case (4) depending on the position of q with respect to i , see Figure 4). -j > j L . First, we show j cannot be served by a facility i < i. Let us assume the contrary. By definition of i L , we have i ≤ i L , and hence arms (i , j) and (i, j L ) cross, which gives us a contradiction as Figure 4(c)). Next, j cannot be served by facility i > i as this implies that arms (i, q) and (i , j) cross for solution x 1 while x 1 i j · x 1 iq > 0 (Case (1) or Case (4) depending on the position of q with respect to i , see Figure 4). Note that partition V a satisfies |C B ∩ V a | = 1 and that all clients in V a are completely assigned to the sole center in V a . Removing V a from V effectively divides the instance into two subinstances. We only note that x 2 i j = 0 for any j < i < i or i < i < j . Otherwise, if (say) j < i < i has x 1 i j > 0, then this would cross arm (i, q) that was assigned to i in x 1 which gives us a contradiction.
We claim the maximum load after performing the second step has increased by at most 4B. The only times the load increases are when some centers of the form i L or i R have some x 2 i j reassigned to them. Each center i can be some center of the form i L only once and of the form i R only once. Moreover, the load of i cannot be increased in both cases. Suppose the contrary, i.e., i is i L for some center i 0 and i R for some center i 1 where there exist clients j 0 < i and j 1 > i with x 1 i 0 j 0 , x 1 i 1 j 1 > 0. Since the arms (i 0 , j 0 ) and (i 1 , j 1 ) cross, we get a contradiction. So at most one center can increase the load of i , and therefore it remains to show that the increase in the load if i is i L or i R of some center is at most 4B.
First assume that i is of the form i L for some facility i. Since d (i , j) < d (i, j) for each client j i, the load of i after the process is increased by the portion of load of i that corresponds to serving clients j < i L which is at most 2B. Now assume i is of the form i R for some facility i. Note that any client j moved to i has d (i, j) < d (i , j) but d (i , j) < 2d (i, j) as j > i . So the load increase on i is at most twice the load on i due to clients j > i ; therefore, the load increase is at most 4B.

• Step (3): Dividing the remaining instance.
First initialize x 3 to x 2 . Now each j ∈ C \ C B assigns all demand either completely to the left or completely to the right. Similarly, any i ∈ C B collects demand either completely from the left or completely from the right. Say that j ∈ C B ∪ C "goes left" if j C B and x 3 i j > 0 only for i < j or j ∈ C B and x 3 j j > 0 only for j ≥ j. If j does not "go left," then we say it "goes right." Now V naturally breaks up into maximal consecutive intervals of nodes, each of which only includes clients that "go left" or "go right." These form the remaining partitions V a .
The only thing left to note is that a client is completely served within its partition. Suppose j ∈ V a and that j "goes left." Either V a is the first partition, or the preceding partition V a "goes right." Since V a is not a singleton (that was taken care of at the end of step (2)), then there is some j ∈ V a with x 3 i j > 0 for some i ∈ C B ∩ V a , i j . Thus, j cannot assign any demand to a client to the left of V a , since this assignment would cross with the assignment of j .
Now let x = x 3 , which is a desired solution satisfying the condition of the lemma.

A Dynamic-Programming Algorithm for Finding a Well-Structured Solution
We describe the dynamic programming approach in two steps. First, for any subsequence V = {j L , . . . , j R } of V , and any 1 ≤ p ≤ k, we describe a Boolean value I (V , p) that is true if and only if one of the following is true.
(1) p = 1 and there is some facility i ∈ V ∩ F such that assigning each client from V to i places load at most 6B on i. (2) There is a set C ⊆ V ∩ F of p facilities, and a fractional assignment (x i j ) i ∈C , j ∈V such that for every j ∈ V we have x i j = 0 for i < j and i ∈C x i j = D j , and for every i ∈ C we have d i · j ∈V x i j ≤ 6B. (3) There are some p locations C and a fractional assignment (x i j ) i ∈C , j ∈V such that for every j ∈ V we have x i j = 0 for i > j and i ∈C x i j = D j , and for every i ∈ C we have j ∈V d j · x i j ≤ 6B.
If we can compute I (V , p) for all (V , p) tuples (as well as the solution that generates it), then we claim that we are done. Observe that if I (V , p) = true when p > 1, then the fractional assignment x corresponding to I (V , p) induces a maximum load of 12B on the centers opened from V . If I (V , p) is true due to the second condition, then this is because if x i j > 0, then d (i, j) ≤ 2d i , and we have d i · j ∈V x i j ≤ 6B. Similarly, if I (V , p) is true due to the third condition, then x i j > 0 implies that d (i, j) ≤ 2d j , and we have j ∈V d j · x i j ≤ 6B. Now it is a simple matter to determine, using another dynamic program, how to partition V into consecutive intervals V 1 , . . . ,V m with positive integers p 1 , . . . ,p m summing to k such that To wrap up the proof, we describe how to compute I (V , p). We can associate a  (V , p). To compute the entries of T 2 , we use an among all ways of choosing p centers C in {j L , . . . , i} and fractionally assigning up to D j units of demand of each client j ≤ i to centers in C such that (i) no center i ∈ C is assigned more than 6B/d i units of demand from clients j < i and (ii) The base cases with i = j L are easy: The first term in the min says that if we open i, then we assign as much leftover demand that we can. The second term says that if we do not open i then all of the demand at i must go to the right of i. Once we compute this, we set T 2 (V , p) to true if and only if f (j R , p) = 0.
For the last case, we consider a similar dynamic programming algorithm in a "right-to-left" manner, except we are concerned with the minimum value of j ≥i d j · (D j − i ∈C x i j ). We associate the table T 3 (V , p) for this case, and we use the auxiliary table д(i, p ) for j L ≤ i ≤ j R , 0 ≤ p ≤ p. д(i, p ) is the minimum possible excess load j ≥i d j (D j − i ∈C x i j ) among all ways of choosing p centers C in {i, . . . , j R } and fractionally assigning up to D j units of demand of each client j ≥ i to centers in C such that (i) no center i ∈ C is assigned a load more than 6B and (ii) The base cases with i = j R are easy: д(i, 0) = d i · D i and д(i, 1) = 0 if i ∈ F ; we set д(i, p ) = ∞ if p > 1, or (p > 0 and i F ). Also, for i < j R but p = 0 we have д(i, p ) = д(i − 1, p ) + d i · D i . Otherwise, if i < j R and p > 0, then we have The first term in the min says that if we open i, then we assign as much leftover load that we can. The second term says that if we do not open i then all of the demand at i must go to the left of i. Once we compute this, we set T 3 (V , p) to true if and only if д(j L , p) = 0.

HARDNESS RESULTS AND INTEGRALITY-GAP LOWER BOUNDS
We now present various hardness and integrality-gap results. We prove that MLkFL is strongly NP-hard on line metrics (Theorems 6.1). We also demonstrate that a natural configuration-style LP has an unbounded integrality gap (Theorem 6.2). Theorem 6.1. Minimum-load k-facility location is strongly NP-hard even in line metrics.
Proof. We reduce from 3-partition, where we are given n = 3k integers b 1 , . . . ,b n and a bound B such that n i=1 b i = kB. The goal is to partition the integers into k groups such that the sum of the integers in any group is at most B. It is NP-complete to determine if there is a feasible solution, even when b i ≤ 2 16 n 4 and B 4 < b i < B 2 for each i (e.g., Reference [9]). In particular, any feasible solution will have precisely three integers in each group of the partition.
We create an instance of MLkFL on the line by creating two groups of clients. First, for each point p ∈ {− k−1 3k , − k−2 3k , . . . , − 1 3k , 0} we place 3k 2 (B + 1) clients at p. Next, for each integer b i , 1 ≤ i ≤ n, we add a single client at position b i . Let N be the number of clients in the resulting instance and notice that all values have values bounded by a polynomial in N . The claim is that there is a solution with cost B + k−1 k if and only if the 3-partition problem is a yes instance.
First, suppose there is a partition of the integers b 1 , . . . ,b n into k groups G 1 , . . . ,G k such that the sum of the integers in any group G i is B. For each 1 ≤ i ≤ k, we create a star with center at − i−1 3k , assign all clients located at this center to this star, and also assign the clients in group G i to this star. The only clients that move some positive distance to the center of the star are those from the group G i , and they move a total distance of B + i−1 k < B + k−1 k . Conversely, suppose there is a solution with maximum load at most B + k−1 k . First, we claim that every point of the form − i 3k , 0 ≤ i < k must be the center of a star. Otherwise, the 3k 2 (B + 1) clients at this location must be assigned to other stars. The minimum distance each of these client travels is 1 3k and one of the open centers receives at least 3k (B + 1) of these clients, so its load is at least B + 1 > B + k−1 k . Therefore, the centers are at locations − i 3k , 0 ≤ i < k. Since B 4 < b i < B 2 , then every star must contain exactly three clients corresponding to integers b 1 , . . . ,b n in the 3-partition instance. Without loss of generality, say b 1 , b 2 , b 3 are the three integers in some star. The total distance they travel lies between b 1 + b 2 + b 3 and b 1 + b 2 + b 3 + k−1 k so b 1 + b 2 + b 3 ≤ B. Therefore, if we let G i be the clients corresponding to integers b 1 , . . . ,b n that are in the star with center − i−1 3k for each 1 ≤ i ≤ k, then G 1 , . . . ,G k is a feasible solution to the 3-partition problem.
Integrality-gap lower bound. Let (F , C, d, k ) be an MLkFL instance. Given a candidate "guess" B of the optimal value, we can consider the following LP-relaxation of the problem of determining if there is a solution with maximum load at most B. We propose the following linear programming for the MLkFL . For each facility i ∈ F , define S(B; i) := {C ⊆ C : j ∈C d (i, j) ≤ B} to be the set of all stars centered at i that induce load at most B at i. We will often refer to a star in S(B; i) as a configuration. (Note that S(B; i) contains ∅.) Our LP Will be a configuration-style LP, where for every facility i and star C ∈ S(B; i), we have a variable denoting if star C is chosen for facility i. This yields the following natural feasibility LP: i ∈F C ∈S(B;i ) x ≥ 0. Constraint (1) ensures that each client belongs to some configuration, and constraints (2) and (3) ensure that each facility belongs to at most one configuration and that there are at most k configurations. We show that there is an MLkFL instance on the line metric, where the smallest value B LP for which (P) is feasible is smaller than the optimal value by an Ω(k/ log k ) factor; thus, the "integrality gap" of (P) is Ω(k/ log k ). Moreover, in this instance, the graph containing the (i, j) edges such that d (i, j) ≤ B LP is connected. Theorem 6.2. The integrality gap of (P) is Ω(k/ log k ) even for line metrics.
Proof. Assume for simplicity that k is odd. Consider the following simple MLkFL instance. We have F = {a 1 , b 1 , a 2 , b 2 , . . . , a m , b m }, where 2m = k + 1. These facilities are located on a line as shown in Figure 6, with the distance between any two consecutive nodes being T /2. There are n = 2k clients colocated with each facility. Let A i (respectively ,B i ) denote the set of clients located at a i (respectively, b i ) for 1 ≤ i ≤ m.
There is a feasible solution to (P) with B = T . For all i = 1, . . . ,m, we set x (a i , A i ∪ {j, j }) = k (k+1) ·(n 2) for all j, j ∈ B i ; note that all these configurations lie in S(T ; a i ). Similarly, we set x (b i , B i ∪ {j, j }) = k k+1·(n 2) for all j, j ∈ A i . It is easy to verify that x is a feasible solution. It is clear that constraints (2) and (3) hold since every facility belongs to exactly (n 2) configurations. Consider a client j ∈ A i . j is covered to an extent of k k+1 by the (n 2) configurations {A i ∪ {k, }} k, ∈B i in S(a i ;T ) and to an extent of 1 k+1 by the n − 1 configurations {B i ∪ {j, k }} k ∈A i :k j . A symmetric argument applies to clients in some B i set.
Finally, we show that any feasible solution must have maximum load at least T · k 2H k , where H r := 1 + 1 2 + · · · + 1 r is the r th harmonic number, which proves the theorem. In any feasible solution, there is some location v that does not have an open facility. For i = 1, . . . , k, let n i be the number of clients colocated at v that are assigned to a facility at a location that is i hops away from v; set n i = 0 if there is no such location. Then, k i=1 n i = n, and the maximum load L at a facility is at least max i=1, ...,k n i iT 4 since there are at most two facilities that are i hops away from v, and one of them must have at least n i 2 clients assigned to it. Thus, we have n i ≤ 4L iT for all i = 1, . . . k, and so n ≤ 4L T · H k , or L ≥ nT 4H k .

AN UNBOUNDED LOCALITY GAP FOR THE MULTI-SWAP LOCAL-SEARCH ALGORITHM FOR MLkFL
A natural local-search heuristic for MLkFL is one where given a current set S of k facilities, we may swap out a facility in S and swap in a facility not in S. More generally, we may consider a p-swap heuristic where we swap out and swap in at most p facilities. Note that given a set of k facilities, one can find the assignment of clients to facilities by solving an instance of the generalized assignment problem [21]. We keep performing such local moves as long as it improves the maximum load of the solution. One can come up with simple examples showing that the locality gap of the p-swap heuristic, which is the worst-case ratio between the maximum load at a local optimum and the (global) optimal value, can be arbitrarily large, even on line metrics. Theorem 7.1. The locality gap of the p-swap heuristic is unbounded, even on line metrics.