Polynomial-Time Constant-Approximation for Fair Sum-Of-Radii Clustering

Bagheri Nezhad, Sina; Bandyapadhyay, Sayan; Chen, Tianzhi

doi:10.4230/LIPIcs.ESA.2025.62

Polynomial-Time Constant-Approximation for Fair Sum-Of-Radii Clustering

Sina Bagheri Nezhad

Portland State University, OR, USA Sayan Bandyapadhyay

Portland State University, OR, USA Tianzhi Chen

Portland State University, OR, USA

Abstract

In a seminal work, Chierichetti et al. [20] introduced the $(t,k)$ -fair clustering problem: Given a set of red points and a set of blue points in a metric space, a clustering is called fair if the number of red points in each cluster is at most $t$ times and at least $1/t$ times the number of blue points in that cluster. The goal is to compute a fair clustering with at most $k$ clusters that optimizes certain objective function. Considering this problem, they designed a polynomial-time $O(1)$ - and $O(t)$ -approximation for the $k$ -center and the $k$ -median objective, respectively. Recently, Carta et al. [15] studied this problem with the sum-of-radii objective and obtained a $(6+\epsilon)$ -approximation with running time $O((k\log_{1+\epsilon}(k/\epsilon))^{k}n^{O(1)})$ , i.e., fixed-parameter tractable in $k$ . Here $n$ is the input size. In this work, we design the first polynomial-time $O(1)$ -approximation for $(t,k)$ -fair clustering with the sum-of-radii objective, improving the result of Carta et al. Our result places sum-of-radii in the same group of objectives as $k$ -center, that admit polynomial-time $O(1)$ -approximations. This result also implies a polynomial-time $O(1)$ -approximation for the Euclidean version of the problem, for which an $f(k)\cdot n^{O(1)}$ -time $(1+\epsilon)$ -approximation was known due to Drexler et al. [24]. Here $f$ is an exponential function of $k$ . We are also able to extend our result to any arbitrary $\ell\geq 2$ number of colors when $t=1$ . This matches known results for the $k$ -center and $k$ -median objectives in this case. The significant disparity of sum-of-radii compared to $k$ -center and $k$ -median presents several complex challenges, all of which we successfully overcome in our work. Our main contribution is a novel cluster-merging-based analysis technique for sum-of-radii that helps us achieve the constant-approximation bounds.

Keywords and phrases:

fair clustering, sum-of-radii clustering, approximation algorithms

Copyright and License:

2012 ACM Subject Classification:

Theory of computation

\rightarrow

Approximation algorithms analysis

Related Version:

Full Version: https://arxiv.org/abs/2504.14683 [44]

Funding:

This work was supported by the National Science Foundation under Grant No. AF 2311397.

DOI:

10.4230/LIPIcs.ESA.2025.62

Event:

33rd Annual European Symposium on Algorithms (ESA 2025)

Editors:

Anne Benoit, Haim Kaplan, Sebastian Wild, and Grzegorz Herman

Series and Publisher:

Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik

1 Introduction

Given a set of points $P$ in a metric space $(\Omega,d)$ and an integer $k>0$ , the task of clustering is to find a partition $X_{1},\ldots,X_{k}$ of $P$ into $k$ groups or clusters such that each group has similar points. The similarity of the clusters is typically modeled using an objective function which is to be minimized. In this work, we focus on the sum-of-radii objective, which is defined as the sum of the radii of $k$ balls that contain the points of the respective $k$ clusters. The sum-of-radii objective, while also center-based, has a different flavor from objectives such as $k$ -center, $k$ -median, and $k$ -means, as it directly sums the radii of the clusters rather than measuring distances from each point to its assigned center. In these objectives, $k$ representative points (or cluster centers) are chosen, and the corresponding clusters are formed by assigning the points of $P$ to their nearest centers. Such a partition is popularly known as the Voronoi partition. It is not hard to see that an optimal sum-of-radii clustering is not necessarily a Voronoi partition. The study of sum-of-radii was motivated by the idea that it could reduce the so-called dissection effect that is observed in $k$ -center type objectives (see attached full version for details).

Sum-of-radii clustering is known to be NP-hard even in planar metrics and metrics of constant doubling dimension [30]. Consequently, it has received substantial attention from the approximation algorithms community. Charikar and Panigrahy [16] designed a Primal-Dual and Lagrangian-relaxation-based 3.504-approximation algorithm that runs in polynomial time (poly-time). Recently, using similar techniques, Friggstad and Jamshidian [26] improved the approximation factor to 3.389. The best-known approximation factor for sum-of-radii in polynomial time is $3+\epsilon$ for any $\epsilon>0$ , due to Buchem et al. [14]. In stark contrast to other well-studied center-based objectives such as $k$ -center and $k$ -median, the sum-of-radii objective admits a QPTAS [30], which is based on a randomized metric partitioning scheme. Additionally, the problem can be solved exactly in polynomial time in the Euclidean metric of constant dimension [31]. The problem also admits polynomial time exact algorithms in other restricted settings, such as when singleton clusters are not allowed [9] and the metric is unweighted [33].

In recent years, sum-of-radii clustering has also been studied with additional constraints. One such popular constraint is the capacity constraint, which puts restriction on the number of points that each cluster can contain. In a series of articles [35, 8, 36, 25], $O(1)$ -approximation algorithms have been designed for capacitated sum-of-radii with running time fixed-parameter tractable (FPT) in $k$ (i.e., $f(k)\cdot n^{O(1)}$ for a function $f$ of $k$ ), culminating in an approximation factor of 3. Inamdar and Varadarajan [35] studied sum-of-radii with a matroid constraint where the set of centers of the balls must be an independent set of a matroid. They obtain an FPT 9-approximation for this problem. The approximation factor has recently been improved to 3 by Chen et al. [18]. Obtaining a poly-time $O(1)$ -approximation for any of these constrained versions is an interesting open question. However, poly-time $O(1)$ -approximations are known for sum-of-radii with lower bounds and with outliers [3, 14].

Sum-of-radii has also been studied with fairness constraints, which is the main focus of our work. Clustering with fairness constraints or fair clustering stems from the idea that protected groups (defined based on a sensitive feature, e.g., gender) must be well-represented in each cluster. In recent years, fair clustering has received significant attention from researchers across several areas of computer science. In a seminal work, Chierichetti et al. [20] introduced the $(t,k)$ -fair clustering problem. In this problem, we are given a set $P_{1}$ of red points, a set $P_{2}$ of blue points, that together contain $n$ points, and an integer balance parameter $t\geq 1$ . A clustering is called $(t,k)$ -fair if, for any cluster $X$ , the number of red points in $X$ is at least $1/t$ times and at most $t$ times the number of blue points in $X$ . We say that each cluster in a $(t,k)$ -fair clustering is $t$ -balanced.

Chierichetti et al. studied $(t,k)$ -fair clustering with $k$ -center and $k$ -median objectives, and obtained poly-time 4- and $O(t)$ -approximation, respectively. Since then obtaining a poly-time $O(1)$ -approximation for $(t,k)$ -fair median or means remained an intriguing open question. The main challenge in this case is that the optimal clusterings are no longer Voronoi partitions, as they also need to be $(t,k)$ -fair. Subsequently, $(t,k)$ -fair median/means has been studied in a plethora of works. The only setting where it is known to obtain a poly-time $O(1)$ -approximation is when $t=1$ [12], that is for $(1,k)$ -fair median/means. These problems have also been considered in the Euclidean case [46, 5].

The $(t,k)$ -fair median/means problem has also been studied with an arbitrary $\ell$ number of groups. The algorithm of Böhm et al. [12] for $t=1$ also yields a poly-time $O(1)$ -approximation in this case. Note that for $t=1$ , a cluster contains the same number of points from all groups. Bandyapadhyay et al. [6] obtained a poly-time approximation for $(t,k)$ -fair median with a factor that depends on $t$ , $\ell$ , and $k$ . Bercea et al. [11] and Bera et al. [10] independently defined a generalization of $(t,k)$ -fair clustering. There we are given balance parameters $\alpha_{i},\beta_{i}\in[0,1]$ for each group $1\leq i\leq\ell$ . A clustering is called fair representational if the fraction of points from group $i$ in every cluster is at least $\alpha_{i}$ and at most $\beta_{i}$ for all $1\leq i\leq\ell$ . They show that it is possible to obtain poly-time bi-criteria type $O(1)$ -approximations where we are allowed to violate the fairness constraints by an additive small constant factor. Subsequently, Dai et al. [23] designed a DP-based poly-time $O(\log k)$ -approximation for this problem. For $\ell$ groups, their running time is $n^{O(l)}$ .

Carta et al. [15] studied fair versions of sum-of-radii. In particular, they study a more general class of mergeable constraints. A clustering constraint is called mergeable if the union of two clusters satisfying the constraint also satisfies the constraint. They show that the fairness constraints defined in $(t,k)$ -fair clustering and fair representational clustering are mergeable. In their work, they obtained a $(6+\epsilon)$ -approximation for sum-of-radii with mergeable constraints. In particular, for the above two fairness constraints, their run time is $O((k\log_{1+\epsilon}(k/\epsilon))^{k}n^{O(1)})$ , so FPT in $k$ . The algorithm iteratively guesses the next cluster based on a $k$ -center completion problem leading to the FPT run time. Their approximation factor improves to $3+\epsilon$ when $t=1$ . Drexler et al. [24] obtained an FPT $(1+\epsilon)$ -approximation for Euclidean sum-of-radii with mergeable constraints. Chen et al. [18] studied a fair version, which is a special case of matroid sum-of-radii, and hence obtained an FPT 3-approximation. A summary of the results on fair clustering under various objectives is provided in Table 1.

As mentioned before, for fair representational models, only bi-criteria type $O(1)$ -approximations are known for $k$ -center/median/means, even with two groups. As we focus on our theoretical quest of designing poly-time $O(1)$ -approximations fully satisfying the fairness constraints, we study $(t,k)$ -fair sum-of-radii. In light of the above discussion, we state the following two questions.

Question $1$ :: Does $(t,k)$ -fair sum-of-radii (with two groups) admit a poly-time constant-approximation algorithm?

Question $2$ :: Does $(1,k)$ -fair sum-of-radii with an arbitrary $\ell\geq 2$ number of groups admit a poly-time constant-approximation algorithm?

After the work of Chierichetti et al. [20], several other notions of fairness have been considered in the context of clustering problems. The following is a sample of these works grouped by the fairness notions: individual fairness [38, 43, 47, 13, 1], proportional fairness [19, 42], fair center representation [17, 40, 39, 21, 34], colorful [7, 37, 4], and min-max fairness [2, 28, 41, 22, 29, 32].

Table 1: Summary of approximation results for fair clustering under various objectives. “Poly” denotes polynomial time; “FPT” denotes fixed-parameter tractable in

k

;

\ell

is the number of groups.

Objective	Fairness Type	Approximation	Time	Reference
$k$ -Center	$(t,k)$ (2 groups)	4	Poly	[20]
$k$ -Median	$(t,k)$ (2 groups)	$O(t)$	Poly	[20]
$k$ -Median	$(1,k)$ ( $\ell$ groups)	$O(1)$	Poly	[12]
$k$ -Median	$(t,k)$ ( $\ell$ groups)	$f(t,\ell,k)$	Poly	[6]
$k$ -Median / Center	Representational (bi-criteria)	$O(1)$	Poly	[10, 11]
Sum-of-Radii	Unconstrained	$3+\varepsilon$	Poly	[14]
Sum-of-Radii	Capacitated	3	FPT	[25]
Sum-of-Radii	Matroid constraint	3	FPT	[18]
Sum-of-Radii	$(t,k)$ (2 groups)	$6+\varepsilon$	FPT	[15]
Sum-of-Radii	$(1,k)$ ( $\ell$ groups)	$3+\varepsilon$	FPT	[15]
Sum-of-Radii	$(t,k)$ (2 groups)	$\mathbf{144+\varepsilon}$	Poly	This work
Sum-of-Radii	$(1,k)$ ( $\ell$ groups)	$\mathbf{180+\varepsilon}$	Poly	This work

1.1 Our Contributions and Techniques

In our work, we prove two theorems resolving Questions 1 and 2 in the affirmative. First, we prove the following theorem.

Theorem 1.

There is a polynomial-time $(144+\epsilon)$ -approximation algorithm for $(t,k)$ -fair sum-of-radii (with two groups).

Our result complements the FPT approximation result of Carta et al. [15] by achieving the first $O(1)$ -approximation for the problem in polynomial time. The result also implies a poly-time $O(1)$ -approximation for Euclidean $(t,k)$ -fair sum-of-radii, for which only an FPT $(1+\epsilon)$ -approximation was known [24]. We note that our result should also be compared with that of $(t,k)$ -fair $k$ -median for which only $O(t)$ -approximation is known in polynomial time. In particular, our result places sum-of-radii in the same group of objectives as $k$ -center that admits polynomial-time $O(1)$ -approximations. Moreover, our result shows that $(t,k)$ -fair sum-of-radii is in contrast to most of the constrained versions of sum-of-radii, including capacitated clustering, for which only FPT $O(1)$ -approximations are known.

Next, we give an overview of our approach. Our approximation algorithm is motivated by the algorithms for $(t,k)$ -fair center and $(t,k)$ -fair median [20]. These algorithms have two major steps. In the first step, a fairlet decomposition of the points in $X=P_{1}\cup P_{2}$ is computed, i.e., a partition $\mathcal{Y}=\{Y_{1},\ldots,Y_{m}\}$ such that for each fairlet $Y_{i}$ , it either has 1 red point and at most $t$ blue points or 1 blue point and at most $t$ red points. Let $\beta:P_{1}\cup P_{2}\rightarrow[m]$ be the function that maps each point $x$ to the index of the fairlet that contains $x$ . From each $Y_{i}$ , an arbitrary point $y_{i}$ is designated as its representative. In the second step, a clustering of these $m$ representatives is computed with the respective cost function. Also, for each $Y_{i}$ , all of its points are assigned to the cluster that contains $y_{i}$ . The new clustering is obviously $(t,k)$ -fair, as each cluster is a merger of fairlets. For the analysis of the cost of the computed clustering, they define a fairlet decomposition cost, which is used to bound the assignment cost of the points in the second step. For $k$ -center, this cost is $\max_{x\in X}d(x,y_{\beta(x)})$ , and for $k$ -median, it is $\sum_{x\in X}d(x,y_{\beta(x)})$ . Indeed, both of these costs when optimal are comparable to the optimal $(t,k)$ -fair clustering cost. For $k$ -center, it is within a constant factor, and for $k$ -median it is within an $O(t)$ factor. Then, it is sufficient to compute a fairlet decomposition in the first step whose cost is within a small constant-factor of the optimal fairlet decomposition cost.

Coming back to $(t,k)$ -fair sum-of-radii, it is not clear how to define a suitable fairlet decomposition cost that can be compared to the optimal $(t,k)$ -fair sum-of-radii cost. In particular, such a cost needs to be defined independent of the number of clusters $k$ . However, for sum-of-radii, the objective is the sum of radii of $k$ clusters. For example, a natural candidate, the cost for $k$ -median, i.e., $\sum_{x\in X}d(x,y_{\beta(x)})$ , is likely to be much larger than the optimal sum-of-radii cost. In the absence of such a suitable fairlet decomposition cost, it is difficult to argue the increase in the assignment cost, when actual points of $Y_{i}$ are assigned instead of just the representative $y_{i}$ .

Our approach.

Our algorithm is surprisingly simple to state. We first compute a complete bi-partite graph $G$ with $P_{1}$ and $P_{2}$ being the two parts. The weight of each edge is set to be the distance between the two corresponding endpoints. Subsequently, a degree-constrained, spanning subgraph of this graph is computed where each vertex has a degree in range $[1,t]$ , and the sum of the weights of the edges is minimized. Such an optimal subgraph can be computed in polynomial time using the algorithm of Gabow [27]. Moreover, one can show that such a subgraph is a collection of stars each having at most $t$ edges. Thus, our algorithm up to this point is in a similar spirit to that of $k$ -median. As we argued before, the total weight of such a subgraph can be very large compared to the optimal sum-of-radii cost. Our main contribution is to prove that there is a sum-of-radii clustering of the stars (or representatives of them) computed in this way whose cost is at most a constant times the optimal $(t,k)$ -fair sum-of-radii cost. Then, one can compute an approximate sum-of-radii clustering of these stars and return the corresponding clustering of the points in $P_{1}\cup P_{2}$ . The obtained clustering is $(t,k)$ -fair, as the clusters are disjoint union of the vertices of stars, each having at most $t$ edges. The proof of the existence of a clustering of the computed stars whose cost is nicely bounded is based on a novel analysis technique that merges a set of optimal clusters to obtain superclusters. We give an overview in the following.

Let $H$ be the degree-constrained subgraph computed with the minimum weight possible. Also, let $\mathcal{C}^{*}=\{C_{1}^{*},C_{2}^{*},\ldots,C_{k}^{*}\}$ be a fixed optimal $(t,k)$ -fair sum-of-radii clustering. We repetitively merge pairs of these clusters if there are edges in $H$ across them. Let $\hat{\mathcal{C}}=\{\hat{C}_{1},\hat{C}_{2},\ldots,\hat{C}_{\kappa}\}$ be the resulting clustering. By our construction, each star of $H$ is fully contained in one of these merged clusters or superclusters. Thus, it is sufficient to show that the radius of each supercluster $\hat{C}_{i}$ is at most $O(1)$ times the sum of the radii of the associated optimal clusters whose merger is $\hat{C}_{i}$ . To bound such radius, we introduce a notion of minimum-switch paths between pairs of clusters. These paths play a central role in our analysis. We prove that it is possible to bound the (weighted) length of any such path by $O(1)$ times the sum of the radii of the associated optimal clusters. Then the diameter (or radius) of the supercluster can also be bounded likewise, as any two cluster vertices are connected by a minimum-switch path. The important distinction is that the length of any arbitrary path might not be bounded in such a nice way.

Next, we prove the following theorem concerning Question 2.

Theorem 2.

There is a polynomial-time $(180+\epsilon)$ -approximation algorithm for $(1,k)$ -fair sum-of-radii with $\ell\geq 2$ groups of points.

Again our result directly improves the FPT approximation result of Carta et al. [15] and extends to more than 2 groups. The result matches the known constant-approximation bound for $k$ -center/median/means in this case. The proof of the above theorem is similar to the proof of Theorem 1, and so employs the same supercluster-based analysis framework. However, here we need to handle $\ell$ colors. The main challenge boils down to bounding the diameter of a certain multi-partite graph $G_{1}^{*}$ with $\cup_{i=1}^{\ell}P_{i}$ being the set of vertices. Intuitively, by the analysis for two groups, the diameter of the graphs induced by only $P_{1}\cup P_{i}$ is nicely bounded. However, we still need to bound the diameter of $G_{1}^{*}$ . Consequently, we introduce an additional notion of minimum-color-switch paths. We prove that the lengths of these paths can also be bounded nicely, exploiting their special properties.

Organization.

We introduce notation in Section 2 and present our algorithms for $(t,k)$ -fair and $(1,k)$ -fair sum-of-radii in Sections 3 and 4, respectively. Proofs of statements marked by $(*)$ are available in the full version¹¹1https://arxiv.org/pdf/2504.14683.

2 Preliminaries

In sum-of-radii clustering, we are given a set $P$ of $n$ points in a metric space with distance $d$ and an integer $k>0$ . We would like to find: (i) a subset $C$ of $P$ containing $k$ points and a non-negative integer $r_{q}$ (called the radius) for each $q\in C$ , and (ii) a function $\phi$ assigning each point $p\in P$ to a center $q\in C$ such that $d(p,q)\leq r_{q}$ . The subset $X_{q}=\phi^{-1}(q)$ for each $q\in C$ is called the cluster corresponding to $q$ having radius $r_{q}$ . The goal is to find a clustering $\{X_{q}\mid q\in C\}$ that minimizes the sum of the radii $\sum_{q\in C}r_{q}$ .

In $(t,k)$ -fair sum-of-radii clustering, we are given two disjoint groups $P_{1}$ (red) and $P_{2}$ (blue) having $n$ points in total in a metric space $(\Omega=P_{1}\cup P_{2},d)$ and an integer balance parameter $t\geq 1$ . A clustering is called $(t,k)$ -fair if, for each cluster $X$ , the number of points from $P_{1}$ in $X$ is at least $1/t$ times the number of points from $P_{2}$ in $X$ and at most $t$ times the number of points from $P_{2}$ in $X$ . The goal is to compute a $(t,k)$ -fair clustering minimizing the sum of the radii of the clusters. Each cluster in a $(t,k)$ -fair clustering is called $t$ -balanced.

In Balanced sum-of-radii clustering, we are given $\ell\geq 2$ disjoint groups $P_{1},P_{2},\ldots,P_{\ell}$ having $n$ points in total in a metric space $(\Omega=\cup_{i=1}^{\ell}P_{i},d)$ such that $|P_{1}|=|P_{2}|=\ldots=|P_{\ell}|$ . A clustering is called balanced if, for each cluster $X$ , it holds that $|X\cap P_{1}|=|X\cap P_{2}|=\ldots=|X\cap P_{\ell}|$ . The goal is to compute a balanced clustering that minimizes the sum of the radii of the clusters. We say that each cluster in a balanced clustering is $1$ -balanced.

Consider any metric space $(\Omega_{1},d_{1})$ and a subset $S_{1}\subseteq\Omega_{1}$ . For any cluster $Q$ and a point $p$ , $d_{1}(p,Q)=\max_{q\in Q}d_{1}(p,q)$ . The center of $Q$ in $S_{1}$ is the point, $\arg\min_{p\in S_{1}}d_{1}(p,Q)$ . The radius of $Q$ w.r.t. $S_{1}$ and $d_{1}$ , denoted by $r_{(S_{1},d_{1})}(Q)$ , is the distance between $Q$ and its center in $S_{1}$ , i.e., $r_{(S_{1},d_{1})}(Q)=\min_{p\in S_{1}}d_{1}(p,Q)$ . We refer to the sum of the radii, w.r.t. $S_{1}$ and $d_{1}$ , of the clusters in any clustering $\mathcal{C}$ as the cost of $\mathcal{C}$ w.r.t. $S_{1}$ and $d_{1}$ and denote it by cost ${}_{(S_{1},d_{1})}(\mathcal{C})$ .

We note that the term “ $(t,k)$ -fairness” refers specifically to the two-color case, where each cluster must maintain a red-to-blue ratio within $[1/t,t]$ . This notion does not naturally extend to more than two colors. In contrast, in the multi-color setting with $\ell\geq 2$ groups, we adopt the term “balanced clustering” (or “ $(1,k)$ -fair clustering with $\ell$ groups”) to describe the setting where each cluster must contain an equal number of points from each group. While we use similar notation for consistency, these two notions are structurally different and should be interpreted accordingly.

3 The Algorithm for $(t,k)$ -Fair Sum-of-Radii Clustering

In this section, we prove Theorem 1. To set up the stage, we define the following problem.

Min-cost Degree Constrained Subgraph (Min-cost DCS).

A Degree Constrained Subgraph (DCS) $H=(V,E^{\prime})$ of a graph $G=(V,E)$ is a subgraph such that the degree of each vertex $v$ in $H$ is in the range $[l(v),u(v)]$ for given integers $l(v)$ and $u(v)$ . Suppose we are also given a weight function $w:E\rightarrow\mathbb{R}^{+}\cup\{0\}$ . A min-cost DCS $H=(V,E^{\prime})$ of $G$ is a DCS that minimizes the sum of the weights of the edges in $E^{\prime}$ over all DCS.

Proposition 3 ([27]).

Min-cost DCS can be solved in $O(|V|^{4})$ time.

The proposition follows from the work of Gabow (Theorem 5.2) [27]. There the stated time complexity is $O((\sum_{i\in V}u_{i})$ $\min\{|E|\log|V|,|V|^{2}\})$ , which is $O(|V|^{4})$ , as each upper-bound $u_{i}$ can be assumed to be at most the degree of the $i$ -th vertex. One technicality is that they study the maximization version (with real weights), but the minimization version can be solved by the standard method of negating edge-weights in min-cost DCS. Also see [45] that has similar discussions and an $O(|V|^{6})$ time algorithm for min-cost DCS, which they call minimum-cost many-to-many matching with demands and capacities.

Observation 4 ( $*$ ).

A min-cost DCS with $l(v)=1$ for all $v\in V$ does not contain a path of length three, and thus it is a disjoint union of star graphs.

Our algorithm is as follows.

The Algorithm

1.

Construct a graph $G=(V,E)$ where $V=P_{1}\cup P_{2}$ and $E=\{\{p,q\}\mid p\in P_{1},q\in P_{2}\}$ . Define the weight function $w$ such that for each edge $e=\{p,q\}$ , $w(e)=d(p,q)$ . Compute a min-cost DCS $H=(V,E^{\prime})$ of $G$ with $l(v)=1$ and $u(v)=t$ for all $v\in V$ .
2.

Construct an edge-weighted graph $G^{\prime}$ in the following way: For each $p\in\Omega$ , add a vertex to $G^{\prime}$ ; For each star $S$ in $H$ , add a vertex corresponding to $S$ to $G^{\prime}$ , which we also call by $S$ ; For each $p,q\in\Omega$ , add the edge $\{p,q\}$ to $G^{\prime}$ with weight $d(p,q)$ ; For all $p\in\Omega$ and $S$ in $H$ , add the edge $\{p,S\}$ to $G^{\prime}$ with weight $\max_{q\in S}d(p,q)$ . Let $d^{\prime}$ be the shortest path metric in $G^{\prime}$ . Construct the metric space $(\Omega^{\prime},d^{\prime})$ where $\Omega^{\prime}$ is the subset of vertices in $G^{\prime}$ corresponding to the stars in $H$ .
3.

Compute a sum of radii clustering $X=\{X_{1},\ldots,X_{k}\}$ of the points in $\Omega^{\prime}$ using the Algorithm of Buchem et al. [14] (with $\Omega^{\prime}$ also being the candidate set of centers).
4.

Compute a clustering $X^{\prime}$ of the points in $P_{1}\cup P_{2}$ using $X$ in the following way. For each cluster $X_{i}\in X$ , add the cluster $\cup_{p\in S\mid S\in X_{i}}\{p\}$ to $X^{\prime}$ . Return $X^{\prime}$ .

Next, we analyze the algorithm. First, we have the following observations.

Observation 5 ( $*$ ).

$X^{\prime}$ is a $(t,k)$ -fair clustering of $P_{1}\cup P_{2}$ .

Next, we analyze the approximation factor. Let $\mathcal{C}^{*}=\{C_{1}^{*},C_{2}^{*},\ldots,C_{k}^{*}\}$ be a fixed optimal $(t,k)$ -fair clustering. We will prove the following lemma. Our result follows as a corollary.

Lemma 6.

Consider the clustering $X$ of $\Omega^{\prime}$ constructed in Step 3 of the algorithm. Then cost ${}_{(\Omega^{\prime},d^{\prime})}(X)\leq\mathbf{(48+\epsilon)}\cdot\sum_{i=1}^% {k}r_{(\Omega,d)}(C_{i}^{*})$ .

Corollary 7 ( $*$ ).

Consider the clustering $X^{\prime}$ of $P_{1}\cup P_{2}$ constructed in Step 4 of the algorithm. Then cost ${}_{(\Omega,d)}(X^{\prime})\leq\mathbf{(144+\epsilon)}\cdot\sum_{i=1}^{k}r_{(% \Omega,d)}(C_{i}^{*})$ . Thus, our algorithm is a $\mathbf{(144+\epsilon)}$ -approximation algorithm.

3.1 Proof of Lemma 6

In the following, we are going to prove Lemma 6. Consider the min-cost DCS $H=(V,E^{\prime})$ computed in Step 1. Also, consider the optimal clusters in $\mathcal{C}^{*}$ . We construct a new clustering $\hat{\mathcal{C}}=\{\hat{C}_{1},\hat{C}_{2},\ldots,\hat{C}_{\kappa}\}$ by merging clusters in $\mathcal{C}^{*}$ in the following way, where $1\leq\kappa\leq k$ . Initially, we set $\hat{\mathcal{C}}$ to $\mathcal{C}^{*}$ . For each edge $\{p,q\}$ of $E^{\prime}$ such that $p\in\hat{C}_{i},q\in\hat{C}_{j}$ and $i\neq j$ , replace $\hat{C}_{i},\hat{C}_{j}$ in $\hat{\mathcal{C}}$ by their union and denote it by $\hat{C}_{i}$ as well.

When the above merging procedure ends, by renaming the indexes, let $\hat{\mathcal{C}}\!=\!\{\hat{C}_{1},\hat{C}_{2},\ldots,\hat{C}_{\kappa}\}$ be the new clustering. Then, we have the following observation.

Observation 8.

Consider any star $S$ in $H$ . Then, for some $1\leq i\leq\kappa$ , all the points of $S$ are contained in $\hat{C}_{i}$ .

Consider the clustering $\mathcal{C}^{\prime}=\{C_{1}^{\prime},\ldots,C^{\prime}_{\kappa}\}$ of $\Omega^{\prime}$ defined in the following way. For each star $S$ in $H$ , identify the cluster $\hat{C}_{i}$ in $\hat{\mathcal{C}}$ that contains all the points in $S$ . By Observation 8, such an index $i$ exists. Assign the point $p$ in $\Omega^{\prime}$ corresponding to $S$ to $C_{i}^{\prime}$ .

Lemma 9 ( $*$ ).

cost ${}_{(\Omega^{\prime},d^{\prime})}(\mathcal{C}^{\prime})\leq 2\cdot$ cost ${}_{(\Omega,d)}(\hat{\mathcal{C}})$ .

We will prove the following lemma.

Lemma 10.

cost ${}_{(\Omega,d)}(\hat{\mathcal{C}})\leq 8\cdot\sum_{i=1}^{k}r_{(\Omega,d)}(C_{i% }^{*})$ .

Lemma 6 follows by Lemma 9 and 10 noting that the Algorithm of Buchem et al. [14] yields a $(3+\epsilon)$ -factor approximation to the optimal clustering (along with an appropriate scaling of $\epsilon$ ). In the rest of this section, we prove Lemma 10.

3.2 Proof of Lemma 10

For simplicity of notation, we drop $(\Omega,d)$ from $r_{(\Omega,d)}(.)$ , as henceforth centers are always assumed to be in $\Omega$ and the metric to be $d$ . Let us consider any fixed $\hat{C}_{i}$ , and suppose it is constructed by merging the clusters $C_{i_{1}}^{*},C_{i_{2}}^{*},\ldots,C_{i_{\tau}}^{*}$ . It is sufficient to prove that $r(\hat{C}_{i})\leq 8\cdot\sum_{j=1}^{\tau}r(C_{i_{j}}^{*})$ . For simplicity of notation, we rename $\hat{C}_{i}$ by $\hat{C}$ , and $C_{i_{1}}^{*},C_{i_{2}}^{*},\ldots,C_{i_{\tau}}^{*}$ by $C_{1}^{*},C_{2}^{*},\ldots,C_{\tau}^{*}$ .

Let $H_{1}=(V_{1},E_{1})$ be the induced subgraph of $H$ such that the vertices of $V_{1}$ are in $\hat{C}$ . We refer to a point of $P_{1}$ (resp. $P_{2}$ ) as a red (resp. blue) point. Note that the edges of $H$ are across red and blue points. In the following, we construct an edge-weighted, directed multi-graph $G^{*}=(V^{*},E^{*})$ in the following manner. $G^{*}$ has a vertex $v_{j}$ corresponding to each cluster $C_{j}^{*}$ , where $1\leq j\leq\tau$ . There is an edge $e=(v_{i},v_{j})$ from $v_{i}$ to $v_{j}$ for each $p\in P_{1}\cap C_{i}^{*}$ and $q\in P_{2}\cap C_{j}^{*}$ such that $\{p,q\}$ is in $E_{1}$ . We refer to such an edge as a $0$ -edge, i.e., its parity is 0. The weight $\omega_{e}$ of the edge $e$ is $d(p,q)$ . Similarly, there is a $1$ -edge (or parity 1 edge) $e=(v_{i},v_{j})$ from $v_{i}$ to $v_{j}$ for each $p\in P_{2}\cap C_{i}^{*}$ and $q\in P_{1}\cap C_{j}^{*}$ such that $\{p,q\}$ is in $E_{1}$ . The weight $\omega_{e}$ of the edge $e$ is $d(p,q)$ . For each edge $e_{i}\in E^{*}$ , we denote the corresponding edge in $E_{1}$ by $\{r_{i},b_{i}\}$ , where $r_{i}$ is the red point and $b_{i}$ is the blue point. For simplicity of exposition, we are going to make heavy use of this correspondence.

Observation 11 ( $*$ ).

Suppose there is a $0$ -edge (resp. $1$ -edge) $(v_{i},v_{j})$ in $E^{*}$ . Then there is also a $1$ -edge (resp. $0$ -edge) $(v_{j},v_{i})$ in $E^{*}$ .

A directed path (or simply a path) $\pi=\{u_{1},\ldots,u_{l}\}$ from $u_{1}$ to $u_{l}$ in $G^{*}$ is a sequence of distinct vertices such that $(u_{i},u_{i+1})$ is in $G^{*}$ for all $1\leq i\leq l-1$ . We say that $\pi$ contains the edges $(u_{i},u_{i+1})$ . If $\pi$ contains all 0-edges (resp. 1-edges), it is called a 0-path (resp. 1-path). Two consecutive edges $e_{1}=(u_{i},u_{i+1}),e_{2}=(u_{i+1},u_{i+2})$ on $\pi$ are said to form a switch if they have different parity. We say that the switch happens at $u_{i+1}$ and it is the corresponding switching vertex. The switch is called a $b$ -switch if the parity of $e_{1}$ is $b$ for $b\in\{0,1\}$ . A directed cycle is formed from $\pi$ by adding the edge $(u_{l},u_{1})$ (if any) with it. The reverse path of $\pi$ is the path $\{u_{l},\ldots,u_{1}\}$ that contains the edges $(u_{i+1},u_{i})$ for all $1\leq i\leq l-1$ . Such edges exist according to Observation 11. A 0-path (resp. 1-path) in a subgraph of $G^{*}$ starting at $v_{i}$ and ending at $v_{j}$ is called maximal if $v_{j}$ does not have any outgoing 0-edges (resp. 1-edges) in the subgraph.

Observation 12 ( $*$ ).

For any two vertices $v_{i},v_{j}\in V^{*}$ , there is a directed path from $v_{i}$ to $v_{j}$ in $G^{*}$ .

Consider any two vertices $v_{\alpha}$ and $v_{\beta}$ of $G^{*}$ . Let $\pi^{*}=\{v_{\alpha}=u_{1},\ldots,u_{l}=v_{\beta}\}$ be a directed path from $v_{\alpha}$ to $v_{\beta}$ having the minimum number of switches, i.e., a minimum-switch path from $v_{\alpha}$ to $v_{\beta}$ .

We prove the following lemma.

Lemma 13.

$\sum_{e\in\pi^{*}}\omega_{e}\leq 6\cdot\sum_{i=1}^{\tau}r(C_{i}^{*})$ . Moreover, if $\pi^{*}$ does not have a switch, $\sum_{e\in\pi^{*}}\omega_{e}\leq 4\cdot\sum_{i=1}^{\tau}r(C_{i}^{*})$ .

Before proving this lemma, we show how to prove Lemma 10. Consider any point $p$ in $\hat{C}_{j}$ . Let $p\in C_{g}^{*}$ . Now, consider any point $q$ in $\hat{C}_{j}$ that is the farthest point from $p$ . Let $q\in C_{h}^{*}$ . By Lemma 13 it follows that, there is a path, say $\pi^{\prime}$ , from $v_{g}$ to $v_{h}$ whose sum of the edge weights is at most $6\cdot\sum_{i=1}^{\tau}r(C_{i}^{*})$ . Then, $r(\hat{C}_{j})\leq d(p,q)\leq\sum_{e\in\pi^{\prime}}\omega_{e}+\sum_{\text{% vertex }v_{i}\in\pi^{\prime}}2\cdot r(C_{i}^{*})\leq 8\cdot\sum_{i=1}^{\tau}r(% C_{i}^{*}).$

Summing over all clusters $\hat{C}_{j}$ in $\hat{\mathcal{C}}$ , we obtain Lemma 10.

3.3 Proof of Lemma 13

The overall idea is to show the existence of a subset of edges $E_{2}^{\prime}\subset E$ , such that the set of edges $(E^{\prime}\setminus\pi^{*})\cup E_{2}^{\prime}$ form a valid degree-constrained subgraph of $G$ on the set of vertices $P_{1}\cup P_{2}$ . Additionally, we need that the total weight of the edges of $E_{2}^{\prime}$ is small. Then we can show that the weight of $\pi^{*}$ is also small, as $H=(V,E^{\prime})$ is a min-cost DCS of $G$ . However, it might not be possible to remove only the edges of $\pi^{*}$ from $E^{\prime}$ to show the existence of such a set $E_{2}^{\prime}$ . We show that there is a subset $E_{1}^{\prime}\subseteq E^{\prime}$ that contains the edges of $\pi^{*}$ and can be removed to obtain such a valid degree-constrained subgraph. In the following, we prove that obtaining two such sets $E_{1}^{\prime}$ and $E_{2}^{\prime}$ is sufficient to prove Lemma 13. For a set of edges $S\subseteq E$ , let $w(S)=\sum_{e\in S}w(e)$ .

Lemma 14.

Suppose there are $E_{1}^{\prime}\subseteq E^{\prime},E_{2}^{\prime}\subset E$ , such that the set of edges $(E^{\prime}\setminus E_{1}^{\prime})\cup E_{2}^{\prime}$ forms a valid degree-constrained subgraph of $G$ and $w(E_{2}^{\prime})\leq 6\cdot\sum_{i=1}^{\tau}r(C_{i}^{*})$ . Then, $w(E_{1}^{\prime})\leq 6\cdot\sum_{i=1}^{\tau}r(C_{i}^{*})$ .

Proof.

Note that $H$ is a min-cost DCS of $G$ . Consider the graph $H^{\prime}$ induced by the set of edges $(E^{\prime}\setminus E_{1}^{\prime})\cup E_{2}^{\prime}$ . By our assumption, $H^{\prime}$ is a valid DCS of $G$ . It follows that,

w(E^{\prime})\leq w((E^{\prime}\setminus E_{1}^{\prime})\cup E_{2}^{\prime}),% \text{or, }w(E_{1}^{\prime})\leq w(E_{2}^{\prime})\leq 6\cdot\sum_{i=1}^{\tau}% r(C_{i}^{*}).

The last inequality follows from our assumption. $\hfill\blacktriangleleft$

Assuming that the conditions of the above lemma are true, we finish the proof of Lemma 13.

	$\displaystyle\sum_{e\in\pi^{*}}\omega_{e}$	$\displaystyle=\sum_{(v_{i},v_{j})\in\pi^{*}}\omega_{e}$
		$\displaystyle=\sum_{\{p,q\}\text{ corresponding to }(v_{i},v_{j})\in\pi^{}% \mid p\in C_{i}^{},q\in C_{j}^{}}w(\{p,q\})\leq w(E_{1}^{\prime})\leq 6\cdot% \sum_{i=1}^{\tau}r(C_{i}^{}).$

If $\pi^{*}$ does not have a switch, then we will show that $w(E_{2}^{\prime})\leq 4\cdot\sum_{i=1}^{\tau}r(C_{i}^{*})$ . Hence, the moreover part in Lemma 13 also follows. It is left to show the existence of such $E_{1}^{\prime}$ and $E_{2}^{\prime}$ .

3.4 Construction of $E_{1}^{\prime}$ and $E_{2}^{\prime}$

Two subgraphs $G_{1}$ and $G_{2}$ of $G^{*}$ are called 0-1-edge-disjoint if for any edge $e_{\eta^{1}}$ of $G_{1}$ and $e_{\eta^{2}}$ of $G_{2}$ , the corresponding edges in $E^{\prime}$ are distinct. Thus, if $G_{1}$ contains a 0-edge $(v_{i},v_{j})$ , and $G_{1}$ and $G_{2}$ are 0-1-edge-disjoint, then $G_{2}$ cannot contain the 0-edge $(v_{i},v_{j})$ and the 1-edge $(v_{j},v_{i})$ . Similarly, if $G_{1}$ contains a 1-edge $(v_{i},v_{j})$ , and $G_{1}$ and $G_{2}$ are 0-1-edge-disjoint, then $G_{2}$ cannot contain the 1-edge $(v_{i},v_{j})$ and the 0-edge $(v_{j},v_{i})$ . Let $j^{1}<j^{2}<\ldots<j^{\lambda}$ be the indexes of the vertices on $\pi^{*}=\{u_{1},\ldots,u_{l}\}$ where the switches occur. Note that $j^{1}>1,j^{\lambda}<l$ . Denote the switch that occurs at $u_{j^{h}}$ by $b^{h}$ for all $1\leq h\leq\lambda$ (i.e., $b^{h}$ is the parity of $(u_{j^{h}-1},u_{j^{h}})$ ). Let $b^{0}$ be the parity of $(u_{1},u_{2})$ and $b^{\lambda+1}$ be the parity of $(u_{l-1},u_{l})$ .

First, we consider the simple case when the parity of $(u_{l-1},u_{l})$ is 0 (resp. 1) and there is a 0-path (resp. 1-path) from $u_{l}$ to $u_{1}$ in $G^{*}$ that is 0-1-edge-disjoint from $\pi^{*}$ . Let us denote the latter path by $\pi(l)$ . Note that $\pi^{*}$ is a path having the minimum number of switches and the existence of $\pi(l)$ ensures that $\pi^{*}$ does not have a switch. Let $U_{0}\subseteq E^{*}$ be the subset of edges that lie on the paths in $\{\pi^{*}\}\cup\{\pi(l)\}$ . Next, we define a subset $E_{1}^{\prime}\subseteq E_{1}$ that has a one-to-one mapping with $U_{0}$ . In particular, consider any edge $(v_{i},v_{j})$ in $U_{0}$ . Note that if it is a $0$ -edge, it was added due to an edge $\{p,q\}$ in $E_{1}$ such that $p\in P_{1}\cap C_{i}^{*}$ and $q\in P_{2}\cap C_{j}^{*}$ . We add the edge $\{p,q\}$ to $E_{1}^{\prime}$ . Otherwise, if $(v_{i},v_{j})$ is a $1$ -edge, it was added due to an edge $\{p,q\}$ in $E_{1}$ such that $p\in P_{2}\cap C_{i}^{*}$ and $q\in P_{1}\cap C_{j}^{*}$ . In this case, we add the edge $\{p,q\}$ to $E_{1}^{\prime}$ .

Next, we show the construction of $E_{2}^{\prime}$ . Wlog, let us assume that $\pi^{*}$ is a 0-path. The other case is symmetric. Note that then $\pi(l)$ is also a 0-path as per our assumption. First, we describe the process of adding the replacement edges for the path $\pi^{*}$ . Consider any intermediate vertex (if any) $v_{j^{\prime}}$ on this path. Then, there are exactly two points in $C_{j^{\prime}}^{*}$ corresponding to the edges on $\pi^{*}$ , which are of opposite colors. We add an edge between these two points in $E_{2}^{\prime}$ (see Figure 1). Removal of the edges of $E_{1}^{\prime}$ corresponding to $\pi^{*}$ and the addition of this edge do not change the degree of the two points in $C_{j^{\prime}}^{*}$ . Similarly, we add edges to $E_{2}^{\prime}$ corresponding to the intermediate vertices of $\pi(l)$ . Next, consider the vertex $u_{l}=v_{i}$ . There is an incoming 0-edge on $\pi^{*}$ and an outgoing 0-edge on $\pi(l)$ that are incident on $v_{i}$ . Thus, there are exactly two points in $C_{i}^{*}$ of opposite colors corresponding to these two edges. We add an edge between these two points in $E_{2}^{\prime}$ . Removal of the edges of $E_{1}^{\prime}$ corresponding to those two edges, and the addition of this edge does not change the degree of the two points in $C_{i}^{*}$ . Similarly, consider the vertex $u_{1}=v_{i}^{\prime}$ . There is an outgoing 0-edge on $\pi^{*}$ and an incoming 0-edge on $\pi(l)$ that are incident on $v_{i}^{\prime}$ . Thus, there are exactly two points in $C_{i^{\prime}}^{*}$ of opposite colors corresponding to these two edges. We add an edge between these two points in $E_{2}^{\prime}$ . Again, the removal of the edges of $E_{1}^{\prime}$ corresponding to those two edges, and the addition of this edge does not change the degree of the two points in $C_{i^{\prime}}^{*}$ . See Figure 1 for an illustration.

Figure 1: Figure illustrating the construction of

E_{2}^{\prime}

for

\{\pi^{*}\}\cup\{\pi(l)\}

. The bold (orange) edges are in

E_{1}^{\prime}

and the dashed (purple) edges are in

E_{2}^{\prime}

.

By our construction, the set of edges $(E^{\prime}\setminus E_{1}^{\prime})\cup E_{2}^{\prime}$ form a valid degree-constrained subgraph of $G$ on the set of vertices $P_{1}\cup P_{2}$ . The way we add the edges to $E_{2}^{\prime}$ , both endpoints of each edge lie in a cluster $C_{j}^{*}$ such that the vertex $v_{j}$ corresponding to the cluster lies on a path in $\{\pi^{*}\}\cup\{\pi(l)\}$ . Now, $v_{j}$ can lie either on one such path or on two paths. Thus, we add at most two edges to $E_{2}^{\prime}$ corresponding to $v_{j}$ . The sum of the weights of these two edges is at most 2 times the diameter of $C_{j}^{*}$ . Hence, by Lemma 14, we obtain $\sum_{e\in\pi^{*}}\omega_{e}\leq 4\cdot\sum_{i=1}^{\tau}r(C_{i}^{*}).$

Next, we consider the remaining case when the parity of $(u_{l-1},u_{l})$ is 0 (resp. 1) and there is no 0-path (resp. 1-path) from $u_{l}$ to $u_{1}$ in $G^{*}$ that is 0-1-edge-disjoint from $\pi^{*}$ . Thus, there is no $b^{\lambda+1}$ -path from $u_{l}$ to $u_{1}$ in $G^{*}$ that is 0-1-edge-disjoint from $\pi^{*}$ .

Consider a path $\pi$ and let $v_{i}$ denote its start vertex. Also, consider a cycle $O$ such that $\pi$ and $O$ have exactly one vertex $v_{j}$ in common. Note that $\pi$ might not have an edge, in which case $v_{j}=v_{i}$ . Let $D$ be the graph formed by the union of $\pi$ and $O$ , i.e., by gluing them together at $v_{j}$ . We refer to such a graph $D$ as a hanging cycle for $v_{i}$ with $v_{j}$ being the join vertex. $D$ is called a $b$ -hanging cycle if all the edges of $\pi$ and $O$ are $b$ -edges. Let $p$ be the point in the cluster $C_{j}^{*}$ corresponding to the edge of the cycle $O$ incoming to $v_{j}$ . Additionally, $D$ is called special if the degree of $p$ in $H_{1}$ is at least 2, and $p$ is called the special point of $D$ (see Figure 2).

Figure 2: A special hanging cycle with the special point

p

.

In the current case, we need the following lemmas.

Lemma 15 ( $*$ ).

Suppose the parity of $(u_{l-1},u_{l})$ is 0 (resp. 1), and there is no 0-path (resp. 1-path) from $u_{l}$ to $u_{1}$ in $G^{*}$ that is 0-1-edge-disjoint from $\pi^{*}$ . Moreover, suppose there is no special 0-hanging cycle (resp. 1-hanging cycle) in $G^{*}$ for $u_{l}$ that is 0-1-edge-disjoint from $\pi^{*}$ . Then, there exists a $0$ -path (resp. 1-path) $\pi_{1}$ in $G^{*}$ from $u_{l}$ to a vertex $v_{j}$ , such that $\pi_{1}$ is 0-1-edge-disjoint from $\pi^{*}$ and one of the following is true: (i) the degree of $b_{\eta}$ (resp. $r_{\eta}$ ) in $H_{1}$ is at least 2, where $e_{\eta}$ is the last edge on $\pi_{1}$ if it has an edge or $(u_{l-1},u_{l})$ otherwise; or (ii) $C_{j}^{*}$ has a red (resp. blue) point whose degree in $H_{1}$ is at most $t-1$ .

Lemma 16 ( $*$ ).

Suppose the parity of $(u_{1},u_{2})$ is 1 (resp. 0), and there is no 1-path (resp. 0-path) from $u_{l}$ to $u_{1}$ in $G^{*}$ that is 0-1-edge-disjoint from $\pi^{*}$ . Moreover, suppose there is no special 0-hanging cycle (resp. 1-hanging cycle) in $G^{*}$ for $u_{1}$ that is 0-1-edge-disjoint from $\pi^{*}$ . Then, there exists a $0$ -path (resp. 1-path) $\pi_{1}$ from $u_{1}$ to a vertex $v_{j}$ , such that $\pi_{1}$ is 0-1-edge-disjoint from $\pi^{*}$ and one of the following is true: (i) the degree of $b_{\eta}$ (resp. $r_{\eta}$ ) in $H_{1}$ is at least 2, where $e_{\eta}$ is the last edge on $\pi_{1}$ if it has an edge or $(u_{2},u_{1})$ otherwise, and (ii) $C_{j}^{*}$ has a red (resp. blue) point whose degree in $H_{1}$ is at most $t-1$ .

Next, we apply the above lemmas to show the construction of $E_{1}^{\prime}$ and $E_{2}^{\prime}$ . Recall that in this case there is no $b^{\lambda+1}$ -path from $u_{l}$ to $u_{1}$ in $G^{*}$ that is 0-1-edge-disjoint from $\pi^{*}$ . If $\pi^{*}$ has a switch, then there is no 0-path or 1-path from $u_{l}$ to $u_{1}$ in $G^{*}$ that is 0-1-edge-disjoint from $\pi^{*}$ . Otherwise, it must be that $b^{0}=b^{\lambda+1}$ , and hence by our assumption, there is no $b^{0}$ -path from $u_{l}$ to $u_{1}$ in $G^{*}$ that is 0-1-edge-disjoint from $\pi^{*}$ . We conclude that in this case ( $b^{0}=b^{\lambda+1}$ ), there is no $b^{0}$ -path from $u_{l}$ to $u_{1}$ in $G^{*}$ that is 0-1-edge-disjoint from $\pi^{*}$ . Then, by Lemma 16 it follows that, either there is a $(1-b^{0})$ -hanging cycle for $u_{1}$ in $G^{*}$ 0-1-edge-disjoint from $\pi^{*}$ , or a $(1-b^{0})$ -path starting from $u_{1}$ in $G^{*}$ with special properties. This is true, as the parity of $(u_{1},u_{2})$ is $b^{0}$ . We denote this structure by $\pi(0)$ . Additionally, if $\pi(0)$ is a path, we call $b_{\eta}$ (resp. $r_{\eta}$ ) an anchor point if its degree in $H_{1}$ is at least 2. Similarly, by Lemma 15 it follows that, either there is a $b^{\lambda+1}$ -hanging cycle for $u_{l}$ in $G^{*}$ 0-1-edge-disjoint from $\pi^{*}$ , or a $b^{\lambda+1}$ -path starting from $u_{l}$ in $G^{*}$ with special properties. This is true, as $(u_{l-1},u_{l})$ is a $b^{\lambda+1}$ -edge. We denote this structure by $\pi(\lambda+1)$ . Additionally, if $\pi(\lambda+1)$ is a path, we call $b_{\eta}$ (resp. $r_{\eta}$ ) an anchor point if its degree in $H_{1}$ is at least 2.

Now, if $\mathbf{1-b^{0}\neq b^{\lambda+1}}$ , then $\pi(0)$ and $\pi(\lambda+1)$ must be vertex-disjoint. If they are not vertex-disjoint, there exists a $b^{\lambda+1}$ -path from $u_{l}$ to $u_{1}$ in $G^{*}$ that is 0-1-edge-disjoint from $\pi^{*}$ : take the edges on $\pi(\lambda+1)$ from $u_{l}$ to a common vertex and the reverse of $\pi(0)$ , from the common vertex to $u_{1}$ . These reverse edges have parity opposite of $1-b^{0}$ , i.e., the same as $b^{\lambda+1}$ . But, by our assumption, such a $b^{\lambda+1}$ -path does not exist. Hence, $\pi(0)$ and $\pi(\lambda+1)$ are vertex-disjoint.

In the other case, $\mathbf{1-b^{0}=b^{\lambda+1}}$ . We note that if $\pi^{*}$ has no switch, $b^{0}=b^{\lambda+1}$ . Thus, if $1-b^{0}=b^{\lambda+1}$ , then we can safely assume that $\pi^{*}$ has at least one switch. In this case, suppose there are a $(1-b^{0})$ -hanging cycle for $u_{1}$ and a $b^{\lambda+1}$ -hanging cycle for $u_{l}$ in $G^{*}$ , such that both are 0-1-edge-disjoint, each of the hanging cycles is 0-1-edge-disjoint from $\pi^{*}$ , and either the special vertices of both are distinct or the special points are the same and the degree of that point in $H_{1}$ is at least 3. Then, we take the hanging cycle for $u_{1}$ as $\pi(0)$ and the one for $u_{l}$ as $\pi(\lambda+1)$ . Otherwise, if there is a $(1-b^{0})$ -hanging cycle for $u_{1}$ in $G^{*}$ 0-1-edge-disjoint from $\pi^{*}$ or a $b^{\lambda+1}$ -hanging cycle for $u_{l}$ in $G^{*}$ 0-1-edge-disjoint from $\pi^{*}$ , we consider one of those. Assume that the former holds. The other case is symmetric. We take such a hanging cycle for $u_{1}$ as $\pi(0)$ . Then, one can prove that there is a $b^{\lambda+1}$ -path from $u_{l}$ in $G^{*}$ with special properties (Lemma 8 in the full version). We take this $b^{\lambda+1}$ -path as $\pi(\lambda+1)$ . Otherwise, there is neither a $(1-b^{0})$ -hanging cycle for $u_{1}$ in $G^{*}$ 0-1-edge-disjoint from $\pi^{*}$ nor a $b^{\lambda+1}$ -hanging cycle for $u_{l}$ in $G^{*}$ 0-1-edge-disjoint from $\pi^{*}$ . Then, one can prove that there are two 0-1-edge-disjoint paths with parity $1-b^{0}=b^{\lambda+1}$ , from $u_{1}$ and $u_{l}$ , respectively, such that both are also 0-1-edge-disjoint from $\pi^{*}$ (Lemma 9 in the full version). In this case, we take the path from $u_{1}$ as $\pi(0)$ and the path from $u_{l}$ as $\pi(\lambda+1)$ .

For all $1\leq h\leq\lambda$ , if there are two 0-1-edge-disjoint $b^{h}$ -hanging cycles for $u_{j^{h}}$ in $G^{*}$ that are 0-1-edge-disjoint from $\pi^{*}$ and have distinct special points or the same special point of degree at least 3 in $H_{1}$ , denote them by $\pi_{1}(h)$ and $\pi_{2}(h)$ . Otherwise, if there is one $b^{h}$ -hanging cycle for $u_{j^{h}}$ in $G^{*}$ that is 0-1-edge-disjoint from $\pi^{*}$ , denote it by $\pi_{1}(h)$ . Now, $(u_{j^{h}-1},u_{j^{h}})$ is a $b^{h}$ -edge and $(u_{j^{h}},u_{j^{h}+1})$ is a $(1-b^{h})$ -edge. Then, one can prove that there is a $b^{h}$ -path starting from $u_{j^{h}}$ in $G^{*}$ with special properties (Lemma 8 in the full version). Denote this path by $\pi_{2}(h)$ . Otherwise, there is no $b^{h}$ -hanging cycle for $u_{j^{h}}$ in $G^{*}$ that is 0-1-edge-disjoint from $\pi^{*}$ . In this case, one can prove that there are two $b^{h}$ -paths starting from $u_{j^{h}}$ in $G^{*}$ with special properties. Denote them by $\pi_{1}(h)$ and $\pi_{2}(h)$ . Note that in all the cases, $\pi()$ , $\pi_{1}()$ or $\pi_{2}()$ can either be a path or a hanging cycle.

Construction of $E_{1}^{\prime}$ .

Let $U\subseteq E^{*}$ be the subset of edges that lie on the structures in $\mathcal{S}=\cup_{i=1}^{\lambda}(\{\pi_{1}(i)\}\cup\{\pi_{2}(i)\})\cup\{\pi(0)% ,\pi(\lambda+1),\pi^{*}\}$ . Next, we define the subset $E_{1}^{\prime}$ of $E^{\prime}$ that has a one-to-one mapping with $U$ . In particular, consider any edge $(v_{i},v_{j})$ in $U$ . Note that if it is a $0$ -edge, it was added due to an edge $\{p,q\}$ in $E_{1}$ such that $p\in P_{1}\cap C_{i}^{*}$ and $q\in P_{2}\cap C_{j}^{*}$ . We add the edge $\{p,q\}$ to $E_{1}^{\prime}$ . Otherwise, if $(v_{i},v_{j})$ is a $1$ -edge, it was added due to an edge $\{p,q\}$ in $E_{1}$ such that $p\in P_{2}\cap C_{i}^{*}$ and $q\in P_{1}\cap C_{j}^{*}$ . We again add the edge $\{p,q\}$ to $E_{1}^{\prime}$ .

Our proof is completed by the following two lemmas.

Lemma 17 ( $*$ ).

There is a subset of edges $E_{2}^{\prime}\subset E$ , such that the set of edges $(E^{\prime}\setminus E_{1}^{\prime})\cup E_{2}^{\prime}$ form a valid degree-constrained subgraph of $G$ on the set of vertices $P_{1}\cup P_{2}$ .

4 The Algorithm for Balanced Sum-of-Radii Clustering

In this section, we prove Theorem 2. Recall that we are given $\ell$ disjoint groups $P_{1},\ldots,P_{\ell}$ having $n$ points in total in a metric space $(\Omega=\cup_{i=1}^{\ell}P_{i},d)$ , such that $|P_{1}|=|P_{2}|=\ldots=|P_{\ell}|$ .

Our algorithm is as follows.

The Algorithm

1.

For each $2\leq i\leq\ell$ , construct a graph $G_{i}=(V_{i},E_{i})$ where $V_{i}=P_{1}\cup P_{i}$ and $E_{i}=\{\{p,q\}\mid p\in P_{1},q\in P_{i}\}$ . Define the weight function $w_{i}$ such that for each edge $e=\{p,q\}$ , $w_{i}(e)=d(p,q)$ . Compute a minimum-weight (w.r.t. $w_{i}$ ) perfect matching $M_{i}$ of $G_{i}$ . For each $p\in P_{1}$ , let $S_{p}$ be the union of $\{p\}$ and the points from $P_{2},\ldots,P_{\ell}$ that are matched to $p$ in $M=\cup_{i=2}^{\ell}M_{i}$ .
2.

Construct an edge-weighted graph $G^{\prime}$ in the following way: For each $p\in\Omega$ , add a vertex to $G^{\prime}$ ; For each $p\in P_{1}$ , add a vertex corresponding to $S_{p}$ to $G^{\prime}$ , which we also call by $S_{p}$ ; For each $p,q\in\Omega$ , add the edge $\{p,q\}$ to $G^{\prime}$ with weight $d(p,q)$ ; For all $p^{\prime}\in\Omega$ and $p\in P_{1}$ , add the edge $\{p^{\prime},S_{p}\}$ to $G^{\prime}$ with weight $\max_{q\in S_{p}}d(p^{\prime},q)$ . Let $d^{\prime}$ be the shortest path metric in $G^{\prime}$ . Construct the metric space $(\Omega^{\prime},d^{\prime})$ where $\Omega^{\prime}$ is the subset of vertices $\{S_{p}\mid p\in P_{1}\}$ in $G^{\prime}$ .
3.

Compute a sum of radii clustering $X=\{X_{1},\ldots,X_{k}\}$ of the points in $\Omega^{\prime}$ using the Algorithm of Buchem et al. [14] (with $\Omega^{\prime}$ also being the candidate set of centers).
4.

Compute a clustering $X^{\prime}$ of the points in $\cup_{i=1}^{\ell}P_{i}$ using $X$ in the following way. For each cluster $X_{i}$ , add the cluster $\cup_{q\in S_{p}\mid S_{p}\in X_{i}}\{q\}$ to $X^{\prime}$ . Return $X^{\prime}$ .

Let $\mathcal{C}^{*}=\{C_{1}^{*},C_{2}^{*},\ldots,C_{k}^{*}\}$ be a fixed optimal balanced clustering.

We have the following lemma. Our main result follows as a corollary.

Lemma 18 ( $*$ ).

Consider the clustering $X$ of $\Omega^{\prime}$ constructed in Step 3 of the algorithm. Then cost ${}_{(\Omega^{\prime},d^{\prime})}(X)\leq\mathbf{(60+\epsilon)}\cdot\sum_{i=1}^% {k}r_{(\Omega,d)}(C_{i}^{*})$ .

Corollary 19 ( $*$ ).

Consider the clustering $X^{\prime}$ of $\cup_{i=1}^{\ell}P_{i}$ constructed in Step 3 of the algorithm. Then cost ${}_{(\Omega,d)}(X^{\prime})\leq\mathbf{(180+\epsilon)}\cdot\sum_{i=1}^{k}r_{(% \Omega,d)}(C_{i}^{*})$ . Thus, our algorithm is a $\mathbf{(180+\epsilon)}$ -approximation algorithm.

References

[1] Anders Aamand, Justin Y. Chen, Allen Liu, Sandeep Silwal, Pattara Sukprasert, Ali Vakilian, and Fred Zhang. Constant approximation for individual preference stable clustering. In Advances in Neural Information Processing Systems (NeurIPS), 2023.
[2] Mohsen Abbasi, Aditya Bhaskara, and Suresh Venkatasubramanian. Fair clustering via equitable group representations. In Proceedings of the Conference on Fairness, Accountability, and Transparency (FAccT), pages 504–514, 2021. doi:10.1145/3442188.3445913.
[3] Sara Ahmadian and Chaitanya Swamy. Approximation Algorithms for Clustering Problems with Lower Bounds and Outliers. In Ioannis Chatzigiannakis, Michael Mitzenmacher, Yuval Rabani, and Davide Sangiorgi, editors, 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016), volume 55 of Leibniz International Proceedings in Informatics (LIPIcs), pages 69:1–69:15, Dagstuhl, Germany, 2016. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.ICALP.2016.69.
[4] Georg Anegg, Haris Angelidakis, Adam Kurpisz, and Rico Zenklusen. A technique for obtaining true approximations for $k$ -center with covering constraints. In International conference on integer programming and combinatorial optimization, pages 52–65. Springer, 2020. doi:10.1007/978-3-030-45771-6_5.
[5] Arturs Backurs, Piotr Indyk, Krzysztof Onak, Baruch Schieber, Ali Vakilian, and Tal Wagner. Scalable fair clustering. In International Conference on Machine Learning, pages 405–413, 2019. URL: http://proceedings.mlr.press/v97/backurs19a.html.
[6] Sayan Bandyapadhyay, Eden Chlamtáč, Yury Makarychev, and Ali Vakilian. A polynomial-time approximation for pairwise fair $k$ -median clustering. arXiv preprint arXiv:2405.10378, 2024. doi:10.48550/arXiv.2405.10378.
[7] Sayan Bandyapadhyay, Tanmay Inamdar, Shreyas Pai, and Kasturi Varadarajan. A constant approximation for colorful $k$ -center. In 27th Annual European Symposium on Algorithms (ESA 2019). Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2019.
[8] Sayan Bandyapadhyay, William Lochet, and Saket Saurabh. FPT constant-approximations for capacitated clustering to minimize the sum of cluster radii. In Erin W. Chambers and Joachim Gudmundsson, editors, 39th International Symposium on Computational Geometry, SoCG 2023, June 12-15, 2023, Dallas, Texas, USA, volume 258 of LIPIcs, pages 12:1–12:14. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPICS.SOCG.2023.12.
[9] Babak Behsaz and Mohammad R. Salavatipour. On minimum sum of radii and diameters clustering. Algorithmica, 73(1):143–165, 2015. doi:10.1007/s00453-014-9907-3.
[10] Suman Bera, Deeparnab Chakrabarty, Nicolas Flores, and Maryam Negahbani. Fair algorithms for clustering. In Advances in Neural Information Processing Systems, pages 4954–4965, 2019.
[11] Ioana O Bercea, Martin Groß, Samir Khuller, Aounon Kumar, Clemens Rösner, Daniel R Schmidt, and Melanie Schmidt. On the cost of essentially fair clusterings. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019). Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2019. doi:10.4230/LIPIcs.APPROX-RANDOM.2019.18.
[12] Matteo Böhm, Adriano Fazzone, Stefano Leonardi, and Chris Schwiegelshohn. Fair clustering with multiple colors. arXiv preprint arXiv:2002.07892, 2020. arXiv:2002.07892.
[13] B Brubach, D Chakrabarti, J Dickerson, A Srinivasan, and L Tsepenekas. Fairness, semi-supervised learning, and more: A general framework for clustering with stochastic pairwise constraints. In Proc. Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI), 2021.
[14] Moritz Buchem, Katja Ettmayr, Hugo KK Rosado, and Andreas Wiese. A ( $3+\epsilon$ )-approximation algorithm for the minimum sum of radii problem with outliers and extensions for generalized lower bounds. In Proceedings of the 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1738–1765. SIAM, 2024.
[15] Lena Carta, Lukas Drexler, Annika Hennes, Clemens Rösner, and Melanie Schmidt. FPT Approximations for Fair $k$ -Min-Sum-Radii. In Julián Mestre and Anthony Wirth, editors, 35th International Symposium on Algorithms and Computation (ISAAC 2024), volume 322 of Leibniz International Proceedings in Informatics (LIPIcs), pages 16:1–16:18, Dagstuhl, Germany, 2024. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.ISAAC.2024.16.
[16] Moses Charikar and Rina Panigrahy. Clustering to minimize the sum of cluster diameters. J. Comput. Syst. Sci., 68(2):417–441, 2004. doi:10.1016/j.jcss.2003.07.014.
[17] Danny Z Chen, Jian Li, Hongyu Liang, and Haitao Wang. Matroid and knapsack center problems. Algorithmica, 75(1):27–52, 2016. doi:10.1007/S00453-015-0010-1.
[18] Xianrun Chen, Dachuan Xu, Yicheng Xu, and Yong Zhang. Parameterized approximation algorithms for sum of radii clustering and variants. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 20666–20673, 2024. doi:10.1609/AAAI.V38I18.30053.
[19] Xingyu Chen, Brandon Fain, Liang Lyu, and Kamesh Munagala. Proportionally fair clustering. In International Conference on Machine Learning, pages 1032–1041, 2019. URL: http://proceedings.mlr.press/v97/chen19d.html.
[20] Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi, and Sergei Vassilvitskii. Fair clustering through fairlets. In Advances in Neural Information Processing Systems, pages 5029–5037, 2017. URL: https://proceedings.neurips.cc/paper/2017/hash/978fce5bcc4eccc88ad48ce3914124a2-Abstract.html.
[21] Ashish Chiplunkar, Sagar Kale, and Sivaramakrishnan Natarajan Ramamoorthy. How to solve fair $k$ -center in massive data models. In Proceedings of the International Conference on Machine Learning (ICML), pages 1877–1886, 2020.
[22] Eden Chlamtáč, Yury Makarychev, and Ali Vakilian. Approximating fair clustering with cascaded norm objectives. In Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2664–2683, 2022. doi:10.1137/1.9781611977073.104.
[23] Zhen Dai, Yury Makarychev, and Ali Vakilian. Fair representation clustering with several protected classes. In FAccT ’22: 2022 ACM Conference on Fairness, Accountability, and Transparency, Seoul, Republic of Korea, June 21 - 24, 2022, pages 814–823. ACM, 2022. doi:10.1145/3531146.3533146.
[24] Lukas Drexler, Annika Hennes, Abhiruk Lahiri, Melanie Schmidt, and Julian Wargalla. Approximating fair $k$ -min-sum-radii in euclidean space. In International Workshop on Approximation and Online Algorithms, pages 119–133. Springer, 2023. doi:10.1007/978-3-031-49815-2_9.
[25] Arnold Filtser and Ameet Gadekar. Fpt approximations for capacitated sum of radii and diameters. arXiv preprint arXiv:2409.04984, 2024. doi:10.48550/arXiv.2409.04984.
[26] Zachary Friggstad and Mahya Jamshidian. Improved Polynomial-Time Approximations for Clustering with Minimum Sum of Radii or Diameters. In Shiri Chechik, Gonzalo Navarro, Eva Rotenberg, and Grzegorz Herman, editors, 30th Annual European Symposium on Algorithms (ESA 2022), volume 244 of Leibniz International Proceedings in Informatics (LIPIcs), pages 56:1–56:14, Dagstuhl, Germany, 2022. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.ESA.2022.56.
[27] Harold N Gabow. An efficient reduction technique for degree-constrained subgraph and bidirected network flow problems. In Proceedings of the fifteenth annual ACM symposium on Theory of computing, pages 448–456, 1983. doi:10.1145/800061.808776.
[28] Mehrdad Ghadiri, Samira Samadi, and Santosh S. Vempala. Socially fair $k$ -means clustering. In Madeleine Clare Elish, William Isaac, and Richard S. Zemel, editors, FAccT ’21: 2021 ACM Conference on Fairness, Accountability, and Transparency, Virtual Event / Toronto, Canada, March 3-10, 2021, pages 438–448. ACM, 2021. doi:10.1145/3442188.3445906.
[29] Mehrdad Ghadiri, Mohit Singh, and Santosh S Vempala. Constant-factor approximation algorithms for socially fair $k$ -clustering. arXiv preprint arXiv:2206.11210, 2022. doi:10.48550/arXiv.2206.11210.
[30] Matt Gibson, Gaurav Kanade, Erik Krohn, Imran A. Pirwani, and Kasturi R. Varadarajan. On metric clustering to minimize the sum of radii. Algorithmica, 57(3):484–498, 2010. doi:10.1007/s00453-009-9282-7.
[31] Matt Gibson, Gaurav Kanade, Erik Krohn, Imran A. Pirwani, and Kasturi R. Varadarajan. On clustering to minimize the sum of radii. SIAM J. Comput., 41(1):47–60, 2012. doi:10.1137/100798144.
[32] Swati Gupta, Jai Moondra, and Mohit Singh. Which $l_{p}$ norm is the fairest? approximations for fair facility location across all “ $p$ ”, 2022. arXiv:2211.14873.
[33] Pinar Heggernes and Daniel Lokshtanov. Optimal broadcast domination in polynomial time. Discret. Math., 306(24):3267–3280, 2006. doi:10.1016/j.disc.2006.06.013.
[34] Sedjro Salomon Hotegni, Sepideh Mahabadi, and Ali Vakilian. Approximation algorithms for fair range clustering. In International Conference on Machine Learning, pages 13270–13284. PMLR, 2023.
[35] Tanmay Inamdar and Kasturi R. Varadarajan. Capacitated sum-of-radii clustering: An FPT approximation. In Fabrizio Grandoni, Grzegorz Herman, and Peter Sanders, editors, 28th Annual European Symposium on Algorithms, ESA 2020, September 7-9, 2020, Pisa, Italy (Virtual Conference), volume 173 of LIPIcs, pages 62:1–62:17. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2020. doi:10.4230/LIPIcs.ESA.2020.62.
[36] Ragesh Jaiswal, Amit Kumar, and Jatin Yadav. FPT approximation for capacitated sum of radii. In Venkatesan Guruswami, editor, 15th Innovations in Theoretical Computer Science Conference, ITCS 2024, January 30 to February 2, 2024, Berkeley, CA, USA, volume 287 of LIPIcs, pages 65:1–65:21. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2024. doi:10.4230/LIPICS.ITCS.2024.65.
[37] Xinrui Jia, Kshiteej Sheth, and Ola Svensson. Fair colorful $k$ -center clustering. In International Conference on Integer Programming and Combinatorial Optimization, pages 209–222. Springer, 2020. doi:10.1007/978-3-030-45771-6_17.
[38] Christopher Jung, Sampath Kannan, and Neil Lutz. A center in your neighborhood: Fairness in facility location. In Proceedings of the Symposium on Foundations of Responsible Computing (FORC), pages 5:1–5:15, 2020.
[39] Matthäus Kleindessner, Pranjal Awasthi, and Jamie Morgenstern. Fair $k$ -center clustering for data summarization. In 36th International Conference on Machine Learning, ICML 2019, pages 5984–6003. International Machine Learning Society (IMLS), 2019.
[40] Ravishankar Krishnaswamy, Shi Li, and Sai Sandeep. Constant approximation for k-median and k-means with outliers via iterative rounding. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, pages 646–659, 2018. doi:10.1145/3188745.3188882.
[41] Yury Makarychev and Ali Vakilian. Approximation algorithms for socially fair clustering. In Conference on Learning Theory (COLT), pages 3246–3264. PMLR, 2021. URL: http://proceedings.mlr.press/v134/makarychev21a.html.
[42] Evi Micha and Nisarg Shah. Proportionally fair clustering revisited. In International Colloquium on Automata, Languages, and Programming (ICALP), 2020. doi:10.4230/LIPIcs.ICALP.2020.85.
[43] Maryam Negahbani and Deeparnab Chakrabarty. Better algorithms for individually fair $k$ -clustering. Advances in Neural Information Processing Systems (NeurIPS), 34:13340–13351, 2021. URL: https://proceedings.neurips.cc/paper/2021/hash/6f221fcb5c504fe96789df252123770b-Abstract.html.
[44] Sina Bagheri Nezhad, Sayan Bandyapadhyay, and Tianzhi Chen. Polynomial-time constant-approximation for fair sum-of-radii clustering, 2025. doi:10.48550/arXiv.2504.14683.
[45] Fatemeh Rajabi-Alni and Alireza Bagheri. Computing a many-to-many matching with demands and capacities between two sets using the hungarian algorithm. Journal of mathematics, 2023(1):7761902, 2023.
[46] Melanie Schmidt, Chris Schwiegelshohn, and Christian Sohler. Fair coresets and streaming algorithms for fair k-means. In International Workshop on Approximation and Online Algorithms, pages 232–251. Springer, 2019. doi:10.1007/978-3-030-39479-0_16.
[47] Ali Vakilian and Mustafa Yalçıner. Improved approximation algorithms for individually fair clustering. In International Conference on Artificial Intelligence and Statistics (AISTATS), pages 8758–8779. PMLR, 2022.

[bib.bib1] [1] Anders Aamand, Justin Y. Chen, Allen Liu, Sandeep Silwal, Pattara Sukprasert, Ali Vakilian, and Fred Zhang. Constant approximation for individual preference stable clustering. In Advances in Neural Information Processing Systems (NeurIPS), 2023.

[bib.bib2] [2] Mohsen Abbasi, Aditya Bhaskara, and Suresh Venkatasubramanian. Fair clustering via equitable group representations. In Proceedings of the Conference on Fairness, Accountability, and Transparency (FAccT), pages 504–514, 2021. doi:10.1145/3442188.3445913.

[bib.bib3] [3] Sara Ahmadian and Chaitanya Swamy. Approximation Algorithms for Clustering Problems with Lower Bounds and Outliers. In Ioannis Chatzigiannakis, Michael Mitzenmacher, Yuval Rabani, and Davide Sangiorgi, editors, 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016), volume 55 of Leibniz International Proceedings in Informatics (LIPIcs), pages 69:1–69:15, Dagstuhl, Germany, 2016. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.ICALP.2016.69.

[bib.bib4] [4] Georg Anegg, Haris Angelidakis, Adam Kurpisz, and Rico Zenklusen. A technique for obtaining true approximations for $k$ -center with covering constraints. In International conference on integer programming and combinatorial optimization, pages 52–65. Springer, 2020. doi:10.1007/978-3-030-45771-6_5.

[bib.bib5] [5] Arturs Backurs, Piotr Indyk, Krzysztof Onak, Baruch Schieber, Ali Vakilian, and Tal Wagner. Scalable fair clustering. In International Conference on Machine Learning, pages 405–413, 2019. URL: http://proceedings.mlr.press/v97/backurs19a.html.

[bib.bib6] [6] Sayan Bandyapadhyay, Eden Chlamtáč, Yury Makarychev, and Ali Vakilian. A polynomial-time approximation for pairwise fair $k$ -median clustering. arXiv preprint arXiv:2405.10378, 2024. doi:10.48550/arXiv.2405.10378.

[bib.bib7] [7] Sayan Bandyapadhyay, Tanmay Inamdar, Shreyas Pai, and Kasturi Varadarajan. A constant approximation for colorful $k$ -center. In 27th Annual European Symposium on Algorithms (ESA 2019). Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2019.

[bib.bib8] [8] Sayan Bandyapadhyay, William Lochet, and Saket Saurabh. FPT constant-approximations for capacitated clustering to minimize the sum of cluster radii. In Erin W. Chambers and Joachim Gudmundsson, editors, 39th International Symposium on Computational Geometry, SoCG 2023, June 12-15, 2023, Dallas, Texas, USA, volume 258 of LIPIcs, pages 12:1–12:14. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPICS.SOCG.2023.12.

[bib.bib9] [9] Babak Behsaz and Mohammad R. Salavatipour. On minimum sum of radii and diameters clustering. Algorithmica, 73(1):143–165, 2015. doi:10.1007/s00453-014-9907-3.

[bib.bib10] [10] Suman Bera, Deeparnab Chakrabarty, Nicolas Flores, and Maryam Negahbani. Fair algorithms for clustering. In Advances in Neural Information Processing Systems, pages 4954–4965, 2019.

[bib.bib11] [11] Ioana O Bercea, Martin Groß, Samir Khuller, Aounon Kumar, Clemens Rösner, Daniel R Schmidt, and Melanie Schmidt. On the cost of essentially fair clusterings. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019). Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2019. doi:10.4230/LIPIcs.APPROX-RANDOM.2019.18.

[bib.bib12] [12] Matteo Böhm, Adriano Fazzone, Stefano Leonardi, and Chris Schwiegelshohn. Fair clustering with multiple colors. arXiv preprint arXiv:2002.07892, 2020. arXiv:2002.07892.

[bib.bib13] [13] B Brubach, D Chakrabarti, J Dickerson, A Srinivasan, and L Tsepenekas. Fairness, semi-supervised learning, and more: A general framework for clustering with stochastic pairwise constraints. In Proc. Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI), 2021.

[bib.bib14] [14] Moritz Buchem, Katja Ettmayr, Hugo KK Rosado, and Andreas Wiese. A ( $3+\epsilon$ )-approximation algorithm for the minimum sum of radii problem with outliers and extensions for generalized lower bounds. In Proceedings of the 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1738–1765. SIAM, 2024.

[bib.bib15] [15] Lena Carta, Lukas Drexler, Annika Hennes, Clemens Rösner, and Melanie Schmidt. FPT Approximations for Fair $k$ -Min-Sum-Radii. In Julián Mestre and Anthony Wirth, editors, 35th International Symposium on Algorithms and Computation (ISAAC 2024), volume 322 of Leibniz International Proceedings in Informatics (LIPIcs), pages 16:1–16:18, Dagstuhl, Germany, 2024. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.ISAAC.2024.16.

[bib.bib16] [16] Moses Charikar and Rina Panigrahy. Clustering to minimize the sum of cluster diameters. J. Comput. Syst. Sci., 68(2):417–441, 2004. doi:10.1016/j.jcss.2003.07.014.

[bib.bib17] [17] Danny Z Chen, Jian Li, Hongyu Liang, and Haitao Wang. Matroid and knapsack center problems. Algorithmica, 75(1):27–52, 2016. doi:10.1007/S00453-015-0010-1.

[bib.bib18] [18] Xianrun Chen, Dachuan Xu, Yicheng Xu, and Yong Zhang. Parameterized approximation algorithms for sum of radii clustering and variants. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 20666–20673, 2024. doi:10.1609/AAAI.V38I18.30053.

[bib.bib19] [19] Xingyu Chen, Brandon Fain, Liang Lyu, and Kamesh Munagala. Proportionally fair clustering. In International Conference on Machine Learning, pages 1032–1041, 2019. URL: http://proceedings.mlr.press/v97/chen19d.html.

[bib.bib20] [20] Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi, and Sergei Vassilvitskii. Fair clustering through fairlets. In Advances in Neural Information Processing Systems, pages 5029–5037, 2017. URL: https://proceedings.neurips.cc/paper/2017/hash/978fce5bcc4eccc88ad48ce3914124a2-Abstract.html.

[bib.bib21] [21] Ashish Chiplunkar, Sagar Kale, and Sivaramakrishnan Natarajan Ramamoorthy. How to solve fair $k$ -center in massive data models. In Proceedings of the International Conference on Machine Learning (ICML), pages 1877–1886, 2020.

[bib.bib22] [22] Eden Chlamtáč, Yury Makarychev, and Ali Vakilian. Approximating fair clustering with cascaded norm objectives. In Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2664–2683, 2022. doi:10.1137/1.9781611977073.104.

[bib.bib23] [23] Zhen Dai, Yury Makarychev, and Ali Vakilian. Fair representation clustering with several protected classes. In FAccT ’22: 2022 ACM Conference on Fairness, Accountability, and Transparency, Seoul, Republic of Korea, June 21 - 24, 2022, pages 814–823. ACM, 2022. doi:10.1145/3531146.3533146.

[bib.bib24] [24] Lukas Drexler, Annika Hennes, Abhiruk Lahiri, Melanie Schmidt, and Julian Wargalla. Approximating fair $k$ -min-sum-radii in euclidean space. In International Workshop on Approximation and Online Algorithms, pages 119–133. Springer, 2023. doi:10.1007/978-3-031-49815-2_9.

[bib.bib25] [25] Arnold Filtser and Ameet Gadekar. Fpt approximations for capacitated sum of radii and diameters. arXiv preprint arXiv:2409.04984, 2024. doi:10.48550/arXiv.2409.04984.

[bib.bib26] [26] Zachary Friggstad and Mahya Jamshidian. Improved Polynomial-Time Approximations for Clustering with Minimum Sum of Radii or Diameters. In Shiri Chechik, Gonzalo Navarro, Eva Rotenberg, and Grzegorz Herman, editors, 30th Annual European Symposium on Algorithms (ESA 2022), volume 244 of Leibniz International Proceedings in Informatics (LIPIcs), pages 56:1–56:14, Dagstuhl, Germany, 2022. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.ESA.2022.56.

[bib.bib27] [27] Harold N Gabow. An efficient reduction technique for degree-constrained subgraph and bidirected network flow problems. In Proceedings of the fifteenth annual ACM symposium on Theory of computing, pages 448–456, 1983. doi:10.1145/800061.808776.

[bib.bib28] [28] Mehrdad Ghadiri, Samira Samadi, and Santosh S. Vempala. Socially fair $k$ -means clustering. In Madeleine Clare Elish, William Isaac, and Richard S. Zemel, editors, FAccT ’21: 2021 ACM Conference on Fairness, Accountability, and Transparency, Virtual Event / Toronto, Canada, March 3-10, 2021, pages 438–448. ACM, 2021. doi:10.1145/3442188.3445906.

[bib.bib29] [29] Mehrdad Ghadiri, Mohit Singh, and Santosh S Vempala. Constant-factor approximation algorithms for socially fair $k$ -clustering. arXiv preprint arXiv:2206.11210, 2022. doi:10.48550/arXiv.2206.11210.

[bib.bib30] [30] Matt Gibson, Gaurav Kanade, Erik Krohn, Imran A. Pirwani, and Kasturi R. Varadarajan. On metric clustering to minimize the sum of radii. Algorithmica, 57(3):484–498, 2010. doi:10.1007/s00453-009-9282-7.

[bib.bib31] [31] Matt Gibson, Gaurav Kanade, Erik Krohn, Imran A. Pirwani, and Kasturi R. Varadarajan. On clustering to minimize the sum of radii. SIAM J. Comput., 41(1):47–60, 2012. doi:10.1137/100798144.

[bib.bib32] [32] Swati Gupta, Jai Moondra, and Mohit Singh. Which $l_{p}$ norm is the fairest? approximations for fair facility location across all “ $p$ ”, 2022. arXiv:2211.14873.

[bib.bib33] [33] Pinar Heggernes and Daniel Lokshtanov. Optimal broadcast domination in polynomial time. Discret. Math., 306(24):3267–3280, 2006. doi:10.1016/j.disc.2006.06.013.

[bib.bib34] [34] Sedjro Salomon Hotegni, Sepideh Mahabadi, and Ali Vakilian. Approximation algorithms for fair range clustering. In International Conference on Machine Learning, pages 13270–13284. PMLR, 2023.

[bib.bib35] [35] Tanmay Inamdar and Kasturi R. Varadarajan. Capacitated sum-of-radii clustering: An FPT approximation. In Fabrizio Grandoni, Grzegorz Herman, and Peter Sanders, editors, 28th Annual European Symposium on Algorithms, ESA 2020, September 7-9, 2020, Pisa, Italy (Virtual Conference), volume 173 of LIPIcs, pages 62:1–62:17. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2020. doi:10.4230/LIPIcs.ESA.2020.62.

[bib.bib36] [36] Ragesh Jaiswal, Amit Kumar, and Jatin Yadav. FPT approximation for capacitated sum of radii. In Venkatesan Guruswami, editor, 15th Innovations in Theoretical Computer Science Conference, ITCS 2024, January 30 to February 2, 2024, Berkeley, CA, USA, volume 287 of LIPIcs, pages 65:1–65:21. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2024. doi:10.4230/LIPICS.ITCS.2024.65.

[bib.bib37] [37] Xinrui Jia, Kshiteej Sheth, and Ola Svensson. Fair colorful $k$ -center clustering. In International Conference on Integer Programming and Combinatorial Optimization, pages 209–222. Springer, 2020. doi:10.1007/978-3-030-45771-6_17.

[bib.bib38] [38] Christopher Jung, Sampath Kannan, and Neil Lutz. A center in your neighborhood: Fairness in facility location. In Proceedings of the Symposium on Foundations of Responsible Computing (FORC), pages 5:1–5:15, 2020.

[bib.bib39] [39] Matthäus Kleindessner, Pranjal Awasthi, and Jamie Morgenstern. Fair $k$ -center clustering for data summarization. In 36th International Conference on Machine Learning, ICML 2019, pages 5984–6003. International Machine Learning Society (IMLS), 2019.

[bib.bib40] [40] Ravishankar Krishnaswamy, Shi Li, and Sai Sandeep. Constant approximation for k-median and k-means with outliers via iterative rounding. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, pages 646–659, 2018. doi:10.1145/3188745.3188882.

[bib.bib41] [41] Yury Makarychev and Ali Vakilian. Approximation algorithms for socially fair clustering. In Conference on Learning Theory (COLT), pages 3246–3264. PMLR, 2021. URL: http://proceedings.mlr.press/v134/makarychev21a.html.

[bib.bib42] [42] Evi Micha and Nisarg Shah. Proportionally fair clustering revisited. In International Colloquium on Automata, Languages, and Programming (ICALP), 2020. doi:10.4230/LIPIcs.ICALP.2020.85.

[bib.bib43] [43] Maryam Negahbani and Deeparnab Chakrabarty. Better algorithms for individually fair $k$ -clustering. Advances in Neural Information Processing Systems (NeurIPS), 34:13340–13351, 2021. URL: https://proceedings.neurips.cc/paper/2021/hash/6f221fcb5c504fe96789df252123770b-Abstract.html.

[bib.bib44] [44] Sina Bagheri Nezhad, Sayan Bandyapadhyay, and Tianzhi Chen. Polynomial-time constant-approximation for fair sum-of-radii clustering, 2025. doi:10.48550/arXiv.2504.14683.

[bib.bib45] [45] Fatemeh Rajabi-Alni and Alireza Bagheri. Computing a many-to-many matching with demands and capacities between two sets using the hungarian algorithm. Journal of mathematics, 2023(1):7761902, 2023.

[bib.bib46] [46] Melanie Schmidt, Chris Schwiegelshohn, and Christian Sohler. Fair coresets and streaming algorithms for fair k-means. In International Workshop on Approximation and Online Algorithms, pages 232–251. Springer, 2019. doi:10.1007/978-3-030-39479-0_16.

[bib.bib47] [47] Ali Vakilian and Mustafa Yalçıner. Improved approximation algorithms for individually fair clustering. In International Conference on Artificial Intelligence and Statistics (AISTATS), pages 8758–8779. PMLR, 2022.

Polynomial-Time Constant-Approximation for Fair Sum-Of-Radii Clustering

Abstract

Keywords and phrases:

Copyright and License:

2012 ACM Subject Classification:

Related Version:

Funding:

DOI:

Event:

Editors:

Series and Publisher:

1 Introduction

1.1 Our Contributions and Techniques

Theorem 1.

Our approach.

Theorem 2.

Organization.

2 Preliminaries

3 The Algorithm for (𝒕,𝒌)-Fair Sum-of-Radii Clustering

Min-cost Degree Constrained Subgraph (Min-cost DCS).

Proposition 3 ([27]).

Observation 4 (∗).

The Algorithm

Observation 5 (∗).

Lemma 6.

Corollary 7 (∗).

3.1 Proof of Lemma 6

Observation 8.

Lemma 9 (∗).

Lemma 10.

3.2 Proof of Lemma 10

Observation 11 (∗).

Observation 12 (∗).

Lemma 13.

3.3 Proof of Lemma 13

Lemma 14.

Proof.

3.4 Construction of 𝑬𝟏′ and 𝑬𝟐′

Lemma 15 (∗).

Lemma 16 (∗).

Construction of 𝑬𝟏′.

Lemma 17 (∗).

4 The Algorithm for Balanced Sum-of-Radii Clustering

The Algorithm

Lemma 18 (∗).

Corollary 19 (∗).

References

3 The Algorithm for $(t,k)$ -Fair Sum-of-Radii Clustering

Observation 4 ( $*$ ).

Observation 5 ( $*$ ).

Corollary 7 ( $*$ ).

Lemma 9 ( $*$ ).

Observation 11 ( $*$ ).

Observation 12 ( $*$ ).

3.4 Construction of $E_{1}^{\prime}$ and $E_{2}^{\prime}$

Lemma 15 ( $*$ ).

Lemma 16 ( $*$ ).

Construction of $E_{1}^{\prime}$ .

Lemma 17 ( $*$ ).

Lemma 18 ( $*$ ).

Corollary 19 ( $*$ ).