Approximation Algorithms for Continuous Clustering and Facility Location Problems

Authors Deeparnab Chakrabarty, Maryam Negahbani, Ankita Sarkar



PDF
Thumbnail PDF

File

LIPIcs.ESA.2022.33.pdf
  • Filesize: 0.81 MB
  • 15 pages

Document Identifiers

Author Details

Deeparnab Chakrabarty
  • Department of Computer Science, Dartmouth College, Hanover, NH, USA
Maryam Negahbani
  • Department of Computer Science, Dartmouth College, Hanover, NH, USA
Ankita Sarkar
  • Department of Computer Science, Dartmouth College, Hanover, NH, USA

Acknowledgements

We thank the reviewers for their comments, which helped us improve our exposition and exhibit a stronger connection to existing work.

Cite AsGet BibTex

Deeparnab Chakrabarty, Maryam Negahbani, and Ankita Sarkar. Approximation Algorithms for Continuous Clustering and Facility Location Problems. In 30th Annual European Symposium on Algorithms (ESA 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 244, pp. 33:1-33:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)
https://doi.org/10.4230/LIPIcs.ESA.2022.33

Abstract

In this paper, we consider center-based clustering problems where C, the set of points to be clustered, lies in a metric space (X,d), and the set X of candidate centers is potentially infinite-sized. We call such problems continuous clustering problems to differentiate them from the discrete clustering problems where the set of candidate centers is explicitly given. It is known that for many objectives, when one restricts the set of centers to C itself and applies an α_dis-approximation algorithm for the discrete version, one obtains a β ⋅ α_{dis}-approximation algorithm for the continuous version via the triangle inequality property of the distance function. Here β depends on the objective, and for many objectives such as k-median, β = 2, while for some others such as k-means, β = 4. The motivating question in this paper is whether this gap of factor β between continuous and discrete problems is inherent, or can one design better algorithms for continuous clustering than simply reducing to the discrete case as mentioned above? In a recent SODA 2021 paper, Cohen-Addad, Karthik, and Lee prove a factor-2 and a factor-4 hardness, respectively, for the continuous versions of the k-median and k-means problems, even when the number of cluster centers is a constant. The discrete problem for a constant number of centers is easily solvable exactly using enumeration, and therefore, in certain regimes, the "β-factor loss" seems unavoidable. In this paper, we describe a technique based on the round-or-cut framework to approach continuous clustering problems. We show that, for the continuous versions of some clustering problems, we can design approximation algorithms attaining a better factor than the β-factor blow-up mentioned above. In particular, we do so for: the uncapacitated facility location problem with uniform facility opening costs (λ-UFL); the k-means problem; the individually fair k-median problem; and the k-center with outliers problem. Notably, for λ-UFL, where β = 2 and the discrete version is NP-hard to approximate within a factor of 1.27, we describe a 2.32-approximation for the continuous version, and indeed 2.32 < 2 × 1.27. Also, for k-means, where β = 4 and the best known approximation factor for the discrete version is 9, we obtain a 32-approximation for the continuous version, which is better than 4 × 9 = 36. The main challenge one faces is that most algorithms for the discrete clustering problems, including the state of the art solutions, depend on Linear Program (LP) relaxations that become infinite-sized in the continuous version. To overcome this, we design new linear program relaxations for the continuous clustering problems which, although having exponentially many constraints, are amenable to the round-or-cut framework.

Subject Classification

ACM Subject Classification
  • Theory of computation → Facility location and clustering
Keywords
  • Approximation Algorithms
  • Clustering
  • Facility Location
  • Fairness
  • Outliers

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Sara Ahmadian, Ashkan Norouzi-Fard, Ola Svensson, and Justin Ward. Better guarantees for k-means and euclidean k-median by primal-dual algorithms. In 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), pages 61-72, 2017. Google Scholar
  2. Soroush Alamdari and David B. Shmoys. A bicriteria approximation algorithm for the k-center and k-median problems. In Proc., Workshop on Approximation and Online Algorithms (WAOA), pages 66-75. Springer, 2017. Google Scholar
  3. Hyung-Chan An, Mohit Singh, and Ola Svensson. LP-based algorithms for capacitated facility location. SIAM Journal on Computing (SICOMP), 46(1):272-306, 2017. Preliminary version in FOCS 2014. Google Scholar
  4. Georg Anegg, Haris Angelidakis, Adam Kurpisz, and Rico Zenklusen. A technique for obtaining true approximations for k-center with covering constraints. Math. Program., 192(1):3-27, 2022. URL: https://doi.org/10.1007/s10107-021-01645-y.
  5. V. Arya, N. Garg, R. Khandekar, A. Meyerson, K. Munagala, and V. Pandit. Local Search Heuristics for k-Median and Facility Location Problems. SIAM Journal on Computing (SICOMP), 33(3):544-562, 2004. Google Scholar
  6. Pranjal Awasthi, Moses Charikar, Ravishankar Krishnaswamy, and Ali Kemal Sinop. The hardness of approximation of euclidean k-means. CoRR, 2015. URL: http://arxiv.org/abs/1502.03316.
  7. Tanvi Bajpai, Deeparnab Chakrabarty, Chandra Chekuri, and Maryam Negahbani. Revisiting Priority k-Center: Fairness and Outliers. In Proc., International Colloquium on Automata, Languages and Programming (ICALP), pages 21:1-21:20, 2021. Google Scholar
  8. Jaroslaw Byrka, Fabrizio Grandoni, Thomas Rothvoss, and Laura Sanità. Steiner tree approximation via iterative randomized rounding. Journal of the ACM, 60(1):1-33, 2013. URL: https://doi.org/10.1145/2432622.2432628.
  9. Jarosław Byrka, Thomas Pensyl, Bartosz Rybicki, Aravind Srinivasan, and Khoa Trinh. An improved approximation for k-median and positive correlation in budgeted optimization. ACM Trans. Algorithms, 2017. Google Scholar
  10. Deeparnab Chakrabarty, Chandra Chekuri, Sanjeev Khanna, and Nitish Korula. Approximability of capacitated network design. Algorithmica, 72(2):493-514, 2015. Google Scholar
  11. Deeparnab Chakrabarty, Prachi Goyal, and Ravishankar Krishnaswamy. The non-uniform k-center problem. ACM Trans. Algorithms, 2020. Preliminary version in ICALP 2016. Google Scholar
  12. Deeparnab Chakrabarty and Maryam Negahbani. Generalized center problems with outliers. ACM Trans. Algorithms, 2019. Google Scholar
  13. Deeparnab Chakrabarty and Maryam Negahbani. Better algorithms for individually fair k-clustering. Adv. in Neu. Inf. Proc. Sys. (NeurIPS), 2021. Google Scholar
  14. Deeparnab Chakrabarty, Maryam Negahbani, and Ankita Sarkar. Approximation algorithms for continuous clustering and facility location problems. CoRR, 2022. URL: http://arxiv.org/abs/2206.15105.
  15. Moses. Charikar and Sudipto. Guha. Improved Combinatorial Algorithms for the Facility Location and k-Median Problems. In Proc., IEEE Symposium on Foundations of Computer Science (FOCS), 1999. Google Scholar
  16. Moses Charikar, Sudipto Guha, Éva Tardos, and David B. Shmoys. A constant-factor approximation algorithm for the k-median problem. Journal of Computer and System Sciences, 65(1):129-149, 2002. Google Scholar
  17. Moses Charikar, Samir Khuller, David M. Mount, and Giri Narasimhan. Algorithms for facility location problems with outliers. In Proc., ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 642-651, 2001. Google Scholar
  18. Ke Chen. On k-median clustering in high dimensions. In Proc., ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1177-1185, 2006. Google Scholar
  19. Vincent Cohen-Addad and Karthik C.S. Inapproximability of clustering in lp metrics. In 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS), pages 519-539, 2019. URL: https://doi.org/10.1109/FOCS.2019.00040.
  20. Vincent Cohen-Addad, Hossein Esfandiari, Vahab Mirrokni, and Shyam Narayanan. Improved approximations for Euclidean k-means and k-median, via nested quasi-independent sets. In Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2022, pages 1621-1628, New York, NY, USA, 2022. Association for Computing Machinery. URL: https://doi.org/10.1145/3519935.3520011.
  21. Vincent Cohen-Addad, C. S. Karthik, and Euiwoong Lee. On approximability of clustering problems without candidate centers. In Proc., ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2635-2648, 2021. Google Scholar
  22. Vincent Cohen-Addad, Philip N Klein, and Claire Mathieu. Local search yields approximation schemes for k-means and k-median in euclidean and minor-free metrics. SIAM Journal on Computing (SICOMP), 48(2):644-667, 2019. Google Scholar
  23. Vincent Cohen-Addad, Karthik C. S., and Euiwoong Lee. Johnson coverage hypothesis: Inapproximability of k-means and k-median in 𝓁_p-metrics. In Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1493-1530, 2022. URL: https://doi.org/10.1137/1.9781611977073.63.
  24. Vincent Cohen-Addad, David Saulpic, and Chris Schwiegelshohn. A new coreset framework for clustering. In Proc., ACM Symposium on Theory of Computing (STOC), pages 169-182, 2021. Google Scholar
  25. W Fernandez De La Vega, Marek Karpinski, Claire Kenyon, and Yuval Rabani. Approximation schemes for clustering problems. In Proc., ACM Symposium on Theory of Computing (STOC), pages 50-58, 2003. Google Scholar
  26. Hu Ding, Haikuo Yu, and Zixiu Wang. Greedy strategy works for k-center clustering with outliers and coreset construction. In Proc., European Symposium on Algorithms, pages 40:1-40:16, 2019. Google Scholar
  27. Dan Feldman, Morteza Monemizadeh, and Christian Sohler. A PTAS for k-means clustering based on weak coresets. In Proceedings of the 5th Annual Symposium on Computational Geometry (SoCG), pages 11-18, 2007. Google Scholar
  28. Zachary Friggstad, Mohsen Rezapour, and Mohammad R Salavatipour. Local search yields a PTAS for k-means in doubling metrics. SIAM Journal on Computing (SICOMP), 48(2):452-480, 2019. Google Scholar
  29. Martin Grötschel, László Lovász, and Alexander Schriver. Geometric Algorithms and Combinatorial Optimization. Springer, Berlin, Heidelberg, 1993. Google Scholar
  30. Sudipto Guha and Samir Khuller. Greedy strikes back: Improved facility location algorithms. Journal of Algorithms, 31(1):228-248, 1999. Google Scholar
  31. Anupam Gupta and Kanat Tangwongsan. Simpler analyses of local search algorithms for facility location. CoRR, 2008. URL: http://arxiv.org/abs/0809.2554.
  32. S. Har-Peled and S. Mazumdar. Coresets for k-means and k-median clustering and their applications. In Proc., ACM Symposium on Theory of Computing (STOC), pages 291-300, 2004. Google Scholar
  33. Dorit S. Hochbaum and David B. Shmoys. A best possible heuristic for the k-center problem. Math. Oper. Res., 10(2):180-184, 1985. URL: https://doi.org/10.1287/moor.10.2.180.
  34. Kamal Jain, Mohammad Mahdian, Evangelos Markakis, Amin Saberi, and Vijay V Vazirani. Greedy facility location algorithms analyzed using dual fitting with factor-revealing LP. Journal of the ACM, 50(6):795-824, 2003. Google Scholar
  35. Kamal Jain and Vijay V. Vazirani. Approximation algorithms for metric facility location and k-median problems using the primal-dual schema and Lagrangean relaxation. Journal of the ACM, 48(2):274-296, 2001. Google Scholar
  36. Christopher Jung, Sampath Kannan, and Neil Lutz. Service in your neighborhood: Fairness in center location. In Proceedings, Foundations of Responsible Computing, FORC 2020, volume 156, pages 5:1-5:15, 2020. Google Scholar
  37. Tapas Kanungo, David M Mount, Nathan S Netanyahu, Christine D Piatko, Ruth Silverman, and Angela Y Wu. A local search approximation algorithm for k-means clustering. Computational Geometry, 28(2-3):89-112, 2004. Google Scholar
  38. A. Kumar, Y. Sabharwal, and S. Sen. A simple linear time (1 + ε)-approximation algorithm for k-means clustering in any dimensions. In Proc., IEEE Symposium on Foundations of Computer Science (FOCS), pages 454-462, 2004. Google Scholar
  39. Shi Li. A 1.488 approximation algorithm for the uncapacitated facility location problem. Information and Computation, 222:45-58, 2013. Google Scholar
  40. Shi Li and Ola Svensson. Approximating k-median via pseudo-approximation. SIAM Journal on Computing (SICOMP), 45(2):530-547, 2016. Google Scholar
  41. Sepideh Mahabadi and Ali Vakilian. (individual) fairness for k-clustering. In Proc., International Conference on Machine Learning (ICML), pages 7925-7935, 2020. Google Scholar
  42. Jirı Matoušek. On approximate geometric k-clustering. Discrete & Computational Geometry, 24(1):61-84, 2000. Google Scholar
  43. Ján Plesník. A heuristic for the p-center problems in graphs. Discrete Applied Mathematics, 17(3):263-268, 1987. Google Scholar
  44. Robert D. Carr, Toshihiro Fujito, Goran Konjevod, and Ojas Parekh. A 21/10-approximation Algorithm for a Generalization of the Weighted Edge-Dominating Set Problem. J. Comb. Optim., 2001. Preliminary version appeared in Proc. 8th ESA, pages 132-142, 2000. Google Scholar
  45. David B. Shmoys, Éva Tardos, and Karen Aardal. Approximation algorithms for facility location problems (extended abstract). In Proc., ACM Symposium on Theory of Computing (STOC), 1997. Google Scholar
  46. Luca Trevisan. When hamming meets euclid: The approximability of geometric tsp and steiner tree. SIAM Journal on Computing, 30(2):475-485, 2000. URL: https://doi.org/10.1137/S0097539799352735.
  47. Ali Vakilian and Mustafa Yalciner. Improved approximation algorithms for individually fair clustering. In Gustau Camps-Valls, Francisco J. R. Ruiz, and Isabel Valera, editors, Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, volume 151 of Proceedings of Machine Learning Research, pages 8758-8779. PMLR, 28-30 March 2022. URL: https://proceedings.mlr.press/v151/vakilian22a.html.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail