Parameterized Approximation For Robust Clustering in Discrete Geometric Spaces

Authors Fateme Abbasi , Sandip Banerjee, Jarosław Byrka , Parinya Chalermsook , Ameet Gadekar, Kamyar Khodamoradi, Dániel Marx , Roohani Sharma, Joachim Spoerhase



PDF
Thumbnail PDF

File

LIPIcs.ICALP.2024.6.pdf
  • Filesize: 1.01 MB
  • 19 pages

Document Identifiers

Author Details

Fateme Abbasi
  • University of Wrocław, Poland
Sandip Banerjee
  • IDSIA, USI-SUPSI, Lugano, Switzerland
Jarosław Byrka
  • University of Wrocław, Poland
Parinya Chalermsook
  • Aalto University, Finland
Ameet Gadekar
  • Bar-Ilan University, Ramat-Gan, Israel
Kamyar Khodamoradi
  • University of Regina, Canada
Dániel Marx
  • CISPA Helmholtz Center for Information Security, Saarbrücken, Germany
Roohani Sharma
  • University of Bergen, Norway
Joachim Spoerhase
  • University of Sheffield, UK

Cite AsGet BibTex

Fateme Abbasi, Sandip Banerjee, Jarosław Byrka, Parinya Chalermsook, Ameet Gadekar, Kamyar Khodamoradi, Dániel Marx, Roohani Sharma, and Joachim Spoerhase. Parameterized Approximation For Robust Clustering in Discrete Geometric Spaces. In 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 297, pp. 6:1-6:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)
https://doi.org/10.4230/LIPIcs.ICALP.2024.6

Abstract

We consider the well-studied Robust (k,z)-Clustering problem, which generalizes the classic k-Median, k-Means, and k-Center problems and arises in the domains of robust optimization [Anthony, Goyal, Gupta, Nagarajan, Math. Oper. Res. 2010] and in algorithmic fairness [Abbasi, Bhaskara, Venkatasubramanian, 2021 & Ghadiri, Samadi, Vempala, 2022]. Given a constant z ≥ 1, the input to Robust (k,z)-Clustering is a set P of n points in a metric space (M,δ), a weight function w: P → ℝ_{≥ 0} and a positive integer k. Further, each point belongs to one (or more) of the m many different groups S_1,S_2,…,S_m ⊆ P. Our goal is to find a set X of k centers such that max_{i ∈ [m]} ∑_{p ∈ S_i} w(p) δ(p,X)^z is minimized. Complementing recent work on this problem, we give a comprehensive understanding of the parameterized approximability of the problem in geometric spaces where the parameter is the number k of centers. We prove the following results: [(i)] 1) For a universal constant η₀ > 0.0006, we devise a 3^z(1-η₀)-factor FPT approximation algorithm for Robust (k,z)-Clustering in discrete high-dimensional Euclidean spaces where the set of potential centers is finite. This shows that the lower bound of 3^z for general metrics [Goyal, Jaiswal, Inf. Proc. Letters, 2023] no longer holds when the metric has geometric structure. 2) We show that Robust (k,z)-Clustering in discrete Euclidean spaces is (√{3/2}- o(1))-hard to approximate for FPT algorithms, even if we consider the special case k-Center in logarithmic dimensions. This rules out a (1+ε)-approximation algorithm running in time f(k,ε)poly(m,n) (also called efficient parameterized approximation scheme or EPAS), giving a striking contrast with the recent EPAS for the continuous setting where centers can be placed anywhere in the space [Abbasi et al., FOCS'23]. 3) However, we obtain an EPAS for Robust (k,z)-Clustering in discrete Euclidean spaces when the dimension is sublogarithmic (for the discrete problem, earlier work [Abbasi et al., FOCS'23] provides an EPAS only in dimension o(log log n)). Our EPAS works also for metrics of sub-logarithmic doubling dimension.

Subject Classification

ACM Subject Classification
  • Theory of computation → Approximation algorithms analysis
  • Theory of computation → Facility location and clustering
Keywords
  • Clustering
  • approximation algorithms
  • parameterized complexity

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Fateme Abbasi, Sandip Banerjee, Jarosław Byrka, Parinya Chalermsook, Ameet Gadekar, Kamyar Khodamoradi, Dániel Marx, Roohani Sharma, and Joachim Spoerhase. Parameterized approximation for robust clustering in discrete geometric spaces, 2023. URL: https://arxiv.org/abs/2305.07316.
  2. Fateme Abbasi, Sandip Banerjee, Jarosław Byrka, Parinya Chalermsook, Ameet Gadekar, Kamyar Khodamoradi, Dániel Marx, Roohani Sharma, and Joachim Spoerhase. Parameterized approximation schemes for clustering with general norm objectives. In 2023 IEEE 64th Annual Symposium on Foundations of Computer Science (FOCS), pages 1377-1399, 2023. URL: https://doi.org/10.1109/FOCS57990.2023.00085.
  3. Mohsen Abbasi, Aditya Bhaskara, and Suresh Venkatasubramanian. Fair clustering via equitable group representations. In Proc. ACM Conference on Fairness, Accountability, and Transparency (FAccT '21), pages 504-514, 2021. Google Scholar
  4. Sara Ahmadian, Ashkan Norouzi-Fard, Ola Svensson, and Justin Ward. Better guarantees for k-means and euclidean k-median by primal-dual algorithms. In Proc. 58th IEEE Annual Symposium on Foundations of Computer Science (FOCS'17), pages 61-72, 2017. Google Scholar
  5. Barbara M. Anthony, Vineet Goyal, Anupam Gupta, and Viswanath Nagarajan. A plant location guide for the unsure: Approximation algorithms for min-max location problems. Math. Oper. Res., 35(1):79-101, 2010. URL: https://doi.org/10.1287/moor.1090.0428.
  6. Mihai Badŏiu, Sariel Har-Peled, and Piotr Indyk. Approximate clustering via core-sets. In Proc. 34th Annual ACM Symposium on Theory of Computing (STOC'04), pages 250-257, 2002. Google Scholar
  7. Daniel Baker, Vladimir Braverman, Lingxiao Huang, Shaofeng H.-C. Jiang, Robert Krauthgamer, and Xuan Wu. Coresets for clustering in graphs of bounded treewidth. In Proc. 37th International Conference on Machine Learning (ICML'20), volume 119, pages 569-579, 2020. Google Scholar
  8. Sayan Bhattacharya, Parinya Chalermsook, Kurt Mehlhorn, and Adrian Neumann. New approximability results for the robust k-median problem. In Proc. Scandinavian Workshop on Algorithm Theory (SWAT'14), pages 50-61, 2014. Google Scholar
  9. Vladimir Braverman, Shaofeng H-C Jiang, Robert Krauthgamer, and Xuan Wu. Coresets for ordered weighted clustering. In Proc. International Conference on Machine Learning (ICML'19), pages 744-753, 2019. Google Scholar
  10. T.W. Byrka, J.and Pensyl, B. Rybicki, A. Srinivasan, and K. Trinh. An improved approximation algorithm for k-median and positive correlation in budgeted optimization. ACM Trans. Algorithms, 13(2)(23):1-31, 2013. Google Scholar
  11. Vincent Cohen-Addad, Hossein Esfandiari, Vahab S. Mirrokni, and Shyam Narayanan. Improved approximations for Euclidean k-means and k-median, via nested quasi-independent sets. In Proc. 54th Annual ACM SIGACT Symposium on Theory of Computing (STOC'22), pages 1621-1628, 2022. Google Scholar
  12. Vincent Cohen-Addad, Anupam Gupta, Amit Kumar, Euiwoong Lee, and Jason Li. Tight FPT Approximations for k-Median and k-Means. In Proc. 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019), volume 132, pages 42:1-42:14, 2019. Google Scholar
  13. Vincent Cohen-Addad and CS Karthik. Inapproximability of clustering in lp metrics. In 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS), pages 519-539. IEEE, 2019. Google Scholar
  14. Vincent Cohen-Addad, CS Karthik, and Euiwoong Lee. On approximability of clustering problems without candidate centers. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2635-2648. SIAM, 2021. Google Scholar
  15. Vincent Cohen-Addad and Euiwoong Lee. Johnson coverage hypothesis: Inapproximability of k-means and k-median in lp-metrics. In Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1493-1530. SIAM, 2022. Google Scholar
  16. Vincent Cohen-Addad, David Saulpic, and Chris Schwiegelshohn. A new coreset framework for clustering. In Samir Khuller and Virginia Vassilevska Williams, editors, Proc. 53rd Annual ACM SIGACT Symposium on Theory of Computing (STOC'21), pages 169-182, 2021. Google Scholar
  17. Marek Cygan, Fedor V. Fomin, Lukasz Kowalik, Daniel Lokshtanov, Dániel Marx, Marcin Pilipczuk, Michal Pilipczuk, and Saket Saurabh. Parameterized Algorithms. Springer, 2015. URL: https://doi.org/10.1007/978-3-319-21275-3.
  18. Petros Drineas, Alan M. Frieze, Ravi Kannan, Santosh S. Vempala, and V. Vinay. Clustering large graphs via the singular value decomposition. Mach. Learn., 56(1-3):9-33, 2004. Google Scholar
  19. Tomás Feder and Daniel H. Greene. Optimal algorithms for approximate clustering. In Proc. 20th Annual ACM Symposium on Theory of Computing (STOC'88), pages 434-444, 1988. Google Scholar
  20. Mehrdad Ghadiri, Samira Samadi, and Santosh Vempala. Socially fair k-means clustering. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, pages 438-448, 2021. Google Scholar
  21. Mehrdad Ghadiri, Mohit Singh, and Santosh S Vempala. Constant-factor approximation algorithms for socially fair k-clustering. arXiv preprint arXiv:2206.11210, 2022. Google Scholar
  22. Dishant Goyal and Ragesh Jaiswal. Tight fpt approximation for socially fair clustering. Information Processing Letters, 182:106383, 2023. URL: https://doi.org/10.1016/j.ipl.2023.106383.
  23. Fabrizio Grandoni, Rafail Ostrovsky, Yuval Rabani, Leonard J. Schulman, and Rakesh Venkat. A refined approximation for euclidean k-means. Inf. Process. Lett., 176:106251, 2022. URL: https://doi.org/10.1016/j.ipl.2022.106251.
  24. Sariel Har-Peled and Soham Mazumdar. On coresets for k-means and k-median clustering. In Proc. 36th Annual ACM Symposium on Theory of Computing (STOC'04), page 291–300, 2004. Google Scholar
  25. D.S. Hochbaum and D. Shmoys. A best possible heuristic for the k-center problem. Mathematics of Operation Research, 10(2):180-184, 1985. Google Scholar
  26. K. Jain and V. V. Vazirani. Approximation algorithms for metric facility location and k-median problems using the primal-dual schema and Lagrangian relaxation. J. ACM, 48(2):274-296, 2001. Google Scholar
  27. Tapas Kanungo, David M Mount, Nathan S Netanyahu, Christine D Piatko, Ruth Silverman, and Angela Y Wu. A local search approximation algorithm for k-means clustering. Computational Geometry, 28(2-3):89-112, 2004. Google Scholar
  28. Amit Kumar, Yogish Sabharwal, and Sandeep Sen. Linear-time approximation schemes for clustering problems in any dimensions. Journal of the ACM (JACM), 57(2):1-32, 2010. Google Scholar
  29. Yury Makarychev and Ali Vakilian. Approximation algorithms for socially fair clustering. In Mikhail Belkin and Samory Kpotufe, editors, Proceedings of Thirty Fourth Conference on Learning Theory, volume 134 of Proceedings of Machine Learning Research, pages 3246-3264. PMLR, 15-19 August 2021. URL: https://proceedings.mlr.press/v134/makarychev21a.html.
  30. Viswanath Nagarajan, Baruch Schieber, and Hadas Shachnai. The Euclidean k-supplier problem. Math. Oper. Res., 45(1):1-14, 2020. Google Scholar
  31. Christian Sohler and David P. Woodruff. Strong coresets for k-median and subspace approximation: Goodbye dimension. In Proc. 59th IEEE Annual Symposium on Foundations of Computer Science (FOCS'18), pages 802-813, 2018. Google Scholar
  32. Amnon Ta-Shma. Explicit, almost optimal, ε-balanced codes. In Hamed Hatami, Pierre McKenzie, and Valerie King, editors, Proc. 49th Annual ACM SIGACT Symposium on Theory of Computing (STOC'17), pages 238-251. ACM, 2017. URL: https://doi.org/10.1145/3055399.3055408.