Approximation Schemes for Min-Sum k-Clustering

Authors Ismail Naderi, Mohsen Rezapour, Mohammad R. Salavatipour



PDF
Thumbnail PDF

File

LIPIcs.ESA.2023.84.pdf
  • Filesize: 1.29 MB
  • 16 pages

Document Identifiers

Author Details

Ismail Naderi
  • Department of Computer Science, University of Alberta, Edmonton, Canada
Mohsen Rezapour
  • Department of Computer Science, University of Alberta, Edmonton, Canada
  • Department of Computer Science and Statistics, K. N. Toosi University of Technology, Tehran, Iran
Mohammad R. Salavatipour
  • Department of Computer Science, University of Alberta, Edmonton, Canada

Cite AsGet BibTex

Ismail Naderi, Mohsen Rezapour, and Mohammad R. Salavatipour. Approximation Schemes for Min-Sum k-Clustering. In 31st Annual European Symposium on Algorithms (ESA 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 274, pp. 84:1-84:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)
https://doi.org/10.4230/LIPIcs.ESA.2023.84

Abstract

We consider the Min-Sum k-Clustering (k-MSC) problem. Given a set of points in a metric which is represented by an edge-weighted graph G = (V, E) and a parameter k, the goal is to partition the points V into k clusters such that the sum of distances between all pairs of the points within the same cluster is minimized. The k-MSC problem is known to be APX-hard on general metrics. The best known approximation algorithms for the problem obtained by Behsaz, Friggstad, Salavatipour and Sivakumar [Algorithmica 2019] achieve an approximation ratio of O(log |V|) in polynomial time for general metrics and an approximation ratio 2+ε in quasi-polynomial time for metrics with bounded doubling dimension. No approximation schemes for k-MSC (when k is part of the input) is known for any non-trivial metrics prior to our work. In fact, most of the previous works rely on the simple fact that there is a 2-approximate reduction from k-MSC to the balanced k-median problem and design approximation algorithms for the latter to obtain an approximation for k-MSC. In this paper, we obtain the first Quasi-Polynomial Time Approximation Schemes (QPTAS) for the problem on metrics induced by graphs of bounded treewidth, graphs of bounded highway dimension, graphs of bounded doubling dimensions (including fixed dimensional Euclidean metrics), and planar and minor-free graphs. We bypass the barrier of 2 for k-MSC by introducing a new clustering problem, which we call min-hub clustering, which is a generalization of balanced k-median and is a trade off between center-based clustering problems (such as balanced k-median) and pair-wise clustering (such as Min-Sum k-clustering). We then show how one can find approximation schemes for Min-hub clustering on certain classes of metrics.

Subject Classification

ACM Subject Classification
  • Theory of computation → Design and analysis of algorithms
Keywords
  • Approximation Algorithms
  • Clustering
  • Dynamic Programming

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Martin Aigner and Günter M. Ziegler. Cayley’s formula for the number of trees. Proofs from THE BOOK, pages 201-206, 2010. URL: https://doi.org/10.1007/978-3-642-00856-6_30.
  2. Sandip Banerjee, Rafail Ostrovsky, and Yuval Rabani. Min-Sum Clustering (With Outliers). In Mary Wootters and Laura Sanità, editors, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021), volume 207 of Leibniz International Proceedings in Informatics (LIPIcs), pages 16:1-16:16, Dagstuhl, Germany, 2021. Schloss Dagstuhl - Leibniz-Zentrum für Informatik. URL: https://doi.org/10.4230/LIPIcs.APPROX/RANDOM.2021.16.
  3. Yair Bartal, Moses Charikar, and Danny Raz. Approximating min-sum k-clustering in metric spaces. In Proceedings of the thirty-third annual ACM symposium on Theory of computing, pages 11-20, 2001. Google Scholar
  4. Babak Behsaz, Zachary Friggstad, Mohammad R Salavatipour, and Rohit Sivakumar. Approximation algorithms for min-sum k-clustering and balanced k-median. Algorithmica, 81:1006-1030, 2019. Google Scholar
  5. Hans L. Bodlaender and Torben Hagerup. Parallel algorithms with optimal speedup for bounded treewidth. SIAM Journal on Computing, 27(6):1725-1746, 1998. URL: https://doi.org/10.1137/S0097539795289859.
  6. Vincent Cohen-Addad, CS Karthik, and Euiwoong Lee. On approximability of clustering problems without candidate centers. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2635-2648. SIAM, 2021. Google Scholar
  7. Vincent Cohen-Addad, Hung Le, Marcin Pilipczuk, and Michał Pilipczuk. Planar and minor-free metrics embed into metrics of polylogarithmic treewidth with expected multiplicative distortion arbitrarily close to 1. arXiv preprint arXiv:2304.07268, 2023. URL: https://arxiv.org/abs/2304.07268.
  8. Artur Czumaj and Christian Sohler. Small space representations for metric min-sum k-clustering and their applications. In Proceedings of the 24th Annual Conference on Theoretical Aspects of Computer Science, STACS'07, pages 536-548, Berlin, Heidelberg, 2007. Springer-Verlag. Google Scholar
  9. Wenceslas Fernandez de la Vega, Marek Karpinski, Claire Mathieu, and Yuval Rabani. Approximation schemes for clustering problems. In STOC '03, 2003. Google Scholar
  10. Andreas Emil Feldmann, Wai Shing Fung, Jochen Könemann, and Ian Post. A (1+ε)-embedding of low highway dimension graphs into bounded treewidth graphs. SIAM Journal on Computing, 47(4):1667-1704, 2018. URL: https://doi.org/10.1137/16M1067196.
  11. Nili Guttmann-Beck and Refael Hassin. Approximation algorithms for min-sum p-clustering. Discrete Applied Mathematics, 89(1-3):125-142, 1998. Google Scholar
  12. Viggo Kann, Sanjeev Khanna, Jens Lagergren, and Alessandro Panconesi. On the hardness of approximating max k-cut and its dual. Chicago J Theoret Comput Sci, May 1997. Google Scholar
  13. Sartaj Sahni and Teofilo F. Gonzalez. P-complete approximation problems. Journal of the ACM (JACM), 23:555-565, 1976. Google Scholar
  14. Kunal Talwar. Bypassing the embedding: Algorithms for low dimensional metrics. In Proceedings of the Thirty-Sixth Annual ACM Symposium on Theory of Computing, STOC '04, pages 281-290, New York, NY, USA, 2004. Association for Computing Machinery. URL: https://doi.org/10.1145/1007352.1007399.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail