Connected k-Center and k-Diameter Clustering

Authors Lukas Drexler, Jan Eube, Kelin Luo, Heiko Röglin, Melanie Schmidt, Julian Wargalla



PDF
Thumbnail PDF

File

LIPIcs.ICALP.2023.50.pdf
  • Filesize: 3.08 MB
  • 20 pages

Document Identifiers

Author Details

Lukas Drexler
  • Heinrich-Heine Universität Düsseldorf, Germany
Jan Eube
  • Universität Bonn, Germany
Kelin Luo
  • Universität Bonn, Germany
Heiko Röglin
  • Universität Bonn, Germany
Melanie Schmidt
  • Heinrich-Heine Universität Düsseldorf, Germany
Julian Wargalla
  • Heinrich-Heine Universität Düsseldorf, Germany

Acknowledgements

The authors thank anonymous reviewers of a previous draft for helpful comments and pointing out relevant related work. We thank Jürgen Kusche and Christian Sohler for raising the problem and for fruitful discussion on the modeling. We also thank Xiangyu Guo for the discussion on the algorithm design and analysis.

Cite AsGet BibTex

Lukas Drexler, Jan Eube, Kelin Luo, Heiko Röglin, Melanie Schmidt, and Julian Wargalla. Connected k-Center and k-Diameter Clustering. In 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 261, pp. 50:1-50:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)
https://doi.org/10.4230/LIPIcs.ICALP.2023.50

Abstract

Motivated by an application from geodesy, we study the connected k-center problem and the connected k-diameter problem. These problems arise from the classical k-center and k-diameter problems by adding a side constraint. For the side constraint, we are given an undirected connectivity graph G on the input points, and a clustering is now only feasible if every cluster induces a connected subgraph in G. Usually in clustering problems one assumes that the clusters are pairwise disjoint. We study this case but additionally also the case that clusters are allowed to be non-disjoint. This can help to satisfy the connectivity constraints. Our main result is an O(1)-approximation algorithm for the disjoint connected k-center and k-diameter problem for Euclidean spaces of low dimension (constant d) and for metrics with constant doubling dimension. For general metrics, we get an O(log²k)-approximation. Our algorithms work by computing a non-disjoint connected clustering first and transforming it into a disjoint connected clustering. We complement these upper bounds by several upper and lower bounds for variations and special cases of the model.

Subject Classification

ACM Subject Classification
  • Theory of computation → Facility location and clustering
Keywords
  • Approximation algorithms
  • Clustering
  • Connectivity constraints

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Moses Charikar, Samir Khuller, David M. Mount, and Giri Narasimhan. Algorithms for facility location problems with outliers. In S. Rao Kosaraju, editor, Proceedings of the Twelfth Annual Symposium on Discrete Algorithms (SODA), pages 642-651. ACM/SIAM, 2001. Google Scholar
  2. Marek Cygan, MohammadTaghi Hajiaghayi, and Samir Khuller. LP rounding for k-centers with non-uniform hard capacities. In Proceedings of the 53rd Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 273-282. IEEE Computer Society, 2012. URL: https://doi.org/10.1109/FOCS.2012.63.
  3. Hu Ding and Jinhui Xu. A unified framework for clustering constrained data without locality property. In Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1471-1490. SIAM, 2015. Google Scholar
  4. Rong Ge, Martin Ester, Byron J. Gao, Zengjian Hu, Binay K. Bhattacharya, and Boaz Ben-Moshe. Joint cluster analysis of attribute data and relationship data: The connected k-center problem, algorithms and applications. ACM Trans. Knowl. Discov. Data, 2(2):7:1-7:35, 2008. URL: https://doi.org/10.1145/1376815.1376816.
  5. Teofilo F. Gonzalez. Clustering to minimize the maximum intercluster distance. Theoretical Computer Science, 38:293-306, 1985. Google Scholar
  6. Neelima Gupta, Aditya Pancholi, and Yogish Sabharwal. Clustering with internal connectedness. In Proc. of 5th Intl. Workshop on Algorithms and Computation (WALCOM), volume 6552 of Lecture Notes in Computer Science, pages 158-169. Springer, 2011. URL: https://doi.org/10.1007/978-3-642-19094-0_17.
  7. Dorit S. Hochbaum. When are np-hard location problems easy? Ann. Oper. Res., 1(3):201-214, 1984. URL: https://doi.org/10.1007/BF01874389.
  8. Dorit S. Hochbaum and David B. Shmoys. A unified approach to approximation algorithms for bottleneck problems. Journal of the ACM, 33(3):533-550, 1986. Google Scholar
  9. Simon J. Holgate, Andrew Matthews, Philip L. Woodworth, Lesley J. Rickards, Mark E. Tamisiea, Elizabeth Bradshaw, Peter R. Foden, Kathleen M. Gordon, Svetlana Jevrejeva, and Jeff Pugh. New data systems and products at the permanent service for mean sea level. Journal of Coastal Research, 29:493-504, 2013. Google Scholar
  10. Wen-Lian Hsu and George L. Nemhauser. Easy and hard bottleneck location problems. Discrete Applied Mathematics, 1(3):209-215, 1979. Google Scholar
  11. Ravishankar Krishnaswamy, Shi Li, and Sai Sandeep. Constant approximation for k-median and k-means with outliers via iterative rounding. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing (STOC), pages 646-659, 2018. Google Scholar
  12. Zhung-Xun Liao and Wen-Chih Peng. Clustering spatial data with a geographic constraint: exploring local search. Knowl. Inf. Syst., 31(1):153-170, 2012. URL: https://doi.org/10.1007/s10115-011-0402-8.
  13. Permanent Service for Mean Sea Level (PSMSL). Tide gauge data, retrieved on 03 February 2022 from URL: http://www.psmsl.org/data/obtaining/.