Clustering with Neighborhoods

Authors Hongyao Huang, Georgiy Klimenko, Benjamin Raichel



PDF
Thumbnail PDF

File

LIPIcs.ISAAC.2021.6.pdf
  • Filesize: 0.76 MB
  • 17 pages

Document Identifiers

Author Details

Hongyao Huang
  • Department of Computer Science, University of Texas at Dallas, Richardson, TX, USA
Georgiy Klimenko
  • Department of Computer Science, University of Texas at Dallas, Richardson, TX, USA
Benjamin Raichel
  • Department of Computer Science, University of Texas at Dallas, Richardson, TX, USA

Cite AsGet BibTex

Hongyao Huang, Georgiy Klimenko, and Benjamin Raichel. Clustering with Neighborhoods. In 32nd International Symposium on Algorithms and Computation (ISAAC 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 212, pp. 6:1-6:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)
https://doi.org/10.4230/LIPIcs.ISAAC.2021.6

Abstract

In the standard planar k-center clustering problem, one is given a set P of n points in the plane, and the goal is to select k center points, so as to minimize the maximum distance over points in P to their nearest center. Here we initiate the systematic study of the clustering with neighborhoods problem, which generalizes the k-center problem to allow the covered objects to be a set of general disjoint convex objects C rather than just a point set P. For this problem we first show that there is a PTAS for approximating the number of centers. Specifically, if r_opt is the optimal radius for k centers, then in n^O(1/ε²) time we can produce a set of (1+ε)k centers with radius ≤ r_opt. If instead one considers the standard goal of approximating the optimal clustering radius, while keeping k as a hard constraint, we show that the radius cannot be approximated within any factor in polynomial time unless P = NP, even when C is a set of line segments. When C is a set of unit disks we show the problem is hard to approximate within a factor of (√{13}-√3)(2-√3) ≈ 6.99. This hardness result complements our main result, where we show that when the objects are disks, of possibly differing radii, there is a (5+2√3)≈ 8.46 approximation algorithm. Additionally, for unit disks we give an O(n log k)+(k/ε)^O(k) time (1+ε)-approximation to the optimal radius, that is, an FPTAS for constant k whose running time depends only linearly on n. Finally, we show that the one dimensional version of the problem, even when intersections are allowed, can be solved exactly in O(n log n) time.

Subject Classification

ACM Subject Classification
  • Theory of computation → Computational geometry
Keywords
  • Clustering
  • Approximation
  • Hardness

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. P. K. Agarwal, J. Matousek, and M. Sharir. On range searching with semialgebraic sets II. In 53rd Annual IEEE Symp. on Foundations of Computer Science (FOCS), pages 420-429, 2012. URL: https://doi.org/10.1109/FOCS.2012.32.
  2. P. K. Agarwal, J. Pach, and M. Sharir. State of the union (of geometric objects). In J.E. Goodman, J. Pach, and R. Pollack, editors, Surveys on Discrete and Computational Geometry: Twenty Years Later, volume 453 of Contemp. Math., pages 9-48. AMS, 2008. Google Scholar
  3. P. K. Agarwal and C. M. Procopiuc. Exact and approximation algorithms for clustering. Algorithmica, 33(2):201-226, 2002. URL: https://doi.org/10.1007/s00453-001-0110-y.
  4. H.-K. Ahn, S.-S. Kim, C. Knauer, L. Schlipf, C.-S. Shin, and A. Vigneron. Covering and piercing disks with two centers. Comput. Geom., 46(3):253-262, 2013. Google Scholar
  5. G. Anegg, H. Angelidakis, A. Kurpisz, and R. Zenklusen. A technique for obtaining true approximations for k-center with covering constraints. In 21st Integer Programming and Combinatorial Optimization (IPCO), volume 12125 of LNCS, pages 52-65. Springer, 2020. URL: https://doi.org/10.1007/978-3-030-45771-6_5.
  6. M. Badoiu, S. Har-Peled, and P. Indyk. Approximate clustering via core-sets. In 34th Annual ACM Symposium on Theory of Computing (STOC), pages 250-257. ACM, 2002. URL: https://doi.org/10.1145/509907.509947.
  7. S. Bandyapadhyay, T. Inamdar, S. Pai, and K. R. Varadarajan. A constant approximation for colorful k-center. In 27th Annual European Symposium on Algorithms (ESA), volume 144 of LIPIcs, pages 12:1-12:14, 2019. URL: https://doi.org/10.4230/LIPIcs.ESA.2019.12.
  8. M. Charikar, S. Khuller, D. M. Mount, and G. Narasimhan. Algorithms for facility location problems with outliers. In 12th Annual Symposium on Discrete Algorithms (SODA), pages 642-651. ACM/SIAM, 2001. URL: http://dl.acm.org/citation.cfm?id=365411.365555.
  9. O. Cheong, H. Everett, M. Glisse, J. Gudmundsson, S. Hornus, S. Lazard, M. Lee, and H.-S. Na. Farthest-polygon Voronoi diagrams. Comput. Geom., 44(4):234-247, 2011. Google Scholar
  10. A. Dumitrescu and J. S. B. Mitchell. Approximation algorithms for TSP with neighborhoods in the plane. J. Algorithms, 48(1):135-159, 2003. URL: https://doi.org/10.1016/S0196-6774(03)00047-6.
  11. D. Eisenstat, P. N. Klein, and C. Mathieu. Approximating k-center in planar graphs. In 25th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 617-627. SIAM, 2014. URL: https://doi.org/10.1137/1.9781611973402.47.
  12. T. Feder and D. H. Greene. Optimal algorithms for approximate clustering. In 20th Annual ACM Symposium on Theory of Computing (STOC), pages 434-444. ACM, 1988. URL: https://doi.org/10.1145/62212.62255.
  13. S. P. Fekete, K. Huang, J. S. B. Mitchell, O. Parekh, and C. A. Phillips. Geometric hitting set for segments of few orientations. Theory Comput. Syst., 62(2):268-303, 2018. Google Scholar
  14. H. De Fraysseix, J. Pach, and R. Pollack. How to draw a planar graph on a grid. Comb., 10(1):41-51, 1990. URL: https://doi.org/10.1007/BF02122694.
  15. G. N. Frederickson. Parametric search and locating supply centers in trees. In 2nd Workshop on Algorithms and Data Structures (WADS), pages 299-319, 1991. URL: https://doi.org/10.1007/BFb0028271.
  16. J. Gao, M. Langberg, and L. J. Schulman. Analysis of incomplete data and an intrinsic-dimension helly theorem. Discrete and Computational Geometry, 40(4):537-560, 2008. URL: https://doi.org/10.1007/s00454-008-9107-5.
  17. J. Gao, M. Langberg, and L. J. Schulman. Clustering lines in high-dimensional space: Classification of incomplete data. ACM Trans. Algorithms, 7(1):8:1-8:26, 2010. URL: https://doi.org/10.1145/1868237.1868246.
  18. M. R. Garey and D. S. Johnson. The rectilinear steiner tree problem is NP-complete. SIAM Journal on Applied Mathematics, 32(4):826-834, 1977. URL: https://doi.org/10.1137/0132071.
  19. T. F. Gonzalez. Clustering to minimize the maximum intercluster distance. Theoretical Computer Science, 38:293-306, 1985. URL: https://doi.org/10.1016/0304-3975(85)90224-5.
  20. D. S. Hochbaum and D. B. Shmoys. A best possible heuristic for the k-center problem. Mathematics of Operations Research, 10(2):180-184, 1985. URL: https://doi.org/10.1287/moor.10.2.180.
  21. W. Hsu and G. L. Nemhauser. Easy and hard bottleneck location problems. Discrete Applied Mathematics, 1(3):209-215, 1979. URL: https://doi.org/10.1016/0166-218X(79)90044-1.
  22. H. Huang, G. Klimenko, and B. Raichel. Clustering with neighborhoods, September 2021. URL: http://arxiv.org/abs/2109.13302.
  23. X. Jia, K. Sheth, and O. Svensson. Fair colorful k-center clustering. In 21st Integer Programming and Combinatorial Optimization (IPCO), volume 12125 of LNCS, pages 209-222. Springer, 2020. URL: https://doi.org/10.1007/978-3-030-45771-6_17.
  24. M. I. Karavelas and M. Yvinec. The Voronoi diagram of planar convex objects. In 11th Annual European Symposium on Algorithms (ESA), volume 2832 of LNCS, pages 337-348. Springer, 2003. URL: https://doi.org/10.1007/978-3-540-39658-1_32.
  25. R. Klein. Concrete and Abstract Voronoi Diagrams, volume 400 of LNCS. Springer, 1989. URL: https://doi.org/10.1007/3-540-52055-4.
  26. E. Lee and L. J. Schulman. Clustering affine subspaces: Hardness and algorithms. In 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 810-827. SIAM, 2013. URL: https://doi.org/10.1137/1.9781611973105.58.
  27. Y. Marom and D. Feldman. k-means clustering of lines for big data. In 32nd Annual Advances in Neural Information Processing Systems (NeurIPS), pages 12797-12806, 2019. URL: http://papers.nips.cc/paper/9442-k-means-clustering-of-lines-for-big-data.
  28. S. Micali and V. V. Vazirani. An O(√|V||E|) algoithm for finding maximum matching in general graphs. In 21st Annual Symposium on Foundations of Computer Science (FOCS), pages 17-27, 1980. Google Scholar
  29. N. H. Mustafa and S. Ray. Improved results on geometric hitting set problems. Discrete and Computational Geometry, 44(4):883-895, 2010. URL: https://doi.org/10.1007/s00454-010-9285-9.
  30. R. Raz and S. Safra. A sub-constant error-probability low-degree test, and a sub-constant error-probability PCP characterization of NP. In 29th Annual ACM Symposium on the Theory of Computing (STOC), pages 475-484. ACM, 1997. URL: https://doi.org/10.1145/258533.258641.
  31. G. Xu and J. Xu. Efficient approximation algorithms for clustering point-sets. Computational Geometry, 43(1):59-66, 2010. URL: https://doi.org/10.1016/j.comgeo.2007.12.002.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail