Complexity of Local Search for Euclidean Clustering Problems

Authors Bodo Manthey , Nils Morawietz , Jesse van Rhijn , Frank Sommer



PDF
Thumbnail PDF

File

LIPIcs.ISAAC.2024.48.pdf
  • Filesize: 0.83 MB
  • 16 pages

Document Identifiers

Author Details

Bodo Manthey
  • Faculty of Electrical Engineering, Mathematics, and Computer Science, University of Twente, The Netherlands
Nils Morawietz
  • Institute of Computer Science, Friedrich Schiller University Jena, Germany
Jesse van Rhijn
  • Faculty of Electrical Engineering, Mathematics, and Computer Science, University of Twente, The Netherlands
Frank Sommer
  • Institute of Logic and Computation, TU Wien, Austria

Cite As Get BibTex

Bodo Manthey, Nils Morawietz, Jesse van Rhijn, and Frank Sommer. Complexity of Local Search for Euclidean Clustering Problems. In 35th International Symposium on Algorithms and Computation (ISAAC 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 322, pp. 48:1-48:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024) https://doi.org/10.4230/LIPIcs.ISAAC.2024.48

Abstract

We show that the simplest local search heuristics for two natural Euclidean clustering problems are PLS-hard. First, we show that the Hartigan-Wong method, which is essentially the Flip heuristic, for k-Means clustering is PLS-hard, even when k = 2. Second, we show the same result for the Flip heuristic for Max Cut, even when the edge weights are given by the (squared) Euclidean distances between the points in some set 𝒳 ⊆ R^d; a problem which is equivalent to Min Sum 2-Clustering.

Subject Classification

ACM Subject Classification
  • Theory of computation → Problems, reductions and completeness
  • Theory of computation → Graph algorithms analysis
  • Theory of computation → Discrete optimization
  • Theory of computation → Facility location and clustering
Keywords
  • Local search
  • PLS-complete
  • max cut
  • k-means
  • partitioning problem
  • flip-neighborhood

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Emile Aarts and Jan Karel Lenstra, editors. Local Search in Combinatorial Optimization. Princeton University Press, 2003. URL: https://doi.org/10.2307/j.ctv346t9c.
  2. A. A. Ageev, A. V. Kel'manov, and A. V. Pyatkin. Complexity of the weighted max-cut in Euclidean space. Journal of Applied and Industrial Mathematics, 8(4):453-457, October 2014. URL: https://doi.org/10.1134/S1990478914040012.
  3. Daniel Aloise, Amit Deshpande, Pierre Hansen, and Preyas Popat. NP-hardness of Euclidean sum-of-squares clustering. Machine Learning, 75(2):245-248, May 2009. URL: https://doi.org/10.1007/s10994-009-5103-0.
  4. Omer Angel, Sébastien Bubeck, Yuval Peres, and Fan Wei. Local max-cut in smoothed polynomial time. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, pages 429-437, New York, NY, USA, June 2017. Association for Computing Machinery. URL: https://doi.org/10.1145/3055399.3055402.
  5. Sanjeev Arora, Satish Rao, and Umesh Vazirani. Expander flows, geometric embeddings and graph partitioning. Journal of the ACM, 56(2):5:1-5:37, April 2009. URL: https://doi.org/10.1145/1502793.1502794.
  6. David Arthur, Bodo Manthey, and Heiko Röglin. Smoothed Analysis of the k-Means Method. Journal of the ACM, 58(5):19:1-19:31, October 2011. URL: https://doi.org/10.1145/2027216.2027217.
  7. David Arthur and Sergei Vassilvitskii. How slow is the k-means method? In Proceedings of the Twenty-Second Annual Symposium on Computational Geometry, SoCG '06, pages 144-153, New York, NY, USA, June 2006. Association for Computing Machinery. URL: https://doi.org/10.1145/1137856.1137880.
  8. David Arthur and Sergei Vassilvitskii. K-means++: The advantages of careful seeding. In Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA '07, pages 1027-1035, USA, January 2007. Society for Industrial and Applied Mathematics. URL: http://dl.acm.org/citation.cfm?id=1283383.1283494.
  9. Francisco Barahona, Martin Grötschel, Michael Jünger, and Gerhard Reinelt. An Application of Combinatorial Optimization to Statistical Physics and Circuit Layout Design. Operations Research, 36(3):493-513, June 1988. URL: https://doi.org/10.1287/opre.36.3.493.
  10. P. Berkhin. A Survey of Clustering Data Mining Techniques. In Grouping Multidimensional Data: Recent Advances in Clustering, pages 25-71. Springer, Berlin, Heidelberg, 2006. URL: https://doi.org/10.1007/3-540-28349-8_2.
  11. Ali Bibak, Charles Carlson, and Karthekeyan Chandrasekaran. Improving the Smoothed Complexity of FLIP for Max Cut Problems. ACM Transactions on Algorithms, 17(3):19:1-19:38, July 2021. URL: https://doi.org/10.1145/3454125.
  12. Y.Y. Boykov and M.-P. Jolly. Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images. In Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, volume 1, pages 105-112 vol.1, July 2001. URL: https://doi.org/10.1109/ICCV.2001.937505.
  13. Sascha Brauer. Complexity of Single-Swap Heuristics for Metric Facility Location and Related Problems. In Algorithms and Complexity, Lecture Notes in Computer Science, pages 116-127, Cham, 2017. Springer International Publishing. URL: https://doi.org/10.1007/978-3-319-57586-5_11.
  14. Xi Chen, Chenghao Guo, Emmanouil V. Vlatakis-Gkaragkounis, Mihalis Yannakakis, and Xinzhi Zhang. Smoothed complexity of local max-cut and binary max-CSP. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, STOC 2020, pages 1052-1065, New York, NY, USA, June 2020. Association for Computing Machinery. URL: https://doi.org/10.1145/3357713.3384325.
  15. Robert Elsässer and Tobias Tscheuschner. Settling the Complexity of Local Max-Cut (Almost) Completely. In International Colloquium on Automata, Languages, and Programming, Lecture Notes in Computer Science, pages 171-182, Berlin, Heidelberg, 2011. Springer. URL: https://doi.org/10.1007/978-3-642-22006-7_15.
  16. Michael Etscheid. Beyond Worst-Case Analysis of Max-Cut and Local Search. PhD thesis, Universitäts- und Landesbibliothek Bonn, August 2018. URL: https://bonndoc.ulb.uni-bonn.de/xmlui/handle/20.500.11811/7613.
  17. Michael Etscheid and Heiko Röglin. Smoothed Analysis of the Squared Euclidean Maximum-Cut Problem. In Algorithms - ESA 2015, Lecture Notes in Computer Science, pages 509-520, Berlin, Heidelberg, 2015. Springer. URL: https://doi.org/10.1007/978-3-662-48350-3_43.
  18. Michael Etscheid and Heiko Röglin. Smoothed Analysis of Local Search for the Maximum-Cut Problem. ACM Transactions on Algorithms, 13(2):25:1-25:12, March 2017. URL: https://doi.org/10.1145/3011870.
  19. C.M. Fiduccia and R.M. Mattheyses. A Linear-Time Heuristic for Improving Network Partitions. In 19th Design Automation Conference, pages 175-181, June 1982. URL: https://doi.org/10.1109/DAC.1982.1585498.
  20. J. A. Hartigan and M. A. Wong. Algorithm AS 136: A K-Means Clustering Algorithm. Journal of the Royal Statistical Society. Series C (Applied Statistics), 28(1):100-108, 1979. URL: https://doi.org/10.2307/2346830.
  21. Susumu Hasegawa, Hiroshi Imai, Mary Inaba, and Naoki Katoh. Efficient Algorithms for Variance-Based k-Clustering. Proceedings of Pacific Graphics 1993, February 2000. Google Scholar
  22. Sophia Heimann, Hung P. Hoang, and Stefan Hougardy. The k-Opt algorithm for the Traveling Salesman Problem has exponential running time for k ≥ 5. In International Colloquium on Automata, Languages, and Programming, volume 297 of Leibniz International Proceedings in Informatics (LIPIcs), pages 84:1-84:18. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2024. URL: https://doi.org/10.4230/LIPICS.ICALP.2024.84.
  23. Anil K. Jain. Data clustering: 50 years beyond K-means. Pattern Recognition Letters, 31(8):651-666, June 2010. URL: https://doi.org/10.1016/j.patrec.2009.09.011.
  24. David S. Johnson, Christos H. Papadimitriou, and Mihalis Yannakakis. How easy is local search? Journal of Computer and System Sciences, 37(1):79-100, August 1988. URL: https://doi.org/10.1016/0022-0000(88)90046-3.
  25. Daniel M. Kane and Raghu Meka. A PRG for lipschitz functions of polynomials with applications to sparsest cut. In Proceedings of the Forty-Fifth Annual ACM Symposium on Theory of Computing, STOC '13, pages 1-10, New York, NY, USA, June 2013. Association for Computing Machinery. URL: https://doi.org/10.1145/2488608.2488610.
  26. T. Kanungo, D.M. Mount, N.S. Netanyahu, C.D. Piatko, R. Silverman, and A.Y. Wu. An efficient k-means clustering algorithm: Analysis and implementation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(7):881-892, July 2002. URL: https://doi.org/10.1109/TPAMI.2002.1017616.
  27. B. W. Kernighan and S. Lin. An efficient heuristic procedure for partitioning graphs. The Bell System Technical Journal, 49(2):291-307, February 1970. URL: https://doi.org/10.1002/j.1538-7305.1970.tb01770.x.
  28. Christian Komusiewicz and Nils Morawietz. Finding 3-Swap-Optimal Independent Sets and Dominating Sets Is Hard. In 47th International Symposium on Mathematical Foundations of Computer Science (MFCS 2022), volume 241 of Leibniz International Proceedings in Informatics (LIPIcs), pages 66:1-66:14, Dagstuhl, Germany, 2022. Schloss Dagstuhl - Leibniz-Zentrum für Informatik. URL: https://doi.org/10.4230/LIPIcs.MFCS.2022.66.
  29. M.W. Krentel. Structure in locally optimal solutions. In 30th Annual Symposium on Foundations of Computer Science, pages 216-221, October 1989. URL: https://doi.org/10.1109/SFCS.1989.63481.
  30. S. Lloyd. Least squares quantization in PCM. IEEE Transactions on Information Theory, 28(2):129-137, March 1982. URL: https://doi.org/10.1109/TIT.1982.1056489.
  31. Meena Mahajan, Prajakta Nimbhorkar, and Kasturi Varadarajan. The planar k-means problem is NP-hard. Theoretical Computer Science, 442:13-21, July 2012. URL: https://doi.org/10.1016/j.tcs.2010.05.034.
  32. Charles F. Mann, David W. Matula, and Eli V. Olinick. The use of sparsest cuts to reveal the hierarchical community structure of social networks. Social Networks, 30(3):223-234, July 2008. URL: https://doi.org/10.1016/j.socnet.2008.03.004.
  33. Bodo Manthey and Jesse van Rhijn. Worst-Case and Smoothed Analysis of the Hartigan-Wong Method for k-Means Clustering. In 41st International Symposium on Theoretical Aspects of Computer Science (STACS 2024). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2024. URL: https://doi.org/10.4230/LIPIcs.STACS.2024.52.
  34. J. Matoušek. On Approximate Geometric k -Clustering. Discrete & Computational Geometry, 24(1):61-84, January 2000. URL: https://doi.org/10.1007/s004540010019.
  35. David W. Matula and Farhad Shahrokhi. Sparsest cuts and bottlenecks in graphs. Discrete Applied Mathematics, 27(1):113-123, May 1990. URL: https://doi.org/10.1016/0166-218X(90)90133-W.
  36. Wil Michiels, Jan Korst, and Emile Aarts. Theoretical Aspects of Local Search. Monographs in Theoretical Computer Science, An EATCS Series. Springer, Berlin, Heidelberg, 2007. URL: https://doi.org/10.1007/978-3-540-35854-1.
  37. Burkhard Monien and Tobias Tscheuschner. On the Power of Nodes of Degree Four in the Local Max-Cut Problem. In Algorithms and Complexity, Lecture Notes in Computer Science, pages 264-275, Berlin, Heidelberg, 2010. Springer. URL: https://doi.org/10.1007/978-3-642-13073-1_24.
  38. Christos H. Papadimitriou. The Complexity of the Lin-Kernighan Heuristic for the Traveling Salesman Problem. SIAM Journal on Computing, 21(3):450-465, June 1992. URL: https://doi.org/10.1137/0221030.
  39. Tim Roughgarden and Joshua R. Wang. The Complexity of the k-means Method. In 24th Annual European Symposium on Algorithms (ESA 2016), volume 57 of Leibniz International Proceedings in Informatics (LIPIcs), pages 78:1-78:14, Dagstuhl, Germany, 2016. Schloss Dagstuhl - Leibniz-Zentrum für Informatik. URL: https://doi.org/10.4230/LIPIcs.ESA.2016.78.
  40. Alejandro A. Schäffer and Mihalis Yannakakis. Simple Local Search Problems that are Hard to Solve. SIAM Journal on Computing, 20(1):56-87, February 1991. URL: https://doi.org/10.1137/0220004.
  41. I. J. Schoenberg. Metric spaces and positive definite functions. Transactions of the American Mathematical Society, 44(3):522-536, 1938. URL: https://doi.org/10.1090/S0002-9947-1938-1501980-0.
  42. Leonard J. Schulman. Clustering for edge-cost minimization (extended abstract). In Proceedings of the Thirty-Second Annual ACM Symposium on Theory of Computing, STOC '00, pages 547-555, New York, NY, USA, May 2000. Association for Computing Machinery. URL: https://doi.org/10.1145/335305.335373.
  43. Jianbo Shi and J. Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):888-905, August 2000. URL: https://doi.org/10.1109/34.868688.
  44. Matus Telgarsky and Andrea Vattani. Hartigan’s Method: K-means Clustering without Voronoi. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pages 820-827. JMLR Workshop and Conference Proceedings, March 2010. URL: https://proceedings.mlr.press/v9/telgarsky10a.html.
  45. Andrea Vattani. K-means Requires Exponentially Many Iterations Even in the Plane. Discrete & Computational Geometry, 45(4):596-616, June 2011. URL: https://doi.org/10.1007/s00454-011-9340-1.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail