From Data Completion to Problems on Hypercubes: A Parameterized Analysis of the Independent Set Problem

Authors Eduard Eiben , Robert Ganian , Iyad Kanj , Sebastian Ordyniak , Stefan Szeider



PDF
Thumbnail PDF

File

LIPIcs.IPEC.2023.16.pdf
  • Filesize: 0.69 MB
  • 14 pages

Document Identifiers

Author Details

Eduard Eiben
  • Department of Computer Science, Royal Holloway, University of London, Egham, UK
Robert Ganian
  • Algorithms and Complexity Group, TU Wien, Austria
Iyad Kanj
  • School of Computing, DePaul University, Chicago, IL, USA
Sebastian Ordyniak
  • School of Computing, University of Leeds, UK
Stefan Szeider
  • Algorithms and Complexity Group, TU Wien, Austria

Cite AsGet BibTex

Eduard Eiben, Robert Ganian, Iyad Kanj, Sebastian Ordyniak, and Stefan Szeider. From Data Completion to Problems on Hypercubes: A Parameterized Analysis of the Independent Set Problem. In 18th International Symposium on Parameterized and Exact Computation (IPEC 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 285, pp. 16:1-16:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)
https://doi.org/10.4230/LIPIcs.IPEC.2023.16

Abstract

Several works have recently investigated the parameterized complexity of data completion problems, motivated by their applications in machine learning, and clustering in particular. Interestingly, these problems can be equivalently formulated as classical graph problems on induced subgraphs of powers of partially-defined hypercubes. In this paper, we follow up on this recent direction by investigating the Independent Set problem on this graph class, which has been studied in the data science setting under the name Diversity. We obtain a comprehensive picture of the problem’s parameterized complexity and establish its fixed-parameter tractability w.r.t. the solution size plus the power of the hypercube. Given that several such FO-definable problems have been shown to be fixed-parameter tractable on the considered graph class, one may ask whether fixed-parameter tractability could be extended to capture all FO-definable problems. We answer this question in the negative by showing that FO model checking on induced subgraphs of hypercubes is as difficult as FO model checking on general graphs.

Subject Classification

ACM Subject Classification
  • Theory of computation → Parameterized complexity and exact algorithms
Keywords
  • Independent Set
  • Powers of Hypercubes
  • Diversity
  • Parameterized Complexity
  • Incomplete Data

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Charu C. Aggarwal and Chandan K. Reddy. Data Clustering: Algorithms and Applications. Chapman & Hall/CRC, 1st edition, 2013. Google Scholar
  2. Sayan Bandyapadhyay, Fedor V. Fomin, Petr A. Golovach, William Lochet, Nidhi Purohit, and Kirill Simonov. How to find a good explanation for clustering? In Thirty-Sixth AAAI Conference on Artificial Intelligence, AAAI 2022, Thirty-Fourth Conference on Innovative Applications of Artificial Intelligence, IAAI 2022, The Twelveth Symposium on Educational Advances in Artificial Intelligence, EAAI 2022 Virtual Event, February 22 - March 1, 2022, pages 3904-3912. AAAI Press, 2022. Google Scholar
  3. Sayan Bandyapadhyay, Fedor V. Fomin, Petr A. Golovach, Nidhi Purohit, and Kirill Simonov. FPT approximation for fair minimum-load clustering. In Holger Dell and Jesper Nederlof, editors, 17th International Symposium on Parameterized and Exact Computation, IPEC 2022, September 7-9, 2022, Potsdam, Germany, volume 249 of LIPIcs, pages 4:1-4:14. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022. Google Scholar
  4. Sayan Bandyapadhyay, Fedor V. Fomin, Petr A. Golovach, Nidhi Purohit, and Kirill Simonov. Lossy kernelization of same-size clustering. In Alexander S. Kulikov and Sofya Raskhodnikova, editors, Computer Science - Theory and Applications - 17th International Computer Science Symposium in Russia, CSR 2022, Virtual Event, June 29 - July 1, 2022, Proceedings, volume 13296 of Lecture Notes in Computer Science, pages 96-114. Springer, 2022. Google Scholar
  5. Sayan Bandyapadhyay, Fedor V. Fomin, Petr A. Golovach, and Kirill Simonov. Parameterized complexity of feature selection for categorical data clustering. In Filippo Bonchi and Simon J. Puglisi, editors, 46th International Symposium on Mathematical Foundations of Computer Science, MFCS 2021, August 23-27, 2021, Tallinn, Estonia, volume 202 of LIPIcs, pages 14:1-14:14. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2021. Google Scholar
  6. Sayan Bandyapadhyay, Fedor V. Fomin, and Kirill Simonov. On coresets for fair clustering in metric and Euclidean spaces and their applications. In Nikhil Bansal, Emanuela Merelli, and James Worrell, editors, 48th International Colloquium on Automata, Languages, and Programming, ICALP 2021, July 12-16, 2021, Glasgow, Scotland (Virtual Conference), volume 198 of LIPIcs, pages 23:1-23:15. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2021. Google Scholar
  7. Binay K. Bhattacharya and Michael E. Houle. Generalized maximum independent sets for trees in subquadratic time. In Alok Aggarwal and C. Pandu Rangan, editors, Algorithms and Computation, 10th International Symposium, ISAAC '99, Chennai, India, December 16-18, 1999, Proceedings, volume 1741 of Lecture Notes in Computer Science, pages 435-445. Springer, 1999. Google Scholar
  8. Emmanuel J. Candès and Yaniv Plan. Matrix completion with noise. Proceedings of the IEEE, 98(6):925-936, 2010. Google Scholar
  9. Emmanuel J. Candès and Benjamin Recht. Exact matrix completion via convex optimization. Foundations of Computational Mathematics, 9(6):717-772, 2009. Google Scholar
  10. Emmanuel J. Candès and Terence Tao. The power of convex relaxation: near-optimal matrix completion. IEEE Trans. Information Theory, 56(5):2053-2080, 2010. Google Scholar
  11. Matteo Ceccarello, Andrea Pietracaprina, Geppino Pucci, and Eli Upfal. MapReduce and streaming algorithms for diversity maximization in metric spaces of bounded doubling dimension. PVLDB, 10(5):469-480, 2017. Google Scholar
  12. Moses Charikar and Rina Panigrahy. Clustering to minimize the sum of cluster diameters. Journal of Computer and System Sciences, 68(2):417-441, 2004. Google Scholar
  13. Marek Cygan, Fedor V. Fomin, Lukasz Kowalik, Daniel Lokshtanov, Dániel Marx, Marcin Pilipczuk, Michal Pilipczuk, and Saket Saurabh. Parameterized Algorithms. Springer, 2015. Google Scholar
  14. Rodney G. Downey and Michael R. Fellows. Fundamentals of Parameterized Complexity. Texts in Computer Science. Springer, 2013. URL: https://doi.org/10.1007/978-1-4471-5559-1.
  15. Tomáš Dvořák and Petr Gregor. Hamiltonian paths with prescribed edges in hypercubes. Discrete Mathematics, 307(16):1982-1998, 2007. Google Scholar
  16. M.E Dyer and A.M Frieze. A simple heuristic for the p-centre problem. Oper. Res. Lett., 3(6):285-288, 1985. Google Scholar
  17. Eduard Eiben, Fedor V. Fomin, Petr A. Golovach, William Lochet, Fahad Panolan, and Kirill Simonov. EPTAS for k-means clustering of affine subspaces. In Dániel Marx, editor, Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms, SODA 2021, Virtual Conference, January 10 - 13, 2021, pages 2649-2659. SIAM, 2021. Google Scholar
  18. Eduard Eiben, Robert Ganian, Iyad Kanj, Sebastian Ordyniak, and Stefan Szeider. The parameterized complexity of clustering incomplete data. In Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, pages 7296-7304. AAAI Press, 2021. URL: https://ojs.aaai.org/index.php/AAAI/article/view/16896, URL: https://doi.org/10.1609/aaai.v35i8.16896.
  19. Eduard Eiben, Robert Ganian, Iyad Kanj, Sebastian Ordyniak, and Stefan Szeider. Finding a cluster in incomplete data. In Shiri Chechik, Gonzalo Navarro, Eva Rotenberg, and Grzegorz Herman, editors, 30th Annual European Symposium on Algorithms, ESA 2022, September 5-9, 2022, Berlin/Potsdam, Germany, volume 244 of LIPIcs, pages 47:1-47:14. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022. Google Scholar
  20. Eduard Eiben, Robert Ganian, Iyad Kanj, Sebastian Ordyniak, and Stefan Szeider. On the parameterized complexity of clustering problems for incomplete data. Journal of Computer and System Sciences, 134:1-19, 2023. Google Scholar
  21. Ehsan Elhamifar and René Vidal. Sparse subspace clustering: Algorithm, theory, and applications. IEEE Trans. Pattern Anal. Mach. Intell., 35(11):2765-2781, 2013. Google Scholar
  22. Paul Erdös and Richard Rado. Intersection theorems for systems of sets. Journal of the London Mathematical Society, 1(1):85-90, 1960. Google Scholar
  23. Tomás Feder and Daniel Greene. Optimal algorithms for approximate clustering. In Proceedings of the Twentieth Annual ACM Symposium on Theory of Computing, STOC '88, pages 434-444. ACM, 1988. Google Scholar
  24. Jörg Flum and Martin Grohe. Parameterized Complexity Theory, volume XIV of Texts in Theoretical Computer Science. An EATCS Series. Springer, Berlin, 2006. Google Scholar
  25. Fedor V. Fomin, Petr A. Golovach, Tanmay Inamdar, Nidhi Purohit, and Saket Saurabh. Exact exponential algorithms for clustering problems. In Holger Dell and Jesper Nederlof, editors, 17th International Symposium on Parameterized and Exact Computation, IPEC 2022, September 7-9, 2022, Potsdam, Germany, volume 249 of LIPIcs, pages 13:1-13:14. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022. Google Scholar
  26. Fedor V. Fomin, Petr A. Golovach, and Kirill Simonov. Parameterized k-clustering: Tractability island. J. Comput. Syst. Sci., 117:50-74, 2021. Google Scholar
  27. John P. Hayes Frank Harary and Horng-Jyh Wu. A survey of the theory of hypercube graphs. Comput. Math. Appl., 15(4):277-289, 1988. Google Scholar
  28. Guojun Gan, Chaoqun Ma, and Jianhong Wu. Data clustering - theory, algorithms, and applications. SIAM, 2007. Google Scholar
  29. Robert Ganian, Thekla Hamm, Viktoriia Korchemna, Karolina Okrasa, and Kirill Simonov. The complexity of k-means clustering when little is known. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato, editors, Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research, pages 6960-6987, 2022. Google Scholar
  30. Robert Ganian, Iyad Kanj, Sebastian Ordyniak, and Stefan Szeider. Parameterized algorithms for the matrix completion problem. In ICML, volume 80 of JMLR Workshop and Conference Proceedings, pages 1642-1651, 2018. Google Scholar
  31. Pawel Gawrychowski, Nadav Krasnopolsky, Shay Mozes, and Oren Weimann. Dispersion on Trees. In Kirk Pruhs and Christian Sohler, editors, 25th Annual European Symposium on Algorithms (ESA 2017), volume 87 of Leibniz International Proceedings in Informatics (LIPIcs), pages 40:1-40:13. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2017. Google Scholar
  32. Leszek Ga̧sieniec, Jesper Jansson, and Andrzej Lingas. Efficient approximation algorithms for the Hamming center problem. In Proceedings of the Tenth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 905-906, 1999. Google Scholar
  33. Leszek Ga̧sieniec, Jesper Jansson, and Andrzej Lingas. Approximation algorithms for Hamming clustering problems. Journal of Discrete Algorithms, 2(2):289-301, 2004. Google Scholar
  34. Teofilo F. Gonzalez. Clustering to minimize the maximum intercluster distance. Theoretical Computer Science, 38:293-306, 1985. Google Scholar
  35. Jens Gramm, Rolf Niedermeier, and Peter Rossmanith. Fixed-parameter algorithms for CLOSEST STRING and related problems. Algorithmica, 37(1):25-42, 2003. Google Scholar
  36. Martin Grohe, Stephan Kreutzer, and Sebastian Siebertz. Deciding first-order properties of nowhere dense graphs. J. ACM, 64(3):17:1-17:32, 2017. Google Scholar
  37. Moritz Hardt, Raghu Meka, Prasad Raghavendra, and Benjamin Weitz. Computational limits for matrix completion. In Proceedings of The 27th Conference on Learning Theory, volume 35 of JMLR Workshop and Conference Proceedings, pages 703-725. JMLR.org, 2014. Google Scholar
  38. Danny Hermelin and Liat Rozenberg. Parameterized complexity analysis for the closest string with wildcards problem. Theoretical Computer Science, 600:11-18, 2015. Google Scholar
  39. Tomohiro Koana, Vincent Froese, and Rolf Niedermeier. Parameterized algorithms for matrix completion with radius constraints. In Inge Li Gørtz and Oren Weimann, editors, 31st Annual Symposium on Combinatorial Pattern Matching, CPM 2020, June 17-19, 2020, Copenhagen, Denmark, volume 161 of LIPIcs, pages 20:1-20:14. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2020. Google Scholar
  40. Tomohiro Koana, Vincent Froese, and Rolf Niedermeier. The complexity of binary matrix completion under diameter constraints. J. Comput. Syst. Sci., 132:45-67, 2023. Google Scholar
  41. Jure Leskovec, Anand Rajaraman, and Jeffrey David Ullman. Mining of Massive Datasets. Cambridge University Press, New York, NY, USA, 2nd edition, 2014. Google Scholar
  42. Ming Li, Bin Ma, and Lusheng Wang. On the closest string and substring problems. J. ACM, 49(2):157-171, 2002. Google Scholar
  43. Leonid Libkin. Elements of Finite Model Theory. Texts in Theoretical Computer Science. An EATCS Series. Springer, 2004. Google Scholar
  44. Boris Mirkin. Clustering For Data Mining: A Data Recovery Approach. Chapman & Hall/CRC, 2005. Google Scholar
  45. Dimitris Sacharidis, Paras Mehta, Dimitrios Skoutas, Kostas Patroumpas, and Agnès Voisard. Selecting representative and diverse spatio-textual posts over sliding windows. In Proceedings of the 30th International Conference on Scientific and Statistical Database Management, SSDBM 2018, Bozen-Bolzano, Italy, July 09-11, 2018, pages 17:1-17:12, 2018. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail