Is It Easier to Count Communities Than Find Them?
Random graph models with community structure have been studied extensively in the literature. For both the problems of detecting and recovering community structure, an interesting landscape of statistical and computational phase transitions has emerged. A natural unanswered question is: might it be possible to infer properties of the community structure (for instance, the number and sizes of communities) even in situations where actually finding those communities is believed to be computationally hard? We show the answer is no. In particular, we consider certain hypothesis testing problems between models with different community structures, and we show (in the low-degree polynomial framework) that testing between two options is as hard as finding the communities.
In addition, our methods give the first computational lower bounds for testing between two different "planted" distributions, whereas previous results have considered testing between a planted distribution and an i.i.d. "null" distribution.
Community detection
Hypothesis testing
Low-degree polynomials
Theory of computation~Random network models
Theory of computation~Computational complexity and cryptography
94:1-94:23
Regular Paper
This work began when the authors were visiting the Simons Institute for the Theory of Computing during the program on Computational Complexity of Statistical Inference in Fall 2021. We are grateful to Guy Bresler for helpful discussions.
Cynthia
Rush
Cynthia Rush
Department of Statistics, Columbia University, New York, NY,USA
https://orcid.org/0000-0001-6857-2855
Part of this work was supported by NSF CCF-1849883 and part of the work was done while visiting the Simons Institute for the Theory of Computing, supported by a Google Research Fellowship.
Fiona
Skerman
Fiona Skerman
Department of Mathematics, Uppsala University, Sweden
https://orcid.org/0000-0003-4141-7059
Partially supported by the Wallenberg AI, Autonomous Systems and Software Program WASP and the project AI4Research at Uppsala University. Part of this work was done while visiting the Simons Institute for the Theory of Computing, supported by a Simons-Berkeley Research Fellowship.
Alexander S.
Wein
Alexander S. Wein
Department of Mathematics, University of California, Davis, CA, USA
https://orcid.org/0000-0002-3406-1747
Part of this work was done at Georgia Tech, supported by NSF grants CCF-2007443 and CCF-2106444. Part of this work was done while visiting the Simons Institute for the Theory of Computing, supported by a Simons-Berkeley Research Fellowship.
Dana
Yang
Dana Yang
Department of Statistics and Data Science, Cornell University, Ithaca, NY, USA
https://orcid.org/0000-0002-2158-0759
Part of this work was done while visiting the Simons Institute for the Theory of Computing, supported by a Simons-Berkeley Research Fellowship.
10.4230/LIPIcs.ITCS.2023.94
Emmanuel Abbe. Community detection and stochastic block models: recent developments. The Journal of Machine Learning Research, 18(1):6446-6531, 2017.
Ery Arias-Castro and Nicolas Verzelen. Community detection in random networks. arXiv preprint, 2013. URL: http://arxiv.org/abs/1302.7099.
http://arxiv.org/abs/1302.7099
Afonso S Bandeira, Ahmed El Alaoui, Samuel B Hopkins, Tselil Schramm, Alexander S Wein, and Ilias Zadik. The Franz-Parisi criterion and computational trade-offs in high dimensional statistics. arXiv preprint, 2022. URL: http://arxiv.org/abs/2205.09727.
http://arxiv.org/abs/2205.09727
Jess Banks, Sidhanth Mohanty, and Prasad Raghavendra. Local statistics, semidefinite programming, and community detection. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1298-1316. SIAM, 2021.
Boaz Barak, Samuel Hopkins, Jonathan Kelner, Pravesh K Kothari, Ankur Moitra, and Aaron Potechin. A nearly tight sum-of-squares lower bound for the planted clique problem. SIAM Journal on Computing, 48(2):687-735, 2019.
Quentin Berthet and Philippe Rigollet. Complexity theoretic lower bounds for sparse principal component detection. In Conference on learning theory, pages 1046-1066. PMLR, 2013.
Matthew Brennan, Guy Bresler, and Wasim Huleihel. Reducibility and computational lower bounds for problems with planted sparse structure. In Conference On Learning Theory, pages 48-166. PMLR, 2018.
Yudong Chen and Jiaming Xu. Statistical-computational tradeoffs in planted problems and submatrix localization with a growing number of clusters and submatrices. The Journal of Machine Learning Research, 17(1):882-938, 2016.
Aurelien Decelle, Florent Krzakala, Cristopher Moore, and Lenka Zdeborová. Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications. Physical Review E, 84(6):066106, 2011.
Ilias Diakonikolas, Daniel M Kane, and Alistair Stewart. Statistical query lower bounds for robust estimation of high-dimensional gaussians and gaussian mixtures. In 58th Annual Symposium on Foundations of Computer Science (FOCS), pages 73-84. IEEE, 2017.
Vitaly Feldman, Elena Grigorescu, Lev Reyzin, Santosh S Vempala, and Ying Xiao. Statistical algorithms and a lower bound for detecting planted cliques. Journal of the ACM (JACM), 64(2):1-37, 2017.
Bruce Hajek, Yihong Wu, and Jiaming Xu. Computational lower bounds for community detection on random graphs. In Conference on Learning Theory, pages 899-928. PMLR, 2015.
Samuel Hopkins. Statistical Inference and the Sum of Squares Method. PhD thesis, Cornell University, 2018.
Samuel B Hopkins, Pravesh K Kothari, Aaron Potechin, Prasad Raghavendra, Tselil Schramm, and David Steurer. The power of sum-of-squares for detecting hidden structures. In 58th Annual Symposium on Foundations of Computer Science (FOCS), pages 720-731. IEEE, 2017.
Samuel B Hopkins and David Steurer. Efficient bayesian estimation from few samples: community detection and related problems. In 58th Annual Symposium on Foundations of Computer Science (FOCS), pages 379-390. IEEE, 2017.
Pravesh K Kothari, Ryuhei Mori, Ryan O'Donnell, and David Witmer. Sum of squares lower bounds for refuting any CSP. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pages 132-145, 2017.
Dmitriy Kunisky, Alexander S Wein, and Afonso S Bandeira. Notes on computational hardness of hypothesis testing: Predictions using the low-degree likelihood ratio. In ISAAC Congress (International Society for Analysis, its Applications and Computation), pages 1-50. Springer, 2022.
Cristopher Moore. The computer science and physics of community detection: Landscapes, phase transitions, and hardness. Bulletin of EATCS, 1(121), 2017.
Tselil Schramm and Alexander S Wein. Computational barriers to estimation from low-degree polynomials. The Annals of Statistics, 50(3):1833-1858, 2022.
Cynthia Rush, Fiona Skerman, Alexander S. Wein, and Dana Yang
Creative Commons Attribution 4.0 International license
https://creativecommons.org/licenses/by/4.0/legalcode