Step-By-Step Community Detection in Volume-Regular Graphs

Authors Luca Becchetti , Emilio Cruciani , Francesco Pasquale , Sara Rizzo



PDF
Thumbnail PDF

File

LIPIcs.ISAAC.2019.20.pdf
  • Filesize: 0.64 MB
  • 23 pages

Document Identifiers

Author Details

Luca Becchetti
  • Sapienza Università di Roma, Italy
Emilio Cruciani
  • Gran Sasso Science Institute, L'Aquila, Italy
Francesco Pasquale
  • Università di Roma "Tor Vergata", Italy
Sara Rizzo
  • Gran Sasso Science Institute, L'Aquila, Italy

Cite AsGet BibTex

Luca Becchetti, Emilio Cruciani, Francesco Pasquale, and Sara Rizzo. Step-By-Step Community Detection in Volume-Regular Graphs. In 30th International Symposium on Algorithms and Computation (ISAAC 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 149, pp. 20:1-20:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)
https://doi.org/10.4230/LIPIcs.ISAAC.2019.20

Abstract

Spectral techniques have proved amongst the most effective approaches to graph clustering. However, in general they require explicit computation of the main eigenvectors of a suitable matrix (usually the Laplacian matrix of the graph). Recent work (e.g., Becchetti et al., SODA 2017) suggests that observing the temporal evolution of the power method applied to an initial random vector may, at least in some cases, provide enough information on the space spanned by the first two eigenvectors, so as to allow recovery of a hidden partition without explicit eigenvector computations. While the results of Becchetti et al. apply to perfectly balanced partitions and/or graphs that exhibit very strong forms of regularity, we extend their approach to graphs containing a hidden k partition and characterized by a milder form of volume-regularity. We show that the class of k-volume regular graphs is the largest class of undirected (possibly weighted) graphs whose transition matrix admits k "stepwise" eigenvectors (i.e., vectors that are constant over each set of the hidden partition). To obtain this result, we highlight a connection between volume regularity and lumpability of Markov chains. Moreover, we prove that if the stepwise eigenvectors are those associated to the first k eigenvalues and the gap between the k-th and the (k+1)-th eigenvalues is sufficiently large, the Averaging dynamics of Becchetti et al. recovers the underlying community structure of the graph in logarithmic time, with high probability.

Subject Classification

ACM Subject Classification
  • Theory of computation → Distributed algorithms
Keywords
  • Community detection
  • Distributed algorithms
  • Dynamics
  • Markov chains
  • Spectral analysis

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Emmanuel Abbe, Afonso S. Bandeira, and Georgina Hall. Exact Recovery in the Stochastic Block Model. IEEE Trans. Information Theory, 62(1):471-487, 2016. URL: https://doi.org/10.1109/TIT.2015.2490670.
  2. Emmanuel Abbe and Colin Sandon. Detection in the stochastic block model with multiple clusters: proof of the achievability conjectures, acyclic BP, and the information-computation gap. CoRR, abs/1512.09080, 2015. URL: http://arxiv.org/abs/1512.09080.
  3. Dana Angluin, James Aspnes, and David Eisenstat. A simple population protocol for fast robust approximate majority. Distributed Computing, 21(2):87-102, 2008. (Preliminary version appeared in DISC 2007). URL: https://doi.org/10.1007/s00446-008-0059-z.
  4. Michael J. Barber and John W. Clark. Detecting network communities by propagating labels under constraints. Phys. Rev. E, 80:026129, August 2009. URL: https://doi.org/10.1103/PhysRevE.80.026129.
  5. Luca Becchetti, Andrea E. F. Clementi, Pasin Manurangsi, Emanuele Natale, Francesco Pasquale, Prasad Raghavendra, and Luca Trevisan. Average Whenever You Meet: Opportunistic Protocols for Community Detection. In 26th Annual European Symposium on Algorithms, ESA 2018, August 20-22, 2018, Helsinki, Finland, pages 7:1-7:13, 2018. URL: https://doi.org/10.4230/LIPIcs.ESA.2018.7.
  6. Luca Becchetti, Andrea E. F. Clementi, Emanuele Natale, Francesco Pasquale, and Luca Trevisan. Find Your Place: Simple Distributed Algorithms for Community Detection. In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2017, Barcelona, Spain, Hotel Porta Fira, January 16-19, pages 940-959, 2017. URL: https://doi.org/10.1137/1.9781611974782.59.
  7. Ravi B. Boppana. Eigenvalues and Graph Bisection: An Average-Case Analysis (Extended Abstract). In 28th Annual Symposium on Foundations of Computer Science, Los Angeles, California, USA, 27-29 October 1987, pages 280-285, 1987. URL: https://doi.org/10.1109/SFCS.1987.22.
  8. Charles Bordenave, Marc Lelarge, and Laurent Massoulié. Nonbacktracking spectrum of random graphs: Community detection and nonregular Ramanujan graphs. Ann. Probab., 46(1):1-71, January 2018. URL: https://doi.org/10.1214/16-AOP1142.
  9. Fan RK Chung. Laplacians of graphs and Cheeger’s inequalities. Combinatorics, Paul Erdos is Eighty, 2(157-172):13-2, 1996. Google Scholar
  10. Amin Coja-Oghlan. Spectral techniques, semidefinite programs, and random graphs. PhD thesis, Habilitationsschrift, Humboldt Universität zu Berlin, Institut für Informatik, 2005. Google Scholar
  11. Amin Coja-Oghlan. Graph Partitioning via Adaptive Spectral Techniques. Comb. Probab. Comput., 19(2):227-284, March 2010. URL: https://doi.org/10.1017/S0963548309990514.
  12. Emilio Cruciani, Emanuele Natale, André Nusser, and Giacomo Scornavacca. Phase Transition of the 2-Choices Dynamics on Core-Periphery Networks. In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, AAMAS 2018, Stockholm, Sweden, July 10-15, 2018, pages 777-785, 2018. URL: http://dl.acm.org/citation.cfm?id=3237499.
  13. Emilio Cruciani, Emanuele Natale, and Giacomo Scornavacca. Distributed Community Detection via Metastability of the 2-Choices Dynamics. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, Honolulu, Hawaii, USA, January 27 - February 1, 2019., pages 6046-6053, 2019. URL: https://aaai.org/ojs/index.php/AAAI/article/view/4560, URL: https://doi.org/10.1609/aaai.v33i01.33016046.
  14. Aurelien Decelle, Florent Krzakala, Cristopher Moore, and Lenka Zdeborová. Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications. Phys. Rev. E, 84:066106, December 2011. URL: https://doi.org/10.1103/PhysRevE.84.066106.
  15. M.E Dyer and A.M Frieze. The solution of some random NP-hard problems in polynomial expected time. Journal of Algorithms, 10(4):451-489, 1989. URL: https://doi.org/10.1016/0196-6774(89)90001-1.
  16. P. Erdös. On a lemma of Littlewood and Offord. Bull. Amer. Math. Soc., 51(12):898-902, December 1945. URL: https://projecteuclid.org:443/euclid.bams/1183507531.
  17. Miroslav Fiedler. Laplacian of graphs and algebraic connectivity. Banach Center Publications, 25(1):57-70, 1989. URL: http://eudml.org/doc/267812.
  18. Santo Fortunato. Community detection in graphs. Physics Reports, 486(3):75-174, 2010. URL: https://doi.org/10.1016/j.physrep.2009.11.002.
  19. Yehuda Hassin and David Peleg. Distributed Probabilistic Polling and Applications to Proportionate Agreement. Inf. Comput., 171(2):248-268, 2001. URL: https://doi.org/10.1006/inco.2001.3088.
  20. Paul W. Holland, Kathryn Blackmond Laskey, and Samuel Leinhardt. Stochastic blockmodels: First steps. Social Networks, 5(2):109-137, 1983. URL: https://doi.org/10.1016/0378-8733(83)90021-7.
  21. Mark Jerrum and Gregory B. Sorkin. The Metropolis Algorithm for Graph Bisection. Discrete Applied Mathematics, 82(1-3):155-175, 1998. URL: https://doi.org/10.1016/S0166-218X(97)00133-9.
  22. John G. Kemeny and J. Laurie Snell. Finite Markov chains. D. van Nostrand Company, inc., Princeton, N.J., 1960. Google Scholar
  23. David Kempe and Frank McSherry. A decentralized algorithm for spectral analysis. J. Comput. Syst. Sci., 74(1):70-83, 2008. (Preliminary version appeared in STOC 2004). URL: https://doi.org/10.1016/j.jcss.2007.04.014.
  24. Kishore Kothapalli, Sriram V. Pemmaraju, and Vivek Sardeshmukh. On the Analysis of a Label Propagation Algorithm for Community Detection. In Distributed Computing and Networking, 14th International Conference, ICDCN 2013, Mumbai, India, January 3-6, 2013. Proceedings, pages 255-269, 2013. URL: https://doi.org/10.1007/978-3-642-35668-1_18.
  25. Florent Krzakala, Cristopher Moore, Elchanan Mossel, Joe Neeman, Allan Sly, Lenka Zdeborová, and Pan Zhang. Spectral redemption in clustering sparse networks. Proceedings of the National Academy of Sciences, 110(52):20935-20940, 2013. URL: https://doi.org/10.1073/pnas.1312486110.
  26. Cornelius Lanczos. An iteration method for the solution of the eigenvalue problem of linear differential and integral operators. J. Res. Natl. Bur. Stand. B, 45:255-282, 1950. URL: https://doi.org/10.6028/jres.045.026.
  27. James R. Lee, Shayan Oveis Gharan, and Luca Trevisan. Multiway Spectral Partitioning and Higher-Order Cheeger Inequalities. J. ACM, 61(6):37:1-37:30, 2014. URL: https://doi.org/10.1145/2665063.
  28. Jure Leskovec, Anand Rajaraman, and Jeffrey D. Ullman. Mining of Massive Datasets, 2nd Ed. Cambridge University Press, 2014. URL: http://www.mmds.org/.
  29. X. Liu and T. Murata. Advanced modularity-specialized label propagation algorithm for detecting communities in networks. Physica A: Statistical Mechanics and its Applications, 389(7):1493-1500, 2010. URL: https://doi.org/10.1016/j.physa.2009.12.019.
  30. David J. C. MacKay. Information theory, inference, and learning algorithms. Cambridge University Press, 2003. Google Scholar
  31. Frederik Mallmann-Trenn, Cameron Musco, and Christopher Musco. Eigenvector Computation and Community Detection in Asynchronous Gossip Models. In 45th International Colloquium on Automata, Languages, and Programming, ICALP 2018, July 9-13, 2018, Prague, Czech Republic, pages 159:1-159:14, 2018. URL: https://doi.org/10.4230/LIPIcs.ICALP.2018.159.
  32. Frank McSherry. Spectral Partitioning of Random Graphs. In 42nd Annual Symposium on Foundations of Computer Science, FOCS 2001, 14-17 October 2001, Las Vegas, Nevada, USA, pages 529-537, 2001. URL: https://doi.org/10.1109/SFCS.2001.959929.
  33. S. J. Montgomery-Smith. The Distribution of Rademacher Sums. Proceedings of the American Mathematical Society, 109(2):517-522, 1990. URL: http://www.jstor.org/stable/2048015.
  34. Joris M. Mooij and Hilbert J. Kappen. Sufficient Conditions for Convergence of the Sum–Product Algorithm. IEEE Transactions on Information Theory, 53(12):4422-4437, December 2007. URL: https://doi.org/10.1109/TIT.2007.909166.
  35. Elchanan Mossel, Joe Neeman, and Allan Sly. Belief propagation, robust reconstruction and optimal recovery of block models. The Annals of Applied Probability, 26(4):2211-2256, 2016. (Preliminary version appeared in COLT 2014). Google Scholar
  36. Elchanan Mossel, Joe Neeman, and Allan Sly. A Proof of the Block Model Threshold Conjecture. Combinatorica, 38(3):665-708, 2018. URL: https://doi.org/10.1007/s00493-016-3238-8.
  37. Andrew Y. Ng, Michael I. Jordan, and Yair Weiss. On Spectral Clustering: Analysis and an algorithm. In Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, NIPS 2001, December 3-8, 2001, Vancouver, British Columbia, Canada], pages 849-856, 2001. URL: http://papers.nips.cc/paper/2092-on-spectral-clustering-analysis-and-an-algorithm.
  38. Richard Peng, He Sun, and Luca Zanetti. Partitioning Well-Clustered Graphs: Spectral Clustering Works! SIAM J. Comput., 46(2):710-743, 2017. (Preliminary version appeared in COLT 2015). URL: https://doi.org/10.1137/15M1047209.
  39. Usha Nandini Raghavan, Réka Albert, and Soundar Kumara. Near linear time algorithm to detect community structures in large-scale networks. Phys. Rev. E, 76:036106, September 2007. URL: https://doi.org/10.1103/PhysRevE.76.036106.
  40. Irina Shevtsova. On the absolute constants in the Berry-Esseen-type inequalities. Doklady Mathematics, 89(3):378-381, May 2014. URL: https://doi.org/10.1134/S1064562414030338.
  41. Jianbo Shi and Jitendra Malik. Normalized Cuts and Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell., 22(8):888-905, 2000. (Preliminary version appeared in CVPR 1997). URL: https://doi.org/10.1109/34.868688.
  42. J Michael Steele. The Cauchy-Schwarz master class: an introduction to the art of mathematical inequalities. Cambridge University Press, 2004. Google Scholar
  43. He Sun and Luca Zanetti. Distributed Graph Clustering and Sparsification. CoRR, abs/1711.01262, 2017. URL: http://arxiv.org/abs/1711.01262.
  44. Jianjun Paul Tian and D. Kannan. Lumpability and Commutativity of Markov Processes. Stochastic Analysis and Applications, 24(3):685-702, 2006. URL: https://doi.org/10.1080/07362990600632045.
  45. Ulrike von Luxburg. A tutorial on spectral clustering. Statistics and Computing, 17(4):395-416, 2007. URL: https://doi.org/10.1007/s11222-007-9033-z.
  46. Yair Weiss. Correctness of Local Probability Propagation in Graphical Models with Loops. Neural Computation, 12(1):1-41, 2000. URL: https://doi.org/10.1162/089976600300015880.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail