Eigenvector Computation and Community Detection in Asynchronous Gossip Models

Authors Frederik Mallmann-Trenn, Cameron Musco, Christopher Musco



PDF
Thumbnail PDF

File

LIPIcs.ICALP.2018.159.pdf
  • Filesize: 0.51 MB
  • 14 pages

Document Identifiers

Author Details

Frederik Mallmann-Trenn
  • CSAIL, MIT, US
Cameron Musco
  • CSAIL, MIT, US
Christopher Musco
  • CSAIL, MIT, US

Cite As Get BibTex

Frederik Mallmann-Trenn, Cameron Musco, and Christopher Musco. Eigenvector Computation and Community Detection in Asynchronous Gossip Models. In 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 107, pp. 159:1-159:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018) https://doi.org/10.4230/LIPIcs.ICALP.2018.159

Abstract

We give a simple distributed algorithm for computing adjacency matrix eigenvectors for the communication graph in an asynchronous gossip model. We show how to use this algorithm to give state-of-the-art asynchronous community detection algorithms when the communication graph is drawn from the well-studied stochastic block model. Our methods also apply to a natural alternative model of randomized communication, where nodes within a community communicate more frequently than nodes in different communities.
Our analysis simplifies and generalizes prior work by forging a connection between asynchronous eigenvector computation and Oja's algorithm for streaming principal component analysis. We hope that our work serves as a starting point for building further connections between the analysis of stochastic iterative methods, like Oja's algorithm, and work on asynchronous and gossip-type algorithms for distributed computation.

Subject Classification

ACM Subject Classification
  • Computing methodologies → Distributed algorithms
Keywords
  • block model
  • community detection
  • distributed clustering
  • eigenvector computation
  • gossip algorithms
  • population protocols

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Zeyuan Allen-Zhu and Yuanzhi Li. First efficient convergence for streaming k-pca: a global, gap-free, and near-optimal rate. In Proceedings of the \nth\intcalcSub20171959 Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 487-492, 2017. Google Scholar
  2. James Aspnes and Eric Ruppert. An introduction to population protocols. In Middleware for Network Eccentric and Mobile Applications, pages 97-120. Springer, 2009. Google Scholar
  3. Luca Becchetti, Andrea Clementi, , Pasin Manurangsi, Emanuele Natale, Francesco Pasquale, Prasad Raghavendra, and Luca Trevisan. Average whenever you meet: Opportunistic protocols for community detection. http://arxiv.org/abs/1703.05045, 2017. Google Scholar
  4. Luca Becchetti, Andrea Clementi, Emanuele Natale, Francesco Pasquale, and Luca Trevisan. Find your place: Simple distributed algorithms for community detection. In Proceedings of the \nth\intcalcSub20171989 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 940-959, 2017. Google Scholar
  5. Mikhail Belkin and Partha Niyogi. Laplacian eigenmaps and spectral techniques for embedding and clustering. In Advances in neural information processing systems, pages 585-591, 2002. Google Scholar
  6. Phillip Bonacich. Some unique properties of eigenvector centrality. Social networks, 29(4):555-564, 2007. Google Scholar
  7. Stephen Boyd, Arpita Ghosh, Balaji Prabhakar, and Devavrat Shah. Randomized gossip algorithms. IEEE/ACM Transactions on Networking (TON), 14(SI):2508-2530, 2006. Google Scholar
  8. Shakila Bu-Pasha. Cross-border issues under eu data protection law with regards to personal data protection. Information &Communications Technology Law, 26(3):1-16, 05 2017. URL: http://dx.doi.org/10.1080/13600834.2017.1330740.
  9. Sam Cole, Shmuel Friedland, and Lev Reyzin. A simple spectral algorithm for recovering planted partitions. Special Matrices, 5:139-157, 2017. Google Scholar
  10. Christopher M De Sa, Ce Zhang, Kunle Olukotun, and Christopher Ré. Taming the wild: A unified analysis of hogwild-style algorithms. In Advances in neural information processing systems, pages 2674-2682, 2015. Google Scholar
  11. Aurelien Decelle, Florent Krzakala, Cristopher Moore, and Lenka Zdeborová. Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications. Physical Review E, 84(6):066106, 2011. Google Scholar
  12. Alexandros G Dimakis, Soummya Kar, José MF Moura, Michael G Rabbat, and Anna Scaglione. Gossip algorithms for distributed signal processing. Proceedings of the IEEE, 98(11):1847-1864, 2010. Google Scholar
  13. Chris HQ Ding, Xiaofeng He, Hongyuan Zha, Ming Gu, and Horst D Simon. A min-max cut algorithm for graph partitioning and data clustering. In Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on, pages 107-114. IEEE, 2001. Google Scholar
  14. Nisrine Ghadban, Paul Honeine, Farah Mourad-Chehade, Joumana Farah, and Clovis Francis. Gossip algorithms for principal component analysis in networks. In Signal Processing Conference (EUSIPCO), 2015 23rd European, pages 2366-2370. IEEE, 2015. Google Scholar
  15. Paul W Holland, Kathryn Blackmond Laskey, and Samuel Leinhardt. Stochastic blockmodels: First steps. Social networks, 5(2):109-137, 1983. Google Scholar
  16. Prateek Jain, Chi Jin, Sham M. Kakade, Praneeth Netrapalli, and Aaron Sidford. Streaming pca: Matching matrix bernstein and near-optimal finite sample guarantees for oja’s algorithm. In Proceedings of the \nth\intcalcSub20161987 Annual Conference on Computational Learning Theory (COLT), 2016. Google Scholar
  17. David Kempe and Frank McSherry. A decentralized algorithm for spectral analysis. Journal of Computer and System Sciences, 74(1):70-83, 2008. Google Scholar
  18. Satish Babu Korada, Andrea Montanari, and Sewoong Oh. Gossip pca. In Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems, pages 209-220. ACM, 2011. Google Scholar
  19. Grzegorz Malewicz, Matthew H Austern, Aart JC Bik, James C Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. Pregel: a system for large-scale graph processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pages 135-146. ACM, 2010. Google Scholar
  20. Frederik Mallmann-Trenn, Cameron Musco, and Christopher Musco. Eigenvector computation and community detection in asynchronous gossip models. http://arxiv.org/abs/1804.08548, 2018. Google Scholar
  21. F. McSherry. Spectral partitioning of random graphs. In Proceedings of the \nth\intcalcSub20011959 Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 529-537, 2001. Google Scholar
  22. Frank McSherry. Spectral partitioning of random graphs. In 42nd Annual Symposium on Foundations of Computer Science, FOCS 2001, 14-17 October 2001, Las Vegas, Nevada, USA, pages 529-537, 2001. Google Scholar
  23. Michael Mitzenmacher and Eli Upfal. Probability and Computing: Randomized Algorithms and Probabilistic Analysis. Cambridge University Press, 2005. Google Scholar
  24. Gemma Morral, Pascal Bianchi, and Jérémie Jakubowicz. Asynchronous distributed principal component analysis using stochastic approximation. In Decision and Control (CDC), 2012 IEEE 51st Annual Conference on, pages 1398-1403. IEEE, 2012. Google Scholar
  25. Elchanan Mossel, Joe Neeman, and Allan Sly. Belief propagation, robust reconstruction and optimal recovery of block models. In Conference on Learning Theory, pages 356-370, 2014. Google Scholar
  26. Andrew Y Ng, Michael I Jordan, and Yair Weiss. On spectral clustering: Analysis and an algorithm. In Advances in neural information processing systems, pages 849-856, 2002. Google Scholar
  27. Erkki Oja. Simplified neuron model as a principal component analyzer. Journal of Mathematical Biology, 15(3):267-273, 1982. Google Scholar
  28. Benjamin Recht, Christopher Re, Stephen Wright, and Feng Niu. Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In Advances in neural information processing systems, pages 693-701, 2011. Google Scholar
  29. Semih Salihoglu, Jaeho Shin, Vikesh Khanna, Ba Quan Truong, and Jennifer Widom. Graft: A debugging tool for apache giraph. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pages 1403-1408. ACM, 2015. Google Scholar
  30. Daniel Spielman. Spectral graph theory: Spectral partitiong in a stochastic block model. University Lecture, 2015. Google Scholar
  31. John Tsitsiklis, Dimitri Bertsekas, and Michael Athans. Distributed asynchronous deterministic and stochastic gradient optimization algorithms. IEEE transactions on automatic control, 31(9):803-812, 1986. Google Scholar
  32. Van Vu. A simple SVD algorithm for finding hidden partitions. Combinatorics, Probability and Computing, 27(1):124-140, 2018. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail