Sample Complexity Bounds for Influence Maximization

Authors Gal Sadeh, Edith Cohen , Haim Kaplan

Thumbnail PDF


  • Filesize: 0.76 MB
  • 36 pages

Document Identifiers

Author Details

Gal Sadeh
  • Tel Aviv University, Israel
Edith Cohen
  • Google Research, Mountain View, CA, USA
  • Tel Aviv University, Israel
Haim Kaplan
  • Google Research, Tel Aviv, Israel
  • Tel Aviv University, Israel

Cite AsGet BibTex

Gal Sadeh, Edith Cohen, and Haim Kaplan. Sample Complexity Bounds for Influence Maximization. In 11th Innovations in Theoretical Computer Science Conference (ITCS 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 151, pp. 29:1-29:36, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Influence maximization (IM) is the problem of finding for a given s ≥ 1 a set S of |S|=s nodes in a network with maximum influence. With stochastic diffusion models, the influence of a set S of seed nodes is defined as the expectation of its reachability over simulations, where each simulation specifies a deterministic reachability function. Two well-studied special cases are the Independent Cascade (IC) and the Linear Threshold (LT) models of Kempe, Kleinberg, and Tardos [Kempe et al., 2003]. The influence function in stochastic diffusion is unbiasedly estimated by averaging reachability values over i.i.d. simulations. We study the IM sample complexity: the number of simulations needed to determine a (1-ε)-approximate maximizer with confidence 1-δ. Our main result is a surprising upper bound of O(s τ ε^{-2} ln (n/δ)) for a broad class of models that includes IC and LT models and their mixtures, where n is the number of nodes and τ is the number of diffusion steps. Generally τ ≪ n, so this significantly improves over the generic upper bound of O(s n ε^{-2} ln (n/δ)). Our sample complexity bounds are derived from novel upper bounds on the variance of the reachability that allow for small relative error for influential sets and additive error when influence is small. Moreover, we provide a data-adaptive method that can detect and utilize fewer simulations on models where it suffices. Finally, we provide an efficient greedy design that computes an (1-1/e-ε)-approximate maximizer from simulations and applies to any submodular stochastic diffusion model that satisfies the variance bounds.

Subject Classification

ACM Subject Classification
  • Theory of computation → Machine learning theory
  • Sample complexity
  • Influence maximization
  • Submodular maximization


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. N. Alon, Y. Matias, and M. Szegedy. The space complexity of approximating the frequency moments. J. Comput. System Sci., 58:137-147, 1999. Google Scholar
  2. E. Balkanski, A. Rubinstein, and Y. Singer. The limitations of optimization from samples. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, Montreal, QC, Canada, June 19-23, 2017, 2017. URL:
  3. C. Borg, M. Brautbar, J. Chayes, and B. Lucier. Maximizing Social Influence in Nearly Optimal Time. In SODA, 2014. Google Scholar
  4. W. Chen, L. V. S. Lakshmanan, and C. Castillo. Information and Influence Propagation in Social Networks. Morgan & Claypool, 2013. Google Scholar
  5. W. Chen, W. Lu, and Y. Zhang. Time-critical influence maximization in social networks with time-delayed diffusion process. In AAAI, 2012. Google Scholar
  6. W. Chen, C. Wang, and Y. Wang. Scalable Influence Maximization for Prevalent Viral marketing in Large-Scale Social Networks. In KDD. ACM, 2010. Google Scholar
  7. W. Chen, Y. Wang, and S. Yang. Efficient influence maximization in social networks. In KDD. ACM, 2009. Google Scholar
  8. W. Chen, Y. Yuan, and L. Zhang. Scalable Influence Maximization in Social Networks under the Linear Threshold Model. In ICDM, 2010. Google Scholar
  9. Wei Chen, Tian Lin, Zihan Tan, Mingfei Zhao, and Xuren Zhou. Robust Influence Maximization. In Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '16. ACM, 2016. URL:
  10. E. Cohen. Size-Estimation Framework with Applications to Transitive Closure and Reachability. J. Comput. System Sci., 55:441-453, 1997. Google Scholar
  11. E. Cohen. All-Distances Sketches, Revisited: HIP estimators for massive graphs analysis. TKDE, 2015. URL:
  12. E. Cohen. Multi-Objective Weighted Sampling. In HotWeb. IEEE, 2015. full version: URL:
  13. E. Cohen, S. Chechik, and H. Kaplan. Clustering Small Samples with Quality Guarantees: Adaptivity with One2all pps. In AAAI, 2018. URL:
  14. E. Cohen, D. Delling, T. Pajor, and R. F. Werneck. Sketch-based Influence Maximization and Computation: Scaling up with Guarantees. In CIKM. ACM, 2014. URL:
  15. E. Cohen, D. Delling, T. Pajor, and R. F. Werneck. Distance-Based Influence in Networks: Computation and Maximization. Technical Report cs.SI/1410.6976, arXiv, 2015. URL:
  16. E. Cohen, N. Grossuag, and H. Kaplan. Processing Top-k Queries from Samples. In Proceedings of the 2006 ACM conference on Emerging network experiment and technology (CoNext). ACM, 2006. Google Scholar
  17. P. Domingos and M. Richardson. Mining the network value of customers. In KDD. ACM, 2001. Google Scholar
  18. N. Du, Y. Liang, M-F. Balcan, and L. Song. Influence Function Learning in Information Diffusion Networks. In Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32, ICML'14, 2014. Google Scholar
  19. N. Du, L. Song, M. Gomez-Rodriguez, and H. Zha. Scalable Influence Estimation in Continuous-Time Diffusion Networks. In NIPS. NIPS, 2013. Google Scholar
  20. D. Easley and J. Kleinberg. Networks, Crowds, and Markets: Reasoning About a Highly Connected World. Cambridge University Press, New York, NY, USA, 2010. Google Scholar
  21. U. Feige. A threshold of ln n for approximating set cover. J. Assoc. Comput. Mach., 45:634-652, 1998. Google Scholar
  22. M. Gomez-Rodriguez, D. Balduzzi, and B. Schölkopf. Uncovering the Temporal Dynamics of Diffusion Networks. In ICML, 2011. Google Scholar
  23. M. Gomez-Rodriguez, J. Leskovec, and A. Krause. Inferring Networks of Diffusion and Influence. In KDD, 2010. Google Scholar
  24. A. Goyal, F. Bonchi, and L. V. S. Lakshmanan. Learning influence probabilities in social networks. In WSDM, 2010. Google Scholar
  25. A. Goyal, W. Lu, and L.V.S. Lakshmanan. CELF++: Optimizing the Greedy Algorithm for Influence Maximization in Social Networks. In WWW. ACM, 2011. Google Scholar
  26. M. Granovetter. Threshold Models of Collective Behavior. American Journal of Sociology, 83(6), 1978. Google Scholar
  27. Xinran He and David Kempe. Robust Influence Maximization. In Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '16. ACM, 2016. URL:
  28. K. Huang, S. Wang, G. S. Bevilacqua, X. Xiao, and L. K. S. Lakshmanan. Revisiting the Stop-and-Stare Algorithms for Influence Maximization. PVLDB, 10, 2017. Google Scholar
  29. M. O. Jackson. Social and economic networks. Princeton University Press, 2010. Google Scholar
  30. K. Jung, W. Heo, and W. Chen. IRIE: Scalable and robust influence maximization in social networks. In ICDM. ACM, 2012. Google Scholar
  31. D. Kempe, J. M. Kleinberg, and É. Tardos. Maximizing the spread of influence through a social network. In KDD. ACM, 2003. Google Scholar
  32. J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, and Glance N. Cost-effective outbreak detection in networks. In KDD. ACM, 2007. Google Scholar
  33. B. Liu, G. Cong, X. Dong, and Y. Zeng. Time Constrained Influence Maximization in Social Networks. In ICDM, 2012. Google Scholar
  34. B. Mirzasoleiman, A. Karbasi, R. Sarkar, and A. Krause. Distributed Submodular Maximization: Identifying Representative Elements in Massive Data. In NIPS, 2013. Google Scholar
  35. Elchanan Mossel and Sébastien Roch. On the submodularity of influence in social networks. In STOC, 2007. Google Scholar
  36. Elchanan Mossel and Sebastien Roch. Submodularity of Infuence in Social Networks: From Local to Global., July 2009. (Accessed on 08 2019).
  37. G. Nemhauser, L. Wolsey, and M. Fisher. An analysis of the approximations of maximizing submodular set functions. Mathematical Programming, 14, 1978. Google Scholar
  38. H. T. Nguyen, M. T. Thai, and T. N. Dinh. Stop-and-Stare: Optimal Sampling Algorithms for Viral Marketing in Billion-scale Networks. In SIGMOD. ACM, 2016. Google Scholar
  39. H. T. Nguyen, M. T. Thai, and T. N. Dinh. A Billion-Scale Approximation Algorithm for Maximizing Benefit in Viral Marketing. IEEE/ACM Trans. Netw., 25(4), 2017. Google Scholar
  40. M. Richardson and P. Domingos. Mining knowledge-sharing sites for viral marketing. In KDD. ACM, 2002. Google Scholar
  41. K. Saito, R. Nakano, and M. Kimura. Prediction of Information Diffusion Probabilities for Independent Cascade Model. In Knowledge-Based Intelligent Information and Engineering Systems (KES), volume 5179 of Lecture Notes in Computer Science. Springer, 2008. Google Scholar
  42. Y. Tang, Y. Shi, and X. Xiao. Influence Maximization in Near-Linear Time: A Martingale Approach. In SIGMOD, 2015. Google Scholar
  43. Y. Tang, X. Xiao, and Y. Shi. Influence Maximization: Near-Optimal Time Complexity Meets Practical Efficiency. In SIGMOD, 2014. Google Scholar
  44. J. Travers and S. Milgram. An Experimental Study of the Small World Problem. Sociometry, 32:425-443, December 1969. URL:
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail