Sample Complexity Bounds for Influence Maximization

Sadeh, Gal; Cohen, Edith; Kaplan, Haim

doi:10.4230/LIPIcs.ITCS.2020.29

File

Author Details

Gal Sadeh

Tel Aviv University, Israel

Edith Cohen

Google Research, Mountain View, CA, USA
Tel Aviv University, Israel

Haim Kaplan

Google Research, Tel Aviv, Israel
Tel Aviv University, Israel

Cite As Get BibTex

Gal Sadeh, Edith Cohen, and Haim Kaplan. Sample Complexity Bounds for Influence Maximization. In 11th Innovations in Theoretical Computer Science Conference (ITCS 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 151, pp. 29:1-29:36, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020) https://doi.org/10.4230/LIPIcs.ITCS.2020.29

Abstract

Influence maximization (IM) is the problem of finding for a given s ≥ 1 a set S of |S|=s nodes in a network with maximum influence. With stochastic diffusion models, the influence of a set S of seed nodes is defined as the expectation of its reachability over simulations, where each simulation specifies a deterministic reachability function. Two well-studied special cases are the Independent Cascade (IC) and the Linear Threshold (LT) models of Kempe, Kleinberg, and Tardos [Kempe et al., 2003]. The influence function in stochastic diffusion is unbiasedly estimated by averaging reachability values over i.i.d. simulations. We study the IM sample complexity: the number of simulations needed to determine a (1-ε)-approximate maximizer with confidence 1-δ. Our main result is a surprising upper bound of O(s τ ε^{-2} ln (n/δ)) for a broad class of models that includes IC and LT models and their mixtures, where n is the number of nodes and τ is the number of diffusion steps. Generally τ ≪ n, so this significantly improves over the generic upper bound of O(s n ε^{-2} ln (n/δ)). Our sample complexity bounds are derived from novel upper bounds on the variance of the reachability that allow for small relative error for influential sets and additive error when influence is small. Moreover, we provide a data-adaptive method that can detect and utilize fewer simulations on models where it suffices. Finally, we provide an efficient greedy design that computes an (1-1/e-ε)-approximate maximizer from simulations and applies to any submodular stochastic diffusion model that satisfies the variance bounds.

Subject Classification

ACM Subject Classification

Theory of computation → Machine learning theory

Keywords

Sample complexity
Influence maximization
Submodular maximization

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

References

N. Alon, Y. Matias, and M. Szegedy. The space complexity of approximating the frequency moments. J. Comput. System Sci., 58:137-147, 1999.
E. Balkanski, A. Rubinstein, and Y. Singer. The limitations of optimization from samples. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, Montreal, QC, Canada, June 19-23, 2017, 2017. URL: https://doi.org/10.1145/3055399.3055406.
C. Borg, M. Brautbar, J. Chayes, and B. Lucier. Maximizing Social Influence in Nearly Optimal Time. In SODA, 2014.
W. Chen, L. V. S. Lakshmanan, and C. Castillo. Information and Influence Propagation in Social Networks. Morgan & Claypool, 2013.
W. Chen, W. Lu, and Y. Zhang. Time-critical influence maximization in social networks with time-delayed diffusion process. In AAAI, 2012.
W. Chen, C. Wang, and Y. Wang. Scalable Influence Maximization for Prevalent Viral marketing in Large-Scale Social Networks. In KDD. ACM, 2010.
W. Chen, Y. Wang, and S. Yang. Efficient influence maximization in social networks. In KDD. ACM, 2009.
W. Chen, Y. Yuan, and L. Zhang. Scalable Influence Maximization in Social Networks under the Linear Threshold Model. In ICDM, 2010.
Wei Chen, Tian Lin, Zihan Tan, Mingfei Zhao, and Xuren Zhou. Robust Influence Maximization. In Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '16. ACM, 2016. URL: https://doi.org/10.1145/2939672.2939745.
E. Cohen. Size-Estimation Framework with Applications to Transitive Closure and Reachability. J. Comput. System Sci., 55:441-453, 1997.
E. Cohen. All-Distances Sketches, Revisited: HIP estimators for massive graphs analysis. TKDE, 2015. URL: http://arxiv.org/abs/1306.3284.
E. Cohen. Multi-Objective Weighted Sampling. In HotWeb. IEEE, 2015. full version: URL: https://arxiv.org/abs/1509.07445.
E. Cohen, S. Chechik, and H. Kaplan. Clustering Small Samples with Quality Guarantees: Adaptivity with One2all pps. In AAAI, 2018. URL: http://arxiv.org/abs/1706.03607.
E. Cohen, D. Delling, T. Pajor, and R. F. Werneck. Sketch-based Influence Maximization and Computation: Scaling up with Guarantees. In CIKM. ACM, 2014. URL: http://arxiv.org/abs/1408.6282.
E. Cohen, D. Delling, T. Pajor, and R. F. Werneck. Distance-Based Influence in Networks: Computation and Maximization. Technical Report cs.SI/1410.6976, arXiv, 2015. URL: http://arxiv.org/abs/1410.06976.
E. Cohen, N. Grossuag, and H. Kaplan. Processing Top-k Queries from Samples. In Proceedings of the 2006 ACM conference on Emerging network experiment and technology (CoNext). ACM, 2006.
P. Domingos and M. Richardson. Mining the network value of customers. In KDD. ACM, 2001.
N. Du, Y. Liang, M-F. Balcan, and L. Song. Influence Function Learning in Information Diffusion Networks. In Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32, ICML'14, 2014.
N. Du, L. Song, M. Gomez-Rodriguez, and H. Zha. Scalable Influence Estimation in Continuous-Time Diffusion Networks. In NIPS. NIPS, 2013.
D. Easley and J. Kleinberg. Networks, Crowds, and Markets: Reasoning About a Highly Connected World. Cambridge University Press, New York, NY, USA, 2010.
U. Feige. A threshold of ln n for approximating set cover. J. Assoc. Comput. Mach., 45:634-652, 1998.
M. Gomez-Rodriguez, D. Balduzzi, and B. Schölkopf. Uncovering the Temporal Dynamics of Diffusion Networks. In ICML, 2011.
M. Gomez-Rodriguez, J. Leskovec, and A. Krause. Inferring Networks of Diffusion and Influence. In KDD, 2010.
A. Goyal, F. Bonchi, and L. V. S. Lakshmanan. Learning influence probabilities in social networks. In WSDM, 2010.
A. Goyal, W. Lu, and L.V.S. Lakshmanan. CELF++: Optimizing the Greedy Algorithm for Influence Maximization in Social Networks. In WWW. ACM, 2011.
M. Granovetter. Threshold Models of Collective Behavior. American Journal of Sociology, 83(6), 1978.
Xinran He and David Kempe. Robust Influence Maximization. In Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '16. ACM, 2016. URL: https://doi.org/10.1145/2939672.2939760.
K. Huang, S. Wang, G. S. Bevilacqua, X. Xiao, and L. K. S. Lakshmanan. Revisiting the Stop-and-Stare Algorithms for Influence Maximization. PVLDB, 10, 2017.
M. O. Jackson. Social and economic networks. Princeton University Press, 2010.
K. Jung, W. Heo, and W. Chen. IRIE: Scalable and robust influence maximization in social networks. In ICDM. ACM, 2012.
D. Kempe, J. M. Kleinberg, and É. Tardos. Maximizing the spread of influence through a social network. In KDD. ACM, 2003.
J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, and Glance N. Cost-effective outbreak detection in networks. In KDD. ACM, 2007.
B. Liu, G. Cong, X. Dong, and Y. Zeng. Time Constrained Influence Maximization in Social Networks. In ICDM, 2012.
B. Mirzasoleiman, A. Karbasi, R. Sarkar, and A. Krause. Distributed Submodular Maximization: Identifying Representative Elements in Massive Data. In NIPS, 2013.
Elchanan Mossel and Sébastien Roch. On the submodularity of influence in social networks. In STOC, 2007.
Elchanan Mossel and Sebastien Roch. Submodularity of Infuence in Social Networks: From Local to Global. https://arxiv.org/pdf/math/0612046.pdf, July 2009. (Accessed on 08 2019).
G. Nemhauser, L. Wolsey, and M. Fisher. An analysis of the approximations of maximizing submodular set functions. Mathematical Programming, 14, 1978.
H. T. Nguyen, M. T. Thai, and T. N. Dinh. Stop-and-Stare: Optimal Sampling Algorithms for Viral Marketing in Billion-scale Networks. In SIGMOD. ACM, 2016.
H. T. Nguyen, M. T. Thai, and T. N. Dinh. A Billion-Scale Approximation Algorithm for Maximizing Benefit in Viral Marketing. IEEE/ACM Trans. Netw., 25(4), 2017.
M. Richardson and P. Domingos. Mining knowledge-sharing sites for viral marketing. In KDD. ACM, 2002.
K. Saito, R. Nakano, and M. Kimura. Prediction of Information Diffusion Probabilities for Independent Cascade Model. In Knowledge-Based Intelligent Information and Engineering Systems (KES), volume 5179 of Lecture Notes in Computer Science. Springer, 2008.
Y. Tang, Y. Shi, and X. Xiao. Influence Maximization in Near-Linear Time: A Martingale Approach. In SIGMOD, 2015.
Y. Tang, X. Xiao, and Y. Shi. Influence Maximization: Near-Optimal Time Complexity Meets Practical Efficiency. In SIGMOD, 2014.
J. Travers and S. Milgram. An Experimental Study of the Small World Problem. Sociometry, 32:425-443, December 1969. URL: https://doi.org/10.2307/2786545.

Sample Complexity Bounds for Influence Maximization

Authors Gal Sadeh, Edith Cohen , Haim Kaplan

File

Document Identifiers

Author Details

Cite As Get BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

Sample Complexity Bounds for Influence Maximization

Authors Gal Sadeh, Edith Cohen , Haim Kaplan

File

Document Identifiers

Author Details

Funding

Cite As Get BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message