Simulating Random Walks on Graphs in the Streaming Model

Author Ce Jin



PDF
Thumbnail PDF

File

LIPIcs.ITCS.2019.46.pdf
  • Filesize: 0.49 MB
  • 15 pages

Document Identifiers

Author Details

Ce Jin
  • Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China

Cite AsGet BibTex

Ce Jin. Simulating Random Walks on Graphs in the Streaming Model. In 10th Innovations in Theoretical Computer Science Conference (ITCS 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 124, pp. 46:1-46:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)
https://doi.org/10.4230/LIPIcs.ITCS.2019.46

Abstract

We study the problem of approximately simulating a t-step random walk on a graph where the input edges come from a single-pass stream. The straightforward algorithm using reservoir sampling needs O(nt) words of memory. We show that this space complexity is near-optimal for directed graphs. For undirected graphs, we prove an Omega(n sqrt{t})-bit space lower bound, and give a near-optimal algorithm using O(n sqrt{t}) words of space with 2^{-Omega(sqrt{t})} simulation error (defined as the l_1-distance between the output distribution of the simulation algorithm and the distribution of perfect random walks). We also discuss extending the algorithms to the turnstile model, where both insertion and deletion of edges can appear in the input stream.

Subject Classification

ACM Subject Classification
  • Theory of computation → Streaming models
Keywords
  • streaming models
  • random walks
  • sampling

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Kook Jin Ahn and Sudipto Guha. Linear programming in the semi-streaming model with application to the maximum matching problem. Information and Computation, 222:59-79, 2013. URL: http://dx.doi.org/10.1016/j.ic.2012.10.006.
  2. Kook Jin Ahn, Sudipto Guha, and Andrew McGregor. Analyzing graph structure via linear measurements. In Proceedings of the 23rd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 459-467, 2012. URL: http://dx.doi.org/10.1137/1.9781611973099.40.
  3. Reid Andersen, Fan Chung, and Kevin Lang. Using pagerank to locally partition a graph. Internet Mathematics, 4(1):35-64, 2007. URL: http://dx.doi.org/10.1080/15427951.2007.10129139.
  4. Reid Andersen and Yuval Peres. Finding sparse cuts locally using evolving sets. In Proceedings of the 41st Annual ACM Symposium on Theory of Computing (STOC), pages 235-244, 2009. URL: http://dx.doi.org/10.1145/1536414.1536449.
  5. Moses Charikar, Liadan O'Callaghan, and Rina Panigrahy. Better streaming algorithms for clustering problems. In Proceedings of the 35th Annual ACM Symposium on Theory of Computing (STOC), pages 30-39, 2003. URL: http://dx.doi.org/10.1145/780542.780548.
  6. Graham Cormode and Shan Muthukrishnan. An improved data stream summary: the count-min sketch and its applications. Journal of Algorithms, 55(1):58-75, 2005. URL: http://dx.doi.org/10.1016/j.jalgor.2003.12.001.
  7. Atish Das Sarma, Sreenivas Gollapudi, and Rina Panigrahy. Estimating pagerank on graph streams. Journal of the ACM (JACM), 58(3):13, 2011. URL: http://dx.doi.org/10.1145/1970392.1970397.
  8. Leah Epstein, Asaf Levin, Julián Mestre, and Danny Segev. Improved Approximation Guarantees for Weighted Matching in the Semi-streaming Model. SIAM Journal on Discrete Mathematics, 25(3):1251-1265, 2011. URL: http://dx.doi.org/10.1137/100801901.
  9. Rajesh Jayaram and David P. Woodruff. Perfect Lp Sampling in a Data Stream. In Proceedings of the 59th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 544-555, 2018. URL: http://dx.doi.org/10.1109/FOCS.2018.00058.
  10. Mark Jerrum and Alistair Sinclair. Approximating the permanent. SIAM Journal on Computing, 18(6):1149-1178, 1989. URL: http://dx.doi.org/10.1137/0218077.
  11. Mark R. Jerrum, Leslie G. Valiant, and Vijay V. Vazirani. Random generation of combinatorial structures from a uniform distribution. Theoretical Computer Science, 43:169-188, 1986. URL: http://dx.doi.org/10.1016/0304-3975(86)90174-X.
  12. Michael Kapralov. Better bounds for matchings in the streaming model. In Proceedings of the 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1679-1697, 2013. URL: http://dx.doi.org/10.1137/1.9781611973105.121.
  13. Michael Kapralov, Yin Tat Lee, Cameron Musco, Christopher Musco, and Aaron Sidford. Single pass spectral sparsification in dynamic streams. SIAM Journal on Computing, 46(1):456-477, 2017. URL: http://dx.doi.org/10.1137/141002281.
  14. Jonathan A. Kelner and Alex Levin. Spectral sparsification in the semi-streaming setting. Theory of Computing Systems, 53(2):243-262, 2013. URL: http://dx.doi.org/10.1007/s00224-012-9396-1.
  15. Peter Bro Miltersen, Noam Nisan, Shmuel Safra, and Avi Wigderson. On Data Structures and Asymmetric Communication Complexity. Journal of Computer and System Sciences, 57(1):37-49, 1998. URL: http://dx.doi.org/10.1006/jcss.1998.1577.
  16. J. Misra and David Gries. Finding repeated elements. Science of Computer Programming, 2(2):143-152, 1982. URL: http://dx.doi.org/10.1016/0167-6423(82)90012-0.
  17. Omer Reingold. Undirected connectivity in log-space. Journal of the ACM (JACM), 55(4):17, 2008. URL: http://dx.doi.org/10.1145/1391289.1391291.
  18. Daniel A. Spielman and Shang-Hua Teng. A Local Clustering Algorithm for Massive Graphs and Its Application to Nearly Linear Time Graph Partitioning. SIAM Journal on Computing, 42(1):1-26, 2013. URL: http://dx.doi.org/10.1137/080744888.