Pseudo-Deterministic Streaming

Authors Shafi Goldwasser, Ofer Grossman, Sidhanth Mohanty, David P. Woodruff



PDF
Thumbnail PDF

File

LIPIcs.ITCS.2020.79.pdf
  • Filesize: 0.62 MB
  • 25 pages

Document Identifiers

Author Details

Shafi Goldwasser
  • Simons Institute for the Theory of Computing, Berkeley, CA, USA
  • MIT, Cambridge, MA, USA
Ofer Grossman
  • MIT, Cambridge, MA, USA
Sidhanth Mohanty
  • University of California Berkeley, CA, USA
David P. Woodruff
  • Carnegie-Mellon University, Pittsburgh, PA, USA

Cite AsGet BibTex

Shafi Goldwasser, Ofer Grossman, Sidhanth Mohanty, and David P. Woodruff. Pseudo-Deterministic Streaming. In 11th Innovations in Theoretical Computer Science Conference (ITCS 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 151, pp. 79:1-79:25, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)
https://doi.org/10.4230/LIPIcs.ITCS.2020.79

Abstract

A pseudo-deterministic algorithm is a (randomized) algorithm which, when run multiple times on the same input, with high probability outputs the same result on all executions. Classic streaming algorithms, such as those for finding heavy hitters, approximate counting, ?_2 approximation, finding a nonzero entry in a vector (for turnstile algorithms) are not pseudo-deterministic. For example, in the instance of finding a nonzero entry in a vector, for any known low-space algorithm A, there exists a stream x so that running A twice on x (using different randomness) would with high probability result in two different entries as the output. In this work, we study whether it is inherent that these algorithms output different values on different executions. That is, we ask whether these problems have low-memory pseudo-deterministic algorithms. For instance, we show that there is no low-memory pseudo-deterministic algorithm for finding a nonzero entry in a vector (given in a turnstile fashion), and also that there is no low-dimensional pseudo-deterministic sketching algorithm for ?_2 norm estimation. We also exhibit problems which do have low memory pseudo-deterministic algorithms but no low memory deterministic algorithm, such as outputting a nonzero row of a matrix, or outputting a basis for the row-span of a matrix. We also investigate multi-pseudo-deterministic algorithms: algorithms which with high probability output one of a few options. We show the first lower bounds for such algorithms. This implies that there are streaming problems such that every low space algorithm for the problem must have inputs where there are many valid outputs, all with a significant probability of being outputted.

Subject Classification

ACM Subject Classification
  • Theory of computation → Streaming, sublinear and near linear time algorithms
Keywords
  • streaming
  • pseudo-deterministic

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Yuqing Ai, Wei Hu, Yi Li, and David P. Woodruff. New Characterizations in Turnstile Streams with Applications. In 31st Conference on Computational Complexity, CCC 2016, May 29 to June 1, 2016, Tokyo, Japan, pages 20:1-20:22, 2016. URL: https://doi.org/10.4230/LIPIcs.CCC.2016.20.
  2. Noga Alon, Yossi Matias, and Mario Szegedy. The space complexity of approximating the frequency moments. Journal of Computer and system sciences, 58(1):137-147, 1999. Google Scholar
  3. Alexandr Andoni, Robert Krauthgamer, and Krzysztof Onak. Streaming algorithms from precision sampling. arXiv preprint, 2010. URL: http://arxiv.org/abs/1011.1263.
  4. Moses Charikar, Kevin Chen, and Martin Farach-Colton. Finding frequent items in data streams. Theoretical Computer Science, 312(1):3-15, 2004. Google Scholar
  5. Kenneth L Clarkson and David P Woodruff. Numerical linear algebra in the streaming model. In Proceedings of the forty-first annual ACM symposium on Theory of computing, pages 205-214. ACM, 2009. Google Scholar
  6. Peter Dixon, A Pavan, and NV Vinodchandran. On Pseudodeterministic Approximation Algorithms. In 43rd International Symposium on Mathematical Foundations of Computer Science (MFCS 2018). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2018. Google Scholar
  7. Philippe Flajolet. Approximate counting: a detailed analysis. BIT Numerical Mathematics, 25(1):113-134, 1985. Google Scholar
  8. Gereon Frahling, Piotr Indyk, and Christian Sohler. Sampling in dynamic data streams and applications. International Journal of Computational Geometry & Applications, 18(01n02):3-28, 2008. Google Scholar
  9. Eran Gat and Shafi Goldwasser. Probabilistic Search Algorithms with Unique Answers and Their Cryptographic Applications. In Electronic Colloquium on Computational Complexity (ECCC), volume 18, page 136, 2011. Google Scholar
  10. Oded Goldreich. Multi-pseudodeterministic algorithms. In Electronic Colloquium on Computational Complexity (ECCC), 2019. Google Scholar
  11. Oded Goldreich, Shafi Goldwasser, and Dana Ron. On the possibilities and limitations of pseudodeterministic algorithms. In Proceedings of the 4th conference on Innovations in Theoretical Computer Science, pages 127-138. ACM, 2013. Google Scholar
  12. Shafi Goldwasser and Ofer Grossman. Perfect Bipartite Matching in Pseudo-Deterministic RNC. In Electronic Colloquium on Computational Complexity (ECCC), volume 22, page 208, 2015. Google Scholar
  13. Shafi Goldwasser, Ofer Grossman, and Dhiraj Holden. Pseudo-deterministic Proofs. arXiv preprint, 2017. URL: http://arxiv.org/abs/1706.04641.
  14. Parikshit Gopalan and Jaikumar Radhakrishnan. Finding duplicates in a data stream. In Proceedings of the twentieth annual ACM-SIAM symposium on Discrete algorithms, pages 402-411. Society for Industrial and Applied Mathematics, 2009. Google Scholar
  15. Ofer Grossman. Finding Primitive Roots Pseudo-Deterministically. In Electronic Colloquium on Computational Complexity (ECCC), volume 22, page 207, 2015. Google Scholar
  16. Ofer Grossman and Yang P Liu. Reproducibility and Pseudo-Determinism in Log-Space. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 606-620. SIAM, 2019. Google Scholar
  17. Moritz Hardt and David P Woodruff. How robust are linear sketches to adaptive inputs? In Proceedings of the forty-fifth annual ACM symposium on Theory of computing, pages 121-130. ACM, 2013. Google Scholar
  18. Dhiraj Holden. A Note on Unconditional Subexponential-time Pseudo-deterministic Algorithms for BPP Search Problems. arXiv preprint, 2017. URL: http://arxiv.org/abs/1707.05808.
  19. Piotr Indyk. Stable distributions, pseudorandom generators, embeddings, and data stream computation. Journal of the ACM (JACM), 53(3):307-323, 2006. Google Scholar
  20. Piotr Indyk and David Woodruff. Optimal approximations of the frequency moments of data streams. In Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, pages 202-208. ACM, 2005. Google Scholar
  21. Rajesh Jayaram and David P Woodruff. Perfect lp sampling in a data stream. In 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS), pages 544-555. IEEE, 2018. Google Scholar
  22. Hossein Jowhari, Mert Sağlam, and Gábor Tardos. Tight bounds for lp samplers, finding duplicates in streams, and related problems. In Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 49-58. ACM, 2011. Google Scholar
  23. Michael Kapralov, Jelani Nelson, Jakub Pachocki, Zhengyu Wang, David P Woodruff, and Mobin Yahyazadeh. Optimal lower bounds for universal relation, and for samplers and finding duplicates in streams. In Foundations of Computer Science (FOCS), 2017 IEEE 58th Annual Symposium on, pages 475-486. Ieee, 2017. Google Scholar
  24. Yi Li, Huy L Nguyen, and David P Woodruff. Turnstile streaming algorithms might as well be linear sketches. In Proceedings of the forty-sixth annual ACM symposium on Theory of computing, pages 174-183. ACM, 2014. Google Scholar
  25. Jayadev Misra and David Gries. Finding repeated elements. Science of computer programming, 2(2):143-152, 1982. Google Scholar
  26. Morteza Monemizadeh and David P Woodruff. 1-pass relative-error L p-sampling with applications. In Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms, pages 1143-1160. Society for Industrial and Applied Mathematics, 2010. Google Scholar
  27. Robert Morris. Counting large numbers of events in small registers. Communications of the ACM, 21(10):840-842, 1978. Google Scholar
  28. Jelani Nelson, Huy L Nguyen, and David P Woodruff. On deterministic sketching and streaming for sparse recovery and norm estimation. Linear Algebra and its Applications, 441:152-167, 2014. Google Scholar
  29. Noam Nisan. Pseudorandom generators for space-bounded computation. Combinatorica, 12(4):449-461, 1992. Google Scholar
  30. Igor C Oliveira and Rahul Santhanam. Pseudodeterministic Constructions in Subexponential Time. arXiv preprint, 2016. URL: http://arxiv.org/abs/1612.01817.
  31. Igor C Oliveira and Rahul Santhanam. Pseudo-Derandomizing Learning and Approximation. In LIPIcs-Leibniz International Proceedings in Informatics, volume 116. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2018. Google Scholar
  32. Salil P Vadhan et al. Pseudorandomness. Foundations and Trendsregistered in Theoretical Computer Science, 7(1-3):1-336, 2012. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail