Zero-One Laws for Sliding Windows and Universal Sketches

Authors Vladimir Braverman, Rafail Ostrovsky, Alan Roytman



PDF
Thumbnail PDF

File

LIPIcs.APPROX-RANDOM.2015.573.pdf
  • Filesize: 0.55 MB
  • 18 pages

Document Identifiers

Author Details

Vladimir Braverman
Rafail Ostrovsky
Alan Roytman

Cite As Get BibTex

Vladimir Braverman, Rafail Ostrovsky, and Alan Roytman. Zero-One Laws for Sliding Windows and Universal Sketches. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 40, pp. 573-590, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015) https://doi.org/10.4230/LIPIcs.APPROX-RANDOM.2015.573

Abstract

Given a stream of data, a typical approach in streaming algorithms is to design a sophisticated algorithm with small memory that computes a specific statistic over the streaming data. Usually, if one wants to compute a different statistic after the stream is gone, it is impossible. But what if we want to compute a different statistic after the fact?  In this paper, we consider the following fascinating possibility: can we collect some small amount of specific data during the stream that is "universal," i.e., where we do not know anything about the statistics we will want to later compute, other than the guarantee that had we known the statistic ahead of time, it would have been possible to do so with small memory? This is indeed what we introduce (and show) in this paper with matching upper and lower bounds: we show that it is possible to collect universal statistics of polylogarithmic size, and prove that these universal statistics allow us after the fact to compute all other statistics that are computable with similar amounts of memory. We show that this is indeed possible, both for the standard unbounded streaming model and the sliding window streaming model.

Subject Classification

Keywords
  • Streaming Algorithms
  • Universality
  • Sliding Windows

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. List of open problems in sublinear algorithms: Problem 20. URL: http://sublinear.info/20.
  2. List of open problems in sublinear algorithms: Problem 30. URL: http://sublinear.info/30.
  3. N. Alon, Y. Matias, and M. Szegedy. The space complexity of approximating the frequency moments. In STOC, 1996. Google Scholar
  4. Z. Bar-Yossef, T. S. Jayram, R. Kumar, and D. Sivakumar. An information statistics approach to data stream and communication complexity. In FOCS, 2002. Google Scholar
  5. Z. Bar-Yossef, T. S. Jayram, R. Kumar, D. Sivakumar, and L. Trevisan. Counting distinct elements in a data stream. In RANDOM. Springer, 2002. Google Scholar
  6. L. Bhuvanagiri, S. Ganguly, D. Kesh, and C. Saha. Simpler algorithm for estimating frequency moments of data streams. In SODA, 2006. Google Scholar
  7. V. Braverman, R. Gelles, and R. Ostrovsky. How to catch l₂-heavy-hitters on sliding windows. In COCOON. Springer, 2013. Google Scholar
  8. V. Braverman and R. Ostrovsky. Smooth histograms for sliding windows. In FOCS, 2007. Google Scholar
  9. V. Braverman and R. Ostrovsky. Zero-one frequency laws. In STOC, 2010. Google Scholar
  10. V. Braverman and R. Ostrovsky. Generalizing the layering method of Indyk and Woodruff: Recursive sketches for frequency-based vectors on streams. In APPROX-RANDOM. Springer, 2013. Google Scholar
  11. A. Chakrabarti, S. Khot, and X. Sun. Near-optimal lower bounds on the multi-party communication complexity of set disjointness. In CCC, 2003. Google Scholar
  12. D. Coppersmith and R. Kumar. An improved data stream algorithm for frequency moments. In SODA, 2004. Google Scholar
  13. G. Cormode, M. Datar, P. Indyk, and S. Muthukrishnan. Comparing data streams using hamming norms (how to zero in). IEEE TKDE, 15(3):529-540, 2003. Google Scholar
  14. G. Cormode and S. Muthukrishnan. What’s hot and what’s not: tracking most frequent items dynamically. ACM TODS, 30(1):249-278, 2005. Google Scholar
  15. M. Datar, A. Gionis, P. Indyk, and R. Motwani. Maintaining stream statistics over sliding windows. In SODA, 2002. Google Scholar
  16. J. Feigenbaum, S. Kannan, M. Strauss, and M. Viswanathan. An approximate l₁-difference algorithm for massive data streams. In FOCS, 1999. Google Scholar
  17. P. Flajolet and G. Nigel Martin. Probabilistic counting algorithms for data base applications. JCSS, 31(2):182-209, 1985. Google Scholar
  18. S. Ganguly and G. Cormode. On estimating frequency moments of data streams. In APPROX-RANDOM. Springer, 2007. Google Scholar
  19. P. B. Gibbons and S. Tirthapura. Distributed streams algorithms for sliding windows. In SPAA, 2002. Google Scholar
  20. L. Golab, D. DeHaan, E. D. Demaine, A. Lopez-Ortiz, and J. I. Munro. Identifying frequent items in sliding windows over on-line packet streams. In IMC, 2003. Google Scholar
  21. R. Hung, L. K. Lee, and H. F. Ting. Finding frequent items over sliding windows with constant update time. IPL, 110(7):257-260, 2010. Google Scholar
  22. P. Indyk. Stable distributions, pseudorandom generators, embeddings, and data stream computation. JACM, 53(3):307-323, 2006. Google Scholar
  23. P. Indyk and D. Woodruff. Tight lower bounds for the distinct elements problem. In FOCS, 2003. Google Scholar
  24. P. Indyk and D. Woodruff. Optimal approximations of the frequency moments of data streams. In STOC, 2005. Google Scholar
  25. C. Jin, W. Qian, C. Sha, J. X. Yu, and A. Zhou. Dynamically maintaining frequent items over a data stream. In CIKM, 2003. Google Scholar
  26. D. M. Kane, J. Nelson, and D. Woodruff. On the exact space complexity of sketching and streaming small norms. In SODA, 2010. Google Scholar
  27. D. M. Kane, J. Nelson, and D. Woodruff. An optimal algorithm for the distinct elements problem. In PODS, 2010. Google Scholar
  28. Y. Li, H. Nguy\sbox0ê\sbox2~\ooalign\hidewidthåise\dimexpr\ht0-\ht2+.3ex\box2 \hidewidthŗêŗn, and D. Woodruff. Turnstile streaming algorithms might as well be linear sketches. In STOC, 2014. Google Scholar
  29. R. Motwani and P. Raghavan. Randomized Algorithms. Cambridge University Press, 1995. Google Scholar
  30. G. Nie and Z. Lu. Approximate frequency counts in sliding window over data stream. In CCECE, 2005. Google Scholar
  31. D. Woodruff. Optimal space lower bounds for all frequency moments. In SODA, 2004. Google Scholar
  32. L. Zhang and Y. Guan. Frequency estimation over sliding windows. In ICDE, 2008. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail