Efficient Summing over Sliding Windows

Authors Ran Ben Basat, Gil Einziger, Roy Friedman, Yaron Kassner



PDF
Thumbnail PDF

File

LIPIcs.SWAT.2016.11.pdf
  • Filesize: 0.55 MB
  • 14 pages

Document Identifiers

Author Details

Ran Ben Basat
Gil Einziger
Roy Friedman
Yaron Kassner

Cite AsGet BibTex

Ran Ben Basat, Gil Einziger, Roy Friedman, and Yaron Kassner. Efficient Summing over Sliding Windows. In 15th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 53, pp. 11:1-11:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)
https://doi.org/10.4230/LIPIcs.SWAT.2016.11

Abstract

This paper considers the problem of maintaining statistic aggregates over the last W elements of a data stream. First, the problem of counting the number of 1's in the last W bits of a binary stream is considered. A lower bound of Omega(1/epsilon + log(W)) memory bits for Wepsilon-additive approximations is derived. This is followed by an algorithm whose memory consumption is O(1/epsilon + log(W)) bits, indicating that the algorithm is optimal and that the bound is tight. Next, the more general problem of maintaining a sum of the last W integers, each in the range of {0, 1, ..., R}, is addressed. The paper shows that approximating the sum within an additive error of RW epsilon can also be done using Theta(1/epsilon + log(W)) bits for epsilon = Omega(1/W). For epsilon = o(1/W), we present a succinct algorithm which uses B(1 + o(1)) bits, where B = Theta(W*log(1/(W*epsilon))) is the derived lower bound. We show that all lower bounds generalize to randomized algorithms as well. All algorithms process new elements and answer queries in O(1) worst-case time.
Keywords
  • Streaming
  • Statistics
  • Lower Bounds

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Michael H Albert, Alexander Golynski, Angèle M Hamel, Alejandro López-Ortiz, S Srinivasa Rao, and Mohammad Ali Safari. Longest increasing subsequences in sliding windows. Theoretical Computer Science, 321(2):405-414, 2004. Google Scholar
  2. Arvind Arasu and Gurmeet Singh Manku. Approximate counts and quantiles over sliding windows. In Proc. of the 23rd ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems, PODS 2004. Association for Computing Machinery, Inc., June 2004. Google Scholar
  3. Brian Babcock, Mayur Datar, Rajeev Motwani, and Liadan O'Callaghan. Maintaining variance and k-medians over data stream windows. In Frank Neven, Catriel Beeri, and Tova Milo, editors, Proceedings of the Twenty-Second ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 9-12, 2003, San Diego, CA, USA, pages 234-243. ACM, 2003. Google Scholar
  4. Ran Ben Basat, Gil Einziger, Roy Friedman, and Yaron Kassner. Efficient summing over sliding windows. CoRR, abs/1604.02450, 2016. URL: http://arxiv.org/abs/1604.02450.
  5. Ran Ben Basat, Gil Einziger, Roy Friedman, and Yaron Kassner. Heavy hitters in streams and sliding windows. In INFOCOM, 2016 Proceedings IEEE, pages 307-315, April 2016. Google Scholar
  6. Vladimir Braverman, Ran Gelles, and Rafail Ostrovsky. How to catch l2-heavy-hitters on sliding windows. Theoretical Computer Science, 554:82-94, 2014. Google Scholar
  7. Vladimir Braverman and Rafail Ostrovsky. Smooth histograms for sliding windows. In Foundations of Computer Science, 2007. FOCS'07. 48th Annual IEEE Symposium on, pages 283-293. IEEE, 2007. Google Scholar
  8. Edith Cohen and Martin J. Strauss. Maintaining time-decaying stream aggregates. J. Algorithms, 59(1):19-36, 2006. Google Scholar
  9. Graham Cormode and Ke Yi. Tracking distributed aggregates over time-based sliding windows. In Scientific and Statistical Database Management, pages 416-430. Springer, 2012. Google Scholar
  10. Michael Crouch and Daniel S. Stubbs. Improved streaming algorithms for weighted matching, via unweighted matching. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM 2014, September 4-6, 2014, Barcelona, Spain, pages 96-104, 2014. URL: http://dx.doi.org/10.4230/LIPIcs.APPROX-RANDOM.2014.96.
  11. Michael S Crouch, Andrew McGregor, and Daniel Stubbs. Dynamic graphs in the sliding-window model. In Algorithms-ESA 2013, pages 337-348. Springer, 2013. Google Scholar
  12. Mayur Datar, Aristides Gionis, Piotr Indyk, and Rajeev Motwani. Maintaining stream statistics over sliding windows. SIAM J. Comput., 31(6):1794-1813, 2002. Google Scholar
  13. Phillip B. Gibbons and Srikanta Tirthapura. Distributed streams algorithms for sliding windows. In SPAA, pages 63-72, 2002. Google Scholar
  14. Regant Y.S. Hung and H. F. Ting. Finding heavy hitters over the sliding window of a weighted data stream. In E. Laber, C. Bornstein, L. Nogueira, and L. Faria, editors, LATIN 2008: Theoretical Informatics, volume 4957 of LNCS, pages 699-710. Springer, 2008. URL: http://dx.doi.org/10.1007/978-3-540-78773-0_60.
  15. Lap-Kei Lee and H. F. Ting. Maintaining significant stream statistics over sliding windows. In Proceedings of the Seventeenth Annual Symposium on Discrete Algorithms, SODA, pages 724-732. ACM Press, 2006. Google Scholar
  16. Lap-Kei Lee and HF Ting. A simpler and more efficient deterministic scheme for finding frequent items over sliding windows. In Proc. of the SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 290-297. ACM, 2006. Google Scholar
  17. Yang Liu, Wenji Chen, and Yong Guan. Near-optimal approximate membership query over time-decaying windows. In INFOCOM, Proceedings IEEE, pages 1447-1455, April 2013. Google Scholar
  18. Kyriakos Mouratidis, Spiridon Bakiras, and Dimitris Papadias. Continuous monitoring of top-k queries over sliding windows. In Proc. of the International Conference on Management of Data, SIGMOD, pages 635-646, New York, NY, USA, 2006. ACM. Google Scholar
  19. Moni Naor and Eylon Yogev. Sliding bloom filters. In Leizhen Cai, Siu-Wing Cheng, and Tak-Wah Lam, editors, Algorithms and Computation, volume 8283 of Lecture Notes in Computer Science, pages 513-523. Springer Berlin Heidelberg, 2013. URL: http://dx.doi.org/10.1007/978-3-642-45030-3_48.
  20. Krešimir Pripužić, Ivana Podnar Žarko, and Karl Aberer. Time- and space-efficient sliding window top-k query processing. ACM Trans. Database Syst., 40(1):1:1-1:44, March 2015. Google Scholar
  21. Hong Shen and Yu Zhang. Improved approximate detection of duplicates for data streams over sliding windows. Journal of Computer Science and Technology, 23(6):973-987, 2008. Google Scholar
  22. Zhitao Shen, M.A. Cheema, Xuemin Lin, Wenjie Zhang, and Haixun Wang. Efficiently monitoring top-k pairs over sliding windows. In Data Engineering (ICDE), 2012 IEEE 28th International Conference on, pages 798-809, April 2012. Google Scholar
  23. Andrew Chi-Chin Yao. Probabilistic computations: Toward a unified measure of complexity. In 18th Annual Symp. on Foundations of Computer Science, pages 222-227. IEEE, 1977. Google Scholar
  24. Wenjie Zhang, Ying Zhang, Muhammad Aamir Cheema, and Xuemin Lin. Counting distinct objects over sliding windows. In Proceedings of the Twenty-First Australasian Conference on Database Technologies - Volume 104, ADC'10, pages 75-84, Darlinghurst, Australia, Australia, 2010. Australian Computer Society, Inc. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail