Efficient Summing over Sliding Windows
This paper considers the problem of maintaining statistic aggregates over the last W elements of a data stream. First, the problem of counting the number of 1's in the last W bits of a binary stream is considered. A lower bound of Omega(1/epsilon + log(W)) memory bits for Wepsilon-additive approximations is derived. This is followed by an algorithm whose memory consumption is O(1/epsilon + log(W)) bits, indicating that the algorithm is optimal and that the bound is tight. Next, the more general problem of maintaining a sum of the last W integers, each in the range of {0, 1, ..., R}, is addressed. The paper shows that approximating the sum within an additive error of RW epsilon can also be done using Theta(1/epsilon + log(W)) bits for epsilon = Omega(1/W). For epsilon = o(1/W), we present a succinct algorithm which uses B(1 + o(1)) bits, where B = Theta(W*log(1/(W*epsilon))) is the derived lower bound. We show that all lower bounds generalize to randomized algorithms as well. All algorithms process new elements and answer queries in O(1) worst-case time.
Streaming
Statistics
Lower Bounds
11:1-11:14
Regular Paper
Ran
Ben Basat
Ran Ben Basat
Gil
Einziger
Gil Einziger
Roy
Friedman
Roy Friedman
Yaron
Kassner
Yaron Kassner
10.4230/LIPIcs.SWAT.2016.11
Michael H Albert, Alexander Golynski, Angèle M Hamel, Alejandro López-Ortiz, S Srinivasa Rao, and Mohammad Ali Safari. Longest increasing subsequences in sliding windows. Theoretical Computer Science, 321(2):405-414, 2004.
Arvind Arasu and Gurmeet Singh Manku. Approximate counts and quantiles over sliding windows. In Proc. of the 23rd ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems, PODS 2004. Association for Computing Machinery, Inc., June 2004.
Brian Babcock, Mayur Datar, Rajeev Motwani, and Liadan O'Callaghan. Maintaining variance and k-medians over data stream windows. In Frank Neven, Catriel Beeri, and Tova Milo, editors, Proceedings of the Twenty-Second ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 9-12, 2003, San Diego, CA, USA, pages 234-243. ACM, 2003.
Ran Ben Basat, Gil Einziger, Roy Friedman, and Yaron Kassner. Efficient summing over sliding windows. CoRR, abs/1604.02450, 2016. URL: http://arxiv.org/abs/1604.02450.
http://arxiv.org/abs/1604.02450
Ran Ben Basat, Gil Einziger, Roy Friedman, and Yaron Kassner. Heavy hitters in streams and sliding windows. In INFOCOM, 2016 Proceedings IEEE, pages 307-315, April 2016.
Vladimir Braverman, Ran Gelles, and Rafail Ostrovsky. How to catch l2-heavy-hitters on sliding windows. Theoretical Computer Science, 554:82-94, 2014.
Vladimir Braverman and Rafail Ostrovsky. Smooth histograms for sliding windows. In Foundations of Computer Science, 2007. FOCS'07. 48th Annual IEEE Symposium on, pages 283-293. IEEE, 2007.
Edith Cohen and Martin J. Strauss. Maintaining time-decaying stream aggregates. J. Algorithms, 59(1):19-36, 2006.
Graham Cormode and Ke Yi. Tracking distributed aggregates over time-based sliding windows. In Scientific and Statistical Database Management, pages 416-430. Springer, 2012.
Michael Crouch and Daniel S. Stubbs. Improved streaming algorithms for weighted matching, via unweighted matching. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM 2014, September 4-6, 2014, Barcelona, Spain, pages 96-104, 2014. URL: http://dx.doi.org/10.4230/LIPIcs.APPROX-RANDOM.2014.96.
http://dx.doi.org/10.4230/LIPIcs.APPROX-RANDOM.2014.96
Michael S Crouch, Andrew McGregor, and Daniel Stubbs. Dynamic graphs in the sliding-window model. In Algorithms-ESA 2013, pages 337-348. Springer, 2013.
Mayur Datar, Aristides Gionis, Piotr Indyk, and Rajeev Motwani. Maintaining stream statistics over sliding windows. SIAM J. Comput., 31(6):1794-1813, 2002.
Phillip B. Gibbons and Srikanta Tirthapura. Distributed streams algorithms for sliding windows. In SPAA, pages 63-72, 2002.
Regant Y.S. Hung and H. F. Ting. Finding heavy hitters over the sliding window of a weighted data stream. In E. Laber, C. Bornstein, L. Nogueira, and L. Faria, editors, LATIN 2008: Theoretical Informatics, volume 4957 of LNCS, pages 699-710. Springer, 2008. URL: http://dx.doi.org/10.1007/978-3-540-78773-0_60.
http://dx.doi.org/10.1007/978-3-540-78773-0_60
Lap-Kei Lee and H. F. Ting. Maintaining significant stream statistics over sliding windows. In Proceedings of the Seventeenth Annual Symposium on Discrete Algorithms, SODA, pages 724-732. ACM Press, 2006.
Lap-Kei Lee and HF Ting. A simpler and more efficient deterministic scheme for finding frequent items over sliding windows. In Proc. of the SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 290-297. ACM, 2006.
Yang Liu, Wenji Chen, and Yong Guan. Near-optimal approximate membership query over time-decaying windows. In INFOCOM, Proceedings IEEE, pages 1447-1455, April 2013.
Kyriakos Mouratidis, Spiridon Bakiras, and Dimitris Papadias. Continuous monitoring of top-k queries over sliding windows. In Proc. of the International Conference on Management of Data, SIGMOD, pages 635-646, New York, NY, USA, 2006. ACM.
Moni Naor and Eylon Yogev. Sliding bloom filters. In Leizhen Cai, Siu-Wing Cheng, and Tak-Wah Lam, editors, Algorithms and Computation, volume 8283 of Lecture Notes in Computer Science, pages 513-523. Springer Berlin Heidelberg, 2013. URL: http://dx.doi.org/10.1007/978-3-642-45030-3_48.
http://dx.doi.org/10.1007/978-3-642-45030-3_48
Krešimir Pripužić, Ivana Podnar Žarko, and Karl Aberer. Time- and space-efficient sliding window top-k query processing. ACM Trans. Database Syst., 40(1):1:1-1:44, March 2015.
Hong Shen and Yu Zhang. Improved approximate detection of duplicates for data streams over sliding windows. Journal of Computer Science and Technology, 23(6):973-987, 2008.
Zhitao Shen, M.A. Cheema, Xuemin Lin, Wenjie Zhang, and Haixun Wang. Efficiently monitoring top-k pairs over sliding windows. In Data Engineering (ICDE), 2012 IEEE 28th International Conference on, pages 798-809, April 2012.
Andrew Chi-Chin Yao. Probabilistic computations: Toward a unified measure of complexity. In 18th Annual Symp. on Foundations of Computer Science, pages 222-227. IEEE, 1977.
Wenjie Zhang, Ying Zhang, Muhammad Aamir Cheema, and Xuemin Lin. Counting distinct objects over sliding windows. In Proceedings of the Twenty-First Australasian Conference on Database Technologies - Volume 104, ADC'10, pages 75-84, Darlinghurst, Australia, Australia, 2010. Australian Computer Society, Inc.
Creative Commons Attribution 3.0 Unported license
https://creativecommons.org/licenses/by/3.0/legalcode