Tracking the l_2 Norm with Constant Update Time

Authors Chi-Ning Chou, Zhixian Lei, Preetum Nakkiran



PDF
Thumbnail PDF

File

LIPIcs.APPROX-RANDOM.2019.2.pdf
  • Filesize: 0.59 MB
  • 15 pages

Document Identifiers

Author Details

Chi-Ning Chou
  • School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts, USA
Zhixian Lei
  • School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts, USA
Preetum Nakkiran
  • School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts, USA

Acknowledgements

The authors wish to thank Jelani Nelson for invaluable advice throughout the course of this research. We also thank Mitali Bafna and Jarosław Błasiok for useful discussion and thank Boaz Barak for many helpful comments on an earlier draft of this article. We are also grateful to reviewers' comments.

Cite AsGet BibTex

Chi-Ning Chou, Zhixian Lei, and Preetum Nakkiran. Tracking the l_2 Norm with Constant Update Time. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 145, pp. 2:1-2:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)
https://doi.org/10.4230/LIPIcs.APPROX-RANDOM.2019.2

Abstract

The l_2 tracking problem is the task of obtaining a streaming algorithm that, given access to a stream of items a_1,a_2,a_3,... from a universe [n], outputs at each time t an estimate to the l_2 norm of the frequency vector f^{(t)}in R^n (where f^{(t)}_i is the number of occurrences of item i in the stream up to time t). The previous work [Braverman-Chestnut-Ivkin-Nelson-Wang-Woodruff, PODS 2017] gave a streaming algorithm with (the optimal) space using O(epsilon^{-2}log(1/delta)) words and O(epsilon^{-2}log(1/delta)) update time to obtain an epsilon-accurate estimate with probability at least 1-delta. We give the first algorithm that achieves update time of O(log 1/delta) which is independent of the accuracy parameter epsilon, together with the nearly optimal space using O(epsilon^{-2}log(1/delta)) words. Our algorithm is obtained using the Count Sketch of [Charilkar-Chen-Farach-Colton, ICALP 2002].

Subject Classification

ACM Subject Classification
  • Theory of computation → Sketching and sampling
Keywords
  • Streaming algorithms
  • Sketching algorithms
  • Tracking
  • CountSketch

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Noga Alon, Yossi Matias, and Mario Szegedy. The space complexity of approximating the frequency moments. In Proceedings of the twenty-eighth annual ACM symposium on Theory of computing, pages 20-29. ACM, 1996. Google Scholar
  2. Andrew C Berry. The accuracy of the Gaussian approximation to the sum of independent variates. Transactions of the american mathematical society, 49(1):122-136, 1941. Google Scholar
  3. Jaroslaw Blasiok, Jian Ding, and Jelani Nelson. Continuous Monitoring of l_p Norms in Data Streams. In LIPIcs-Leibniz International Proceedings in Informatics, volume 81. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2017. Google Scholar
  4. Vladimir Braverman, Stephen R Chestnut, Nikita Ivkin, Jelani Nelson, Zhengyu Wang, and David P Woodruff. BPTree: An 𝓁₂ heavy hitters algorithm using constant memory. In Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, pages 361-376. ACM, 2017. Google Scholar
  5. Vladimir Braverman, Stephen R Chestnut, Nikita Ivkin, and David P Woodruff. Beating CountSketch for heavy hitters in insertion streams. In Proceedings of the forty-eighth annual ACM symposium on Theory of Computing, pages 740-753. ACM, 2016. Google Scholar
  6. Moses Charikar, Kevin Chen, and Martin Farach-Colton. Finding frequent items in data streams. In International Colloquium on Automata, Languages, and Programming, pages 693-703. Springer, 2002. Google Scholar
  7. Graham Cormode and Shan Muthukrishnan. An improved data stream summary: the count-min sketch and its applications. Journal of Algorithms, 55(1):58-75, 2005. Google Scholar
  8. Anirban Dasgupta, Ravi Kumar, and Tamás Sarlós. A sparse Johnson-Lindenstrauss transform. In Proceedings of the forty-second ACM symposium on Theory of computing, pages 341-350. ACM, 2010. Google Scholar
  9. Carl-Gustaf Esseen. On the Liapounoff limit of error in the theory of probability. Almqvist & Wiksell Stockholm, 1942. Google Scholar
  10. David Lee Hanson and Farroll Tim Wright. A bound on tail probabilities for quadratic forms in independent random variables. The Annals of Mathematical Statistics, 42(3):1079-1083, 1971. Google Scholar
  11. Zengfeng Huang, Wai Ming Tai, and Ke Yi. Tracking the Frequency Moments at All Times. arXiv preprint, 2014. URL: http://arxiv.org/abs/1412.1763.
  12. Piotr Indyk. Stable distributions, pseudorandom generators, embeddings, and data stream computation. Journal of the ACM (JACM), 53(3):307-323, 2006. Google Scholar
  13. Tadeusz Inglot and Teresa Ledwina. Asymptotic optimality of new adaptive test in regression model. Annales de l'Institut Henri Poincare (B) Probability and Statistics, 42(5):579-590, 2006. Google Scholar
  14. T. S. Jayram and David P. Woodruff. Optimal Bounds for Johnson-Lindenstrauss Transforms and Streaming Problems with Subconstant Error. ACM Trans. Algorithms, 9(3):26:1-26:17, June 2013. URL: https://doi.org/10.1145/2483699.2483706.
  15. Daniel Kane, Raghu Meka, and Jelani Nelson. Almost optimal explicit Johnson-Lindenstrauss families. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, pages 628-639. Springer, 2011. Google Scholar
  16. Daniel M Kane and Jelani Nelson. Sparser Johnson-Lindenstrauss transforms. Journal of the ACM (JACM), 61(1):4, 2014. Google Scholar
  17. Daniel M Kane, Jelani Nelson, Ely Porat, and David P Woodruff. Fast moment estimation in data streams in optimal space. In Proceedings of the forty-third annual ACM symposium on Theory of computing, pages 745-754. ACM, 2011. Google Scholar
  18. Daniel M Kane, Jelani Nelson, Ely Porat, and David P Woodruff. Fast moment estimation in data streams in optimal space. In Proceedings of the forty-third annual ACM symposium on Theory of computing, pages 745-754. ACM, 2011. Google Scholar
  19. Daniel M Kane, Jelani Nelson, and David P Woodruff. An optimal algorithm for the distinct elements problem. In Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 41-52. ACM, 2010. Google Scholar
  20. Daniel M Kane, Jelani Nelson, and David P Woodruff. On the exact space complexity of sketching and streaming small norms. In Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms, pages 1161-1178. Society for Industrial and Applied Mathematics, 2010. Google Scholar
  21. Balachander Krishnamurthy, Subhabrata Sen, Yin Zhang, and Yan Chen. Sketch-based change detection: methods, evaluation, and applications. In Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement, pages 234-247. ACM, 2003. Google Scholar
  22. Kasper Green Larsen, Jelani Nelson, and Huy L Nguyên. Time lower bounds for nonadaptive turnstile streaming algorithms. In Proceedings of the forty-seventh annual ACM symposium on Theory of computing, pages 803-812. ACM, 2015. Google Scholar
  23. Mikkel Thorup and Yin Zhang. Tabulation based 4-universal hashing with applications to second moment estimation. In SODA, volume 4, pages 615-624, 2004. Google Scholar
  24. Mikkel Thorup and Yin Zhang. Tabulation-based 5-independent hashing with applications to linear probing and second moment estimation. SIAM Journal on Computing, 41(2):293-331, 2012. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail