Universal Sketches for the Frequency Negative Moments and Other Decreasing Streaming Sums

Authors Vladimir Braverman, Stephen R. Chestnut



PDF
Thumbnail PDF

File

LIPIcs.APPROX-RANDOM.2015.591.pdf
  • Filesize: 0.52 MB
  • 15 pages

Document Identifiers

Author Details

Vladimir Braverman
Stephen R. Chestnut

Cite AsGet BibTex

Vladimir Braverman and Stephen R. Chestnut. Universal Sketches for the Frequency Negative Moments and Other Decreasing Streaming Sums. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 40, pp. 591-605, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)
https://doi.org/10.4230/LIPIcs.APPROX-RANDOM.2015.591

Abstract

Given a stream with frequency vector f in n dimensions, we characterize the space necessary for approximating the frequency negative moments Fp, where p<0, in terms of n, the accuracy, and the L_1 length of the vector f. To accomplish this, we actually prove a much more general result. Given any nonnegative and nonincreasing function g, we characterize the space necessary for any streaming algorithm that outputs a (1 +/- eps)-approximation to the sum of the coordinates of the vector f transformed by g. The storage required is expressed in the form of the solution to a relatively simple nonlinear optimization problem, and the algorithm is universal for (1 +/- eps)-approximations to any such sum where the applied function is nonnegative, nonincreasing, and has the same or smaller space complexity as g. This partially answers an open question of Nelson (IITK Workshop Kanpur, 2009).
Keywords
  • data streams
  • frequency moments
  • negative moments

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Noga Alon, Yossi Matias, and Mario Szegedy. The space complexity of approximating the frequency moments. In Symposium on the Theory of Computing, pages 20-29, 1996. Google Scholar
  2. Alexandr Andoni, Robert Krauthgamer, and Krzysztof Onak. Streaming algorithms via precision sampling. In IEEE Foundations of Computer Science, pages 363-372, 2011. Google Scholar
  3. Alexandr Andoni, Huy L. Nguy\cftilen, Yury Polyanskiy, and Yihong Wu. Tight lower bound for linear sketches of moments. In Automata, languages, and programming, volume 7965 of Lec. Notes in Comput. Sci., pages 25-32. Springer, Heidelberg, 2013. Google Scholar
  4. Ziv Bar-Yossef, T. S. Jayram, Ravi Kumar, and D. Sivakumar. An information statistics approach to data stream and communication complexity. J. Comput. System Sci., 68(4):702-732, 2004. Google Scholar
  5. Lakshminath Bhuvanagiri, Sumit Ganguly, Deepanjan Kesh, and Chandan Saha. Simpler algorithm for estimating frequency moments of data streams. In ACM-SIAM Symposium on Discrete Algorithms, pages 708-713, 2006. Google Scholar
  6. Vladimir Braverman, Jonathan Katzman, Charles Seidell, and Gregory Vorsanger. Approximating large frequency moments with O(n^1-2/k) bits. arXiv preprint arXiv:1401.1763, 2014. Google Scholar
  7. Vladimir Braverman and Rafail Ostrovsky. Recursive sketching for frequency moments. arXiv preprint arXiv:1011.2571, 2010. Google Scholar
  8. Vladimir Braverman and Rafail Ostrovsky. Zero-one frequency laws. In ACM Symposium on the Theory of Computing, pages 281-290, 2010. Google Scholar
  9. Vladimir Braverman, Rafail Ostrovsky, and Alan Roytman. Universal streaming. arXiv preprint arXiv:1408.2604, 2014. Google Scholar
  10. Peter S. Bullen. Handbook of means and their inequalities. Springer Science & Business Media, 2003. Google Scholar
  11. Amit Chakrabarti, Graham Cormode, and Andrew McGregor. A near-optimal algorithm for computing the entropy of a stream. In ACM-SIAM Symposium on Discrete Algorithms, pages 328-335. Society for Industrial and Applied Mathematics, 2007. Google Scholar
  12. Amit Chakrabarti, Khanh Do Ba, and S. Muthukrishnan. Estimating entropy and entropy norm on data streams. Internet Math., 3(1):63-78, 2006. Google Scholar
  13. Amit Chakrabarti, Subhash Khot, and Xiaodong Sun. Near-optimal lower bounds on the multi-party communication complexity of set disjointness. In IEEE Conference on Computational Complexity, pages 107-117, 2003. Google Scholar
  14. Moses Charikar, Kevin Chen, and Martin Farach-Colton. Finding frequent items in data streams. In Automata, languages and programming, volume 2380 of Lec. Notes in Comput. Sci., pages 693-703. Springer, Berlin, 2002. Google Scholar
  15. Don Coppersmith and Ravi Kumar. An improved data stream algorithm for frequency moments. In ACM-SIAM Symposium on Discrete Algorithms, pages 151-156, 2004. Google Scholar
  16. Sumit Ganguly. Estimating frequency moments of data streams using random linear combinations. In Approximation, Randomization, and Combinatorial Optimization, pages 369-380. Springer, 2004. Google Scholar
  17. Sumit Ganguly. Polynomial estimators for high frequency moments. arXiv preprint arXiv:1104.4552, 2011. Google Scholar
  18. Edwin L Grab and I Richard Savage. Tables of the expected value of 1/X for positive bernoulli and poisson variables. J. Am. Stat. Assoc., 49(265):169-177, 1954. Google Scholar
  19. André Gronemeier. Asymptotically optimal lower bounds on the NIH-multi-party information complexity of the AND-function and disjointness. In Symposium on Theoretical Aspects of Computer Science, 2009. Google Scholar
  20. Sudipto Guha, Piotr Indyk, and Andrew McGregor. Sketching information divergences. In Learning theory, volume 4539 of Lec. Notes in Comput. Sci., pages 424-438. Springer, Berlin, 2007. Google Scholar
  21. Nicholas JA Harvey, Jelani Nelson, and Krzysztof Onak. Sketching and streaming entropy via approximation theory. In IEEE Symposium on Foundations of Computer Science, pages 489-498, 2008. Google Scholar
  22. Piotr Indyk. Stable distributions, pseudorandom generators, embeddings, and data stream computation. J. of the ACM, 53(3):307-323, 2006. Google Scholar
  23. Piotr Indyk and David Woodruff. Optimal approximations of the frequency moments of data streams. In ACM Symposium on the Theory of Computing, pages 202-208, 2005. Google Scholar
  24. C.~Matthew Jones and Anatoly A. Zhigljavsky. Approximating the negative moments of the Poisson distribution. Statistics & Probability letters, 66(2):171-181, 2004. Google Scholar
  25. Hossein Jowhari, Mert Sağlam, and Gábor Tardos. Tight bounds for lp samplers, finding duplicates in streams, and related problems. In ACM Symposium on Principles of Database Systems, pages 49-58, 2011. Google Scholar
  26. Daniel M. Kane, Jelani Nelson, and David P. Woodruff. On the exact space complexity of sketching and streaming small norms. In ACM-SIAM Symposium on Discrete Algorithms, pages 1161-1178, 2010. Google Scholar
  27. Daniel M Kane, Jelani Nelson, and David P Woodruff. An optimal algorithm for the distinct elements problem. In ACM Symposium on Principles of Database Systems, pages 41-52, 2010. Google Scholar
  28. Eyal Kushilevitz and Noam Nisan. Communication complexity. Cambridge University Press, Cambridge, 1997. Google Scholar
  29. Ping Li. Estimators and tail bounds for dimension reduction in l_α (0 < α\leq2) using stable random projections. In ACM-SIAM Symposium on Discrete Algorithms, pages 10-19, 2008. Google Scholar
  30. Yi Li and David P Woodruff. A tight lower bound for high frequency moment estimation with small error. In Approximation, Randomization, and Combinatorial Optimization, pages 623-638. Springer, 2013. Google Scholar
  31. W. Mendenhall and H. Lehman, E.Ȧn approximation to the negative moments of the positive binomial useful in life testing. Technometrics, 2(2):227-242, 1960. Google Scholar
  32. Morteza Monemizadeh and David P Woodruff. 1-pass relative-error l_p-sampling with applications. In ACM-SIAM Symposium on Discrete Algorithms, pages 1143-1160, 2010. Google Scholar
  33. Jelani Nelson. List of open problems in sublinear algorithms: Problem 30. URL: http://sublinear.info/30.
  34. Robert F Reilly and Robert P Schweihs. The handbook of business valuation and intellectual property analysis. McGraw Hill, 2004. Google Scholar
  35. Frederick F Stephan. The expected value and variance of the reciprocal and other negative powers of a positive Bernoullian variate. Ann. Math. Stat., 16(1):50-61, 1945. Google Scholar
  36. David P Woodruff. Data streams and applications in computer science. Bulletin of EATCS, 3(114), 2014. Google Scholar
  37. Marko Znidaric. Asymptotic expansion for inverse moments of binomial and Poisson distributions. arXiv preprint math/0511226, 2005. Google Scholar
  38. Marko Žnidarič and Martin Horvat. Exponential complexity of an adiabatic algorithm for an NP-complete problem. Phys. Rev. A, 73(2), 2006. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail