Approximating Subadditive Hadamard Functions on Implicit Matrices

Authors Vladimir Braverman, Alan Roytman, Gregory Vorsanger



PDF
Thumbnail PDF

File

LIPIcs.APPROX-RANDOM.2016.25.pdf
  • Filesize: 0.56 MB
  • 19 pages

Document Identifiers

Author Details

Vladimir Braverman
Alan Roytman
Gregory Vorsanger

Cite AsGet BibTex

Vladimir Braverman, Alan Roytman, and Gregory Vorsanger. Approximating Subadditive Hadamard Functions on Implicit Matrices. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 60, pp. 25:1-25:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)
https://doi.org/10.4230/LIPIcs.APPROX-RANDOM.2016.25

Abstract

An important challenge in the streaming model is to maintain small-space approximations of entrywise functions performed on a matrix that is generated by the outer product of two vectors given as a stream. In other works, streams typically define matrices in a standard way via a sequence of updates, as in the work of Woodruff [22] and others. We describe the matrix formed by the outer product, and other matrices that do not fall into this category, as implicit matrices. As such, we consider the general problem of computing over such implicit matrices with Hadamard functions, which are functions applied entrywise on a matrix. In this paper, we apply this generalization to provide new techniques for identifying independence between two data streams. The previous state of the art algorithm of Braverman and Ostrovsky [9] gave a (1 +- epsilon)-approximation for the L_1 distance between the joint and product of the marginal distributions, using space O(log^{1024}(nm) epsilon^{-1024}), where m is the length of the stream and n denotes the size of the universe from which stream elements are drawn. Our general techniques include the L_1 distance as a special case, and we give an improved space bound of O(log^{12}(n) log^{2}({nm}/epsilon) epsilon^{-7}).
Keywords
  • Streaming Algorithms
  • Measuring Independence
  • Hadamard Functions
  • Implicit Matrices

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Noga Alon, Alexandr Andoni, Tali Kaufman, Kevin Matulef, Ronitt Rubinfeld, and Ning Xie. Testing k-wise and almost k-wise independence. In Proceedings of the 39th annual ACM Symposium on Theory of Computing, 2007. Google Scholar
  2. Noga Alon, Oded Goldreich, and Yishay Mansour. Almost k-wise independence versus k-wise independence. Information Processing Letters, 88(3):107-110, 2003. Google Scholar
  3. Noga Alon, Yossi Matias, and Mario Szegedy. The space complexity of approximating the frequency moments. In Proceedings of the 28th annual ACM Symposium on Theory of Computing, 1996. Google Scholar
  4. Tuğkan Batu, Eldar Fischer, Lance Fortnow, Ravi Kumar, Ronitt Rubinfeld, and Patrick White. Testing random variables for independence and identity. In Proceedings of the 42nd annual IEEE Symposium on Foundations of Computer Science, 2001. Google Scholar
  5. Tuğkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, and Patrick White. Testing that distributions are close. In Proceedings of the 41st annual IEEE Symposium on Foundations of Computer Science, 2000. Google Scholar
  6. Tuğkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, and Patrick White. Testing closeness of discrete distributions. Journal of the ACM, 60(1):4, 2013. Google Scholar
  7. Tuğkan Batu, Ravi Kumar, and Ronitt Rubinfeld. Sublinear algorithms for testing monotone and unimodal distributions. In Proceedings of the 36th annual ACM Symposium on Theory of Computing, 2004. Google Scholar
  8. Vladimir Braverman, Kai-Min Chung, Zhenming Liu, Michael Mitzenmacher, and Rafail Ostrovsky. AMS Without 4-Wise Independence on Product Domains. In Proceedings of the 27th International Symposium on Theoretical Aspects of Computer Science, 2010. Google Scholar
  9. Vladimir Braverman and Rafail Ostrovsky. Measuring independence of datasets. In Proceedings of the 42nd annual ACM Symposium on Theory of Computing, 2010. Google Scholar
  10. Vladimir Braverman and Rafail Ostrovsky. Zero-one frequency laws. In Proceedings of the 42nd annual ACM Symposium on Theory of Computing, 2010. Google Scholar
  11. Vladimir Braverman and Rafail Ostrovsky. Generalizing the layering method of Indyk and Woodruff: Recursive sketches for frequency-based vectors on streams. In International Workshop on Approximation Algorithms for Combinatorial Optimization Problems. Springer, 2013. Google Scholar
  12. Dominique Guillot, Apoorva Khare, and Bala Rajaratnam. Complete characterization of hadamard powers preserving loewner positivity, monotonicity, and convexity. Journal of Mathematical Analysis and Applications, 425(1):489-507, 2015. Google Scholar
  13. Roger A. Horn and Charles R. Johnson. Topics in matrix analysis. Cambridge University Press, Cambridge, 1991. Google Scholar
  14. Piotr Indyk. Stable distributions, pseudorandom generators, embeddings, and data stream computation. Journal of the ACM, 53(3):307-323, 2006. Google Scholar
  15. Piotr Indyk and Andrew McGregor. Declaring independence via the sketching of sketches. In Proceedings of the 19th annual ACM-SIAM Symposium on Discrete Algorithms, 2008. Google Scholar
  16. Ralph Kimball and Joe Caserta. The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data. John Wiley &Sons, 2004. Google Scholar
  17. Erich L. Lehmann and Joseph P. Romano. Testing statistical hypotheses. Springer Science &Business Media, 2006. Google Scholar
  18. Andrew McGregor and Hoa T. Vu. Evaluating bayesian networks via data streams. In Proceedings of the 21st International Computing and Combinatorics Conference, 2015. Google Scholar
  19. Viswanath Poosala and Yannis E. Ioannidis. Selectivity estimation without the attribute value independence assumption. In Proceedings of the 23rd International Conference on Very Large Data Bases, 1997. Google Scholar
  20. Ronitt Rubinfeld and Rocco A. Servedio. Testing monotone high-dimensional distributions. In Proceedings of the 37th annual ACM Symposium on Theory of Computing, 2005. Google Scholar
  21. Amit Sahai and Salil Vadhan. A complete problem for statistical zero knowledge. Journal of the ACM, 50(2):196-249, 2003. Google Scholar
  22. David P. Woodruff. Sketching as a tool for numerical linear algebra. CoRR, abs/1411.4357, 2014. URL: http://arxiv.org/abs/1411.4357.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail