Linear Sketching over F_2

Authors Sampath Kannan, Elchanan Mossel, Swagato Sanyal, Grigory Yaroslavtsev



PDF
Thumbnail PDF

File

LIPIcs.CCC.2018.8.pdf
  • Filesize: 0.77 MB
  • 37 pages

Document Identifiers

Author Details

Sampath Kannan
  • University of Pennsylvania
Elchanan Mossel
  • Massachusetts Institute of Technology
Swagato Sanyal
  • Division of Mathematical Sciences, Nanyang Technological University, Singapore and Centre for Quantum Technologies, National University of Singapore, Singapore
Grigory Yaroslavtsev
  • Indiana University, Bloomington

Cite As Get BibTex

Sampath Kannan, Elchanan Mossel, Swagato Sanyal, and Grigory Yaroslavtsev. Linear Sketching over F_2. In 33rd Computational Complexity Conference (CCC 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 102, pp. 8:1-8:37, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018) https://doi.org/10.4230/LIPIcs.CCC.2018.8

Abstract

We initiate a systematic study of linear sketching over F_2. For a given Boolean function treated as f : F_2^n -> F_2 a randomized F_2-sketch is a distribution M over d x n matrices with elements over F_2 such that Mx suffices for computing f(x) with high probability. Such sketches for d << n can be used to design small-space distributed and streaming algorithms.
Motivated by these applications we study a connection between F_2-sketching and a two-player one-way communication game for the corresponding XOR-function. We conjecture that F_2-sketching is optimal for this communication game. Our results confirm this conjecture for multiple important classes of functions: 1) low-degree F_2-polynomials, 2) functions with sparse Fourier spectrum, 3) most symmetric functions, 4) recursive majority function. These results rely on a new structural theorem that shows that F_2-sketching is optimal (up to constant factors) for uniformly distributed inputs.
Furthermore, we show that (non-uniform) streaming algorithms that have to process random updates over F_2 can be constructed as F_2-sketches for the uniform distribution. In contrast with the previous work of Li, Nguyen and Woodruff (STOC'14) who show an analogous result for linear sketches over integers in the adversarial setting our result does not require the stream length to be triply exponential in n and holds for streams of length O(n) constructed through uniformly random updates.

Subject Classification

ACM Subject Classification
  • Theory of computation → Communication complexity
Keywords
  • Linear sketch
  • Streaming algorithms
  • XOR-functions
  • Communication complexity

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Yuqing Ai, Wei Hu, Yi Li, and David P. Woodruff. New characterizations in turnstile streams with applications. In Ran Raz, editor, 31st Conference on Computational Complexity, CCC 2016, May 29 to June 1, 2016, Tokyo, Japan, volume 50 of LIPIcs, pages 20:1-20:22. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2016. URL: http://dx.doi.org/10.4230/LIPIcs.CCC.2016.20.
  2. Noga Alon, Tali Kaufman, Michael Krivelevich, Simon Litsyn, and Dana Ron. Testing reed-muller codes. IEEE Trans. Information Theory, 51(11):4032-4039, 2005. Google Scholar
  3. Noga Alon, Yossi Matias, and Mario Szegedy. The space complexity of approximating the frequency moments. J. Comput. Syst. Sci., 58(1):137-147, 1999. URL: http://dx.doi.org/10.1006/jcss.1997.1545.
  4. Sepehr Assadi, Sanjeev Khanna, and Yang Li. On estimating maximum matching size in graph streams. In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2017, Barcelona, Spain, Hotel Porta Fira, January 16-19, pages 1723-1742, 2017. Google Scholar
  5. Sepehr Assadi, Sanjeev Khanna, Yang Li, and Grigory Yaroslavtsev. Maximum matchings in dynamic graph streams and the simultaneous communication model. In Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2016, Arlington, VA, USA, January 10-12, 2016, pages 1345-1364, 2016. Google Scholar
  6. Arnab Bhattacharyya, Swastik Kopparty, Grant Schoenebeck, Madhu Sudan, and David Zuckerman. Optimal testing of reed-muller codes. In 51th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2010, October 23-26, 2010, Las Vegas, Nevada, USA, pages 488-497, 2010. Google Scholar
  7. Eric Blais, Li-Yang Tan, and Andrew Wan. An inequality for the fourier spectrum of parity decision trees. CoRR, abs/1506.01055, 2015. URL: http://arxiv.org/abs/1506.01055.
  8. Andrej Bogdanov and Emanuele Viola. Pseudorandom bits for polynomials. In 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2007), October 20-23, 2007, Providence, RI, USA, Proceedings, pages 41-51, 2007. Google Scholar
  9. Anirban Dasgupta, Ravi Kumar, and D. Sivakumar. Sparse and lopsided set disjointness via information theory. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques - 15th International Workshop, APPROX 2012, and 16th International Workshop, RANDOM 2012, Cambridge, MA, USA, August 15-17, 2012. Proceedings, pages 517-528, 2012. Google Scholar
  10. Sumit Ganguly. Lower bounds on frequency estimation of data streams (extended abstract). In Computer Science - Theory and Applications, Third International Computer Science Symposium in Russia, CSR 2008, Moscow, Russia, June 7-12, 2008, Proceedings, pages 204-215, 2008. Google Scholar
  11. Dmitry Gavinsky, Julia Kempe, and Ronald de Wolf. Quantum communication cannot simulate a public coin. CoRR, quant-ph/0411051, 2004. URL: http://arxiv.org/abs/quant-ph/0411051.
  12. Mohsen Ghaffari and Merav Parter. MST in log-star rounds of congested clique. In Proceedings of the 2016 ACM Symposium on Principles of Distributed Computing, PODC 2016, Chicago, IL, USA, July 25-28, 2016, pages 19-28, 2016. Google Scholar
  13. Mika Göös and T. S. Jayram. A composition theorem for conical juntas. In 31st Conference on Computational Complexity, CCC 2016, May 29 to June 1, 2016, Tokyo, Japan, pages 5:1-5:16, 2016. Google Scholar
  14. Mika Göös, Shachar Lovett, Raghu Meka, Thomas Watson, and David Zuckerman. Rectangles are nonnegative juntas. In Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing, STOC 2015, Portland, OR, USA, June 14-17, 2015, pages 257-266, 2015. Google Scholar
  15. Mika Göös, Toniann Pitassi, and Thomas Watson. Deterministic communication vs. partition number. In IEEE 56th Annual Symposium on Foundations of Computer Science, FOCS 2015, Berkeley, CA, USA, 17-20 October, 2015, pages 1077-1088, 2015. Google Scholar
  16. Parikshit Gopalan, Ryan O'Donnell, Rocco A. Servedio, Amir Shpilka, and Karl Wimmer. Testing fourier dimensionality and sparsity. SIAM J. Comput., 40(4):1075-1100, 2011. URL: http://dx.doi.org/10.1137/100785429.
  17. Vince Grolmusz. On the power of circuits with gates of low l_1 norms. Theor. Comput. Sci., 188(1-2):117-128, 1997. URL: http://dx.doi.org/10.1016/S0304-3975(96)00290-3.
  18. Hamed Hatami, Kaave Hosseini, and Shachar Lovett. Structure of protocols for XOR functions. In IEEE 57th Annual Symposium on Foundations of Computer Science, FOCS 2016, 9-11 October 2016, Hyatt Regency, New Brunswick, New Jersey, USA, pages 282-288, 2016. Google Scholar
  19. James W. Hegeman, Gopal Pandurangan, Sriram V. Pemmaraju, Vivek B. Sardeshmukh, and Michele Scquizzato. Toward optimal bounds in the congested clique: Graph connectivity and MST. In Proceedings of the 2015 ACM Symposium on Principles of Distributed Computing, PODC 2015, Donostia-San Sebastián, Spain, July 21 - 23, 2015, pages 91-100, 2015. Google Scholar
  20. Wei Huang, Yaoyun Shi, Shengyu Zhang, and Yufan Zhu. The communication complexity of the hamming distance problem. Inf. Process. Lett., 99(4):149-153, 2006. URL: http://dx.doi.org/10.1016/j.ipl.2006.01.014.
  21. T. S. Jayram. Information complexity: a tutorial. In Proceedings of the Twenty-Ninth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2010, June 6-11, 2010, Indianapolis, Indiana, USA, pages 159-168, 2010. Google Scholar
  22. T. S. Jayram, Ravi Kumar, and D. Sivakumar. Two applications of information complexity. In Proceedings of the 35th Annual ACM Symposium on Theory of Computing, June 9-11, 2003, San Diego, CA, USA, pages 673-682, 2003. Google Scholar
  23. Tomasz Jurdzinski and Krzysztof Nowicki. MST in O(1) rounds of congested clique. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2018, New Orleans, LA, USA, January 7-10, 2018, pages 2620-2632, 2018. Google Scholar
  24. Michael Kapralov, Yin Tat Lee, Cameron Musco, Christopher Musco, and Aaron Sidford. Single pass spectral sparsification in dynamic streams. In 55th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2014, Philadelphia, PA, USA, October 18-21, 2014, pages 561-570, 2014. Google Scholar
  25. Michael Kapralov, Jelani Nelson, Jakub Pachocki, Zhengyu Wang, David P. Woodruff, and Mobin Yahyazadeh. Optimal lower bounds for universal relation, and for samplers and finding duplicates in streams. In 58th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2017, Berkeley, CA, USA, October 15-17, 2017, pages 475-486, 2017. Google Scholar
  26. Eyal Kushilevitz and Yishay Mansour. Learning decision trees using the fourier spectrum. SIAM J. Comput., 22(6):1331-1348, 1993. URL: http://dx.doi.org/10.1137/0222080.
  27. Eyal Kushilevitz and Noam Nisan. Communication complexity. Cambridge University Press, 1997. Google Scholar
  28. Troy Lee and Shengyu Zhang. Composition theorems in communication complexity. In Automata, Languages and Programming, 37th International Colloquium, ICALP 2010, Bordeaux, France, July 6-10, 2010, Proceedings, Part I, pages 475-489, 2010. Google Scholar
  29. Nikos Leonardos. An improved lower bound for the randomized decision tree complexity of recursive majority,. In Automata, Languages, and Programming - 40th International Colloquium, ICALP 2013, Riga, Latvia, July 8-12, 2013, Proceedings, Part I, pages 696-708, 2013. Google Scholar
  30. Ming Lam Leung, Yang Li, and Shengyu Zhang. Tight bounds on the randomized communication complexity of symmetric XOR functions in one-way and SMP models. CoRR, abs/1101.4555, 2011. URL: http://arxiv.org/abs/1101.4555.
  31. Yi Li, Huy L. Nguyen, and David P. Woodruff. Turnstile streaming algorithms might as well be linear sketches. In Symposium on Theory of Computing, STOC 2014, New York, NY, USA, May 31 - June 03, 2014, pages 174-183, 2014. Google Scholar
  32. Yang Liu and Shengyu Zhang. Quantum and randomized communication complexity of XOR functions in the SMP model. Electronic Colloquium on Computational Complexity (ECCC), 20:10, 2013. URL: http://eccc.hpi-web.de/report/2013/010.
  33. Shachar Lovett. Unconditional pseudorandom generators for low degree polynomials. In Proceedings of the 40th Annual ACM Symposium on Theory of Computing, Victoria, British Columbia, Canada, May 17-20, 2008, pages 557-562, 2008. Google Scholar
  34. Shachar Lovett. Recent advances on the log-rank conjecture in communication complexity. Bulletin of the EATCS, 112, 2014. URL: http://eatcs.org/beatcs/index.php/beatcs/article/view/260.
  35. Frédéric Magniez, Ashwin Nayak, Miklos Santha, Jonah Sherman, Gábor Tardos, and David Xiao. Improved bounds for the randomized decision tree complexity of recursive majority. CoRR, abs/1309.7565, 2013. URL: http://arxiv.org/abs/1309.7565.
  36. Frédéric Magniez, Ashwin Nayak, Miklos Santha, and David Xiao. Improved bounds for the randomized decision tree complexity of recursive majority. In Automata, Languages and Programming - 38th International Colloquium, ICALP 2011, Zurich, Switzerland, July 4-8, 2011, Proceedings, Part I, pages 317-329, 2011. Google Scholar
  37. Andrew McGregor. Graph stream algorithms: a survey. SIGMOD Record, 43(1):9-20, 2014. URL: http://dx.doi.org/10.1145/2627692.2627694.
  38. Ashley Montanaro and Tobias Osborne. On the communication complexity of XOR functions. CoRR, abs/0909.3392, 2009. URL: http://arxiv.org/abs/0909.3392.
  39. Elchanan Mossel, Sampath Kannan, and Grigory Yaroslavtsev. Linear sketching over 𝕗_2. Electronic Colloquium on Computational Complexity (ECCC), 23:174, 2016. URL: http://eccc.hpi-web.de/report/2016/174.
  40. Elchanan Mossel, Ryan O'Donnell, and Rocco A. Servedio. Learning juntas. In Proceedings of the 35th Annual ACM Symposium on Theory of Computing, June 9-11, 2003, San Diego, CA, USA, pages 206-212, 2003. Google Scholar
  41. Ryan O'Donnell, John Wright, Yu Zhao, Xiaorui Sun, and Li-Yang Tan. A composition theorem for parity kill number. In IEEE 29th Conference on Computational Complexity, CCC 2014, Vancouver, BC, Canada, June 11-13, 2014, pages 144-154, 2014. Google Scholar
  42. Ronitt Rubinfeld and Madhu Sudan. Robust characterizations of polynomials with applications to program testing. SIAM J. Comput., 25(2):252-271, 1996. Google Scholar
  43. Michael E. Saks and Avi Wigderson. Probabilistic boolean decision trees and the complexity of evaluating game trees. In 27th Annual Symposium on Foundations of Computer Science, Toronto, Canada, 27-29 October 1986, pages 29-38, 1986. Google Scholar
  44. Swagato Sanyal. Near-optimal upper bound on fourier dimension of boolean functions in terms of fourier sparsity. In Automata, Languages, and Programming - 42nd International Colloquium, ICALP 2015, Kyoto, Japan, July 6-10, 2015, Proceedings, Part I, pages 1035-1045, 2015. Google Scholar
  45. Yaoyun Shi and Zhiqiang Zhang. Communication complexities of symmetric xor functions. Quantum Inf. Comput, pages 0808-1762, 2008. Google Scholar
  46. Amir Shpilka, Avishay Tal, and Ben lee Volk. On the structure of boolean functions with small spectral norm. In Innovations in Theoretical Computer Science, ITCS'14, Princeton, NJ, USA, January 12-14, 2014, pages 37-48, 2014. Google Scholar
  47. Xiaoming Sun and Chengu Wang. Randomized communication complexity for linear algebra problems over finite fields. In 29th International Symposium on Theoretical Aspects of Computer Science, STACS 2012, February 29th - March 3rd, 2012, Paris, France, pages 477-488, 2012. Google Scholar
  48. Justin Thaler. Semi-streaming algorithms for annotated graph streams. In 43rd International Colloquium on Automata, Languages, and Programming, ICALP 2016, July 11-15, 2016, Rome, Italy, pages 59:1-59:14, 2016. Google Scholar
  49. Hing Yin Tsang, Chung Hoi Wong, Ning Xie, and Shengyu Zhang. Fourier sparsity, spectral norm, and the log-rank conjecture. In 54th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2013, 26-29 October, 2013, Berkeley, CA, USA, pages 658-667, 2013. Google Scholar
  50. Emanuele Viola. The sum of d small-bias generators fools polynomials of degree d. In Proceedings of the 23rd Annual IEEE Conference on Computational Complexity, CCC 2008, 23-26 June 2008, College Park, Maryland, USA, pages 124-127, 2008. Google Scholar
  51. Omri Weinstein and David P. Woodruff. The simultaneous communication of disjointness with applications to data streams. In Automata, Languages, and Programming - 42nd International Colloquium, ICALP 2015, Kyoto, Japan, July 6-10, 2015, Proceedings, Part I, pages 1082-1093, 2015. Google Scholar
  52. David P. Woodruff. Sketching as a tool for numerical linear algebra. Foundations and Trends in Theoretical Computer Science, 10(1-2):1-157, 2014. URL: http://dx.doi.org/10.1561/0400000060.
  53. Andrew Chi-Chih Yao. Lower bounds by probabilistic arguments (extended abstract). In 24th Annual Symposium on Foundations of Computer Science, Tucson, Arizona, USA, 7-9 November 1983, pages 420-428, 1983. Google Scholar
  54. Grigory Yaroslavtsev. Approximate linear sketching over 𝕗_2, 2017. Google Scholar
  55. Zhiqiang Zhang and Yaoyun Shi. On the parity complexity measures of boolean functions. Theor. Comput. Sci., 411(26-28):2612-2618, 2010. URL: http://dx.doi.org/10.1016/j.tcs.2010.03.027.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail