Memory-Sample Lower Bounds for Learning Parity with Noise

Authors Sumegha Garg, Pravesh K. Kothari, Pengda Liu, Ran Raz



PDF
Thumbnail PDF

File

LIPIcs.APPROX-RANDOM.2021.60.pdf
  • Filesize: 0.75 MB
  • 19 pages

Document Identifiers

Author Details

Sumegha Garg
  • Department of Computer Science, Harvard University, Cambridge, MA, USA
Pravesh K. Kothari
  • Department of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
Pengda Liu
  • Department of Computer Science, Stanford University, CA, USA
Ran Raz
  • Department of Computer Science, Princeton University, NJ, USA

Acknowledgements

We would like to thank Avishay Tal and Greg Valiant for the helpful discussions.

Cite As Get BibTex

Sumegha Garg, Pravesh K. Kothari, Pengda Liu, and Ran Raz. Memory-Sample Lower Bounds for Learning Parity with Noise. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 207, pp. 60:1-60:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021) https://doi.org/10.4230/LIPIcs.APPROX/RANDOM.2021.60

Abstract

In this work, we show, for the well-studied problem of learning parity under noise, where a learner tries to learn x = (x₁,…,x_n) ∈ {0,1}ⁿ from a stream of random linear equations over 𝔽₂ that are correct with probability 1/2+ε and flipped with probability 1/2-ε (0 < ε < 1/2), that any learning algorithm requires either a memory of size Ω(n²/ε) or an exponential number of samples. 
In fact, we study memory-sample lower bounds for a large class of learning problems, as characterized by [Garg et al., 2018], when the samples are noisy. A matrix M: A × X → {-1,1} corresponds to the following learning problem with error parameter ε: an unknown element x ∈ X is chosen uniformly at random. A learner tries to learn x from a stream of samples, (a₁, b₁), (a₂, b₂) …, where for every i, a_i ∈ A is chosen uniformly at random and b_i = M(a_i,x) with probability 1/2+ε and b_i = -M(a_i,x) with probability 1/2-ε (0 < ε < 1/2). Assume that k,𝓁, r are such that any submatrix of M of at least 2^{-k} ⋅ |A| rows and at least 2^{-𝓁} ⋅ |X| columns, has a bias of at most 2^{-r}. We show that any learning algorithm for the learning problem corresponding to M, with error parameter ε, requires either a memory of size at least Ω((k⋅𝓁)/ε), or at least 2^{Ω(r)} samples. The result holds even if the learner has an exponentially small success probability (of 2^{-Ω(r)}). In particular, this shows that for a large class of learning problems, same as those in [Garg et al., 2018], any learning algorithm requires either a memory of size at least Ω(((log|X|)⋅(log|A|))/ε) or an exponential number of noisy samples.
Our proof is based on adapting the arguments in [Ran Raz, 2017; Garg et al., 2018] to the noisy case.

Subject Classification

ACM Subject Classification
  • Theory of computation → Machine learning theory
Keywords
  • memory-sample tradeoffs
  • learning parity under noise
  • space lower bound
  • branching program

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Michael Alekhnovich. More on average case vs approximation complexity. In 44th Symposium on Foundations of Computer Science (FOCS 2003), 11-14 October 2003, Cambridge, MA, USA, Proceedings, pages 298-307. IEEE Computer Society, 2003. URL: https://doi.org/10.1109/SFCS.2003.1238204.
  2. Paul Beame, Shayan Oveis Gharan, and Xin Yang. Time-space tradeoffs for learning finite functions from random evaluations, with applications to polynomials. In Conference On Learning Theory, pages 843-856, 2018. Google Scholar
  3. Avrim Blum, Adam Kalai, and Hal Wasserman. Noise-tolerant learning, the parity problem, and the statistical query model. J. ACM, 50(4):506-519, 2003. URL: https://doi.org/10.1145/792538.792543.
  4. Yuval Dagan, Gil Kur, and Ohad Shamir. Space lower bounds for linear prediction in the streaming model. In Conference on Learning Theory, pages 929-954. PMLR, 2019. Google Scholar
  5. Yuval Dagan and Ohad Shamir. Detecting correlations with little memory and communication. In Conference On Learning Theory, pages 1145-1198, 2018. Google Scholar
  6. Wei Dai, Stefano Tessaro, and Xihu Zhang. Super-linear time-memory trade-offs for symmetric encryption. Cryptology ePrint Archive, Report 2020/663, 2020. URL: https://eprint.iacr.org/2020/663.
  7. Vitaly Feldman, Parikshit Gopalan, Subhash Khot, and Ashok Kumar Ponnuswami. On agnostic learning of parities, monomials, and halfspaces. SIAM J. Comput., 39(2):606-645, 2009. URL: https://doi.org/10.1137/070684914.
  8. Sumegha Garg, Ran Raz, and Avishay Tal. Extractor-based time-space lower bounds for learning. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, pages 990-1002. ACM, 2018. Google Scholar
  9. Sumegha Garg, Ran Raz, and Avishay Tal. Time-space lower bounds for two-pass learning. In 34th Computational Complexity Conference (CCC 2019). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2019. Google Scholar
  10. Uma Girish, Ran Raz, and Wei Zhan. Quantum logspace algorithm for powering matrices with bounded norm. arXiv preprint arXiv:2006.04880, 2020. Google Scholar
  11. Jiaxin Guan and Mark Zhandary. Simple schemes in the bounded storage model. In Annual International Conference on the Theory and Applications of Cryptographic Techniques, pages 500-524. Springer, 2019. Google Scholar
  12. Jiaxin Guan and Mark Zhandry. Disappearing cryptography in the bounded storage model. IACR Cryptol. ePrint Arch., 2021:406, 2021. Google Scholar
  13. Joseph Jaeger and Stefano Tessaro. Tight time-memory trade-offs for symmetric encryption. In Annual International Conference on the Theory and Applications of Cryptographic Techniques, pages 467-497. Springer, 2019. Google Scholar
  14. Adam Tauman Kalai, Adam R. Klivans, Yishay Mansour, and Rocco A. Servedio. Agnostically learning halfspaces. SIAM J. Comput., 37(6):1777-1805, 2008. URL: https://doi.org/10.1137/060649057.
  15. Michael J. Kearns. Efficient noise-tolerant learning from statistical queries. J. ACM, 45(6):983-1006, 1998. URL: https://doi.org/10.1145/293347.293351.
  16. Gillat Kol, Ran Raz, and Avishay Tal. Time-space hardness of learning sparse parities. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pages 1067-1080. ACM, 2017. Google Scholar
  17. Dana Moshkovitz and Michal Moshkovitz. Mixing implies lower bounds for space bounded learning. In Conference on Learning Theory, pages 1516-1566. PMLR, 2017. Google Scholar
  18. Dana Moshkovitz and Michal Moshkovitz. Entropy samplers and strong generic lower bounds for space bounded learning. In 9th Innovations in Theoretical Computer Science Conference (ITCS 2018). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2018. Google Scholar
  19. Michal Moshkovitz and Naftali Tishby. Mixing complexity and its applications to neural networks. arXiv preprint arXiv:1703.00729, 2017. Google Scholar
  20. Ran Raz. Fast learning requires good memory: A time-space lower bound for parity learning. In Foundations of Computer Science (FOCS), 2016 IEEE 57th Annual Symposium on, pages 266-275. IEEE, 2016. Google Scholar
  21. Ran Raz. A time-space lower bound for a large class of learning problems. In 58th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2017, Berkeley, CA, USA, October 15-17, 2017, pages 732-742, 2017. Google Scholar
  22. Ohad Shamir. Fundamental limits of online and distributed algorithms for statistical learning and estimation. Advances in Neural Information Processing Systems, 27:163-171, 2014. Google Scholar
  23. Vatsal Sharan, Aaron Sidford, and Gregory Valiant. Memory-sample tradeoffs for linear regression with small error. In Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, pages 890-901, 2019. Google Scholar
  24. Jacob Steinhardt, Gregory Valiant, and Stefan Wager. Memory, communication, and statistical queries. In Conference on Learning Theory, pages 1490-1516. PMLR, 2016. Google Scholar
  25. Stefano Tessaro and Aishwarya Thiruvengadam. Provable time-memory trade-offs: symmetric cryptography against memory-bounded adversaries. In Theory of Cryptography Conference, pages 3-32. Springer, 2018. Google Scholar
  26. Gregory Valiant and Paul Valiant. Information theoretically secure databases. arXiv preprint arXiv:1605.02646, 2016. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail