Embedding Hard Learning Problems Into Gaussian Space

Authors Adam Klivans, Pravesh Kothari



PDF
Thumbnail PDF

File

LIPIcs.APPROX-RANDOM.2014.793.pdf
  • Filesize: 482 kB
  • 17 pages

Document Identifiers

Author Details

Adam Klivans
Pravesh Kothari

Cite As Get BibTex

Adam Klivans and Pravesh Kothari. Embedding Hard Learning Problems Into Gaussian Space. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2014). Leibniz International Proceedings in Informatics (LIPIcs), Volume 28, pp. 793-809, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2014) https://doi.org/10.4230/LIPIcs.APPROX-RANDOM.2014.793

Abstract

We give the first representation-independent hardness result for agnostically learning halfspaces with respect to the Gaussian distribution. We reduce from the problem of learning sparse parities with noise with respect to the uniform distribution on the hypercube (sparse LPN), a notoriously hard problem in theoretical computer science and show that any algorithm for agnostically learning halfspaces requires n^Omega(log(1/\epsilon)) time under the assumption that k-sparse LPN requires n^Omega(k) time, ruling out a polynomial time algorithm for the problem. As far as we are aware, this is the first representation-independent hardness result for supervised learning when the underlying distribution is restricted to be a Gaussian.
We also show that the problem of agnostically learning sparse polynomials with respect to the Gaussian distribution in polynomial time is as hard as PAC learning DNFs on the uniform distribution in polynomial time. This complements the surprising result of Andoni et. al. 2013 who show that sparse polynomials are learnable under random Gaussian noise in polynomial time.
Taken together, these results show the inherent difficulty of designing supervised learning algorithms in Euclidean space even in the presence of strong distributional assumptions. Our results use a novel embedding of random labeled examples from the uniform distribution on the Boolean hypercube into random labeled examples from the Gaussian distribution that allows us to relate the hardness of learning problems on two different domains and distributions.

Subject Classification

Keywords
  • distribution-specific hardness of learning
  • gaussian space
  • halfspace-learning
  • agnostic learning

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Alexandr Andoni, Rina Panigrahy, Gregory Valiant, and Li Zhang. Learning sparse polynomial functions. In SODA, 2014. Google Scholar
  2. Shai Ben-David and Hans-Ulrich Simon. Efficient learning of linear perceptrons. In NIPS, pages 189-195, 2000. Google Scholar
  3. Quentin Berthet and Philippe Rigollet. Complexity theoretic lower bounds for sparse principal component detection. In COLT, pages 1046-1066, 2013. Google Scholar
  4. Aharon Birnbaum and Shai Shalev-Shwartz. Learning halfspaces with the zero-one loss: Time-accuracy tradeoffs. In NIPS, pages 935-943, 2012. Google Scholar
  5. A. Blum, A. Kalai, and H. Wasserman. Noise-tolerant learning, the parity problem, and the statistical query model. Journal of the ACM, 50(4):506-519, 2003. Google Scholar
  6. Avrim Blum, Alan M. Frieze, Ravi Kannan, and Santosh Vempala. A polynomial-time algorithm for learning noisy linear threshold functions. Algorithmica, 22(1/2):35-52, 1998. Google Scholar
  7. Nader H. Bshouty and Vitaly Feldman. On using extended statistical queries to avoid membership queries. Journal of Machine Learning Research, 2:359-395, 2002. Google Scholar
  8. Amit Daniely, Nati Linial, and Shai Shalev-Shwartz. From average case complexity to improper learning complexity. CoRR, abs/1311.2272, 2013. Google Scholar
  9. Ilias Diakonikolas, Daniel M. Kane, and Jelani Nelson. Bounded independence fools degree-2 threshold functions. CoRR, abs/0911.3389, 2009. Google Scholar
  10. Ilias Diakonikolas, Ryan O'Donnell, Rocco A. Servedio, and Yi Wu. Hardness results for agnostically learning low-degree polynomial threshold functions. In SODA, pages 1590-1606, 2011. Google Scholar
  11. Ilias Diakonikolas, Ryan O'Donnell, Rocco A. Servedio, and Yi Wu. Hardness results for agnostically learning low-degree polynomial threshold functions. In SODA, pages 1590-1606, 2011. Google Scholar
  12. V. Feldman. A complete characterization of statistical query learning with applications to evolvability. Journal of Computer System Sciences, 78(5):1444-1459, 2012. Google Scholar
  13. V. Feldman, P. Gopalan, S. Khot, and A. Ponuswami. On agnostic learning of parities, monomials and halfspaces. SIAM Journal on Computing, 39(2):606-645, 2009. Google Scholar
  14. Vitaly Feldman and Varun Kanade. Computational bounds on statistical query learning. In COLT, pages 16.1-16.22, 2012. Google Scholar
  15. Vitaly Feldman, Pravesh Kothari, and Jan Vondrák. Representation, approximation and learning of submodular functions using low-rank decision trees. In COLT, pages 711-740, 2013. Google Scholar
  16. Yoav Freund. Boosting a weak learning algorithm by majority. In COLT, pages 202-216, 1990. Google Scholar
  17. Yoav Freund. An improved boosting algorithm and its implications on learning complexity. In COLT, pages 391-398, 1992. Google Scholar
  18. Yoav Freund and Robert E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci., 55(1):119-139, 1997. Google Scholar
  19. Venkatesan Guruswami and Prasad Raghavendra. Hardness of learning halfspaces with noise. SIAM J. Comput., 39(2):742-765, 2009. Google Scholar
  20. D. Haussler. Decision theoretic generalizations of the PAC model for neural net and other learning applications. Information and Computation, 100(1):78-150, 1992. Google Scholar
  21. Jeffrey C. Jackson. An efficient membership-query algorithm for learning dnf with respect to the uniform distribution. J. Comput. Syst. Sci., 55(3):414-440, 1997. Google Scholar
  22. Adam Tauman Kalai, Adam R. Klivans, Yishay Mansour, and Rocco A. Servedio. Agnostically learning halfspaces. SIAM J. Comput., 37(6):1777-1805, 2008. Google Scholar
  23. Daniel M. Kane, Adam Klivans, and Raghu Meka. Learning halfspaces under log-concave densities: Polynomial approximations and moment matching. In COLT, pages 522-545, 2013. Google Scholar
  24. M. Kearns, R. Schapire, and L. Sellie. Toward efficient agnostic learning. Machine Learning, 17(2-3):115-141, 1994. Google Scholar
  25. Eike Kiltz, Krzysztof Pietrzak, David Cash, Abhishek Jain, and Daniele Venturi. Efficient authentication from hard learning problems. In EUROCRYPT, pages 7-26, 2011. Google Scholar
  26. Adam Klivans, Pravesh Kothari, and Igor Oliveira. Constructing hard functions from learning algorithms. Conference on Computational Complexity, CCC, 20:129, 2013. Google Scholar
  27. Adam R. Klivans, Ryan O'Donnell, and Rocco A. Servedio. Learning intersections and thresholds of halfspaces. J. Comput. Syst. Sci., 68(4):808-840, 2004. Google Scholar
  28. Adam R. Klivans, Ryan O'Donnell, and Rocco A. Servedio. Learning geometric concepts via gaussian surface area. In FOCS, pages 541-550, 2008. Google Scholar
  29. Adam R. Klivans and Rocco A. Servedio. Boosting and hard-core set construction. Machine Learning, 51(3):217-238, 2003. Google Scholar
  30. Adam R. Klivans and Alexander A. Sherstov. Cryptographic hardness for learning intersections of halfspaces. In FOCS, pages 553-562, 2006. Google Scholar
  31. Adam R. Klivans and Alexander A. Sherstov. Lower bounds for agnostic learning via approximate rank. Computational Complexity, 19(4):581-604, 2010. Google Scholar
  32. Eyal Kushilevitz and Yishay Mansour. Learning decision trees using the fourier spectrum. SIAM J. Comput., 22(6):1331-1348, 1993. Google Scholar
  33. Ryan O'Donnell. Fourier coefficients of majority. http://www.contrib.andrew.cmu.edu/~ryanod/?p=877, 2012.
  34. Frank Rosenblatt. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65(6):386-408, 1958. Google Scholar
  35. Shai Shalev-Shwartz, Ohad Shamir, and Karthik Sridharan. Learning kernel-based halfspaces with the zero-one loss. In COLT, pages 441-450, 2010. Google Scholar
  36. Gregory Valiant. Finding correlations in subquadratic time, with applications to learning parities and juntas. In The 53rd Annual IEEE Symposium on the Foundations of Computer Science (FOCS), 2012. Google Scholar
  37. L. G. Valiant. A theory of the learnable. Communications of the ACM, 27(11):1134-1142, 1984. Google Scholar
  38. V. Vapnik. Statistical Learning Theory. Wiley-Interscience, New York, 1998. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail