Learning Stochastic Decision Trees

Authors Guy Blanc, Jane Lange, Li-Yang Tan

Thumbnail PDF


  • Filesize: 0.83 MB
  • 16 pages

Document Identifiers

Author Details

Guy Blanc
  • Stanford University, CA, USA
Jane Lange
  • MIT, Cambridge, MA, USA
Li-Yang Tan
  • Stanford University, CA, USA

Cite AsGet BibTex

Guy Blanc, Jane Lange, and Li-Yang Tan. Learning Stochastic Decision Trees. In 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 198, pp. 30:1-30:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


We give a quasipolynomial-time algorithm for learning stochastic decision trees that is optimally resilient to adversarial noise. Given an η-corrupted set of uniform random samples labeled by a size-s stochastic decision tree, our algorithm runs in time n^{O(log(s/ε)/ε²)} and returns a hypothesis with error within an additive 2η + ε of the Bayes optimal. An additive 2η is the information-theoretic minimum. Previously no non-trivial algorithm with a guarantee of O(η) + ε was known, even for weaker noise models. Our algorithm is furthermore proper, returning a hypothesis that is itself a decision tree; previously no such algorithm was known even in the noiseless setting.

Subject Classification

ACM Subject Classification
  • Theory of computation → Boolean function learning
  • Learning theory
  • decision trees
  • proper learning algorithms
  • adversarial noise


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Guy Blanc, Neha Gupta, Jane Lange, and Li-Yang Tan. Universal guarantees for decision tree induction via a higher-order splitting criterion. In Proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS), 2020. Google Scholar
  2. Guy Blanc, Jane Lange, and Li-Yang Tan. Provable guarantees for decision tree induction: the agnostic setting. In Proceedings of the 37th International Conference on Machine Learning (ICML), 2020. Available at URL: https://arxiv.org/abs/2006.00743.
  3. Guy Blanc, Jane Lange, and Li-Yang Tan. Top-down induction of decision trees: rigorous guarantees and inherent limitations. In Proceedings of the 11th Innovations in Theoretical Computer Science Conference (ITCS), volume 151, pages 1-44, 2020. Google Scholar
  4. Avirm Blum, Merrick Furst, Jeffrey Jackson, Michael Kearns, Yishay Mansour, and Steven Rudich. Weakly learning DNF and characterizing statistical query learning using Fourier analysis. In Proceedings of the 26th Annual ACM Symposium on Theory of Computing (STOC), pages 253-262, 1994. Google Scholar
  5. Avrim Blum. Rank-r decision trees are a subclass of r-decision lists. Inform. Process. Lett., 42(4):183-185, 1992. URL: https://doi.org/10.1016/0020-0190(92)90237-P.
  6. Alon Brutzkus, Amit Daniely, and Eran Malach. ID3 learns juntas for smoothed product distributions. In Proceedings of the 33rd Annual Conference on Learning Theory (COLT), pages 902-915, 2020. Google Scholar
  7. Nader Bshouty. Exact learning via the monotone theory. In Proceedings of 34th Annual Symposium on Foundations of Computer Science (FOCS), pages 302-311, 1993. Google Scholar
  8. Nader H Bshouty, Nadav Eiron, and Eyal Kushilevitz. Pac learning with nasty noise. Theoretical Computer Science, 288(2):255-275, 2002. Google Scholar
  9. Sitan Chen and Ankur Moitra. Beyond the low-degree algorithm: mixtures of subcubes and their applications. In Proceedings of the 51st Annual ACM Symposium on Theory of Computing (STOC), pages 869-880, 2019. Google Scholar
  10. Andrzej Ehrenfeucht and David Haussler. Learning decision trees from random examples. Information and Computation, 82(3):231-246, 1989. Google Scholar
  11. Surbhi Goel, Aravind Gollakota, Zhihan Jin, Sushrut Karmalkar, and Adam Klivans. Superpolynomial lower bounds for learning one-layer neural networks using gradient descent. In Proceedings of the 37th International Conference on Machine Learning (ICML), volume 119, pages 3587-3596, 2020. Google Scholar
  12. Surbhi Goel, Aravind Gollakota, and Adam R. Klivans. Statistical-query lower bounds via functional gradients. In Proceedings of the Annual Conference on Neural Information Processing Systems (NeurIPS), 2020. Google Scholar
  13. Surbhi Goel and Adam Klivans. Learning neural networks with two nonlinear layers in polynomial time. In Proceedings of the 32nd Conference on Learning Theory (COLT), volume 99, pages 1470-1499, 2019. Google Scholar
  14. Surbhi Goel, Adam Klivans, and Raghu Meka. Learning one convolutional layer with overlapping patches. In Proceedings of the 35th International Conference on Machine Learning (ICML), volume 80, pages 1783-1791, 2018. Google Scholar
  15. Parikshit Gopalan, Adam Kalai, and Adam Klivans. Agnostically learning decision trees. In Proceedings of the 40th ACM Symposium on Theory of Computing (STOC), pages 527-536, 2008. Google Scholar
  16. Thomas Hancock. Learning kμ decision trees on the uniform distribution. In Proceedings of the 6th Annual Conference on Computational Learning Theory (COT), pages 352-360, 1993. Google Scholar
  17. Thomas Hancock, Tao Jiang, Ming Li, and John Tromp. Lower bounds on learning decision lists and trees. Information and Computation, 126(2):114-122, 1996. Google Scholar
  18. David Haussler. Decision theoretic generalizations of the pac model for neural net and other learning applications. Information and computation, 100(1):78-150, 1992. Google Scholar
  19. Elad Hazan, Adam Klivans, and Yang Yuan. Hyperparameter optimization: A spectral approach. Proceedings of the 6th International Conference on Learning Representations (ICLR), 2018. Google Scholar
  20. Jeffrey C. Jackson and Rocco A. Servedio. On learning random dnf formulas under the uniform distribution. Theory of Computing, 2(8):147-172, 2006. URL: https://doi.org/10.4086/toc.2006.v002a008.
  21. Adam Kalai, Adam Klivans, Yishay Mansour, and Rocco A. Servedio. Agnostically learning halfspaces. SIAM Journal on Computing, 37(6):1777-1805, 2008. Google Scholar
  22. Adam Kalai, Alex Samorodnitsky, and Shang-Hua Teng. Learning and smoothed analysis. In Proceedings of the 50th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 395-404, 2009. Google Scholar
  23. Michael Kearns and Yishay Mansour. On the boosting ability of top-down decision tree learning algorithms. Journal of Computer and System Sciences, 58(1):109-128, 1999. Google Scholar
  24. Michael Kearns and Robert Schapire. Efficient distribution-free learning of probabilistic concepts. Journal of Computer and System Sciences, 48(3):464-497, 1994. Google Scholar
  25. Michael Kearns, Robert Schapire, and Linda Sellie. Toward efficient agnostic learning. Machine Learning, 17(2/3):115-141, 1994. Google Scholar
  26. Adam Klivans and Rocco Servedio. Toward attribute efficient learning of decision lists and parities. Journal of Machine Learning Research, 7(Apr):587-602, 2006. Google Scholar
  27. Eyal Kushilevitz and Yishay Mansour. Learning decision trees using the fourier spectrum. SIAM Journal on Computing, 22(6):1331-1348, 1993. Google Scholar
  28. Homin Lee. On the learnability of monotone functions. PhD thesis, Columbia University, 2009. Google Scholar
  29. Nathan Linial, Yishay Mansour, and Noam Nisan. Constant depth circuits, Fourier transform and learnability. Journal of the ACM, 40(3):607-620, 1993. Google Scholar
  30. Dinesh Mehta and Vijay Raghavan. Decision tree approximations of boolean functions. Theoretical Computer Science, 270(1-2):609-623, 2002. Google Scholar
  31. Ryan O'Donnell and Rocco Servedio. Learning monotone decision trees in polynomial time. SIAM Journal on Computing, 37(3):827-844, 2007. Google Scholar
  32. Ronald Rivest. Learning decision lists. Machine learning, 2(3):229-246, 1987. Google Scholar
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail