Reconstructing Decision Trees

Blanc, Guy; Lange, Jane; Tan, Li-Yang

doi:10.4230/LIPIcs.ICALP.2022.24

Abstract

We give the first reconstruction algorithm for decision trees: given queries to a function f that is opt-close to a size-s decision tree, our algorithm provides query access to a decision tree T where: - T has size S := s^O((log s)²/ε³); - dist(f,T) ≤ O(opt)+ε; - Every query to T is answered with poly((log s)/ε)⋅ log n queries to f and in poly((log s)/ε)⋅ n log n time. This yields a tolerant tester that distinguishes functions that are close to size-s decision trees from those that are far from size-S decision trees. The polylogarithmic dependence on s in the efficiency of our tester is exponentially smaller than that of existing testers. Since decision tree complexity is well known to be related to numerous other boolean function properties, our results also provide a new algorithm for reconstructing and testing these properties.

Scott Aaronson, Shalev Ben-David, Robin Kothari, Shravas Rao, and Avishay Tal. Degree vs. approximate degree and quantum implications of huang’s sensitivity theorem. arXiv preprint, abs/2010.12629, 2020. URL: http://arxiv.org/abs/2010.12629.
Nir Ailon, Bernard Chazelle, Seshadhri Comandur, and Ding Liu. Property-preserving data reconstruction. Algorithmica, 51(2):160-182, 2008.
Tim Austin and Terence Tao. Testability and repair of hereditary hypergraph properties. Random Structures & Algorithms, 36(4):373-463, 2010.
Arnab Bhattacharyya, Elena Grigorescu, Madhav Jha, Kyomin Jung, Sofya Raskhodnikova, and David Woodruff. Lower bounds for local monotonicity reconstruction from transitive-closure spanners. SIAM Journal on Discrete Mathematics, 26(2):618-646, 2012.
Eric Blais, Joshua Brody, and Kevin Matulef. Property testing lower bounds via communication complexity. computational complexity, 21(2):311-358, 2012.
Guy Blanc, Neha Gupta, Jane Lange, and Li-Yang Tan. Estimating decision tree learnability with polylogarithmic sample complexity. In Proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS), 2020.
Guy Blanc, Jane Lange, Mingda Qiao, and Li-Yang Tan. Properly learning decision trees in almost polynomial time. In Proceedings of the 62nd Annual Symposium on Foundations of Computer Science (FOCS), 2021.
Avirm Blum, Merrick Furst, Jeffrey Jackson, Michael Kearns, Yishay Mansour, and Steven Rudich. Weakly learning DNF and characterizing statistical query learning using Fourier analysis. In Proceedings of the 26th Annual ACM Symposium on Theory of Computing (STOC), pages 253-262, 1994.
Avrim Blum. Rank-r decision trees are a subclass of r-decision lists. Inform. Process. Lett., 42(4):183-185, 1992. URL: https://doi.org/10.1016/0020-0190(92)90237-P.
Avrim Blum and Lunjia Hu. Active tolerant testing. In Proceedings of the 31st Conference On Learning Theory (COLT), volume 75, pages 474-497, 2018.
Zvika Brakerski. Local property restoring, 2008. Manuscript.
Joshua Brody and Pooya Hatami. Distance-sensitive property testing lower bounds. CoRR, abs/1304.6685, 2013. URL: http://arxiv.org/abs/1304.6685.
Alon Brutzkus, Amit Daniely, and Eran Malach. ID3 learns juntas for smoothed product distributions. In Proceedings of the 33rd Annual Conference on Learning Theory (COLT), pages 902-915, 2020.
Nader Bshouty. Exact learning via the monotone theory. In Proceedings of 34th Annual Symposium on Foundations of Computer Science (FOCS), pages 302-311, 1993.
Nader Bshouty. Almost optimal testers for concise representations. In Proceedings of the 24th International Conference on Randomization and Computation (RANDOM), pages 5:1-5:20, 2020.
Nader H. Bshouty and Catherine A. Haddad-Zaknoon. On learning and testing decision tree. ArXiv preprint, abs/2108.04587, 2021. URL: http://arxiv.org/abs/2108.04587.
Andrea Campagna, Alan Guo, and Ronitt Rubinfeld. Local reconstructors and tolerant testers for connectivity and diameter. In Proceedings of the 17th International Workshop on Randomization and Computation (RANDOM), pages 411-424, 2013.
Sourav Chakraborty, Eldar Fischer, and Arie Matsliah. Query complexity lower bounds for reconstruction of codes. Theory of Computing, 10(19):515-533, 2014. URL: https://doi.org/10.4086/toc.2014.v010a019.
Sourav Chakraborty, David García-Soriano, and Arie Matsliah. Efficient sample extractors for juntas with applications. In International Colloquium on Automata, Languages, and Programming (ICALP), pages 545-556, 2011.
Sourav Chakraborty, David García-Soriano, and Arie Matsliah. Nearly tight bounds for testing function isomorphism. In Proceedings of the 22nd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1683-1702, 2011.
Bernard Chazelle and C Seshadhri. Online geometric reconstruction. In Proceedings of the 22nd Annual Symposium on Computational Geometry (SoCG), pages 386-394, 2006.
Sitan Chen and Ankur Moitra. Beyond the low-degree algorithm: mixtures of subcubes and their applications. In Proceedings of the 51st Annual ACM Symposium on Theory of Computing (STOC), pages 869-880, 2019.
Ilias Diakonikolas, Homin Lee, Kevin Matulef, Krzysztof Onak, Ronitt Rubinfeld, Rocco Servedio, and Andrew Wan. Testing for concise representations. In Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 549-558, 2007.
Andrzej Ehrenfeucht and David Haussler. Learning decision trees from random examples. Information and Computation, 82(3):231-246, 1989.
Oded Goldreich, Shafi Goldwasser, and Dana Ron. Property testing and its connection to learning and approximation. Journal of the ACM, 45:653-750, 1998.
Thomas Hancock. Learning kμ decision trees on the uniform distribution. In Proceedings of the 6th Annual Conference on Computational Learning Theory (COT), pages 352-360, 1993.
Thomas Hancock, Tao Jiang, Ming Li, and John Tromp. Lower bounds on learning decision lists and trees. Information and Computation, 126(2):114-122, 1996.
Elad Hazan, Adam Klivans, and Yang Yuan. Hyperparameter optimization: A spectral approach. Proceedings of the 6th International Conference on Learning Representations (ICLR), 2018.
Jeffrey C. Jackson and Rocco A. Servedio. On learning random dnf formulas under the uniform distribution. Theory of Computing, 2(8):147-172, 2006. URL: https://doi.org/10.4086/toc.2006.v002a008.
Madhav Jha and Sofya Raskhodnikova. Testing and reconstruction of lipschitz functions with applications to data privacy. SIAM Journal on Computing, 42(2):700-731, 2013.
Adam Kalai, Alex Samorodnitsky, and Shang-Hua Teng. Learning and smoothed analysis. In Proceedings of the 50th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 395-404, 2009.
Satyen Kale, Yuval Peres, and Comandur Seshadhri. Noise tolerance of expanders and sublinear expansion reconstruction. SIAM Journal on Computing, 42(1):305-323, 2013.
Michael Kearns and Dana Ron. Testing problems with sublearning sample complexity. Journal of Computer and System Sciences, 61(3):428-456, 2000.
Adam Klivans and Rocco Servedio. Toward attribute efficient learning of decision lists and parities. Journal of Machine Learning Research, 7(Apr):587-602, 2006.
Weihao Kong and Gregory Valiant. Estimating learnability in the sublinear data regime. In Proceedings of the 31st Annual Conference on Neural Information Processing Systems (NeurIPS), pages 5460-5469, 2018.
Dinesh Mehta and Vijay Raghavan. Decision tree approximations of boolean functions. Theoretical Computer Science, 270(1-2):609-623, 2002.
Gatis Midrijānis. Exact quantum query complexity for total boolean functions. arXiv preprint, quant-ph/0403168, 2004. URL: http://arxiv.org/abs/quant-ph/0403168.
Ryan O'Donnell. Analysis of Boolean Functions. Cambridge University Press, 2014.
Ryan O'Donnell and Rocco Servedio. Learning monotone decision trees in polynomial time. SIAM Journal on Computing, 37(3):827-844, 2007.
Michal Parnas, Dana Ron, and Ronitt Rubinfeld. Tolerant property testing and distance approximation. Journal of Computer and System Sciences, 72(6):1012-1042, 2006.
Ronald Rivest. Learning decision lists. Machine learning, 2(3):229-246, 1987.
Ronitt Rubinfeld, Gil Tamir, Shai Vardi, and Ning Xie. Fast local computation algorithms. In Proceedings of the 2nd Symposium on Innovations in Computer Science (ICS), pages 223-238, 2011.
Michael Saks and Comandur Seshadhri. Local monotonicity reconstruction. SIAM Journal on Computing, 39(7):2897-2926, 2010.
Avishay Tal. Properties and applications of boolean function composition. In Proceedings of the 4th Conference on Innovations in Theoretical Computer Science (ITCS), pages 441-454, 2013.

Reconstructing Decision Trees

Authors Guy Blanc, Jane Lange, Li-Yang Tan

File

Document Identifiers

Author Details

Acknowledgements

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

Reconstructing Decision Trees

Authors Guy Blanc, Jane Lange, Li-Yang Tan

File

Document Identifiers

Author Details

Funding

Acknowledgements

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

Related Versions

References

Thanks for your feedback!

Could not send message