From Formal Boosted Tree Explanations to Interpretable Rule Sets

Yu, Jinqiang; Ignatiev, Alexey; Stuckey, Peter J.

doi:10.4230/LIPIcs.CP.2023.38

Abstract

The rapid rise of Artificial Intelligence (AI) and Machine Learning (ML) has invoked the need for explainable AI (XAI). One of the most prominent approaches to XAI is to train rule-based ML models, e.g. decision trees, lists and sets, that are deemed interpretable due to their transparent nature. Recent years have witnessed a large body of work in the area of constraints- and reasoning-based approaches to the inference of interpretable models, in particular decision sets (DSes). Despite being shown to outperform heuristic approaches in terms of accuracy, most of them suffer from scalability issues and often fail to handle large training data, in which case no solution is offered. Motivated by this limitation and the success of gradient boosted trees, we propose a novel anytime approach to producing DSes that are both accurate and interpretable. The approach makes use of the concept of a generalized formal explanation and builds on the recent advances in formal explainability of gradient boosted trees. Experimental results obtained on a wide range of datasets, demonstrate that our approach produces DSes that more accurate than those of the state-of-the-art algorithms and comparable with them in terms of explanation size.

ACM. Fathers of the deep learning revolution receive ACM A.M. Turing award. http://tiny.cc/9plzpz, 2018.
Jeremias Berg, Emir Demirovic, and Peter J. Stuckey. Core-boosted linear search for incomplete MaxSAT. In CPAIOR, pages 39-56, 2019.
Armin Biere, Marijn Heule, Hans van Maaren, and Toby Walsh, editors. Handbook of Satisfiability. IOS Press, 2021.
Leo Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Wadsworth, 1984.
Tianqi Chen and Carlos Guestrin. XGBoost: A scalable tree boosting system. In KDD, pages 785-794, 2016.
Peter Clark and Robin Boswell. Rule induction with CN2: some recent improvements. In EWSL, pages 151-163, 1991.
Peter Clark and Tim Niblett. The CN2 induction algorithm. Machine Learning, 3:261-283, 1989.
William W. Cohen. Fast effective rule induction. In ICML, pages 115-123, 1995.
Dheeru Dua and Casey Graff. UCI machine learning repository, 2017. URL: http://archive.ics.uci.edu/ml.
Jiri Filip and Tomas Kliegr. Pyids-python implementation of interpretable decision sets algorithm by lakkaraju et al, 2016. In RuleML+ RR, 2019.
Nicholas Frosst and Geoffrey E. Hinton. Distilling a neural network into a soft decision tree. In CEx@AI*IA, volume 2071 of CEUR Workshop Proceedings. CEUR-WS.org, 2017.
Bishwamittra Ghosh and Kuldeep S. Meel. IMLI: an incremental framework for MaxSAT-based learning of interpretable classification rules. In AIES, pages 203-210. ACM, 2019.
Jianping Gou, Baosheng Yu, Stephen J. Maybank, and Dacheng Tao. Knowledge distillation: A survey. Int. J. Comput. Vis., 129(6):1789-1819, 2021.
Gurobi Optimization. Gurobi optimizer reference manual, 2022. URL: http://www.gurobi.com/.
Laurent Hyafil and Ronald L. Rivest. Constructing optimal binary decision trees is NP-complete. Inf. Process. Lett., 5(1):15-17, 1976.
Alexey Ignatiev. Towards trustable explainable AI. In IJCAI, pages 5154-5158, 2020. URL: https://doi.org/10.24963/ijcai.2020/726.
Alexey Ignatiev, Yacine Izza, Peter J. Stuckey, and João Marques-Silva. Using MaxSAT for efficient explanations of tree ensembles. In AAAI, pages 3776-3785, 2022.
Alexey Ignatiev, Edward Lam, Peter J. Stuckey, and Joao Marques-Silva. A scalable two stage approach to computing optimal decision sets. In AAAI, pages 3806-3814, 2021.
Alexey Ignatiev, Antonio Morgado, and Joao Marques-Silva. RC2: an efficient MaxSAT solver. J. Satisf. Boolean Model. Comput., 11(1):53-64, 2019.
Alexey Ignatiev, Nina Narodytska, and Joao Marques-Silva. Abduction-based explanations for machine learning models. In AAAI, pages 1511-1519, 2019. URL: https://doi.org/10.1609/aaai.v33i01.33011511.
Alexey Ignatiev, Nina Narodytska, and Joao Marques-Silva. On validating, repairing and refining heuristic ML explanations. CoRR, abs/1907.02509, 2019. URL: https://arxiv.org/abs/1907.02509.
Alexey Ignatiev, Filipe Pereira, Nina Narodytska, and João Marques-Silva. A SAT-based approach to learn explainable decision sets. In IJCAR, pages 627-645, 2018.
Yacine Izza, Alexey Ignatiev, and Joao Marques-Silva. On tackling explanation redundancy in decision trees. J. Artif. Intell. Res., 75:261-321, 2022.
Yacine Izza, Alexey Ignatiev, Peter J. Stuckey, and Joao Marques-Silva. Delivering inflated explanations. CoRR, abs/2306.15272, 2023.
Yacine Izza and Joao Marques-Silva. On explaining random forests with SAT. In IJCAI, July 2021.
Anil P. Kamath, Narendra Karmarkar, K. G. Ramakrishnan, and Mauricio G. C. Resende. A continuous approach to inductive inference. Math. Program., 57:215-238, 1992.
Kentaro Kanamori, Takuya Takagi, Ken Kobayashi, Yuichi Ike, Kento Uemura, and Hiroki Arimura. Ordered counterfactual explanation by mixed-integer linear optimization. In AAAI, pages 11564-11574. AAAI Press, 2021.
Richard M. Karp. Reducibility among combinatorial problems. In Complexity of Computer Computations, pages 85-103, 1972.
Himabindu Lakkaraju, Stephen H. Bach, and Jure Leskovec. Interpretable decision sets: A joint framework for description and prediction. In KDD, pages 1675-1684, 2016.
Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. Nature, 521(7553):436, 2015.
Zhendong Lei and Shaowei Cai. Solving (weighted) partial maxsat by dynamic local search for SAT. In IJCAI, pages 1346-1352, 2018.
Zhendong Lei and Shaowei Cai. Nudist: An efficient local search algorithm for (weighted) partial maxsat. Comput. J., 63(9):1321-1337, 2020.
Zachary C. Lipton. The mythos of model interpretability. Commun. ACM, 61(10):36-43, 2018. URL: https://doi.org/10.1145/3233231.
Scott M. Lundberg and Su-In Lee. A unified approach to interpreting model predictions. In NeurIPS, pages 4765-4774, 2017. URL: https://proceedings.neurips.cc/paper/2017/hash/8a20a8621978632d76c43dfd28b67767-Abstract.html.
Chuan Luo, Shaowei Cai, Kaile Su, and Wenxuan Huang. CCEHC: an efficient local search algorithm for weighted partial maximum satisfiability. Artif. Intell., 243:26-44, 2017.
Dmitry Malioutov and Kuldeep S. Meel. MLIC: A MaxSAT-based framework for learning interpretable classification rules. In CP, pages 312-327, 2018.
Joao Marques-Silva and Alexey Ignatiev. Delivering trustworthy AI through formal XAI. In AAAI, pages 12342-12350, 2022.
Hayden McTavish, Chudi Zhong, Reto Achermann, Ilias Karimalis, Jacques Chen, Cynthia Rudin, and Margo I. Seltzer. Fast sparse decision tree optimization via reference ensembles. In AAAI, pages 9604-9613, 2022.
Ryszard S Michalski. On the quasi-minimal solution of the general covering problem. In International Symposium on Information Processing, pages 125-128, 1969.
Nina Narodytska, Alexey Ignatiev, Filipe Pereira, and João Marques-Silva. Learning optimal decision trees with SAT. In IJCAI, pages 1362-1368, 2018.
Randal S. Olson, William G. La Cava, Patryk Orzechowski, Ryan J. Urbanowicz, and Jason H. Moore. PMLB: a large benchmark suite for machine learning evaluation and comparison. BioData Min., 10(1):36:1-36:13, 2017.
Axel Parmentier and Thibaut Vidal. Optimal counterfactual explanations in tree ensembles. In ICML, volume 139 of PMLR, pages 8422-8431, 2021.
Marco Túlio Ribeiro, Sameer Singh, and Carlos Guestrin. "why should I trust you?": Explaining the predictions of any classifier. In KDD, pages 1135-1144, 2016. URL: https://doi.org/10.1145/2939672.2939778.
Marco Túlio Ribeiro, Sameer Singh, and Carlos Guestrin. Anchors: High-precision model-agnostic explanations. In AAAI, pages 1527-1535, 2018. URL: https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16982.
Ronald L. Rivest. Learning decision lists. Mach. Learn., 2(3):229-246, 1987.
Cynthia Rudin and Seyda Ertekin. Learning customized and optimized lists of rules with mathematical programming. Mathematical Programming Computation, 10:659-702, 2018.
Paul Saikko, Jeremias Berg, and Matti Järvisalo. LMHS: A SAT-IP hybrid MaxSAT solver. In SAT, pages 539-546, 2016.
Andy Shih, Arthur Choi, and Adnan Darwiche. A symbolic approach to explaining bayesian network classifiers. In IJCAI, pages 5103-5111, 2018. URL: https://doi.org/10.24963/ijcai.2018/708.
Dylan Slack, Sophie Hilgard, Emily Jia, Sameer Singh, and Himabindu Lakkaraju. Fooling LIME and SHAP: adversarial attacks on post hoc explanation methods. In AIES, pages 180-186, 2020. URL: https://doi.org/10.1145/3375627.3375830.
Jinqiang Yu, Alexey Ignatiev, Peter J. Stuckey, and Pierre Le Bodic. Computing optimal decision sets with SAT. In CP, pages 952-970, 2020.
Jinqiang Yu, Alexey Ignatiev, Peter J Stuckey, and Pierre Le Bodic. Learning optimal decision sets and lists with sat. JAIR, 72:1251-1279, 2021.
Jinqiang Yu, Alexey Ignatiev, Peter J. Stuckey, Nina Narodytska, and Joao Marques-Silva. Eliminating the impossible, whatever remains must be true: On extracting and applying background knowledge in the context of formal explanations. In AAAI, 2023.

From Formal Boosted Tree Explanations to Interpretable Rule Sets

Authors Jinqiang Yu , Alexey Ignatiev , Peter J. Stuckey

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

References

Thanks for your feedback!

Could not send message

From Formal Boosted Tree Explanations to Interpretable Rule Sets

Authors Jinqiang Yu , Alexey Ignatiev , Peter J. Stuckey

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Funding

Supplementary Materials

References

Thanks for your feedback!

Could not send message