Anytime Approximate Formal Feature Attribution

Yu, Jinqiang; Farr, Graham; Ignatiev, Alexey; Stuckey, Peter J.

doi:10.4230/LIPIcs.SAT.2024.30

Abstract

Widespread use of artificial intelligence (AI) algorithms and machine learning (ML) models on the one hand and a number of crucial issues pertaining to them warrant the need for explainable artificial intelligence (XAI). A key explainability question is: given this decision was made, what are the input features which contributed to the decision? Although a range of XAI approaches exist to tackle this problem, most of them have significant limitations. Heuristic XAI approaches suffer from the lack of quality guarantees, and often try to approximate Shapley values, which is not the same as explaining which features contribute to a decision. A recent alternative is so-called formal feature attribution (FFA), which defines feature importance as the fraction of formal abductive explanations (AXp’s) containing the given feature. This measures feature importance from the view of formally reasoning about the model’s behavior. Namely, given a system of constraints logically representing the ML model of interest, computing an AXp requires finding a minimal unsatisfiable subset (MUS) of the system. It is challenging to compute FFA using its definition because that involves counting over all AXp’s (equivalently, counting over MUSes), although one can approximate it. Based on these results, this paper makes several contributions. First, it gives compelling evidence that computing FFA is intractable, even if the set of contrastive formal explanations (CXp’s), which correspond to minimal correction subsets (MCSes) of the logical system, is provided, by proving that the problem is #P-hard. Second, by using the duality between MUSes and MCSes, it proposes an efficient heuristic to switch from MCS enumeration to MUS enumeration on-the-fly resulting in an adaptive explanation enumeration algorithm effectively approximating FFA in an anytime fashion. Finally, experimental results obtained on a range of widely used datasets demonstrate the effectiveness of the proposed FFA approximation approach in terms of the error of FFA approximation as well as the number of explanations computed and their diversity given a fixed time limit.

Fahiem Bacchus and George Katsirelos. Using minimal correction sets to more efficiently compute minimal unsatisfiable sets. In CAV, pages 70-86, 2015.
James Bailey and Peter J. Stuckey. Discovery of minimal unsatisfiable subsets of constraints using hitting set dualization. In PADL, pages 174-186, 2005.
Anton Belov, Ines Lynce, and Joao Marques-Silva. Towards efficient MUS extraction. AI Commun., 25(2):97-116, 2012.
Jaroslav Bendík, Ivana Cerná, and Nikola Benes. Recursive online enumeration of all minimal unsatisfiable subsets. In ATVA, pages 143-159, 2018.
Armin Biere, Marijn Heule, Hans van Maaren, and Toby Walsh, editors. Handbook of Satisfiability: Second Edition, volume 336 of Frontiers in Artificial Intelligence and Applications. IOS Press, 2021.
Elazar Birnbaum and Eliezer L. Lozinskii. Consistent subsets of inconsistent systems: structure and behaviour. J. Exp. Theor. Artif. Intell., 15(1):25-46, 2003.
Karthekeyan Chandrasekaran, Richard M. Karp, Erick Moreno-Centeno, and Santosh S. Vempala. Algorithms for implicit hitting set problems. In SODA, pages 614-629, 2011.
Tianqi Chen and Carlos Guestrin. XGBoost: A scalable tree boosting system. In KDD, pages 785-794, 2016.
Peter Clark and Robin Boswell. Rule induction with CN2: some recent improvements. In EWSL, pages 151-163, 1991.
Li Deng. The MNIST database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine, 29(6):141-142, 2012.
Jerome H. Friedman. Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5):1189-1232, 2001.
Enrico Giunchiglia and Marco Maratea. Solving optimization problems with DLL. In ECAI, pages 377-381, 2006.
Éric Grégoire, Yacine Izza, and Jean-Marie Lagniez. Boosting MCSes enumeration. In IJCAI, pages 1309-1315, 2018.
Addison Howard, Devrishi, Phil Culliton, and Yufeng Guo. Natural language processing with disaster tweets, 2019. URL: https://kaggle.com/competitions/nlp-getting-started.
Xuanxiang Huang, Martin C. Cooper, António Morgado, Jordi Planes, and João Marques-Silva. Feature necessity & relevancy in ML classifier explanations. In TACAS (1), pages 167-186, 2023.
Xuanxiang Huang and João Marques-Silva. The inadequacy of Shapley values for explainability. CoRR, abs/2302.08160, 2023.
Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. Binarized neural networks. In NIPS, pages 4107-4115, 2016.
Laurent Hyafil and Ronald L. Rivest. Constructing optimal binary decision trees is NP-complete. Inf. Process. Lett., 5(1):15-17, 1976.
Alexey Ignatiev, Mikolas Janota, and Joao Marques-Silva. Quantified maximum satisfiability. Constraints An Int. J., 21(2):277-302, 2016.
Alexey Ignatiev, Nina Narodytska, Nicholas Asher, and Joao Marques-Silva. From contrastive to abductive explanations and back again. In AI*IA, pages 335-355, 2020. URL: https://doi.org/10.1007/978-3-030-77091-4_21.
Alexey Ignatiev, Nina Narodytska, and Joao Marques-Silva. Abduction-based explanations for machine learning models. In AAAI, pages 1511-1519, 2019. URL: https://doi.org/10.1609/AAAI.V33I01.33011511.
Maurice G Kendall. A new measure of rank correlation. Biometrika, 30(1/2):81-93, 1938.
Ron Kohavi. Scaling up the accuracy of naive-Bayes classifiers: A decision-tree hybrid. In KDD, pages 202-207, 1996.
Solomon Kullback and Richard A Leibler. On information and sufficiency. The annals of mathematical statistics, 22(1):79-86, 1951.
Himabindu Lakkaraju, Stephen H. Bach, and Jure Leskovec. Interpretable decision sets: A joint framework for description and prediction. In KDD, pages 1675-1684. ACM, 2016.
Mark Liffiton and Ammar Malik. Enumerating infeasibility: Finding multiple MUSes quickly. In CPAIOR, pages 160-175, 2013.
Mark Liffiton and Ammar Malik. Enumerating infeasibility: Finding multiple muses quickly. In CPAIOR, pages 160-175, 2013.
Mark H. Liffiton, Maher N. Mneimneh, Ines Lynce, Zaher S. Andraus, Joao Marques-Silva, and Karem A. Sakallah. A branch and bound algorithm for extracting smallest minimal unsatisfiable subformulas. Constraints An Int. J., 14(4):415-442, 2009.
Mark H. Liffiton, Alessandro Previti, Ammar Malik, and Joao Marques-Silva. Fast, flexible MUS enumeration. Constraints An Int. J., 21(2):223-250, 2016.
Mark H. Liffiton and Karem A. Sakallah. On finding all minimally unsatisfiable subformulas. In SAT, pages 173-186, 2005.
Mark H. Liffiton and Karem A. Sakallah. Algorithms for computing minimal unsatisfiable subsets of constraints. J. Autom. Reasoning, 40(1):1-33, 2008.
Scott M. Lundberg and Su-In Lee. A unified approach to interpreting model predictions. In NeurIPS, pages 4765-4774, 2017. URL: https://proceedings.neurips.cc/paper/2017/hash/8a20a8621978632d76c43dfd28b67767-Abstract.html.
Joao Marques-Silva, Thomas Gerspacher, Martin C. Cooper, Alexey Ignatiev, and Nina Narodytska. Explanations for monotonic classifiers. In ICML, pages 7469-7479, 2021. URL: http://proceedings.mlr.press/v139/marques-silva21a.html.
Joao Marques-Silva, Federico Heras, Mikolás Janota, Alessandro Previti, and Anton Belov. On computing minimal correction subsets. In IJCAI, pages 615-622, 2013.
João Marques-Silva and Xuanxiang Huang. Explainability is NOT a game. CoRR, abs/2307.07514, 2023.
João Marques-Silva and Alexey Ignatiev. Delivering trustworthy AI through formal XAI. In AAAI, pages 12342-12350, 2022.
Carlos Mencia, Alexey Ignatiev, Alessandro Previti, and Joao Marques-Silva. MCS extraction with sublinear oracle queries. In SAT, pages 342-360, 2016.
Carlos Mencia, Alessandro Previti, and Joao Marques-Silva. Literal-based MCS extraction. In IJCAI, pages 1973-1979, 2015.
Tim Miller. Explanation in artificial intelligence: Insights from the social sciences. Artif. Intell., 267:1-38, 2019. URL: https://doi.org/10.1016/J.ARTINT.2018.07.007.
Rishabh Misra and Prahal Arora. Sarcasm detection using news headlines dataset. AI Open, 4:13-18, 2023.
Rishabh Misra and Jigyasa Grover. Sculpting Data for ML: The first act of Machine Learning. Independently published, January 2021.
Vinod Nair and Geoffrey Hinton. Rectified linear units improve restricted boltzmann machines. In ICML, pages 807-814, 2010.
Nina Narodytska, Nikolaj Bjørner, Maria-Cristina V. Marinescu, and Mooly Sagiv. Core-guided minimal correction set and core enumeration. In IJCAI, pages 1353-1361, 2018.
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Z. Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. PyTorch: An imperative style, high-performance deep learning library. In NeurIPS, pages 8024-8035, 2019.
Alessandro Previti and João Marques-Silva. Partial MUS enumeration. In AAAI. AAAI Press, 2013.
Alessandro Previti, Carlos Mencía, Matti Järvisalo, and João Marques-Silva. Premise set caching for enumerating minimal correction subsets. In AAAI, pages 6633-6640, 2018.
J. Scott Provan and Michael O. Ball. The complexity of counting cuts and of computing the probability that a graph is connected. SIAM J. Comput., 12(4):777-788, 1983.
Raymond Reiter. A theory of diagnosis from first principles. Artif. Intell., 32(1):57-95, 1987. URL: https://doi.org/10.1016/0004-3702(87)90062-2.
Marco Túlio Ribeiro, Sameer Singh, and Carlos Guestrin. "Why should I trust you?": Explaining the predictions of any classifier. In KDD, pages 1135-1144, 2016. URL: https://doi.org/10.1145/2939672.2939778.
Ronald L. Rivest. Learning decision lists. Mach. Learn., 2(3):229-246, 1987.
Andy Shih, Arthur Choi, and Adnan Darwiche. A symbolic approach to explaining Bayesian network classifiers. In IJCAI, pages 5103-5111, 2018. URL: https://doi.org/10.24963/IJCAI.2018/708.
Salil Vadhan. The complexity of counting in sparse, regular, and planar graphs. SIAM J. Comput., 31(2):398-427, 2001.
L. G. Valiant. The complexity of computing the permanent. Theoret. Comput. Sci., 8(2):189-201, 1979.
William Webber, Alistair Moffat, and Justin Zobel. A similarity measure for indefinite rankings. ACM Transactions on Information Systems (TOIS), 28(4):1-38, 2010.
Jiancheng Yang, Rui Shi, Donglai Wei, Zequan Liu, Lin Zhao, Bilian Ke, Hanspeter Pfister, and Bingbing Ni. MedMNIST v2-a large-scale lightweight benchmark for 2D and 3D biomedical image classification. Scientific Data, 10(1):41, 2023.
Jinqiang Yu, Alexey Ignatiev, and Peter J. Stuckey. On formal feature attribution and its approximation. CoRR, abs/2307.03380, 2023. https://arxiv.org/abs/2307.03380, URL: https://doi.org/10.48550/arXiv.2307.03380.

Anytime Approximate Formal Feature Attribution

Authors Jinqiang Yu , Graham Farr, Alexey Ignatiev , Peter J. Stuckey

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

References

Thanks for your feedback!

Could not send message

Anytime Approximate Formal Feature Attribution

Authors Jinqiang Yu , Graham Farr, Alexey Ignatiev , Peter J. Stuckey

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Funding

Supplementary Materials

References

Thanks for your feedback!

Could not send message