OASIcs.SAIA.2024.12.pdf
- Filesize: 0.97 MB
- 23 pages
Modern deep learning methods have long been considered as black-boxes due to their opaque decision-making processes. Explainable Artificial Intelligence (XAI), however, has turned the tables: it provides insight into how these models work, promoting transparency that is crucial for accountability. Yet, recent developments in adversarial machine learning have highlighted vulnerabilities in XAI methods, raising concerns about security, reliability and trustworthiness, particularly in sensitive areas like healthcare and autonomous systems. Awareness of the potential risks associated with XAI is needed as its adoption increases, driven in part by the need to enhance compliance to regulations. This survey provides a holistic perspective on the security and safety landscape surrounding XAI, categorizing research on adversarial attacks against XAI and the misuse of explainability to enhance attacks on AI systems, such as evasion and privacy breaches. Our contribution includes identifying current insecurities in XAI and outlining future research directions in adversarial XAI. This work serves as an accessible foundation and outlook to recognize potential research gaps and define future directions. It identifies data modalities, such as time-series or graph data, and XAI methods that have not been extensively investigated for vulnerabilities in current research.
Feedback for Dagstuhl Publishing