Objectif du cours
Course Summary:
This course explores Explainable Artificial Intelligence (XAI), a crucial subfield of machine learning dedicated to enhancing the transparency of complex models. While modern AI systems—particularly Deep Neural Networks (DNNs) and Foundation Models achieve state-of-the-art performance, their black-box nature makes it challenging to understand the reasoning behind their predictions. This lack of interpretability raises concerns about trust, accountability, and the ability to extract meaningful insights from these models.
The course examines two key perspectives in XAI:
1. The argument for using inherently interpretable models in high-stakes domains such as healthcare and finance.
2. The development of post hoc explanation techniques that provide insight into complex models after training.
Students will engage with a variety of state-of-the-art XAI methods across multiple modalities, including computer vision, audio processing, and natural language processing (NLP). Topics covered include attribution techniques, sensitivity analysis, Concept Bottleneck Models, Concept Activation Vectors (CAVs), and Counterfactual Explanations. Through hands-on exercises, students will gain practical experience applying XAI techniques, equipping them to enhance transparency and interpretability across diverse AI applications.
Context and Motivation:
With the growing use of AI and machine learning in critical fields like healthcare, finance, autonomous vehicles, and security, understanding how these systems arrive at decisions is crucial. Explainability in AI improves model transparency, and enhances fairness and accountability. While explainability has traditionally focused on tabular data, the increasing complexity of multimodal systems, which combine vision, audio, and text, necessitates the development of advanced XAI techniques that cater to these diverse inputs.
Also with the European Union’s AI Act seeking to regulate AI systems based on their level of risk, with high-risk applications (such as healthcare, finance, and law enforcement) requiring strict compliance with transparency and explainability standards. Explainability is central to ensuring AI systems are accountable, fair, and aligned with human values, particularly in sensitive domains where opaque decision-making can lead to unintended biases or ethical concerns.
This course directly addresses these challenges by equipping students with the theoretical foundations and practical skills needed to interpret AI models. By exploring interpretable models and post hoc explainability methods, students will be prepared to contribute to the development of compliant, trustworthy, and socially responsible AI systems.
Organisation des séances
Week 1: Introduction to Explainable AI (XAI)
● Overview of XAI:
○ What is Explainable AI?
○ Importance and challenges of interpretability
○ Real-world applications of XAI (e.g., healthcare, finance, autonomous systems)
○ Introduction to different types of explanations in AI
● LIME & SHAP (post-hoc explanation):
○ Understanding Local Interpretable Model-agnostic Explanations (LIME) [2]
○ Introduction to SHAP (SHapley Additive exPlanations) [3] and Kernel SHAP
● Hands-on Tutorial : Implementing LIME and SHAP on vision tasks (TP on vision)
Week 2 Introduction to Explainable AI (XAI)
● Attribution Methods (CAM & Grad-CAM & Gradient based strategy post-hoc explanation) :
○ Class Activation Mapping (CAM)
○ Gradient-weighted Class Activation Mapping (Grad-CAM)[1]
○ Application to convolutional neural networks (CNNs)
○ Integrated gradient
● Hands-on Tutorial : Using CAM and Grad-CAM to interpret vision models (TP on vision)
Week 3: Variance-based Sensitivity Analysis
● Variance-based Sensitivity Analysis (post-hoc explanation) :
○ Overview of sensitivity analysis methods
○ Mathematical foundations and applications
○ Review of key articles on variance-based methods for XAI [4][5]
● Hands-on Tutorial :Applying variance-based sensitivity analysis on audio models (TP on audio)
Week 4: Counterfactual Explanations
● Understanding Counterfactuals :
○ Definition and theory of counterfactual explanations [6]
○ Application in AI models for decision-making transparency
● Hands-on Tutorial : Counterfactual explanations for audio-based AI systems (TP on audio)
Week 5: Concept Bottleneck Models
● Concept Bottleneck Models :
○ Overview of Concept Bottleneck Models and their role in improving interpretability [8,9]
○ Using intermediate concepts to explain final predictions
● Hands-on Tutorial : Implementing and evaluating Concept Bottleneck Models for vision-based tasks (TP on vision)
Week 6: Concept Activation Vectors (CAVs)
● Concept Activation Vectors :
○ Introduction to CAVs for interpreting deep learning models [10]
○ How CAVs connect human-understandable concepts to neural network activations
● Hands-on Tutorial : Using CAVs to interpret vision-language models (TP on vision-NLP)
Week 7: Prototype Networks
● Prototype Networks :
○ Understanding Prototype Networks and how they improve explainability by leveraging representative prototypes for predictions [11]
● Hands-on Tutorial : Building and testing Prototype Networks for audio-related tasks (TP on audio)
Week 8: Chain of Thought Reasoning
● Chain of Thought Reasoning :
○ How models can simulate human-like reasoning processes through chains of thought [12]
○ Importance of step-by-step explanations in complex decision-making systems
● Hands-on Tutorial : Applying Chain of Thought reasoning in NLP models (TP on NLP)
Week 9: Knowledge Graphs in XAI
● Knowledge Graphs :
○ Introduction to knowledge graphs and their use in enhancing model interpretability [13]
○ Role of knowledge graphs in NLP and AI systems for structured explanations
● Hands-on Tutorial : Integrating knowledge graphs into NLP models (TP on NLP)
Week 10: Final Project and Exam
● Project and Exam:
○ Final exam (1h)
○ Students work on a final project to apply multiple XAI techniques across a multimodal dataset (2h)
Mode de validation
Examination and Grading:
● Final Project (50%): Students will work on a final project that demonstrates their ability to apply XAI methods to multimodal data (vision, audio, NLP).
● Final Exam (50%): A written exam covering the theoretical foundations of the XAI techniques learned in class.
Références
Required Reading and Resources:
1. https://github.com/jacobgil/pytorch-grad-cam
2. Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). « Why should I trust you? » Explaining the predictions of any classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
3. Lundberg, S. M., & Lee, S. I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems (NeurIPS).
4. Sobol, Ilya M. « Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates. » Mathematics and computers in simulation 55.1-3 (2001): 271-280.
5. Fel, T., Cadène, R., Chalvidal, M., Cord, M., Vigouroux, D., & Serre, T. (2021). Look at the variance! efficient black-box explanations with sobol-based sensitivity analysis. Advances in neural information processing systems, 34, 26005-26014.
6. Mothilal, Ramaravind K., Amit Sharma, and Chenhao Tan. « Explaining machine learning classifiers through diverse counterfactual explanations. » Proceedings of the 2020 conference on fairness, accountability, and transparency. 2020.
7. Agerri, R., & García-Serrano, A. (2021). « Multimodal deep learning: A review of the state of the art. » arXiv:2111.04138.
8. Koh, Pang Wei, et al. « Concept bottleneck models. » International conference on machine learning. PMLR, 2020.
9. Kazmierczak, R., Berthier, E., Frehse, G., & Franchi, G. (2023). CLIP-QDA: An explainable concept bottleneck model. arXiv preprint arXiv:2312.00110.
10. Parekh, J., Khayatan, P., Shukor, M., Newson, A., & Cord, M. (2024). A concept-based explainability framework for large multimodal models. Advances in Neural Information Processing Systems, 37, 135783-135818.
11. Donnelly, Jon, Alina Jade Barnett, and Chaofan Chen. « Deformable protopnet: An interpretable image classifier using deformable prototypes. » Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022.
12. Saparov, Abulhair, and He He. « Language models are greedy reasoners: A systematic formal analysis of chain-of-thought. » arXiv preprint arXiv:2210.01240 (2022).
13. Tiddi, Ilaria, and Stefan Schlobach. « Knowledge graphs as tools for explainable machine learning: A survey. » Artificial Intelligence 302 (2022): 103627.
14. https://christophm.github.io/interpretable-ml-book/
Thèmes abordés
Connection to Other MVA Courses:
The MVA (Mathematical Vision and Learning) program offers a wide range of deep learning courses applied to computer vision, natural language processing (NLP), and audio. It also includes courses on the theoretical foundations of deep learning, as well as Responsible Machine Learning and Bayesian Machine Learning, which provide essential background on trustworthy AI. While our course aligns with these topics, it introduces a new perspective on explainability, exploring certain concepts that may be briefly touched upon in other courses but are examined in greater depth here.
Gianni FRANCHI
(ENSTA)
Mathieu FONTAINE
(Télécom)
Matthieu LABEAU
(Télécom)
Matthieu CORD
(Valéo & Sorbonne Université)