Research Interests

Research Questions

I am Machine Learning enthusiast who is looking to merge complex mathematical theory with real-world applications. My current research interests are in the field of eXplainable Artificial Intelligence (XAI), where the goal is to make complex black-box models more interpretable. Indeed, although complex models such as Random Forests or Neural Network attain high performance, they are not easily understood by humans. Depending on the task, not being able to explain predictions can be a considerable roadblock toward model acceptance and deployment. For instance, think of safety-critical domains such as Aerospace, or applications where human beings are impacted by decisions: Medicine, Banking, and Insurance.

Various post-hoc techniques have been proposed to get insight into model behavior, notably Partial Dependence Plots (PDP), Permutation Feature Importance (PFI), SHAP and Expected Gradients. Although these techniques hold the promise of ‘‘explaining’’ the behavior of any black-box model, fundamental questions still remain to be investigated.

  1. What are the theoretical relationships between existing methods. What do they characterize about models? In what scenarios can we expect them to agree (or be contradictory)?
  2. When are practitioners allowed (disallowed) to trust the explanations provided? Without the existence of ground-truths in explainability, it is hard to define accurate trustworthiness metrics.

Contributions

My Doctoral degree is part of the DEEL research initiative and aims at answered these two research questions. The first question is tackled in our FDTrees paper where various post-hoc explanations are unified through the lens of Functional Decompositions and it is demonstrate that disagreements are caused by so-called Feature Interactions. This discovery clarifies the relationship between the various explainers. For the second question, we propose to use Disagreements as a proxy of trustworthiness of post-hoc explanations. The stronger the disagreements, the lower the trust. We define three critical types of disagreements that we recommend computing in a XAI pipeline :

  • Oversimplification Disagreement: the amount by which the PDP/SHAP/PFI explanation disagree because of feature interactions. This disagreement is reduced by using FDTrees.
  • Sub-sampling Disagreement : the stochasticity induced by the necessity to provide subsamples of data to the explainers instead of the whole dataset. The importance of considering this uncertainty is demonstrated by our FoolSHAP attack that can make an unfair model look acceptable.
  • Under-specification Disagreement : the disagreements between two models with equivalent performance i.e all models in the Rashomon Set. This methodology is introduced in our JMLR paper.

Each of these is elaborated on in a published article from my PhD thesis.