In the field of explainable artificial intelligence (XAI), a significant challenge is determining how well various methods and metrics measure the "faithfulness" of post-hoc explanations provided for AI model predictions. These explanations assign importance to input features to clarify how an AI model arrives at its conclusions. However, despite an array of available metrics to assess faithfulness, these metrics often conflict, causing confusion among practitioners as to which should be employed to confirm that explanations accurately represent a model’s reasoning.
To address this issue, a paper was launched focusing on the discrepancies among current faithfulness metrics. The researchers adopted various local explanation methods for linear and non-linear models across multiple datasets. Local explanation methods generate attributions for individual predictions, allowing users to understand the rationale behind specific decisions made by the model. Included in the paper were popular attribution techniques such as Deep SHAP, KernelSHAP, and Integrated Gradients, each accompanied by different baseline comparisons to account for variability. The researchers also evaluated the novel use of ablation and topological data analysis (TDA) among the metrics assessed.
The paper provided an extensive comparison of faithfulness metrics by implementing a ranking system based on their assessments. Ideally, if all metrics were aligned, the rankings would converge, but the paper's findings showed otherwise: there was little consensus on which set of explanations could be considered the most faithful. Perturbation-based metrics, in particular, demonstrated variability due to their sensitivity to parameter selection, such as the perturbation method and choice of features to perturb. These findings suggest that the selection of faithfulness metrics may be highly contextual and that more research is required to determine the most appropriate metrics for specific use cases.
In conclusion, the paper highlighted a gap between theoretical XAI metrics and their practical utility. The absence of agreement among measures of faithfulness can leave users without clear guidance on how to choose the most appropriate explanations for their AI models. This underscores the need for a more refined understanding of these metrics and their implications in practice. The paper urges the XAI community to take note of these divergences and to further investigate the development of more harmonized benchmarks that can provide consistent and reliable guidance for evaluating the faithfulness of AI explanations.