Papers

Topics

Authors

Recent

View all

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 89 tok/s

Gemini 2.5 Pro 43 tok/s Pro

GPT-5 Medium 24 tok/s Pro

GPT-5 High 24 tok/s Pro

GPT-4o 112 tok/s Pro

Kimi K2 199 tok/s Pro

GPT OSS 120B 449 tok/s Pro

Claude Sonnet 4 37 tok/s Pro

2000 character limit reached

Feature relevance quantification in explainable AI: A causal problem (1910.13413v2)

Published 29 Oct 2019 in stat.ML and cs.LG

Abstract: We discuss promising recent contributions on quantifying feature relevance using Shapley values, where we observed some confusion on which probability distribution is the right one for dropped features. We argue that the confusion is based on not carefully distinguishing between observational and interventional conditional probabilities and try a clarification based on Pearl's seminal work on causality. We conclude that unconditional rather than conditional expectations provide the right notion of dropping features in contradiction to the theoretical justification of the software package SHAP. Parts of SHAP are unaffected because unconditional expectations (which we argue to be conceptually right) are used as approximation for the conditional ones, which encouraged others to `improve' SHAP in a way that we believe to be flawed.

Citations (259)

View on Semantic Scholar

Summary

The paper challenges conventional use of conditional expectations in Shapley calculations by advocating for unconditional expectations to achieve valid causal attribution.
It presents thorough theoretical and numerical analyses demonstrating that unconditional methods reduce attribution errors and better capture causal influences.
The study highlights practical implications for enhancing AI transparency and accountability through the integration of causal perspectives in feature relevance.

Causal Considerations in Explainable AI: A Critical Analysis of Feature Relevance Quantification via Shapley Values

The paper undertakes a rigorous exploration of feature relevance quantification within the domain of explainable AI (XAI), focusing on the application of Shapley values. The authors examine the misinterpretation of the appropriate probability distribution for dropped features, advocating for unconditional expectations as a more valid approach compared to conditional ones, particularly in contrast to the theoretical framework used by the SHAP package.

Core Contributions

The primary contribution of this work lies in its critique of existing methodologies for calculating feature relevance through Shapley values, specifically challenging the alignment of SHAP with observational conditional expectations when it should focus on unconditional expectations. By referencing causal inference, notably Pearl’s causal framework, the authors position their argument that the attribution based on marginal rather than conditional expectations aligns more accurately with causal effects.

Theoretical and Numerical Analysis

A significant portion of the discussion elaborates on the theoretical underpinnings of feature relevance attribution models, including integrated gradients and Shapley values. The paper posits that, while both conditional and unconditional expectations can be used for defining simplified functions of model inputs, the latter presents a more coherent logical foundation for causal interpretation. Particularly, it points out the sensitivity issue observed when using conditional expectations, where irrelevant features can erroneously receive non-zero attribution.

Numerically, the authors provide substantial evidence through experiments that support the superiority of unconditional over conditional expectations in terms of accuracy for feature attribution. The simulations include various scenarios using linear functions where ground truth attributions are known, and real-world data, further reinforcing their theoretical claims.

Critique of Existing Approaches

SHAP's original use of conditional expectations has been critiqued for potentially misrepresenting the causal relevance of features, suggesting an alignment closer to intended causal interventions would be obtained through unconditional expectations. The authors argue this approach is approximately mirrored in SHAP’s practical implementation through the approximations it typically employs, albeit being theoretically misaligned. Furthermore, they engage with Sundararajan et al.'s critiques around the symmetry properties of attribution methods, defending the symmetry in their framework under causal rationalization.

Implications for Explainable AI

This paper’s findings have practical implications for the development and deployment of AI systems that rely on transparency for interpretability. Correctly quantifying feature relevance is essential not only for robustness against misinterpretation but also for ethical and legal accountability in algorithmic decision-making. By aligning feature attribution approaches with causal inference principles, it positions models to be better understood concerning how input features causally affect outcomes, improving trustworthiness.

Future Developments

The discussion posits that the incorporation of unconditional expectations based on causal perspectives could drive enhancements in tools like SHAP by refining the understanding of which features genuinely influence model output. Such advancements could lead to more reliable XAI systems that users can accurately interpret, further advocating for a shift towards causal methods in feature attribution analysis.

In conclusion, this paper critically advances the discourse in explainable AI by aligning feature attribution mechanisms with causal inference principles. It paves the way for future research to explore deeper integration of causality within the XAI framework, potentially catalyzing improvements in the reliability and interpretability of AI systems across diverse domains.