L-Shapley and C-Shapley: Efficient Model Interpretation for Structured Data (1808.02610v1)

Published 8 Aug 2018 in cs.LG and stat.ML

Abstract: We study instancewise feature importance scoring as a method for model interpretation. Any such method yields, for each predicted instance, a vector of importance scores associated with the feature vector. Methods based on the Shapley score have been proposed as a fair way of computing feature attributions of this kind, but incur an exponential complexity in the number of features. This combinatorial explosion arises from the definition of the Shapley value and prevents these methods from being scalable to large data sets and complex models. We focus on settings in which the data have a graph structure, and the contribution of features to the target variable is well-approximated by a graph-structured factorization. In such settings, we develop two algorithms with linear complexity for instancewise feature importance scoring. We establish the relationship of our methods to the Shapley value and another closely related concept known as the Myerson value from cooperative game theory. We demonstrate on both language and image data that our algorithms compare favorably with other methods for model interpretation.

Citations (200)

View on Semantic Scholar

Summary

The paper introduces L-Shapley and C-Shapley, two novel algorithms for efficiently approximating Shapley values to interpret models on structured data, particularly graph-based.
L-Shapley uses local neighborhoods within the data graph, while C-Shapley exploits connected components to significantly reduce the computational complexity of feature importance calculation.
Empirical evaluation shows these methods are more efficient than existing techniques while maintaining interpretability on tasks like text and image classification.

Efficient Model Interpretation for Structured Data: L-Shapley and C-Shapley

The paper under consideration presents two novel algorithms, L-Shapley and C-Shapley, designed to improve the efficiency of model interpretation when dealing with structured data, particularly in graph-based settings. It addresses the challenge posed by the exponential computational complexity typically associated with Shapley values in feature importance scoring. By focusing on graph-structured data, it significantly reduces this computational burden, thereby making model interpretation more feasible for large datasets and complex models.

Background and Motivation

Traditional machine learning models, while accurate, often operate as "black boxes" that are difficult to interpret. The need for interpretability is crucial in applications requiring transparency, such as medicine or financial markets. Instancewise feature importance scoring methods offer insights into the contributions of different features in a model's predictions on a per-instance basis. Shapley values, derived from cooperative game theory, provide an axiomatic approach to fairly attributing importance among features. However, the cardinality of feature subsets required to compute exact Shapley values leads to infeasible computational demands.

Methodological Innovation

The research focuses on graph-structured data, where features are interrelated, and interactions can be represented as nodes and edges in a graph. In such a scenario, it is common that features close to each other in the graph have stronger interactions. The paper proposes two efficient algorithms:

L-Shapley (Local Shapley): This method approximates the Shapley value by computing contributions based only on a feature's local neighborhood within the graph. The algorithm evaluates subsets that include the target feature and its neighbors within a graph-defined distance $k$ . This local approach dramatically reduces the computational complexity from exponential to linear relative to the neighborhood size.
C-Shapley (Connected Shapley): Building on the neighborhood concept, C-Shapley further optimizes the process by only considering connected subsets of features. This approach exploits graph connectivity to streamline calculations, effectively approximating Shapley values with even fewer model evaluations. The use of the Myerson value is pivotal here, facilitating decomposition of feature interactions based on connected components, reinforcing the computational efficiency gains.

Theoretical Contributions

The paper provides thorough theoretical analysis, establishing that under certain probabilistic assumptions related to feature dependence (measured through absolute mutual information), the L-Shapley and C-Shapley approximations are close to true Shapley values. It introduces a useful framework for assessing model interpretability in graph-constrained settings, connecting model-based explanations directly to structural characteristics in the data.

Empirical Evaluation

The authors validate their algorithms using both text and image classification tasks. Performance is measured in terms of model interpretability: the capability to accurately identify and score feature importance. The results indicate that L-Shapley and C-Shapley outperform existing methods like SampleShapley, KernelSHAP, and LIME in efficiency and in maintaining interpretability, even when the number of model evaluations is constrained.

Implications and Future Directions

This research offers practical utility in fields with structured data, including natural language processing and computer vision, where interpretability aligns with tangible graph structures (e.g., syntactic dependencies, pixel arrangements). The proposed methods potentially enhance the deployment of complex models in critical industries demanding clarity and justification of AI decision processes.

Future work could explore extending these concepts to other types of structured data, or integrating these efficient Shapley approximations with other interpretability frameworks. Moreover, examining the potential of these approaches in ensemble learning scenarios or more intricate neural architectures presents a valuable direction.

In summary, L-Shapley and C-Shapley present significant advancements in the domain of model interpretation. By exploiting the natural structure in data, these methods offer a robust approach to understanding intricate models without prohibitive computational costs, ensuring that the growing complexity of machine learning systems remains accessible and interpretable.