Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 165 tok/s

Gemini 2.5 Pro 47 tok/s Pro

GPT-5 Medium 28 tok/s Pro

GPT-5 High 24 tok/s Pro

GPT-4o 112 tok/s Pro

Kimi K2 208 tok/s Pro

GPT OSS 120B 466 tok/s Pro

Claude Sonnet 4.5 36 tok/s Pro

2000 character limit reached

shapiq: Shapley Interactions for Machine Learning (2410.01649v1)

Published 2 Oct 2024 in cs.LG and cs.AI

Abstract: Originally rooted in game theory, the Shapley Value (SV) has recently become an important tool in machine learning research. Perhaps most notably, it is used for feature attribution and data valuation in explainable artificial intelligence. Shapley Interactions (SIs) naturally extend the SV and address its limitations by assigning joint contributions to groups of entities, which enhance understanding of black box machine learning models. Due to the exponential complexity of computing SVs and SIs, various methods have been proposed that exploit structural assumptions or yield probabilistic estimates given limited resources. In this work, we introduce shapiq, an open-source Python package that unifies state-of-the-art algorithms to efficiently compute SVs and any-order SIs in an application-agnostic framework. Moreover, it includes a benchmarking suite containing 11 machine learning applications of SIs with pre-computed games and ground-truth values to systematically assess computational performance across domains. For practitioners, shapiq is able to explain and visualize any-order feature interactions in predictions of models, including vision transformers, LLMs, as well as XGBoost and LightGBM with TreeSHAP-IQ. With shapiq, we extend shap beyond feature attributions and consolidate the application of SVs and SIs in machine learning that facilitates future research. The source code and documentation are available at https://github.com/mmschlk/shapiq.

Citations (2)

View on Semantic Scholar

Summary

The paper introduces shapiq, a Python package that integrates state-of-the-art approximation algorithms to efficiently compute Shapley Values and Shapley Interactions in ML.
It details both exact and approximate computation methods, benchmarked using metrics like MSE and Precision@5 to validate performance and trade-offs.
The toolkit enhances model explainability by quantifying joint feature contributions, supporting advanced applications of cooperative game theory in AI.

shapiq: Shapley Interactions for Machine Learning

Introduction

The paper "shapiq: Shapley Interactions for Machine Learning" presents an open-source Python package, shapiq, designed to compute Shapley Values (SV) and Shapley Interactions (SI) in an efficient and application-agnostic manner. The importance of SVs in feature attribution and data valuation within Explainable AI (XAI) is well-documented. However, their computational complexity often limits their practical application. Shapiq aims to address these computational challenges by integrating state-of-the-art approximation algorithms and providing comprehensive tools for their practical application in ML. The paper emphasizes that these interactions can enhance understanding of complex ML models by considering joint contributions of feature groups, going beyond traditional feature attributions.

Figure 1: The shapiq Python package facilitates research on game theory for machine learning, including state-of-the-art approximation algorithms and pre-computed benchmarks. Moreover, it provides a simple interface for explaining predictions of machine learning models beyond feature attributions.

Theoretical Background

SVs are pivotal in cooperative game theory, offering a unique allocation scheme based on specific axioms like efficiency, symmetry, and dummy. These values are crucial in ML for distributing a model's output fairly among input features. SIs further extend these concepts by quantifying the combined contributions of feature groups, which are often necessary for capturing synergies and redundancies among features in complex models.

Computational challenges arise due to the exponential number of calculations required to determine SVs or SIs precisely. Shapiq leverages both structural model properties and stochastic approximations to efficiently estimate these values.

Implementation of shapiq

Shapiq provides an interface to compute SVs and SIs using exact and approximate methods, encompassing a vast array of algorithms suitable for different ML applications:

Approximators: These are pivotal for estimating SVs and SIs without direct computation. Shapiq includes several kernel-based and sampling-based approximators, such as KernelSHAP-IQ and SVARM-IQ, which leverage strategies like linear regression and stratified sampling.
Exact Computation: While infeasible for large scenarios, exact computations are crucial for benchmarking and validating approximations. Shapiq supports exact computations for moderate problem sizes, facilitating detailed studies into SV and SI properties.
Applications: Shapiq's flexible API allows integration with various ML models and datasets, supporting tasks ranging from feature attribution to data valuation and model ensemble evaluations.
Figure 2: Left: Exemplary code for locally explaining a single model's prediction with shapiq. Right: Local feature interactions visualized on a network plot.

Benchmarking and Evaluation

The paper provides a comprehensive benchmarking suite within shapiq to assess the performance and utility of different approximators across multiple real-world ML scenarios. This includes:

Performance Metrics: Pre-computed benchmarks for assessing approximator accuracy using metrics like Mean Squared Error (MSE) and Precision@5. Such metrics are essential for judging the trade-offs between computational efficiency and approximation quality.
Application Domains: Benchmarks span diverse use cases, such as uncertainty estimation, unsupervised feature importance, and model explanations, reflecting the versatility of SVs and SIs in capturing complex feature interactions.
Figure 3: Left: Exemplary code for globally explaining multiple model's predictions with shapiq. Right: Global feature interaction importance visualized as a bar plot.

Implications and Future Directions

Shapiq significantly advances the practical feasibility of SVs and SIs in ML by mitigating computational constraints. Its generalizability allows researchers to apply game-theoretic approaches across various domains, fostering a deeper understanding of model behavior and feature dynamics.

Future developments could enhance computational efficiency further, possibly integrating low-level optimizations or leveraging parallel computing architectures. Additionally, extending visualizations and interpretability features within shapiq could aid in human-centric model evaluations, aligning with the growing emphasis on transparency and accountability in AI systems.

Conclusion

By providing a comprehensive toolkit for SV and SI computation, shapiq positions itself as a crucial resource in the XAI landscape. It not only empowers researchers to explore complex feature attributions and interactions but also sets a benchmark for evaluating the efficacy of current approximation methods across a variety of ML tasks. The package's contributions are expected to facilitate future research in cooperative game theory applications in ML, driving innovations in feature attribution methodologies.