Model-agnostic Feature Importance and Effects with Dependent Features -- A Conditional Subgroup Approach (2006.04628v2)

Published 8 Jun 2020 in stat.ML and cs.LG

Abstract: The interpretation of feature importance in machine learning models is challenging when features are dependent. Permutation feature importance (PFI) ignores such dependencies, which can cause misleading interpretations due to extrapolation. A possible remedy is more advanced conditional PFI approaches that enable the assessment of feature importance conditional on all other features. Due to this shift in perspective and in order to enable correct interpretations, it is therefore important that the conditioning is transparent and humanly comprehensible. In this paper, we propose a new sampling mechanism for the conditional distribution based on permutations in conditional subgroups. As these subgroups are constructed using decision trees (transformation trees), the conditioning becomes inherently interpretable. This not only provides a simple and effective estimator of conditional PFI, but also local PFI estimates within the subgroups. In addition, we apply the conditional subgroups approach to partial dependence plots (PDP), a popular method for describing feature effects that can also suffer from extrapolation when features are dependent and interactions are present in the model. We show that PFI and PDP based on conditional subgroups often outperform methods such as conditional PFI based on knockoffs, or accumulated local effect plots. Furthermore, our approach allows for a more fine-grained interpretation of feature effects and importance within the conditional subgroups.

Authors (4)

Christoph Molnar (11 papers)
Gunnar König (14 papers)
Bernd Bischl (136 papers)
Giuseppe Casalicchio (34 papers)

Citations (66)

View on Semantic Scholar

Summary

The paper proposes a novel conditional subgroup permutation (cs-permutation) method to accurately assess model-agnostic feature importance and effects, specifically addressing the challenge of dependent features.
The cs-permutation approach uses decision trees to partition data into subgroups where features are less dependent, enabling more reliable feature importance estimation by minimizing extrapolation in low-density regions.
Experiments show that cs-permutation feature importance (cs-PFI) effectively recovers ground-truth importance and outperforms existing methods like knockoffs and ALE plots under diverse dependency scenarios, enhancing model interpretability and validation.

Model-agnostic Feature Importance and Effects with Dependent Features: A Conditional Subgroup Approach

The paper "Model-agnostic Feature Importance and Effects with Dependent Features--A Conditional Subgroup Approach" addresses a significant challenge in the field of interpretable machine learning—the accurate assessment of feature importance when features exhibit dependency. Traditionally, permutation feature importance (PFI) techniques have been employed to evaluate the significance of features in predictive models. However, these methods often fail in the presence of complex feature interactions, leading to misleading interpretations due to extrapolation across low-density regions of the feature space.

The authors propose an innovative method termed conditional subgroup permutation (cs-permutation), which aims to offer a more reliable estimation of feature importance by incorporating the concept of conditional independence within subgroups constructed using decision trees. This subgroup approach seeks to minimize extrapolation by focusing on permutations that respect the conditional distribution of features. The subgroups are formed by leveraging decision trees—specifically transformation trees—which partition the data based on distributions such that each feature of interest is less dependent on others within its subgroup.

Central to this paper is the introduction of model-agnostic conditional subgroup partial dependence plots (cs-PDPs) and cs-PFIs. These variations present nuanced interpretations of feature effects and importances that are conditioned on other features, potentially leading to a more accurate depiction of feature interactions and significance.

In their experiments, the authors demonstrate that cs-PFIs can effectively recover ground-truth feature importance under diverse dependency scenarios and typically outperform established alternatives such as knockoffs sampling, accumulated local effect plots, and imputation methods in terms of data fidelity and model fidelity. Notably, cs-PFIs maintain robustness irrespective of scenarios involving multiple dependencies or nonlinear relationships.

The practical applications of this approach extend beyond the mere computation of feature importances; it can enhance the understanding of interactions within the data, providing interpretability that is critical for model validation and refinement. Moreover, the interpretability of subgroup characteristics enables researchers to discern how dependency structures influence feature effects.

The theoretical and application implications of this research highlight its utility in modern machine learning tasks where models are increasingly recognized for their interpretative challenges, especially in applications involving complex data structures. The conditional subgroup approach outlined in this paper is poised to offer a novel tool for researchers aiming to improve model transparency and reliability.

Looking ahead, this research could stimulate further exploration into decision tree-based methods for the elucidation of feature relationships, possibly complementing neural network interpretations or aiding in the development of automated tools for model inspection. Furthermore, expanding the conditional subgroup methodology to accommodate features with varying types of dependencies and scaling these methods for high-dimensional data could be fertile grounds for future investigation in AI advancements.

Related Papers

YouTube

Show All Videos