Explanation of Machine Learning Models Using Shapley Additive Explanation and Application for Real Data in Hospital (2112.11071v2)

Published 21 Dec 2021 in cs.LG and stat.ML

Abstract: When using machine learning techniques in decision-making processes, the interpretability of the models is important. In the present paper, we adopted the Shapley additive explanation (SHAP), which is based on fair profit allocation among many stakeholders depending on their contribution, for interpreting a gradient-boosting decision tree model using hospital data. For better interpretability, we propose two novel techniques as follows: (1) a new metric of feature importance using SHAP and (2) a technique termed feature packing, which packs multiple similar features into one grouped feature to allow an easier understanding of the model without reconstruction of the model. We then compared the explanation results between the SHAP framework and existing methods. In addition, we showed how the A/G ratio works as an important prognostic factor for cerebral infarction using our hospital data and proposed techniques.

Authors (4)

Yasunobu Nohara (2 papers)
Koutarou Matsumoto (1 paper)
Hidehisa Soejima (1 paper)
Naoki Nakashima (2 papers)

Citations (257)

View on Semantic Scholar

Summary

Interpretability of Machine Learning Models Using SHAP in Clinical Settings

This paper evaluates the interpretability of machine learning models applied to clinical data, focusing on Gradient Boosting Decision Trees (GBDT) using Shapley Additive Explanation (SHAP). The authors propose two techniques to enhance model interpretability: (1) a new metric for feature importance based on SHAP values and (2) a mechanism termed “feature packing.” These approaches are evaluated using real-world hospital data, specifically to understand prognostic factors for patients admitted after cerebral infarction.

SHAP and Model Interpretability

The motivation behind using SHAP stems from the inherent complexity of predictive models like GBDT, which, despite their accuracy, compromise on interpretability. For models utilized in decision-making processes, especially in high-stakes fields like medicine, understanding the rationale behind predictions is critical. SHAP values assign to each feature a 'contribution score', indicating its influence on the model's output. This fairness in attribution is derived from cooperative game theory.

Methodological Contributions

Novel Feature Importance Metric: The paper introduces a metric based on the variance (L2-norm) of SHAP values rather than the traditional approach using L1-norm sums. This allows for an evaluation of variable importance that aligns more closely with conventional interpretations similar to those produced in linear modeling.
Feature Packing Technique: This technique aggregates correlated features into a single, comprehensible grouping without model reconstruction, thus maintaining predictive performance while simplifying interpretability. By leveraging the additivity property of SHAP values, feature packing does not affect the prediction accuracy and is particularly insightful when dealing with correlated inputs like various activity of daily living (ADL) scores.

Empirical Findings

The paper utilizes a data sample of 1534 patients admitted for cerebral infarction, constructing a prognostic model using GBDT. The model achieves a commendable AUC of 0.788. Through SHAP analysis, important prognostic factors such as NIH Stroke Scale, D-dimer levels, and the A/G ratio are identified. Notably, the paper highlights the A/G ratio, a less anticipated predictor, as crucial — an insight corroborated via SHAP dependence and summary plots.

Comparing SHAP and Existing Methods

While traditional methods such as gain-based feature importance offer relative significance, they falter in delivering how features affect predictions. SHAP not only matched existing methods in extracting relevant features but provided richer context by visually depicting the impact of values, including missing values. For example, SHAP dependence plots reveal nuanced interactions and non-linearities absent in simpler partial dependence plots.

Implications and Future Directions

By adopting SHAP, the analysis bridges the gap between model performance and interpretability in clinical settings. This methodology empowers clinicians by elucidating the underlying factors guiding machine learning predictions. The proposed techniques — specifically feature packing — could be further refined and subjected to different datasets and contexts to broaden clinical applicability.

Future research might explore the integration of SHAP with other interpretability frameworks and its computational efficiency in real-time health informatics systems. The extension of these methods to other complex models such as neural networks also warrants consideration, especially in creating unified interpretability frameworks across varying model architectures.

This paper is significant in enhancing the trustworthiness of AI applications in healthcare, ensuring that complex model decisions align with clinical expertise and standards.

PDF Markdown

Related Papers

Find Related Papers