One Map Does Not Fit All: Evaluating Saliency Map Explanation on Multi-Modal Medical Images (2107.05047v1)

Published 11 Jul 2021 in cs.CV and cs.LG

Abstract: Being able to explain the prediction to clinical end-users is a necessity to leverage the power of AI models for clinical decision support. For medical images, saliency maps are the most common form of explanation. The maps highlight important features for AI model's prediction. Although many saliency map methods have been proposed, it is unknown how well they perform on explaining decisions on multi-modal medical images, where each modality/channel carries distinct clinical meanings of the same underlying biomedical phenomenon. Understanding such modality-dependent features is essential for clinical users' interpretation of AI decisions. To tackle this clinically important but technically ignored problem, we propose the MSFI (Modality-Specific Feature Importance) metric to examine whether saliency maps can highlight modality-specific important features. MSFI encodes the clinical requirements on modality prioritization and modality-specific feature localization. Our evaluations on 16 commonly used saliency map methods, including a clinician user study, show that although most saliency map methods captured modality importance information in general, most of them failed to highlight modality-specific important features consistently and precisely. The evaluation results guide the choices of saliency map methods and provide insights to propose new ones targeting clinical applications.

Authors (3)

Weina Jin (11 papers)
Xiaoxiao Li (144 papers)
Ghassan Hamarneh (64 papers)

Citations (15)

View on Semantic Scholar

Summary

Evaluating Saliency Map Explanation on Multi-Modal Medical Images

The paper "One Map Does Not Fit All: Evaluating Saliency Map Explanation on Multi-Modal Medical Images" provides a comprehensive evaluation of 16 saliency map methods applied to multi-modal medical imaging. The paper's focal point is the variability in interpretability across different imaging modalities when using common saliency map techniques, questioning the viability of a one-size-fits-all explanation method in this context.

Methods Under Review

The authors scrutinize a diverse array of saliency map methods, categorized predominantly into activation-, gradient-, and perturbation-based approaches:

Activation-Based Methods: These include CAM (Class Activation Mapping) and its variant, Grad-CAM. The evaluation suggests Grad-CAM as favorable for its lack of special architectural requirements. A significant limitation noted is the inability of these methods to provide modality-specific insights, as the saliency maps produced are singular and shared across all modalities.
Gradient-Based Methods: Covering techniques such as Input $\times$ Gradient, SmoothGrad, and Integrated Gradient, among others, this category examines methods that rely on gradient computations to interpret model predictions. The paper highlights the effectiveness of SmoothGrad in mitigating noise in gradient signals, although further critique is needed regarding the potential gradient saturation problem which methods like Integrated Gradients aim to address.
Perturbation-Based Methods: This involves methods like Occlusion, LIME, and Shapley Value Sampling, which alter inputs to observe changes in model outputs. The method's ability to generate modality-specific saliency maps through localized perturbations is emphasized as a strength, with particular mention of LIME's adaptability to any classifier as a notable advantage.

Empirical Evaluations

The research employed two datasets: the real BraTS dataset and a synthetically generated dataset featuring brain tumor MRIs. Through performance evaluation metrics such as confusion matrices and doctor rating correlations, strong statistical backing is provided. Results demonstrate notably moderate correlation (Spearman coefficient = 0.53, p=0.001) between doctor ratings and MSFI scores, providing empirical support for the saliency methods’ utility in producing interpretable explanations.

Implications and Future Directions

The evaluation reveals that no single method consistently provides reliable, modality-specific insights across different imaging modalities, urging caution among practitioners against over-reliance on a single saliency map method in clinical settings. While the results validate the potential application of these methods in multi-modal interpretations, they also illuminate the necessity for tailored approaches to better harness modality-specific information, crucial for medical diagnostics.

Future work should concentrate on refining these interpretability methods, focusing on improving their sensitivity and fidelity to align more closely with domain-specific requirements. As AI continues to expand within medical imaging, advancements in interpretability will be vital, not only for model assessment but also for building practitioner trust in AI-assisted diagnostic tools. Further exploration might focus on the integration of multi-modal information to develop more nuanced interpretability algorithms that can leverage the strengths across various modalities.

Overall, this work contributes significantly to the discourse on explainable AI, particularly within the field of medical imaging, by critically evaluating the limits of current saliency map techniques and indicating pathways for future research and method development.

PDF Markdown

Related Papers

YouTube

Show All Videos