MAProtoNet: A Multi-scale Attentive Interpretable Prototypical Part Network for 3D Magnetic Resonance Imaging Brain Tumor Classification (2404.08917v1)
Abstract: Automated diagnosis with artificial intelligence has emerged as a promising area in the realm of medical imaging, while the interpretability of the introduced deep neural networks still remains an urgent concern. Although contemporary works, such as XProtoNet and MProtoNet, has sought to design interpretable prediction models for the issue, the localization precision of their resulting attribution maps can be further improved. To this end, we propose a Multi-scale Attentive Prototypical part Network, termed MAProtoNet, to provide more precise maps for attribution. Specifically, we introduce a concise multi-scale module to merge attentive features from quadruplet attention layers, and produces attribution maps. The proposed quadruplet attention layers can enhance the existing online class activation mapping loss via capturing interactions between the spatial and channel dimension, while the multi-scale module then fuses both fine-grained and coarse-grained information for precise maps generation. We also apply a novel multi-scale mapping loss for supervision on the proposed multi-scale module. Compared to existing interpretable prototypical part networks in medical imaging, MAProtoNet can achieve state-of-the-art performance in localization on brain tumor segmentation (BraTS) datasets, resulting in approximately 4% overall improvement on activation precision score (with a best score of 85.8%), without using additional annotated labels of segmentation. Our code will be released in https://github.com/TUAT-Novice/maprotonet.
- Network dissection: Quantifying interpretability of deep visual representations, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE. pp. 3319–3327. URL: https://openaccess.thecvf.com/content_cvpr_2017/html/Bau_Network_Dissection_Quantifying_CVPR_2017_paper.html, doi:10.1109/CVPR.2017.354.
- MICA: Towards explainable skin lesion diagnosis via multi-level image-concept alignment. arXiv preprint arXiv:2401.08527 URL: https://arxiv.org/abs/2401.08527.
- Convolutional neural network-based clinical predictors of oral dysplasia: Class activation map analysis of deep learning results. Cancers 13. URL: https://www.mdpi.com/2072-6694/13/6/1291, doi:10.3390/cancers13061291.
- On the applicability of prototypical part learning in medical images: Breast masses classification using ProtoPNet, in: Pattern Recognition, Computer Vision, and Image Processing. ICPR 2022 International Workshops and Challenges, Springer Nature Switzerland. pp. 539–557. URL: https://link.springer.com/10.1007/978-3-031-37660-3_38, doi:10.1007/978-3-031-37660-3_38.
- This looks like that: Deep learning for interpretable image recognition, in: Advances in Neural Information Processing Systems (NeurIPS), Curran Associates, Inc.. pp. 8928–8939. URL: https://proceedings.neurips.cc/paper_files/paper/2019/file/adf7ee2dcf142b0e11888e72b43fcb75-Paper.pdf.
- Masked-attention mask transformer for universal image segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE. pp. 1290–1299. URL: https://openaccess.thecvf.com/content/CVPR2022/html/Cheng_Masked-Attention_Mask_Transformer_for_Universal_Image_Segmentation_CVPR_2022_paper.html, doi:10.1109/CVPR52688.2022.00135.
- Per-pixel classification is not all you need for semantic segmentation, in: Advances in Neural Information Processing Systems (NeurIPS), Curran Associates, Inc.. pp. 17864–17875. URL: https://proceedings.neurips.cc/paper_files/paper/2021/file/950a4152c2b4aa3ad78bdd6b366cc179-Paper.pdf.
- Emergent symbolic language based deep medical image classification, in: Proceedings of the 18th International Symposium on Biomedical Imaging (ISBI), IEEE. pp. 689–692. URL: https://ieeexplore.ieee.org/document/9434073/, doi:10.1109/ISBI48211.2021.9434073.
- Deformable protopnet: An interpretable image classifier using deformable prototypes, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE. pp. 10265–10275. URL: https://ieeexplore.ieee.org/document/9878975/, doi:10.1109/CVPR52688.2022.01002.
- An image is worth 16x16 words: Transformers for image recognition at scale, in: Proceedings of the 9th International Conference on Learning Representations (ICLR), OpenReview.net.
- ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. arXiv preprint arXiv:1811.12231 URL: http://arxiv.org/abs/1811.12231.
- Using radiomics as prior knowledge for thorax disease classification and localization in chest x-rays. AMIA Annual Symposium Proceedings 2021, 546. URL: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8861661/.
- Deep residual learning for image recognition, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE. pp. 770–778. URL: https://openaccess.thecvf.com/content_cvpr_2016/html/He_Deep_Residual_Learning_CVPR_2016_paper.html, doi:10.1109/CVPR.2016.90.
- Denoising diffusion probabilistic models, in: Advances in Neural Information Processing Systems (NeurIPS), Curran Associates, Inc.. pp. 6840–6851. URL: https://proceedings.neurips.cc/paper_files/paper/2020/file/4c5bcfec8584af0d967f1ab10179ca4b-Paper.pdf.
- Squeeze-and-excitation networks, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE. pp. 7132–7141. URL: http://arxiv.org/abs/1709.01507.
- UNet 3+: A full-scale connected UNet for medical image segmentation, in: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE. pp. 1055–1059. URL: http://arxiv.org/abs/2004.08790, doi:10.1109/ICASSP40776.2020.9053405.
- Batch normalization: Accelerating deep network training by reducing internal covariate shift, in: Proceedings of the 32nd International Conference on Machine Learning (ICML), PMLR. pp. 448–456. URL: https://proceedings.mlr.press/v37/ioffe15.html.
- Weakly-supervised learning-based feature localization for confocal laser endomicroscopy glioma images, in: Medical Image Computing and Computer Assisted Intervention (MICCAI). Springer International Publishing. volume 11071, pp. 300–308. URL: https://link.springer.com/10.1007/978-3-030-00934-2_34, doi:10.1007/978-3-030-00934-2_34.
- SCALP - supervised contrastive learning for cardiopulmonary disease classification and localization in chest x-rays using patient metadata, in: International Conference on Data Mining (ICDM), IEEE. pp. 1132–1137. URL: https://ieeexplore.ieee.org/abstract/document/9679107, doi:10.1109/ICDM51629.2021.00134.
- XProtoNet: Diagnosis in chest radiography with global and local explanations, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE. pp. 15719–15728. URL: https://ieeexplore.ieee.org/document/9577909/, doi:10.1109/CVPR46437.2021.01546.
- Concept bottleneck with visual concept filtering for explainable medical image classification, in: Medical Image Computing and Computer Assisted Intervention (MICCAI) Workshops, Springer Nature Switzerland. pp. 225–233. URL: http://arxiv.org/abs/2308.11920.
- Visual interpretation of convolutional neural network predictions in classifying medical image modalities. Diagnostics 9. URL: https://www.mdpi.com/2075-4418/9/2/38, doi:10.3390/diagnostics9020038.
- Concept bottleneck models, in: Proceedings of the 37th International Conference on Machine Learning (ICML), PMLR. pp. 5338–5348. URL: https://proceedings.mlr.press/v119/koh20a.html.
- Challenges of implementing computer-aided diagnostic models for neuroimages in a clinical setting. npj Digital Medicine 6, 129. URL: https://doi.org/10.1038/s41746-023-00868-x, doi:10.1038/s41746-023-00868-x.
- H-DenseUNet: Hybrid densely connected UNet for liver and tumor segmentation from CT volumes. IEEE Transactions on Medical Imaging 37, 2663–2674. URL: https://ieeexplore.ieee.org/document/8379359/, doi:10.1109/TMI.2018.2845918.
- SSPNet: An interpretable 3d-CNN for classification of schizophrenia using phase maps of resting-state complex-valued fMRI data. Medical Image Analysis 79, 102430. URL: https://www.sciencedirect.com/science/article/pii/S1361841522000810, doi:https://doi.org/10.1016/j.media.2022.102430.
- Image segmentation using deep learning: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 3523–3542. URL: https://ieeexplore.ieee.org/document/9356353/, doi:10.1109/TPAMI.2021.3059968.
- Rotate to attend: Convolutional triplet attention module, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), IEEE. pp. 3139–3148. URL: https://ieeexplore.ieee.org/document/9423300/, doi:10.1109/WACV48630.2021.00318.
- Using ProtoPNet for interpretable alzheimer’s disease classification, in: Canadian Conference on Artificial Intelligence. URL: https://caiac.pubpub.org/pub/klwhoig4, doi:10.21428/594757db.fb59ce6c.
- Rectified linear units improve restricted boltzmann machines, in: Proceedings of the 27th International Conference on Machine Learning (ICML), Omnipress. pp. 807–814. URL: https://www.cs.toronto.edu/~hinton/absps/reluICML.pdf.
- Demystifying brain tumor segmentation networks: Interpretability and uncertainty analysis. Frontiers in Computational Neuroscience 14, 6. URL: https://www.frontiersin.org/articles/10.3389/fncom.2020.00006, doi:10.3389/fncom.2020.00006.
- Neural prototype trees for interpretable fine-grained image recognition, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE. pp. 14933–14943. URL: https://ieeexplore.ieee.org/document/9577335/, doi:10.1109/CVPR46437.2021.01469.
- PIP-Net: Patch-based intuitive prototypes for interpretable image classification, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE. pp. 2744–2753. URL: https://ieeexplore.ieee.org/document/10204807/, doi:10.1109/CVPR52729.2023.00269.
- Smooth grad-CAM++: An enhanced inference level visualization technique for deep convolutional neural network models. arXiv preprint arXiv:1908.01224 URL: http://arxiv.org/abs/1908.01224.
- Coherent concept-based explanations in medical image and its application to skin lesion diagnosis, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, IEEE. pp. 3799–3808. URL: https://ieeexplore.ieee.org/document/10208381/, doi:10.1109/CVPRW59228.2023.00394.
- Explainable deep learning methods in medical image classification: A survey. ACM Computing Surveys 56, 1–41. URL: https://doi.org/10.1145/3625287, doi:10.1145/3625287.
- Automatic brain tumor grading from MRI data using convolutional neural networks and quality assessment, in: Understanding and Interpreting Machine Learning in Medical Image Computing Applications, Springer International Publishing. pp. 106–114. URL: http://arxiv.org/abs/1809.09468, doi:10.1007/978-3-030-02628-8.
- Incorporating task-specific structural knowledge into CNNs for brain midline shift detection, in: Interpretability of Machine Intelligence in Medical Image Computing and Multimodal Learning for Clinical Decision Support, Springer. pp. 30–38. URL: http://arxiv.org/abs/1908.04568.
- U-Net: Convolutional networks for biomedical image segmentation, in: Medical Image Computing and Computer Assisted Intervention (MICCAI), Springer International Publishing. pp. 234–241. URL: http://arxiv.org/abs/1505.04597.
- Interpretable image classification with differentiable prototypes assignment, in: Proceedings of the European Conference on Computer Vision (ECCV), Springer Nature Switzerland. pp. 351–368. URL: http://arxiv.org/abs/2112.02902.
- ProtoPShare: Prototype sharing for interpretable image classification and similarity discovery, in: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pp. 1420–1430. URL: http://arxiv.org/abs/2011.14340, doi:10.1145/3447548.3467245.
- ProtoSeg: Interpretable semantic segmentation with prototypical parts, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), IEEE. pp. 1481–1492. URL: https://ieeexplore.ieee.org/document/10030923/, doi:10.1109/WACV56688.2023.00153.
- Transparency of deep neural networks for medical image analysis: A review of interpretability methods. Computers in Biology and Medicine 140, 105111. URL: https://www.sciencedirect.com/science/article/pii/S0010482521009057, doi:https://doi.org/10.1016/j.compbiomed.2021.105111.
- Towards emergent language symbolic semantic segmentation and model interpretability, in: Medical Image Computing and Computer Assisted Intervention MICCAI, Springer. pp. 326–334. URL: https://link.springer.com/10.1007/978-3-030-59710-8_32, doi:10.1007/978-3-030-59710-8_32.
- Grad-CAM: Visual explanations from deep networks via gradient-based localization, in: Proceedings of the IEEE International Conference on Computer Vision (ICCV), IEEE. pp. 618–626. URL: https://openaccess.thecvf.com/content_iccv_2017/html/Selvaraju_Grad-CAM_Visual_Explanations_ICCV_2017_paper.html.
- Cancer diagnosis using artificial intelligence: A review. Artificial Intelligence Review 55, 2641–2673. URL: https://doi.org/10.1007/s10462-021-10074-4, doi:10.1007/s10462-021-10074-4.
- Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15, 1929–1958. URL: https://www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf.
- SAUNet: Shape attentive U-Net for interpretable medical image segmentation, in: Medical Image Computing and Computer Assisted Intervention (MICCAI), Springer. pp. 797–806. URL: http://arxiv.org/abs/2001.07645.
- Algorithms and transparency in view of the new general data protection regulation. European Data Protection Law Review (EDPL) 3, 473. URL: https://heinonline.org/HOL/Page?handle=hein.journals/edpl3&id=512&div=&collection=.
- Attention is all you need, in: Advances in Neural Information Processing Systems (NeurIPS), Curran Associates, Inc.. pp. 5998–6008. URL: https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf.
- Convolutional embedding makes hierarchical vision transformer stronger, in: Proceedings of the European Conference on Computer Vision (ECCV), Springer Nature Switzerland. pp. 739–756. URL: https://www.ecva.net/papers/eccv_2022/papers_ECCV/html/3627_ECCV_2022_paper.php.
- HQProtoPNet: An evidence-based model for interpretable image recognition, in: International Joint Conference on Neural Networks (IJCNN), IEEE. pp. 1–8. URL: https://ieeexplore.ieee.org/abstract/document/10191863, doi:10.1109/IJCNN54540.2023.10191863.
- Tienet: Text-image embedding network for common thorax disease classification and reporting in chest x-rays, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE. pp. 9049–9058. URL: https://ieeexplore.ieee.org/document/8579041/, doi:10.1109/CVPR.2018.00943.
- MProtoNet: A case-based interpretable model for brain tumor classification with 3d multi-parametric magnetic resonance imaging, in: Medical Imaging with Deep Learning (MIDL), PMLR. pp. 1798–1812. URL: https://proceedings.mlr.press/v227/wei24a.html.
- CBAM: Convolutional block attention module, in: Proceedings of the European Conference on Computer Vision (ECCV), IEEE. pp. 3–19. URL: http://arxiv.org/abs/1807.06521.
- Robust and interpretable medical image classifiers via concept bottleneck models. arXiv preprint arXiv:2310.03182 URL: http://arxiv.org/abs/2310.03182.
- Devil is in the queries: Advancing mask transformers for real-world medical image segmentation and out-of-distribution localization, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE. pp. 23879–23889. URL: https://ieeexplore.ieee.org/document/10203355/, doi:10.1109/CVPR52729.2023.02287.
- Pathologist-level interpretable whole-slide cancer diagnosis with deep learning. Nature Machine Intelligence 1, 236–245. URL: https://www.nature.com/articles/s42256-019-0052-1, doi:10.1038/s42256-019-0052-1.
- CrossEAI: Using explainable AI to generate better bounding boxes for chest x-ray images. arXiv preprint arXiv:2310.19835 URL: http://arxiv.org/abs/2310.19835.
- Learning deep features for discriminative localization, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE. pp. 2921–2929. URL: http://ieeexplore.ieee.org/document/7780688/, doi:10.1109/CVPR.2016.319.
- Deformable DETR: Deformable transformers for end-to-end object detection, in: Proceedings of the 9th International Conference on Learning Representations (ICLR) 2021, OpenReview.net. URL: https://openreview.net/forum?id=gZ9hCDWe6ke.