Papers
Topics
Authors
Recent
Search
2000 character limit reached

Benchmark data to study the influence of pre-training on explanation performance in MR image classification

Published 21 Jun 2023 in cs.CV, cs.AI, and cs.LG | (2306.12150v1)

Abstract: Convolutional Neural Networks (CNNs) are frequently and successfully used in medical prediction tasks. They are often used in combination with transfer learning, leading to improved performance when training data for the task are scarce. The resulting models are highly complex and typically do not provide any insight into their predictive mechanisms, motivating the field of 'explainable' artificial intelligence (XAI). However, previous studies have rarely quantitatively evaluated the 'explanation performance' of XAI methods against ground-truth data, and transfer learning and its influence on objective measures of explanation performance has not been investigated. Here, we propose a benchmark dataset that allows for quantifying explanation performance in a realistic magnetic resonance imaging (MRI) classification task. We employ this benchmark to understand the influence of transfer learning on the quality of explanations. Experimental results show that popular XAI methods applied to the same underlying model differ vastly in performance, even when considering only correctly classified examples. We further observe that explanation performance strongly depends on the task used for pre-training and the number of CNN layers pre-trained. These results hold after correcting for a substantial correlation between explanation and classification performance.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. Openxai: Towards a transparent evaluation of model explanations. arXiv preprint arXiv:2206.11104, 2022.
  2. Z. Ardalan and V. Subbian. Transfer learning approaches for neuroimaging analysis: A scoping review. Frontiers in Artificial Intelligence, 5, 2022. ISSN 2624-8212. doi: 10.3389/frai.2022.780405.
  3. Clevr-xai: A benchmark dataset for the ground truth evaluation of neural network explanations. Information Fusion, 81:14–40, 2022. ISSN 1566-2535.
  4. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLOS ONE, 10(7):1–46, 07 2015a. doi: 10.1371/journal.pone.0130140.
  5. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLOS ONE, 10(7):1–46, 07 2015b.
  6. Transfer learning with convolutional neural networks for classification of abdominal ultrasound images. J Digit Imaging, 30(2):234–243, Apr. 2017.
  7. M. Cherti and J. Jitsev. Effect of Pre-Training Scale on Intra- and Inter-Domain Full and Few-Shot Transfer Learning for Natural and Medical X-Ray Chest Images. pages 1–6. Medical Imaging Meets NeurIPS (MedNeurIPS), Sydney / online (Australia), 6 Dec 2021 - 14 Dec 2021, 2021.
  8. A transfer-learning approach for accelerated MRI using deep neural networks. Magnetic Resonance in Medicine, 84(2):663–685, 2020. doi: 10.1002/mrm.28148.
  9. A. Das and P. Rad. Opportunities and challenges in explainable artificial intelligence (XAI): A survey, 2020.
  10. Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255, 2009a.
  11. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009b.
  12. White matter hyperintensities are common in midlife and already associated with cognitive decline. Brain Communications, 1(1), 12 2019. ISSN 2632-1297. doi: 10.1093/braincomms/fcz041.
  13. The human connectome project: A retrospective. NeuroImage, 244:118543, 2021.
  14. European Commission. 2018 Reform of EU Data Protection Rules, 2018. URL {https://ec.europa.eu/commission/sites/beta-political/files/data-protection-factsheet-changes_en.pdf}.
  15. B. Fischl. FreeSurfer. NeuroImage, 62(2):774–781, 2012. ISSN 1053-8119. doi: 10.1016/j.neuroimage.2012.01.021.
  16. Datasheets for datasets. Communications of the ACM, 64(12):86–92, 2021.
  17. Evaluating Feature Attribution Methods in the Image Domain. arXiv e-prints, art. arXiv:2202.12270, 2022.
  18. The minimal preprocessing pipelines for the human connectome project. NeuroImage, 80:105–124, 2013. ISSN 1053-8119. doi: 10.1016/j.neuroimage.2013.04.127.
  19. On the interpretation of weight vectors of linear models in multivariate neuroimaging. NeuroImage, 87:96–110, 2014.
  20. Towards the interpretability of deep learning models for multi-modal neuroimaging: Finding structural changes of the ageing brain. NeuroImage, 261:119504, 2022.
  21. Improved optimization for the robust and accurate linear registration and motion correction of brain images. NeuroImage, 17(2):825–841, 2002. ISSN 1053-8119. doi: 10.1006/nimg.2002.1132.
  22. FSL. NeuroImage, 62(2):782–790, 2012. ISSN 1053-8119. doi: 10.1016/j.neuroimage.2011.09.015.
  23. ImageNet Classification with Deep Convolutional Neural Networks. In F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 25, pages 1097–1105. Curran Associates, Inc., 2012.
  24. S. Lundberg and S.-I. Lee. A unified approach to interpreting model predictions, 2017.
  25. M. Milchenko and D. Marcus. Obscuring surface anatomy in volumetric imaging data. Neuroinformatics, 11(1):65–75, Jan 2013. ISSN 1559-0089. doi: 10.1007/s12021-012-9160-3.
  26. N. Otsu. A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man, and Cybernetics, 9(1):62–66, 1979. doi: 10.1109/TSMC.1979.4310076.
  27. S. J. Pan and Q. Yang. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10):1345–1359, 2010. doi: 10.1109/TKDE.2009.191.
  28. Improving reproducibility in machine learning research (a report from the neurips 2019 reproducibility program). The Journal of Machine Learning Research, 22(1):7459–7478, 2021.
  29. An automated tool for detection of flair-hyperintense white-matter lesions in multiple sclerosis. Neuroimage, 59(4):3774–3783, 2012.
  30. First U-Net layers contain more domain specific information than the last ones. In S. Albarqouni, S. Bakas, K. Kamnitsas, M. J. Cardoso, B. Landman, W. Li, F. Milletari, N. Rieke, H. Roth, D. Xu, and Z. Xu, editors, Domain Adaptation and Representation Transfer, and Distributed and Collaborative Learning, pages 117–126, Cham, 2020. Springer International Publishing. ISBN 978-3-030-60548-3.
  31. Learning important features through propagating activation differences. 2017. doi: 10.48550/ARXIV.1704.02685.
  32. K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv, 2014. doi: 10.48550/ARXIV.1409.1556.
  33. Deep inside convolutional networks: Visualising image classification models and saliency maps, 2013.
  34. Striving for simplicity: The all convolutional net. In ICLR (workshop track), 2015.
  35. Axiomatic attribution for deep networks, 2017.
  36. E. Tjoa and C. Guan. Quantifying Explainability of Saliency Methods in Deep Neural Networks. 2020.
  37. E. J. Topol. High-performance medicine: the convergence of human and artificial intelligence. Nature Medicine, 25(1):44–56, Jan 2019. ISSN 1546-170X. doi: 10.1038/s41591-018-0300-7.
  38. Transfer learning in magnetic resonance brain imaging: A systematic review. Journal of Imaging, 7(4), 2021. ISSN 2313-433X. doi: 10.3390/jimaging7040066.
  39. The WU-Minn human connectome project: An overview. NeuroImage, 80:62–79, 2013. ISSN 1053-8119. doi: 10.1016/j.neuroimage.2013.05.041.
  40. Age-associated white matter lesions: The MRC cognitive function and ageing study. Brain Pathology, 25(1):35–43, 2015. doi: 10.1111/bpa.12219.
  41. Scrutinizing XAI using linear ground-truth data with suppressor variables. Machine Learning, 2022.
  42. M. D. Zeiler and R. Fergus. Visualizing and understanding convolutional networks. In D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, editors, Computer Vision – ECCV 2014, pages 818–833, Cham, 2014. Springer International Publishing. ISBN 978-3-319-10590-1.
  43. Do feature attribution methods correctly attribute features? In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 9623–9633, 2022.
  44. Explainable sentiment analysis with applications in medicine. In 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pages 1740–1747, 2018. doi: 10.1109/BIBM.2018.8621359.
Citations (1)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.