When a Relation Tells More Than a Concept: Exploring and Evaluating Classifier Decisions with CoReX (2405.01661v3)
Abstract: Explanations for Convolutional Neural Networks (CNNs) based on relevance of input pixels might be too unspecific to evaluate which and how input features impact model decisions. Especially in complex real-world domains like biology, the presence of specific concepts and of relations between concepts might be discriminating between classes. Pixel relevance is not expressive enough to convey this type of information. In consequence, model evaluation is limited and relevant aspects present in the data and influencing the model decisions might be overlooked. This work presents a novel method to explain and evaluate CNN models, which uses a concept- and relation-based explainer (CoReX). It explains the predictive behavior of a model on a set of images by masking (ir-)relevant concepts from the decision-making process and by constraining relations in a learned interpretable surrogate model. We test our approach with several image data sets and CNN architectures. Results show that CoReX explanations are faithful to the CNN model in terms of predictive outcomes. We further demonstrate through a human evaluation that CoReX is a suitable tool for generating combined explanations that help assessing the classification quality of CNNs. We further show that CoReX supports the identification and re-classification of incorrect or ambiguous classifications.
- \bibcommenthead
- Finzel B (2024) Human-centered explanations: Lessons learned from image classification for medical and clinical decision making. KI-Künstliche Intelligenz 10.1007/s13218-024-00835-y, URL https://doi.org/10.1007/s13218-024-00835-y
- Gillies S (2022) Shapely documentation. Tech. rep., https://shapely.readthedocs.io/_/downloads/en/1.8.1/pdf/
- Miller T (2021) Contrastive explanation: a structural-model approach. Knowl Eng Rev 36:e14. 10.1017/S0269888921000102, URL https://doi.org/10.1017/S0269888921000102
- Mitchell TM (1997) Machine Learning. McGraw-Hill
- Muggleton SH (1991) Inductive logic programming. New Generation Computing 8(4):295–318. 10.1007/BF03037089, URL https://doi.org/10.1007/BF03037089
- Renz J (2002) Qualitative spatial reasoning with topological information. Springer
- Rosch EH (1973) Natural categories. Cognitive Psychology 4(3):328–350
- Schwalbe G, Finzel B (2023) A comprehensive taxonomy for explainable artificial intelligence: A systematic survey of surveys on methods and concepts. Data Mining and Knowledge Discovery pp 1–59
- Simons P (2000) Parts: A Study in Ontology. Oxford University Press
- Smith EE, Medin DL (2013) Categories and concepts. Harvard University Press
- Srinivasan A (2007) The Aleph Manual. URL https://www.cs.ox.ac.uk/activities/programinduction/Aleph/aleph.html
- Wille R (2005) Formal concept analysis as mathematical theory of concepts and concept hierarchies. In: Formal Concept Analysis. Springer, p 1–33