Learning Causal Alignment for Reliable Disease Diagnosis (2310.01766v2)
Abstract: Aligning the decision-making process of machine learning algorithms with that of experienced radiologists is crucial for reliable diagnosis. While existing methods have attempted to align their diagnosis behaviors to those of radiologists reflected in the training data, this alignment is primarily associational rather than causal, resulting in pseudo-correlations that may not transfer well. In this paper, we propose a causality-based alignment framework towards aligning the model's decision process with that of experts. Specifically, we first employ counterfactual generation to identify the causal chain of model decisions. To align this causal chain with that of experts, we propose a causal alignment loss that enforces the model to focus on causal factors underlying each decision step in the whole causal chain. To optimize this loss that involves the counterfactual generator as an implicit function of the model's parameters, we employ the implicit function theorem equipped with the conjugate gradient method for efficient estimation. We demonstrate the effectiveness of our method on two medical diagnosis applications, showcasing faithful alignment to radiologists.
- Explainable artificial intelligence (xai): What we know and what is left to attain trustworthy artificial intelligence. Information Fusion, 99:101805, 2023.
- The lung image database consortium (lidc) and image database resource initiative (idri): a completed reference database of lung nodules on ct scans. Medical physics, 38(2):915–931, 2011.
- Explainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges toward responsible ai. Information fusion, 58:82–115, 2020.
- Latent-cf: a simple baseline for reverse counterfactual explanations. arXiv preprint arXiv:2012.09301, 2020.
- Approximating cnns with bag-of-local-features models works surprisingly well on imagenet. In International Conference on Learning Representations, 2019.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
- Towards robust classification model by counterfactual and invariant data generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15212–15221, 2021.
- This looks like that: deep learning for interpretable image recognition. Advances in neural information processing systems, 32, 2019.
- Zhengcong Fei. Attention-aligned transformer for image captioning. In proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pp. 607–615, 2022.
- Timo Freiesleben. The intriguing relation between counterfactual explanations and adversarial examples. Minds and Machines, 32(1):77–109, 2022.
- A computational cognition model of perception, memory, and judgment. Science China Information Sciences, 57:1–15, 2014.
- Aligning eyes between humans and deep neural network through interactive attention alignment. Proceedings of the ACM on Human-Computer Interaction, 6(CSCW2):1–28, 2022.
- Imagenet-trained cnns are biased towards texture; increasing shape bias improves accuracy and robustness. arXiv preprint arXiv:1811.12231, 2018.
- Evaluation of individuals with pulmonary nodules: When is it lung cancer?: Diagnosis and management of lung cancer: American college of chest physicians evidence-based clinical practice guidelines. Chest, 143(5):e93S–e120S, 2013.
- Why attention is not explanation: Surgical intervention and causal reasoning about neural models. 2020.
- Methods of conjugate gradients for solving linear systems. Journal of research of the National Bureau of Standards, 49(6):409–436, 1952.
- Ted: Teaching ai to explain its decisions. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, pp. 123–129, 2019.
- Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison. In Proceedings of the AAAI conference on artificial intelligence, volume 33, pp. 590–597, 2019.
- Improving deep learning interpretability by saliency guided training. Advances in Neural Information Processing Systems, 34:26726–26739, 2021.
- Attention is not explanation. arXiv preprint arXiv:1902.10186, 2019.
- Explaining in style: Training a gan to explain a classifier in stylespace. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 693–702, 2021.
- Ca-net: Leveraging contextual features for lung cancer prediction. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part V 24, pp. 23–32. Springer, 2021a.
- Act like a radiologist: towards reliable multi-view correspondence reasoning for mammogram mass detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(10):5947–5961, 2021b.
- Deep learning applications and challenges in big data analytics. Journal of big data, 2(1):1–21, 2015.
- Neural prototype trees for interpretable fine-grained image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14933–14943, 2021.
- Explaining visual models by causal attribution. In 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 4167–4175. IEEE, 2019.
- Torchopt: An efficient library for differentiable optimization. arXiv preprint arXiv:2211.06934, 2022.
- Interpretations are useful: penalizing explanations to align neural networks with prior knowledge. In International conference on machine learning, pp. 8116–8126. PMLR, 2020.
- U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pp. 234–241. Springer, 2015.
- Right for the right reasons: Training differentiable models by constraining their explanations. arXiv preprint arXiv:1703.03717, 2017.
- Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision, pp. 618–626, 2017.
- Multi-crop convolutional neural networks for lung nodule malignancy suspiciousness classification. Pattern Recognition, 61:663–673, 2017.
- Timo Speith. A review of taxonomies of explainable artificial intelligence (xai) methods. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, pp. 2239–2250, 2022.
- Application of cognitive computing in healthcare, cybersecurity, big data and iot: A literature review. Information Processing & Management, 59(2):102888, 2022.
- Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199, 2013.
- Counterfactual explanations and algorithmic recourses for machine learning: A review. arXiv preprint arXiv:2010.10596, 2020.
- Counterfactual explanations without opening the black box: Automated decisions and the gdpr. Harv. JL & Tech., 31:841, 2017.
- Dae-gcn: Identifying disease-related features for disease prediction. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part V 24, pp. 43–52. Springer, 2021.
- Knowledge-aware bayesian deep topic model. Advances in Neural Information Processing Systems, 35:14331–14344, 2022.
- Hybrid predictive models: When an interpretable model collaborates with a black-box model. The Journal of Machine Learning Research, 22(1):6085–6122, 2021.
- Joint learning for pulmonary nodule segmentation, attributes and malignancy prediction. In 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), pp. 1109–1113. IEEE, 2018.
- Knowledge-based collaborative deep learning for benign-malignant lung nodule classification on chest ct. IEEE transactions on medical imaging, 38(4):991–1004, 2018.
- Interpretable convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8827–8836, 2018.
- Conditional counterfactual causal effect for individual attribution. In Uncertainty in Artificial Intelligence, pp. 2519–2528. PMLR, 2023.
- Hybrid-augmented intelligence: collaboration and cognition. Frontiers of Information Technology & Electronic Engineering, 18(2):153–179, 2017.