Medical Image Debiasing by Learning Adaptive Agreement from a Biased Council (2401.11713v1)
Abstract: Deep learning could be prone to learning shortcuts raised by dataset bias and result in inaccurate, unreliable, and unfair models, which impedes its adoption in real-world clinical applications. Despite its significance, there is a dearth of research in the medical image classification domain to address dataset bias. Furthermore, the bias labels are often agnostic, as identifying biases can be laborious and depend on post-hoc interpretation. This paper proposes learning Adaptive Agreement from a Biased Council (Ada-ABC), a debiasing framework that does not rely on explicit bias labels to tackle dataset bias in medical images. Ada-ABC develops a biased council consisting of multiple classifiers optimized with generalized cross entropy loss to learn the dataset bias. A debiasing model is then simultaneously trained under the guidance of the biased council. Specifically, the debiasing model is required to learn adaptive agreement with the biased council by agreeing on the correctly predicted samples and disagreeing on the wrongly predicted samples by the biased council. In this way, the debiasing model could learn the target attribute on the samples without spurious correlations while also avoiding ignoring the rich information in samples with spurious correlations. We theoretically demonstrated that the debiasing model could learn the target features when the biased model successfully captures dataset bias. Moreover, to our best knowledge, we constructed the first medical debiasing benchmark from four datasets containing seven different bias scenarios. Our extensive experiments practically showed that our proposed Ada-ABC outperformed competitive approaches, verifying its effectiveness in mitigating dataset bias for medical image classification. The codes and organized benchmark datasets will be made publicly available.
- E. J. Topol, “High-performance medicine: the convergence of human and artificial intelligence,” Nature medicine, vol. 25, no. 1, pp. 44–56, 2019.
- R. Geirhos et al., “Shortcut learning in deep neural networks,” Nature Machine Intelligence, vol. 2, no. 11, pp. 665–673, 2020.
- L. Oakden-Rayner et al., “Hidden stratification causes clinically meaningful failures in machine learning for medical imaging,” in Proceedings of the ACM conference on health, inference, and learning, pp. 151–159, 2020.
- L. Luo et al., “Pseudo bias-balanced learning for debiased chest x-ray classification,” in Medical Image Computing and Computer Assisted Intervention–MICCAI 2022: 25th International Conference, Singapore, September 18–22, 2022, Proceedings, Part VIII, pp. 621–631, Springer, 2022.
- A. J. DeGrave, J. D. Janizek and S.-I. Lee, “Ai for radiographic covid-19 detection selects shortcuts over signal,” Nature Machine Intelligence, vol. 3, no. 7, pp. 610–619, 2021.
- A. J. Larrazabal et al., “Gender imbalance in medical imaging datasets produces biased classifiers for computer-aided diagnosis,” Proceedings of the National Academy of Sciences, vol. 117, no. 23, pp. 12592–12594, 2020.
- J. W. Gichoya et al., “Ai recognition of patient race in medical imaging: a modelling study,” The Lancet Digital Health, vol. 4, no. 6, pp. e406–e414, 2022.
- D. A. Bluemke et al., “Assessing radiology research on artificial intelligence: a brief guide for authors, reviewers, and readers—from the radiology editorial board,” 2020.
- S. Taylor-Phillips et al., “Uk national screening committee’s approach to reviewing evidence on artificial intelligence in breast cancer screening,” The Lancet Digital Health, vol. 4, no. 7, pp. e558–e565, 2022.
- J. Nam et al., “Learning from failure: De-biasing classifier from biased classifier,” Advances in Neural Information Processing Systems, vol. 33, pp. 20673–20684, 2020.
- H. Shah et al., “The pitfalls of simplicity bias in neural networks,” Advances in Neural Information Processing Systems, vol. 33, pp. 9573–9585, 2020.
- D. Kalimeris et al., “Sgd on neural networks learns functions of increasing complexity,” Advances in neural information processing systems, vol. 32, 2019.
- K. Hermann and A. Lampinen, “What shapes feature representations? exploring datasets, architectures, and training,” Advances in Neural Information Processing Systems, vol. 33, pp. 9995–10006, 2020.
- D. Teney et al., “Evading the simplicity bias: Training a diverse set of models discovers solutions with superior ood generalization,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16761–16772, 2022.
- J. Rueckel et al., “Impact of confounding thoracic tubes and pleural dehiscence extent on artificial intelligence pneumothorax detection in chest radiographs,” Investigative Radiology, vol. 55, no. 12, pp. 792–798, 2020.
- P. Rouzrokh et al., “Mitigating bias in radiology machine learning: 1. data handling,” Radiology: Artificial Intelligence, vol. 4, no. 5, p. e210290, 2022.
- Y. Li and N. Vasconcelos, “Repair: Removing representation bias by dataset resampling,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9572–9581, 2019.
- S. Sagawa et al., “Distributionally robust neural networks for group shifts: On the importance of regularization for worst-case generalization,” in International Conference on Learning Representations, 2020.
- M. Arjovsky et al., “Invariant risk minimization,” arXiv preprint arXiv:1907.02893, 2019.
- G. Zhang et al., “Quantifying and improving transferability in domain generalization,” Advances in Neural Information Processing Systems, vol. 34, pp. 10957–10970, 2021.
- X. Zhou et al., “Sparse invariant risk minimization,” in International Conference on Machine Learning, pp. 27222–27244, PMLR, 2022.
- E. Tartaglione, C. A. Barbano and M. Grangetto, “End: Entangling and disentangling deep representations for bias correction,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13508–13517, 2021.
- W. Zhu et al., “Learning bias-invariant representation by cross-sample mutual information minimization,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 15002–15012, 2021.
- N. Sohoni et al., “No subclass left behind: Fine-grained robustness in coarse-grained classification problems,” Advances in Neural Information Processing Systems, vol. 33, pp. 19339–19352, 2020.
- E. Z. Liu et al., “Just train twice: Improving group robustness without training group information,” in International Conference on Machine Learning, pp. 6781–6792, PMLR, 2021.
- J. Lee et al., “Learning debiased representation via disentangled feature augmentation,” Advances in Neural Information Processing Systems, vol. 34, pp. 25123–25133, 2021.
- E. Kim, J. Lee and J. Choo, “Biaswap: Removing dataset bias with bias-tailored swapping augmentation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 14992–15001, 2021.
- L. Luo et al., “Rethinking annotation granularity for overcoming shortcuts in deep learning–based radiograph diagnosis: A multicenter study,” Radiology: Artificial Intelligence, vol. 4, no. 5, p. e210299, 2022.
- J. D. Viviano et al., “Saliency is a possible red herring when diagnosing poor generalization,” in ICLR, 2020.
- L. Seyyed-Kalantari et al., “Chexclusion: Fairness gaps in deep chest x-ray classifiers,” in BIOCOMPUTING 2021: proceedings of the Pacific symposium, pp. 232–243, World Scientific, 2020.
- L. Seyyed-Kalantari et al., “Underdiagnosis bias of artificial intelligence algorithms applied to chest radiographs in under-served patient populations,” Nature medicine, vol. 27, no. 12, pp. 2176–2182, 2021.
- B. Zufiria et al., “Analysis of potential biases on mammography datasets for deep learning model development,” in Applications of Medical Artificial Intelligence: First International Workshop, AMAI 2022, Held in Conjunction with MICCAI 2022, Singapore, September 18, 2022, Proceedings, pp. 59–67, Springer, 2022.
- Q. Zhao, E. Adeli and K. M. Pohl, “Training confounder-free deep learning models for medical applications,” Nature communications, vol. 11, no. 1, p. 6010, 2020.
- Y. Hong and E. Yang, “Unbiased classification through bias-contrastive and bias-balanced learning,” Advances in Neural Information Processing Systems, vol. 34, 2021.
- N. Kim et al., “Learning debiased classifier with biased committee,” in Advances in Neural Information Processing Systems, 2022.
- D. Arpit et al., “A closer look at memorization in deep networks,” in International conference on machine learning, pp. 233–242, PMLR, 2017.
- M. Pagliardini et al., “Agree to disagree: Diversity through disagreement for better transferability,” in ICLR, 2023.
- Z. Zhang and M. Sabuncu, “Generalized cross entropy loss for training deep neural networks with noisy labels,” Advances in neural information processing systems, vol. 31, 2018.
- G. Nam et al., “Diversity matters when learning from ensembles,” Advances in neural information processing systems, vol. 34, pp. 8367–8377, 2021.
- L. Luo et al., “Deep mining external imperfect data for chest x-ray disease screening,” IEEE transactions on medical imaging, vol. 39, no. 11, pp. 3583–3594, 2020.
- A. E. Johnson et al., “Mimic-cxr, a de-identified publicly available database of chest radiographs with free-text reports,” Scientific data, vol. 6, no. 1, pp. 1–8, 2019.
- X. Wang et al., “Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2097–2106, 2017.
- J. M. Zambrano Chaves et al., “Opportunistic assessment of ischemic heart disease risk using abdominopelvic computed tomography and medical record data: a multimodal explainable artificial intelligence approach,” Scientific Reports, vol. 13, no. 1, p. 21034, 2023.
- Y. Zong, Y. Yang and T. Hospedales, “Medfair: Benchmarking fairness for medical imaging,” in The Eleventh International Conference on Learning Representations, 20223.
- G. Huang et al., “Densely connected convolutional networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pp. 4700–4708, 2017.
- J. P. Cohen et al., “Torchxrayvision: A library of chest x-ray datasets and models,” in International Conference on Medical Imaging with Deep Learning, pp. 231–249, PMLR, 2022.
- D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in Proc. Int. Conf. Learn. Representations, 2015.
- B. Zhou et al., “Learning deep features for discriminative localization,” in CVPR, pp. 2921–2929, 2016.
- Luyang Luo (39 papers)
- Xin Huang (222 papers)
- Minghao Wang (18 papers)
- Zhuoyue Wan (5 papers)
- Hao Chen (1005 papers)