Looking Beyond What You See: An Empirical Analysis on Subgroup Intersectional Fairness for Multi-label Chest X-ray Classification Using Social Determinants of Racial Health Inequities (2403.18196v1)
Abstract: There has been significant progress in implementing deep learning models in disease diagnosis using chest X- rays. Despite these advancements, inherent biases in these models can lead to disparities in prediction accuracy across protected groups. In this study, we propose a framework to achieve accurate diagnostic outcomes and ensure fairness across intersectional groups in high-dimensional chest X- ray multi-label classification. Transcending traditional protected attributes, we consider complex interactions within social determinants, enabling a more granular benchmark and evaluation of fairness. We present a simple and robust method that involves retraining the last classification layer of pre-trained models using a balanced dataset across groups. Additionally, we account for fairness constraints and integrate class-balanced fine-tuning for multi-label settings. The evaluation of our method on the MIMIC-CXR dataset demonstrates that our framework achieves an optimal tradeoff between accuracy and fairness compared to baseline methods.
- Agency for Healthcare Research and Quality. Social Determinants of Health Database. https://www.ahrq.gov/sdoh/data-analytics/sdoh-data.html. Accessed: March 27, 2024.
- Anaxnet: anatomy aware multi-label finding classification in chest x-ray. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part V 24, pages 804–813. Springer, 2021.
- A near-optimal algorithm for debiasing trained machine learning models. Advances in Neural Information Processing Systems, 34:8072–8084, 2021.
- A reduction to binary approach for debiasing multiclass datasets. Advances in Neural Information Processing Systems, 35:2480–2493, 2022.
- Sources of bias in artificial intelligence that perpetuate healthcare disparities—a global review. PLOS Digital Health, 1(3):e0000022, 2022.
- Data preprocessing to mitigate bias: A maximum entropy based approach. In International conference on machine learning, pages 1349–1359. PMLR, 2020.
- Joint modeling of chest radiographs and radiology reports for pulmonary edema assessment. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 529–539. Springer, 2020.
- Technical challenges for training fair neural networks. arXiv preprint arXiv:2102.06764, 2021.
- Fifa: Making fairness more generalizable in classifiers trained on imbalanced data. arXiv preprint arXiv:2206.02792, 2022.
- Sociodemographic variables reporting in human radiology artificial intelligence research. Journal of the American College of Radiology, 2023.
- Fairness in deep learning: A computational perspective. IEEE Intelligent Systems, 36(4):25–34, 2020.
- Certifying and removing disparate impact. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’15, page 259–268, New York, NY, USA, 2015. Association for Computing Machinery.
- In search of lost domain generalization. arXiv preprint arXiv:2007.01434, 2020.
- Equality of opportunity in supervised learning. Advances in neural information processing systems, 29, 2016.
- Multi-label generalized zero shot learning for the classification of disease in chest radiographs. In Machine Learning for Healthcare Conference, pages 461–477. PMLR, 2021.
- Strategies and solutions to address digital determinants of health (ddoh) across underinvested communities. PLOS digital health, 2(10):e0000314, 2023.
- On feature learning in the presence of spurious correlations. Advances in Neural Information Processing Systems, 35:38516–38532, 2022.
- Deep learning applied to chest x-rays: Exploiting and preventing shortcuts. In Machine Learning for Healthcare Conference, pages 750–782. PMLR, 2020.
- Improving joint learning of chest x-ray and radiology report by word region alignment. In International Workshop on Machine Learning in Medical Imaging, pages 110–119. Springer, 2021.
- Improving joint learning of chest x-ray and radiology report by word region alignment. In Machine Learning in Medical Imaging: 12th International Workshop, MLMI 2021, Held in Conjunction with MICCAI 2021, Strasbourg, France, September 27, 2021, Proceedings, page 110–119, Berlin, Heidelberg, 2021. Springer-Verlag.
- Mimic-iv, a freely accessible electronic health record dataset. Scientific data, 10(1):1, 2023.
- Mimic-cxr-jpg, a large publicly available database of labeled chest radiographs. arXiv preprint arXiv:1901.07042, 2019.
- Assessing algorithmic fairness with unobserved protected class using data combination. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, FAT* ’20, page 110, New York, NY, USA, 2020. Association for Computing Machinery.
- Data preprocessing techniques for classification without discrimination. Knowledge and information systems, 33(1):1–33, 2012.
- Fairness-aware classifier with prejudice remover regularizer. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2012, Bristol, UK, September 24-28, 2012. Proceedings, Part II 23, pages 35–50. Springer, 2012.
- Multiaccuracy: Black-box post-processing for fairness in classification. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, pages 247–254, 2019.
- Last layer re-training is sufficient for robustness to spurious correlations. arXiv preprint arXiv:2204.02937, 2022.
- Wilds: A benchmark of in-the-wild distribution shifts. In International Conference on Machine Learning, pages 5637–5664. PMLR, 2021.
- Bias mitigation framework for intersectional subgroups in neural networks. arXiv preprint arXiv:2212.13014, 2022.
- Fairness without demographics through adversarially reweighted learning. Advances in neural information processing systems, 33:728–740, 2020.
- Gender imbalance in medical imaging datasets produces biased classifiers for computer-aided diagnosis. Proceedings of the National Academy of Sciences, 117(23):12592–12594, 2020.
- Just train twice: Improving group robustness without training group information. In International Conference on Machine Learning, pages 6781–6792. PMLR, 2021.
- Fnnc: Achieving fairness through neural networks. arXiv preprint arXiv:1811.00247, 2018.
- Last-layer fairness fine-tuning is simple and effective for neural networks. arXiv preprint arXiv:2304.03935, 2023.
- Debiasing deep chest x-ray classifiers using intra- and post-processing methods. In Zachary Lipton, Rajesh Ranganath, Mark Sendak, Michael Sjoding, and Serena Yeung, editors, Proceedings of the 7th Machine Learning for Healthcare Conference, volume 182 of Proceedings of Machine Learning Research, pages 504–536. PMLR, 05–06 Aug 2022.
- Few-shot learning geometric ensemble for multi-label classification of chest x-rays. In Data Augmentation, Labelling, and Imperfections: Second MICCAI Workshop, DALI 2022, Held in Conjunction with MICCAI 2022, Singapore, September 22, 2022, Proceedings, pages 112–122. Springer, 2022.
- Learning from failure: De-biasing classifier from biased classifier. Advances in Neural Information Processing Systems, 33:20673–20684, 2020.
- Bias in artificial intelligence algorithms and recommendations for mitigation. PLOS Digital Health, 2(6):e0000278, 2023.
- Fnnc: Achieving fairness through neural networks. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI’20, 2021.
- On fairness and calibration. Advances in neural information processing systems, 30, 2017.
- An investigation of why overparameterization exacerbates spurious correlations. In International Conference on Machine Learning, pages 8346–8356. PMLR, 2020.
- Diagnosing failures of fairness transfer across distribution shift in real-world medical settings. Advances in Neural Information Processing Systems, 35:19304–19318, 2022.
- A scoping review of artificial intelligence applications in thoracic surgery. European Journal of Cardio-Thoracic Surgery, 61(2):239–248, 10 2021.
- Underdiagnosis bias of artificial intelligence algorithms applied to chest radiographs in under-served patient populations. Nature medicine, 27(12):2176–2182, 2021.
- Vladimir N Vapnik. An overview of statistical learning theory. IEEE transactions on neural networks, 10(5):988–999, 1999.
- Fairness definitions explained. In Proceedings of the international workshop on software fairness, pages 1–7, 2018.
- Narrowing the gap: imaging disparities in radiology. Radiology, 299(1):27–35, 2021.
- Enabling chronic obstructive pulmonary disease diagnosis through chest x-rays: A multi-site and multi-modality study. International Journal of Medical Informatics, 178:105211, 2023.
- Balanced datasets are not enough: Estimating and mitigating gender bias in deep image representations. In Proceedings of the IEEE/CVF international conference on computer vision, pages 5310–5319, 2019.
- A fine-grained analysis on distribution shift, 2021.
- Fairness with overlapping groups; a probabilistic perspective. Advances in neural information processing systems, 33:4067–4078, 2020.
- Evaluating the impact of social determinants on health prediction. arXiv preprint arXiv:2305.12622, 2023.
- Change is hard: A closer look at subpopulation shift. arXiv preprint arXiv:2302.12254, 2023.
- Improving the fairness of chest x-ray classifiers. In Conference on Health, Inference, and Learning, pages 204–233. PMLR, 2022.
- An empirical framework for domain generalization in clinical settings. In Proceedings of the conference on health, inference, and learning, pages 279–290, 2021.
- Medfair: Benchmarking fairness for medical imaging. arXiv preprint arXiv:2210.01725, 2022.
- Dana Moukheiber (7 papers)
- Saurabh Mahindre (1 paper)
- Lama Moukheiber (6 papers)
- Mira Moukheiber (5 papers)
- Mingchen Gao (27 papers)