Harvard Glaucoma Fairness: A Retinal Nerve Disease Dataset for Fairness Learning and Fair Identity Normalization (2306.09264v3)
Abstract: Fairness (also known as equity interchangeably) in machine learning is important for societal well-being, but limited public datasets hinder its progress. Currently, no dedicated public medical datasets with imaging data for fairness learning are available, though minority groups suffer from more health issues. To address this gap, we introduce Harvard Glaucoma Fairness (Harvard-GF), a retinal nerve disease dataset with both 2D and 3D imaging data and balanced racial groups for glaucoma detection. Glaucoma is the leading cause of irreversible blindness globally with Blacks having doubled glaucoma prevalence than other races. We also propose a fair identity normalization (FIN) approach to equalize the feature importance between different identity groups. Our FIN approach is compared with various the-state-of-the-art fairness learning methods with superior performance in the racial, gender, and ethnicity fairness tasks with 2D and 3D imaging data, which demonstrate the utilities of our dataset Harvard-GF for fairness learning. To facilitate fairness comparisons between different models, we propose an equity-scaled performance measure, which can be flexibly used to compare all kinds of performance metrics in the context of fairness. The dataset and code are publicly accessible via \url{https://ophai.hms.harvard.edu/datasets/harvard-glaucoma-fairness-3300-samples/}.
- J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE conference on computer vision and pattern recognition. Ieee, 2009, pp. 248–255.
- A. Krizhevsky, G. Hinton et al., “Learning multiple layers of features from tiny images,” 2009.
- A. Kadambi, “Achieving fairness in medical devices,” Science, vol. 372, no. 6537, pp. 30–31, 2021.
- R. B. Parikh, S. Teeple, and A. S. Navathe, “Addressing bias in artificial intelligence in health care,” Jama, vol. 322, no. 24, pp. 2377–2378, 2019.
- N. Mehrabi, F. Morstatter, N. Saxena, K. Lerman, and A. Galstyan, “A survey on bias and fairness in machine learning,” ACM Computing Surveys (CSUR), vol. 54, no. 6, pp. 1–35, 2021.
- J. Dressel and H. Farid, “The accuracy, fairness, and limits of predicting recidivism,” Science advances, vol. 4, no. 1, p. eaao5580, 2018.
- A. Asuncion and D. Newman, “Uci machine learning repository,” 2007.
- L. F. Wightman, “Lsac national longitudinal bar passage study. lsac research report series.” 1998.
- W. Miao, “Did the results of promotion exams have a disparate impact on minorities? using statistical evidence in ricci v. destefano,” Journal of Statistics Education, vol. 18, no. 3, 2010.
- J. Kuzilek, M. Hlosta, and Z. Zdrahal, “Open university learning analytics dataset,” Scientific data, vol. 4, no. 1, pp. 1–8, 2017.
- S. Ruggles, R. McCaa, M. Sobek, and L. Cleveland, “The ipums collaboration: integrating and disseminating the world’s population microdata,” Journal of demographic economics, vol. 81, no. 2, pp. 203–216, 2015.
- Z. Zhang, Y. Song, and H. Qi, “Age progression/regression by conditional adversarial autoencoder,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 5810–5818.
- P. Braveman, “Health disparities and health equity: concepts and measurement,” Annu. Rev. Public Health, vol. 27, pp. 167–194, 2006.
- M. Marmot, S. Friel, R. Bell, T. A. Houweling, and S. Taylor, “Closing the gap in a generation: health equity through action on the social determinants of health,” The lancet, vol. 372, no. 9650, pp. 1661–1669, 2008.
- M. Marmot, “Achieving health equity: from root causes to fair outcomes,” The Lancet, vol. 370, no. 9593, pp. 1153–1163, 2007.
- C. R. Lyles, R. M. Wachter, and U. Sarkar, “Focusing on digital health equity,” Jama, vol. 326, no. 18, pp. 1795–1796, 2021.
- M. R. Carnethon, K. N. Kershaw, and N. R. Kandula, “Disparities research, disparities researchers, and health equity,” Jama, vol. 323, no. 3, pp. 211–212, 2020.
- M. Marmot, J. Allen, R. Bell, and P. Goldblatt, “Building of the global movement for health equity: from santiago to rio and beyond,” The Lancet, vol. 379, no. 9811, pp. 181–188, 2012.
- W. F. Wong, T. A. LaVeist, and J. M. Sharfstein, “Achieving health equity by design,” Jama, vol. 313, no. 14, pp. 1417–1418, 2015.
- Y.-C. Tham, X. Li, T. Y. Wong, H. A. Quigley, T. Aung, and C.-Y. Cheng, “Global prevalence of glaucoma and projections of glaucoma burden through 2040: a systematic review and meta-analysis,” Ophthalmology, vol. 121, no. 11, pp. 2081–2090, 2014.
- H. A. Quigley, “Number of people with glaucoma worldwide.” British journal of ophthalmology, vol. 80, no. 5, pp. 389–393, 1996.
- H. A. Quigley and A. T. Broman, “The number of people with glaucoma worldwide in 2010 and 2020,” British journal of ophthalmology, vol. 90, no. 3, pp. 262–267, 2006.
- Y. Tian, M. Shi, Y. Luo, A. Kouhana, T. Elze, and M. Wang, “Fairseg: A large-scale medical image segmentation dataset for fairness learning with fair error-bound scaling,” arXiv preprint arXiv:2311.02189, 2023.
- Y. Luo, Y. Tian, M. Shi, T. Elze, and M. Wang, “Eye fairness: A large-scale 3d imaging dataset for equitable eye diseases screening and fair identity scaling,” arXiv preprint arXiv:2310.02492, 2023.
- M. Shi, A. Lokhande, M. S. Fazli, V. Sharma, Y. Tian, Y. Luo, L. R. Pasquale, T. Elze, M. V. Boland, N. Zebardast et al., “Artifact-tolerant clustering-guided contrastive embedding learning for ophthalmic images in glaucoma,” IEEE Journal of Biomedical and Health Informatics, 2023.
- Y. Luo, M. Shi, Y. Tian, T. Elze, and M. Wang, “Harvard glaucoma detection and progression: A multimodal multitask dataset and generalization-reinforced semi-supervised learning,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 20 471–20 482.
- M. Shi, J. A. Sun, A. Lokhande, Y. Tian, Y. Luo, T. Elze, L. Q. Shen, and M. Wang, “Artifact correction in retinal nerve fiber layer thickness maps using deep learning and its clinical utility in glaucoma,” Translational Vision Science & Technology, vol. 12, no. 11, pp. 12–12, 2023.
- Y. Sun, A. Chen, M. Zou, Y. Zhang, L. Jin, Y. Li, D. Zheng, G. Jin, and N. Congdon, “Time trends, associations and prevalence of blindness and vision loss due to glaucoma: an analysis of observational data from the global burden of disease study 2017,” BMJ open, vol. 12, no. 1, p. e053805, 2022.
- K. Vermeer, F. Vos, B. Lo, Q. Zhou, H. Lemij, A. Vossepoel, and L. van Vliet, “Modeling of scanning laser polarimetry images of the human retina for progression detection of glaucoma,” IEEE Transactions on Medical Imaging, vol. 25, no. 5, pp. 517–528, 2006.
- G. D. Joshi, J. Sivaswamy, and S. R. Krishnadas, “Optic disk and cup segmentation from monocular color retinal images for glaucoma assessment,” IEEE Transactions on Medical Imaging, vol. 30, no. 6, pp. 1192–1205, 2011.
- J. Cheng, J. Liu, Y. Xu, F. Yin, D. W. K. Wong, N.-M. Tan, D. Tao, C.-Y. Cheng, T. Aung, and T. Y. Wong, “Superpixel classification based optic disc and optic cup segmentation for glaucoma screening,” IEEE Transactions on Medical Imaging, vol. 32, no. 6, pp. 1019–1032, 2013.
- H. Fu, Y. Xu, S. Lin, X. Zhang, D. W. K. Wong, J. Liu, A. F. Frangi, M. Baskaran, and T. Aung, “Segmentation and quantification for angle-closure glaucoma assessment in anterior segment oct,” IEEE Transactions on Medical Imaging, vol. 36, no. 9, pp. 1930–1938, 2017.
- H. Fu, J. Cheng, Y. Xu, C. Zhang, D. W. K. Wong, J. Liu, and X. Cao, “Disc-aware ensemble network for glaucoma screening from fundus image,” IEEE Transactions on Medical Imaging, vol. 37, no. 11, pp. 2493–2501, 2018.
- A. Diaz-Pinto, A. Colomer, V. Naranjo, S. Morales, Y. Xu, and A. F. Frangi, “Retinal image synthesis and semi-supervised learning for glaucoma assessment,” IEEE Transactions on Medical Imaging, vol. 38, no. 9, pp. 2211–2218, 2019.
- S. Yu, J. Guo, R. Zhang, Y. Fan, Z. Wang, and X. Cheng, “A re-balancing strategy for class-imbalanced classification based on instance difficulty,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 70–79.
- Y. Cui, M. Jia, T.-Y. Lin, Y. Song, and S. Belongie, “Class-balanced loss based on effective number of samples,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 9268–9277.
- K. Cao, C. Wei, A. Gaidon, N. Arechiga, and T. Ma, “Learning imbalanced datasets with label-distribution-aware margin loss,” Advances in neural information processing systems, vol. 32, 2019.
- Q. Dong, S. Gong, and X. Zhu, “Imbalanced deep learning by minority class incremental rectification,” IEEE transactions on pattern analysis and machine intelligence, vol. 41, no. 6, pp. 1367–1381, 2018.
- N. Quadrianto, V. Sharmanska, and O. Thomas, “Discovering fair representations in the data domain,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 8227–8236.
- V. V. Ramaswamy, S. S. Kim, and O. Russakovsky, “Fair attribute classification through latent space de-biasing,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 9301–9310.
- Y. Zhang and J. Sang, “Towards accuracy-fairness paradox: Adversarial example-based data augmentation for visual debiasing,” in Proceedings of the 28th ACM International Conference on Multimedia, 2020, pp. 4346–4354.
- S. Park, J. Lee, P. Lee, S. Hwang, D. Kim, and H. Byun, “Fair contrastive learning for facial attribute classification,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 10 389–10 398.
- D. Zietlow, M. Lohaus, G. Balakrishnan, M. Kleindessner, F. Locatello, B. Scholkopf, and C. Russell, “Leveling down in computer vision: Pareto inefficiencies in fair deep classifiers,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA, USA: IEEE Computer Society, June 2022, pp. 10 400–10 411.
- A. Beutel, J. Chen, Z. Zhao, and E. H. Chi, “Data decisions and theoretical implications when adversarially learning fair representations,” arXiv preprint arXiv:1707.00075, 2017.
- Y. Roh, K. Lee, S. Whang, and C. Suh, “Fr-train: A mutual information-based approach to fair and robust training,” in International Conference on Machine Learning. PMLR, 2020, pp. 8147–8157.
- M. H. Sarhan, N. Navab, A. Eslami, and S. Albarqouni, “Fairness by learning orthogonal disentangled representations,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXIX 16. Springer, 2020, pp. 746–761.
- M. B. Zafar, I. Valera, M. G. Rogriguez, and K. P. Gummadi, “Fairness constraints: Mechanisms for fair classification,” in Artificial intelligence and statistics. PMLR, 2017, pp. 962–970.
- B. H. Zhang, B. Lemoine, and M. Mitchell, “Mitigating unwanted biases with adversarial learning,” in Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, 2018, pp. 335–340.
- Z. Wang, X. Dong, H. Xue, Z. Zhang, W. Chiu, T. Wei, and K. Ren, “Fairness-aware adversarial perturbation towards bias mitigation for deployed deep models,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 10 379–10 388.
- M. P. Kim, A. Ghorbani, and J. Zou, “Multiaccuracy: Black-box post-processing for fairness in classification,” in Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, 2019, pp. 247–254.
- P. J. Bickel, E. A. Hammel, and J. W. O’Connell, “Sex bias in graduate admissions: Data from berkeley: Measuring bias is harder than is usually assumed, and the evidence is sometimes contrary to expectation.” Science, vol. 187, no. 4175, pp. 398–404, 1975.
- A. Agarwal, A. Beygelzimer, M. Dudík, J. Langford, and H. Wallach, “A reductions approach to fair classification,” in International Conference on Machine Learning. PMLR, 2018, pp. 60–69.
- A. Agarwal, M. Dudík, and Z. S. Wu, “Fair regression: Quantitative definitions and reduction-based algorithms,” in International Conference on Machine Learning. PMLR, 2019, pp. 120–129.
- M. Hardt, E. Price, and N. Srebro, “Equality of opportunity in supervised learning,” Advances in neural information processing systems, vol. 29, 2016.
- R. Gargeya and T. Leng, “Automated identification of diabetic retinopathy using deep learning,” Ophthalmology, vol. 124, no. 7, pp. 962–969, 2017.
- V. Gulshan, L. Peng, M. Coram, M. C. Stumpe, D. Wu, A. Narayanaswamy, S. Venugopalan, K. Widner, T. Madams, J. Cuadros et al., “Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs,” jama, vol. 316, no. 22, pp. 2402–2410, 2016.
- A. C. Thompson, A. A. Jammal, and F. A. Medeiros, “A review of deep learning for screening, diagnosis, and detection of glaucoma progression,” Translational vision science & technology, vol. 9, no. 2, pp. 42–42, 2020.
- C. Blundell, J. Cornebise, K. Kavukcuoglu, and D. Wierstra, “Weight uncertainty in neural network,” in International conference on machine learning. PMLR, 2015, pp. 1613–1622.
- S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” in Proceedings of the 32nd International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, F. Bach and D. Blei, Eds., vol. 37. Lille, France: PMLR, 07–09 Jul 2015, pp. 448–456.
- X. Chen, H. Fan, R. Girshick, and K. He, “Improved baselines with momentum contrastive learning,” arXiv preprint arXiv:2003.04297, 2020.
- P. Khosla, P. Teterwak, C. Wang, A. Sarna, Y. Tian, P. Isola, A. Maschinot, C. Liu, and D. Krishnan, “Supervised contrastive learning,” Advances in neural information processing systems, vol. 33, pp. 18 661–18 673, 2020.
- M. Tan and Q. Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,” in International conference on machine learning. PMLR, 2019, pp. 6105–6114.
- J. Yang, X. Huang, Y. He, J. Xu, C. Yang, G. Xu, and B. Ni, “Reinventing 2d convolutions for 3d images,” IEEE Journal of Biomedical and Health Informatics, vol. 25, no. 8, pp. 3009–3018, 2021.
- I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” in International Conference on Learning Representations, 2019.
- S.-J. Chen, P. Lu, W.-F. Zhang, and J.-H. Lu, “High myopia as a risk factor in primary open angle glaucoma,” International journal of ophthalmology, vol. 5, no. 6, p. 750, 2012.
- R. Lavanya, R. Kawasaki, W. T. Tay, G. C. Cheung, P. Mitchell, S.-M. Saw, T. Aung, and T. Y. Wong, “Hyperopic refractive error and shorter axial length are associated with age-related macular degeneration: the singapore malay eye study,” Investigative ophthalmology & visual science, vol. 51, no. 12, pp. 6247–6252, 2010.
- Yan Luo (77 papers)
- Yu Tian (249 papers)
- Min Shi (39 papers)
- Louis R. Pasquale (2 papers)
- Lucy Q. Shen (3 papers)
- Nazlee Zebardast (3 papers)
- Tobias Elze (8 papers)
- Mengyu Wang (28 papers)