Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

FairCLIP: Harnessing Fairness in Vision-Language Learning (2403.19949v2)

Published 29 Mar 2024 in cs.CV

Abstract: Fairness is a critical concern in deep learning, especially in healthcare, where these models influence diagnoses and treatment decisions. Although fairness has been investigated in the vision-only domain, the fairness of medical vision-language (VL) models remains unexplored due to the scarcity of medical VL datasets for studying fairness. To bridge this research gap, we introduce the first fair vision-language medical dataset Harvard-FairVLMed that provides detailed demographic attributes, ground-truth labels, and clinical notes to facilitate an in-depth examination of fairness within VL foundation models. Using Harvard-FairVLMed, we conduct a comprehensive fairness analysis of two widely-used VL models (CLIP and BLIP2), pre-trained on both natural and medical domains, across four different protected attributes. Our results highlight significant biases in all VL models, with Asian, Male, Non-Hispanic, and Spanish being the preferred subgroups across the protected attributes of race, gender, ethnicity, and language, respectively. In order to alleviate these biases, we propose FairCLIP, an optimal-transport-based approach that achieves a favorable trade-off between performance and fairness by reducing the Sinkhorn distance between the overall sample distribution and the distributions corresponding to each demographic group. As the first VL dataset of its kind, Harvard-FairVLMed holds the potential to catalyze advancements in the development of machine learning models that are both ethically aware and clinically effective. Our dataset and code are available at https://ophai.hms.harvard.edu/datasets/harvard-fairvlmed10k.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (100)
  1. Peking university international competition on ocular disease intelligent recognition (odir-2019).
  2. Hurdles to artificial intelligence deployment: Noise in schemas and “gold” labels. Radiology: Artificial Intelligence, 5(2):e220056, 2023.
  3. Covid-ct-md, covid-19 computed tomography scan dataset applicable in machine learning and deep learning. Scientific Data, 8(1):121, 2021.
  4. A reductions approach to fair classification. In International Conference on Machine Learning, pages 60–69. PMLR, 2018.
  5. Fair regression: Quantitative definitions and reduction-based algorithms. In International Conference on Machine Learning, pages 120–129. PMLR, 2019.
  6. Vision–language model for visual question answering in medical imagery. Bioengineering, 10(3):380, 2023.
  7. Racial categories in machine learning. In Proceedings of the conference on fairness, accountability, and transparency, pages 289–298, 2019.
  8. Data decisions and theoretical implications when adversarially learning fair representations. arXiv preprint arXiv:1707.00075, 2017.
  9. Martim Brandao. Age and gender bias in pedestrian detection algorithms. arXiv preprint arXiv:1906.10490, 2019.
  10. Detecting and preventing shortcut learning for fair medical ai using shortcut testing (short). arXiv preprint arXiv:2207.10384, 2022.
  11. Gender shades: Intersectional accuracy disparities in commercial gender classification. In Conference on fairness, accountability and transparency, pages 77–91. PMLR, 2018.
  12. Padchest: A large chest x-ray image dataset with multi-label annotated reports. Medical image analysis, 66:101797, 2020.
  13. Preparing a collection of radiology examinations for distribution and retrieval. Journal of the American Medical Informatics Association, 23(2):304–310, 2016.
  14. Quantitative classification of eyes with and without intermediate age-related macular degeneration using optical coherence tomography. Ophthalmology, 121(1):162–172, 2014.
  15. Algorithmic encoding of protected characteristics in chest x-ray disease detection models. Ebiomedicine, 89, 2023.
  16. Evaluating deep neural networks trained on clinical images in dermatology with the fitzpatrick 17k dataset. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1820–1828, 2021.
  17. Prevalence of glaucoma in the united states: the 2005–2008 national health and nutrition examination survey. Investigative ophthalmology & visual science, 57(6):2905–2913, 2016.
  18. Towards a critical race methodology in algorithmic fairness. In Proceedings of the 2020 conference on fairness, accountability, and transparency, pages 501–512, 2020.
  19. Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 16000–16009, 2022.
  20. Pathological visual question answering. arXiv preprint arXiv:2010.12435, 2020.
  21. A visual–language foundation model for pathology image analysis using medical twitter. Nature medicine, pages 1–10, 2023.
  22. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning, pages 448–456. pmlr, 2015.
  23. Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison. In Proceedings of the AAAI conference on artificial intelligence, pages 590–597, 2019a.
  24. Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison. In Proceedings of the AAAI conference on artificial intelligence, pages 590–597, 2019b.
  25. On the automatic generation of medical imaging reports. arXiv preprint arXiv:1711.08195, 2017.
  26. Mimic-cxr, a de-identified publicly available database of chest radiographs with free-text reports. Scientific data, 6(1):317, 2019a.
  27. Mimic-cxr-jpg, a large publicly available database of labeled chest radiographs. arXiv preprint arXiv:1901.07042, 2019b.
  28. Ai fairness via domain adaptation. arXiv preprint arXiv:2104.01109, 2021.
  29. Visiongpt-3d: A generalized multimodal agent for enhanced 3d vision understanding. arXiv preprint arXiv:2403.09530, 2024a.
  30. Visiongpt: Vision-language understanding agent using generalized multimodal framework. arXiv preprint arXiv:2403.09027, 2024b.
  31. How fair are medical imaging foundation models? In Machine Learning for Health (ML4H), pages 217–231. PMLR, 2023.
  32. Adam: A method for stochastic optimization. In International Conference on Learning Representations, 2014.
  33. Age should not matter: Towards more accurate pedestrian detection via self-training. In Computer Sciences & Mathematics Forum, page 11. MDPI, 2022.
  34. Dermatobot: an image processing enabled chatbot for diagnosis and tele-remedy of skin diseases. In 2022 3rd International Conference for Emerging Technology (INCET), pages 1–5. IEEE, 2022.
  35. Papila: Dataset with fundus images and clinical data of both eyes of the same patient for glaucoma assessment. Scientific Data, 9(1):291, 2022.
  36. A dataset of clinically generated visual questions and answers about radiology images. Scientific data, 5(1):1–10, 2018.
  37. Clip-lung: Textual knowledge-guided lung nodule malignancy prediction. arXiv preprint arXiv:2304.08013, 2023.
  38. Learning hierarchical graph for occluded pedestrian detection. In Proceedings of the 28th ACM International Conference on Multimedia, pages 1597–1605, 2020.
  39. Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation. In International Conference on Machine Learning, pages 12888–12900. PMLR, 2022.
  40. Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. arXiv preprint arXiv:2301.12597, 2023.
  41. Pmc-clip: Contrastive language-image pre-training using biomedical documents. arXiv preprint arXiv:2303.07240, 2023.
  42. Slake: A semantically-labeled knowledge-enhanced dataset for medical visual question answering. In 2021 IEEE 18th International Symposium on Biomedical Imaging (ISBI), pages 1650–1654. IEEE, 2021.
  43. Clinically accurate chest x-ray report generation. In Machine Learning for Healthcare Conference, pages 249–269. PMLR, 2019.
  44. Clip-driven universal model for organ segmentation and tumor detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 21152–21164, 2023.
  45. Decoupled weight decay regularization. In International Conference on Learning Representations, 2019.
  46. Biogpt: generative pre-trained transformer for biomedical text generation and mining. Briefings in Bioinformatics, 23(6):bbac409, 2022.
  47. Harvard glaucoma detection and progression: A multimodal multitask dataset and generalization-reinforced semi-supervised learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 20471–20482, 2023a.
  48. Harvard glaucoma detection and progression: A multimodal multitask dataset and generalization-reinforced semi-supervised learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 20471–20482, 2023b.
  49. Eye fairness: A large-scale 3d imaging dataset for equitable eye diseases screening and fair identity scaling. arXiv preprint arXiv:2310.02492, 2023c.
  50. Harvard glaucoma fairness: A retinal nerve disease dataset for fairness learning and fair identity normalization. IEEE Transactions on Medical Imaging, pages 1–1, 2024.
  51. Debiasing deep chest x-ray classifiers using intra-and post-processing methods. In Machine Learning for Healthcare Conference, pages 504–536. PMLR, 2022.
  52. Berthop: An effective vision-and-language model for chest x-ray disease diagnosis. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 725–734. Springer, 2022.
  53. Vision-language transformer for interpretable pathology visual question answering. IEEE Journal of Biomedical and Health Informatics, 27(4):1681–1690, 2022.
  54. Eddie-transformer: Enriched disease embedding transformer for x-ray report generation. In 2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI), pages 1–5. IEEE, 2022.
  55. Improving chest x-ray report generation by leveraging text of similar images. Available at SSRN 4211036, 2023.
  56. OpenAI. Gpt-4 technical report, 2023.
  57. Fair contrastive learning for facial attribute classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10389–10398, 2022.
  58. Radiology objects in context (roco): a multimodal image dataset. In Intravascular Imaging and Computer Assisted Stenting and Large-Scale Annotation of Biomedical Data and Expert Label Synthesis: 7th Joint International Workshop, CVII-STENT 2018 and Third International Workshop, LABELS 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 16, 2018, Proceedings 3, pages 180–189. Springer, 2018.
  59. Computational optimal transport: With applications to data science. Foundations and Trends® in Machine Learning, 11(5-6):355–607, 2019.
  60. On fairness and calibration. Advances in neural information processing systems, 30, 2017.
  61. Fairness in cardiac mr image analysis: an investigation of bias due to data imbalance in deep learning based segmentation. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part III 24, pages 413–423. Springer, 2021.
  62. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021a.
  63. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021b.
  64. Fair attribute classification through latent space de-biasing. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9301–9310, 2021.
  65. Fr-train: A mutual information-based approach to fair and robust training. In International Conference on Machine Learning, pages 8147–8157. PMLR, 2020.
  66. Fairness by learning orthogonal disentangled representations. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXIX 16, pages 746–761. Springer, 2020.
  67. Chexclusion: Fairness gaps in deep chest x-ray classifiers. In BIOCOMPUTING 2021: proceedings of the Pacific symposium, pages 232–243. World Scientific, 2020.
  68. Burden of undetected and untreated glaucoma in the united states. American journal of ophthalmology, 158(6):1121–1129, 2014.
  69. Artifact-tolerant clustering-guided contrastive embedding learning for ophthalmic images in glaucoma. IEEE Journal of Biomedical and Health Informatics, 2023a.
  70. Equitable artificial intelligence for glaucoma screening with fair identity normalization. medRxiv, pages 2023–12, 2023b.
  71. Artifact correction in retinal nerve fiber layer thickness maps using deep learning and its clinical utility in glaucoma. Translational Vision Science & Technology, 12(11):12–12, 2023c.
  72. Rnflt2vec: Artifact-corrected representation learning for retinal nerve fiber layer thickness maps. Medical Image Analysis, page 103110, 2024.
  73. Medicat: A dataset of medical images, captions, and textual references. arXiv preprint arXiv:2010.06000, 2020.
  74. Global prevalence of glaucoma and projections of glaucoma burden through 2040: a systematic review and meta-analysis. Ophthalmology, 121(11):2081–2090, 2014.
  75. A diagnostic report generator from ct volumes on liver tumor with semi-supervised attention mechanism. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2018: 21st International Conference, Granada, Spain, September 16-20, 2018, Proceedings, Part II 11, pages 702–710. Springer, 2018.
  76. Fairseg: A large-scale medical image segmentation dataset for fairness learning with fair error-bound scaling. arXiv preprint arXiv:2311.02189, 2023.
  77. The ham10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Scientific data, 5(1):1–9, 2018.
  78. Mitigating bias in face recognition using skewness-aware reinforcement learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9322–9331, 2020.
  79. Towards fairness in visual recognition: Effective strategies for bias mitigation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8919–8928, 2020.
  80. Medclip: Contrastive learning from unpaired medical images and text. arXiv preprint arXiv:2210.10163, 2022.
  81. Medklip: Medical knowledge enhanced language-image pre-training. medRxiv, pages 2023–01, 2023a.
  82. Pmc-llama: Further finetuning llama on medical papers. arXiv preprint arXiv:2304.14454, 2023b.
  83. Agnet: Automatic generation network for skin imaging reports. Computers in biology and medicine, 141:105037, 2022a.
  84. Fairprune: Achieving fairness through pruning for dermatological disease diagnosis. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 743–753. Springer, 2022b.
  85. Standardization of analysis sets for reporting results from adni mri data. Alzheimer’s & Dementia, 9(3):332–337, 2013.
  86. Elixr: Towards a general purpose x-ray artificial intelligence system through alignment of large language models and radiology vision encoders. arXiv preprint arXiv:2308.01317, 2023.
  87. Worldgpt: A sora-inspired video ai agent as rich world models from text and image inputs. arXiv preprint arXiv:2403.07944, 2024.
  88. Automatic ultrasound image report generation with adaptive multimodal attention mechanism. Neurocomputing, 427:40–49, 2021.
  89. Cxr-clip: Toward large scale chest x-ray language-image pre-training. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 101–111. Springer, 2023.
  90. Large batch training of convolutional networks. arXiv preprint arXiv:1708.03888, 2017.
  91. Fairness constraints: Mechanisms for fair classification. In Artificial intelligence and statistics, pages 962–970. PMLR, 2017.
  92. Opportunistic assessment of ischemic heart disease risk using abdominopelvic computed tomography and medical record data: a multimodal explainable artificial intelligence approach. medRxiv, pages 2021–01, 2021.
  93. Mitigating unwanted biases with adversarial learning. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pages 335–340, 2018.
  94. Large-scale domain-specific pretraining for biomedical vision-language processing. arXiv preprint arXiv:2303.00915, 2023.
  95. Towards accuracy-fairness paradox: Adversarial example-based data augmentation for visual debiasing. In Proceedings of the 28th ACM International Conference on Multimedia, pages 4346–4354, 2020.
  96. Pathologist-level interpretable whole-slide cancer diagnosis with deep learning. Nature Machine Intelligence, 1(5):236–245, 2019.
  97. An automatically thyroid nodules feature extraction and description network for ultrasound images. In 2021 IEEE International Ultrasonics Symposium (IUS), pages 1–4. IEEE, 2021.
  98. Anomalyclip: Object-agnostic prompt learning for zero-shot anomaly detection. arXiv preprint arXiv:2310.18961, 2023.
  99. Radfusion: Benchmarking performance and fairness for multimodal pulmonary embolism detection from ct and ehr. arXiv preprint arXiv:2111.11665, 2021.
  100. Leveling down in computer vision: Pareto inefficiencies in fair deep classifiers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10410–10421, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (12)
  1. Yan Luo (77 papers)
  2. Min Shi (39 papers)
  3. Muhammad Osama Khan (6 papers)
  4. Muhammad Muneeb Afzal (3 papers)
  5. Hao Huang (155 papers)
  6. Shuaihang Yuan (17 papers)
  7. Yu Tian (249 papers)
  8. Luo Song (1 paper)
  9. Ava Kouhana (2 papers)
  10. Tobias Elze (8 papers)
  11. Yi Fang (151 papers)
  12. Mengyu Wang (28 papers)
Citations (14)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com