Reading Race: AI Recognises Patient's Racial Identity In Medical Images (2107.10356v1)

Published 21 Jul 2021 in cs.CV, cs.CY, and eess.IV

Abstract: Background: In medical imaging, prior studies have demonstrated disparate AI performance by race, yet there is no known correlation for race on medical imaging that would be obvious to the human expert interpreting the images. Methods: Using private and public datasets we evaluate: A) performance quantification of deep learning models to detect race from medical images, including the ability of these models to generalize to external environments and across multiple imaging modalities, B) assessment of possible confounding anatomic and phenotype population features, such as disease distribution and body habitus as predictors of race, and C) investigation into the underlying mechanism by which AI models can recognize race. Findings: Standard deep learning models can be trained to predict race from medical images with high performance across multiple imaging modalities. Our findings hold under external validation conditions, as well as when models are optimized to perform clinically motivated tasks. We demonstrate this detection is not due to trivial proxies or imaging-related surrogate covariates for race, such as underlying disease distribution. Finally, we show that performance persists over all anatomical regions and frequency spectrum of the images suggesting that mitigation efforts will be challenging and demand further study. Interpretation: We emphasize that model ability to predict self-reported race is itself not the issue of importance. However, our findings that AI can trivially predict self-reported race -- even from corrupted, cropped, and noised medical images -- in a setting where clinical experts cannot, creates an enormous risk for all model deployments in medical imaging: if an AI model secretly used its knowledge of self-reported race to misclassify all Black patients, radiologists would not be able to tell using the same data the model has access to.

PDF Abstract

AI Recognition of Racial Identity in Medical Imaging

The paper "Reading Race: AI Recognizes Patient's Racial Identity In Medical Images" presents a comprehensive investigation into the capability of AI systems to detect self-reported racial identity from medical images. Utilizing both private and publicly available datasets, the paper meticulously evaluates the performance, generalizability, and underlying mechanisms of deep learning models in inferring racial identities from medical imaging data. The findings unveil new dimensions of bias and present consequential challenges for AI deployment in medical contexts.

The research demonstrates robust performance of deep learning models in predicting racial identity across diverse imaging modalities, including chest X-rays, CT scans, and mammograms. The models achieve high ROC-AUC values in internal and external validations, such as 0.97 for Black and White patients in chest X-ray datasets (CXR), illustrating the remarkably high prediction capability even when human experts cannot discern race from the same images. Remarkably, AI's race-detection ability persists despite interventions to alter image quality or resolution, suggesting that racial information is distributed throughout the image's frequency domain.

One significant finding is that traditional confounding factors such as body habitus and disease distribution do not account for the models' predictive capability. For instance, breast density analyses and bone density removal from images did not significantly impact the race-detection performance, indicating that the racial information encoded in these images stems from complex feature interactions rather than notable anatomical proxies.

A critical implication of these findings is the inherent risk in AI systems carrying forward or exacerbating existing racial disparities in healthcare. These systems inherently extract and utilize racial information, which, if unaddressed, could lead to differential treatment or outcomes for patients of different racial identities. Such risks are compounded by the fact that this information is seemingly 'hidden' from human oversight and is challenging to isolate within the AI framework, negating 'color-blind' strategies aimed at mitigating bias.

From a regulatory perspective, this paper underscores an urgent need for stringent auditing of AI models in medical imaging to identify and mitigate unintended racial bias. The ability of AI systems to inadvertently encode and act on racial data warrants a re-evaluation of current validation and approval processes employed by regulatory bodies like the FDA. The authors recommend employing model audits focusing on demographic performance to safeguard against such racial discrimination.

Moving forward, the paper calls for deeper exploration into how racial identity is latently encoded in medical images beyond ionizing radiation modalities like CT and X-ray. Moreover, it advocates for methodical audits and explicit regulatory processes to monitor and address racial biases within developing AI systems. The utility of including self-reported race in datasets also emerges as a pivotal consideration for developing fair AI solutions that account for unforeseen human-invisible model capabilities.

The paper thus provides essential insights into the complexities of AI-mediated racial bias in medical imaging, laying groundwork for future research aimed at ensuring equity in AI-driven healthcare technologies.