Large Language Models are Geographically Biased (2402.02680v2)

Published 5 Feb 2024 in cs.CL, cs.AI, cs.CY, and cs.LG

Abstract: LLMs inherently carry the biases contained in their training corpora, which can lead to the perpetuation of societal harm. As the impact of these foundation models grows, understanding and evaluating their biases becomes crucial to achieving fairness and accuracy. We propose to study what LLMs know about the world we live in through the lens of geography. This approach is particularly powerful as there is ground truth for the numerous aspects of human life that are meaningfully projected onto geographic space such as culture, race, language, politics, and religion. We show various problematic geographic biases, which we define as systemic errors in geospatial predictions. Initially, we demonstrate that LLMs are capable of making accurate zero-shot geospatial predictions in the form of ratings that show strong monotonic correlation with ground truth (Spearman's $\rho$ of up to 0.89). We then show that LLMs exhibit common biases across a range of objective and subjective topics. In particular, LLMs are clearly biased against locations with lower socioeconomic conditions (e.g. most of Africa) on a variety of sensitive subjective topics such as attractiveness, morality, and intelligence (Spearman's $\rho$ of up to 0.70). Finally, we introduce a bias score to quantify this and find that there is significant variation in the magnitude of bias across existing LLMs. Code is available on the project website: https://rohinmanvi.github.io/GeoLLM

PDF Abstract

Geographic Biases in LLMs

The academic paper titled "LLMs are Geographically Biased" critically examines the biases inherent in LLMs through the geographical lens. The primary focus is to evaluate the systemic errors these models make in geospatial predictions, a novel dimension of bias assessment. By leveraging geography, a multifaceted and globally inclusive perspective on culture, race, language, economics, politics, and religion is applied to ascertain the biases in LLMs.

Methodology and Findings

The researchers have opted to paper LLMs' knowledge through geographic indicators because many aspects of human life are inherently tied to geography. The approach proposed in this paper involves zero-shot geospatial predictions where LLMs are prompted to rate locations based on a given topic without being fine-tuned for the task. The predictions are then assessed against ground truths using metrics like Spearman's rank correlation (p). This rigorous method ensures the models' biases are revealed in an unmitigated state, offering a clearer view into their predispositions.

The authors introduce a "bias score" that quantifies geographic bias. It incorporates the mean absolute deviation (MAD) of output ratings and Spearman's rank correlation relative to an anchoring bias distribution, such as infant survival rate, which serves as a proxy for socioeconomic conditions. This quantitative assessment shows significant variation in bias across models, revealing the nuanced nature of geographic biases.

In terms of specific results, LLMs demonstrated strong monotonic correlation with ground truth geospatial data on objective topics like population density and infant mortality rate, with Spearman's p reaching up to 0.89. However, the research also uncovered troubling biases. Particularly, the LLMs showed biases against locations characterized by lower socioeconomic conditions across subjective topics such as attractiveness and morality, with correlations of up to 0.70 when measured against infant survival rates.

Implications

The implications of these findings are both practical and theoretical. On a practical level, the biases documented suggest that deployers of LLMs should exercise caution, particularly in applications where geographic context is critical. These biases could perpetuate harmful stereotypes or unjustly disadvantage less represented communities in applications spanning from automated decision-making to content creation.

Theoretically, this work opens new pathways for bias mitigation strategies. By identifying geographic bias footprints, researchers can hone strategies for corpus balancing and model training adjustments. Additionally, the introduction of a specific bias score for geographic context provides a valuable tool for benchmarking LLMs against various bias indicators.

Future Research Directions

Future research could extend beyond the evaluations presented, such as exploring interventions that directly reduce the geographical bias footprint. This could include more nuanced training data adjustments that better represent underrepresented regions of the world. Furthermore, examining the interactions between geographic biases and other forms of bias (e.g., gender, race) could yield more comprehensive bias amelioration techniques.

Overall, while LLMs possess impressive capabilities in zero-shot geospatial prediction, the work clearly indicates the need for an improved understanding of the latent biases embedded within these systems. Researchers and practitioners must consider these findings in the ongoing development and deployment of AI systems that are fair and equitable across geographic divides.