DECASTE: Unveiling Caste Stereotypes in Large Language Models through Multi-Dimensional Bias Analysis (2505.14971v2)

Published 20 May 2025 in cs.CL and cs.CY

Abstract: Recent advancements in LLMs have revolutionized NLP and expanded their applications across diverse domains. However, despite their impressive capabilities, LLMs have been shown to reflect and perpetuate harmful societal biases, including those based on ethnicity, gender, and religion. A critical and underexplored issue is the reinforcement of caste-based biases, particularly towards India's marginalized caste groups such as Dalits and Shudras. In this paper, we address this gap by proposing DECASTE, a novel, multi-dimensional framework designed to detect and assess both implicit and explicit caste biases in LLMs. Our approach evaluates caste fairness across four dimensions: socio-cultural, economic, educational, and political, using a range of customized prompting strategies. By benchmarking several state-of-the-art LLMs, we reveal that these models systematically reinforce caste biases, with significant disparities observed in the treatment of oppressed versus dominant caste groups. For example, bias scores are notably elevated when comparing Dalits and Shudras with dominant caste groups, reflecting societal prejudices that persist in model outputs. These results expose the subtle yet pervasive caste biases in LLMs and emphasize the need for more comprehensive and inclusive bias evaluation methodologies that assess the potential risks of deploying such models in real-world contexts.

Summary

The paper introduces the DeCaste framework, a novel methodology employing multi-dimensional tasks to evaluate implicit and explicit caste stereotypes in large language models.
Experiments on nine LLMs reveal models, including state-of-the-art, propagate significant caste stereotypes and measurable biases in implicit and explicit outputs.
The research highlights the urgent need to address caste biases in AI systems to prevent perpetuating historical prejudice, advocating for inclusive frameworks and participatory research.

Analyzing Caste Bias in LLMs: Insights from DeCaste Framework

In the paper "DeCaste: Unveiling Caste Stereotypes in LLMs through Multi-Dimensional Bias Analysis," the authors address the critical and underexplored domain of caste-based bias within the field of LLMs. The paper introduces a novel methodology for evaluating implicit and explicit biases associated with the caste system prevalent in Indian society. While biases relating to gender and race have garnered substantial attention, caste bias remains less scrutinized, particularly in the context of artificial intelligence.

Key Contributions and Methodology

The foundation of this research is the DeCaste framework, which is designed to assess and quantify caste biases in LLMs. This framework operates through two primary tasks:

Stereotypical Word Association Task (SWAT): Employing an Implicit Bias Probing (IBP) strategy, SWAT evaluates how LLMs implicitly associate caste groups with stereotypical terms drawn from dimensions such as socio-cultural, economic, educational, and political contexts. By using structured prompts without direct mention of caste, the task seeks to reveal hidden biases embedded in model outputs.
Persona-based Scenario Answering Task (PSAT): This task explores explicit biases using Explicit Bias Probing (EBP) methods, asking models to generate personas and answer questions in scenarios where caste may influence outcomes. It evaluates how distinctly caste-based roles are assigned in hypothetical real-life situations, highlighting biases in models' decision-making processes.

These tasks are ingeniously designed to dissect caste biases across multiple axes, offering a comprehensive examination of both implicit and explicit biases using various LLMs.

Experimental Insights

The analysis, encompassing nine distinct models, exposes that LLMs, regardless of their sophistication or size, tend to propagate stereotypes and biases inherited from their training data. Notably, state-of-the-art models like GPT-4o and LlaMa variants demonstrate measurable biases when tested against caste-related benchmarks. Results indicate that biases are pronounced not only in model-generated personas and scenario tasks but also in their implicit word associations. Statistical tests further substantiate the significance of these patterns, underscoring the systemic nature of caste-based discrimination mirrored in digital models.

Practical and Theoretical Implications

This research highlights the urgent need to address caste biases in AI systems, given their increasing integration in digital communications and decision-support systems. The propagation of such biases risks perpetuating historical prejudices, thereby affecting marginalized communities in real-world applications. From a theoretical perspective, the paper extends the discussions on social biases in AI, calling for more nuanced frameworks that accommodate diverse social hierarchies beyond Western-centric models.

Furthermore, it prompts the adoption of participatory research approaches to ensure models reflect and respect the complexities of social structures in non-Western contexts. By collaborating with social scientists and communities most affected by algorithmic biases, AI practitioners and researchers can enhance the representational fairness in LLMs.

Future Directions

Looking ahead, the paper advocates for further exploration and refinement of bias detection methodologies. The authors suggest expanding the analysis to include more languages and intersectional dimensions such as religious and gender biases, which frequently intersect with caste-related prejudice. Additionally, they highlight the need for robust bias mitigation strategies that can be integrated during various stages of model development, including pre-processing, training, and post-processing.

In conclusion, the DeCaste framework not only identifies but provides insightful methodologies for analyzing and addressing caste-related biases in LLMs. As AI continues to influence societal discourse, ensuring fairness across diverse cultural contexts becomes imperative, marking an essential step toward inclusive and ethical AI solutions.

YouTube

Show All Videos