Personality Shapes Gender Bias in Persona-Conditioned LLM Narratives Across English and Hindi: An Empirical Investigation

Published 26 Apr 2026 in cs.CL | (2604.23600v1)

Abstract: LLMs are increasingly deployed in persona-driven applications such as education, customer service, and social platforms, where models are prompted to adopt specific personas when interacting with users. While persona conditioning can improve user experience and engagement, it also raises concerns about how personality cues may interact with gender biases and stereotypes. In this work, we present a controlled study of persona-conditioned story generation in English and Hindi, where each story portrays a working professional in India producing context-specific artifacts (e.g., lesson plans, reports, letters) under systematically varied persona gender, occupational role, and personality traits from the HEXACO and Dark Triad frameworks. Across 23,400 generated stories from six state-of-the-art LLMs, we find that personality traits are significantly associated with both the magnitude and direction of gender bias. In particular, Dark Triad personality traits are consistently associated with higher gender-stereotypical representations compared to socially desirable HEXACO traits, though these associations vary across models and languages. Our findings demonstrate that gender bias in LLMs is not static but context-dependent. This suggests that persona-conditioned systems used in real-world applications may introduce uneven representational harms, reinforcing gender stereotypes in generated educational, professional, or social content.

Abstract PDF Upgrade to Chat

Authors (5)

Summary

The paper confirms that personality cues significantly alter gender bias in narrative generation, with Dark Triad traits amplifying male stereotypes.
The study employs a factorial design across 50 occupations and two languages using a centroid-based bias metric validated through substantial human annotation.
Cross-linguistic results reveal a stronger baseline male bias in Hindi, while prosocial traits like Openness and Emotionality attenuate biases differently across models.

Personality Modulates Gender Bias in Persona-Conditioned LLM Narratives Across English and Hindi

Introduction and Motivation

This work presents a large-scale, controlled investigation into how personality cues interact with gender bias in persona-conditioned LLM story generation, systematically comparing effects across English and Hindi (2604.23600). Unlike prior studies that focus on demographic attributes or occupation, this study rigorously manipulates persona gender, occupation, and psychological traits (HEXACO and Dark Triad) to quantify how personality traits function as bias amplifiers or attenuators in contextualized occupational narratives. The multilingual setting directly tests whether bias mechanisms and prompt-driven effects in English persist under Indian socio-cultural and morphosyntactic conditions (grammatical gender, social hierarchy). The analytic pipeline combines a curated Indian-relevant occupation-scenario matrix, prompt-based conditioning, and a validated, sentence-level semantic bias metric.

Figure 1: Illustration of personality modulation of gender-stereotypical language in LLM-generated occupational artifacts for male and female personas, contrasting baseline and personality-conditioned responses.

Experimental Methods

The experimental protocol instantiates a factorial corpus: 50 occupations paired with contextually plausible artifacts and scenarios, assigned persona gender (male, female, neutral), and personality specification (9 traits × 2 levels, HEXACO + Dark Triad), yielding 23,400 artifact generations from 6 diverse LLMs (GPT-5 nano, Llama-3.3, Deepseek-R1, Mixtral, Gemma, Falcon-Mamba). Both English and Hindi outputs are generated via parallel prompt templates, ensuring cross-linguistic comparability.

A stepwise pipeline organizes data collection, centroid construction, and bias quantification:

Figure 2: Experimental pipeline illustrating persona-attribute prompt generation, model artifact sampling, semantic embedding, and centroid-based bias computation.

Centroid-based bias scoring is implemented using IndicSBERT multilingual sentence embeddings. Manually curated, native-validated gender-stereotypical lexicons for both languages define reference centroids, against which each artifact’s sentence alignment is cosine-measured. The bias metric captures maximum absolute alignment per story, with human annotation (substantial Fleiss’ κ in both languages) confirming the metric’s validity.

Main Empirical Results

Baseline and Amplification Effects

All models exhibit a baseline male-stereotypical bias (cosine difference > 0), accentuated in Hindi due to grammatical gender and sociocultural salience. Personality conditioning systematically increases the generation of male-stereotypical sentences compared to the no-personality baseline, across all architectures and languages.

Figure 3: Percentage of generated sentences leaning toward male stereotypes for baselines (no personality) and persona-conditioned prompts, averaged across model and language.

Personality as a Systematic Bias Modulator

Regression analyses isolate the direction and magnitude of personality effects. Dark Triad conditions (Machiavellianism, Psychopathy; Narcissism variably) robustly amplify male-stereotypical language alignment, in some cases surpassing gender label main effects. Among HEXACO traits, Honesty-Humility, Agreeableness, and Conscientiousness weakly amplify, while Openness and, to a lesser degree, Emotionality attenuate bias—more commonly driving female-stereotypical alignment.

Figure 4: Personality trait coefficients on bias scores; Dark Triad traits consistently amplify male-stereotypical alignment, while Openness and Emotionality attenuate toward female.

Gender and Personality Interaction Effects

Explicit gender conditioning modulates bias direction: male personas shift the narrative toward male stereotypes, female toward female stereotypes, with the male effect larger in absolute magnitude. Crucially, specific personality traits (notably Dark Triad) exert stronger effect sizes than gender labels alone.

Figure 5: Gender main effects relative to neutral; male and female persona shifts are asymmetric, with personality coefficients often exceeding gender in magnitude.

Stratified analyses reveal that Dark Triad bias amplification is strongest when aligned with male personas, less so for female personas, indicating personality-by-gender interaction.

Figure 6: Gender stratified personality effects, with bias amplification by the Dark Triad more pronounced for male personas.

Cross-Linguistic Variation

Hindi generations display stronger overall male-stereotypical bias, but personality-driven modulation is more pronounced in English, especially for Dark Triad traits. Attenuating effects of prosocial HEXACO traits are compressed in Hindi, plausibly due to distributed gender encoding via verb/adjective agreement.

Figure 7: Language stratified personality effects; English exhibits stronger personality-dependent amplification and attenuation compared to Hindi.

Model Family and Scaling Effects

Across model families, GPT-5 nano anchors the strongest male-leaning bias. Smaller SSM models (Gemma, Falcon) show reversals toward female-stereotypical alignments at baseline, but the relative impact of personality conditioning (amplification/attenuation) patterns holds across scale and architecture.

Qualitative Analysis of Narrative Realization

Qualitative audit reveals non-trivial interactions between personality, gender, occupation, and language. For Dark Triad–conditioned female narratives, antisociality is often rendered covertly (e.g., deception, relational aggression), whereas male antisociality is realized overtly (self-indulgence, authority challenges). Morphosyntactic gender marking in Hindi constrains explicit lexical stereotype deployment, with personality effects surfacing as narrative-level strategy rather than only through lexical markers.

Implications

The study establishes that personality cues in prompt conditioning are not stylistic surface effects but primary determinants of context-dependent gender bias. This effect is robust to model family, persists across occupational stereotypes, and varies nontrivially by language, thus raising substantial concerns for persona-driven deployment in human-facing LLM applications. These findings indicate that fairness cannot be assured simply by restricting demographic prompt attributes; personality dimension interaction can reintroduce or obscure stereotypical harms.

From a theoretical perspective, the results suggest that LLMs internalize human-like stereotype context dependencies, reproducing documented psychological interaction effects (dark traits amplify, prosocial traits attenuate). Crosslinguistic divergence further demonstrates that mitigation strategies must be language-adapted and that centroid-based, semantically grounded metrics are essential for bias detection in morphologically rich languages.

Practical implications include the necessity for explicit fairness audits whenever LLMs are persona- or personality-conditioned, especially for education, HR, and customer support use-cases. Adjusting or restricting personality prompt spaces may act as a low-cost control for bias reduction—as established for toxicity [wang-etal-2025-exploring-impact-toxicity-bias].

Speculation on Future AI Developments

Emergent bias from trait conditioning suggests future LLM deployments must model persona as a structured, multi-dimensional latent variable, where prompt combinations can be explicitly diagnosed for bias interactions. Extending current work to multi-turn dialog, spontaneous code-switching, and compound stereotypes (e.g., caste, religion × gender × personality) will be required for robust fairness evaluation in real-world, multilingual systems. Tooling for prompt auditing should integrate context-dependent, language-adaptive embedding metrics. Further, as LLMs increasingly model human psychological regularities, systematic testing for stereotype amplification and attenuation in response to prompt space navigation is a critical path for safety and social acceptance.

Conclusion

This paper provides a comprehensive empirical demonstration that personality cues systematically modulate the magnitude, direction, and realization of gender bias in persona-conditioned LLM narrative generation, with pronounced effects for Dark Triad traits, cross-linguistically robust baseline skew, and critical architecture- and persona-interactional nuance. These findings recommend a shift from static to contextually dynamic conceptualizations of LLM social bias—prompt parameterization functions as a fairness-critical axis, requiring ongoing, high-resolution analytic attention in responsible LLM deployment.

Markdown Report Issue