Papers

Topics

Authors

Recent

View all

Gemini 2.5 Flash

Gemini 2.5 Flash 91 tok/s

Gemini 2.5 Pro 58 tok/s Pro

GPT-5 Medium 29 tok/s

GPT-5 High 29 tok/s Pro

GPT-4o 102 tok/s

GPT OSS 120B 462 tok/s Pro

Kimi K2 181 tok/s Pro

2000 character limit reached

Conformity in Large Language Models (2410.12428v2)

Published 16 Oct 2024 in cs.CL and cs.AI

Abstract: The conformity effect describes the tendency of individuals to align their responses with the majority. Studying this bias in LLMs is crucial, as LLMs are increasingly used in various information-seeking and decision-making tasks as conversation partners to improve productivity. Thus, conformity to incorrect responses can compromise their effectiveness. In this paper, we adapt psychological experiments to examine the extent of conformity in popular LLMs. Our findings reveal that all tested models exhibit varying levels of conformity toward the majority, regardless of their initial choice or correctness, across different knowledge domains. Notably, we are the first to show that LLMs are more likely to conform when they are more uncertain in their own prediction. We further explore factors that influence conformity, such as training paradigms and input characteristics, finding that instruction-tuned models are less susceptible to conformity, while increasing the naturalness of majority tones amplifies conformity. Finally, we propose two interventions, Devil's Advocate and Question Distillation, to mitigate conformity, providing insights into building more robust LLMs.

Collections

Summary

The paper demonstrates that LLMs are prone to adopting majority answers at the expense of accuracy.
Using an Asch-inspired experimental design, the study quantifies conformity effects across various models and question types.
Mitigation strategies like Devil’s Advocate and Question Distillation significantly reduce conformity biases in LLM responses.

Conformity in LLMs

The paper "Conformity in LLMs" (2410.12428) proposes that LLMs, despite their numerous advances in both accuracy and human-like behaviour emulation, are predisposed to conform to majority responses even when those responses are incorrect. It provides an in-depth analysis of conformity effects among LLMs and suggests mitigation strategies for such behaviour.

Introduction to Conformity in LLMs

Conformity encapsulates a psychosocial bias in which an entity – in this context, an LLM – aligns its responses with a majority, often relinquishing accuracy in the process. The paper adapts psychological paradigms to assess how different LLMs react to conformity pressures across diverse knowledge domains, exposing a universal tendency among these models to conform when uncertain. This phenomenon can critically hamper the performance of technologies built on LLMs, particularly in multi-agent scenarios or when human interaction is involved.

Figure 1: An example of LLMs conforming to an incorrect majority answer.

Methodology and Experiments

The paper employs a well-structured methodology inspired by Asch's conformity experiment, transforming the evaluation task into a dialogue-based Q&A system where the LLM is the 'critical subject.' By varying conditions such as participant numbers and the tones used in answers, the paper examines conformity across both objective and subjective questions.

Key Observations:

Universal Conformity: Conformity is prevalent across all tested LLMs irrespective of domain or question type (objective or subjective), as depicted in Figure 2.
Figure 2: Conformity level for Llama-3-8B-Instruct in various question-answering tasks.

Factors Influencing Conformity

The paper systematically identifies factors contributing to conformity variance among LLMs:

Tone of Responses: A more conversational and confident tone amplifies conformity (Figure 3).
Figure 3: Conformity levels across different models and participant numbers with different tones on MMLU.
Instruction Tuning: Models that underwent instruction tuning generally exhibited reduced conformity, suggesting that tuning forms a crucial defense against conformity pressures.
Figure 4: Conformity level across pre-trained and instruction-tuned models with Unanimous-Plain on MMLU.

Mitigating the Conformity Effect

To mitigate the conformity effect, the paper proposes two interventions:

Devil’s Advocate (DA): Incorporating dissenting responses helps reduce conformity by creating a diversity of opinions.
Figure 5: Resistance levels across different models and participant numbers, showing the impact of Question Distillation (QD) and Devil's Advocate (DA) in reducing conformity, compared to original MMLU performance.
Question Distillation (QD): Simplifying prompts helps to diminish misplaced attention on majority responses.

Implications and Future Directions

The paper emphasizes that understanding and addressing LLM conformity is crucial for the deployment of more robust AI systems. Conformity behaviours could undermine the reliability of AI in critical decision-making scenarios, particularly in domains involving collective intelligence and incremental learning.

The analysis connects LLM conformity with human psychological theories, suggesting a reflection of underlying biases present during model training. This opens avenues for future research aimed at integrating psychological insights into NLP, potentially addressing not just conformity but other socio-cognitive biases ingrained during model learning.

Conclusion

The investigation underscores the ubiquity of conformity among LLMs and provides strategic interventions that significantly reduce such biases. By drawing parallels with human conformity behaviours, this work paves the way for improvements in the design of LLMs, enhancing their reliability and fairness in computational contexts. These efforts mark a stride towards achieving AI that not only simulates human-like responses but also effectively minimizes erroneous social biases.

PDF Markdown

Paper Prompts

Explore 10 Community Prompts

Follow-up Questions

Authors (5)

Tweets

https://twitter.com/emollick/status/1850625799252303950

https://twitter.com/cambridgenlp/status/1848753339997319518

https://twitter.com/cleebration/status/1855082635707445545