Language Understanding as a Constraint on Consensus Size in LLM Societies (2409.02822v2)

Published 4 Sep 2024 in physics.soc-ph

Abstract: The applications of LLMs are going towards collaborative tasks where several agents interact with each other like in an LLM society. In such a setting, large groups of LLMs could reach consensus about arbitrary norms for which there is no information supporting one option over another, regulating their own behavior in a self-organized way. In human societies, the ability to reach consensus without institutions has a limit in the cognitive capacities of humans. To understand if a similar phenomenon characterizes also LLMs, we apply methods from complexity science and principles from behavioral sciences in a new approach of AI anthropology. We find that LLMs are able to reach consensus in groups and that the opinion dynamics of LLMs can be understood with a function parametrized by a majority force coefficient that determines whether consensus is possible. This majority force is stronger for models with higher language understanding capabilities and decreases for larger groups, leading to a critical group size beyond which, for a given LLM, consensus is unfeasible. This critical group size grows exponentially with the language understanding capabilities of models and for the most advanced models, it can reach an order of magnitude beyond the typical size of informal human groups.

PDF HTML Abstract

Analyzing Consensus Formation in LLM Societies

This paper investigates the emergence of collective behavior, specifically consensus formation, in groups of LLMs interacting within a simulated societal framework. By drawing parallels between human cognitive constraints and the capabilities of LLMs, the research introduces a novel perspective on how these models can achieve consensus on norms in settings devoid of predefined preferences.

Overview of Findings

The paper employs concepts from complexity science and behavioral sciences, alongside a methodological approach termed "AI anthropology," to simulate opinion dynamics within LLM societies. The results demonstrate that LLMs can indeed achieve consensus in groups, with the capacity to do so being contingent upon a defined parameter called the "majority force." This parameter is indicative of the models' tendency to conform to the prevailing majority opinion in the group.

Key findings include:

Majority Force and Model Capability: The paper identifies that the majority force, a critical factor for consensus, correlates positively with the language understanding capabilities of the models, as measured by benchmarks such as the Massive Multitask Language Understanding (MMLU). More capable models, like those in the GPT-4 family, exhibit stronger consensus tendencies.
Critical Consensus Size: The research reveals a "critical consensus size" for each model, beyond which achieving consensus becomes exponentially difficult. Interestingly, this size grows exponentially with enhanced language understanding capabilities and exceeds the typical size of informal human groups for the most advanced models.
Group Size Influence: As group size increases, the majority force decreases, aligning with the observation that even sophisticated models have a threshold group size beyond which consensus formation is hindered.
Analogy with Human Societies: There is a compelling analogy with human societies, wherein the cognitive constraints related to the neocortex size in primates predict group sizes capable of maintaining informal consensus.

Theoretical and Practical Implications

Theoretically, this paper extends the understanding of emergent behavior in LLMs, suggesting that LLM societies adhere to principles similar to those governing human social dynamics, such as those observed in Dunbar's number. The exponential scaling between cognitive capabilities and group size in LLMs echoes findings in evolutionary anthropology regarding primate social groups. The alignment of complex behavior in artificial systems with biological systems underscores the utility and validity of cross-domain methodologies like AI anthropology.

Practically, these insights have implications for the deployment of LLMs in collective tasks, particularly where alignment of behavior is crucial. Potential applications could range from collaborative AI systems in various industries to autonomous negotiation networks. However, there are associated risks, such as the propagation of undesired norms when LLM societies form in non-transparent or inadequately regulated environments.

Future Research Directions

Future developments in this area should focus on understanding the nuances of LLM interactions, especially when different model types or versions coexist within the same societal framework. Investigating scenarios with mixed capabilities and access to varying levels of information could unveil further complexities in consensus dynamics. Moreover, assessing the impact of individual model design (e.g., context window length, specific tuning and training regimens) on societal behavior would deepen comprehension of how LLMs emulate or diverge from human-like social structures.

The paper establishes a foundational framework for evaluating collective behavior in LLMs, paving the way for further exploration into the potential and challenges of deploying AI in interconnected, non-hierarchical systems. These insights not only contribute to the field of artificial intelligence but also open new dialogues within the broader context of social science and behavioral dynamics in machine societies.