LMAgent: A Large-scale Multimodal Agents Society for Multi-user Simulation (2412.09237v2)

Published 12 Dec 2024 in cs.AI

Abstract: The believable simulation of multi-user behavior is crucial for understanding complex social systems. Recently, LLMs-based AI agents have made significant progress, enabling them to achieve human-like intelligence across various tasks. However, real human societies are often dynamic and complex, involving numerous individuals engaging in multimodal interactions. In this paper, taking e-commerce scenarios as an example, we present LMAgent, a very large-scale and multimodal agents society based on multimodal LLMs. In LMAgent, besides freely chatting with friends, the agents can autonomously browse, purchase, and review products, even perform live streaming e-commerce. To simulate this complex system, we introduce a self-consistency prompting mechanism to augment agents' multimodal capabilities, resulting in significantly improved decision-making performance over the existing multi-agent system. Moreover, we propose a fast memory mechanism combined with the small-world model to enhance system efficiency, which supports more than 10,000 agent simulations in a society. Experiments on agents' behavior show that these agents achieve comparable performance to humans in behavioral indicators. Furthermore, compared with the existing LLMs-based multi-agent system, more different and valuable phenomena are exhibited, such as herd behavior, which demonstrates the potential of LMAgent in credible large-scale social behavior simulations.

PDF HTML Abstract

Insightful Overview of "LMAgent: A Large-scale Multimodal Agents Society for Multi-user Simulation"

The paper presents a sophisticated advancement in the simulation of complex multi-user behaviors through the development of LMAgent, a large-scale, multimodal agents society. This research leverages the capabilities of multimodal LLMs to create an extensive environment where agents dynamically interact in scenarios reflecting real-world complexities, particularly focusing on e-commerce environments.

Key Contributions and Methodological Innovations

The authors introduce LMAgent as a solution to the challenge of simulating believable multi-user behavior in dynamic and intricate settings. This work concentrates on embedding agents with human-like decision-making abilities by utilizing multimodal LLMs, which previously have shown potential in understanding and generating human-like interactions. The emphasis on using a multimodal approach allows agents to process text and visual input, thus facilitating a more realistic imitation of user behavior.

Notable innovations introduced in this paper include:

Self-Consistency Prompting: This mechanism enhances the agents' decision-making accuracy by enabling them to generate multimodal prompts through chain-of-thought reasoning. This strategy substantially augments the decision-making performance, providing significant improvements over existing multi-agent systems that rely solely on text modalities.
Fast Memory Mechanism: The approach notably increases simulation efficiency by restricting costly multimodal LLM queries to complex behaviors, optimizing resource usage. Combined with an initial network set up using the small-world model, it supports simulations involving over 10,000 agents, emphasizing scalability.
Small-World Network Initialization: By structuring the agent society based on the small-world model, the simulation mimics real-world social networks more accurately. This model promotes efficient information dissemination among agents, aligning with the six degrees of separation theory, enhancing the credibility of the simulated social dynamics.

Experimental Validation

The authors substantiate their claims through extensive experimentation. One standout result in purchase behavior simulation shows that LMAgent achieves an average accuracy of 73.04% across multiple settings, significantly outperforming traditional recommendation systems and previous LLM-based agents. Furthermore, the simulation's ability to generate "herd behavior" and replicate real-world user co-purchase patterns demonstrates its effectiveness in reflecting authentic consumer behavior on a large scale.

Implications and Future Directions

Practically, this research has profound implications for the development of systems requiring high-fidelity simulations of human social behavior, particularly in digital commerce and social media environments. By simulating the nuanced interactions of tens of thousands of agents, researchers and practitioners can better understand and predict complex human social behaviors at scale.

Theoretically, LMAgent contributes to ongoing discussions in AI about the integration of multimodal capabilities into agent-based systems. It lays a foundation for integrating more sophisticated cognitive and affective components into agent architectures, potentially advancing human-computer interaction paradigms.

Future directions suggested by this work involve extending these capabilities into other domains and further refining the decision-making and memory systems of agents. Additionally, as LLMs continue to evolve, integrating more advanced models may further enhance the accuracy and realism of these simulations, opening pathways for their application in broader societal contexts.

In summary, this paper represents a significant step in AI and multi-agent systems, underscoring the potential of LMAgent to serve as a framework for exploring large-scale human behavior simulations across various fields.

PDF Markdown Bookmark Chat (Pro)

Authors (6)

Yijun Liu (23 papers)
Wu Liu (56 papers)
Xiaoyan Gu (10 papers)
Yong Rui (23 papers)
Xiaodong He (162 papers)
Yongdong Zhang (119 papers)

Related Papers

Find Related Papers

Tweets

https://twitter.com/rohanpaul_ai/status/1869878711245516904

https://twitter.com/dropacid8/status/1869886829723328832