Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SOTOPIA-$π$: Interactive Learning of Socially Intelligent Language Agents (2403.08715v3)

Published 13 Mar 2024 in cs.CL

Abstract: Humans learn social skills through both imitation and social interaction. This social learning process is largely understudied by existing research on building language agents. Motivated by this gap, we propose an interactive learning method, SOTOPIA-$\pi$, improving the social intelligence of language agents. This method leverages behavior cloning and self-reinforcement training on filtered social interaction data according to LLM ratings. We show that our training method allows a 7B LLM to reach the social goal completion ability of an expert model (GPT-4-based agent), while improving the safety of language agents and maintaining general QA ability on the MMLU benchmark. We also find that this training paradigm uncovers some difficulties in LLM-based evaluation of social intelligence: LLM-based evaluators overestimate the abilities of the language agents trained specifically for social interaction.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (59)
  1. An in-depth look at gemini’s language abilities.
  2. A general theoretical paradigm to understand learning from human preferences.
  3. Constitutional ai: Harmlessness from ai feedback.
  4. Albert Bandura. 1976. Self-reinforcement: Theoretical and methodological considerations. Behaviorism, 4(2):135–155.
  5. otree—an open-source platform for laboratory, online, and field experiments. Journal of Behavioral and Experimental Finance, 9:88–97.
  6. Gmail smart compose: Real-time assisted writing. CoRR, abs/1906.00080.
  7. Deep reinforcement learning from human preferences. Advances in neural information processing systems, 30.
  8. Qlora: Efficient finetuning of quantized llms.
  9. Raft: Reward ranked finetuning for generative foundation model alignment. arXiv preprint arXiv:2304.06767.
  10. Social chemistry 101: Learning to reason about social and moral norms. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 653–670, Online. Association for Computational Linguistics.
  11. Bias and fairness in large language models: A survey. arXiv preprint arXiv:2309.00770.
  12. Understanding social reasoning in language models with language models. arXiv preprint arXiv:2306.15448.
  13. Reinforced self-training (rest) for language modeling. CSCL.
  14. Hyowon Gweon. 2021. Inferential social learning: cognitive foundations of human social learning and teaching. Trends in Cognitive Sciences, 25(10):896–910.
  15. Socially intelligent machines that learn from humans and help humans learn. Philosophical Transactions of the Royal Society A, 381(2251):20220048.
  16. A comparative analysis of speed and accuracy for three off-the-shelf de-identification tools. AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science, 2020:241–250.
  17. Measuring massive multitask language understanding. In International Conference on Learning Representations.
  18. Dan Hendrycks and Mantas Mazeika. 2022. X-risk analysis for ai research.
  19. An overview of catastrophic ai risks.
  20. Lora: Low-rank adaptation of large language models.
  21. Ai alignment: A comprehensive survey.
  22. Mistral 7b.
  23. Revisiting the evaluation of theory of mind through question answering. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 5872–5877, Hong Kong, China. Association for Computational Linguistics.
  24. The power of scale for parameter-efficient prompt tuning.
  25. Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: Optimizing continuous prompts for generation.
  26. From text to tactic: Evaluating llms playing the game of avalon.
  27. Training socially aligned language models on simulated social interactions. In The Twelfth International Conference on Learning Representations.
  28. Trustworthy llms: a survey and guideline for evaluating large language models’ alignment.
  29. David Lopez-Paz and Marc’Aurelio Ranzato. 2017. Gradient episodic memory for continuum learning. CoRR, abs/1706.08840.
  30. An empirical study of catastrophic forgetting in large language models during continual fine-tuning.
  31. Can llms keep a secret? testing privacy implications of language models via contextual integrity theory.
  32. Helen Nissenbaum. 2004. Privacy as contextual integrity. Washington Law Review, 79.
  33. Open-ended learning leads to generally capable agents. arXiv preprint arXiv:2107.12808.
  34. Self-imitation learning. In International Conference on Machine Learning, pages 3878–3887. PMLR.
  35. OpenAI. 2023. Gpt-4 technical report.
  36. Training language models to follow instructions with human feedback.
  37. Dean A Pomerleau. 1988. Alvinn: An autonomous land vehicle in a neural network. Advances in neural information processing systems, 1.
  38. Direct preference optimization: Your language model is secretly a reward model.
  39. Neural theory-of-mind? on the limits of social intelligence in large lms.
  40. Social IQa: Commonsense reasoning about social interactions. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 4463–4473, Hong Kong, China. Association for Computational Linguistics.
  41. Proximal policy optimization algorithms.
  42. Role play with large language models. Nature, 623(7987):493–498.
  43. Clever hans or neural theory of mind? stress testing social reasoning in large language models.
  44. Towards facilitating empathic conversations in online mental health support: A reinforcement learning approach. CoRR, abs/2101.07714.
  45. Why so toxic? measuring and triggering toxic behavior in open-domain chatbots.
  46. Auditing and mitigating cultural bias in llms. arXiv preprint arXiv:2311.14096.
  47. Michael Tomasello. 2021. Becoming Human: A Theory of Ontogeny. Belknap Press.
  48. Behavioral cloning from observation. arXiv preprint arXiv:1805.01954.
  49. Tomer Ullman. 2023. Large language models fail on trivial alterations to theory-of-mind tasks.
  50. Decodingtrust: A comprehensive assessment of trustworthiness in gpt models. arXiv preprint arXiv:2306.11698.
  51. Large language models are not fair evaluators. arXiv preprint arXiv:2305.17926.
  52. Ruoyao Wang and Peter Jansen. 2023. Self-supervised behavior cloned transformers are path crawlers for text games. arXiv preprint arXiv:2312.04657.
  53. Aligning large language models with human: A survey.
  54. Ethical and social risks of harm from language models. arXiv preprint arXiv:2112.04359.
  55. Eliezer Yudkowsky et al. 2008. Artificial intelligence as a positive and negative factor in global risk. Global catastrophic risks, 1(303):184.
  56. Ethical considerations and policy implications for large language models: Guiding responsible development and deployment.
  57. Sotopia: Interactive evaluation for social intelligence in language agents. In ICLR.
  58. Fine-tuning language models from human preferences.
  59. NormBank: A knowledge bank of situational social norms. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7756–7776, Toronto, Canada. Association for Computational Linguistics.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Ruiyi Wang (11 papers)
  2. Haofei Yu (17 papers)
  3. Wenxin Zhang (27 papers)
  4. Zhengyang Qi (6 papers)
  5. Maarten Sap (86 papers)
  6. Graham Neubig (342 papers)
  7. Yonatan Bisk (91 papers)
  8. Hao Zhu (212 papers)
Citations (15)

Summary

  • The paper introduces SOTOPIA-π, an innovative framework that enhances language agents' social intelligence through interactive social task generation, behavior cloning, and self-reinforcement.
  • The framework employs GPT-4 to generate diverse social scenarios, creating rich training data from expert demonstrations and self-reinforced interactions.
  • Empirical evaluations reveal marked improvements in goal completion and safety, bridging the gap between human judgment and AI performance in social contexts.

Enhancing Social Intelligence in LLMs through Interactive Learning

Introduction to Social Learning in AI

Recent research efforts focus on enhancing the social intelligence of language agents through interactive learning, a process analogous to human social skill development. Social intelligence, fundamental for nuanced human-machine interactions, encompasses abilities ranging from understanding social cues to engaging in complex conversational exchanges. Despite the impressive strides in LLM (LM) capabilities, there remains a significant gap in achieving human-like social intelligence. This gap underscores the challenge of equipping language agents with the ability to intuit social norms, make socially aware decisions, and undertake goal-driven social interactions engagingly and safely.

SOTOPIA-π\pi: A New Framework for Social Learning

In response to these challenges, the recent work on the SOTOPIA-π\pi framework presents an innovative approach to interactive learning, aiming to bolster the social intelligence of language agents. The framework consists of three primary components: automatic generation of diverse social tasks, collection of interaction data through behavior cloning and self-reinforcement, and iterative policy updates to improve agent performance.

  1. Social Task Generation: Task diversity is crucial for developing transferable social strategies. SOTOPIA-π\pi leverages GPT-4 to synthesize novel social scenarios, drawing from a wide array of potential social interactions. This process not only ensures a breadth of learning opportunities but also simulates the unpredictability and richness of human social experiences.
  2. Training Data Collection: The framework utilizes a dual strategy for data collection. For behavior cloning, interactions between expert agents (based on GPT-4) serve as exemplary models of social behavior. Self-reinforcement, on the other hand, relies on the agent's own experiences, focusing on instances of successful goal completion as rated by GPT-4.
  3. Agent Policy Update: Training incorporates both learning from experts and reinforcing positive self-generated interactions, refined through GPT-4-based performance evaluations. This multi-faceted approach allows for the gradual improvement of the agent's social acumen, balancing between learning effective strategies and minimizing unsafe or undesired behaviors.

Empirical Evaluation and Findings

The evaluation of SOTOPIA-π\pi yields several key insights:

  • Improvement in Social Intelligence: The framework demonstrates significant advancements in the language agents' goal completion abilities, bringing them closer to the expert model's performance. This indicates the effectiveness of combining behavior cloning and self-reinforcement for social learning.
  • Challenges in Evaluation: The increased performance, however, uncovers limitations in the current evaluation protocols, particularly the gap between LLM-based evaluators and human judgment. This mismatch signals the need for more nuanced and human-aligned evaluation metrics for social intelligence.
  • Balance with Other AI Capabilities: SOTOPIA-π\pi not only enhances social intelligence but does so without compromising the LLMs' general knowledge and reasoning abilities. Furthermore, it introduces improvements in safety, reducing the propensity for generating toxic responses.

Theoretical and Practical Implications

The SOTOPIA-π\pi framework underscores the potential of interactive learning for enhancing social intelligence in LLMs. Theoretically, it aligns with the understanding that social learning is not merely about imitation but involves complex cognitive processes including hypothesis testing and reinforcement. Practically, the findings advocate for a holistic approach to AI development, where improving social abilities goes hand in hand with ensuring safe and aligned interactions.

Future Directions

This research opens several avenues for future exploration. First, refining the evaluation metrics and methodologies to better capture the nuance of human social judgment stands as a priority. Moreover, integrating online reinforcement learning could offer real-time feedback mechanisms for continuous improvement. Lastly, extending the framework to incorporate human interaction data could provide richer learning experiences, further bridging the gap between AI and human social intelligence.

In conclusion, the SOTOPIA-π\pi framework represents a significant step forward in the quest to endow LLMs with advanced social intelligence. By leveraging interactive learning, it not only enhances the ability of AI agents to navigate complex social scenarios but also lays the groundwork for safer and more meaningful human-AI interactions.

Youtube Logo Streamline Icon: https://streamlinehq.com