Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 71 tok/s
Gemini 2.5 Pro 46 tok/s Pro
GPT-5 Medium 27 tok/s Pro
GPT-5 High 30 tok/s Pro
GPT-4o 93 tok/s Pro
Kimi K2 207 tok/s Pro
GPT OSS 120B 460 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Neeko: Leveraging Dynamic LoRA for Efficient Multi-Character Role-Playing Agent (2402.13717v3)

Published 21 Feb 2024 in cs.CL

Abstract: LLMs have revolutionized open-domain dialogue agents but encounter challenges in multi-character role-playing (MCRP) scenarios. To address the issue, we present Neeko, an innovative framework designed for efficient multiple characters imitation. Unlike existing methods, Neeko employs a dynamic low-rank adapter (LoRA) strategy, enabling it to adapt seamlessly to diverse characters. Our framework breaks down the role-playing process into agent pre-training, multiple characters playing, and character incremental learning, effectively handling both seen and unseen roles. This dynamic approach, coupled with distinct LoRA blocks for each character, enhances Neeko's adaptability to unique attributes, personalities, and speaking patterns. As a result, Neeko demonstrates superior performance in MCRP over most existing methods, offering more engaging and versatile user interaction experiences. Code and data are available at https://github.com/weiyifan1023/Neeko.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (28)
  1. Intrinsic dimensionality explains the effectiveness of language model fine-tuning. arXiv preprint arXiv:2012.13255.
  2. Large language models meet harry potter: A dataset for aligning dialogue agents with characters. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 8506–8520.
  3. Lifelong language pretraining with distribution-specialized experts. In International Conference on Machine Learning, pages 5383–5395. PMLR.
  4. Adanet: Adaptive structural learning of artificial neural networks. In International conference on machine learning, pages 874–883. PMLR.
  5. Learning factored representations in a deep mixture of experts. arXiv preprint arXiv:1312.4314.
  6. Transformer feed-forward layers are key-value memories. arXiv preprint arXiv:2012.14913.
  7. Meet your favorite character: Open-domain chatbot mimicking fictional characters with only a few utterances. arXiv preprint arXiv:2204.10825.
  8. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685.
  9. Lorahub: Efficient cross-task generalization via dynamic lora composition. arXiv preprint arXiv:2307.13269.
  10. Chatharuhi: Reviving anime character in reality via large language model. arXiv preprint arXiv:2308.09597.
  11. Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, pages 74–81.
  12. Moelora: An moe-based parameter efficient fine-tuning method for multi-task medical applications. arXiv preprint arXiv:2310.18339.
  13. Moelora: Contrastive learning guided mixture of experts on parameter-efficient fine-tuning for large language models.
  14. Unsupervised enrichment of persona-grounded dialog with background stories. arXiv preprint arXiv:2106.08364.
  15. OpenAI. 2023. Chatgpt: Optimizing language models for dialogue.
  16. Character-llm: A trainable agent for role-playing. arXiv preprint arXiv:2310.10158.
  17. Roleeval: A bilingual role evaluation benchmark for large language models. arXiv preprint arXiv:2312.16132.
  18. Telling stories through multi-user dialogue by modeling character relations. arXiv preprint arXiv:2105.15054.
  19. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
  20. Charactereval: A chinese benchmark for role-playing conversational agent evaluation. arXiv preprint arXiv:2401.01275.
  21. Dylora: Parameter-efficient tuning of pre-trained models using dynamic search-free low-rank adaptation. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 3266–3279.
  22. Rolellm: Benchmarking, eliciting, and enhancing role-playing abilities of large language models. arXiv preprint arXiv:2310.00746.
  23. Assessing knowledge editing in language models via relation perspective. arXiv preprint arXiv:2311.09053.
  24. Fei Ye and Adrian G Bors. 2023. Lifelong compression mixture model via knowledge relationship graph. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 10900–10908.
  25. Melo: Enhancing model editing with neuron-indexed dynamic lora. arXiv preprint arXiv:2312.11795.
  26. Spiking generative networks empowered by multiple dynamic experts for lifelong learning. Expert Systems with Applications, 238:121845.
  27. Narrativeplay: Interactive narrative understanding. arXiv preprint arXiv:2310.01459.
  28. Characterglm: Customizing chinese conversational ai characters with large language models. arXiv preprint arXiv:2311.16832.
Citations (14)

Summary

  • The paper introduces Neeko as a multi-character role-playing agent that employs dynamic LoRA for efficient role-specific dialogue management.
  • It details a novel gating network that activates dedicated LoRA blocks to reduce computational overhead while preserving distinct character nuances.
  • Empirical results show Neeko’s superior dialogue coherency and flexibility in adding new roles without complete retraining.

Exploring Multi-Character Role-Playing with Neeko: A Dynamic LoRA-Based Approach

Introduction to Neeko and the Challenge of MCRP

The emergence of LLMs has significantly propelled the capabilities of open-domain dialogue agents. However, when navigating the complexities of multi-character role-playing (MCRP), these agents face a distinctive set of challenges. Neeko, an incremental role-playing agent, is introduced to specifically address these challenges by leveraging a dynamic Low-Rank Adapter (LoRA) strategy. This allows Neeko to adeptly handle multiple roles within extended dialogues, covering both familiar and novel characters. The imperative necessity of such a framework arises from the observed limitations of existing methods, which predominantly focus on single-character representations and lack adaptability to new, undefined characters.

Methodological Insights into Neeko

Role-Playing with Dynamic LoRA

Neeko's architecture comprehends three principal phases: pre-training, role-playing, and incremental learning. Through non-overlapping LoRA blocks designated for each character, Neeko undergoes pre-training covering various character dialogues. This structure ensures individuality and distinctiveness in character portrayal, effectively mitigating catastrophic forgetting and enhancing adaptability in role shifts.

Role Selection with Gating Network

The intricacy of activating relevant LoRA blocks based on role prompts is managed through a gating network inspired by the Mix of Experts model. Neeko differentiates itself by using this network to assess role identities, hence dynamically aligning model parameters to the role being enacted. This approach not only substantially reduces computation overhead but also meticulously preserves role-specific nuances.

Lifelong Role-Playing with LoRA Expansion

Addressing the challenge of incorporating new characters into the agent's repertoire, Neeko introduces two strategies: fusion and expansion. These strategies facilitate the model's growth in character coverage without necessitating complete retraining, thereby streamlining the inclusion of unseen roles and significantly lowering computational demands.

Empirical Validation and Implications

The comparative analysis demonstrates Neeko's superiority over contemporary methods in the field of MCRP, particularly highlighting its ability to maintain high-quality character consistency, knowledge fidelity, and dialogue coherency. Notably, Neeko exhibits exceptional proficiency in seamlessly transitioning between roles while preserving the distinct attributes and knowledge bases associated with each character.

Theoretical and Practical Contributions

The formulation of the MCRP task, accompanied by the introduction of Neeko and its innovative use of dynamic LoRA, marks a significant step forward in the research on role-playing agents. This research elucidates a pathway for future investigations into more complex interaction scenarios involving multiple characters, suggesting a potential expansion of dialogue systems' capabilities to provide more engaging and personalized user experiences.

Future Directions in AI Research

The findings and methodologies presented in this paper open up several avenues for further exploration, including the refinement of role embeddings, optimization of gating mechanisms, and expansion of Neeko's application to broader domains beyond role-playing scenarios. Additionally, the fundamental principles underpinning Neeko's design might inspire the development of more sophisticated models capable of navigating the nuanced demands of multi-role interactions in dynamic and unpredictable environments.

In conclusion, Neeko represents a pivotal advancement in addressing the nuanced requirements of multi-character role-playing, significantly broadening the horizons of what is achievable with LLMs and dialogue agents. The insights garnered from this research not only contribute to the academic discourse but also hold considerable promise for enhancing the practical implementations of AI-driven interactive systems.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Reddit Logo Streamline Icon: https://streamlinehq.com