CharacterGLM: Customizing Chinese Conversational AI Characters with LLMs
The paper focuses on the introduction and evaluation of CharacterGLM, a set of models constructed on the foundation of ChatGLM. These models, varying from 6B to 66B parameters, are crafted to enable character-based dialogue tasks—referred to as CharacterDial—tailored to meet social and emotional needs by allowing extensive character customization.
Model Architecture and Objectives
CharacterGLM is designed specifically for creating diverse AI characters or social agents. This is done by customizing their attributes, such as identities and experiences, and behaviors including linguistic and emotional features. The objective is to endow these AI characters with capabilities in consistency, human-likeness, and engagement, surpassing many closed-source LLMs like the GPT series.
Data Collection and Model Training
The authors crowdsourced a comprehensive Chinese CharacterDial corpus from various domains, including celebrities, daily life, and virtual settings, to fine-tune the CharacterGLM models. While leveraging ChatGLM, they enhanced model training with self-refinement techniques inspired by LaMDA, aiming for continuous improvement in dialogue quality and adherence to character profiles.
Evaluation Framework
A detailed evaluation was conducted involving both pairwise and pointwise assessments, comparing CharacterGLM against competitive models, including GPT-3.5, GPT-4, and others. Evaluations were based primarily on dimensions critical to social dialogue: consistency, human-likeness, and engagement, with an emphasis on long-term interaction potential.
Key Findings and Implications
CharacterGLM-66B demonstrated a notable advantage in modeling character attributes and behaviors consistently across extended interactions, often surpassing GPT-3.5 and performing on par with GPT-4. This indicates a promising step toward filling gaps in character-based dialogue research, particularly in generating emotionally resonating content suitable for long-term interactions.
The release of the 6B model and a subset of high-quality dialogue data underscores the authors' intention to advance the research in character-based dialogue generation. This facilitates exploration in enriching AI characters with nuanced personalities, enabling them to act as social companions and furthering the practical applications of AI in conversational settings.
Future Research Directions
The paper outlines several avenues for future advancement, including:
- Long-term Memory and Growth: Developing models that can remember past interactions and exhibit traits akin to learning and growth over multiple sessions.
- Self-awareness: Ensuring AI characters maintain a distinct, aware sense of self, enhancing trust and interaction quality.
- Inter-character Social Dynamics: Exploring interactions among AI characters within a virtual society to enrich conversational capabilities.
- Intrinsic Cognitive Processes: Incorporating deeper cognitive processes to mirror human social interactions more closely, addressing both text generation and understanding.
Conclusion
CharacterGLM stands as a significant contribution towards character-based conversational AI, illustrating the potential of LLMs in customizing interactive experiences. The work serves as a foundation for expanding AI's social functionalities, aligning technological development with intricate human social and emotional dynamics.