CharacterGLM: Customizable Dialogue LLMs
- CharacterGLM is a family of large language models designed for generating character-based dialogue with detailed control over static profiles and dynamic behaviors.
- The models range from 6B to 66B parameters, balancing inference speed with enriched conversational depth and persona fidelity.
- It integrates prompt-based conditioning with human-in-the-loop refinement to maintain consistent, human-like engagement and customizable character attributes.
CharacterGLM is a family of LLMs constructed atop the ChatGLM architecture, designed explicitly for generating character-based dialogues (CharacterDial) in Chinese. With model sizes spanning from 6 billion to 66 billion parameters, CharacterGLM facilitates deep customization of conversational AI agents by enabling detailed control over both static attributes and dynamic behaviors. The system’s architecture, training methodology, and evaluation regimes are optimized for maintaining consistency with customized character profiles, with empirical results indicating superior performance in human-likeness, consistency, and engagement relative to mainstream closed-source LLMs—including the GPT series. The authors have released the 6B version and a subset of training data to foster research in character-based dialogue generation.
1. Architecture and Model Scaling
CharacterGLM builds on the ChatGLM backbone and is offered in several parameter scales from 6B to 66B. The underlying architecture leverages advancements in transformer-based LLM design, with scaling conferring notable improvements:
- 6B Variant: Prioritizes inference speed and reduced resource demands.
- 66B Variant: Achieves heightened language understanding, resilience in multi-turn conversations, and finer mirroring of intricate character attributes.
Model scaling, as observed, enhances fluency, safety, and correctness, and enables the capture of more nuanced, persona-specific dialogue features. The architecture is tuned for optimal performance according to resource constraints and desired depth of character representation.
Model Variant | Parameter Count | Primary Attributes |
---|---|---|
CharacterGLM-6B | 6B | Faster, resource-efficient |
CharacterGLM-66B | 66B | Richer, more consistent |
2. Character Customization Protocol
A central goal is enabling bespoke AI characters through manipulation of detailed character profiles. These profiles comprise:
- Identities: Name, gender, age, occupation, residence, family composition.
- Interests: Likes, dislikes, hobbies, passions.
- Viewpoints: Worldviews, philosophies, values.
- Experiences and Achievements: Past experiences, notable accomplishments, awards.
- Social Relationships: Friendships, familial ties, colleagues.
Profiles are translated into natural language prompts, conditioning outputs on these specifications. The training data, essential for this high-fidelity customization, is curated via:
- Human role-playing exercises.
- Synthetic generation with LLMs (notably GPT-4).
- Extraction from literary and biographical sources.
This data diversity ensures robust modeling of nuanced, multi-dimensional personas suited to varied conversational contexts.
3. Dynamic Behavioral Configuration
Beyond static profiling, CharacterGLM features fine-grained control over the behavioral dimensions of AI agents. Behavioral configuration encompasses:
- Linguistic Features: Preferred lexical items, catchphrases, dialects, stylistic biases.
- Emotional Expressions: Calibrated affective responses reflecting specified mood or temperament.
- Interaction Patterns: Schematized conversational flow, turn-taking strategies, and multi-turn consistency mechanisms.
Integration of static and behavioral components results in dialogue outputs that sustain contextually faithful and human-like engagement over extended interactions. This approach addresses challenges in multi-turn stability and persona embodiment.
4. Training, Fine-tuning, and Objective Function
CharacterGLM’s training pipeline involves:
- Supervised Fine-tuning: Conducted on curated, multi-turn dialogue datasets embedding both profile and behavioral cues via natural language prompts.
- Self-Refinement: Human-prototype interaction sessions are employed to iteratively correct divergences and inconsistencies in generated dialogue.
The training objective is the standard cross-entropy loss over generated tokens, conditioned on character prompts and dialogue history. This is formalized as:
where denotes the token generated at turn , and encodes the character’s profile and behaviors. Data augmentation strategies such as paraphrasing and stylization are incorporated during prompt design, broadening the variety of character-specific dialogue encountered during training. This pipeline ensures faithful rendering of persona-specific responses and adaptation to real-world conversation demands.
5. Performance Assessment and Comparative Results
Extensive manual evaluations benchmark CharacterGLM against leading mainstream models, including GPT-3.5 and GPT-4. Human evaluators rate model outputs across six criteria: profile consistency, human-likeness, engagement, overall quality, safety, and factual correctness.
Findings demonstrate that, particularly in its 66B variant, CharacterGLM exhibits:
- Superior maintenance of consistent character attributes and behavioral style.
- Enhanced generation of natural, human-like responses.
- Improved engagement and robustness across long dialogue sessions.
In pairwise assessments, CharacterGLM consistently outperforms GPT-3.5 and, in several instances, parallels or surpasses GPT-4 in dialogue quality. These results underscore advancements in context retention, persona fidelity, and overall conversational quality.
6. Release Strategy and Research Directions
CharacterGLM’s 6B variant, alongside a subset of the Chinese CharacterDial corpus, is available to the community. This release is designed to catalyze research in character-based conversational AI, offering resources for model extension, task adaptation, and benchmarking.
Anticipated research threads enabled by these resources include:
- Development of persistent, long-term character memory.
- Increasing self-reflection or self-awareness capabilities in AI agents.
- Engineering multi-agent systems or “character societies” enabling interaction among distinct AI personas for complex social simulations.
A plausible implication is that broader access to high-quality, persona-driven dialogue data and modular LLMs will accelerate innovations in areas such as social-role play, narrative generation, and emotionally intelligent virtual agents.
7. Technical and Implementation Considerations
CharacterGLM relies on prompt-based conditioning of dialogue, necessitating careful design of profile and behavioral prompts for achieving target persona effects. Implementation entails selection of model size balancing fidelity and resource use, comprehensive prompt engineering, and possible post-deployment fine-tuning to maintain alignment with user requirements.
resource requirements and deployment constraints scale with model size. For production use, the 6B model offers practical inference speed and deployment feasibility. The 66B model, albeit more computationally demanding, is optimal where maximal character nuance and conversational depth are paramount.
The system’s self-refinement protocol, anchored in human-in-the-loop correction, is crucial for minimizing drift and preserving character integrity over sustained interactions. Data augmentation strategies during training further enhance resilience and adaptability to diverse conversational scenarios.
CharacterGLM thus constitutes a substantive advance in customizable, coherent, and lifelike AI-driven dialogue, furnishing both the research community and application developers with granular control over the construction and deployment of sophisticated conversational agents.