CharacterGLM: Customizable Chinese Dialogue AI

Updated 2 February 2026

CharacterGLM is a family of Chinese conversational AI models that enable fine-grained control over dialogue personas via natural language prompts.
It builds on the transformer-based ChatGLM architecture, using character prompts to enhance consistency, human-likeness, and engagement.
The framework supports scalable implementations from 6B to 66B parameters, demonstrating competitive performance against leading closed-source models.

CharacterGLM is a family of Chinese conversational AI models designed for character-based dialogue (CharacterDial) with a focus on customizable agent personalities. Built upon the ChatGLM transformer architecture, CharacterGLM enables fine-grained control over AI character attributes and behaviors entirely through prompt conditioning, achieving strong results in consistency, human-likeness, and engagement relative to leading closed-source LLMs.

1. Architecture and Model Family

CharacterGLM utilizes the ChatGLM backbone, which implements a standard transformer-based, autoregressive LLM supporting both Chinese and English. The architecture features Transformer-XL layer patterns with gated rotary positional embeddings and dense self-attention, employing a sliding-window mechanism to support long context handling. No modifications are made to the core attention or feed-forward layers for CharacterGLM; instead, extensibility is accomplished without any additional parameters or adapter modules.

A key technical distinction is the use of a "character prompt": a natural-language description encapsulating both static characteristics and dynamic behavior of a desired persona. This prompt is prepended to every dialogue session such that, during supervised fine-tuning, the model learns to condition its outputs directly on the provided persona context. This approach enables scalable character customization while maintaining architectural simplicity (Zhou et al., 2023).

CharacterGLM has been developed at several model scales:

6B parameters (publicly released)
12B parameters (API access)
66B parameters (API access)

Manual evaluations show clear scaling benefits. The 6B model demonstrates reasonable dialogue fluency but is limited in long-term consistency and engagement. The 12B version achieves notably improved maintenance of character traits across >10 dialogue turns. The 66B variant matches or outperforms GPT-4 overall, with ratings of 4.33 (human-likeness), 4.23 (engagement), and 4.18 (consistency) on a 5-point scale.

2. Customization Mechanism

The core customization method relies on prompt conditioning only. To define a character, a structured profile containing attributes such as name, age, occupation, interests, dislikes, viewpoints, experiences, achievements, social ties, linguistic style, emotional tone, and interaction patterns is transformed into a coherent paragraph by crowdworkers. This natural-language prompt is then used as a prefix to all user interactions with the model.

A representative pseudocode template for constructing such prompts is:

def make_character_prompt(profile):
    return f"""
    You are {profile.name}, a {profile.age}-year-old {profile.occupation} living in {profile.home}.
    You enjoy {profile.interests.join(', ')}, but dislike {profile.dislikes.join(', ')}.
    You speak in a {profile.style} tone, often saying "{profile.catchphrase}".
    You believe {profile.viewpoint}.
    In conversation, you {profile.behavior_pattern}.
    """

At inference, user turns are concatenated to the prompt and dialogue history:

1	prompt = character_prompt + "\n<dialogue history>\nUser: " + user_utterance + "\nAssistant:"

The learning objective remains standard token-level cross-entropy:

$\mathcal{L}_{\text{ce}} = -\sum_{t=1}^T \log P_\theta(x_t \mid x_{<t}, \text{character\_prompt})$

No explicit attribute-conditioning loss or architectural modifications are employed; the model adapts to persona conditioning solely via data-driven learning in supervised fine-tuning (Zhou et al., 2023).

3. Data Sources and Training Pipeline

CharacterGLM's CharacterDial dataset, encompassing approximately 1 million dialogue turns (with 1,000 sessions publicly released), aggregates data from several complementary sources:

Human role-playing by annotators constructing detailed character profiles and engaging in multi-turn exchanges
Synthetic dialogue sessions generated by GPT-4, followed by human colloquial paraphrasing to ensure naturalness
Manual extraction and adaptation from scripts and novels
Collections from human–prototype interaction designed for post-deployment self-refinement

The training pipeline proceeds as follows:

Pre-training: Inherits ChatGLM's large-scale bilingual pretraining.
Supervised fine-tuning (SFT): Inputs consist of the character prompt concatenated with multi-turn dialogue. For the 6B model, typical hyperparameters include a learning rate of $1 \times 10^{-5}$ (linear warmup + cosine decay), batch size of 64, maximum sequence length of 1,024, and 3–5 epochs.
Self-refinement: Ongoing post-deployment supervised fine-tuning is performed on real user corrections, with manually “fixed” responses used as additional SFT data.

4. Evaluation Protocol and Empirical Results

Manual evaluation is conducted using a pointwise rating system. Ten crowd-workers interact in sessions of at least 20 turns each with two different characters per model. The evaluation criteria, scored on a 1–5 scale, include:

Consistency (profile adherence)
Human-likeness (naturalness of style)
Engagement (interestingness)
Quality (fluency and coherence)
Safety
Correctness

Principal empirical results for CharacterGLM-66B are as follows:

Overall score: 4.21 (GPT-4: 4.15; GPT-3.5: 3.49)
Consistency: 4.18 (tie with best closed-source LLMs)
Human-likeness: 4.33 (highest among compared models)
Engagement: 4.23 (highest among compared models)

In pairwise comparisons versus GPT-3.5 on 1,000+ turns, CharacterGLM-66B achieves a Win/Tie/Lose (Engagement) breakdown of 48/12/40, indicating an 8% engagement advantage. Performance is especially enhanced in love-scene dialogues (+15%) and sessions exceeding 10 turns (+7% aggregate advantage) (Zhou et al., 2023).

5. Model Deployment and Usage

The 6B-parameter CharacterGLM model and a subset of 1,000 CharacterDial sessions are released on HuggingFace. The recommended usage workflow is:

Load the model and tokenizer using the Transformers library.
Define a character profile and generate the corresponding character prompt.
Manage dialogue history and concatenate it with the character prompt and each user utterance for inference.

Example Python usage:

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

tokenizer = AutoTokenizer.from_pretrained("LingxinAI/CharacterGLM-6b", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("LingxinAI/CharacterGLM-6b",
                                             load_in_8bit=True, device_map="auto")

profile = {
    "name": "Sun Wukong",
    "occupation": "Monkey King of the Flower Fruit Mountain",
    "interests": ["martial arts", "peach banquets"],
    "style": "playful and heroic",
    "viewpoint": "I will protect the innocent"
}
character_prompt = make_character_prompt(profile)

history = ""
for user_input in ["Hello, who are you?", "Can you fight demons?"]:
    prompt = character_prompt + "\n" + history + f"User: {user_input}\nAssistant:"
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    output = model.generate(**inputs, max_new_tokens=128, do_sample=True, top_p=0.9, temperature=0.8)
    response = tokenizer.decode(output[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True)
    print("Assistant:", response)
    history += f"User: {user_input}\nAssistant: {response}\n"

Developers can modify the prompt generation template to inject any desired persona mix of attributes and behaviors. Fine-tuning on domain-specific CharacterDial-style data is supported to derive new custom CharacterGLM variants.

6. Comparative Analysis and Applications

CharacterGLM demonstrates competitive or superior performance to leading closed-source LLMs, particularly excelling in modeling specific character-centric dialogue features fundamental to social and emotionally engaging agents. The framework’s modular prompt-based customization design permits rapid instantiation of diverse AI personas tailored to specific domains or social contexts without architectural retraining or parameter growth.

The model’s prompt-only conditioning approach offers advantages in efficiency, transparency, and extensibility for customization use cases. The availability of a public 6B version and training data subset supports further research in character-based dialogue generation, including adaptations for specific domains such as education, entertainment, and virtual companionship (Zhou et al., 2023).

Markdown Report Issue Upgrade to Chat

References (1)

CharacterGLM: Customizing Chinese Conversational AI Characters with Large Language Models (2023)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to CharacterGLM Framework.

CharacterGLM: Customizable Chinese Dialogue AI

1. Architecture and Model Family

2. Customization Mechanism

3. Data Sources and Training Pipeline

4. Evaluation Protocol and Empirical Results

5. Model Deployment and Usage

6. Comparative Analysis and Applications

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

CharacterGLM: Customizable Chinese Dialogue AI

1. Architecture and Model Family

2. Customization Mechanism

3. Data Sources and Training Pipeline

4. Evaluation Protocol and Empirical Results

5. Model Deployment and Usage

6. Comparative Analysis and Applications

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research