Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ChatHaruhi: Reviving Anime Character in Reality via Large Language Model (2308.09597v1)

Published 18 Aug 2023 in cs.CL and cs.HC

Abstract: Role-playing chatbots built on LLMs have drawn interest, but better techniques are needed to enable mimicking specific fictional characters. We propose an algorithm that controls LLMs via an improved prompt and memories of the character extracted from scripts. We construct ChatHaruhi, a dataset covering 32 Chinese / English TV / anime characters with over 54k simulated dialogues. Both automatic and human evaluations show our approach improves role-playing ability over baselines. Code and data are available at https://github.com/LC1332/Chat-Haruhi-Suzumiya .

Citations (25)

Summary

  • The paper introduces an LLM-driven algorithm for role-playing anime characters using a curated dataset of 32 characters and over 54,000 dialogues.
  • It outlines a novel memory-based dialogue control method that harnesses extracted script data to maintain character authenticity.
  • The evaluation through both automatic and human assessments confirms enhanced alignment and response quality compared to conventional models.

Overview

In a paper, researchers propose an innovative approach to reviving anime characters in conversations using AI-powered chatbots. Building on existing LLMs, such as ChatGPT and Claude, the authors introduce an algorithm capable of role-playing as specific fictional characters. This is groundbreaking because it allows the creation of chatbots that do not just engage in generic conversations but can emulate the personas of characters from various Chinese and English TV shows and anime, contributing significantly to both recreational and creative industries.

Methodology

The core of the paper revolves around constructing "ChatHaruhi," a dataset that spans 32 characters and over 54,000 simulated dialogues, taken from both original scripts and generated data. The project's aim is to create virtual characters who truly reflect the knowledge, personality, and linguistic style of their fictional counterparts. The new technique involves organizing character memories (extracted from scripts) to effectively control LLMs during a conversation. By maintaining a memory database and honing system prompts, language habits unique to the characters can be mimicked more accurately.

Dataset and Algorithm

The researchers detail the process of building their dataset, which involved extracting dialogue from various source materials like TV series, novels, and wiki entries. They also developed tools for this extraction process, accounting for the nuances inherent in different media types. The algorithm developed allows the chatbot to tap into a memory bank containing classic dialogues to generate responses that are in line with the character's persona. This system proved to be effective even for characters with little existing dialogue data, as the method was also designed to generate new, character-fitting dialogues to supplement the dataset.

Evaluation and Contribution

To measure the potency of their models, the authors deployed both automatic and human evaluations. Automatically, they checked the model's ability to respond to classic plot points with answers similar to the original script. In human evaluations, two metrics were proposed: alignment (whether the chatbot's answers match the character's setting) and response quality (linguistic prowess of the chatbot's answers). The results showed that this tailored approach led to more effective role-playing by chatbots over traditional models.

Conclusions

The paper concludes with insights into the success of constructing AI chat systems capable of impersonating characters accurately from different works. With the data and code available to the public, they envision further refinements to the interface and evaluations, indicating ongoing work and potential applications of this technology. The success of such a system lies in its ability to tap into nostalgia, offering fans a way to interact with beloved characters in a conversational setting that has not been possible until now.

Youtube Logo Streamline Icon: https://streamlinehq.com