TimeChara: Evaluating Point-in-Time Character Hallucination of Role-Playing Large Language Models (2405.18027v1)

Published 28 May 2024 in cs.CL

Abstract: While LLMs can serve as agents to simulate human behaviors (i.e., role-playing agents), we emphasize the importance of point-in-time role-playing. This situates characters at specific moments in the narrative progression for three main reasons: (i) enhancing users' narrative immersion, (ii) avoiding spoilers, and (iii) fostering engagement in fandom role-playing. To accurately represent characters at specific time points, agents must avoid character hallucination, where they display knowledge that contradicts their characters' identities and historical timelines. We introduce TimeChara, a new benchmark designed to evaluate point-in-time character hallucination in role-playing LLMs. Comprising 10,895 instances generated through an automated pipeline, this benchmark reveals significant hallucination issues in current state-of-the-art LLMs (e.g., GPT-4o). To counter this challenge, we propose Narrative-Experts, a method that decomposes the reasoning steps and utilizes narrative experts to reduce point-in-time character hallucinations effectively. Still, our findings with TimeChara highlight the ongoing challenges of point-in-time character hallucination, calling for further study.

PDF HTML Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

References (75)

Authors (7)

Jaewoo Ahn (7 papers)
Taehyun Lee (3 papers)
Junyoung Lim (1 paper)
Jin-Hwa Kim (42 papers)
Sangdoo Yun (71 papers)
Hwaran Lee (31 papers)
Gunhee Kim (74 papers)

Citations (7)

View on Semantic Scholar

Tweets

https://twitter.com/AHNJAEWOO2/status/1795765881316032603

TimeChara: Evaluating Point-in-Time Character Hallucination of Role-Playing Large Language Models (2405.18027v1)

Related Papers

Tweets