Large Language Models Meet Harry Potter: A Bilingual Dataset for Aligning Dialogue Agents with Characters (2211.06869v4)
Abstract: In recent years, Dialogue-style LLMs such as ChatGPT and GPT4 have demonstrated immense potential in constructing open-domain dialogue agents. However, aligning these agents with specific characters or individuals remains a considerable challenge due to the complexities of character representation and the lack of comprehensive annotations. In this paper, we introduce the Harry Potter Dialogue (HPD) dataset, designed to advance the study of dialogue agents and character alignment. The dataset encompasses all dialogue sessions (in both English and Chinese) from the Harry Potter series and is annotated with vital background information, including dialogue scenes, speakers, character relationships, and attributes. These extensive annotations may empower LLMs to unlock character-driven dialogue capabilities. Furthermore, it can serve as a universal benchmark for evaluating how well can a LLM aligning with a specific character. We benchmark LLMs on HPD using both fine-tuning and in-context learning settings. Evaluation results reveal that although there is substantial room for improvement in generating high-quality, character-aligned responses, the proposed dataset is valuable in guiding models toward responses that better align with the character of Harry Potter.
- Language models are few-shot learners. CoRR, abs/2005.14165.
- Orca: A few-shot benchmark for chinese conversational machine reading comprehension. arXiv preprint arXiv:2302.13619.
- Self-supervised dialogue learning for spoken conversational question answering. In Interspeech, pages 231–235. ISCA.
- Cristian Danescu-Niculescu-Mizil and Lillian Lee. 2011. Chameleons in imagined conversations: A new approach to understanding coordination of linguistic style in dialogs. In CMCL@ACL, pages 76–87. Association for Computational Linguistics.
- Wizard of wikipedia: Knowledge-powered conversational agents. In ICLR (Poster). OpenReview.net.
- Enhancing chat language models by scaling high-quality instructional conversations. arXiv preprint arXiv:2305.14233.
- Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685.
- Call for customized conversation: Customized conversation grounding persona and knowledge. CoRR, abs/2112.08619.
- Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback. arXiv preprint arXiv:2303.05453.
- Exploring personalized neural conversational models. In IJCAI, pages 3728–3734. ijcai.org.
- A persona-based neural conversation model. In ACL (1). The Association for Computer Linguistics.
- Chin-Yew Lin. 2004. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out, pages 74–81, Barcelona, Spain. Association for Computational Linguistics.
- Evaluate what you can’t evaluate: Unassessable generated responses quality. arXiv preprint arXiv:2305.14658.
- OpenAI. 2023. Gpt-4 technical report.
- Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, July 6-12, 2002, Philadelphia, PA, USA, pages 311–318. ACL.
- Instruction tuning with gpt-4. arXiv preprint arXiv:2304.03277.
- Pchatbot: A large-scale dataset for personalized chatbot. In SIGIR, pages 2470–2477. ACM.
- Lamp: When large language models meet personalization. arXiv preprint arXiv:2304.11406.
- Profile consistency identification for open-domain dialogue agents. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6651–6662.
- Improving retrieval-based dialogue system via syntax-informed attention. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5. IEEE.
- Transfertransfo: A transfer learning approach for neural network based conversational agents. CoRR, abs/1901.08149.
- Zhengzhe Yang and Jinho D. Choi. 2019. Friendsqa: Open-domain question answering on TV show transcripts. In SIGdial, pages 188–197. Association for Computational Linguistics.
- Towards data distillation for end-to-end spoken conversational question answering. CoRR, abs/2010.08923.
- Self-supervised contrastive cross-modality representation learning for spoken question answering. In EMNLP (Findings), pages 28–39. Association for Computational Linguistics.
- Glm-130b: An open bilingual pre-trained model. arXiv preprint arXiv:2210.02414.
- Personalizing dialogue agents: I have a dog, do you have pets too? In ACL (1), pages 2204–2213. Association for Computational Linguistics.
- Neural personalized response generation as domain adaptation. World Wide Web, 22(4):1427–1446.
- Personalized dialogue generation with diversified traits. CoRR, abs/1901.09672.
- A pre-training based personalized dialogue generation model with persona-sparse data. In AAAI, pages 9693–9700. AAAI Press.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.