Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MP2D: An Automated Topic Shift Dialogue Generation Framework Leveraging Knowledge Graphs (2403.05814v1)

Published 9 Mar 2024 in cs.CL and cs.AI

Abstract: Despite advancements in on-topic dialogue systems, effectively managing topic shifts within dialogues remains a persistent challenge, largely attributed to the limited availability of training datasets. To address this issue, we propose Multi-Passage to Dialogue (MP2D), a data generation framework that automatically creates conversational question-answering datasets with natural topic transitions. By leveraging the relationships between entities in a knowledge graph, MP2D maps the flow of topics within a dialogue, effectively mirroring the dynamics of human conversation. It retrieves relevant passages corresponding to the topics and transforms them into dialogues through the passage-to-dialogue method. Through quantitative and qualitative experiments, we demonstrate MP2D's efficacy in generating dialogue with natural topic shifts. Furthermore, this study introduces a novel benchmark for topic shift dialogues, TS-WikiDialog. Utilizing the dataset, we demonstrate that even LLMs struggle to handle topic shifts in dialogue effectively, and we showcase the performance improvements of models trained on datasets generated by MP2D across diverse topic shift dialogue tasks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (57)
  1. Knowledge graph based synthetic corpus generation for knowledge-enhanced language model pre-training. arXiv preprint arXiv:2010.12688.
  2. Open-domain question answering goes conversational via question rewriting. arXiv preprint arXiv:2010.04898.
  3. Jaime Arguello and Carolyn Rosé. 2006. Topic-segmentation of dialogue. In Proceedings of the analyzing conversations in text and speech, pages 42–49.
  4. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
  5. Taskmaster-1: Toward a realistic and diverse dialog dataset. arXiv preprint arXiv:1909.05358.
  6. Evaluating entity disambiguation and the role of popularity in retrieval-based nlp. arXiv preprint arXiv:2106.06830.
  7. A survey on dialogue systems: Recent advances and new frontiers. Acm Sigkdd Explorations Newsletter, 19(2):25–35.
  8. Jacob Cohen. 1960. A coefficient of agreement for nominal scales. Educational and psychological measurement, 20(1):37–46.
  9. Evaluating the ripple effects of knowledge editing in language models. arXiv preprint arXiv:2307.12976.
  10. Dialog inpainting: Turning documents into dialogs. In International Conference on Machine Learning, pages 4558–4586. PMLR.
  11. Trec cast 2019: The conversational assistance track overview. arXiv preprint arXiv:2003.13624.
  12. Discourse segmentation of multi-party conversation. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, pages 562–569.
  13. Unsupervised dialogue topic segmentation with topic-aware contrastive learning. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 2481–2485.
  14. Linda J Garcia and Yves Joanette. 1997. Analysis of conversational topic shifts: A multiple case study. Brain and language, 58(1):92–114.
  15. Chatgpt outperforms crowd-workers for text-annotation tasks. arXiv preprint arXiv:2303.15056.
  16. Investigating evaluation of open-domain dialogue systems with human generated multiple references. arXiv preprint arXiv:1907.10568.
  17. The curious case of neural text degeneration. arXiv preprint arXiv:1904.09751.
  18. Florian Holz and Sven Teresniak. 2010. Towards automatic detection and tracking of topic change. In International Conference on Intelligent Text Processing and Computational Linguistics, pages 327–339. Springer.
  19. Dialogizer: Context-aware conversational-qa dataset generation from textual sources. arXiv preprint arXiv:2311.07589.
  20. Paraqa: a question answering dataset with paraphrase responses for single-turn conversation. In European semantic web conference, pages 598–613. Springer.
  21. Michael R King. 2023. The future of ai in medicine: a perspective from a chatbot. Annals of Biomedical Engineering, 51(2):291–295.
  22. J Richard Landis and Gary G Koch. 1977. An application of hierarchical kappa-type statistics in the assessment of majority agreement among multiple observers. Biometrics, pages 363–374.
  23. Dailydialog: A manually labelled multi-turn dialogue dataset. arXiv preprint arXiv:1710.03957.
  24. Topic-oriented dialogue summarization. IEEE/ACM Transactions on Audio, Speech, and Language Processing.
  25. Multi-granularity prompts for topic shift detection in dialogue. arXiv preprint arXiv:2305.14006.
  26. Dragon: A dialogue-based robot for assistive navigation with visual language grounding. IEEE Robotics and Automation Letters.
  27. Gpteval: Nlg evaluation using gpt-4 with better human alignment. arXiv preprint arXiv:2303.16634.
  28. Ilya Loshchilov and Frank Hutter. 2017. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101.
  29. Shikib Mehri and Maxine Eskenazi. 2020. Usr: An unsupervised and reference free evaluation metric for dialog generation. arXiv preprint arXiv:2005.00456.
  30. Rquge: Reference-free metric for evaluating question generation by answering the question. arXiv preprint arXiv:2211.01482.
  31. Nikahat Mulla and Prachi Gharpure. 2023. Automatic question generation: a review of methodologies, datasets, evaluation metrics, and applications. Progress in Artificial Intelligence, 12(1):1–32.
  32. Micah Musser. 2023. A cost analysis of generative language models and influence operations. arXiv preprint arXiv:2308.03740.
  33. Recent advances in deep learning based dialogue systems: A systematic survey. Artificial intelligence review, 56(4):3055–3155.
  34. R OpenAI. 2023. Gpt-4 technical report. arXiv, pages 2303–08774.
  35. Towards holistic and automatic evaluation of open-domain dialogue generation.
  36. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pages 311–318.
  37. Carbon emissions and large neural network training. arXiv preprint arXiv:2104.10350.
  38. Heiko Paulheim. 2017. Knowledge graph refinement: A survey of approaches and evaluation methods. Semantic web, 8(3):489–508.
  39. Matthew Purver. 2011. Topic segmentation. Spoken language understanding: systems for extracting semantic information from speech, pages 291–317.
  40. Open-retrieval conversational question answering. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, pages 539–548.
  41. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551.
  42. Otters: One-turn topic transitions for open-domain dialogue. arXiv preprint arXiv:2105.13710.
  43. Swapna Somasundaran et al. 2020. Two-level transformer and auxiliary coherence modeling for improved text segmentation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 7797–7804.
  44. An empirical study of topic transition in dialogue. arXiv preprint arXiv:2111.14188.
  45. Multitasking information seeking and searching processes. Journal of the american society for information science and technology, 53(8):639–652.
  46. Topicrefine: Joint topic prediction and dialogue response generation for multi-turn end-to-end dialogue system. arXiv preprint arXiv:2109.05187.
  47. Is chatgpt a good nlg evaluator? a preliminary study. arXiv preprint arXiv:2303.04048.
  48. Qrelscore: Better evaluating generated questions with deeper understanding of context-aware relevance. arXiv preprint arXiv:2204.13921.
  49. Tiage: A benchmark for topic-shift aware dialog modeling. arXiv preprint arXiv:2109.04562.
  50. Topic-aware multi-turn dialogue modeling. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 14176–14184.
  51. Take: topic-shift aware knowledge selection for dialogue generation. In Proceedings of the 29th International Conference on Computational Linguistics, pages 253–265.
  52. Conversational question answering: A survey. Knowledge and Information Systems, 64(12):3151–3195.
  53. Learning to select the relevant history turns in conversational question answering. In International Conference on Web Information Systems Engineering, pages 334–348. Springer.
  54. Automated interactive domain-specific conversational agents that understand human dialogs. In International Symposium on Practical Aspects of Declarative Languages, pages 204–222. Springer.
  55. Dynaeval: Unifying turn and dialogue level evaluation. arXiv preprint arXiv:2106.01112.
  56. Grounded conversation generation as guided traverses in commonsense knowledge graphs. arXiv preprint arXiv:1911.02707.
  57. Learning discourse-level diversity for neural dialog models using conditional variational autoencoders. arXiv preprint arXiv:1703.10960.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Yerin Hwang (8 papers)
  2. Yongil Kim (11 papers)
  3. Yunah Jang (4 papers)
  4. Jeesoo Bang (2 papers)
  5. Hyunkyung Bae (6 papers)
  6. Kyomin Jung (76 papers)
Citations (1)