Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Zero-Shot Dialogue Disentanglement by Self-Supervised Entangled Response Selection (2110.12646v2)

Published 25 Oct 2021 in cs.CL

Abstract: Dialogue disentanglement aims to group utterances in a long and multi-participant dialogue into threads. This is useful for discourse analysis and downstream applications such as dialogue response selection, where it can be the first step to construct a clean context/response set. Unfortunately, labeling all~\emph{reply-to} links takes quadratic effort w.r.t the number of utterances: an annotator must check all preceding utterances to identify the one to which the current utterance is a reply. In this paper, we are the first to propose a~\textbf{zero-shot} dialogue disentanglement solution. Firstly, we train a model on a multi-participant response selection dataset harvested from the web which is not annotated; we then apply the trained model to perform zero-shot dialogue disentanglement. Without any labeled data, our model can achieve a cluster F1 score of 25. We also fine-tune the model using various amounts of labeled data. Experiments show that with only 10\% of the data, we achieve nearly the same performance of using the full dataset\footnote{Code is released at \url{https://github.com/chijames/zero_shot_dialogue_disentanglement}}.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (19)
  1. Longformer: The long-document transformer. arXiv preprint arXiv:2004.05150.
  2. Model ensembling of esim and bert for dialogue response selection. DSTC8.
  3. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
  4. Micha Elsner and Eugene Charniak. 2008. You talking to me? a corpus and algorithm for conversation disentanglement. In Proceedings of ACL-08: HLT, pages 834–842.
  5. Micha Elsner and Eugene Charniak. 2011. Disentangling chat with local coherence models. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 1179–1189.
  6. Pre-trained and attention-based neural networks for building noetic task-oriented dialogue systems. arXiv preprint arXiv:2004.01940.
  7. Noesis ii: Predicting responses, identifying success, and managing complexity in task-oriented dialogue. In 8th Edition of the Dialog System Technology Challenges at AAAI 2019.
  8. Learning to disentangle interleaved conversational threads with a siamese hierarchical network and similarity ranking. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1812–1822.
  9. A large-scale corpus for conversation disentanglement. arXiv preprint arXiv:1810.11118.
  10. Dialbert: A hierarchical pre-trained model for conversation disentanglement. arXiv preprint arXiv:2004.03760.
  11. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
  12. Thread detection in dynamic text message streams. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 35–42.
  13. Lidan Wang and Douglas W Oard. 2009. Context-based message expansion for disentanglement of interleaved text conversations. In Proceedings of human language technologies: The 2009 annual conference of the North American chapter of the association for computational linguistics, pages 200–208. Citeseer.
  14. Response selection for multi-party conversations with dynamic topic tracking. arXiv preprint arXiv:2010.07785.
  15. Enhancing response selection with advanced context modeling and post-training. DSTC8.
  16. Xlnet: Generalized autoregressive pretraining for language understanding. arXiv preprint arXiv:1906.08237.
  17. Tao Yu and Shafiq Joty. 2020. Online conversation disentanglement with pointer networks. arXiv preprint arXiv:2010.11080.
  18. Big bird: Transformers for longer sequences. arXiv preprint arXiv:2007.14062.
  19. Who did they respond to? conversation structure modeling using masked hierarchical transformer. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 9741–9748.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Ta-Chung Chi (19 papers)
  2. Alexander I. Rudnicky (9 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.