Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CluCDD:Contrastive Dialogue Disentanglement via Clustering (2302.08146v1)

Published 16 Feb 2023 in cs.CL

Abstract: A huge number of multi-participant dialogues happen online every day, which leads to difficulty in understanding the nature of dialogue dynamics for both humans and machines. Dialogue disentanglement aims at separating an entangled dialogue into detached sessions, thus increasing the readability of long disordered dialogue. Previous studies mainly focus on message-pair classification and clustering in two-step methods, which cannot guarantee the whole clustering performance in a dialogue. To address this challenge, we propose a simple yet effective model named CluCDD, which aggregates utterances by contrastive learning. More specifically, our model pulls utterances in the same session together and pushes away utterances in different ones. Then a clustering method is adopted to generate predicted clustering labels. Comprehensive experiments conducted on the Movie Dialogue dataset and IRC dataset demonstrate that our model achieves a new state-of-the-art result.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Jingsheng Gao (16 papers)
  2. Zeyu Li (62 papers)
  3. Suncheng Xiang (27 papers)
  4. Ting Liu (329 papers)
  5. Yuzhuo Fu (24 papers)