Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

User Response and Sentiment Prediction for Automatic Dialogue Evaluation (2111.08808v2)

Published 16 Nov 2021 in cs.CL

Abstract: Automatic evaluation is beneficial for open-domain dialog system development. However, standard word-overlap metrics (BLEU, ROUGE) do not correlate well with human judgements of open-domain dialog systems. In this work we propose to use the sentiment of the next user utterance for turn or dialog level evaluation. Specifically we propose three methods: one that predicts the next sentiment directly, and two others that predict the next user utterance using an utterance or a feedback generator model and then classify its sentiment. Experiments show our model outperforming existing automatic evaluation metrics on both written and spoken open-domain dialogue datasets.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Sarik Ghazarian (13 papers)
  2. Behnam Hedayatnia (27 papers)
  3. Alexandros Papangelis (23 papers)
  4. Yang Liu (2253 papers)
  5. Dilek Hakkani-Tur (94 papers)
Citations (3)