Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

POSSCORE: A Simple Yet Effective Evaluation of Conversational Search with Part of Speech Labelling (2109.03039v1)

Published 7 Sep 2021 in cs.IR, cs.AI, and cs.CL

Abstract: Conversational search systems, such as Google Assistant and Microsoft Cortana, provide a new search paradigm where users are allowed, via natural language dialogues, to communicate with search systems. Evaluating such systems is very challenging since search results are presented in the format of natural language sentences. Given the unlimited number of possible responses, collecting relevance assessments for all the possible responses is infeasible. In this paper, we propose POSSCORE, a simple yet effective automatic evaluation method for conversational search. The proposed embedding-based metric takes the influence of part of speech (POS) of the terms in the response into account. To the best knowledge, our work is the first to systematically demonstrate the importance of incorporating syntactic information, such as POS labels, for conversational search evaluation. Experimental results demonstrate that our metrics can correlate with human preference, achieving significant improvements over state-of-the-art baseline metrics.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Zeyang Liu (13 papers)
  2. Ke Zhou (48 papers)
  3. Jiaxin Mao (47 papers)
  4. Max L. Wilson (4 papers)
Citations (2)