Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Studying the Effects of Cognitive Biases in Evaluation of Conversational Agents (2002.07927v2)

Published 18 Feb 2020 in cs.CL and cs.HC

Abstract: Humans quite frequently interact with conversational agents. The rapid advancement in generative LLMing through neural networks has helped advance the creation of intelligent conversational agents. Researchers typically evaluate the output of their models through crowdsourced judgments, but there are no established best practices for conducting such studies. Moreover, it is unclear if cognitive biases in decision-making are affecting crowdsourced workers' judgments when they undertake these tasks. To investigate, we conducted a between-subjects study with 77 crowdsourced workers to understand the role of cognitive biases, specifically anchoring bias, when humans are asked to evaluate the output of conversational agents. Our results provide insight into how best to evaluate conversational agents. We find increased consistency in ratings across two experimental conditions may be a result of anchoring bias. We also determine that external factors such as time and prior experience in similar tasks have effects on inter-rater consistency.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Sashank Santhanam (15 papers)
  2. Alireza Karduni (12 papers)
  3. Samira Shaikh (18 papers)
Citations (25)

Summary

We haven't generated a summary for this paper yet.