Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Limits of Large Language Models in Debating Humans (2402.06049v1)

Published 6 Feb 2024 in cs.AI, cs.CL, cs.HC, and stat.AP

Abstract: LLMs have shown remarkable promise in their ability to interact proficiently with humans. Subsequently, their potential use as artificial confederates and surrogates in sociological experiments involving conversation is an exciting prospect. But how viable is this idea? This paper endeavors to test the limits of current-day LLMs with a pre-registered study integrating real people with LLM agents acting as people. The study focuses on debate-based opinion consensus formation in three environments: humans only, agents and humans, and agents only. Our goal is to understand how LLM agents influence humans, and how capable they are in debating like humans. We find that LLMs can blend in and facilitate human productivity but are less convincing in debate, with their behavior ultimately deviating from human's. We elucidate these primary failings and anticipate that LLMs must evolve further before being viable debaters.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. James Flamino (13 papers)
  2. Mohammed Shahid Modi (3 papers)
  3. Boleslaw K. Szymanski (100 papers)
  4. Brendan Cross (6 papers)
  5. Colton Mikolajczyk (1 paper)
Citations (3)
X Twitter Logo Streamline Icon: https://streamlinehq.com