Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Why Exposure Bias Matters: An Imitation Learning Perspective of Error Accumulation in Language Generation (2204.01171v3)

Published 3 Apr 2022 in cs.CL, cs.AI, and cs.LG

Abstract: Current language generation models suffer from issues such as repetition, incoherence, and hallucinations. An often-repeated hypothesis is that this brittleness of generation models is caused by the training and the generation procedure mismatch, also referred to as exposure bias. In this paper, we verify this hypothesis by analyzing exposure bias from an imitation learning perspective. We show that exposure bias leads to an accumulation of errors, analyze why perplexity fails to capture this accumulation, and empirically show that this accumulation results in poor generation quality. Source code to reproduce these experiments is available at https://github.com/kushalarora/quantifying_exposure_bias

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Kushal Arora (13 papers)
  2. Layla El Asri (13 papers)
  3. Hareesh Bahuleyan (9 papers)
  4. Jackie Chi Kit Cheung (57 papers)
Citations (61)