Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Sticking to the Facts: Confident Decoding for Faithful Data-to-Text Generation (1910.08684v3)

Published 19 Oct 2019 in cs.CL

Abstract: We address the issue of hallucination in data-to-text generation, i.e., reducing the generation of text that is unsupported by the source. We conjecture that hallucination can be caused by an encoder-decoder model generating content phrases without attending to the source; so we propose a confidence score to ensure that the model attends to the source whenever necessary, as well as a variational Bayes training framework that can learn the score from data. Experiments on the WikiBio (Lebretet al., 2016) dataset show that our approach is more faithful to the source than existing state-of-the-art approaches, according to both PARENT score (Dhingra et al., 2019) and human evaluation. We also report strong results on the WebNLG (Gardent et al., 2017) dataset.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Ran Tian (30 papers)
  2. Shashi Narayan (35 papers)
  3. Thibault Sellam (19 papers)
  4. Ankur P. Parikh (28 papers)
Citations (90)

Summary

We haven't generated a summary for this paper yet.