Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multilingual Fine-Grained News Headline Hallucination Detection (2407.15975v1)

Published 22 Jul 2024 in cs.CL

Abstract: The popularity of automated news headline generation has surged with advancements in pre-trained LLMs. However, these models often suffer from the ``hallucination'' problem, where the generated headline is not fully supported by its source article. Efforts to address this issue have predominantly focused on English, using over-simplistic classification schemes that overlook nuanced hallucination types. In this study, we introduce the first multilingual, fine-grained news headline hallucination detection dataset that contains over 11 thousand pairs in 5 languages, each annotated with detailed hallucination types by experts. We conduct extensive experiments on this dataset under two settings. First, we implement several supervised fine-tuning approaches as preparatory solutions and demonstrate this dataset's challenges and utilities. Second, we test various LLMs' in-context learning abilities and propose two novel techniques, language-dependent demonstration selection and coarse-to-fine prompting, to boost the few-shot hallucination detection performance in terms of the example-F1 metric. We release this dataset to foster further research in multilingual, fine-grained headline hallucination detection.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Jiaming Shen (56 papers)
  2. Tianqi Liu (49 papers)
  3. Jialu Liu (21 papers)
  4. Zhen Qin (105 papers)
  5. Jay Pavagadhi (3 papers)
  6. Simon Baumgartner (10 papers)
  7. Michael Bendersky (63 papers)