Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Neural Media Bias Detection Using Distant Supervision With BABE -- Bias Annotations By Experts (2209.14557v1)

Published 29 Sep 2022 in cs.CL

Abstract: Media coverage has a substantial effect on the public perception of events. Nevertheless, media outlets are often biased. One way to bias news articles is by altering the word choice. The automatic identification of bias by word choice is challenging, primarily due to the lack of a gold standard data set and high context dependencies. This paper presents BABE, a robust and diverse data set created by trained experts, for media bias research. We also analyze why expert labeling is essential within this domain. Our data set offers better annotation quality and higher inter-annotator agreement than existing work. It consists of 3,700 sentences balanced among topics and outlets, containing media bias labels on the word and sentence level. Based on our data, we also introduce a way to detect bias-inducing sentences in news articles automatically. Our best performing BERT-based model is pre-trained on a larger corpus consisting of distant labels. Fine-tuning and evaluating the model on our proposed supervised data set, we achieve a macro F1-score of 0.804, outperforming existing methods.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Timo Spinde (20 papers)
  2. Manuel Plank (2 papers)
  3. Jan-David Krieger (3 papers)
  4. Terry Ruas (46 papers)
  5. Bela Gipp (98 papers)
  6. Akiko Aizawa (74 papers)
Citations (62)

Summary

We haven't generated a summary for this paper yet.