Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Predicting Sentence-Level Factuality of News and Bias of Media Outlets (2301.11850v4)

Published 27 Jan 2023 in cs.CL

Abstract: Automated news credibility and fact-checking at scale require accurately predicting news factuality and media bias. This paper introduces a large sentence-level dataset, titled "FactNews", composed of 6,191 sentences expertly annotated according to factuality and media bias definitions proposed by AllSides. We use FactNews to assess the overall reliability of news sources, by formulating two text classification problems for predicting sentence-level factuality of news reporting and bias of media outlets. Our experiments demonstrate that biased sentences present a higher number of words compared to factual sentences, besides having a predominance of emotions. Hence, the fine-grained analysis of subjectivity and impartiality of news articles provided promising results for predicting the reliability of media outlets. Finally, due to the severity of fake news and political polarization in Brazil, and the lack of research for Portuguese, both dataset and baseline were proposed for Brazilian Portuguese.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (40)
  1. We can detect your bias: Predicting the political ideology of news articles. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, page 4982–4991, Held Online.
  2. Predicting factuality of reporting and bias of news media sources. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3528–3539, Brussels, Belgium.
  3. Multi-task ordinal regression for jointly predicting the trustworthiness and the leading political ideology of news media. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2109–2116, Minneapolis, Minnesota.
  4. Predicting the factuality of reporting of news media using observations about user attention in their YouTube channels. In Proceedings of the International Conference on Recent Advances in Natural Language Processing, pages 182–189, Held Online.
  5. Information credibility on twitter. In Proceedings of the 20th International Conference on World Wide Web, page 675–684, New York, United States.
  6. Attending sentences to detect satirical fake news. In Proceedings of the 27th International Conference on Computational Linguistics, pages 3371–3380, New Mexico, United States.
  7. Knowledge-based trust: Estimating the trustworthiness of web sources. Proc. VLDB Endow., 8(9):938–949.
  8. In plain sight: Media bias through the lens of factual reporting. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pages 6343–6349, Hong Kong, China.
  9. A multidimensional dataset based on crowdsourcing for analyzing and detecting news bias. In Proceedings of the 29th ACM International Conference on Information and Knowledge Management, page 3007–3014, New York, United States.
  10. Matthew Gentzkow and Jesse M Shapiro. 2010. What drives media slant? evidence from us daily newspapers. Econometrica, 78(1):35–71.
  11. A Survey on Automated Fact-Checking. Transactions of the Association for Computational Linguistics, 10:178–206.
  12. Felix Hamborg. 2020. Media bias, the social sciences, and NLP: Automating frame analyses to identify bias by word choice and labeling. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, pages 79–87, Held Online.
  13. In search of credible news. In 17th International Conference on Artificial Intelligence: Methodology, Systems, and Application, pages 172–180, Varna, Bulgaria.
  14. Assessing the news landscape: A multi-module toolkit for evaluating the credibility of news. In Proceedings of the The Web Conference 2018, page 235–238, Geneva, Switzerland.
  15. Political ideology detection using recursive neural networks. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, pages 1113–1122, Baltimore, Maryland.
  16. Sentence-level media bias analysis informed by discourse structures. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 10040 – 10050, Abu Dhabi, United Arab Emirates.
  17. Annotating and analyzing biased sentences in news articles using crowdsourcing. In Proceedings of the 13th Language Resources and Evaluation Conference, pages 1478–1484, Marseille, France.
  18. POLITICS: Pretraining with same-story article comparison for ideology prediction and stance detection. In Findings of the Association for Computational Linguistics: 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 1354–1374, Seattle, United States.
  19. A survey on computational propaganda detection. In Proceedings of the 29th International Joint Conference on Artificial Intelligence, pages 4826–4832, Yokohama, Japan.
  20. Julie Mastrine. 2022. How to Spot 16 Types of Media Bias. AllSides: Don’t be fooled by media bias and misinformation, California, United States.
  21. Fact checking in community forums. In The 32th AAAI Conference on Artificial Intelligence, pages 5309–5316, New Orleans, United States.
  22. Subhabrata Mukherjee and Gerhard Weikum. 2015. Leveraging joint interactions for credibility analysis in news communities. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, page 353–362, New York, United States.
  23. An interpretable joint graphical model for fact-checking from crowds. In Proceedings of the 32th AAAI Conference on Artificial Intelligence, 1, pages 1511–1518, Louisiana, United States.
  24. Content based fake news detection using knowledge graphs. In The Semantic Web - ISWC 2018, pages 669–683.
  25. P. R Pasqualotti. 2008. Reconhecimento de expressões de emoções na interação mediada por computador. Master’s thesis, Dissertação de Mestrado em Ciência da Computação. Pontifícia Universidade Católica do Rio Grande do Sul - PUCRS, Porto Alegre, Brasil.
  26. Automatic detection of fake news. In Proceedings of the 27th International Conference on Computational Linguistics, pages 3391–3401, New Mexico, United States.
  27. Credibility assessment of textual claims on the web. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, page 2173–2178, New York, United States.
  28. Markus Prior. 2013. Media and political polarization. Annual Review of Political Science, 16(1):101–127.
  29. Linguistic models for analyzing and detecting biased language. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pages 1650–1659, Sofia, Bulgaria.
  30. Explainable machine learning for fake news detection. In Proceedings of the 11th ACM Conference on Web Science, pages 17–26, Massachusetts, United States.
  31. A dataset of fact-checked images shared on whatsapp during the brazilian and indian elections. In Proceedings of the 14th International AAAI Conference on Web and Social Media, pages 903–908, Held Online.
  32. Can online attention signals help fact-checkers to fact-check? In Workshop Proceedings of the 17th International AAAI Conference On Web and Social Media, pages 1–10, Atlanta, United States.
  33. Shamik Roy and Dan Goldwasser. 2020. Weakly supervised learning of nuanced frames for analyzing polarization in news media. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, pages 7698–7716, Held Online.
  34. Eitan Sapiro-Gheiler. 2019. Examining political trustworthiness through text-based measures of ideology. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 10029–10030, Hawaii, United States.
  35. Julius Sim and Chris C Wright. 2005. The kappa statistic in reliability studies: Use, interpretation, and sample size requirements. Physical therapy, 85(3):257–268.
  36. Neural media bias detection using distant supervision with BABE - bias annotations by experts. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 1166–1177, Punta Cana, Dominican Republic.
  37. Toward discourse-aware models for multilingual fake news detection. In Proceedings of the Student Research Workshop Associated with RANLP 2021, pages 210–218, Held Online.
  38. Rhetorical structure approach for online deception detection: A survey. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 5906–5915, Marseille, France.
  39. Data Mining: Practical machine learning tools and techniques, 3ed edition. Morgan Kaufmann Publishers.
  40. Early detection of fake news by utilizing the credibility of news, publishers, and users based on weakly supervised learning. In Proceedings of the 28th International Conference on Computational Linguistics, pages 5444–5454, Held Online.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Francielle Vargas (4 papers)
  2. Kokil Jaidka (24 papers)
  3. Thiago A. S. Pardo (2 papers)
  4. Fabrício Benevenuto (64 papers)
Citations (5)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets