Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
51 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Accuracy and Political Bias of News Source Credibility Ratings by Large Language Models (2304.00228v2)

Published 1 Apr 2023 in cs.CL, cs.CY, and cs.IR

Abstract: Search engines increasingly leverage LLMs to generate direct answers, and AI chatbots now access the Internet for fresh data. As information curators for billions of users, LLMs must assess the accuracy and reliability of different sources. This paper audits eight widely used LLMs from three major providers -- OpenAI, Google, and Meta -- to evaluate their ability to discern credible and high-quality information sources from low-credibility ones. We find that while LLMs can rate most tested news outlets, larger models more frequently refuse to provide ratings due to insufficient information, whereas smaller models are more prone to hallucination in their ratings. For sources where ratings are provided, LLMs exhibit a high level of agreement among themselves (average Spearman's $\rho = 0.81$), but their ratings align only moderately with human expert evaluations (average $\rho = 0.59$). Analyzing news sources with different political leanings in the US, we observe a liberal bias in credibility ratings yielded by all LLMs in default configurations. Additionally, assigning partisan identities to LLMs consistently results in strong politically congruent bias in the ratings. These findings have important implications for the use of LLMs in curating news and political information.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Kai-Cheng Yang (29 papers)
  2. Filippo Menczer (102 papers)
Citations (23)