Papers

Topics

Authors

Recent

View all

Gemini 2.5 Flash

98 tokens/sec

GPT-4o

61 tokens/sec

Gemini 2.5 Pro Pro

46 tokens/sec

o3 Pro

8 tokens/sec

GPT-4.1 Pro

50 tokens/sec

DeepSeek R1 via Azure Pro

28 tokens/sec

2000 character limit reached

125

Fact-checking information from large language models can decrease headline discernment (2308.10800v4)

Published 21 Aug 2023 in cs.HC, cs.AI, and cs.CY

Abstract: Fact checking can be an effective strategy against misinformation, but its implementation at scale is impeded by the overwhelming volume of information online. Recent AI LLMs have shown impressive ability in fact-checking tasks, but how humans interact with fact-checking information provided by these models is unclear. Here, we investigate the impact of fact-checking information generated by a popular LLM on belief in, and sharing intent of, political news headlines in a preregistered randomized control experiment. Although the LLM accurately identifies most false headlines (90%), we find that this information does not significantly improve participants' ability to discern headline accuracy or share accurate news. In contrast, viewing human-generated fact checks enhances discernment in both cases. Subsequent analysis reveals that the AI fact-checker is harmful in specific cases: it decreases beliefs in true headlines that it mislabels as false and increases beliefs in false headlines that it is unsure about. On the positive side, AI fact-checking information increases the sharing intent for correctly labeled true headlines. When participants are given the option to view LLM fact checks and choose to do so, they are significantly more likely to share both true and false news but only more likely to believe false headlines. Our findings highlight an important source of potential harm stemming from AI applications and underscore the critical need for policies to prevent or mitigate such unintended consequences.

PDF HTML Abstract

Analysis of Fact-Checking Information Generated by LLMs on News Discernment

The research investigates the use of LLMs in fact-checking political news and assesses their impact on public belief and sharing intentions through a randomized controlled experiment. This paper is significant both in its illumination of LLMs' fact-checking capability and its interdisciplinary interest at the intersection of AI, digital misinformation, and political communication.

At the core of the research is the use of a model like ChatGPT to evaluate the credibility of political headlines and its influence on the audience's perception when fact-checking information is provided. While LLMs such as ChatGPT have shown commendable performance in natural language processing tasks, their integration into the fact-checking domain is still nascent. Here, this paper contributes empirical evidence regarding the practical applications and challenges of applying these models in a sensitive context like misinformation detection and correction.

Methodological Approach

The authors conducted an experiment using a representative sample of 1,548 U.S. participants, presenting them with 40 real political news stories—half true, half false. Participants were divided into groups to either assess belief or sharing intent regarding the headlines. Three experimental conditions were established: control, forced exposure to LLM-generated fact checks, and optional access to these fact checks. ChatGPT provided the fact-checking data for the treatment conditions.

Principal Findings

LLM Performance on Fact-Checking: ChatGPT excelled in classifying false headlines, aiming towards high accuracy with 90% of false headlines identified correctly. However, its performance on true headlines was less reliable, correctly identifying only 15%, with the majority being labeled as "unsure".
Impact on Audience Discernment: Surprisingly, the availability of LLM-generated fact checks did not enhance participants' ability to discern true from false headlines, whether they were in the forced or optional condition. The discernment—or the difference in belief or intent to share between true and false news—did not significantly differ between treatment and control groups.
Scenario-Specific Effects: Analysis revealed scenario-specific, and sometimes adverse, effects of fact-checking information. Incorrectly tagged true headlines reduced belief in these stories, demonstrating potential harm in the mislabeling process. Moreover, uncertain classification of false headlines occasionally increased the belief in their veracity.
User Behavior in Optional Condition: Participants who opted to view fact-checking information displayed varied patterns; they were more likely to believe false headlines and more willing to share content regardless of veracity, suggesting deeper engagement with headlines when fact-checks are made available.

Implications and Future Directions

The findings open a critical discussion on the deployment of LLMs in combating misinformation, highlighting potential drawbacks such as misclassification risks and discrepancies in human-AI interactions. The data suggest that a careful balance must be struck between AI fact-checking deployment and accuracy enhancement, particularly given the complex dynamics introduced when models express uncertainty or err in classification.

Policy development appears necessary to mitigate unintended consequences stemming from AI fact-checking interventions. Moreover, the selective engagement by individuals with fact-checking information points to a need for understanding user intention and the motivational aspects of their interaction with AI-generated corrections.

With advances in LLMs projected, future research could explore tailored LLM configurations specifically designed for fact-checking, utilize larger and more diverse data sets, and investigate variant representations of misinformation. Furthermore, refining user interaction models to enhance discernment without introducing bias or misplaced trust in AI systems is crucial.

In conclusion, while this research underscores promising aspects of AI applications in news credibility assessment, it simultaneously cautions against unintended impairments to news discernment, advocating for informed integration strategies for LLM technology within digital ecosystems.

PDF Markdown Bookmark Chat (Pro)

References (74)

Authors (4)

Matthew R. DeVerna (10 papers)
Harry Yaojun Yan (1 paper)
Kai-Cheng Yang (29 papers)
Filippo Menczer (102 papers)

Citations (2)

View on Semantic Scholar

Tweets

https://twitter.com/OSoMe_IU/status/1864414676991218133

https://twitter.com/yang3kc/status/1839331125006881159

https://twitter.com/mdeverna2/status/1837601776838619579

https://twitter.com/mdeverna2/status/1844086692862714045

https://twitter.com/mdeverna2/status/1781080159765791152

https://twitter.com/mdeverna2/status/1813971681712738488