Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
98 tokens/sec
GPT-4o
61 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Fact-checking information from large language models can decrease headline discernment (2308.10800v4)

Published 21 Aug 2023 in cs.HC, cs.AI, and cs.CY
Fact-checking information from large language models can decrease headline discernment

Abstract: Fact checking can be an effective strategy against misinformation, but its implementation at scale is impeded by the overwhelming volume of information online. Recent AI LLMs have shown impressive ability in fact-checking tasks, but how humans interact with fact-checking information provided by these models is unclear. Here, we investigate the impact of fact-checking information generated by a popular LLM on belief in, and sharing intent of, political news headlines in a preregistered randomized control experiment. Although the LLM accurately identifies most false headlines (90%), we find that this information does not significantly improve participants' ability to discern headline accuracy or share accurate news. In contrast, viewing human-generated fact checks enhances discernment in both cases. Subsequent analysis reveals that the AI fact-checker is harmful in specific cases: it decreases beliefs in true headlines that it mislabels as false and increases beliefs in false headlines that it is unsure about. On the positive side, AI fact-checking information increases the sharing intent for correctly labeled true headlines. When participants are given the option to view LLM fact checks and choose to do so, they are significantly more likely to share both true and false news but only more likely to believe false headlines. Our findings highlight an important source of potential harm stemming from AI applications and underscore the critical need for policies to prevent or mitigate such unintended consequences.

Analysis of Fact-Checking Information Generated by LLMs on News Discernment

The research investigates the use of LLMs in fact-checking political news and assesses their impact on public belief and sharing intentions through a randomized controlled experiment. This paper is significant both in its illumination of LLMs' fact-checking capability and its interdisciplinary interest at the intersection of AI, digital misinformation, and political communication.

At the core of the research is the use of a model like ChatGPT to evaluate the credibility of political headlines and its influence on the audience's perception when fact-checking information is provided. While LLMs such as ChatGPT have shown commendable performance in natural language processing tasks, their integration into the fact-checking domain is still nascent. Here, this paper contributes empirical evidence regarding the practical applications and challenges of applying these models in a sensitive context like misinformation detection and correction.

Methodological Approach

The authors conducted an experiment using a representative sample of 1,548 U.S. participants, presenting them with 40 real political news stories—half true, half false. Participants were divided into groups to either assess belief or sharing intent regarding the headlines. Three experimental conditions were established: control, forced exposure to LLM-generated fact checks, and optional access to these fact checks. ChatGPT provided the fact-checking data for the treatment conditions.

Principal Findings

  1. LLM Performance on Fact-Checking: ChatGPT excelled in classifying false headlines, aiming towards high accuracy with 90% of false headlines identified correctly. However, its performance on true headlines was less reliable, correctly identifying only 15%, with the majority being labeled as "unsure".
  2. Impact on Audience Discernment: Surprisingly, the availability of LLM-generated fact checks did not enhance participants' ability to discern true from false headlines, whether they were in the forced or optional condition. The discernment—or the difference in belief or intent to share between true and false news—did not significantly differ between treatment and control groups.
  3. Scenario-Specific Effects: Analysis revealed scenario-specific, and sometimes adverse, effects of fact-checking information. Incorrectly tagged true headlines reduced belief in these stories, demonstrating potential harm in the mislabeling process. Moreover, uncertain classification of false headlines occasionally increased the belief in their veracity.
  4. User Behavior in Optional Condition: Participants who opted to view fact-checking information displayed varied patterns; they were more likely to believe false headlines and more willing to share content regardless of veracity, suggesting deeper engagement with headlines when fact-checks are made available.

Implications and Future Directions

The findings open a critical discussion on the deployment of LLMs in combating misinformation, highlighting potential drawbacks such as misclassification risks and discrepancies in human-AI interactions. The data suggest that a careful balance must be struck between AI fact-checking deployment and accuracy enhancement, particularly given the complex dynamics introduced when models express uncertainty or err in classification.

Policy development appears necessary to mitigate unintended consequences stemming from AI fact-checking interventions. Moreover, the selective engagement by individuals with fact-checking information points to a need for understanding user intention and the motivational aspects of their interaction with AI-generated corrections.

With advances in LLMs projected, future research could explore tailored LLM configurations specifically designed for fact-checking, utilize larger and more diverse data sets, and investigate variant representations of misinformation. Furthermore, refining user interaction models to enhance discernment without introducing bias or misplaced trust in AI systems is crucial.

In conclusion, while this research underscores promising aspects of AI applications in news credibility assessment, it simultaneously cautions against unintended impairments to news discernment, advocating for informed integration strategies for LLM technology within digital ecosystems.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (74)
  1. Timing matters when correcting fake news. Proceedings of the National Academy of Sciences, 118(5):e2020043118, 2021. URL https://doi.org/10.1073/pnas.2020043118.
  2. The global effectiveness of fact-checking: Evidence from simultaneous experiments in Argentina, Nigeria, South Africa, and the United Kingdom. Proc Natl Acad Sci U.S.A, 118(37):e2104235118, September 2021. URL https://doi.org/10.1073/pnas.2104235118.
  3. Fact-checking: A meta-analysis of what works and for whom. Political Communication, 37(3):350–375, 2020. URL https://doi.org/10.1080/10584609.2019.1668894.
  4. The psychology of fake news. Trends in cognitive sciences, 25(5):388–402, 2021. URL https://doi.org/10.1016/j.tics.2021.02.007.
  5. Leveraging ChatGPT for Efficient Fact-Checking. PsyArXiv, April 2023. URL https://doi.org/10.31234/osf.io/qnjkf.
  6. Reinforcement Learning-based Counter-Misinformation Response Generation: A Case Study of COVID-19 Vaccine Misinformation. In WWW ’23: Proceedings of the ACM Web Conference 2023, pages 2698–2709. Association for Computing Machinery, April 2023. URL https://doi.org/10.1145/3543507.3583388.
  7. Beyond misinformation: Understanding and coping with the “post-truth” era. Journal of Applied Research in Memory and Cognition, 6(4):353–369, 2017. URL https://doi.org/10.1016/j.jarmac.2017.07.008.
  8. The science of fake news. Science, 359(6380):1094–1096, March 2018. URL https://doi.org/10.1126/science.aao2998.
  9. Inoculating the public against misinformation about climate change. Global challenges, 1(2):1600008, 2017. URL https://doi.org/10.1002/gch2.201600008.
  10. Climate of conspiracy: A meta-analysis of the consequences of belief in conspiracy theories about climate change. Current Opinion in Psychology, page 101390, 2022. URL https://doi.org/10.1016/j.copsyc.2022.101390.
  11. Online misinformation is linked to early COVID-19 vaccination hesitancy and refusal. Scientific reports, 12(1):5966, 2022. URL https://doi.org/10.1038/s41598-022-10070-w.
  12. Social media behavior is associated with vaccine hesitancy. PNAS Nexus, 1(4), 2022. URL https://doi.org/10.1093/pnasnexus/pgac207.
  13. Measuring the impact of COVID-19 vaccine misinformation on vaccination intent in the UK and USA. Nature human behaviour, 5(3):337–348, 2021. URL https://doi.org/10.1038/s41562-021-01056-1.
  14. Social media, political polarization, and political disinformation: A review of the scientific literature. SSRN, 2018. URL https://dx.doi.org/10.2139/ssrn.3144139.
  15. Political psychology in the digital (mis) information age: A model of news belief and sharing. Social Issues and Policy Review, 15(1):84–113, 2021. URL https://doi.org/10.1111/sipr.12077.
  16. Taking fact-checks literally but not seriously? The effects of journalistic fact-checking on factual beliefs and candidate favorability. Political Behavior, 42:939–960, 2020. URL https://doi.org/10.1007/s11109-019-09528-x.
  17. Effects of credibility indicators on social media news sharing intent. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, pages 1–14, 2020. URL https://doi.org/10.1145/3313831.3376213.
  18. Language Models as Fact Checkers? In Proceedings of the Third Workshop on Fact Extraction and VERification (FEVER), pages 36–41. Association for Computational Linguistics, jul 2020. URL https://aclanthology.org/2020.fever-1.5.
  19. Fake news detection on social media: A data mining perspective. ACM SIGKDD explorations newsletter, 19(1):22–36, 2017. URL https://doi.org/10.1145/3137597.3137600.
  20. A survey of fake news: Fundamental theories, detection methods, and opportunities. ACM Computing Surveys (CSUR), 53(5):1–40, 2020. URL https://doi.org/10.1145/3395046.
  21. Automated fact-checking for assisting human fact-checkers. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pages 4551–4558. International Joint Conferences on Artificial Intelligence Organization, 8 2021. URL https://doi.org/10.24963/ijcai.2021/619.
  22. Scalable Fact-checking with Human-in-the-Loop. In 2021 IEEE International Workshop on Information Forensics and Security (WIFS), pages 1–6. IEEE, December 2021. URL https://doi.org/10.1109/WIFS53200.2021.9648388.
  23. D. Graves. Understanding the promise and limits of automated fact-checking. Reuters Institute for the Study of Journalism, February 2018. URL https://ora.ox.ac.uk/objects/uuid:f321ff43-05f0-4430-b978-f5f517b73b9b.
  24. Toward Automated Fact-Checking: Detecting Check-Worthy Factual Claims by ClaimBuster. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, page 1803–1812, 2017. URL https://doi.org/10.1145/3097983.3098131.
  25. Computational Fact Checking from Knowledge Networks. PLoS One, 10(6):e0128193, June 2015. URL https://doi.org/10.1371/journal.pone.0128193.
  26. Automated fact-checking: A survey. Language and Linguistics Compass, 15(10):e12438, 2021. URL https://doi.org/10.1111/lnc3.12438.
  27. A Survey on Automated Fact-Checking. Transactions of the Association for Computational Linguistics, 10:178–206, 2022. URL https://doi.org/10.1162/tacl_a_00454.
  28. Language models are few-shot learners. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 1877–1901, 2020. URL https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf.
  29. A Comprehensive Capability Analysis of GPT-3 and GPT-3.5 Series Models. Preprint arXiv:2303.10420, 2023. URL https://doi.org/10.48550/arXiv.2303.10420.
  30. Is ChatGPT a general-purpose natural language processing task solver? Preprint arXiv:2302.06476, 2023. URL https://doi.org/10.48550/arXiv.2302.06476.
  31. GPT-4 Passes the Bar Exam. SSRN, March 2023. URL https://dx.doi.org/10.2139/ssrn.4389233.
  32. OpenAI. GPT-4 Technical Report. ArXiv, Mar 2023. URL https://doi.org/10.48550/arXiv.2303.08774.
  33. Large language models can rate news outlet credibility. arXiv, April 2023a. URL https://arxiv.org/abs/2304.00228.
  34. In Generative AI we Trust: Can Chatbots Effectively Verify Political Information? arXiv, December 2023. URL https://doi.org/10.48550/arXiv.2312.13096.
  35. Alpaca: A Strong, Replicable Instruction-Following Model. Stanford Center for Research on Foundation Models Blog, April 2023. URL https://crfm.stanford.edu/2023/03/13/alpaca.html. [Online; accessed 7. Apr. 2023].
  36. Hello Dolly: Democratizing the magic of ChatGPT with open models. Databricks Blog, March 2023. URL https://www.databricks.com/blog/2023/03/24/hello-dolly-democratizing-magic-chatgpt-open-models.html. [Online; accessed 7. Apr. 2023].
  37. AI-Mediated Communication: Definition, Research Agenda, and Ethical Considerations. J Comput Mediat Commun, 25(1):89–100, March 2020. URL https://doi.org/10.1093/jcmc/zmz022.
  38. S. Shyam Sundar. The MAIN Model: A Heuristic Approach to Understanding Technology Effects on Credibility. In Digital Media, Youth, and Credibility, volume 2008, pages 73–100. The MIT Press, 2008. doi: 10.1162/dmal.9780262562324.073.
  39. Machine Heuristic: When We Trust Computers More than Humans with Our Personal Information. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, CHI ’19, pages 1–9, New York, NY, USA, May 2019. Association for Computing Machinery. ISBN 978-1-4503-5970-2. doi: 10.1145/3290605.3300768. URL https://doi.org/10.1145/3290605.3300768.
  40. S Shyam Sundar. Rise of Machine Agency: A Framework for Studying the Psychology of Human–AI Interaction (HAII). Journal of Computer-Mediated Communication, 25(1):74–88, 2020. URL https://doi.org/10.1093/jcmc/zmz026.
  41. David C. DeAndrea. Advancing Warranting Theory. Commun Theory, 24(2):186–204, May 2014. URL https://doi.org/10.1111/comt.12033.
  42. “Like Having a Really Bad PA”: The Gulf between User Expectation and Experience of Conversational Agents. In CHI ’16: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, pages 5286–5297. Association for Computing Machinery, May 2016. URL https://doi.org/10.1145/2858036.2858288.
  43. Exploring User Expectations of Proactive AI Systems. Proc ACM Interact Mob Wearable Ubiquitous Technol, 4(4):1–22, December 2020. URL https://doi.org/10.1145/3432193.
  44. The Effects of Interactive AI Design on User Behavior: An Eye-tracking Study of Fact-checking COVID-19 Claims. In CHIIR ’22: Proceedings of the 2022 Conference on Human Information Interaction and Retrieval, pages 315–320. Association for Computing Machinery, March 2022. URL https://doi.org/10.1145/3498366.3505786.
  45. No Explainability without Accountability: An Empirical Study of Explanations and Feedback in Interactive ML. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, CHI ’20, pages 1–13, New York, NY, USA, April 2020. Association for Computing Machinery. URL https://doi.org/10.1145/3313831.3376624.
  46. Effect of confidence and explanation on accuracy and trust calibration in AI-assisted decision making. In FAT∗normal-∗\ast∗ ’20: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pages 295–305. Association for Computing Machinery, January 2020. URL https://doi.org/10.1145/3351095.3372852.
  47. Does the Whole Exceed its Parts? The Effect of AI Explanations on Complementary Team Performance. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, CHI ’21, pages 1–16, New York, NY, USA, May 2021. Association for Computing Machinery. URL https://doi.org/10.1145/3411764.3445717.
  48. ChatGPT Fact-checking as a Misinformation Intervention. OSF Preregistration, doi: 10.17605/OSF.IO/58RMU, March 2023. URL https://osf.io/58rmu.
  49. The Perils & Promises of Fact-checking with Large Language Models. arXiv, October 2023. URL https://doi.org/10.48550/arXiv.2310.13549.
  50. Shifting attention to accuracy can reduce misinformation online. Nature, 592:590–595, 2021a. URL https://doi.org/10.1038/s41586-021-03344-2.
  51. How To Think About Whether Misinformation Interventions Work. Nature Human Behaviour, 2023. doi: 10.1038/s41562-023-01667-w. URL https://doi.org/10.1038/s41562-023-01667-w.
  52. Rumors in Retweet: Ideological Asymmetry in the Failure to Correct Misinformation. Pers Soc Psychol Bull, September 2022. URL https://doi.org/10.1177/01461672221114222.
  53. Self-reported willingness to share political news articles in online surveys correlates with actual sharing on Twitter. PLoS One, 15(2):e0228882, February 2020. ISSN 1932-6203. URL https://doi.org/10.1371/journal.pone.0228882.
  54. Explainable Automated Fact-Checking: A Survey. In Proceedings of the 28th International Conference on Computational Linguistics, pages 5430–5443. International Committee on Computational Linguistics, dec 2020. doi: 10.18653/v1/2020.coling-main.474. URL https://aclanthology.org/2020.coling-main.474.
  55. The Minimal Persuasive Effects of Campaign Contact in General Elections: Evidence from 49 Field Experiments. American Political Science Review, 112(1):148–166, February 2018. URL https://doi.org/10.1017/S0003055417000363.
  56. Artificial Intelligence Can Persuade Humans on Political Issues, February 2023. URL https://osf.io/stakv/.
  57. Raymond S. Nickerson. Confirmation Bias: A Ubiquitous Phenomenon in Many Guises. Review of General Psychology, 2(2):175–220, June 1998. URL https://doi.org/10.1037/1089-2680.2.2.175.
  58. Research note: Fighting misinformation or fighting for information? Harvard Kennedy School Misinformation Review, January 2022. URL https//doi.org/10.37016/mr-2020-87.
  59. AI model GPT-3 (dis)informs us better than humans. Sci Adv, 9(26):eadh1850, June 2023. ISSN 2375-2548. URL https://doi.org/10.1126/sciadv.adh1850.
  60. Generative Language Models and Automated Influence Operations: Emerging Threats and Potential Mitigations, January 2023. URL http://arxiv.org/abs/2301.04246. arXiv:2301.04246.
  61. Could chatgpt become a monster misinformation superspreader? NewsGuard, March 2023. URL https://www.newsguardtech.com/misinformation-monitor/jan-2023.
  62. Addressing the harms of AI-generated inauthentic content. Nat Mach Intell, 5:679–680, July 2023. URL https://doi.org/10.1038/s42256-023-00690-w.
  63. Anatomy of an ai-powered malicious social botnet. Preprint 2307.16336, arXiv, 2023b. URL https://doi.org/10.48550/arXiv.2307.16336.
  64. Release strategies and the social impacts of language models. ArXiv, November 2019. URL https://doi.org/10.48550/arXiv.1908.09203.
  65. Working With AI to Persuade: Examining a Large Language Model’s Ability to Generate Pro-Vaccination Messages. Proceedings of the ACM on Human-Computer Interaction, 7(CSCW1):1–29, 2023. URL https://doi.org/10.1145/3579592.
  66. U.S. Census Bureau. Educational Attainment in the United States: 2020, 2020. URL https://www.census.gov/data/tables/2020/demo/educational-attainment/cps-detailed-tables.html.
  67. Pew Research Center. What the 2020 Electorate Looks Like by Party, Race and Ethnicity, Age, Education and Religion, 2020. URL https://www.pewresearch.org/short-reads/2020/10/26/what-the-2020-electorate-looks-like-by-party-race-and-ethnicity-age-education-and-religion/.
  68. Social Science Research Council. Building a better toolkit (for fighting misinformation): Large collaborative project to compare misinformation interventions, 2023. URL https://www.ssrc.org/grantees/large-collaborative-project-to-compare-misinformation-interventions/. Accessed August 2, 2023.
  69. G*power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39(2):175–191, 2007. doi: 10.3758/BF03193146.
  70. A Practical Guide to Doing Behavioral Research on Fake News and Misinformation. Collabra: Psychology, 7(1), January 2021b. URL https://doi.org/10.1525/collabra.25293.
  71. Lazy, not biased: Susceptibility to partisan fake news is better explained by lack of reasoning than by motivated reasoning. Cognition, 188:39–50, 2019. URL https://doi.org/10.1016/j.cognition.2018.06.011.
  72. The implied truth effect: Attaching warnings to a subset of fake news headlines increases perceived accuracy of headlines without warnings. Management science, 66(11):4944–4957, 2020. URL https://doi.org/10.1287/mnsc.2019.3478.
  73. OpenAI. ChatGPT — Release Notes, May 2023. URL https://help.openai.com/en/articles/6825453-chatgpt-release-notes. [Online; accessed 16. May 2023].
  74. Assessing the Attitude Towards Artificial Intelligence: Introduction of a Short Measure in German, Chinese, and English Language. KI - Künstliche Intelligenz, 35(1):109–118, March 2021. URL https://doi.org/10.1007/s13218-020-00689-0.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Matthew R. DeVerna (10 papers)
  2. Harry Yaojun Yan (1 paper)
  3. Kai-Cheng Yang (29 papers)
  4. Filippo Menczer (102 papers)
Citations (2)