Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

The Perils & Promises of Fact-checking with Large Language Models (2310.13549v2)

Published 20 Oct 2023 in cs.CL, cs.CY, and cs.HC

Abstract: Automated fact-checking, using machine learning to verify claims, has grown vital as misinformation spreads beyond human fact-checking capacity. LLMs like GPT-4 are increasingly trusted to write academic papers, lawsuits, and news articles and to verify information, emphasizing their role in discerning truth from falsehood and the importance of being able to verify their outputs. Understanding the capacities and limitations of LLMs in fact-checking tasks is therefore essential for ensuring the health of our information ecosystem. Here, we evaluate the use of LLM agents in fact-checking by having them phrase queries, retrieve contextual data, and make decisions. Importantly, in our framework, agents explain their reasoning and cite the relevant sources from the retrieved context. Our results show the enhanced prowess of LLMs when equipped with contextual information. GPT-4 outperforms GPT-3, but accuracy varies based on query language and claim veracity. While LLMs show promise in fact-checking, caution is essential due to inconsistent accuracy. Our investigation calls for further research, fostering a deeper comprehension of when agents succeed and when they fail.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. Ethan Porter and Thomas J Wood “The global effectiveness of fact-checking: Evidence from simultaneous experiments in Argentina, Nigeria, South Africa, and the United Kingdom” In Proceedings of the National Academy of Sciences 118.37 National Acad Sciences, 2021, pp. e2104235118
  2. David S Morris, Jonathan S Morris and Peter L Francia “A fake news inoculation? Fact checkers, partisan identification, and the power of misinformation” In Politics, Groups, and Identities 8.5 Taylor & Francis, 2020, pp. 986–1005
  3. “Estimating fact-checking’s effects” In Arlington, VA: American Press Institute, 2015
  4. Pulitzer Prize Board “2009 Pulitzer Prize Winners” Accessed: 2023-10-20 In The Pulitzer Prizes, 2009 URL: https://www.pulitzer.org/prize-winners-by-year/2009
  5. Alexios Mantzarlis “Fact-checking 101” In Journalism, fake news & disinformation: Handbook for journalism education and training Unesco Publishing Paris, France, 2018, pp. 85–100
  6. Alexandre Bovet and Hernán A Makse “Influence of fake news in Twitter during the 2016 US presidential election” In Nature communications 10.1 Nature Publishing Group UK London, 2019, pp. 7
  7. “Fake news on Twitter during the 2016 U.S. presidential election” In Science 363.6425, 2019, pp. 374–378 DOI: 10.1126/science.aau2706
  8. “Political polarization of news media and influencers on Twitter in the 2016 and 2020 US presidential elections” In Nature Human Behaviour 7, 2023, pp. 904––916 DOI: 10.1038/s41562-023-01550-8
  9. “The rise of fact-checking sites in Europe” Reuters Institute for the Study of Journalism, 2016
  10. “How COVID drove the evolution of fact-checking” In Harvard Kennedy School Misinformation Review, 2021
  11. “The quest to automate fact-checking” In Proceedings of the 2015 computation+ journalism symposium, 2015 Citeseer
  12. “Progress toward “the holy grail”: The continued quest to automate fact-checking” In Computation+ Journalism Symposium,(September), 2017
  13. Emma Hoes, Sacha Altay and Juan Bermeo “Leveraging ChatGPT for Efficient Fact-Checking” Preprint at: https://osf.io/preprints/psyarxiv/qnjkf/, 2023
  14. Paulina Okunytė “Google search exposes academics using chatgpt in research” Accessed: 2023-10-20 In Cybernews Cybernews, 2023 URL: https://cybernews.com/news/academic-cheating-chatgpt-openai/
  15. “The CHATGPT lawyer explains himself” Accessed: 2023-10-20 In The New York Times The New York Times, 2023 URL: https://www.nytimes.com/2023/06/08/nyregion/lawyer-chatgpt-sanctions.html
  16. “Investigating the Impact of User Trust on the Adoption and Use of ChatGPT: Survey Analysis” In Journal of Medical Internet Research 25 JMIR Publications Toronto, Canada, 2023, pp. e47184
  17. “Explainable Automated Fact-Checking: A Survey” In Proceedings of the 28th International Conference on Computational Linguistics Barcelona, Spain (Online): International Committee on Computational Linguistics, 2020, pp. 5430–5443 DOI: 10.18653/v1/2020.coling-main.474
  18. “SemEval-2019 Task 7: RumourEval, Determining Rumour Veracity and Support for Rumours” In Proceedings of the 13th International Workshop on Semantic Evaluation Minneapolis, Minnesota, USA: Association for Computational Linguistics, 2019, pp. 845–854 DOI: 10.18653/v1/S19-2147
  19. “Overview of the clef–2022 checkthat! lab on fighting the covid-19 infodemic and fake news detection” In International Conference of the Cross-Language Evaluation Forum for European Languages, 2022, pp. 495–520 Springer
  20. “Overview of the CLEF-2022 CheckThat! lab task 1 on identifying relevant claims in tweets”, 2022
  21. “Overview of the CLEF-2022 CheckThat! lab task 2 on detecting previously fact-checked claims”, 2022
  22. “Overview of the CLEF-2022 CheckThat! lab task 3 on fake news detection” In Working Notes of CLEF, 2022
  23. “The Fact Extraction and VERification (FEVER) Shared Task” In Proceedings of the First Workshop on Fact Extraction and VERification (FEVER) Brussels, Belgium: Association for Computational Linguistics, 2018, pp. 1–9 DOI: 10.18653/v1/W18-5501
  24. “Claimbuster: The first-ever end-to-end fact-checking system” In Proceedings of the VLDB Endowment 10.12 VLDB Endowment, 2017, pp. 1945–1948
  25. Xia Zeng, Amani S Abumansour and Arkaitz Zubiaga “Automated fact-checking: A survey” In Language and Linguistics Compass 15.10 Wiley Online Library, 2021, pp. e12438
  26. Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova “Bert: Pre-training of deep bidirectional transformers for language understanding” In Proceedings of naacL-HLT 1, 2019, pp. 2
  27. “Roberta: A robustly optimized bert pretraining approach” Preprint at: https://arxiv.org/abs/1907.11692, 2019
  28. “Truth of varying shades: Analyzing language in fake news and political fact-checking” In Proceedings of the 2017 conference on empirical methods in natural language processing, 2017, pp. 2931–2937
  29. Zhijiang Guo, M. Schlichtkrull and Andreas Vlachos “A Survey on Automated Fact-Checking” In Transactions of the Association for Computational Linguistics 10, 2021, pp. 178–206
  30. “FEVER: a Large-scale Dataset for Fact Extraction and VERification” In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers) New Orleans, Louisiana: Association for Computational Linguistics, 2018, pp. 809–819 DOI: 10.18653/v1/N18-1074
  31. “Fact or Fiction: Verifying Scientific Claims” In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) Online: Association for Computational Linguistics, 2020, pp. 7534–7550 DOI: 10.18653/v1/2020.emnlp-main.609
  32. “MultiFC: A real-world multi-domain dataset for evidence-based fact checking of claims” Preprint at: https://arxiv.org/abs/1909.03242, 2019
  33. “Machine Knowledge: Creation and Curation of Comprehensive Knowledge Bases” In Found. Trends Databases 10, 2020, pp. 108–490
  34. “Attention is all you need” In Advances in neural information processing systems 30, 2017
  35. Christian Buck, Kenneth Heafield and Bas Van Ooyen “N-gram Counts and Language Models from the Common Crawl.” In LREC 2, 2014, pp. 4
  36. “Language models are few-shot learners” In Advances in neural information processing systems 33, 2020, pp. 1877–1901
  37. OpenAI “Introducing chatgpt” Accessed: 2023-10-20 In Introducing ChatGPT, 2022 URL: https://openai.com/blog/chatgpt
  38. “ReAct: Synergizing Reasoning and Acting in Language Models” Preprint at: https://arxiv.org/abs/2210.03629, 2023
  39. Harrison Chase “LangChain”, 2022 URL: https://github.com/hwchase17/langchain
  40. “GPT-3.5 vs GPT-4: Evaluating ChatGPT’s Reasoning Performance in Zero-shot Learning” Preprint at: https://arxiv.org/abs/2305.12477, 2023
  41. Rishabh Misra “Politifact Fact Check Dataset”, 2022 DOI: 10.13140/RG.2.2.29923.22566
  42. “A multitask, multilingual, multimodal evaluation of chatgpt on reasoning, hallucination, and interactivity” Preprint at https://arXiv:2302.04023, 2023
  43. “Is ChatGPT a good translator? Yes with GPT-4 as the engine” Preprint at https://arXiv:2301.08745, 2023
  44. “Multilingual machine translation with large language models: Empirical results and analysis” Preprint at: https://arXiv:2304.04675, 2023
  45. OpenAI “OpenAI platform” Accessed: 2023-10-20 In OpenAI Platform URL: https://platform.openai.com/docs/models/gpt-4
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Dorian Quelle (5 papers)
  2. Alexandre Bovet (22 papers)
Citations (10)
Youtube Logo Streamline Icon: https://streamlinehq.com

HackerNews