Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 60 tok/s
Gemini 2.5 Pro 54 tok/s Pro
GPT-5 Medium 30 tok/s Pro
GPT-5 High 35 tok/s Pro
GPT-4o 99 tok/s Pro
Kimi K2 176 tok/s Pro
GPT OSS 120B 448 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Comparing GPT-4 and Open-Source Language Models in Misinformation Mitigation (2401.06920v1)

Published 12 Jan 2024 in cs.CL

Abstract: Recent LLMs have been shown to be effective for misinformation detection. However, the choice of LLMs for experiments varies widely, leading to uncertain conclusions. In particular, GPT-4 is known to be strong in this domain, but it is closed source, potentially expensive, and can show instability between different versions. Meanwhile, alternative LLMs have given mixed results. In this work, we show that Zephyr-7b presents a consistently viable alternative, overcoming key limitations of commonly used approaches like Llama-2 and GPT-3.5. This provides the research community with a solid open-source option and shows open-source models are gradually catching up on this task. We then highlight how GPT-3.5 exhibits unstable performance, such that this very widely used model could provide misleading results in misinformation detection. Finally, we validate new tools including approaches to structured output and the latest version of GPT-4 (Turbo), showing they do not compromise performance, thus unlocking them for future research and potentially enabling more complex pipelines for misinformation mitigation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. Qwen Technical Report. arXiv preprint arXiv:2309.16609.
  2. Open LLM Leaderboard. https://huggingface.co/spaces/HuggingFaceH4/open˙llm˙leaderboard.
  3. Caramancion, K. M. 2023. Harnessing the Power of ChatGPT to Decimate Mis/Disinformation: Using ChatGPT for Fake News Detection. In 2023 IEEE World AI IoT Congress (AIIoT), 0042–0046.
  4. Can LLM-Generated Misinformation Be Detected? arXiv:2309.13788.
  5. Combating Misinformation in the Age of LLMs: Opportunities and Challenges. arXiv preprint arXiv:2311.05656.
  6. How is ChatGPT’s behavior changing over time? arXiv:2307.09009.
  7. Can Large Language Models Understand Content and Propagation for Misinformation Detection: An Empirical Study. arXiv:2311.12699.
  8. DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models. arXiv:2309.03883.
  9. The case for 4-bit precision: k-bit Inference Scaling Laws. arXiv:2212.09720.
  10. Grammar-Constrained Decoding for Structured NLP Tasks without Finetuning. arXiv:2305.13971.
  11. Leveraging ChatGPT for Efficient Fact-Checking.
  12. Bad Actor, Good Advisor: Exploring the Role of Large Language Models in Fake News Detection. arXiv:2309.12247.
  13. Mistral 7B. arXiv:2310.06825.
  14. Overview of the CLEF-2022 CheckThat! Lab: Task 3 on Fake News Detection. In Faggioli, G.; 0001, N. F.; Hanbury, A.; and Potthast, M., eds., Proceedings of the Working Notes of CLEF 2022 - Conference and Labs of the Evaluation Forum, Bologna, Italy, September 5th - to - 8th, 2022, volume 3180 of CEUR Workshop Proceedings, 404–421. CEUR-WS.org.
  15. Machine learning explanations to prevent overtrust in fake news detection. In Proceedings of the international AAAI conference on web and social media, volume 15, 421–431.
  16. Towards Reliable Misinformation Mitigation: Generalization, Uncertainty, and GPT-4. arXiv:2305.14928.
  17. The Perils & Promises of Fact-checking with Large Language Models. arXiv:2310.13549.
  18. New explainability method for BERT-based model in fake news detection. Scientific Reports, 11(1).
  19. iCompass at CheckThat! 2022: combining deep language models for fake news detection. Working Notes of CLEF.
  20. Llama 2: Open Foundation and Fine-Tuned Chat Models. arXiv:2307.09288.
  21. ur-iw-hnt at CheckThat!-2022: Cross-lingual Text Summarization for Fake News Detection. In Faggioli, G.; 0001, N. F.; Hanbury, A.; and Potthast, M., eds., Proceedings of the Working Notes of CLEF 2022 - Conference and Labs of the Evaluation Forum, Bologna, Italy, September 5th - to - 8th, 2022, volume 3180 of CEUR Workshop Proceedings, 740–748. CEUR-WS.org.
  22. Zephyr: Direct Distillation of LM Alignment. arXiv:2310.16944.
  23. Wang, W. Y. 2017. “Liar, Liar Pants on Fire”: A New Benchmark Dataset for Fake News Detection. In Barzilay, R.; and Kan, M.-Y., eds., Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 422–426. Vancouver, Canada: Association for Computational Linguistics.
  24. Open, Closed, or Small Language Models for Text Classification? arXiv:2308.10092.
  25. Towards LLM-based Fact Verification on News Claims with a Hierarchical Step-by-Step Prompting Method. arXiv:2310.00305.
  26. Synthetic Lies: Understanding AI-Generated Misinformation and Evaluating Algorithmic and Human Solutions. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, CHI ’23. ACM.
Citations (5)

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube