Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ConspEmoLLM: Conspiracy Theory Detection Using an Emotion-Based Large Language Model (2403.06765v3)

Published 11 Mar 2024 in cs.CL

Abstract: The internet has brought both benefits and harms to society. A prime example of the latter is misinformation, including conspiracy theories, which flood the web. Recent advances in natural language processing, particularly the emergence of LLMs, have improved the prospects of accurate misinformation detection. However, most LLM-based approaches to conspiracy theory detection focus only on binary classification and fail to account for the important relationship between misinformation and affective features (i.e., sentiment and emotions). Driven by a comprehensive analysis of conspiracy text that reveals its distinctive affective features, we propose ConspEmoLLM, the first open-source LLM that integrates affective information and is able to perform diverse tasks relating to conspiracy theories. These tasks include not only conspiracy theory detection, but also classification of theory type and detection of related discussion (e.g., opinions towards theories). ConspEmoLLM is fine-tuned based on an emotion-oriented LLM using our novel ConDID dataset, which includes five tasks to support LLM instruction tuning and evaluation. We demonstrate that when applied to these tasks, ConspEmoLLM largely outperforms several open-source general domain LLMs and ChatGPT, as well as an LLM that has been fine-tuned using ConDID, but which does not use affective features. This project will be released on https://github.com/lzw108/ConspEmoLLM/.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. Tsun-Hin Cheung and Kin-Man Lam. 2023. Factllama: Optimizing instruction-following language models with external knowledge for automated fact-checking. In 2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pages 846–853. IEEE.
  2. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
  3. Public emotions and rumors spread during the covid-19 epidemic in china: web-based correlation study. Journal of Medical Internet Research, 22(11):e21933.
  4. Karen M Douglas. 2021. Covid-19 conspiracy theories. Group Processes & Intergroup Relations, 24(2):270–275.
  5. Detection of conspiracy propagators using psycho-linguistic characteristics. Journal of Information Science, 49(1):3–17.
  6. Aspect-based sentiment analysis using bert. In Proceedings of the 22nd nordic conference on computational linguistics, pages 187–196.
  7. Bad actor, good advisor: Exploring the role of large language models in fake news detection. arXiv preprint arXiv:2309.12247.
  8. Clayton Hutto and Eric Gilbert. 2014. Vader: A parsimonious rule-based model for sentiment analysis of social media text. In Proceedings of the international AAAI conference on web and social media, volume 8, pages 216–225.
  9. Coco: an annotated twitter dataset of covid-19 conspiracy theories. Journal of Computational Social Science, pages 1–42.
  10. Instructerc: Reforming emotion recognition in conversation with a retrieval multi-task llms framework. arXiv preprint arXiv:2309.11911.
  11. An emotion-aware approach for fake news detection. IEEE Transactions on Computational Social Systems.
  12. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
  13. Emollms: A series of emotional large language models and annotation tools for comprehensive affective analysis. arXiv preprint arXiv:2401.08508.
  14. Emotion detection for misinformation: A review. Information Fusion, page 102300.
  15. Ilya Loshchilov and Frank Hutter. 2017. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101.
  16. Loco: The 88-million-word language of conspiracy corpus. Behavior research methods, pages 1–24.
  17. How “loco” is the loco corpus? annotating the language of conspiracy theories. In Proceedings of the 16th Lingusitic Annotation Workshop (LAW-XVI) within LREC2022, pages 111–119.
  18. Covid-twitter-bert: A natural language processing model to analyse covid-19 content on twitter. Frontiers in Artificial Intelligence, 6:1023281.
  19. M Giulia Napolitano and Kevin Reuter. 2023. What is a conspiracy theory? Erkenntnis, 88(5):2035–2062.
  20. Bohdan M Pavlyshenko. 2023. Analysis of disinformation and fake news detection using fine-tuned large language model. arXiv preprint arXiv:2309.04704.
  21. The refinedweb dataset for falcon llm: outperforming curated corpora with web data, and web data only. arXiv preprint arXiv:2306.01116.
  22. Detecting covid-19-related conspiracy theories in tweets. MediaEval.
  23. Definitions matter: Guiding gpt for multi-label classification. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 4054–4063.
  24. Fakenews: Corona virus and conspiracies multimedia analysis task at mediaeval 2021. In Multimedia Benchmark Workshop, volume 67.
  25. Deepspeed: System optimizations enable training deep learning models with over 100 billion parameters. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 3505–3506.
  26. Roberta-lstm: a hybrid model for sentiment analysis with transformer and recurrent neural network. IEEE Access, 10:21517–21525.
  27. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
  28. Bloom: A 176b-parameter open-access multilingual language model. arXiv preprint arXiv:2211.05100.
  29. Pixiu: A large language model, instruction data and evaluation benchmark for finance. arXiv preprint arXiv:2306.05443.
  30. Classifying covid-19 conspiracy tweets with word embedding and bert. In Working Notes Proceedings of the MediaEval 2021 Workshop, Online, pages 13–15.
  31. Mentalllama: Interpretable mental health analysis on social media with large language models. arXiv preprint arXiv:2309.13567.
  32. Back to the future: Towards explainable temporal reasoning with large language models. arXiv preprint arXiv:2310.01074.
  33. On sentiment of online fake news. In 2020 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pages 760–767. IEEE.
  34. Enhancing financial sentiment analysis via retrieval augmented large language models. In Proceedings of the Fourth ACM International Conference on AI in Finance, pages 349–356.
  35. Opt: Open pre-trained transformer language models. arXiv preprint arXiv:2205.01068.
  36. Mining dual emotion for fake news detection. In Proceedings of the web conference 2021, pages 3465–3476.
  37. Building emotional support chatbots in the era of llms. arXiv preprint arXiv:2308.11584.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets