Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Large Language Models for Propaganda Detection (2310.06422v2)

Published 10 Oct 2023 in cs.CL and cs.AI

Abstract: The prevalence of propaganda in our digital society poses a challenge to societal harmony and the dissemination of truth. Detecting propaganda through NLP in text is challenging due to subtle manipulation techniques and contextual dependencies. To address this issue, we investigate the effectiveness of modern LLMs such as GPT-3 and GPT-4 for propaganda detection. We conduct experiments using the SemEval-2020 task 11 dataset, which features news articles labeled with 14 propaganda techniques as a multi-label classification problem. Five variations of GPT-3 and GPT-4 are employed, incorporating various prompt engineering and fine-tuning strategies across the different models. We evaluate the models' performance by assessing metrics such as $F1$ score, $Precision$, and $Recall$, comparing the results with the current state-of-the-art approach using RoBERTa. Our findings demonstrate that GPT-4 achieves comparable results to the current state-of-the-art. Further, this study analyzes the potential and challenges of LLMs in complex tasks like propaganda detection.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. Detecting propaganda techniques in english news articles using pre-trained transformers. In 2022 13th International Conference on Information and Communication Systems (ICICS), pages 301–308. IEEE.
  2. A closer look at fake news detection: A deep learning perspective. In Proceedings of the 2019 3rd international conference on advances in artificial intelligence, pages 24–28.
  3. Detecting fake news using machine learning: A systematic literature review. arXiv preprint arXiv:2102.04458.
  4. Justdeep at nlp4if 2019 shared task: Propaganda detection using ensemble deep learning models. EMNLP-IJCNLP 2019, page 113.
  5. Overview of the wanlp 2022 shared task on propaganda detection in arabic. arXiv preprint arXiv:2211.10057.
  6. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
  7. Deep reinforcement learning from human preferences. Advances in neural information processing systems, 30.
  8. Large language models in the workplace: A case study on prompt engineering for job type classification. In International Conference on Applications of Natural Language to Information Systems, pages 3–17. Springer.
  9. Prta: A system to support the analysis of propaganda techniques in the news. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 287–293.
  10. Prta: A system to support the analysis of propaganda techniques in the news. page 287–293.
  11. Fine-grained analysis of propaganda in news article. In Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pages 5636–5646.
  12. Vlad Ermurachi and Daniela Gifu. 2020. Uaic1860 at semeval-2020 task 11: detection of propaganda techniques in news articles. In Proceedings of the Fourteenth Workshop on Semantic Evaluation, pages 1835–1840.
  13. The pile: An 800gb dataset of diverse text for language modeling. arXiv preprint arXiv:2101.00027.
  14. Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12):1–38.
  15. Practical text classification with large pre-trained language models. arXiv preprint arXiv:1812.01207.
  16. Minhyeok Lee. 2023. A mathematical investigation of hallucination and creativity in gpt models. Mathematics, 11(10):2320.
  17. Explainable multi-hop verbal reasoning through internal monologue. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1225–1250.
  18. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 55(9):1–35.
  19. Vivian Liu and Lydia B Chilton. 2022. Design guidelines for prompt engineering text-to-image generative models. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, pages 1–23.
  20. Self-supervised learning: Generative or contrastive. IEEE Transactions on Knowledge and Data Engineering, 35(1):857–876.
  21. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
  22. Semeval-2020 task 11: Detection of propaganda techniques in news articles. arXiv preprint arXiv:2009.02696.
  23. OpenAI. 2023a. Gpt-4 technical report.
  24. OpenAI. 2023b. Openai API. [Online; accessed 2023-06-15].
  25. OpenAI. 2023c. Function calling and other API updates. [Online; accessed 2023-06-01].
  26. Raul Puri and Bryan Catanzaro. 2019. Zero-shot text classification with generative language models. arXiv preprint arXiv:1912.10165.
  27. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9.
  28. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551.
  29. Sonish Sivarajkumar and Yanshan Wang. 2022. Healthprompt: A zero-shot learning paradigm for clinical natural language processing. arXiv preprint arXiv:2203.05061.
  30. Jason Stanley. 2015. How propaganda works. In How propaganda works. Princeton University Press.
  31. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
  32. Legal prompt engineering for multilingual legal judgement prediction. arXiv preprint arXiv:2212.02199.
  33. Attention is all you need. Advances in neural information processing systems, 30.
  34. Chain of thought prompting elicits reasoning in large language models. arXiv preprint arXiv:2201.11903.
  35. David Yarowsky. 1995. Unsupervised word sense disambiguation rivaling supervised methods. In 33rd annual meeting of the association for computational linguistics, pages 189–196.
Citations (8)

Summary

We haven't generated a summary for this paper yet.