Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DWReCO at CheckThat! 2023: Enhancing Subjectivity Detection through Style-based Data Sampling (2307.03550v1)

Published 7 Jul 2023 in cs.CL, cs.CY, and cs.LG

Abstract: This paper describes our submission for the subjectivity detection task at the CheckThat! Lab. To tackle class imbalances in the task, we have generated additional training materials with GPT-3 models using prompts of different styles from a subjectivity checklist based on journalistic perspective. We used the extended training set to fine-tune language-specific transformer models. Our experiments in English, German and Turkish demonstrate that different subjective styles are effective across all languages. In addition, we observe that the style-based oversampling is better than paraphrasing in Turkish and English. Lastly, the GPT-3 models sometimes produce lacklustre results when generating style-based texts in non-English languages.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (25)
  1. Overview of the CLEF-2023 CheckThat! lab task 2 on subjectivity in news articles, ????
  2. Overview of the CLEF–2023 CheckThat! Lab checkworthiness, subjectivity, political bias, factuality, and authority of news articles and their source, in: A. Arampatzis, E. Kanoulas, T. Tsikrika, S. Vrochidis, A. Giachanou, D. Li, M. Aliannejadi, M. Vlachos, G. Faggioli, N. Ferro (Eds.), Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fourteenth International Conference of the CLEF Association (CLEF 2023), 2023a.
  3. The CLEF-2023 CheckThat! Lab: Checkworthiness, subjectivity, political bias, factuality, and authority, in: J. Kamps, L. Goeuriot, F. Crestani, M. Maistro, H. Joho, B. Davis, C. Gurrin, U. Kruschwitz, A. Caputo (Eds.), Advances in Information Retrieval, Springer Nature Switzerland, Cham, 2023b, pp. 506–517.
  4. A corpus for sentence-level subjectivity detection on english news articles, arXiv preprint arXiv:2305.18034 (2023).
  5. Distinguishing between facts and opinions for sentiment analysis: Survey and challenges, Information Fusion 44 (2018) 65–77.
  6. Training language models to follow instructions with human feedback, Advances in Neural Information Processing Systems 35 (2022) 27730–27744.
  7. A survey of methods for addressing class imbalance in deep-learning based natural language processing, in: Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, Association for Computational Linguistics, Dubrovnik, Croatia, 2023, pp. 523–540. URL: https://aclanthology.org/2023.eacl-main.38.
  8. A corpus for sentence-level subjectivity detection on english news articles, 2023.
  9. Deep learning for text style transfer: A survey, Computational Linguistics 48 (2022) 155–205.
  10. Defending against neural fake news, Advances in neural information processing systems 32 (2019).
  11. On the definition of prescriptive annotation guidelines for language-agnostic subjectivity detection, in: Proceedings of Text2Story—Sixth Workshop on Narrative Extraction From Texts, held in conjunction with the 45th European Conference on Information Retrieval (ECIR 2023), volume 3370, CEUR-WS. org, 2023, pp. 103–111.
  12. P. Chong, Valuing subjectivity in journalism: Bias, emotions, and self-interest as tools in arts reporting, Journalism 20 (2019) 427–443.
  13. E. H. Henderson, Toward a definition of propaganda, The Journal of Social Psychology 18 (1943) 71–87. doi:10.1080/00224545.1943.9921701.
  14. J. Wiebe, Identifying subjective characters in narrative, in: COLING 1990 Volume 2: Papers presented to the 13th International Conference on Computational Linguistics, 1990.
  15. Learning subjective language, Computational linguistics 30 (2004) 277–308.
  16. J. Westerståhl, Objective news reporting: General premises, Communication research 10 (1983) 403–424.
  17. R. L. Kaplan, Politics and the american press: The rise of objectivity, 1865-1920, Canadian Journal of Communication 28 (2003).
  18. A. White, Ethical challenges for journalists in dealing with hate speech, OHCHR http://www. ohchr. org/Documents/Issues/Expression/ICCPR/Vienna/CRP8White. pdf (1976).
  19. C. George, Hate speech: A dilemma for journalists the world over, Ethics in the News. EJN Report on Challenges for Journalism in the Post-truth Era. Available at: https://ethicaljournalismnetwork. org/resources/publications/ethics-in-the-news/hate-speech (accessed 17 June 2019) (2017).
  20. E. Riloff, J. Wiebe, Learning extraction patterns for subjective expressions, in: Proceedings of the 2003 conference on Empirical methods in natural language processing, 2003, pp. 105–112.
  21. Separating facts from fiction: Linguistic models to classify suspicious and trusted news posts on twitter, in: Proceedings of the 55th annual meeting of the association for computational linguistics (volume 2: Short papers), 2017, pp. 647–653.
  22. Using verbs and adjectives to automatically classify blog sentiment, Training 580 (2006) 233.
  23. L. Kramp, S. Weichert, Hateful commenting online: Control strategies for newsrooms, Landesanstalt für Medien NRW. Retrieved November 15 (2018) 2021.
  24. Transformers: State-of-the-art natural language processing, in: Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations, 2020, pp. 38–45.
  25. A robustly optimized bert pre-training approach with post-training, in: Proceedings of the 20th chinese national conference on computational linguistics, 2021, pp. 1218–1227.

Summary

We haven't generated a summary for this paper yet.