AI-UPV at EXIST 2023 -- Sexism Characterization Using Large Language Models Under The Learning with Disagreements Regime (2307.03385v1)
Abstract: With the increasing influence of social media platforms, it has become crucial to develop automated systems capable of detecting instances of sexism and other disrespectful and hateful behaviors to promote a more inclusive and respectful online environment. Nevertheless, these tasks are considerably challenging considering different hate categories and the author's intentions, especially under the learning with disagreements regime. This paper describes AI-UPV team's participation in the EXIST (sEXism Identification in Social neTworks) Lab at CLEF 2023. The proposed approach aims at addressing the task of sexism identification and characterization under the learning with disagreements paradigm by training directly from the data with disagreements, without using any aggregated label. Yet, performances considering both soft and hard evaluations are reported. The proposed system uses LLMs (i.e., mBERT and XLM-RoBERTa) and ensemble strategies for sexism identification and classification in English and Spanish. In particular, our system is articulated in three different pipelines. The ensemble approach outperformed the individual LLMs obtaining the best performances both adopting a soft and a hard label evaluation. This work describes the participation in all the three EXIST tasks, considering a soft evaluation, it obtained fourth place in Task 2 at EXIST and first place in Task 3, with the highest ICM-Soft of -2.32 and a normalized ICM-Soft of 0.79. The source code of our approaches is publicly available at https://github.com/AngelFelipeMP/Sexism-LLM-Learning-With-Disagreement.
- Overview of EXIST 2023 – Learning with Disagreement for Sexism Identification and Characterization, in: A. Arampatzis, E. Kanoulas, T. Tsikrika, S. Vrochidis, A. Giachanou, D. Li, M. Aliannejadi, M. Vlachos, G. Faggioli, N. Ferro (Eds.), Experimental IR Meets Multilinguality, Multimodality, and Interaction, Thessaloniki, Greece, 2023a.
- Overview of EXIST 2023 – Learning with Disagreement for Sexism Identification and Characterization (Extended Overview), in: M. Aliannejadi, G. Faggioli, N. Ferro, M. Vlachos (Eds.), Working Notes of CLEF 2023 – Conference and Labs of the Evaluation Forum, 2023b.
- UPV at CheckThat! 2021: Mitigating Cultural Differences for Identifying Multilingual Check-worthy Claims, in: Proceedings of The 12th Conference and Labs of the Evaluation Forum (CLEF), volume 2936, 2021a, pp. 465–475.
- UPV at TREC Health Misinformation Track 2021 Ranking with SBERT and Quality Estimators, in: Proceedings of The Thirtieth Text REtrieval Conference (TREC), 2021b.
- I. Baris Schlicht, A. F. Magnossão de Paula, Unified and Multilingual Author Profiling for Detecting Haters, in: Proceedings of The 12th Conference and Labs of the Evaluation Forum (CLEF), volume 2936, 2021, pp. 1837–1845.
- “Call me sexist, but…”: Revisiting Sexism Detection Using Psychological Scales and Adversarial Samples, in: Proceedings of the International AAAI Conference on Web and Social Media, volume 15, 2021, pp. 573–584.
- Sexism Prediction in Spanish and English Tweets Using Monolingual and Multilingual BERT and Ensemble Models, in: Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2021) co-located with the XXXVII International Conference of the Spanish Society for Natural Language Processing (SEPLN), volume 2943, 2021, pp. 356–373.
- A. F. Magnossão de Paula, R. F. da Silva, Detection and Classification of Sexism on Social Media Using Multiple Languages, Transformers, and Ensemble Models, in: Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2022) co-located with the XXXVIII International Conference of the Spanish Society for Natural Language Processing (SEPLN), volume 3202, 2022.
- Automatic detection of sexist content in memes, Image 46 (2018) 53–9.
- Detecting sexist meme on the web: A study on textual and visual cues, in: 2019 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW), IEEE, 2019, pp. 226–231.
- Overview of EXIST 2021: sEXism Identification in Social neTworks, Procesamiento del Lenguaje Natural 67 (2021) 195–207.
- Overview of EXIST 2022: sEXism Identification in Social neTworks, Procesamiento del Lenguaje Natural 69 (2022) 229–240.
- SemEval-2023 Task 10: Explainable Detection of Online Sexism, arXiv preprint arXiv:2303.04222 (2023).
- AdamR at SemEval-2023 Task 10: Solving the Class Imbalance Problem in Sexism Detection with Ensemble Learning, arXiv preprint arXiv:2305.08636 (2023).
- SSS at SemEval-2023 Task 10: Explainable Detection of Online Sexism using Majority Voted Fine-Tuned Transformers, arXiv preprint arXiv:2304.03518 (2023).
- A. F. Magnossão de Paula, I. Baris Schlicht, AI-UPV at IberLEF-2021 DETOXIS task: Toxicity Detection in Immigration-Related Web News Comments Using Transformers and Statistical Models, in: Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2021) co-located with the XXXVII International Conference of the Spanish Society for Natural Language Processing (SEPLN), 2021, pp. 547–566.
- UPV at the Arabic Hate Speech 2022 Shared Task: Offensive Language and Hate Speech Detection Using Transformers and Ensemble Models, in: Proceedings of the 5th Workshop on Open-Source Arabic Corpora and Processing Tools with Shared Tasks on Qur’an QA and Fine-Grained Hate Speech Detection, 2022, pp. 181–185.
- Transformers and Ensemble Methods: A Solution for Hate Speech Detection in Arabic Languages, in: Proceedings of the 1st Natural Language Processing (NLP) Challenge at Centre De Recherche Sur L’information Scientifique et Technique (CERIST), 2023.
- Overview of the Task on Automatic Misogyny Identification at IberEval 2018, IberEval @ SEPLN 2150 (2018) 214–228.
- Scalable Detection of Offensive and Non-compliant Content / Logo in Product Images, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2020, pp. 2247–2256.
- Pornographic Content Classification Using Deep-Learning, in: 21st ACM Symposium on Document Engineering, 2021, pp. 1–10.
- An Evaluation of State-of-the-Art Object Detectors for Pornography Detection, in: IEEE International Conference on Signal and Image Processing Applications (ICSIPA), 2021, pp. 191–196.
- The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes, Advances in Neural Information Processing Systems 33 (2020) 2611–2624.
- SemEval-2022 Task 5: Multimedia Automatic Misogyny Identification, in: Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022), 2022, pp. 533–549.
- Bias Analysis on Twitter, in: New Trends in Disruptive Technologies, Tech Ethics and Artificial Intelligence: The DITTET 2022 Collection, Springer, 2022, pp. 131–142.
- Detection of Abusive Language: the Problem of Biased Datasets, in: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Association for Computational Linguistics, Minneapolis, Minnesota, 2019, pp. 602–608. doi:10.18653/v1/N19-1060.
- Learning from Disagreement: A Survey, Journal of Artificial Intelligence Research 72 (2021) 1385–1470. doi:10.1613/jair.1.12752.
- A. Kalra, A. Zubiaga, Sexism Identification in Tweets and Gabs using Deep Neural Networks, arXiv preprint arXiv:2111.03612 (2021).
- E. Amigó, A. Delgado, Evaluating Extreme Hierarchical Multi-label Classification, in: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Dublin, Ireland, 2022, pp. 5809–5819. doi:10.18653/v1/2022.acl-long.399.
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, in: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Association for Computational Linguistics, Minneapolis, Minnesota, 2019, pp. 4171–4186. doi:10.18653/v1/N19-1423.
- Automatic Sexism Detection with Multilingual Transformer Models, in: Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2021) co-located with the XXXVII International Conference of the Spanish Society for Natural Language Processing (SEPLN), volume 2943, 2021. URL: https://ceur-ws.org/Vol-2943/exist_paper1.pdf.
- K. Bengoetxea, A. Aguirregoitia, Multiaztertest@Exist-Iberlef2022: Sexism Identification in Social Networks, in: Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2022) co-located with the XXXVIII International Conference of the Spanish Society for Natural Language Processing (SEPLN), volume 3202, 2022. URL: https://ceur-ws.org/Vol-3202/exist-paper8.pdf.
- Unsupervised cross-lingual representation learning at scale, arXiv preprint arXiv:1911.02116 (2019).
- Hate Speech Detection on Twitter Using Transfer Learning, Computer Speech & Language 74 (2022) 101365.
- Multimodal Hate Speech Detection from Bengali Memes and Texts, in: International conference on Speech & Language Technology for Low-resource Languages (SPELLL), 2022, pp. 1–15.
- Hate Speech and Offensive Language Detection in Dravidian Languages Using Deep Ensemble Framework, Computer Speech & Language 75 (2022) 101386.