2000 character limit reached
Homograph Attacks on Maghreb Sentiment Analyzers (2402.03171v1)
Published 5 Feb 2024 in cs.CL, cs.CR, and cs.LG
Abstract: We examine the impact of homograph attacks on the Sentiment Analysis (SA) task of different Arabic dialects from the Maghreb North-African countries. Homograph attacks result in a 65.3% decrease in transformer classification from an F1-score of 0.95 to 0.33 when data is written in "Arabizi". The goal of this study is to highlight LLMs weaknesses' and to prioritize ethical and responsible Machine Learning.