Emotion Classification in Low and Moderate Resource Languages (2402.18424v2)
Abstract: It is important to be able to analyze the emotional state of people around the globe. There are 7100+ active languages spoken around the world and building emotion classification for each language is labor intensive. Particularly for low-resource and endangered languages, building emotion classification can be quite challenging. We present a cross-lingual emotion classifier, where we train an emotion classifier with resource-rich languages (i.e. \textit{English} in our work) and transfer the learning to low and moderate resource languages. We compare and contrast two approaches of transfer learning from a high-resource language to a low or moderate-resource language. One approach projects the annotation from a high-resource language to low and moderate-resource language in parallel corpora and the other one uses direct transfer from high-resource language to the other languages. We show the efficacy of our approaches on 6 languages: Farsi, Arabic, Spanish, Ilocano, Odia, and Azerbaijani. Our results indicate that our approaches outperform random baselines and transfer emotions across languages successfully. For all languages, the direct cross-lingual transfer of emotion yields better results. We also create annotated emotion-labeled resources for four languages: Farsi, Azerbaijani, Ilocano and Odia.
- Unsupervised word mapping using structural similarities in monolingual embeddings. Transactions of the Association of Computational Linguistics, 6:185–196.
- Saima Aman and Stan Szpakowicz. 2007. Identifying expressions of emotion in text. In International Conference on Text, Speech and Dialogue, pages 196–205. Springer.
- Mona Baker and Gabriela Saldanha. 2019. Routledge encyclopedia of translation studies. Routledge.
- Jeremy Barnes. 2023. Sentiment and emotion classification in low-resource settings. In Proceedings of the 13th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis, pages 290–304.
- Bilingual sentiment embeddings: Joint projection of sentiment across languages. arXiv preprint arXiv:1805.09016.
- Multivec: a multilingual and multilevel representation learning toolkit for nlp.
- Goodnewseveryone: A corpus of news headlines annotated with emotions, semantic roles, and reader perception. arXiv preprint arXiv:1912.03184.
- Christos Christodouloupoulos and Mark Steedman. 2015. A massively parallel corpus: the bible in 100 languages. Language resources and evaluation, 49(2):375–395.
- Unsupervised cross-lingual representation learning at scale. arXiv preprint arXiv:1911.02116.
- Word translation without parallel data. arXiv preprint arXiv:1710.04087.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
- Paul Ekman. 1993. Facial expression and emotion. American psychologist, 48(4):384.
- Noura Farra. 2019. Cross-Lingual and Low-Resource Sentiment Analysis. Ph.D. thesis, Columbia University.
- Joseph L Fleiss. 1971. Measuring nominal scale agreement among many raters. Psychological bulletin, 76(5):378.
- Learning multilingual subjective language via cross-lingual projections. In Proceedings of the 45th annual meeting of the association of computational linguistics, pages 976–983.
- Rethinking the role of demonstrations: What makes in-context learning work? arXiv preprint arXiv:2202.12837.
- Semeval-2018 task 1: Affect in tweets. In Proceedings of The 12th International Workshop on Semantic Evaluation, pages 1–17.
- Saif Mohammad and Peter Turney. 2010. Emotions evoked by common words and phrases: Using Mechanical Turk to create an emotion lexicon. In Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, pages 26–34, Los Angeles, CA. Association for Computational Linguistics.
- Saif M. Mohammad. 2018. Word affect intensities. In Proceedings of the 11th Edition of the Language Resources and Evaluation Conference (LREC-2018), Miyazaki, Japan.
- Saif M Mohammad and Svetlana Kiritchenko. 2015. Using hashtags to capture fine emotion categories from tweets. Computational Intelligence, 31(2):301–326.
- Franz Josef Och and Hermann Ney. 2003. A systematic comparison of various statistical alignment models. Computational linguistics, 29(1):19–51.
- Asta Olesen. 2013. Islam & Politics Afghanistan N. Routledge.
- Robert Plutchik. 1984. Emotions: A general psychoevolutionary theory. Approaches to emotion, 1984:197–219.
- Cross-lingual sentiment transfer with limited resources. Machine Translation, 32(1-2):143–165.
- Klaus R Scherer and Harald G Wallbott. 1994. Evidence for universality and cultural variation of differential emotion response patterning. Journal of personality and social psychology, 66(2):310.
- Timo Schick and Hinrich Schütze. 2020. Exploiting cloze questions for few shot text classification and natural language inference. arXiv preprint arXiv:2001.07676.
- Cross-cultural similarity features for cross-lingual transfer learning of pragmatically motivated tasks. arXiv preprint arXiv:2006.09336.
- Shabnam Tafreshi and Mona Diab. 2018a. Emotion detection and classification in a multigenre corpus with joint multi-task deep learning. In Proceedings of the 27th international conference on computational linguistics, pages 2905–2913.
- Shabnam Tafreshi and Mona Diab. 2018b. Sentence and clause level emotion annotation, detection, and classification in a multi-genre corpus. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018).
- Is chatgpt equipped with emotional dialogue capabilities? arXiv preprint arXiv:2304.09582.
- Bridging the language gap: Learning distributed semantics for cross-lingual sentiment classification. In Natural Language Processing and Chinese Computing, pages 138–149. Springer.