Emotion Classification In Software Engineering Texts: A Comparative Analysis of Pre-trained Transformers Language Models (2401.10845v3)
Abstract: Emotion recognition in software engineering texts is critical for understanding developer expressions and improving collaboration. This paper presents a comparative analysis of state-of-the-art Pre-trained LLMs (PTMs) for fine-grained emotion classification on two benchmark datasets from GitHub and Stack Overflow. We evaluate six transformer models - BERT, RoBERTa, ALBERT, DeBERTa, CodeBERT and GraphCodeBERT against the current best-performing tool SEntiMoji. Our analysis reveals consistent improvements ranging from 1.17% to 16.79% in terms of macro-averaged and micro-averaged F1 scores, with general domain models outperforming specialized ones. To further enhance PTMs, we incorporate polarity features in attention layer during training, demonstrating additional average gains of 1.0\% to 10.23\% over baseline PTMs approaches. Our work provides strong evidence for the advancements afforded by PTMs in recognizing nuanced emotions like Anger, Love, Fear, Joy, Sadness, and Surprise in software engineering contexts. Through comprehensive benchmarking and error analysis, we also outline scope for improvements to address contextual gaps.
- N. Novielli and A. Serebrenik, “Emotion analysis in software ecosystems,” in Software Ecosystems: Tooling and Analytics. Springer, 2023.
- B. Lin, N. Cassee, A. Serebrenik, G. Bavota, N. Novielli, and M. Lanza, “Opinion mining for software development: a systematic literature review,” ACM TOSEM, vol. 31, 2022.
- A. Murgia, M. Ortu, P. Tourani, B. Adams, and S. Demeyer, “An exploratory qualitative and quantitative analysis of emotions in issue report comments of open source systems,” Empirical Software Engineering, 2018.
- F. Calefato, F. Lanubile, N. Novielli, and L. Quaranta, “Emtk-the emotion mining toolkit,” in 2019 IEEE/ACM 4th International Workshop on SEmotion. IEEE, 2019.
- Z. Chen, Y. Cao, H. Yao, X. Lu, X. Peng, H. Mei, and X. Liu, “Emoji-powered sentiment and emotion detection from software developers’ communication data,” ACM TOSEM, 2021.
- M. Ortu, A. Murgia, G. Destefanis, P. Tourani, R. Tonelli, M. Marchesi, and B. Adams, “The emotional side of software developers in jira,” in Proceedings of the 13th MSR, 2016.
- N. Novielli, F. Calefato, and F. Lanubile, “A gold standard for emotion annotation in stack overflow,” in 2018 IEEE/ACM 15th MSR. IEEE, 2018.
- J. Li, Y. Lei, S. Li, H. Zhou, Y. Yu, Z. Jia, Y. Ma, and T. Wang, “A two-stage framework for ambiguous classification in software engineering,” in 2023 IEEE 34th ISSRE. IEEE, 2023.
- J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv, 2018.
- M. M. Imran, Y. Jain, P. Chatterjee, and K. Damevski, “Data augmentation for improving emotion recognition in software engineering communication,” in Proceedings of the 37th ASE, 2022.
- D. Bleyl and E. K. Buxton, “Emotion recognition on stackoverflow posts using bert,” in 2022 IEEE International Conference on Big Data. IEEE, 2022.
- Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, “Roberta: A robustly optimized bert pretraining approach,” arXiv, 2019.
- A. Ciborowska and K. Damevski, “Fast changeset-based bug localization with bert,” in Proceedings of the 44th ICSE, 2022.
- M. Ciniselli, N. Cooper, L. Pascarella, D. Poshyvanyk, M. D. Penta, and G. Bavota, “An empirical study on the usage of bert models for code completion,” in 2021 IEEE/ACM 18th MSR. IEEE, 2021.
- B. Lin, F. Zampetti, G. Bavota, M. Di Penta, M. Lanza, and R. Oliveto, “Sentiment analysis for software engineering: How far can we go?” in Proceedings of the 40th ICSE, 2018.
- O. B. Sghaier and H. Sahraoui, “A multi-step learning approach to assist code review,” in 2023 IEEE SANER. IEEE, 2023.
- T. Zhang, B. Xu, F. Thung, S. A. Haryono, D. Lo, and L. Jiang, “Sentiment analysis for software engineering: How far can pre-trained transformer models go?” in 2020 IEEE ICSME. IEEE, 2020.
- J. Sarker, A. K. Turzo, and A. Bosu, “A benchmark study of the contemporary toxicity detectors on software engineering interactions,” in 2020 27th APSEC. IEEE, 2020.
- N. Cassee, “Sentiment in software engineering: detection and application,” in Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2022.
- P. Shaver, J. Schwartz, D. Kirson, and C. O’connor, “Emotion knowledge: further exploration of a prototype approach.” Journal of personality and social psychology, 1987.
- B. Felbo, A. Mislove, A. Søgaard, I. Rahwan, and S. Lehmann, “Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm,” in Proceedings of the 2017 Conference on EMNLP, 2017.
- H. Batra, N. S. Punn, S. K. Sonbhadra, and S. Agarwal, “Bert-based sentiment analysis: A software engineering perspective,” in Database and Expert Systems Applications: 32nd International Conference, DEXA 2021. Springer-Verlag, 2021.
- T. Wang, B. Sun, and Y. Tong, “Auto-absa: automatic detection of aspects in aspect-based sentiment analysis,” arXiv preprint arXiv:2202.00484, 2022.
- X. Zhou, D. Han, and D. Lo, “Assessing generalizability of codebert,” in 2021 IEEE ICSME. IEEE, 2021.
- A. Karmakar and R. Robbes, “What do pre-trained code models know about code?” in 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 2021, pp. 1332–1336.
- P. He, X. Liu, J. Gao, and W. Chen, “Deberta: Decoding-enhanced bert with disentangled attention,” arXiv, 2020.
- Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, and R. Soricut, “Albert: A lite bert for self-supervised learning of language representations,” arXiv, 2019.
- Z. Feng, D. Guo, D. Tang, N. Duan, X. Feng, M. Gong, L. Shou, B. Qin, T. Liu, D. Jiang et al., “Codebert: A pre-trained model for programming and natural languages,” in EMNLP 2020, 2020.
- D. Guo, S. Ren, S. Lu, Z. Feng, D. Tang, S. Liu, L. Zhou, N. Duan, A. Svyatkovskiy, S. Fu et al., “Graphcodebert: Pre-training code representations with data flow,” arXiv, 2020.
- (2023) Hugging face. [Online]. Available: https://huggingface.co/
- (2023) One vs all. [Online]. Available: https://developers.google.com/machine-learning/crash-course/multi-class-neural-networks/one-vs-all
- Z. Botev and A. Ridder, “Variance reduction,” Wiley statsRef: Statistics reference online, 2017.
- N. Novielli, D. Girardi, and F. Lanubile, “A benchmark study on sentiment analysis for software engineering research,” in Proceedings of the 15th MSR, 2018.
- H. Tian, C. Gao, X. Xiao, H. Liu, B. He, H. Wu, H. Wang, and F. Wu, “Skep: Sentiment knowledge enhanced pre-training for sentiment analysis,” arXiv, 2020.
- Y. Zhang, Y. Yang, B. Liang, S. Chen, B. Qin, and R. Xu, “An empirical study of sentiment-enhanced pre-training for aspect-based sentiment analysis,” in Findings of the ACL 2023, 2023.
- P. Ke, H. Ji, S. Liu, X. Zhu, and M. Huang, “Sentilare: Sentiment-aware language representation learning with linguistic knowledge,” in Proceedings of the 2020 Conference on EMNLP, 2020.
- H. Yang, C. Zhang, and K. Li, “Pyabsa: A modularized framework for reproducible aspect-based sentiment analysis,” in Proceedings of the 32nd ACM CIKM, 2023.
- J. Zhou, J. Tian, R. Wang, Y. Wu, W. Xiao, and L. He, “Sentix: A sentiment-aware pre-trained model for cross-domain sentiment analysis,” in Proceedings of the 28th international conference on computational linguistics, 2020.
- M. Bayer, M.-A. Kaufhold, and C. Reuter, “A survey on data augmentation for text classification,” ACM Computing Surveys, 2022.
- J. Kocon, I. Cichecki, O. Kaszyca, M. Kochanek, D. Szydło, J. Baran, J. Bielaniewicz, M. Gruza, A. Janz, K. Kanclerz et al., “Chatgpt: Jack of all trades, master of none,” Information Fusion, 2023.
- B. Koptyra, A. Ngo, Ł. Radliński, and J. Kocoń, “Clarin-emo: Training emotion recognition models using human annotation and chatgpt,” in International Conference on Computational Science. Springer, 2023.
- A. Nedilko, “Generative pretrained transformers for emotion detection in a code-switching setting,” in Proceedings of the 13th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis, 2023.
- D. Pletea, B. Vasilescu, and A. Serebrenik, “Security and emotion: sentiment analysis of security discussions on github,” in Proceedings of the 11th working conference on mining software repositories, 2014.
- J. A. Russell and A. Mehrabian, “Evidence for a three-factor theory of emotions,” Journal of research in Personality, 1977.
- M. Mantyla, B. Adams, G. Destefanis, D. Graziotin, and M. Ortu, “Mining valence, arousal, and dominance: possibilities for detecting burnout and productivity?” in Proceedings of the 13th MSR, 2016.
- M. R. Islam and M. F. Zibran, “Deva: sensing emotions in the valence arousal space in software engineering text,” in Proceedings of the 33rd ACM SAC, 2018.
- M. R. Islam, M. K. Ahmmed, and M. F. Zibran, “Marvalous: Machine learning based detection of emotions in the valence-arousal space in software engineering text,” in Proceedings of the 34th ACM/SIGAPP SAC, 2019.
- K. Werder and S. Brinkkemper, “Meme: toward a method for emotions extraction from github,” in Proceedings of the 3rd International Workshop on SEmotion, 2018.
- G. Destefanis, M. Ortu, D. Bowes, M. Marchesi, and R. Tonelli, “On measuring affects of github issues’ commenters,” in Proceedings of the 3rd International Workshop on SEmotion, 2018.
- M. Ortu, T. Hall, M. Marchesi, R. Tonelli, D. Bowes, and G. Destefanis, “Mining communication patterns in software development: A github analysis,” in Proceedings of the 14th PROMISE, 2018.
- M. Ortu, M. Marchesi, and R. Tonelli, “Empirical analysis of affect of merged issues on github,” in 2019 IEEE/ACM 4th International Workshop on SEmotion. IEEE, 2019.
- A. S. M. Venigalla and S. Chimalakonda, “Understanding emotions of developer community towards software documentation,” in 2021 IEEE/ACM 43rd ICSE-SEIS. IEEE, 2021.
- K. P. Neupane, K. Cheung, and Y. Wang, “Emod: An end-to-end approach for investigating emotion dynamics in software development,” in 2019 IEEE ICSME. IEEE, 2019.
- S. Rong, W. Wang, U. A. Mannan, E. S. de Almeida, S. Zhou, and I. Ahmed, “An empirical study of emoji use in software development communication,” Information and Software Technology, vol. 148, 2022.
- S. Gao, C. Gao, Y. He, J. Zeng, L. Nie, X. Xia, and M. Lyu, “Code structure–guided transformer for source code summarization,” ACM Transactions on Software Engineering and Methodology, vol. 32, no. 1, 2023.
- S. Haque, Z. Eberhart, A. Bansal, and C. McMillan, “Semantic similarity metrics for evaluating source code summarization,” in Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension, 2022.
- S. MacNeil, A. Tran, A. Hellas, J. Kim, S. Sarsa, P. Denny, S. Bernstein, and J. Leinonen, “Experiences from using code explanations generated by large language models in a web software development e-book,” in Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1, 2023.
- S. Rai, R. C. Belwal, and A. Gupta, “A review on source code documentation,” ACM Transactions on Intelligent Systems and Technology, vol. 13, 2022.
- P. Ardimento and C. Mele, “Using bert to predict bug-fixing time,” in 2020 IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS), 2020.
- W.-Y. Wang, C.-H. Wu, and J. He, “Clebpi: Contrastive learning for bug priority inference,” IST, vol. 164, 2023.
- A. Ali, Y. Xia, Q. Umer, and M. Osman, “Bert based severity prediction of bug reports for the maintenance of mobile applications,” JSS, 2023.
- J. Keim, A. Kaplan, A. Koziolek, and M. Mirakhorli, “Does bert understand code?–an exploratory study on the detection of architectural tactics in code,” in European Conference on Software Architecture. Springer, 2020.
- R. Sharma, F. Chen, F. Fard, and D. Lo, “An exploratory study on code attention in bert,” in Proceedings of the 30th IEEE/ACM ICPC, 2022.
- C. Yang, B. Xu, J. Y. Khan, G. Uddin, D. Han, Z. Yang, and D. Lo, “Aspect-based api review classification: How far can pre-trained transformer model go?” in 2022 IEEE International Conference on SANER. IEEE, 2022.
- J. He, B. Xu, Z. Yang, D. Han, C. Yang, and D. Lo, “Ptm4tag: sharpening tag recommendation of stack overflow posts with pre-trained models,” in Proceedings of the 30th IEEE/ACM ICPC, 2022.
- Mia Mohammad Imran (9 papers)