Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ChatGPT vs Gemini vs LLaMA on Multilingual Sentiment Analysis (2402.01715v1)

Published 25 Jan 2024 in cs.CL and cs.AI

Abstract: Automated sentiment analysis using LLM-based models like ChatGPT, Gemini or LLaMA2 is becoming widespread, both in academic research and in industrial applications. However, assessment and validation of their performance in case of ambiguous or ironic text is still poor. In this study, we constructed nuanced and ambiguous scenarios, we translated them in 10 languages, and we predicted their associated sentiment using popular LLMs. The results are validated against post-hoc human responses. Ambiguous scenarios are often well-coped by ChatGPT and Gemini, but we recognise significant biases and inconsistent performance across models and evaluated human languages. This work provides a standardised methodology for automated sentiment analysis evaluation and makes a call for action to further improve the algorithms and their underlying data, to improve their performance, interpretability and applicability.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. “Opinion mining and sentiment analysis” In Found Trends Inf Ret 2.1–2 Now Publishers, Inc., 2008, pp. 1–135
  2. K Mouthami, K Nirmala Devi and V Murali Bhaskaran “Sentiment analysis and classification based on textual reviews” In Int Conf Inf Comm Embedded Sys (ICICES), 2013, pp. 271–276 IEEE
  3. “Emotion AI-driven sentiment analysis: A survey, future research directions, and open issues” In Appl Sci 9.24 MDPI, 2019, pp. 5462
  4. “A survey of opinion mining and sentiment analysis” In Mining text data Springer, 2012, pp. 415–463
  5. “Sentiment analysis is a big suitcase” In IEEE Intell Sys 32.6 IEEE, 2017, pp. 74–80
  6. “Emergent abilities of large language models” In arXiv:2206.07682, 2022
  7. “Revisiting Sentiment Analysis for Software Engineering in the Era of Large Language Models” In arXiv:2310.11113, 2023
  8. “Language models are few-shot learners” In Adv Neur In Proc Sys 33, 2020, pp. 1877–1901
  9. “Bert: Pre-training of deep bidirectional transformers for language understanding” In arXiv:1810.04805, 2018
  10. “Attention is all you need” In Adv Neur In Proc Sys 30, 2017
  11. “Deep reinforcement learning from human preferences” In Adv Neur In Proc Sys 30, 2017
  12. “Holistic evaluation of language models” In arXiv:2211.09110, 2022
  13. Steven T Piantadosi, Harry Tily and Edward Gibson “The communicative function of ambiguity in language” In Cognition 122.3 Elsevier, 2012, pp. 280–291
  14. “A review on human–machine trust evaluation: Human-centric and machine-centric perspectives” In IEEE T Human-Machine Sys 52.5 IEEE, 2022, pp. 952–962
  15. OpenAI “Introducing ChatGPT”, 2023 URL: https://openai.com/blog/chatgpt
  16. Google “Introducing Gemini: our largest and most capable AI model”, 2023 URL: https://blog.google/technology/ai/google-gemini-ai/
  17. “Llama 2: Open Foundation and Fine-Tuned Chat Models” In arXiv:2307.09288, 2023
  18. “Vader: A parsimonious rule-based model for sentiment analysis of social media text” In Proc Int AAAI Conf Web Social Media 8.1, 2014, pp. 216–225
  19. Mayur Wankhade, Annavarapu Chandra Sekhara Rao and Chaitanya Kulkarni “A survey on sentiment analysis methods, applications, and challenges” In Artif Int Rev 55.7 Springer, 2022, pp. 5731–5780
  20. Saif M Mohammad “Sentiment analysis: Detecting valence, emotions, and other affectual states from text” In Emotion measurement Elsevier, 2016, pp. 201–237
  21. “Learning word vectors for sentiment analysis” In Proc 49th Annu Meeting Ass Comput Ling, 2011, pp. 142–150
  22. Saif M Mohammad and Peter D Turney “Crowdsourcing a word–emotion association lexicon” In Comput Int 29.3 Wiley Online Library, 2013, pp. 436–465
  23. Duyu Tang, Bing Qin and Ting Liu “Deep learning for sentiment analysis: successful approaches and future challenges” In Wiley Interd Rev: Data Mining Knowledge Disc 5.6 Wiley Online Library, 2015, pp. 292–303
  24. SemEval “International Workshop on Semantic Evaluation”, 2024 URL: https://semeval.github.io/
  25. A Shaji George and AS Hovan George “A review of ChatGPT AI’s impact on several business sectors” In PU Int Inno J 1.1, 2023, pp. 9–23
  26. “An empirical, quantitative analysis of the differences between sarcasm and irony” In The Semantic Web: ESWC 2016 Satellite Events, Heraklion, Crete, Greece, 2016, pp. 203–216 Springer
  27. Ibrahim Abu Farha and Walid Magdy “From arabic sentiment analysis to sarcasm detection: The arsarcasm dataset” In Proc 4th Workshop Open-Source Arabic Corpora and Processing Tools, 2020, pp. 32–39
  28. Erik Cambria “An introduction to concept-level sentiment analysis” In Advances in Soft Computing and Its Applications: 12th Mexican Int Conf Artif Int, MICAI 2013, Mexico City, Mexico, 2013, pp. 478–483 Springer
  29. “Overview of the Transformer-based Models for NLP Tasks” In 15th Conference FedCSIS, 2020, pp. 179–183 IEEE
  30. Jay M Patel and Jay M Patel “Introduction to common crawl datasets” In Getting Structured Data from the Internet Springer, 2020, pp. 277–324
  31. “Gemini: a family of highly capable multimodal models” In arXiv:2312.11805, 2023
  32. Steffen Eger, Paul Youssef and Iryna Gurevych “Is it time to swish? Comparing deep learning activation functions across NLP tasks” In arXiv:1901.02671, 2019
  33. “GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints” In arXiv:2305.13245, 2023
  34. HuggingFace “Llama 2”, 2023 URL: https://huggingface.co/meta-llama/Llama-2-7b
  35. DI Hernández Farias and Paolo Rosso “Irony, sarcasm, and sentiment analysis” In Sentiment Analysis in Social Networks Elsevier, 2017, pp. 113–128
  36. “EmoAtlas: An emotional profiling tool merging psychological lexicons, artificial intelligence and network science” In ResearchSquare preprint, 2023
  37. Axel Bruns “Are filter bubbles real?” John Wiley & Sons, 2019
  38. Dias Oliva Thiago, Antonialli Dennys Marcelo and Alessandra Gomes “Fighting hate speech, silencing drag queens? artificial intelligence in content moderation and risks to lgbtq voices online” In Sexuality & culture 25.2 Springer Nature BV, 2021, pp. 700–732
  39. “The risk of racial bias in hate speech detection” In Proc 57th Annu Meeting Ass Comput Ling, 2019, pp. 1668–1678
Citations (14)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com