Multimodal Sentiment Analysis: A Survey (2305.07611v3)
Abstract: Multimodal sentiment analysis has become an important research area in the field of artificial intelligence. With the latest advances in deep learning, this technology has reached new heights. It has great potential for both application and research, making it a popular research topic. This review provides an overview of the definition, background, and development of multimodal sentiment analysis. It also covers recent datasets and advanced models, emphasizing the challenges and future prospects of this technology. Finally, it looks ahead to future research directions. It should be noted that this review provides constructive suggestions for promising research directions and building better performing multimodal sentiment analysis models, which can help researchers in this field.
- The emotions: A philosophical introduction. Routledge, 2012.
- Vader: A parsimonious rule-based model for sentiment analysis of social media text. In Proceedings of the international AAAI conference on web and social media, volume 8, pages 216–225, 2014.
- Determining the sentiment of opinions. In COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics, pages 1367–1373, 2004.
- New avenues in opinion mining and sentiment analysis. IEEE Intelligent systems, 28(2):15–21, 2013.
- Vision transformers in medical computer vision—a contemplative retrospection. Engineering Applications of Artificial Intelligence, 122:106126, 2023.
- Toward the third generation artificial intelligence. Science China Information Sciences, 66(2):1–19, 2023.
- Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 55(9):1–35, 2023.
- State of the art: a review of sentiment analysis based on sequential transfer learning. Artificial Intelligence Review, 56(1):749–780, 2023.
- A survey on sentiment analysis methods, applications, and challenges. Artificial Intelligence Review, 55(7):5731–5780, 2022.
- E-word of mouth sentiment analysis for user behavior studies. Information Processing & Management, 59(1):102784, 2022.
- Sentiment analysis using deep learning architectures: a review. Artificial Intelligence Review, 53(6):4335–4385, 2020.
- Multimodal sentimental analysis for social media applications: A comprehensive review. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 11(5):e1415, 2021.
- Decision support with text-based emotion recognition: Deep learning for affective computing. arXiv preprint arXiv:1803.06397, 2018.
- Semeval-2007 task 14: Affective text. In Proceedings of the fourth international workshop on semantic evaluations (SemEval-2007), pages 70–74, 2007.
- A generative model for category text generation. Information Sciences, 450:301–315, 2018.
- Rong Dai. Facial expression recognition method based on facial physiological features and deep learning. Journal of Chongqing University of Technology (Natural Science), 34(6):146–153, 2020.
- Acoustics, content and geo-information based sentiment prediction from large-scale networked voice data. In 2014 IEEE International Conference on Multimedia and Expo (ICME), pages 1–4. IEEE, 2014.
- Summary of multi-modal sentiment analysis technology. Journal of Frontiers of Computer Science & Technology, 15(7):1165, 2021.
- Image–text sentiment analysis via deep multimodal attentive fusion. Knowledge-Based Systems, 167:26–37, 2019.
- Sentiment analysis of multimodal twitter data. Multimedia Tools and Applications, 78:24103–24119, 2019.
- Multimodal sentiment analysis: review, application domains and future directions. In 2021 IEEE Pune Section International Conference (PuneCon), pages 1–5. IEEE, 2021.
- Impact of smote on imbalanced text features for toxic comments classification using rvvc model. IEEE Access, 9:78621–78634, 2021.
- Hybrid multimodal feature extraction, mining and fusion for sentiment analysis. In Proceedings of the 3rd International on Multimodal Sentiment Analysis Workshop and Challenge, pages 81–88, 2022.
- A multi-modal array of interpretable features to evaluate language and speech patterns in different neurological disorders. In 2022 IEEE Spoken Language Technology Workshop (SLT), pages 532–539. IEEE, 2023.
- A review of affective computing: From unimodal analysis to multimodal fusion. Information fusion, 37:98–125, 2017.
- Sathyan Munirathinam. Industry 4.0: Industrial internet of things (iiot). In Advances in computers, volume 117, pages 129–164. Elsevier, 2020.
- The rise of social media. Our world in data, 2023.
- Does information and communication technologies improve environmental quality in the era of globalization? an empirical analysis. Environmental Science and Pollution Research, 26:8594–8608, 2019.
- Iemocap: Interactive emotional dyadic motion capture database. Language resources and evaluation, 42:335–359, 2008.
- Deap: A database for emotion analysis; using physiological signals. IEEE transactions on affective computing, 3(1):18–31, 2011.
- Mosi: multimodal corpus of sentiment intensity and subjectivity analysis in online opinion videos. arXiv preprint arXiv:1606.06259, 2016.
- Multimodal language analysis in the wild: Cmu-mosei dataset and interpretable dynamic fusion graph. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2236–2246, 2018.
- Meld: A multimodal multi-party dataset for emotion recognition in conversations. arXiv preprint arXiv:1810.02508, 2018.
- Multi-interactive memory network for aspect based multimodal sentiment analysis. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 371–378, 2019.
- Ch-sims: A chinese multimodal sentiment analysis dataset with fine-grained annotation of modality. In Proceedings of the 58th annual meeting of the association for computational linguistics, pages 3718–3727, 2020.
- Cmu-moseas: A multimodal language dataset for spanish, portuguese, german and french. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing, volume 2020, page 1801. NIH Public Access, 2020.
- Factify: A multi-modal fact verification dataset. In Proceedings of the First Workshop on Multimodal Fact-Checking and Hate Speech Detection (DE-FACTIFY), 2022.
- Memotion 2: Dataset on sentiment and emotion analysis of memes. In Proceedings of De-Factify: Workshop on Multimodal Fact Checking and Hate Speech Detection, CEUR, 2022.
- Towards multimodal sentiment analysis: Harvesting opinions from the web. In Proceedings of the 13th international conference on multimodal interfaces, pages 169–176, 2011.
- Multimodal language analysis with recurrent multistage fusion. arXiv preprint arXiv:1808.03920, 2018.
- Words can shift: Dynamically adjusting word representations using nonverbal behaviors. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 7216–7223, 2019.
- Divide, conquer and combine: Hierarchical feature fusion network with local and global perspectives for multimodal affective computing. In Proceedings of the 57th annual meeting of the association for computational linguistics, pages 481–492, 2019.
- Found in translation: Learning robust joint representations by cyclic translations between modalities. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 6892–6899, 2019.
- Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis. In Proceedings of the 2015 conference on empirical methods in natural language processing, pages 2539–2544, 2015.
- Jointly fine-tuning” bert-like” self supervised models to improve multimodal speech emotion recognition. arXiv preprint arXiv:2008.06682, 2020.
- Deep multimodal fusion for persuasiveness prediction. In Proceedings of the 18th ACM International Conference on Multimodal Interaction, pages 284–288, 2016.
- Select-additive learning: Improving generalization in multimodal sentiment analysis. In 2017 IEEE International Conference on Multimedia and Expo (ICME), pages 949–954. IEEE, 2017.
- Temporally selective attention model for social and affective state recognition in multimedia content. In Proceedings of the 25th ACM international conference on Multimedia, pages 1743–1751, 2017.
- Multisentinet: A deep semantic network for multimodal sentiment analysis. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pages 2399–2402, 2017.
- Complementary fusion of multi-features and multi-modalities in sentiment analysis. arXiv preprint arXiv:1904.08138, 2019.
- Social image sentiment analysis by exploiting multimodal content and heterogeneous relations. IEEE Transactions on Industrial Informatics, 17(4):2974–2982, 2020.
- Sfnn: Semantic features fusion neural network for multimodal sentiment analysis. In 2020 5th International Conference on Automation, Control and Robotics Engineering (CACRE), pages 661–665. IEEE, 2020.
- Misa: Modality-invariant and-specific representations for multimodal sentiment analysis. In Proceedings of the 28th ACM International Conference on Multimedia, pages 1122–1131, 2020.
- Integrating multimodal information in large pretrained transformers. In Proceedings of the conference. Association for Computational Linguistics. Meeting, volume 2020, page 2359. NIH Public Access, 2020.
- Two-level multimodal fusion for sentiment analysis in public security. Security and Communication Networks, 2021:1–10, 2021.
- An automl-based approach to multimodal image sentiment analysis. In 2021 International Joint Conference on Neural Networks (IJCNN), pages 1–9. IEEE, 2021.
- Learning modality-specific representations with self-supervised multi-task learning for multimodal sentiment analysis. In Proceedings of the AAAI conference on artificial intelligence, volume 35, pages 10790–10797, 2021.
- Dynamic invariant-specific representation fusion network for multimodal sentiment analysis. Computational Intelligence and Neuroscience, 2022, 2022.
- Tedt: Transformer-based encoding–decoding translation network for multimodal sentiment analysis. Cognitive Computation, 15(1):289–303, 2023.
- Tetfn: A text enhanced transformer fusion network for multimodal sentiment analysis. Pattern Recognition, 136:109259, 2023.
- Shared and private information learning in multimodal sentiment analysis with deep modal alignment and self-supervised multi-task learning. arXiv preprint arXiv:2305.08473, 2023.
- A survey of computational approaches and challenges in multimodal sentiment analysis. Int. J. Comput. Sci. Eng, 7(1):876–883, 2019.
- Multimodal sentiment analysis: A survey and comparison. Research Anthology on Implementing Sentiment Analysis Across Multiple Disciplines, pages 1846–1870, 2022.
- The multimodal sentiment analysis in car reviews (muse-car) dataset: Collection, insights and improvements. IEEE Transactions on Affective Computing, 2021.
- Multimodal emotion classification. In companion proceedings of the 2019 world wide web conference, pages 439–449, 2019.
- Hidden topic–emotion transition model for multi-level social emotion detection. Knowledge-Based Systems, 164:426–435, 2019.
- Fake consumer review detection using deep neural networks integrating word embeddings and emotion mining. Neural Computing and Applications, 32:17259–17274, 2020.
- Soonil Kwon. A cnn-assisted enhanced audio signal processing for speech emotion recognition. Sensors, 20(1):183, 2019.
- Emotion detection of contextual text using deep learning. In 2020 4th International symposium on multidisciplinary studies and innovative technologies (ISMSIT), pages 1–5. IEEE, 2020.
- Text emotion detection in social networks using a novel ensemble classifier based on parzen tree estimator (tpe). Neural Computing and Applications, 31(12):8971–8983, 2019.
- Inferring social media users’ mental health status from multimodal information. In Proceedings of the 12th language resources and evaluation conference, pages 6292–6299, 2020.
- Employing multimodal machine learning for stress detection. Journal of Healthcare Engineering, 2021:1–12, 2021.
- What you say or how you say it? depression detection through joint modeling of linguistic and acoustic aspects of speech. Cognitive Computation, 14(5):1585–1598, 2022.
- Deception detection using multimodal fusion approaches. Multimedia Tools and Applications, pages 1–30, 2021.
- Songning Lai (31 papers)
- Xifeng Hu (2 papers)
- Haoxuan Xu (6 papers)
- Zhaoxia Ren (2 papers)
- Zhi Liu (155 papers)