Deep learning for affective computing: text-based emotion recognition in decision support (1803.06397v6)
Abstract: Emotions widely affect human decision-making. This fact is taken into account by affective computing with the goal of tailoring decision support to the emotional states of individuals. However, the accurate recognition of emotions within narrative documents presents a challenging undertaking due to the complexity and ambiguity of language. Performance improvements can be achieved through deep learning; yet, as demonstrated in this paper, the specific nature of this task requires the customization of recurrent neural networks with regard to bidirectional processing, dropout layers as a means of regularization, and weighted loss functions. In addition, we propose sent2affect, a tailored form of transfer learning for affective computing: here the network is pre-trained for a different task (i.e. sentiment analysis), while the output layer is subsequently tuned to the task of emotion recognition. The resulting performance is evaluated in a holistic setting across 6 benchmark datasets, where we find that both recurrent neural networks and transfer learning consistently outperform traditional machine learning. Altogether, the findings have considerable implications for the use of affective computing.
Summary
- The paper demonstrates that customized deep learning architectures, particularly BiLSTM with dropout and weighted loss, effectively address class imbalance in emotion recognition tasks.
- It introduces sent2affect, a transfer learning method that leverages large sentiment datasets to improve performance on smaller emotion recognition benchmarks.
- Comprehensive evaluations show that tailored RNN models significantly outperform traditional approaches in both F1 score and MSE across diverse text sources.
This paper, "Deep learning for affective computing: text-based emotion recognition in decision support" (Deep learning for affective computing: text-based emotion recognition in decision support, 2018), addresses the challenge of accurately recognizing emotions from text, a task crucial for building decision support systems tailored to individuals' emotional states. It highlights that while deep learning offers performance improvements for sequence data like text, the specific nature of text-based emotion recognition requires significant customization beyond standard architectures. The paper demonstrates that customized recurrent neural networks (RNNs), particularly Long Short-Term Memory (LSTM) and Bidirectional LSTM (BiLSTM) networks, consistently outperform traditional machine learning methods. Furthermore, it introduces sent2affect
, a novel transfer learning strategy that leverages knowledge from the related task of sentiment analysis to boost performance on emotion recognition datasets.
Problem Context and Motivation
Emotions significantly influence human decision-making, communication, and cognitive processes. Affective computing aims to detect, recognize, and predict human emotions to adapt computational systems accordingly, enabling decision support systems that can, for example, provide empathetic responses or tailor information based on detected emotional states. Text is a prevalent medium for communication, making text-based emotion recognition a key area. However, text is complex and ambiguous, making accurate emotion detection difficult. Traditional machine learning methods have been used, but they often rely on handcrafted features like bag-of-words with tf-idf, which may not capture the nuanced, context-dependent nature of emotional expression in text. Deep learning, particularly RNNs, has shown promise for sequential data but needs adaptation for the specific challenges of emotion recognition datasets, such as limited size and severe class imbalances.
Key Contributions
The main contributions of the paper are:
- Tailored Deep Learning Architecture: Identifying and implementing key modifications to standard RNNs (LSTMs/BiLSTMs) to improve performance on text-based emotion recognition datasets. These modifications include:
- Bidirectional Processing (BiLSTM): Using two LSTMs to process text in both forward and backward directions, allowing the model to capture dependencies from both past and future words in a sequence.
- Dropout Layers: Applying dropout within recurrent layers and between layers to prevent overfitting, which is particularly important for the relatively small datasets common in affective computing.
- Weighted Loss Function: Implementing a loss function that weights the error for each sample inversely to the size of its class. This is crucial for handling the high class imbalance often found in emotion datasets, preventing the model from simply predicting the majority class.
sent2affect
Transfer Learning: Proposing and evaluating a novel transfer learning approach where a neural network is first trained on a large sentiment analysis dataset (classifying text as positive or negative). The final output layer is then replaced, and the network is fine-tuned on a specific emotion recognition dataset. This transfers semantic understanding gained from sentiment analysis to the more fine-grained task of emotion recognition, leveraging the larger dataset availability in sentiment analysis.- Holistic Evaluation: Conducting a comprehensive evaluation of the proposed deep learning models against traditional machine learning baselines across six diverse benchmark datasets. These datasets cover various sources (literary tales, tweets, headlines, Facebook posts), linguistic styles, sizes, and underlying affect theories (categorical classification and dimensional regression).
Methods and Implementation Details
The paper compares traditional machine learning baselines (Random Forest, Support Vector Machine) using tf-idf weighted bag-of-words features against various deep learning models based on LSTMs.
Traditional Machine Learning Baselines:
- Used Random Forest and Support Vector Machines (SVM for classification, Support Vector Regression for regression).
- Feature engineering based on bag-of-words with tf-idf weighting.
- Preprocessing included tokenization, lowercasing, punctuation/number/stop word removal, and stemming using NLTK.
- Hyperparameters optimized via manual tuning (RF) and grid search (SVM). Weighted loss used for imbalanced datasets.
Deep Learning Models:
- Based on a three-layer architecture:
- Embedding Layer: Maps words to low-dimensional dense vectors. Can be randomly initialized (learned during training) or use pre-trained embeddings (e.g., GloVe).
- Recurrent Layer: Processes the sequence of word embeddings. Uses LSTM or BiLSTM cells. The hidden state hi and output oi at step i depend on the previous hidden state hi−1 and current input ei. For BiLSTM, the forward and backward hidden states are concatenated: [hiforward,hibackward].
- Dense Layer: Takes the final hidden state from the recurrent layer and maps it to the output prediction.
- Customizations for Affective Computing:
- Dropout: Applied to the recurrent layer connections (recurrent dropout) and between the recurrent and dense layers. Randomly sets a percentage of neuron outputs to zero during training.
- Bidirectional LSTM (BiLSTM): Explained above, processes text in both directions for richer context representation.
- Weighted Loss Function: For classification tasks (cross-entropy loss), the loss for each sample xi with ground truth label yi is weighted by wi. The weight wi is calculated as $w_{i} = \frac{N}{K \sum_{j} \mathds{1}_{y_j=y_i}}$, where N is the total samples, K is the number of classes, and the denominator is the count of samples in class yi. This up-weights the contribution of minority classes to the total loss. For regression tasks (mean squared error), this weighting is not typically applied in the same manner.
sent2affect
Transfer Learning Implementation:- A BiLSTM network is initialized.
- The network is trained on a large sentiment analysis dataset (100,000 tweets labeled positive/negative).
- The final dense prediction layer is removed.
- A new, randomly-initialized dense layer is added, sized for the target emotion recognition task (number of emotion classes or dimensions).
- The entire network (or specifically the new dense layer and potentially fine-tuning on earlier layers) is then trained/fine-tuned on the target emotion recognition dataset. The transferred knowledge lies in the weights of the embedding and recurrent layers, which have learned general linguistic patterns related to affect from the large sentiment dataset.
Evaluation and Results
The evaluation covers both categorical classification and dimensional regression tasks.
- Classification: Performance is measured using the weighted average F1-score, sensitivity, and specificity across classes to account for class imbalance.
- Results show that deep learning models consistently outperform traditional machine learning baselines.
- BiLSTM models with pre-trained GloVe embeddings generally yield the best performance, achieving F1 score improvements of 1.6% to 23.2% over the best traditional baseline across the datasets.
- The performance gains were generally higher on cleaner datasets (e.g., headlines) and lower on noisier datasets with severe imbalance (e.g., election tweets).
- The proposed architectural modifications (BiLSTM, dropout, weighted loss) were necessary for deep learning models to perform effectively, as a naive LSTM often failed by predicting only the majority class.
- Regression: Performance is measured using Mean Squared Error (MSE).
- Deep learning models, particularly BiLSTM with pre-trained embeddings, consistently outperform traditional machine learning baselines.
- Improvements in MSE range up to 11.6% across different datasets and emotional dimensions.
sent2affect
Transfer Learning: Evaluated on two tweet datasets.- Showed additional performance improvements (F1-score increases of 5.6% and 6.6%) compared to a BiLSTM with pre-trained embeddings but without task transfer learning.
- This demonstrates the value of transferring knowledge not just from different datasets but from a semantically related task like sentiment analysis, especially for smaller target datasets.
Practical Applications and Implications
The findings have significant implications for practitioners and researchers:
- Implementation in Decision Support Systems: The improved emotion recognition accuracy enabled by customized deep learning can enhance various decision support applications. Examples include:
- Customer Support/Marketing: Analyzing customer reviews or social media posts to understand emotions towards products/services, informing product development, advertising strategies, or reputation management.
- Human-Computer Interaction: Building more empathetic chatbots or personal assistants that can detect and respond appropriately to user emotions.
- Finance: Identifying emotional content in news or social media to inform trading decisions or predict economic trends.
- Healthcare: Detecting emotional states related to health conditions (e.g., depression, suicidal ideation) from text communication for early intervention or diagnosis support.
- Education: Analyzing e-learner emotions to adapt teaching strategies or provide tailored support.
- Politics/Public Monitoring: Detecting polarizing language or hate speech in social media.
- Necessity of Customization: Practitioners adopting deep learning for affective computing should be aware that standard architectures may not suffice. Customizations like BiLSTM, dropout, and weighted loss are essential for achieving good performance, even on relatively small datasets.
- Value of Transfer Learning: When facing limited labeled data for a specific emotion recognition task, leveraging
sent2affect
transfer learning from sentiment analysis can be a highly effective strategy to improve model performance by utilizing larger sentiment datasets. - Need for Standardized Datasets: The paper highlights the heterogeneity of existing emotion datasets (annotation schemes, dimensions, size). Standardized, large-scale datasets would significantly benefit future research and application development in this field, facilitating benchmarking and more effective transfer learning.
In summary, the paper provides practical guidance and empirical evidence demonstrating that customized deep learning models, specifically BiLSTMs with dropout and weighted loss functions, significantly advance the state of the art in text-based emotion recognition. The proposed sent2affect
transfer learning method offers a valuable technique for leveraging widely available sentiment data to further boost performance, making accurate text-based emotion recognition more feasible for real-world decision support applications.
Related Papers
- A Systematic Review on Affective Computing: Emotion Models, Databases, and Recent Advances (2022)
- REDAffectiveLM: Leveraging Affect Enriched Embedding and Transformer-based Neural Language Model for Readers' Emotion Detection (2023)
- Multimodal Emotion Recognition using Transfer Learning from Speaker Recognition and BERT-based models (2022)
- Modeling emotion in complex stories: the Stanford Emotional Narratives Dataset (2019)
- End-to-End Multimodal Emotion Recognition using Deep Neural Networks (2017)