Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Robust Emotion Recognition in Context Debiasing (2403.05963v3)

Published 9 Mar 2024 in cs.CV and cs.LG

Abstract: Context-aware emotion recognition (CAER) has recently boosted the practical applications of affective computing techniques in unconstrained environments. Mainstream CAER methods invariably extract ensemble representations from diverse contexts and subject-centred characteristics to perceive the target person's emotional state. Despite advancements, the biggest challenge remains due to context bias interference. The harmful bias forces the models to rely on spurious correlations between background contexts and emotion labels in likelihood estimation, causing severe performance bottlenecks and confounding valuable context priors. In this paper, we propose a counterfactual emotion inference (CLEF) framework to address the above issue. Specifically, we first formulate a generalized causal graph to decouple the causal relationships among the variables in CAER. Following the causal graph, CLEF introduces a non-invasive context branch to capture the adverse direct effect caused by the context bias. During the inference, we eliminate the direct context effect from the total causal effect by comparing factual and counterfactual outcomes, resulting in bias mitigation and robust prediction. As a model-agnostic framework, CLEF can be readily integrated into existing methods, bringing consistent performance gains.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (66)
  1. Human-computer interaction with detection of speaker emotions using convolution neural networks. Computational Intelligence and Neuroscience, 2022, 2022.
  2. Context in emotion perception. Current Directions in Psychological Science, 20(5):286–290, 2011.
  3. Step: Spatial temporal graph convolutional networks for emotion perception from gaits. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pages 1342–1350, 2020.
  4. Visual causal feature learning. arXiv preprint arXiv:1412.2309, 2014.
  5. Incorporating structured emotion commonsense knowledge and interpersonal relation into context-aware emotion recognition. Applied Intelligence, 53(4):4201–4217, 2023.
  6. Miss: A generative pretraining and finetuning approach for med-vqa. arXiv preprint arXiv:2401.05163, 2024.
  7. High-level context representation for emotion recognition in images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshop (CVPRW), pages 326–334, 2023.
  8. Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 248–255, 2009.
  9. Learning associative representation for facial expression recognition. In IEEE International Conference on Image Processing (ICIP), pages 889–893, 2021.
  10. E Michael Foster. Causal inference and developmental psychology. Developmental Psychology, 46(6):1454, 2010.
  11. Graph reasoning-based emotion recognition network. IEEE Access, 9:6488–6497, 2021.
  12. Causal inference in statistics: A primer. John Wiley & Sons, 2016.
  13. Deep residual learning for image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016.
  14. Context-aware emotion recognition based on visual relationship detection. IEEE Access, 9:90465–90474, 2021.
  15. Emotion regulation in education: Conceptual foundations, current applications, and future directions. International Handbook of Emotions in Education, pages 183–201, 2014.
  16. Sam: Structural agnostic model, causal discovery and penalized adversarial learning. 2018.
  17. Causalgan: Learning causal implicit generative models with adversarial training. arXiv preprint arXiv:1709.02023, 2017.
  18. Emotion recognition in context. In Proceedings of the IEEE/CVF Conference on computer Vision and Pattern Recognition (CVPR), pages 1667–1675, 2017.
  19. Context based emotion recognition using emotic dataset. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(11):2755–2766, 2019.
  20. Context-aware emotion recognition networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 10143–10152, 2019.
  21. Text-oriented modality reinforcement network for multimodal sentiment analysis from unaligned multimodal sequences. arXiv preprint arXiv:2307.13205, 2023.
  22. Towards robust multimodal sentiment analysis under uncertain signal missing. IEEE Signal Processing Letters, 30:1497–1501, 2023.
  23. Deep facial expression recognition: A survey. IEEE Transactions on Affective Computing, 13(3):1195–1215, 2020.
  24. Human emotion recognition with relational region-level analysis. IEEE Transactions on Affective Computing, 2021a.
  25. Sequential interactive biased network for context-aware emotion recognition. In IEEE International Joint Conference on Biometrics (IJCB), pages 1–6, 2021b.
  26. Ctnet: Conversational transformer network for emotion recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 29:985–1000, 2021.
  27. Learning appearance-motion normality for video anomaly detection. In 2022 IEEE International Conference on Multimedia and Expo (ICME), pages 1–6. IEEE, 2022.
  28. Amp-net: Appearance-motion prototype network assisted automatic video anomaly detection system. IEEE Transactions on Industrial Informatics, 2023a.
  29. Generalized video anomaly event detection: Systematic taxonomy and comparison of deep models. arXiv preprint arXiv:2302.05087, 2023b.
  30. Discovering causal signals in images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 6979–6987, 2017.
  31. M3er: Multiplicative multimodal emotion recognition using facial, textual, and speech cues. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pages 1359–1367, 2020a.
  32. Emoticon: Context-aware multimodal emotion recognition using frege’s principle. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 14234–14243, 2020b.
  33. Multimodal and context-aware emotion perception model with multiplicative fusion. IEEE MultiMedia, 28(2):67–75, 2021.
  34. Counterfactual vqa: A cause-effect look at language bias. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12700–12710, 2021.
  35. Automatic differentiation in pytorch. 2017.
  36. Judea Pearl. Causal inference in statistics: An overview. Statistics Surveys, 3:96–146, 2009a.
  37. Judea Pearl. Causality. Cambridge University Press, 2009b.
  38. Judea Pearl. Interpretation and identification of causal mediation. Psychological Methods, 19(4):459, 2014.
  39. Judea Pearl et al. Models, reasoning and inference. Cambridge, UK: CambridgeUniversityPress, 19:2, 2000.
  40. Two causal principles for improving visual dialog. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10860–10869, 2020.
  41. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in Neural Information Processing Systems (NeurIPS), 28, 2015.
  42. Context-aware generation-based net for multi-label visual emotion recognition. In 2020 IEEE International Conference on Multimedia and Expo (ICME), pages 1–6. IEEE Computer Society, 2020.
  43. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
  44. Unbiased scene graph generation from biased training. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3716–3725, 2020.
  45. Hal R Varian. Causal inference in economics and marketing. Proceedings of the National Academy of Sciences, 113(27):7310–7315, 2016.
  46. Tsa-net: Tube self-attention network for action quality assessment. In Proceedings of the 29th ACM International Conference on Multimedia (ACM MM), pages 4902–4910, 2021.
  47. Ca-spacenet: Counterfactual analysis for 6d pose estimation in space. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 10627–10634, 2022a.
  48. Cpr-clip: Multimodal pre-training for composite error recognition in cpr training. IEEE Signal Processing Letters, 2023a.
  49. Visual commonsense r-cnn. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10760–10770, 2020.
  50. Model robustness meets data privacy: Adversarial robustness distillation without original data. arXiv preprint arXiv:2303.11611, 2023b.
  51. Adversarial contrastive distillation with adaptive denoising. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5. IEEE, 2023c.
  52. Sampling to distill: Knowledge transfer from open-world data. arXiv preprint arXiv:2307.16601, 2023d.
  53. Context-dependent emotion recognition. Journal of Visual Communication and Image Representation, 89:103679, 2022b.
  54. Disentangled representation learning for multimodal emotion recognition. In Proceedings of the 30th ACM International Conference on Multimedia (ACM MM), pages 1642–1651, 2022a.
  55. Contextual and cross-modal interaction for multi-modal speech emotion recognition. IEEE Signal Processing Letters, 29:2093–2097, 2022b.
  56. Emotion recognition for multiple context awareness. In Proceedings of the European Conference on Computer Vision (ECCV), pages 144–162, 2022c.
  57. Learning modality-specific and -agnostic representations for asynchronous multimodal language sequences. In Proceedings of the 30th ACM International Conference on Multimedia (ACM MM), pages 1708–1717, 2022d.
  58. Context de-confounded emotion recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 19005–19015, 2023a.
  59. Aide: A vision-driven multi-view, multi-modal, multi-tasking dataset for assistive driving perception. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 20459–20470, 2023b.
  60. Target and source modality co-reinforcement for emotion understanding from asynchronous multimodal sequences. Knowledge-Based Systems, 265:110370, 2023c.
  61. How2comm: Communication-efficient and collaboration-pragmatic multi-agent perception. In Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS), 2023d.
  62. Spatio-temporal domain awareness for multi-agent collaborative perception. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 23383–23392, 2023e.
  63. What2comm: Towards communication-efficient collaborative perception via feature decoupling. In Proceedings of the 31th ACM International Conference on Multimedia (ACM MM), page 7686–7695, 2023f.
  64. A graph convolutional network for emotion recognition in context. In 2020 Cross Strait Radio Science & Wireless Technology Conference (CSRSWTC), pages 1–3. IEEE, 2020.
  65. Context-aware affective graph reasoning for emotion recognition. In IEEE International Conference on Multimedia and Expo (ICME), pages 151–156. IEEE, 2019.
  66. Places: A 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(6):1452–1464, 2017.
Citations (14)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com