Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
134 tokens/sec
GPT-4o
10 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Compound Expression Recognition via Multi Model Ensemble (2403.12572v1)

Published 19 Mar 2024 in cs.CV and cs.AI

Abstract: Compound Expression Recognition (CER) plays a crucial role in interpersonal interactions. Due to the existence of Compound Expressions , human emotional expressions are complex, requiring consideration of both local and global facial expressions to make judgments. In this paper, to address this issue, we propose a solution based on ensemble learning methods for Compound Expression Recognition. Specifically, our task is classification, where we train three expression classification models based on convolutional networks, Vision Transformers, and multi-scale local attention networks. Then, through model ensemble using late fusion, we merge the outputs of multiple models to predict the final result. Our method achieves high accuracy on RAF-DB and is able to recognize expressions through zero-shot on certain portions of C-EXPR-DB.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. Emotion detection from facial expressions using augmented reality. 2023 5th International Conference on Inventive Research in Computing Applications (ICIRCA), pages 1–5, 2023.
  2. An image is worth 16x16 words: Transformers for image recognition at scale. ArXiv, abs/2010.11929, 2020.
  3. Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2015.
  4. Masked autoencoders are scalable vision learners. arXiv:2111.06377, 2021.
  5. Robovie: an interactive humanoid robot. Industrial Robot-an International Journal, 28:498–503, 2001.
  6. Dimitrios Kollias. Abaw: Valence-arousal estimation, expression recognition, action unit detection & multi-task learning challenges. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2328–2336, 2022a.
  7. Dimitrios Kollias. Abaw: Learning from synthetic data & multi-task learning challenges. arXiv preprint arXiv:2207.01138, 2022b.
  8. Expression, affect, action unit recognition: Aff-wild2, multi-task learning and arcface. arXiv preprint arXiv:1910.04855, 2019.
  9. Affect analysis in-the-wild: Valence-arousal, expressions, action units and a unified framework. arXiv preprint arXiv:2103.15792, 2021a.
  10. Analysing affective behavior in the second abaw2 competition. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3652–3660, 2021b.
  11. Analysing affective behavior in the first abaw 2020 competition. In 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020)(FG), pages 794–800.
  12. Face behavior a la carte: Expressions, affect and action units in a single network. arXiv preprint arXiv:1910.11111, 2019a.
  13. Deep affect prediction in-the-wild: Aff-wild database and challenge, deep architectures, and beyond. International Journal of Computer Vision, pages 1–23, 2019b.
  14. Distribution matching for heterogeneous multi-task learning: a large-scale face study. arXiv preprint arXiv:2105.03790, 2021.
  15. The 6th affective behavior analysis in-the-wild (abaw) competition. arXiv preprint arXiv:2402.19344, 2024.
  16. Dimitrios D. Kollias. Multi-label compound expression recognition: C-expr database & network. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5589–5598, 2023.
  17. A system of driving fatigue detection based on machine vision and its application on smart device. J. Sensors, 2015:548602:1–548602:11, 2015.
  18. Imagenet classification with deep convolutional neural networks. Communications of the ACM, 60:84 – 90, 2012.
  19. Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2584–2593, 2017.
  20. Compound expression recognition in-the-wild with au-assisted meta multi-task learning. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 5735–5744, 2023.
  21. Affectnet: A database for facial expression, valence, and arousal computing in the wild. IEEE Transactions on Affective Computing, 10:18–31, 2017.
  22. Blended emotion in-the-wild: Multi-label facial expression recognition using crowdsourced annotations and deep locality feature learning. International Journal of Computer Vision, 127:884 – 906, 2018.
  23. Attention is all you need. In Neural Information Processing Systems, 2017.
  24. Learning deep global multi-scale and local attention features for facial expression recognition in the wild. IEEE Transactions on Image Processing, 30:6544–6556, 2021.
  25. Eye fixation versus pupil diameter as eye-tracking features for virtual reality emotion classification. 2021 IEEE International Conference on Computing (ICOCO), pages 315–319, 2021.
  26. Two birds with one stone: Knowledge-embedded temporal convolutional transformer for depression detection and emotion recognition. IEEE Transactions on Affective Computing, 14:2595–2613, 2023.
Citations (7)

Summary

We haven't generated a summary for this paper yet.