Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

The 6th Affective Behavior Analysis in-the-wild (ABAW) Competition (2402.19344v3)

Published 29 Feb 2024 in cs.CV

Abstract: This paper describes the 6th Affective Behavior Analysis in-the-wild (ABAW) Competition, which is part of the respective Workshop held in conjunction with IEEE CVPR 2024. The 6th ABAW Competition addresses contemporary challenges in understanding human emotions and behaviors, crucial for the development of human-centered technologies. In more detail, the Competition focuses on affect related benchmarking tasks and comprises of five sub-challenges: i) Valence-Arousal Estimation (the target is to estimate two continuous affect dimensions, valence and arousal), ii) Expression Recognition (the target is to recognise between the mutually exclusive classes of the 7 basic expressions and 'other'), iii) Action Unit Detection (the target is to detect 12 action units), iv) Compound Expression Recognition (the target is to recognise between the 7 mutually exclusive compound expression classes), and v) Emotional Mimicry Intensity Estimation (the target is to estimate six continuous emotion dimensions). In the paper, we present these Challenges, describe their respective datasets and challenge protocols (we outline the evaluation metrics) and present the baseline systems as well as their obtained performance. More information for the Competition can be found in: https://affective-behavior-analysis-in-the-wild.github.io/6th.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (115)
  1. An audiovisual and contextual approach for categorical and continuous emotion recognition in-the-wild. arXiv preprint arXiv:2107.03465, 2021.
  2. A large imaging database and novel deep neural architecture for covid-19 diagnosis. In 2022 IEEE 14th Image, Video, and Multidimensional Signal Processing Workshop (IVMSP), pages 1–5. IEEE, 2022.
  3. Data-driven covid-19 detection through medical imaging. In 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), page 1–5. IEEE, 2023.
  4. Fatauva-net : An integrated deep learning framework for facial attribute recognition, action unit (au) detection, and valence-arousal estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshop, 2017.
  5. Multi-task learning for emotion descriptors estimation at the fourth abaw challenge. arXiv preprint arXiv:2207.09716, 2022.
  6. Multimodal multi-task learning for dimensional and continuous emotion recognition. In Proceedings of the 7th Annual Workshop on Audio/Visual Emotion Challenge, pages 19–26. ACM, 2017.
  7. ’feeltrace’: An instrument for recording perceived emotion in real time. In ISCA tutorial and research workshop (ITRW) on speech and emotion, 2000.
  8. Didan Deng. Multiple emotion descriptors estimation at the abaw3 challenge. arXiv preprint arXiv:2203.12845, 2022.
  9. Fau, facial expressions, valence and arousal: A multi-task solution. arXiv preprint arXiv:2002.03557, 2020a.
  10. Multitask emotion recognition with incomplete labels. In 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020), pages 592–599. IEEE, 2020b.
  11. Towards better uncertainty: Iterative training of efficient networks for multitask emotion recognition. arXiv preprint arXiv:2108.04228, 2021.
  12. Affective expression analysis in-the-wild using multi-task temporal statistical deep learning model. In 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020), pages 624–628. IEEE, 2020.
  13. An audio-video deep and transfer learning framework for multimodal emotion recognition in the wild. arXiv preprint arXiv:2010.03692, 2020.
  14. Affect expression behaviour analysis in the wild using spatio-channel attention and complementary context information. arXiv preprint arXiv:2009.14440, 2020.
  15. Affect expression behaviour analysis in the wild using consensual collaborative training. arXiv preprint arXiv:2107.05736, 2021.
  16. Ss-mfar: Semi-supervised multi-task facial affect recognition. arXiv preprint arXiv:2207.09012, 2022.
  17. Abaw: Facial expression recognition in the wild. arXiv preprint arXiv:2303.09785, 2023.
  18. An ensemble approach for multiple emotion descriptors estimation using multi-task learning. arXiv preprint arXiv:2207.10878, 2022.
  19. Incremental boosting convolutional neural network for facial action unit recognition. In Advances in neural information processing systems, pages 109–117, 2016.
  20. An attention-based method for action unit detection at the 3rd abaw competition. arXiv preprint arXiv:2203.12428, 2022.
  21. Facial expression recognition based on multi-head cross attention network. arXiv preprint arXiv:2203.13235, 2022a.
  22. Learning from synthetic data: Facial expression classification based on ensemble of multi-task networks. arXiv preprint arXiv:2207.10025, 2022b.
  23. Multi-label relation modeling in facial action units detection. arXiv preprint arXiv:2002.01105, 2020.
  24. Facial action unit recognition with multi-models ensembling. arXiv preprint arXiv:2203.13046, 2022.
  25. A multi-modal and multi-task learning method for action unit and expression recognition. arXiv preprint arXiv:2107.04187, 2021.
  26. Continuous-time audiovisual fusion with recurrence vs. attention for in-the-wild affect recognition. arXiv preprint arXiv:2203.13285, 2022.
  27. Facial expression recognition with swin transformer. arXiv preprint arXiv:2203.13472, 2022.
  28. Multi-modal facial expression recognition with transformer-based fusion networks and dynamic sampling. arXiv preprint arXiv:2303.08419, 2023.
  29. Dimitrios Kollias. Abaw: learning from synthetic data & multi-task learning challenges. In European Conference on Computer Vision, pages 157–172. Springer, 2022a.
  30. Dimitrios Kollias. Abaw: Valence-arousal estimation, expression recognition, action unit detection & multi-task learning challenges. arXiv preprint arXiv:2202.10659, 2022b.
  31. Dimitrios Kollias. Abaw: Learning from synthetic data & multi-task learning challenges. In European Conference on Computer Vision, pages 157–172. Springer, 2023a.
  32. Dimitrios Kollias. Multi-label compound expression recognition: C-expr database & network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5589–5598, 2023b.
  33. Aff-wild2: Extending the aff-wild database for affect recognition. arXiv preprint arXiv:1811.07770, 2018a.
  34. Aff-wild2: Extending the aff-wild database for affect recognition. arXiv preprint arXiv:1811.07770, 2018b.
  35. A multi-task learning & generation framework: Valence-arousal, action units & primary expressions. arXiv preprint arXiv:1811.07771, 2018c.
  36. Training deep neural networks with different datasets in-the-wild: The emotion recognition paradigm. In 2018 International Joint Conference on Neural Networks (IJCNN), pages 1–8. IEEE, 2018d.
  37. Expression, affect, action unit recognition: Aff-wild2, multi-task learning and arcface. arXiv preprint arXiv:1910.04855, 2019.
  38. Va-stargan: Continuous affect generation. In International Conference on Advanced Concepts for Intelligent Vision Systems, pages 227–238. Springer, 2020a.
  39. Affect analysis in-the-wild: Valence-arousal, expressions, action units and a unified framework. arXiv preprint arXiv:2103.15792, 2021a.
  40. Analysing affective behavior in the second abaw2 competition. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3652–3660, 2021b.
  41. Exploiting multi-cnn features in cnn-rnn based dimensional emotion recognition on the omg in-the-wild dataset. IEEE Transactions on Affective Computing, 2020b.
  42. Interweaving deep learning and semantic techniques for emotion analysis in human-machine interaction. In 2015 10th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP), pages 1–6. IEEE, 2015.
  43. On line emotion detection using retrainable deep neural networks. In Computational Intelligence (SSCI), 2016 IEEE Symposium Series on, pages 1–8. IEEE, 2016.
  44. Recognition of affect in the wild using deep neural networks. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2017 IEEE Conference on, pages 1972–1979. IEEE, 2017a.
  45. Adaptation and contextualization of deep neural network models. In Computational Intelligence (SSCI), 2017 IEEE Symposium Series on, pages 1–8. IEEE, 2017b.
  46. Photorealistic facial synthesis in the dimensional affect space. In European Conference on Computer Vision, pages 475–491. Springer, 2018a.
  47. Deep neural architectures for prediction in healthcare. Complex & Intelligent Systems, 4(2):119–131, 2018b.
  48. Face behavior a la carte: Expressions, affect and action units in a single network. arXiv preprint arXiv:1910.11111, 2019.
  49. Deep transparent prediction through latent representation analysis. arXiv preprint arXiv:2009.07044, 2020a.
  50. Deep neural network augmentation: Generating faces for affect analysis. International Journal of Computer Vision, pages 1–30, 2020b.
  51. Analysing affective behavior in the first abaw 2020 competition. In 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020)(FG), pages 794–800. IEEE Computer Society, 2020c.
  52. Transparent adaptation in deep medical image diagnosis. In International Workshop on the Foundations of Trustworthy AI Integrating Learning, Optimization and Reasoning, pages 251–267. Springer, 2020d.
  53. Mia-cov19d: Covid-19 detection through 3-d chest ct image analysis. arXiv preprint arXiv:2106.07524, 2021a.
  54. Distribution matching for heterogeneous multi-task learning: a large-scale face study. arXiv preprint arXiv:2105.03790, 2021b.
  55. Ai-mia: Covid-19 detection and severity analysis through medical imaging. In European Conference on Computer Vision, page 677–690. Springer, 2022.
  56. Ai-enabled analysis of 3-d ct scans for diagnosis of covid-19 & its severity. In 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), page 1–5. IEEE, 2023a.
  57. A deep neural architecture for harmonizing 3-d input data analysis and decision making in medical imaging. Neurocomputing, 542:126244, 2023b.
  58. Facernet: a facial expression intensity estimation network. arXiv preprint arXiv:2303.00180, 2023c.
  59. Abaw: Valence-arousal estimation, expression recognition, action unit detection & emotional reaction intensity estimation challenges. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5888–5897, 2023d.
  60. Btdnet: A multi-modal approach for brain tumor radiogenomic classification. Applied Sciences, 13(21):11984, 2023e.
  61. Distribution matching for multi-task learning of classification tasks: a large-scale study on faces & beyond. arXiv preprint arXiv:2401.01219, 2024.
  62. Two-stream aural-visual affect analysis in the wild. arXiv preprint arXiv:2002.03399, 2020.
  63. Byel: Bootstrap on your emotion latent. arXiv preprint arXiv:2207.10003, 2022.
  64. Mid-level representation enhancement and graph embedded uncertainty suppressing for facial expression recognition. arXiv preprint arXiv:2207.13235, 2022.
  65. I Li et al. Technical report for valence-arousal estimation on affwild2 dataset. arXiv preprint arXiv:2105.01502, 2021.
  66. Facial affect analysis: Learning from synthetic data & multi-task learning challenges. arXiv preprint arXiv:2207.09748, 2022a.
  67. Affective behaviour analysis using pretrained model with facial priori. arXiv preprint arXiv:2207.11679, 2022b.
  68. Emotion recognition for in-the-wild videos. arXiv preprint arXiv:2002.05447, 2020.
  69. Spatial and temporal networks for facial expression recognition in the wild videos. arXiv preprint arXiv:2107.05160, 2021.
  70. Multi-modal emotion estimation for in-the-wild videos. arXiv preprint arXiv:2203.13032, 2022.
  71. Tempt: Temporal consistency for test-time adaptation. arXiv preprint arXiv:2303.10536, 2023.
  72. Multi-task cross attention network in facial behavior analysis. arXiv preprint arXiv:2207.10293, 2022a.
  73. A transformer-based approach to video frame-level prediction in affective behaviour analysis in-the-wild. arXiv preprint arXiv:2303.09293, 2023.
  74. An ensemble approach for facial expression analysis in video. arXiv preprint arXiv:2203.12891, 2022b.
  75. Causal affect prediction model using a facial image sequence. arXiv preprint arXiv:2107.03886, 2021.
  76. Multi-label class balancing algorithm for action unit detection. arXiv preprint arXiv:2002.03238, 2020.
  77. Expression classification using concatenation of deep neural network for the 3rd abaw3 competition. arXiv preprint arXiv:2203.12899, 2022.
  78. Mixaugment & mixup: Augmentation methods for facial expression recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2367–2375, 2022.
  79. A joint cross-attention model for audio-visual fusion in dimensional emotion recognition. arXiv preprint arXiv:2203.14779, 2022.
  80. Action units recognition using improved pairwise deep architecture. arXiv preprint arXiv:2107.03143, 2021.
  81. Medical image segmentation: A review of modern architectures. In Computer Vision–ECCV 2022 Workshops: Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part VII, pages 691–708. Springer, 2023.
  82. Andrey V Savchenko. Frame-level prediction of facial expressions, valence, arousal and action units for mobile devices. arXiv preprint arXiv:2203.13436, 2022a.
  83. Andrey V Savchenko. Hse-nn team at the 4th abaw competition: Multi-task emotion recognition and learning from synthetic images. arXiv preprint arXiv:2207.09508, 2022b.
  84. Andrey V Savchenko. Emotieffnet facial features in uni-task emotion recognition in video at abaw-5 competition. arXiv preprint arXiv:2303.09162, 2023.
  85. Mutilmodal feature extraction and attention-based fusion for emotion estimation in videos. arXiv preprint arXiv:2303.10421, 2023.
  86. Two-aspect information fusion model for abaw4 multi-task challenge. arXiv preprint arXiv:2207.11389, 2022.
  87. Assessment of parkinson’s disease based on deep neural networks. In International Conference on Engineering Applications of Neural Networks, pages 391–403. Springer, 2017.
  88. Machine learning for neurodegenerative disorder diagnosis—survey of practices and launch of benchmark dataset. International Journal on Artificial Intelligence Tools, 27(03):1850011, 2018.
  89. Multi-label transformer for action unit detection. arXiv preprint arXiv:2203.12531, 2022.
  90. Multitask multi-database emotion recognition. arXiv preprint arXiv:2107.04127, 2021.
  91. Vision transformer for action units detection. arXiv preprint arXiv:2303.09917, 2023.
  92. A multi-task mean teacher for semi-supervised facial affective behavior analysis. arXiv preprint arXiv:2107.04225, 2021.
  93. Hybrid cnn-transformer model for facial affect recognition in the abaw4 challenge. arXiv preprint arXiv:2207.10201, 2022a.
  94. Multi-modal multi-label facial action unit detection with transformer. arXiv preprint arXiv:2203.13301, 2022b.
  95. Facial action unit recognition based on transfer learning. arXiv preprint arXiv:2203.14694, 2022c.
  96. Facial affective behavior analysis method for 5th abaw competition. arXiv preprint arXiv:2303.09145, 2023a.
  97. Spatio-temporal au relational graph representation learning for facial action units detection. arXiv preprint arXiv:2303.10644, 2023b.
  98. Technical report for valence-arousal estimation in abaw2 challenge. arXiv preprint arXiv:2107.03891, 2021.
  99. Coarse-to-fine cascaded networks with smooth predicting for video facial expression recognition. arXiv preprint arXiv:2203.13052, 2022.
  100. Exploring expression-related self-supervised learning for affective behaviour analysis. arXiv preprint arXiv:2303.10511, 2023.
  101. Multi-modal facial action unit detection with large pre-trained models for the 5th competition on affective behavior analysis in-the-wild. arXiv preprint arXiv:2303.10590, 2023.
  102. A multi-term and multi-task analyzing framework for affective analysis in-the-wild. arXiv preprint arXiv:2009.13885, 2020.
  103. Multi-model ensemble learning method for human expression recognition. arXiv preprint arXiv:2203.14466, 2022.
  104. Exploring large-scale unlabeled faces to enhance facial expression recognition. arXiv preprint arXiv:2303.08617, 2023.
  105. Aff-wild: Valence and arousal ‘in-the-wild’challenge. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2017 IEEE Conference on, pages 1980–1987. IEEE, 2017.
  106. Audio-visual attentive fusion for continuous emotion recognition. arXiv preprint arXiv:2107.01175, 2021a.
  107. Continuous emotion recognition using visual-audio-linguistic information: A technical report for abaw3. arXiv preprint arXiv:2203.13031, 2022a.
  108. Multimodal continuous emotion recognition: A technical report for abaw5. arXiv preprint arXiv:2303.10335, 2023a.
  109. Emotion recognition based on multi-task learning framework in the abaw4 challenge. arXiv preprint arXiv:2207.09373, 2022b.
  110. Prior aided streaming network for multi-task affective recognitionat the 2nd abaw2 competition. arXiv preprint arXiv:2107.03708, 2021b.
  111. Transformer-based multimodal information fusion for facial expression analysis. arXiv preprint arXiv:2203.12367, 2022c.
  112. m3superscript𝑚3m^{3}italic_m start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT t: Multi-modal continuous valence-arousal estimation in the wild. arXiv preprint arXiv:2002.02957, 2020.
  113. Facial affect recognition based on transformer encoder and audiovisual fusion for the abaw5 challenge. arXiv preprint arXiv:2303.09158, 2023b.
  114. Continuous emotion recognition based on tcn and transformer. arXiv preprint arXiv:2303.08356, 2023.
  115. Spatial-temporal transformer for affective behavior analysis. arXiv preprint arXiv:2303.10561, 2023.
Citations (54)

Summary

We haven't generated a summary for this paper yet.