Hierarchical Space-Time Attention for Micro-Expression Recognition (2405.03202v1)
Abstract: Micro-expression recognition (MER) aims to recognize the short and subtle facial movements from the Micro-expression (ME) video clips, which reveal real emotions. Recent MER methods mostly only utilize special frames from ME video clips or extract optical flow from these special frames. However, they neglect the relationship between movements and space-time, while facial cues are hidden within these relationships. To solve this issue, we propose the Hierarchical Space-Time Attention (HSTA). Specifically, we first process ME video frames and special frames or data parallelly by our cascaded Unimodal Space-Time Attention (USTA) to establish connections between subtle facial movements and specific facial areas. Then, we design Crossmodal Space-Time Attention (CSTA) to achieve a higher-quality fusion for crossmodal data. Finally, we hierarchically integrate USTA and CSTA to grasp the deeper facial cues. Our model emphasizes temporal modeling without neglecting the processing of special data, and it fuses the contents in different modalities while maintaining their respective uniqueness. Extensive experiments on the four benchmarks show the effectiveness of our proposed HSTA. Specifically, compared with the latest method on the CASME3 dataset, it achieves about 3% score improvement in seven-category classification.
- Vivit: A video vision transformer. In ICCV, pages 6836–6846, 2021.
- Crossvit: Cross-attention multi-scale vision transformer for image classification. In ICCV, pages 357–366, 2021.
- Samm: A spontaneous micro-facial movement dataset. TAC, 9(1):116–129, 2016.
- Objective classes for micro-facial expression recognition. Journal of imaging, 4(10):119, 2018.
- Paul Ekman. Telling lies: Clues to deceit in the marketplace, politics, and marriage (revised edition). WW Norton & Company, 2009.
- Off-apexnet on micro-expression recognition system. Signal Processing: Image Communication, 74:129–139, 2019.
- Enriched long-term recurrent convolutional network for facial micro-expression recognition. In IEEE FG 2018, pages 667–674, 2018.
- Dual-stream shallow networks for facial micro-expression recognition. In ICIP, pages 36–40, 2019.
- A spontaneous micro-expression database: Inducement, collection and baseline. In IEEE FG, pages 1–6, 2013.
- Micro-expression recognition based on 3d flow convolutional neural network. Pattern Analysis and Applications, 22:1331–1339, 2019.
- Mmnet: Muscle motion-guided network for micro-expression recognition. IJCAI, 2022.
- Cas (me) 3: A third generation facial spontaneous micro-expression database with depth information and high ecological validity. TPAMI, 45(3):2782–2800, 2022.
- Micro-expression recognition using apex frame with phase information. In APSIPA ASC, pages 534–537, 2017.
- Less is more: Micro-expression recognition from video using apex frame. Signal Processing: Image Communication, 62:82–92, 2018.
- Shallow triple stream three-dimensional cnn (ststnet) for micro-expression recognition. In IEEE FG, pages 1–5, 2019.
- Micron-bert: Bert-based facial micro-expression recognition. In CVPR, pages 1482–1492, 2023.
- Spatiotemporal contrastive video representation learning. In CVPR, pages 6964–6974, 2021.
- Spontaneous facial micro-expression recognition using 3d spatiotemporal convolutional neural networks. In IJCNN, pages 1–8, 2019.
- Federated self-supervised learning for video understanding. In ECCV, pages 506–522, 2022.
- Megc 2019–the second facial micro-expressions grand challenge. In 2019 14th IEEE FG, pages 1–5, 2019.
- Videomae: Masked autoencoders are data-efficient learners for self-supervised video pre-training. NIPS, 35:10078–10093, 2022.
- Capsulenet for micro-expression recognition. In IEEE FG, pages 1–7, 2019.
- Attention is all you need. NIPS, 30, 2017.
- Learnet: Dynamic imaging network for micro expression recognition. IEEE Transactions on Image Processing, 29:1618–1627, 2019.
- Videomae v2: Scaling video masked autoencoders with dual masking. In CVPR, pages 14549–14560, 2023.
- Htnet for micro-expression recognition. arXiv preprint arXiv:2307.14637, 2023.
- Multi-modality cross attention network for image and sentence matching. In CVPR, pages 10941–10950, 2020.
- Deepfake video detection using convolutional vision transformer. arXiv preprint arXiv:2102.11126, 2021.
- Casme ii: An improved spontaneous micro-expression database and the baseline evaluation. PloS one, 9(1):e86041, 2014.
- Feature representation learning with adaptive displacement generation and transformer fusion for micro-expression recognition. In CVPR, pages 22086–22095, 2023.
- A two-stage 3d cnn based learning method for spontaneous micro-expression recognition. Neurocomputing, 448:276–289, 2021.
- Dfme: A new benchmark for dynamic facial micro-expression recognition. TAC, 2023.
- Database micro-expression recognition. In IEEE FG, pages 1–5, 2019.
- Feature refinement: An expression-specific feature learning and fusion method for micro-expression recognition. Pattern Recognition, 122:108275, 2022.
- Dual-atme: Dual-branch attention network for micro-expression recognition. Entropy, 25(3):460, 2023.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.