Multi-Energy Guided Image Translation with Stochastic Differential Equations for Near-Infrared Facial Expression Recognition (2312.05908v1)
Abstract: Illumination variation has been a long-term challenge in real-world facial expression recognition(FER). Under uncontrolled or non-visible light conditions, Near-infrared (NIR) can provide a simple and alternative solution to obtain high-quality images and supplement the geometric and texture details that are missing in the visible domain. Due to the lack of existing large-scale NIR facial expression datasets, directly extending VIS FER methods to the NIR spectrum may be ineffective. Additionally, previous heterogeneous image synthesis methods are restricted by low controllability without prior task knowledge. To tackle these issues, we present the first approach, called for NIR-FER Stochastic Differential Equations (NFER-SDE), that transforms face expression appearance between heterogeneous modalities to the overfitting problem on small-scale NIR data. NFER-SDE is able to take the whole VIS source image as input and, together with domain-specific knowledge, guide the preservation of modality-invariant information in the high-frequency content of the image. Extensive experiments and ablation studies show that NFER-SDE significantly improves the performance of NIR FER and achieves state-of-the-art results on the only two available NIR FER datasets, Oulu-CASIA and Large-HFE.
- How far are we from solving the 2D & 3D Face Alignment problem? (and a dataset of 230,000 3D facial landmarks). In International Conference on Computer Vision.
- Ilvr: Conditioning method for denoising diffusion probabilistic models. arXiv preprint arXiv:2108.02938.
- Diffusion models in vision: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence.
- Compound facial expressions of emotion. Proceedings of the national academy of sciences, 111(15): E1454–E1462.
- Generative adversarial nets. Advances in neural information processing systems, 27.
- Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, 1125–1134.
- Semi-supervised learning with deep generative models. Advances in neural information processing systems, 27.
- A deeper look at facial expression dataset bias. IEEE Transactions on Affective Computing, 13(2): 881–893.
- 3D facial expression modeling based on facial landmarks in single image. Neurocomputing, 355: 155–167.
- Sdedit: Guided image synthesis and editing with stochastic differential equations. arXiv preprint arXiv:2108.01073.
- T2V-DDPM: Thermal to Visible Face Translation using Denoising Diffusion Probabilistic Models. In 2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG), 1–7. IEEE.
- Generating diverse structure for image inpainting with hierarchical VQ-VAE. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10775–10784.
- Image super-resolution via iterative refinement. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(4): 4713–4726.
- Learning structured output representation using deep conditional generative models. Advances in neural information processing systems, 28.
- Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456.
- Facial Feature Embedded Cyclegan For Vis-Nir Translation. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 1903–1907. IEEE.
- Suppressing uncertainties for large-scale facial expression recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 6897–6906.
- EASE: Robust Facial Expression Recognition via Emotion Ambiguity-SEnsitive Cooperative Networks. In Proceedings of the 30th ACM International Conference on Multimedia, 218–227.
- Co-Completion for Occluded Facial Expression Recognition. In Proceedings of the 30th ACM International Conference on Multimedia, 130–140.
- Hifacegan: Face renovation via collaborative suppression and replenishment. In Proceedings of the 28th ACM international conference on multimedia, 1551–1560.
- Facial expression recognition using facial movement features. IEEE transactions on affective computing, 2(4): 219–229.
- The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition, 586–595.
- Facial expression recognition from near-infrared videos. Image and vision computing, 29(9): 607–619.
- Egsde: Unpaired image-to-image translation via energy-guided stochastic differential equations. Advances in Neural Information Processing Systems, 35: 3609–3623.
- Former-dfer: Dynamic facial expression recognition transformer. In Proceedings of the 29th ACM International Conference on Multimedia, 1553–1561.
- Knowledge Conditioned Variational Learning for One-Class Facial Expression Recognition. IEEE Transactions on Image Processing.
- Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision, 2223–2232.