Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
143 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Adaptive Global-Local Representation Learning and Selection for Cross-Domain Facial Expression Recognition (2401.11085v1)

Published 20 Jan 2024 in cs.CV and cs.AI

Abstract: Domain shift poses a significant challenge in Cross-Domain Facial Expression Recognition (CD-FER) due to the distribution variation across different domains. Current works mainly focus on learning domain-invariant features through global feature adaptation, while neglecting the transferability of local features. Additionally, these methods lack discriminative supervision during training on target datasets, resulting in deteriorated feature representation in target domain. To address these limitations, we propose an Adaptive Global-Local Representation Learning and Selection (AGLRLS) framework. The framework incorporates global-local adversarial adaptation and semantic-aware pseudo label generation to enhance the learning of domain-invariant and discriminative feature during training. Meanwhile, a global-local prediction consistency learning is introduced to improve classification results during inference. Specifically, the framework consists of separate global-local adversarial learning modules that learn domain-invariant global and local features independently. We also design a semantic-aware pseudo label generation module, which computes semantic labels based on global and local features. Moreover, a novel dynamic threshold strategy is employed to learn the optimal thresholds by leveraging independent prediction of global and local features, ensuring filtering out the unreliable pseudo labels while retaining reliable ones. These labels are utilized for model optimization through the adversarial learning process in an end-to-end manner. During inference, a global-local prediction consistency module is developed to automatically learn an optimal result from multiple predictions. We conduct comprehensive experiments and analysis based on a fair evaluation benchmark. The results demonstrate that the proposed framework outperforms the current competing methods by a substantial margin.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (63)
  1. C. Lu, Y. Zong, W. Zheng, Y. Li, C. Tang, and B. W. Schuller, “Domain invariant feature learning for speaker-independent speech emotion recognition,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 2217–2230, 2022.
  2. T. Zhang, Y. Zong, W. Zheng, C. L. P. Chen, X. Hong, C. Tang, Z. Cui, and G. Zhao, “Cross-database micro-expression recognition: A benchmark,” IEEE Transactions on Knowledge and Data Engineering, vol. 34, no. 2, pp. 544–559, 2022.
  3. K. Xia, X. Gu, and B. Chen, “Cross-dataset transfer driver expression recognition via global discriminative and local structure knowledge exploitation in shared projection subspace,” IEEE Transactions on Intelligent Transportation Systems, vol. 22, no. 3, pp. 1765–1776, 2021.
  4. G. Wen, T. Chang, H. Li, and L. Jiang, “Dynamic objectives learning for facial expression recognition,” IEEE Transactions on Multimedia, vol. 22, no. 11, pp. 2914–2925, 2020.
  5. W. Huang, S. Zhang, P. Zhang, Y. Zha, Y. Fang, and Y. Zhang, “Identity-aware facial expression recognition via deep metric learning based on synthesized images,” IEEE Transactions on Multimedia, vol. 24, pp. 3327–3339, 2022.
  6. S. Li and W. Deng, “Deep facial expression recognition: A survey,” IEEE Transactions on Affective Computing, vol. 13, no. 3, pp. 1195–1215, 2022.
  7. R. Zhao, T. Liu, J. Xiao, D. P. Lun, and K.-M. Lam, “Deep multi-task learning for facial expression recognition and synthesis based on selective feature sharing,” in 25th International Conference on Pattern Recognition (ICPR), 2020, pp. 4412–4419.
  8. S. Li and W. Deng, “Reliable crowdsourcing and deep locality-preserving learning for unconstrained facial expression recognition,” IEEE Transactions on Image Processing, vol. 28, no. 1, pp. 356–370, 2019.
  9. I. Goodfellow, D. Erhan, P. Carrier, A. Courville, M. Mirza, B. Hamner, W. Cukierski, Y. Tang, D. Thaler, D.-H. Lee, Y. Zhou, C. Ramaiah, F. Feng, R. Li, X. Wang, D. Athanasakis, J. Shawe-Taylor, M. Milakov, J. Park, R. T. Ionescu, M. Popescu, C. Grozea, J. Bergstra, J. Xie, L. Romaszko, B. Xu, Z. Chuang, and Y. Bengio, “Challenges in representation learning: A report on three machine learning contests,” in International Conference on Neural Information Processing (ICONIP), 2013, pp. 117–124.
  10. P. Lucey, J. F. Cohn, T. Kanade, J. Saragih, Z. Ambadar, and I. Matthews, “The extended cohn-kanade dataset (ck+): A complete dataset for action unit and emotion-specified expression,” in IEEE Computer Society Conference on Computer Vision and Pattern Recogniton-Workshops, 2010, pp. 94–101.
  11. M. Lyons, S. Akamatsu, M. Kamachi, and J. Gyoba, “Coding facial expressions with gabor wavelets,” in IEEE International Conference on Automatic Face and Gesture Recognition, 1998, pp. 200–205.
  12. A. Dhall, R. Goecke, S. Lucey, and T. Gedeon, “Static facial expression analysis in tough conditions: Data, evaluation protocol and benchmark,” in IEEE International Conference on Computer Vision Workshops (ICCV Workshops), 2011, pp. 2106–2112.
  13. Z. Zhang, P. Luo, C. C. Loy, and X. Tang, “From facial expression recognition to interpersonal relation prediction,” International Journal of Computer Vision, vol. 126, pp. 550–569, 2018.
  14. H. Yan, “Transfer subspace learning for cross-dataset facial expression recognition,” Neurocomputing, vol. 208, pp. 165–173, 2016.
  15. Y.-Q. Miao, R. Araujo, and M. S. Kamel, “Cross-domain facial expression recognition using supervised kernel mean matching,” in 11th International Conference on Machine Learning and Applications, 2012, pp. 326–332.
  16. Z. Sun, R. Chiong, Z.-p. Hu, and S. Dhakal, “A dynamic constraint representation approach based on cross-domain dictionary learning for expression recognition,” Journal of Visual Communication and Image Representation, vol. 85, p. 103458, 2022.
  17. T. Ni, C. Zhang, and X. Gu, “Transfer model collaborating metric learning and dictionary learning for cross-domain facial expression recognition,” IEEE Transactions on Computational Social Systems, vol. 8, no. 5, pp. 1213–1222, 2021.
  18. C. Wang, J. Ding, H. Yan, and S. Shen, “A prototype-oriented contrastive adaption network for cross-domain facial expression recognition,” in Asian Conference on Computer Vision (ACCV), 2022, pp. 324–340.
  19. X. Wang, X. Wang, and Y. Ni, “Unsupervised domain adaptation for facial expression recognition using generative adversarial networks,” Computational Intelligence and Neuroscience, p. 7208794, 2018.
  20. Y. Zong, W. Zheng, X. Huang, J. Shi, Z. Cui, and G. Zhao, “Domain regeneration for cross-database micro-expression recognition,” IEEE Transactions on Image Processing, vol. 27, no. 5, pp. 2484–2498, 2018.
  21. B. Bozorgtabar, D. Mahapatra, and J.-P. Thiran, “Exprada: Adversarial domain adaptation for facial expression analysis,” Pattern Recognition, vol. 100, p. 107111, 2020.
  22. G. Liang, S. Wang, and C. Wang, “Pose-aware adversarial domain adaptation for personalized facial expression recognition,” arXiv preprint arXiv:2007.05932, 2020.
  23. H. Yang, Z. Zhang, and L. Yin, “Identity-adaptive facial expression recognition through expression regeneration using conditional generative adversarial networks,” in 13th IEEE International Conference on Automatic Face & Gesture Recognition, 2018, pp. 294–301.
  24. Y. Xie, T. Chen, T. Pu, H. Wu, and L. Lin, “Adversarial graph representation adaptation for cross-domain facial expression recognition,” in 28th ACM International Conference on Multimedia, 2020, pp. 1255–1264.
  25. T. Chen, T. Pu, H. Wu, Y. Xie, L. Liu, and L. Lin, “Cross-domain facial expression recognition: A unified evaluation benchmark and adversarial graph learning,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 12, pp. 9887–9903, 2022.
  26. Y. Li, Y. Gao, B. Chen, Z. Zhang, L. Zhu, and G. Lu, “Jdman: Joint discriminative and mutual adaptation networks for cross-domain facial expression recognition,” in 29th ACM International Conference on Multimedia, 2021, pp. 3312–3320.
  27. Y. Xie, Y. Gao, J. Lin, and T. Chen, “Learning consistent global-local representation for cross-domain facial expression recognition,” in 26th International Conference on Pattern Recognition (ICPR), 2022, pp. 2489–2495.
  28. X. Zou, Y. Yan, J. Xue, S. Chen, and H. Wang, “Learn-to-decompose: Cascaded decomposition network for cross-domain few-shot facial expression recognition,” in European Conference on Computer Vision, 2022, pp. 683–700.
  29. K. Yan, W. Zheng, Z. Cui, and Y. Zong, “Cross-database facial expression recognition via unsupervised domain adaptive dictionary learning,” in International Conference on Neural Information Processing, 2016, pp. 427–434.
  30. M. Meng, M. Lan, J. Yu, J. Wu, and L. Liu, “Dual-level adaptive and discriminative knowledge transfer for cross-domain recognition,” IEEE Transactions on Multimedia, vol. 25, pp. 2266–2279, 2023.
  31. X. Zou, Y. Yan, J.-H. Xue, S. Chen, and H. Wang, “When facial expression recognition meets few-shot learning: a joint and alternate learning framework,” in the AAAI Conference on Artificial Intelligence, 2022, pp. 5367–5375.
  32. W. Zheng, Y. Zong, X. Zhou, and M. Xin, “Cross-domain color facial expression recognition using transductive transfer subspace learning,” IEEE Transactions on Affective Computing, vol. 9, no. 1, pp. 21–37, 2018.
  33. Y. Li, Z. Zhang, B. Chen, G. Lu, and D. Zhang, “Deep margin-sensitive representation learning for cross-domain facial expression recognition,” IEEE Transactions on Multimedia, vol. 25, pp. 1359–1373, 2023.
  34. Y. Ji, Y. Hu, Y. Yang, F. Shen, and H. T. Shen, “Cross-domain facial expression recognition via an intra-category common feature and inter-category distinction feature fusion network,” Neurocomputing, vol. 333, pp. 231–239, 2019.
  35. S. Li and W. Deng, “A deeper look at facial expression dataset bias,” IEEE Transactions on Affective Computing, vol. 13, no. 2, pp. 881–893, 2022.
  36. E. Tzeng, J. Hoffman, K. Saenko, and T. Darrell, “Adversarial discriminative domain adaptation,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 2962–2971.
  37. A. Conti, P. Rota, Y. Wang, and E. Ricci, “Cluster-level pseudo-labelling for source-free cross-domain facial expression recognition,” in the 3rd British Machine Vision Conference (BMVC), 2022, pp. 1–13.
  38. M. de Carvalho, M. Pratama, J. Zhang, and E. Y. K. Yee, “Acdc: Online unsupervised cross-domain adaptation,” Knowledge-Based Systems, vol. 253, p. 109486, 2022.
  39. M. Long, Z. Cao, J. Wang, and M. I. Jordan, “Conditional adversarial domain adaptation,” in Advances in Neural Information Processing Systems, 2018, pp. 1647–1657.
  40. Y. Ji, Y. Hu, Y. Yang, and H. T. Shen, “Region attention enhanced unsupervised cross-domain facial emotion recognition,” IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 4, pp. 4190–4201, 2023.
  41. J. Wang and J. Jiang, “Learning across tasks for zero-shot domain adaptation from a single source domain,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 10, pp. 6264–6279, 2022.
  42. Y. Li, D. Tarlow, M. Brockschmidt, and R. Zemel, “Gated graph sequence neural networks,” arXiv:1511.05493, 2015.
  43. T. Chen, L. Lin, R. Chen, X. Hui, and H. Wu, “Knowledge-guided multi-label few-shot learning for general image recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 3, pp. 1371–1384, 2020.
  44. Z. Wang, J. Zhang, T. Chen, W. Wang, and P. Luo, “Restoreformer++: Towards real-world blind face restoration from undegraded key-value pairs,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 12, pp. 15 462–15 476, 2023.
  45. T. Pu, T. Chen, H. Wu, Y. Lu, and L. Lin, “Spatial-temporal knowledge-embedded transformer for video scene graph generation,” IEEE Transactions on Image Processing, vol. 33, pp. 556–568, 2023.
  46. B. Zhang, Y. Wang, W. Hou, H. Wu, J. Wang, M. Okumura, and T. Shinozaki, “Flexmatch: Boosting semi-supervised learning with curriculum pseudo labeling,” in Advances in Neural Information Processing Systems, 2021, pp. 18 408–18 419.
  47. R. Zhu, G. Sang, and Q. Zhao, “Discriminative feature adaptation for cross-domain facial expression recognition,” in International Conference on Biometrics (ICB), 2016, pp. 1–7.
  48. M. V. Zavarez, R. F. Berriel, and T. Oliveira-Santos, “Cross-database facial expression recognition based on fine-tuned deep convolutional network,” in 30th Conference on Graphics, Patterns and Images (SIBGRAPI), 2017, pp. 405–412.
  49. K. Fatras, T. Sejourne, R. Flamary, and N. Courty, “Unbalanced minibatch optimal transport; applications to domain adaptation,” in 38th International Conference on Machine Learning, 2021, pp. 3186–3197.
  50. M. Li, Y.-M. Zhai, Y.-W. Luo, P. Ge, and C.-X. Ren, “Enhanced transport distance for unsupervised domain adaptation,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 13 933–13 941.
  51. R. Xu, G. Li, J. Yang, and L. Lin, “Larger norm more transferable: An adaptive feature norm approach for unsupervised domain adaptation,” in IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 1426–1435.
  52. C.-Y. Lee, T. Batra, M. H. Baig, and D. Ulbricht, “Sliced wasserstein discrepancy for unsupervised domain adaptation,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 10 277–10 287.
  53. S. Li, W. Deng, and J. Du, “Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 2584–2593.
  54. S. Li and W. Deng, “Deep emotion transfer network for cross-database facial expression recognition,” in 24th International Conference on Pattern Recognition (ICPR), 2018, pp. 3092–3099.
  55. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778.
  56. M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “Mobilenetv2: Inverted residuals and linear bottlenecks,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 4510–4520.
  57. K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, “Joint face detection and alignment using multitask cascaded convolutional networks,” IEEE Signal Processing Letters, vol. 23, no. 10, pp. 1499–1503, 2016.
  58. Y. Guo, L. Zhang, Y. Hu, X. He, and J. Gao, “Ms-celeb-1m: A dataset and benchmark for large-scale face recognition,” in European Conference on Computer Vision (ECCV), 2016, pp. 87–102.
  59. X. Glorot and Y. Bengio, “Understanding the difficulty of training deep feedforward neural networks,” in The Thirteenth International Conference on Artificial Intelligence and Statistics, 2010, pp. 249–256.
  60. J. Wen, R. Liu, N. Zheng, Q. Zheng, Z. Gong, and J. Yuan, “Exploiting local feature patterns for unsupervised domain adaptation,” in the AAAI Conference on Artificial Intelligence, 2019, pp. 5401–5408.
  61. M. Friedman, “The use of ranks to avoid the assumption of normality implicit in the analysis of variance,” Journal of The American Statistical Association, vol. 32, no. 200, pp. 675–701, 1937.
  62. R. TAMURA, “Some distribution-free multiple comparison procedures,” Memoirs of the Faculty of Literature and Science, Shimane University, Natural Sciences, vol. 1, pp. 1–7, 1968.
  63. J. Demšar, “Statistical comparisons of classifiers over multiple data sets,” The Journal of Machine learning research, vol. 7, pp. 1–30, 2006.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets