From Modalities to Styles: Rethinking the Domain Gap in Heterogeneous Face Recognition (2404.14247v1)
Abstract: Heterogeneous Face Recognition (HFR) focuses on matching faces from different domains, for instance, thermal to visible images, making Face Recognition (FR) systems more versatile for challenging scenarios. However, the domain gap between these domains and the limited large-scale datasets in the target HFR modalities make it challenging to develop robust HFR models from scratch. In our work, we view different modalities as distinct styles and propose a method to modulate feature maps of the target modality to address the domain gap. We present a new Conditional Adaptive Instance Modulation (CAIM ) module that seamlessly fits into existing FR networks, turning them into HFR-ready systems. The CAIM block modulates intermediate feature maps, efficiently adapting to the style of the source modality and bridging the domain gap. Our method enables end-to-end training using a small set of paired samples. We extensively evaluate the proposed approach on various challenging HFR benchmarks, showing that it outperforms state-of-the-art methods. The source code and protocols for reproducing the findings will be made publicly available
- E. Learned-Miller, G. B. Huang, A. RoyChowdhury, H. Li, and G. Hua, “Labeled faces in the wild: A survey,” Advances in face detection and facial image analysis, vol. 1, pp. 189–248, 2016.
- S. Z. Li, R. Chu, S. Liao, and L. Zhang, “Illumination invariant face recognition using near-infrared images,” IEEE Transactions on pattern analysis and machine intelligence, vol. 29, no. 4, pp. 627–639, 2007.
- A. George, D. Geissbuhler, and S. Marcel, “A comprehensive evaluation on multi-channel biometric face presentation attack detection,” arXiv preprint arXiv:2202.10286, 2022.
- A. George, A. Mohammadi, and S. Marcel, “Prepended domain transformer: Heterogeneous face recognition without bells and whistles,” IEEE Transactions on Information Forensics and Security, 2022.
- B. F. Klare and A. K. Jain, “Heterogeneous face recognition using kernel prototype similarities,” IEEE transactions on pattern analysis and machine intelligence, vol. 35, no. 6, pp. 1410–1422, 2012.
- R. He, X. Wu, Z. Sun, and T. Tan, “Wasserstein cnn: Learning invariant features for nir-vis face recognition,” IEEE transactions on pattern analysis and machine intelligence, vol. 41, no. 7, pp. 1761–1773, 2018.
- A. George and S. Marcel, “Bridging the gap: Heterogeneous face recognition with conditional adaptive instance modulation,” in 2023 IEEE International Joint Conference on Biometrics (IJCB). IEEE, 2023, p. 1.
- S. Liao, D. Yi, Z. Lei, R. Qin, and S. Z. Li, “Heterogeneous face recognition from local structures of normalized appearance,” in International Conference on Biometrics. Springer, 2009, pp. 209–218.
- B. Klare, Z. Li, and A. K. Jain, “Matching forensic sketches to mug shot photos,” IEEE transactions on pattern analysis and machine intelligence, vol. 33, no. 3, pp. 639–646, 2010.
- W. Zhang, X. Wang, and X. Tang, “Coupled information-theoretic encoding for face photo-sketch recognition,” in CVPR 2011. IEEE, 2011, pp. 513–520.
- R. He, X. Wu, Z. Sun, and T. Tan, “Learning invariant deep representation for nir-vis face recognition,” in Thirty-First AAAI Conference on Artificial Intelligence, 2017.
- H. Roy and D. Bhattacharjee, “A novel quaternary pattern of local maximum quotient for heterogeneous face recognition,” Pattern Recognition Letters, vol. 113, pp. 19–28, 2018.
- D. Liu, J. Li, N. Wang, C. Peng, and X. Gao, “Composite components-based face sketch recognition,” Neurocomputing, vol. 302, pp. 46–54, 2018.
- M. Kan, S. Shan, H. Zhang, S. Lao, and X. Chen, “Multi-view discriminant analysis,” IEEE transactions on pattern analysis and machine intelligence, vol. 38, no. 1, pp. 188–194, 2015.
- D. Lin and X. Tang, “Inter-modality face recognition,” in European conference on computer vision. Springer, 2006, pp. 13–26.
- D. Yi, R. Liu, R. Chu, Z. Lei, and S. Z. Li, “Face matching between near infrared and visible light images,” in International Conference on Biometrics. Springer, 2007, pp. 523–530.
- Z. Lei and S. Z. Li, “Coupled spectral regression for matching heterogeneous faces,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2009, pp. 1123–1128.
- Z. Lei, S. Liao, A. K. Jain, and S. Z. Li, “Coupled discriminant analysis for heterogeneous face recognition,” IEEE Transactions on Information Forensics and Security, vol. 7, no. 6, pp. 1707–1716, 2012.
- A. Sharma and D. W. Jacobs, “Bypassing synthesis: Pls for face recognition with pose, low-resolution and sketch,” in CVPR 2011. IEEE, 2011, pp. 593–600.
- T. de Freitas Pereira, A. Anjos, and S. Marcel, “Heterogeneous face recognition using domain specific units,” IEEE Transactions on Information Forensics and Security, vol. 14, no. 7, pp. 1803–1816, 2018.
- D. Liu, X. Gao, N. Wang, J. Li, and C. Peng, “Coupled attribute learning for heterogeneous face recognition,” IEEE Transactions on Neural Networks and Learning Systems, vol. 31, no. 11, pp. 4699–4712, 2020.
- D. Liu, W. Yang, C. Peng, N. Wang, R. Hu, and X. Gao, “Modality-agnostic augmented multi-collaboration representation for semi-supervised heterogenous face recognition,” in Proceedings of the 31st ACM International Conference on Multimedia, 2023, pp. 4647–4656.
- X. Tang and X. Wang, “Face sketch synthesis and recognition,” in Proceedings Ninth IEEE International Conference on Computer Vision. IEEE, 2003, pp. 687–694.
- C. Fu, X. Wu, Y. Hu, H. Huang, and R. He, “Dvg-face: Dual variational generation for heterogeneous face recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.
- X. Wang and X. Tang, “Face photo-sketch synthesis and recognition,” IEEE transactions on pattern analysis and machine intelligence, vol. 31, no. 11, pp. 1955–1967, 2008.
- Q. Liu, X. Tang, H. Jin, H. Lu, and S. Ma, “A nonlinear approach for face sketch synthesis and recognition,” in 2005 IEEE Computer Society conference on computer vision and pattern recognition (CVPR’05), vol. 1. IEEE, 2005, pp. 1005–1010.
- J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks,” arXiv:1703.10593 [cs], Mar. 2017.
- H. B. Bae, T. Jeon, Y. Lee, S. Jang, and S. Lee, “Non-visual to visual translation for cross-domain face recognition,” IEEE Access, vol. 8, pp. 50 452–50 464, 2020.
- H. Zhang, V. M. Patel, B. S. Riggan, and S. Hu, “Generative adversarial network-based synthesis of visible faces from polarimetrie thermal faces,” in 2017 IEEE International Joint Conference on Biometrics (IJCB). IEEE, 2017, pp. 100–107.
- D. Liu, X. Gao, C. Peng, N. Wang, and J. Li, “Heterogeneous face interpretable disentangled representation for joint face recognition and synthesis,” IEEE transactions on neural networks and learning systems, vol. 33, no. 10, pp. 5611–5625, 2021.
- M. Luo, H. Wu, H. Huang, W. He, and R. He, “Memory-modulated transformer network for heterogeneous face recognition,” IEEE Transactions on Information Forensics and Security, vol. 17, pp. 2095–2109, 2022.
- K. Weiss, T. M. Khoshgoftaar, and D. Wang, “A survey of transfer learning,” Journal of Big data, vol. 3, no. 1, pp. 1–40, 2016.
- A. George and S. Marcel, “Heterogeneous face recognition using domain invariant units,” in ICASSP 2024 - IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2024.
- R. Hadsell, S. Chopra, and Y. LeCun, “Dimensionality reduction by learning an invariant mapping,” in 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 2. IEEE, 2006, pp. 1735–1742.
- D. Ulyanov, A. Vedaldi, and V. Lempitsky, “Improved texture networks: Maximizing quality and diversity in feed-forward stylization and texture synthesis,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 6924–6932.
- L. A. Gatys, A. S. Ecker, and M. Bethge, “Image style transfer using convolutional neural networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 2414–2423.
- V. Dumoulin, J. Shlens, and M. Kudlur, “A learned representation for artistic style,” arXiv preprint arXiv:1610.07629, 2016.
- X. Huang and S. Belongie, “Arbitrary style transfer in real-time with adaptive instance normalization,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 1501–1510.
- K. Zhou, Y. Yang, Y. Qiao, and T. Xiang, “Domain generalization with mixstyle,” in International Conference on Learning Representations, 2021. [Online]. Available: https://openreview.net/forum?id=6xHJ37MVxxp
- “Pytorch insightface,” Sep 2021. [Online]. Available: https://github.com/nizhib/pytorch-insightface
- A. Anjos, M. Günther, T. de Freitas Pereira, P. Korshunov, A. Mohammadi, and S. Marcel, “Continuously reproducing toolchains in pattern recognition and machine learning experiments,” in International Conference on Machine Learning (ICML), Aug. 2017. [Online]. Available: http://publications.idiap.ch/downloads/papers/2017/Anjos_ICML2017-2_2017.pdf
- A. Anjos, L. E. Shafey, R. Wallace, M. Günther, C. McCool, and S. Marcel, “Bob: a free signal processing and machine learning toolbox for researchers,” in 20th ACM Conference on Multimedia Systems (ACMMM), Nara, Japan, Oct. 2012. [Online]. Available: https://publications.idiap.ch/downloads/papers/2012/Anjos_Bob_ACMMM12.pdf
- K. Panetta, Q. Wan, S. Agaian, S. Rajeev, S. Kamath, R. Rajendran, S. P. Rao, A. Kaszowska, H. A. Taylor, A. Samani et al., “A comprehensive database for benchmarking imaging systems,” IEEE transactions on pattern analysis and machine intelligence, vol. 42, no. 3, pp. 509–520, 2018.
- M. Grgic, K. Delac, and S. Grgic, “Scface–surveillance cameras face database,” Multimedia tools and applications, vol. 51, no. 3, pp. 863–879, 2011.
- S. Hu, N. J. Short, B. S. Riggan, C. Gordon, K. P. Gurton, M. Thielke, P. Gurram, and A. L. Chan, “A polarimetric thermal database for face recognition research,” in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2016, pp. 119–126.
- S. Li, D. Yi, Z. Lei, and S. Liao, “The casia nir-vis 2.0 face database,” in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2013, pp. 348–353.
- P. J. Phillips, H. Wechsler, J. Huang, and P. J. Rauss, “The feret database and evaluation procedure for face-recognition algorithms,” Image and vision computing, vol. 16, no. 5, pp. 295–306, 1998.
- Y. Fang, W. Deng, J. Du, and J. Hu, “Identity-aware cyclegan for face photo-sketch synthesis and recognition,” Pattern Recognition, vol. 102, p. 107249, 2020.
- X. Wu, R. He, Z. Sun, and T. Tan, “A light cnn for deep face representation with noisy labels,” IEEE Transactions on Information Forensics and Security, vol. 13, no. 11, pp. 2884–2896, 2018.
- C. Fu, X. Wu, Y. Hu, H. Huang, and R. He, “Dual variational generation for low shot heterogeneous face recognition,” in Advances in Neural Information Processing Systems, 2019.
- T. de Freitas Pereira and S. Marcel, “Heterogeneous face recognition using inter-session variability modelling,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2016, pp. 111–118.
- A. F. Sequeira, L. Chen, J. Ferryman, P. Wild, F. Alonso-Fernandez, J. Bigun, K. B. Raja, R. Raghavendra, C. Busch, T. de Freitas Pereira et al., “Cross-eyed 2017: Cross-spectral iris/periocular recognition competition,” in 2017 IEEE International Joint Conference on Biometrics (IJCB). IEEE, 2017, pp. 725–732.
- S. J. Klum, H. Han, B. F. Klare, and A. K. Jain, “The facesketchid system: Matching facial composites to mugshots,” IEEE Transactions on Information Forensics and Security, vol. 9, no. 12, pp. 2248–2263, 2014.
- C. Reale, N. M. Nasrabadi, H. Kwon, and R. Chellappa, “Seeing the forest from the trees: A holistic approach to near-infrared heterogeneous face recognition,” in IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2016.
- S. Saxena and J. Verbeek, “Heterogeneous face recognition with cnns,” in European Conference on Computer Vision, 2016.
- J. Lezama, Q. Qiu, and G. Sapiro, “Not afraid of the dark: Nir-vis face recognition via cross-spectral hallucination and low-rank embedding,” in IEEE Conference on Computer Vision and Pattern Recognition, 2017.
- X. Liu, L. Song, X. Wu, and T. Tan, “Transferring deep representation for nir-vis heterogeneous face recognition,” in International Conference on Biometrics, 2016.
- R. He, X. Wu, Z. Sun, and T. Tan, “Wasserstein cnn: Learning invariant features for nir-vis face recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 7, pp. 1761–1773, 2018.
- B. Duan, C. Fu, Y. Li, X. Song, and R. He, “Pose agnostic cross-spectral hallucination via disentangling independent factors,” in IEEE Conference on Computer Vision and Pattern Recognition, 2020.
- Z. Deng, X. Peng, and Y. Qiao, “Residual compensation networks for heterogeneous face recognition,” in AAAI Conference on Artificial Intelligence, 2019.
- Z. Deng, X. Peng, Z. Li, and Y. Qiao, “Mutual component convolutional neural networks for heterogeneous face recognition,” IEEE Transactions on Image Processing, vol. 28, no. 6, pp. 3102–3114, 2019.
- X. Wu, H. Huang, V. M. Patel, R. He, and Z. Sun, “Disentangled variational representation for heterogeneous face recognition,” in AAAI Conference on Artificial Intelligence, 2019.
- J. Deng, J. Guo, N. Xue, and S. Zafeiriou, “Arcface: Additive angular margin loss for deep face recognition,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 4690–4699.
- F. Boutros, N. Damer, F. Kirchbuchner, and A. Kuijper, “Elasticface: Elastic margin loss for deep face recognition,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 1578–1587.