Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Unsupervised Visible-Infrared ReID via Pseudo-label Correction and Modality-level Alignment (2404.06683v1)

Published 10 Apr 2024 in cs.CV

Abstract: Unsupervised visible-infrared person re-identification (UVI-ReID) has recently gained great attention due to its potential for enhancing human detection in diverse environments without labeling. Previous methods utilize intra-modality clustering and cross-modality feature matching to achieve UVI-ReID. However, there exist two challenges: 1) noisy pseudo labels might be generated in the clustering process, and 2) the cross-modality feature alignment via matching the marginal distribution of visible and infrared modalities may misalign the different identities from two modalities. In this paper, we first conduct a theoretic analysis where an interpretable generalization upper bound is introduced. Based on the analysis, we then propose a novel unsupervised cross-modality person re-identification framework (PRAISE). Specifically, to address the first challenge, we propose a pseudo-label correction strategy that utilizes a Beta Mixture Model to predict the probability of mis-clustering based network's memory effect and rectifies the correspondence by adding a perceptual term to contrastive learning. Next, we introduce a modality-level alignment strategy that generates paired visible-infrared latent features and reduces the modality gap by aligning the labeling function of visible and infrared features to learn identity discriminative and modality-invariant features. Experimental results on two benchmark datasets demonstrate that our method achieves state-of-the-art performance than the unsupervised visible-ReID methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (70)
  1. C. Lang, A. Braun, L. Schillingmann, and A. Valada, “Self-supervised multi-object tracking for autonomous driving from consistency across timescales,” IEEE Robotics and Automation Letters, 2023.
  2. H. Ye, J. Zhao, Y. Zhan, W. Chen, L. He, and H. Zhang, “Person re-identification for robot person following with online continual learning,” arXiv preprint arXiv:2309.11727, 2023.
  3. Z. Li, H. Shao, L. Niu, and N. Xue, “Progressive learning algorithm for efficient person re-identification,” in 2020 25th International Conference on Pattern Recognition (ICPR).   IEEE, 2021, pp. 16–23.
  4. Q. Zhang, C. Lai, J. Liu, N. Huang, and J. Han, “Fmcnet: Feature-level modality compensation for visible-infrared person re-identification,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 7349–7358.
  5. H. Wang, J. Shen, Y. Liu, Y. Gao, and E. Gavves, “Nformer: Robust person re-identification with neighbor transformer,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 7297–7307.
  6. J. Xu, R. Zhao, F. Zhu, H. Wang, and W. Ouyang, “Attention-aware compositional network for person re-identification,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 2119–2128.
  7. N. Pu, Z. Zhong, N. Sebe, and M. S. Lew, “A memorizing and generalizing framework for lifelong person re-identification,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
  8. X. Yang, P. Zhou, and M. Wang, “Person reidentification via structural deep metric learning,” IEEE transactions on neural networks and learning systems, vol. 30, no. 10, pp. 2987–2998, 2018.
  9. S. Choi, S. Lee, Y. Kim, T. Kim, and C. Kim, “Hi-cmd: Hierarchical cross-modality disentanglement for visible-infrared person re-identification,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 10 257–10 266.
  10. Z. Feng, J. Lai, and X. Xie, “Learning modality-specific representations for visible-infrared person re-identification,” IEEE Transactions on Image Processing, vol. 29, pp. 579–590, 2019.
  11. Q. Wu, P. Dai, J. Chen, C.-W. Lin, Y. Wu, F. Huang, B. Zhong, and R. Ji, “Discover cross-modality nuances for visible-infrared person re-identification,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 4330–4339.
  12. H. Zheng, X. Zhong, W. Huang, K. Jiang, W. Liu, and Z. Wang, “Visible-infrared person re-identification: A comprehensive survey and a new setting,” Electronics, vol. 11, no. 3, p. 454, 2022.
  13. H. Park, S. Lee, J. Lee, and B. Ham, “Learning by aligning: Visible-infrared person re-identification using cross-modal correspondences,” in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 12 046–12 055.
  14. X. Cai, L. Liu, L. Zhu, and H. Zhang, “Dual-modality hard mining triplet-center loss for visible infrared person re-identification,” Knowledge-Based Systems, vol. 215, p. 106772, 2021.
  15. A. Wu, W.-S. Zheng, H.-X. Yu, S. Gong, and J. Lai, “Rgb-infrared cross-modality person re-identification,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 5380–5389.
  16. M. Ye, X. Lan, Z. Wang, and P. C. Yuen, “Bi-directional center-constrained top-ranking for visible thermal person re-identification,” IEEE Transactions on Information Forensics and Security, vol. 15, pp. 407–419, 2019.
  17. Y. Lu, Y. Wu, B. Liu, T. Zhang, B. Li, Q. Chu, and N. Yu, “Cross-modality person re-identification with shared-specific feature transfer,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 13 379–13 389.
  18. Z. Wang, Z. Wang, Y. Zheng, Y.-Y. Chuang, and S. Satoh, “Learning to reduce dual-level discrepancy for infrared-visible person re-identification,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 618–626.
  19. G. Wang, T. Zhang, J. Cheng, S. Liu, Y. Yang, and Z. Hou, “Rgb-infrared cross-modality person re-identification via joint pixel and feature alignment,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 3623–3632.
  20. W. Liang, G. Wang, J. Lai, and X. Xie, “Homogeneous-to-heterogeneous: Unsupervised learning for rgb-infrared person re-identification,” IEEE Transactions on Image Processing, vol. 30, pp. 6392–6407, 2021.
  21. J. Wang, Z. Zhang, M. Chen, Y. Zhang, C. Wang, B. Sheng, Y. Qu, and Y. Xie, “Optimal transport for label-efficient visible-infrared person re-identification,” in Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXIV.   Springer, 2022, pp. 93–109.
  22. C. Hu and G. H. Lee, “Feature representation learning for unsupervised cross-domain image retrieval,” in Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXVII.   Springer, 2022, pp. 529–544.
  23. X. Wang, D. Peng, M. Yan, and P. Hu, “Correspondence-free domain alignment for unsupervised cross-domain image retrieval,” Proceedings of the AAAI Conference on Artificial Intelligence, 2023.
  24. Z. Ma and A. Leijon, “Bayesian estimation of beta mixture models with variational inference,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 11, pp. 2160–2173, 2011.
  25. M. Ye, J. Shen, D. J. Crandall, L. Shao, and J. Luo, “Dynamic dual-attentive aggregation learning for visible-infrared person re-identification,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVII 16.   Springer, 2020, pp. 229–247.
  26. H. Liu, S. Ma, D. Xia, and S. Li, “Sfanet: A spectrum-aware feature augmentation network for visible-infrared person reidentification,” IEEE Transactions on Neural Networks and Learning Systems, 2021.
  27. M. Ye, J. Shen, and L. Shao, “Visible-infrared person re-identification via homogeneous augmented tri-modal learning,” IEEE Transactions on Information Forensics and Security, vol. 16, pp. 728–739, 2020.
  28. Y. Li, T. Zhang, and Y. Zhang, “Frequency domain modality-invariant feature learning for visible-infrared person re-identification,” arXiv preprint arXiv:2401.01839, 2024.
  29. N. Pu, W. Chen, Y. Liu, E. M. Bakker, and M. S. Lew, “Dual gaussian-based variational subspace disentanglement for visible-infrared person re-identification,” in Proceedings of the 28th ACM International Conference on Multimedia, 2020, pp. 2149–2158.
  30. K. Kansal, A. V. Subramanyam, Z. Wang, and S. Satoh, “Sdl: Spectrum-disentangled representation learning for visible-infrared person re-identification,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 10, pp. 3422–3432, 2020.
  31. M. Ye, X. Lan, J. Li, and P. Yuen, “Hierarchical discriminative learning for visible thermal person re-identification,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, no. 1, 2018.
  32. Y. Hao, N. Wang, J. Li, and X. Gao, “Hsme: Hypersphere manifold embedding for visible thermal person re-identification,” in Proceedings of the AAAI conference on artificial intelligence, vol. 33, no. 01, 2019, pp. 8385–8392.
  33. M. Ye, W. Ruan, B. Du, and M. Z. Shou, “Channel augmented joint learning for visible-infrared recognition,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 13 567–13 576.
  34. J. Liu, Y. Sun, F. Zhu, H. Pei, Y. Yang, and W. Li, “Learning memory-augmented unidirectional metrics for cross-modality person re-identification,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 19 366–19 375.
  35. Y. Gao, T. Liang, Y. Jin, X. Gu, W. Liu, Y. Li, and C. Lang, “Mso: Multi-feature space joint optimization network for rgb-infrared person re-identification,” in Proceedings of the 29th ACM international conference on multimedia, 2021, pp. 5257–5265.
  36. V. V. Kniaz, V. A. Knyaz, J. Hladuvka, W. G. Kropatsch, and V. Mizginov, “Thermalgan: Multimodal color-to-thermal image translation for person re-identification in multispectral dataset,” in Proceedings of the European Conference on Computer Vision (ECCV) Workshops, 2018, pp. 0–0.
  37. G.-A. Wang, T. Zhang, Y. Yang, J. Cheng, J. Chang, X. Liang, and Z.-G. Hou, “Cross-modality paired-images generation for rgb-infrared person re-identification,” in Proceedings of the AAAI conference on artificial intelligence, vol. 34, no. 07, 2020, pp. 12 144–12 151.
  38. Y. Yang, T. Zhang, J. Cheng, Z. Hou, P. Tiwari, H. M. Pandey et al., “Cross-modality paired-images generation and augmentation for rgb-infrared person re-identification,” Neural Networks, vol. 128, pp. 294–304, 2020.
  39. B. Yang, M. Ye, J. Chen, and Z. Wu, “Augmented dual-contrastive aggregation learning for unsupervised visible-infrared person re-identification,” in Proceedings of the 30th ACM International Conference on Multimedia, 2022, pp. 2843–2851.
  40. Z. Wu and M. Ye, “Unsupervised visible-infrared person re-identification via progressive graph matching and alternate learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 9548–9558.
  41. B. Yang, J. Chen, and M. Ye, “Towards grand unified representation learning for unsupervised visible-infrared person re-identification,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 11 069–11 079.
  42. D. Cheng, X. Huang, N. Wang, L. He, Z. Li, and X. Gao, “Unsupervised visible-infrared person reid by collaborative learning with neighbor-guided label refinement,” in Proceedings of the 31st ACM International Conference on Multimedia, 2023, pp. 7085–7093.
  43. B. Yang, J. Chen, C. Chen, and M. Ye, “Dual consistency-constrained learning for unsupervised visible-infrared person re-identification,” IEEE Transactions on Information Forensics and Security, 2023.
  44. J. Shi, X. Yin, Y. Chen, Y. Zhang, Z. Zhang, Y. Xie, and Y. Qu, “Multi-memory matching for unsupervised visible-infrared person re-identification,” arXiv preprint arXiv:2401.06825, 2024.
  45. J. Shi, X. Yin, Y. Wang, X. Liu, Y. Xie, and Y. Qu, “Progressive contrastive learning with multi-prototype for unsupervised visible-infrared person re-identification,” arXiv preprint arXiv:2402.19026, 2024.
  46. D. Fu, D. Chen, H. Yang, J. Bao, L. Yuan, L. Zhang, H. Li, F. Wen, and D. Chen, “Large-scale pre-training for person re-identification with noisy labels,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 2476–2486.
  47. M. Ye, H. Li, B. Du, J. Shen, L. Shao, and S. C. Hoi, “Collaborative refining for person re-identification with label noise,” IEEE Transactions on Image Processing, vol. 31, pp. 379–391, 2021.
  48. Y. Cho, W. J. Kim, S. Hong, and S.-E. Yoon, “Part-based pseudo label refinement for unsupervised person re-identification,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 7308–7318.
  49. Y. Wu, X. Wu, X. Li, and J. Tian, “Mgh: Metadata guided hypergraph modeling for unsupervised person re-identification,” in Proceedings of the 29th ACM International Conference on Multimedia, 2021, pp. 1571–1580.
  50. T. Yan, K. Zhu, G. Zhu, M. Tang, J. Wang et al., “Plug-and-play pseudo label correction network for unsupervised person re-identification,” arXiv preprint arXiv:2206.06607, 2022.
  51. D. Arpit, S. Jastrzkebski, N. Ballas, D. Krueger, E. Bengio, M. S. Kanwal, T. Maharaj, A. Fischer, A. Courville, Y. Bengio et al., “A closer look at memorization in deep networks,” in International conference on machine learning.   PMLR, 2017, pp. 233–242.
  52. S. Ben-David, J. Blitzer, K. Crammer, and F. Pereira, “Analysis of representations for domain adaptation,” Advances in neural information processing systems, vol. 19, 2006.
  53. H. Zhao, R. T. Des Combes, K. Zhang, and G. Gordon, “On learning invariant representations for domain adaptation,” in International conference on machine learning.   PMLR, 2019, pp. 7523–7532.
  54. K. Khan, S. U. Rehman, K. Aziz, S. Fong, and S. Sarasvady, “Dbscan: Past, present and future,” in The fifth international conference on the applications of digital information and web technologies (ICADIWT 2014).   IEEE, 2014, pp. 232–238.
  55. Z. Dai, G. Wang, W. Yuan, S. Zhu, and P. Tan, “Cluster contrast for unsupervised person re-identification,” in Proceedings of the Asian Conference on Computer Vision, 2022, pp. 1142–1160.
  56. J. Li, R. Socher, and S. C. Hoi, “Dividemix: Learning with noisy labels as semi-supervised learning,” arXiv preprint arXiv:2002.07394, 2020.
  57. E. Arazo, D. Ortego, P. Albert, N. O’Connor, and K. McGuinness, “Unsupervised label noise modeling and loss correction,” in International conference on machine learning.   PMLR, 2019, pp. 312–321.
  58. T. K. Moon, “The expectation-maximization algorithm,” IEEE Signal processing magazine, vol. 13, no. 6, pp. 47–60, 1996.
  59. J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2223–2232.
  60. M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein gan,” 2017.
  61. M. Ye, J. Shen, G. Lin, T. Xiang, L. Shao, and S. C. Hoi, “Deep learning for person re-identification: A survey and outlook,” IEEE transactions on pattern analysis and machine intelligence, vol. 44, no. 6, pp. 2872–2893, 2021.
  62. M. Yang, Z. Huang, P. Hu, T. Li, J. Lv, and X. Peng, “Learning with twin noisy labels for visible-infrared person re-identification,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 14 308–14 317.
  63. Y. Ge, F. Zhu, D. Chen, R. Zhao et al., “Self-paced contrastive learning with hybrid memory for domain adaptive object re-id,” Advances in Neural Information Processing Systems, vol. 33, pp. 11 309–11 321, 2020.
  64. Y. Ge, D. Chen, and H. Li, “Mutual mean-teaching: Pseudo label refinery for unsupervised domain adaptation on person re-identification,” arXiv preprint arXiv:2001.01526, 2020.
  65. H. Chen, B. Lagadec, and F. Bremond, “Ice: Inter-instance contrastive encoding for unsupervised person re-identification,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 14 960–14 969.
  66. S. Xuan and S. Zhang, “Intra-inter camera similarity for unsupervised person re-identification,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 11 926–11 935.
  67. Y. Lin, X. Dong, L. Zheng, Y. Yan, and Y. Yang, “A bottom-up clustering approach to unsupervised person re-identification,” in Proceedings of the AAAI conference on artificial intelligence, vol. 33, no. 01, 2019, pp. 8738–8745.
  68. X. Hao, S. Zhao, M. Ye, and J. Shen, “Cross-modality person re-identification via modality confusion and center aggregation,” in Proceedings of the IEEE/CVF International conference on computer vision, 2021, pp. 16 403–16 412.
  69. D. T. Nguyen, H. G. Hong, K. W. Kim, and K. R. Park, “Person recognition system based on a combination of body images from visible light and thermal cameras,” Sensors, vol. 17, no. 3, 2017. [Online]. Available: https://www.mdpi.com/1424-8220/17/3/605
  70. L. Van der Maaten and G. Hinton, “Visualizing data using t-sne.” Journal of machine learning research, vol. 9, no. 11, 2008.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com