VI-Diff: Unpaired Visible-Infrared Translation Diffusion Model for Single Modality Labeled Visible-Infrared Person Re-identification (2310.04122v1)
Abstract: Visible-Infrared person re-identification (VI-ReID) in real-world scenarios poses a significant challenge due to the high cost of cross-modality data annotation. Different sensing cameras, such as RGB/IR cameras for good/poor lighting conditions, make it costly and error-prone to identify the same person across modalities. To overcome this, we explore the use of single-modality labeled data for the VI-ReID task, which is more cost-effective and practical. By labeling pedestrians in only one modality (e.g., visible images) and retrieving in another modality (e.g., infrared images), we aim to create a training set containing both originally labeled and modality-translated data using unpaired image-to-image translation techniques. In this paper, we propose VI-Diff, a diffusion model that effectively addresses the task of Visible-Infrared person image translation. Through comprehensive experiments, we demonstrate that VI-Diff outperforms existing diffusion and GAN models, making it a promising solution for VI-ReID with single-modality labeled data. Our approach can be a promising solution to the VI-ReID task with single-modality labeled data and serves as a good starting point for future study. Code will be available.
- One-sided unsupervised domain mapping. Advances in neural information processing systems, 30, 2017.
- Ilvr: Conditioning method for denoising diffusion probabilistic models. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 14347–14356, 2021.
- Hi-cmd: Hierarchical cross-modality disentanglement for visible-infrared person re-identification. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10257–10266, 2020a.
- Stargan v2: Diverse image synthesis for multiple domains. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8188–8197, 2020b.
- Cross-modality person re-identification with generative adversarial training. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, page 677–683. AAAI Press, 2018.
- Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 994–1003, 2018.
- Diffusion models beat gans on image synthesis. Advances in neural information processing systems, 34:8780–8794, 2021.
- Cross-modal cross-domain dual alignment network for rgb-infrared person re-identification. IEEE Transactions on Circuits and Systems for Video Technology, 32(10):6874–6887, 2022.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- Classifier-free diffusion guidance. arXiv preprint arXiv:2207.12598, 2022.
- Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
- Sbsgan: Suppression of inter-domain background shift for person re-identification. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9527–9536, 2019.
- Learning to discover cross-domain relations with generative adversarial networks. In International conference on machine learning, pages 1857–1865. PMLR, 2017.
- Diverse image-to-image translation via disentangled representations. In European Conference on Computer Vision, 2018.
- Homogeneous-to-heterogeneous: Unsupervised learning for rgb-infrared person re-identification. IEEE Transactions on Image Processing, 30:6392–6407, 2021a.
- Homogeneous-to-heterogeneous: Unsupervised learning for rgb-infrared person re-identification. IEEE Transactions on Image Processing, 30:6392–6407, 2021b.
- Cross-modality person re-identification with shared-specific feature transfer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13379–13389, 2020.
- SDEdit: Guided image synthesis and editing with stochastic differential equations. In International Conference on Learning Representations, 2022.
- Person recognition system based on a combination of body images from visible light and thermal cameras. Sensors, 17(3):605, 2017.
- Learning by aligning: Visible-infrared person re-identification using cross-modal correspondences. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 12046–12055, 2021.
- Contrastive learning for unpaired image-to-image translation. In Computer Vision – ECCV 2020, pages 319–345, Cham, 2020. Springer International Publishing.
- U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pages 234–241. Springer, 2015.
- Palette: Image-to-image diffusion models. In ACM SIGGRAPH 2022 Conference Proceedings, pages 1–10, 2022.
- Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502, 2020.
- Dual diffusion implicit bridges for image-to-image translation. In International Conference on Learning Representations, 2023.
- Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2818–2826, 2016.
- Laurens Van Der Maaten. Accelerating t-sne using tree-based algorithms. The journal of machine learning research, 15(1):3221–3245, 2014.
- Optimal transport for label-efficient visible-infrared person re-identification. In European Conference on Computer Vision, pages 93–109. Springer, 2022.
- Person transfer gan to bridge domain gap for person re-identification. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 79–88, 2018.
- Rgb-infrared cross-modality person re-identification. In Proceedings of the IEEE international conference on computer vision, pages 5380–5389, 2017.
- Discover cross-modality nuances for visible-infrared person re-identification. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4328–4337, 2021.
- Holistically-nested edge detection. In Proceedings of IEEE International Conference on Computer Vision, 2015.
- Learning with twin noisy labels for visible-infrared person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 14308–14317, 2022.
- Bi-directional center-constrained top-ranking for visible thermal person re-identification. IEEE Transactions on Information Forensics and Security, 15:407–419, 2019.
- Dynamic dual-attentive aggregation learning for visible-infrared person re-identification. In European Conference on Computer Vision, pages 229–247. Springer, 2020.
- Channel augmented joint learning for visible-infrared recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 13567–13576, 2021a.
- Deep learning for person re-identification: A survey and outlook. IEEE transactions on pattern analysis and machine intelligence, 44(6):2872–2893, 2021b.
- Dualgan: Unsupervised dual learning for image-to-image translation. In Proceedings of the IEEE international conference on computer vision, pages 2849–2857, 2017.
- Diverse embedding expansion network and low-light cross-modality benchmark for visible-infrared person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2153–2162, 2023.
- Generalized cross entropy loss for training deep neural networks with noisy labels. Advances in neural information processing systems, 31, 2018.
- Egsde: Unpaired image-to-image translation via energy-guided stochastic differential equations. Advances in Neural Information Processing Systems, 35:3609–3623, 2022.
- Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision, pages 2223–2232, 2017.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.