Cross-Modality Paired-Images Generation for RGB-Infrared Person Re-Identification (2002.04114v2)

Published 10 Feb 2020 in cs.CV

Abstract: RGB-Infrared (IR) person re-identification is very challenging due to the large cross-modality variations between RGB and IR images. The key solution is to learn aligned features to the bridge RGB and IR modalities. However, due to the lack of correspondence labels between every pair of RGB and IR images, most methods try to alleviate the variations with set-level alignment by reducing the distance between the entire RGB and IR sets. However, this set-level alignment may lead to misalignment of some instances, which limits the performance for RGB-IR Re-ID. Different from existing methods, in this paper, we propose to generate cross-modality paired-images and perform both global set-level and fine-grained instance-level alignments. Our proposed method enjoys several merits. First, our method can perform set-level alignment by disentangling modality-specific and modality-invariant features. Compared with conventional methods, ours can explicitly remove the modality-specific features and the modality variation can be better reduced. Second, given cross-modality unpaired-images of a person, our method can generate cross-modality paired images from exchanged images. With them, we can directly perform instance-level alignment by minimizing distances of every pair of images. Extensive experimental results on two standard benchmarks demonstrate that the proposed model favourably against state-of-the-art methods. Especially, on SYSU-MM01 dataset, our model can achieve a gain of 9.2% and 7.7% in terms of Rank-1 and mAP. Code is available at https://github.com/wangguanan/JSIA-ReID.

PDF Abstract

Cross-Modality Paired-Images Generation for RGB-Infrared Person Re-Identification

The work presented in "Cross-Modality Paired-Images Generation for RGB-Infrared Person Re-Identification" tackles a considerable challenge in person re-identification—dealing with the substantial variations across different modalities, specifically RGB and infrared (IR) images. Traditional single modality matching approaches that rely on RGB-RGB scenarios falter in environments with insufficient lighting where IR cameras take precedence.

This paper elucidates a novel method termed as Joint Set-Level and Instance-Level Alignment Re-ID (JSIA-ReID), offering a dual-focus strategy on both global set-level and fine-grained instance-level alignments, thereby addressing the imprecisions emerging from solely set-level alignment seen in previous methodologies. Notably, the proposed system generates cross-modality paired-images that significantly enhance the feature alignment process.

Key Contributions

Set-Level and Instance-Level Alignment: The central innovation involves a two-pronged alignment strategy. Initially, set-level alignment is accomplished by disentangling the captured images into modality-specific and invariant features. This disentanglement explicitly focuses on reducing modality-specific attributes to a minimum while retaining vital shared attributes across modalities.
Cross-Modality Paired-Images Generation: The research introduces an approach to generate paired images across modalities, sidestepping the dependency on labeled pairing data that current methodologies require. This is achieved by leveraging a generation model that reconstructs and translates images into modalities where standardized point-to-point alignment can be operationalized.
Superior Performance on Protocols: The empirical results underscored the effectiveness of the cross-modality approach. Notably, on the SYSU-MM01 dataset, improvements of 9.2% in Rank-1 accuracy and 7.7% in mAP underscore a significant advancement over existing state-of-the-art techniques. Furthermore, these results reinforce the utility of combining generation models with instance-based refinement for enhanced cross-modal representation.

Practical and Theoretical Implications

Practically, this methodology improves the robustness of person re-identification systems in varied lighting conditions, crucial for surveillance and security sectors. The ability to accurately match subjects across RGB and IR modalities even in challenging settings can inform the deployment of these systems in smart cities and security-sensitive regions.

Theoretically, the paper advances the understanding of modality transformation and alignment mechanics, setting a precedent for future research exploring image feature disentanglement and real-time image synthesis models. Furthermore, it propels the exploration of deep learning architectures that fuse generative models with feature extraction paradigms, hinting at potential advancements in other computer vision tasks requiring cross-modality adaptation.

Future Prospects in AI

Looking ahead, the method's framework could be extended to encompass a wider array of modalities, such as integrating Lidar or depth information into person re-identification tasks. Moreover, exploring adversarial training frameworks that dynamically adapt to ever-changing environments and lighting conditions could also extend this research's applicability. On the AI frontier, this paper lays the groundwork for more nuanced and cross-detailed modal-tuning methods that tap into broader applications across fields reliant on cross-spectrum recognition capabilities.

In summary, this work makes a substantial contribution to the domain of cross-modality person re-identification by addressing the limitations of prior models through its innovative paired image generation approach and dual focus on set and instance level alignment.

PDF Markdown Bookmark Chat (Pro)

Authors (6)

Guan-An Wang (2 papers)
Tianzhu Zhang. Yang Yang (1 paper)
Jian Cheng (127 papers)
Jianlong Chang (22 papers)
Xu Liang (10 papers)
Zengguang Hou (6 papers)

Citations (268)

View on Semantic Scholar

Cross-Modality Paired-Images Generation for RGB-Infrared Person Re-Identification (2002.04114v2)

Cross-Modality Paired-Images Generation for RGB-Infrared Person Re-Identification

Key Contributions

Practical and Theoretical Implications

Future Prospects in AI

Related Papers