Hi-CMD: Hierarchical Cross-Modality Disentanglement for Visible-Infrared Person Re-Identification (1912.01230v3)

Published 3 Dec 2019 in cs.CV

Abstract: Visible-infrared person re-identification (VI-ReID) is an important task in night-time surveillance applications, since visible cameras are difficult to capture valid appearance information under poor illumination conditions. Compared to traditional person re-identification that handles only the intra-modality discrepancy, VI-ReID suffers from additional cross-modality discrepancy caused by different types of imaging systems. To reduce both intra- and cross-modality discrepancies, we propose a Hierarchical Cross-Modality Disentanglement (Hi-CMD) method, which automatically disentangles ID-discriminative factors and ID-excluded factors from visible-thermal images. We only use ID-discriminative factors for robust cross-modality matching without ID-excluded factors such as pose or illumination. To implement our approach, we introduce an ID-preserving person image generation network and a hierarchical feature learning module. Our generation network learns the disentangled representation by generating a new cross-modality image with different poses and illuminations while preserving a person's identity. At the same time, the feature learning module enables our model to explicitly extract the common ID-discriminative characteristic between visible-infrared images. Extensive experimental results demonstrate that our method outperforms the state-of-the-art methods on two VI-ReID datasets. The source code is available at: https://github.com/bismex/HiCMD.

PDF Abstract

Hierarchical Cross-Modality Disentanglement for Visible-Infrared Person Re-Identification

This essay provides an overview of the research work titled "Hi-CMD: Hierarchical Cross-Modality Disentanglement for Visible-Infrared Person Re-Identification," which addresses the challenging task of person re-identification in visible-infrared (VI-ReID) scenarios. The task is particularly useful in surveillance applications, especially during nighttime when visible cameras encounter limitations due to poor illumination conditions. This paper proposes a novel method, termed Hi-CMD, which focuses on addressing the unique cross-modality discrepancies in VI-ReID.

Problem Context

Person re-identification (ReID) involves recognizing individuals across different camera views, which is of significant interest in security and surveillance. Traditional ReID methods primarily deal with images from visible spectrum cameras and are therefore subjected to intra-modality discrepancies. However, VI-ReID must also contend with cross-modality discrepancies arising from combining imagery from visible and infrared sources. These discrepancies introduce additional challenges beyond those encountered in traditional ReID settings, complicating the task of matching person images captured with different imaging technologies.

Proposed Method: Hi-CMD

The Hi-CMD framework seeks to tackle these challenges by using a Hierarchical Cross-Modality Disentanglement approach. The main contributions of this method are:

Hierarchical Disentanglement: The approach achieves a separation of ID-discriminative and ID-excluded factors from cross-modality images. This separation allows the system to focus only on ID-discriminative factors for matching tasks, thereby filtering out irrelevant factors such as pose and illumination variance.
ID-preserving Person Image Generation Network: This component of Hi-CMD learns to disentangle and reconstruct person identities across modalities. It enables the generation of cross-modality images that preserve person identity even when pose and illumination vary. This process supports the effective learning of disentangled representations.
Hierarchical Feature Learning (HFL) Module: This module is integrated with the person image generation network, allowing for robust extraction of ID-discriminative traits across visible and infrared images. By employing a feature learning approach that leverages ID-discriminative features, the framework improves the capability to match cross-spectrum images effectively.

Experimental Results

The proposed Hi-CMD method was evaluated using two VI-ReID datasets, demonstrating superior performance over several state-of-the-art methods. Notably, the results indicated substantial performance gains in both rank-1 identification rate and mean Average Precision (mAP), underscoring the effectiveness of the disentanglement approach. By disentangling ID-related features from extraneous factors, Hi-CMD minimizes both modally-induced and intra-class variations, leading to better matching accuracy.

Implications and Future Directions

The Hi-CMD method introduces a significant advancement in the field of VI-ReID by systematically tackling cross-modality discrepancies through hierarchical disentanglement. The successful demonstration of this approach suggests potential for broader applications in surveillance and security systems where such conditions are prevalent.

In future work, the research could explore extensions of the framework to other challenging cross-modality problems, as well as its application in real-world scenarios where variable environmental factors further complicate image capture and analysis. Additionally, extending this approach to other domains, such as medical imaging or multimodal sensor fusion, could offer similar advantages in disentangling complex image attributes.

In conclusion, the Hi-CMD framework presents a structured and effective methodology for addressing the unique challenges of VI-ReID by employing sophisticated disentanglement strategies, contributing valuably to advancements in cross-modality image processing and person re-identification.

PDF Markdown Bookmark Chat (Pro)

Authors (5)

Seokeon Choi (13 papers)
Sumin Lee (29 papers)
Youngeun Kim (48 papers)
Taekyung Kim (41 papers)
Changick Kim (75 papers)

Citations (216)

View on Semantic Scholar

Related Papers

Find Related Papers

GitHub

GitHub - bismex/HiCMD: [CVPR2020] Hi-CMD: Hierarchical Cross-Modality Disentanglement for Visible-Infrared Person Re-Identification (78 stars)