Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Cross-modality Person re-identification with Shared-Specific Feature Transfer (2002.12489v3)

Published 28 Feb 2020 in cs.CV

Abstract: Cross-modality person re-identification (cm-ReID) is a challenging but key technology for intelligent video analysis. Existing works mainly focus on learning common representation by embedding different modalities into a same feature space. However, only learning the common characteristics means great information loss, lowering the upper bound of feature distinctiveness. In this paper, we tackle the above limitation by proposing a novel cross-modality shared-specific feature transfer algorithm (termed cm-SSFT) to explore the potential of both the modality-shared information and the modality-specific characteristics to boost the re-identification performance. We model the affinities of different modality samples according to the shared features and then transfer both shared and specific features among and across modalities. We also propose a complementary feature learning strategy including modality adaption, project adversarial learning and reconstruction enhancement to learn discriminative and complementary shared and specific features of each modality, respectively. The entire cm-SSFT algorithm can be trained in an end-to-end manner. We conducted comprehensive experiments to validate the superiority of the overall algorithm and the effectiveness of each component. The proposed algorithm significantly outperforms state-of-the-arts by 22.5% and 19.3% mAP on the two mainstream benchmark datasets SYSU-MM01 and RegDB, respectively.

Cross-modality Person Re-identification with Shared-Specific Feature Transfer

The paper focuses on developing a novel approach to tackle cross-modality person re-identification (cm-ReID), a challenging task within the domain of intelligent video analysis, particularly involving RGB and infrared modalities. This paper presents a unique cross-modality shared-specific feature transfer (cm-SSFT) algorithm aimed at optimizing both shared and specific modal characteristics to enhance re-identification performance.

Problem Definition and Approach

Cross-modality ReID primarily encounters difficulties due to the modality discrepancy arising from varying imaging processes, which often leads to the loss of distinct cues, like color transitions from RGB to infrared images. Previous methodologies either centered on learning a unified feature space neglecting modal-specific information or utilized generative adversarial networks (GANs) to approximate missing specific information across modalities. These approaches, however, limit the discrimination capacity of learned features.

The cm-SSFT algorithm addresses this by integrating a systematic feature transfer mechanism via affinity modeling which utilizes both shared and specific features. The algorithm emphasizes robust affinity-based information propagation between and within modalities, thus achieving a more comprehensive feature representation encompassing inter-modality and intra-modality affinities.

Methodology

The proposed method introduces several key components:

  1. Two-stream Feature Extractor: This module extracts shared and specific features by modeling the affinities across modalities, leveraging a dual-stage network mechanism that integrates shallow convolutional layers for effective extraction of specific modal cues.
  2. Shared-Specific Transfer Network (SSTN): SSTN exploits an intra- and inter-modality affinity framework to facilitate specific feature exchange and reinforcement across modalities, thereby enhancing the feature discriminability of each modality.
  3. Complementary Feature Learning Strategy: This strategy incorporates shared and modality-specific feature adaptation through complementary learning methods like modality adaptation, project adversarial networks, and reconstruction enhancement to segregate and optimize shared and specific features.

The structure is trained end-to-end, with the cm-SSFT algorithm yielding a significant performance improvement in both standard single-shot and multi-shot configurations on principal benchmark datasets, SYSU-MM01 and RegDB, setting new benchmarks with 22.5% and 19.3% improvements in mAP respectively over previous approaches.

Experimental Results

The comprehensive experiments unequivocally demonstrate the efficacy of cm-SSFT. Notably, the inclusion of modal-specific features contributes strongly to enhancing re-identification accuracy. The algorithm effectively outperforms existing state-of-the-art methods across different evaluation settings on key metrics like Rank-1 accuracy and mAP. Particularly, the novel feature scoring strategy validates the utility of modality-specific features, notwithstanding the traditionally greater emphasis on shared features in prior work.

Furthermore, the paper provides insight into the single query adaptation context, showcasing the cm-SSFT's flexibility and applicability when auxiliary images are limited, thus underscoring its robustness and practical relevance.

Implications and Future Work

This research contributes a pivotal step towards optimizing cm-ReID tasks, underlining the dual importance of exploiting both shared and specific modal features. The promising results indicate potential advancements in refining and applying such algorithms in real-world surveillance and security applications, where varying camera modalities are prevalent.

Future research could explore broader applicability to other cross-modality contexts, improve model optimization efficiency, and potentially leverage unsupervised and semi-supervised learning paradigms to further enhance adaptability and performance across varying domains. Additionally, integrating broader contextual data sources could refine and improve the algorithm’s real-time applicability and efficiency in large-scale environments.

In essence, the cm-SSFT not only marks a significant step in cross-modality identification systems but also sets a foundation for subsequent research targeting diverse modality interaction and information propagation for improved AI understanding and applications.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Yan Lu (179 papers)
  2. Yue Wu (338 papers)
  3. Bin Liu (441 papers)
  4. Tianzhu Zhang (60 papers)
  5. Baopu Li (45 papers)
  6. Qi Chu (52 papers)
  7. Nenghai Yu (173 papers)
Citations (237)