Unsupervised Person Re-identification by Soft Multilabel Learning (1903.06325v2)

Published 15 Mar 2019 in cs.CV

Abstract: Although unsupervised person re-identification (RE-ID) has drawn increasing research attentions due to its potential to address the scalability problem of supervised RE-ID models, it is very challenging to learn discriminative information in the absence of pairwise labels across disjoint camera views. To overcome this problem, we propose a deep model for the soft multilabel learning for unsupervised RE-ID. The idea is to learn a soft multilabel (real-valued label likelihood vector) for each unlabeled person by comparing (and representing) the unlabeled person with a set of known reference persons from an auxiliary domain. We propose the soft multilabel-guided hard negative mining to learn a discriminative embedding for the unlabeled target domain by exploring the similarity consistency of the visual features and the soft multilabels of unlabeled target pairs. Since most target pairs are cross-view pairs, we develop the cross-view consistent soft multilabel learning to achieve the learning goal that the soft multilabels are consistently good across different camera views. To enable effecient soft multilabel learning, we introduce the reference agent learning to represent each reference person by a reference agent in a joint embedding. We evaluate our unified deep model on Market-1501 and DukeMTMC-reID. Our model outperforms the state-of-the-art unsupervised RE-ID methods by clear margins. Code is available at https://github.com/KovenYu/MAR.

Citations (236)

View on Semantic Scholar

Summary

The paper presents a soft multilabel approach that leverages reference agents to extract latent discriminative features from unlabeled data.
It employs soft multilabel-guided hard negative mining and cross-view consistency to enhance the discriminative embedding across camera perspectives.
Experiments on Market-1501 and DukeMTMC-reID demonstrate over a 20% improvement in Rank-1 accuracy, validating its effectiveness.

An Analytical Overview of "Unsupervised Person Re-identification by Soft Multilabel Learning"

The paper "Unsupervised Person Re-identification by Soft Multilabel Learning" presents an innovative approach to tackle the challenging problem of unsupervised person re-identification (RE-ID) by leveraging soft multilabel learning. The authors propose a model that enables the extraction of discriminative information from unlabeled RE-ID data by soft multilabels, improving the robustness and accuracy of RE-ID systems without necessitating manually labeled pairwise data from multiple camera views.

Technical Contribution

The essence of the approach lies in representing each unlabeled target person through comparisons with a set of known reference persons in an auxiliary source dataset. This deviation from traditional methods facilitates an effective unsupervised learning paradigm that circumvents the constraints of pairwise labels, which are typically tedious and impractical to obtain at a large scale.

The key contributions of the paper are synthesized in their proposed framework called deep soft multilabel reference learning (MAR), which embodies the following innovative techniques:

Soft Multilabel-Guided Hard Negative Mining: This strategy differentiates visually similar pairs of individuals by utilizing multilabel representations to identify truly similar pairs (positives) from visually similar but different ones (hard negatives). This aspect is foundational to learning a discriminative embedding for identifiers across different camera perspectives.
Cross-View Consistent Soft Multilabel Learning: The consistency across different views is crucial, as most RE-ID scenarios involve cross-view pairs. The authors proposed a mechanism ensuring that learned soft multilabels remain accurate and consistent across varied camera angles, thereby enhancing the model's utility in practical multi-camera surveillance systems.
Reference Agent Learning: By representing reference persons as reference agents in a joint embedding space, the method ensures that the comparisons between unlabeled target data and reference agents are meaningful and discriminative. This facilitates robust comparisons across domains, mitigating the domain shift issue often seen in RE-ID tasks.

Experimental Evaluation

The experimental validation was robust, conducted on two standard benchmarks: Market-1501 and DukeMTMC-reID. The proposed model significantly outperformed existing state-of-the-art unsupervised RE-ID models, evidenced by a remarkable increase in Rank-1 accuracy and mean average precision (MAP). For instance, in DukeMTMC-reID dataset, MAR achieved a Rank-1 accuracy improvement by over 20% over previous methods, demonstrating the model's efficacy in extracting and leveraging latent discriminative features from unlabeled data.

Implications and Future Directions

From a theoretical standpoint, the introduction of soft multilabels paves the way for representing unlabeled data in a highly-detailed, comparative framework that could redefine unsupervised learning approaches in tasks where label retrieval is labor-intensive or impracticable. The practical implications are profound in large-scale surveillance systems where the deployment of robust and scalable RE-ID systems is paramount.

Looking forward, several avenues for future exploration emerge. Further refinement of soft multilabeling techniques to explore additional auxiliary datasets or more sophisticated label alignment strategies could be beneficial. Moreover, integrating domain adaptation techniques to enhance the cross-domain performance of soft multilabel frameworks remains an intriguing prospect. This work sets a valuable precedent for subsequent unsupervised and semi-supervised learning models in person re-identification and related fields.

The paper thus lays crucial groundwork for advancements in unsupervised learning paradigms, offering a substantive leap in addressing scalability and generalization issues inherent in person re-identification.

PDF Markdown