ICE: Inter-instance Contrastive Encoding for Unsupervised Person Re-identification (2103.16364v2)

Published 30 Mar 2021 in cs.CV

Abstract: Unsupervised person re-identification (ReID) aims at learning discriminative identity features without annotations. Recently, self-supervised contrastive learning has gained increasing attention for its effectiveness in unsupervised representation learning. The main idea of instance contrastive learning is to match a same instance in different augmented views. However, the relationship between different instances has not been fully explored in previous contrastive methods, especially for instance-level contrastive loss. To address this issue, we propose Inter-instance Contrastive Encoding (ICE) that leverages inter-instance pairwise similarity scores to boost previous class-level contrastive ReID methods. We first use pairwise similarity ranking as one-hot hard pseudo labels for hard instance contrast, which aims at reducing intra-class variance. Then, we use similarity scores as soft pseudo labels to enhance the consistency between augmented and original views, which makes our model more robust to augmentation perturbations. Experiments on several large-scale person ReID datasets validate the effectiveness of our proposed unsupervised method ICE, which is competitive with even supervised methods. Code is made available at https://github.com/chenhao2345/ICE.

Citations (163)

View on Semantic Scholar

Summary

The paper introduces Inter-instance Contrastive Encoding (ICE), a novel method for unsupervised person re-identification that leverages inter-instance affinities and contrastive encoding principles.
ICE utilizes a hard instance contrastive loss that mines the hardest positive sample in a batch using pairwise similarity ranking to reduce intra-class variance.
The method proposes a soft instance consistency loss which employs inter-instance pairwise similarity scores as soft labels to ensure consistency across data augmentations.

Inter-instance Contrastive Encoding for Unsupervised Person Re-identification

The paper introduces a novel approach named Inter-instance Contrastive Encoding (ICE) targeted at enhancing unsupervised person re-identification (ReID). Person re-identification plays a crucial role in video surveillance by enabling the tracking of individuals across multiple camera feeds. The ICE model innovates on existing unsupervised strategies by effectively leveraging both inter-instance affinities and contrastive encoding principles to improve ReID performance without the necessity for manual identity annotations.

Key Contributions

Contrastive Learning and Inter-instance Affinities: ICE builds upon the foundational principles of recent contrastive learning techniques, which focus on matching different augmented views of the same instance. Traditionally, these methods treat each image as an individual class, which is limiting for tasks requiring fine-grained distinction between instances like ReID. The paper addresses this by incorporating inter-instance affinities into the learning process, which are largely ignored by previous methods.
Hard Instance Contrastive Loss: ICE introduces a hard instance contrastive loss by mining the hardest positive sample within a mini-batch using inter-instance pairwise similarity ranking. This approach aids in reducing intra-class variance, thereby promoting more compact cluster formations within the unsupervised learning framework. By focusing on hard samples, the method enhances robustness against variations in appearance due to viewpoint and other environmental factors.
Soft Instance Consistency Loss: The model further proposes a soft instance consistency loss, which utilizes inter-instance pairwise similarity scores as soft labels to ensure consistency before and after data augmentation. By maintaining such consistency, ICE becomes more resilient to perturbations introduced during the augmentation process, effectively managing both natural and artificial variations.

Results and Implications

The experimental results validate the efficacy of ICE across several large-scale person ReID datasets, demonstrating performances competitive with even supervised methods. These experiments illustrate the potential for ICE to facilitate large-scale unsupervised deployments by outperforming state-of-the-art unsupervised domain adaptation (UDA) and fully unsupervised methods.

ICE’s integration of both hard and soft label mechanisms speculatively points toward the future development of AI models better equipped for real-world surveillance tasks, which involve significant environmental variability and demand high robustness without reliance on labeled data. However, careful consideration during the implementation of ICE is required, especially in settings with significant intra-class variance or strong camera characteristics.

Conclusion

The study presented in this paper makes substantial progress in unsupervised ReID by innovatively exploiting inter-instance relationships through contrastive learning. The dual utilization of hard and soft instance-based losses marks a significant step forward in achieving highly discriminative and robust identity representations. Future research could explore extending ICE applications to scenarios involving significant appearance changes, further advancing the unsupervised ReID’s versatility and practicality.