Weakly Supervised Person Re-ID: Differentiable Graphical Learning and A New Benchmark (1904.03845v3)

Published 8 Apr 2019 in cs.CV and cs.AI

Abstract: Person re-identification (Re-ID) benefits greatly from the accurate annotations of existing datasets (e.g., CUHK03 [1] and Market-1501 [2]), which are quite expensive because each image in these datasets has to be assigned with a proper label. In this work, we ease the annotation of Re-ID by replacing the accurate annotation with inaccurate annotation, i.e., we group the images into bags in terms of time and assign a bag-level label for each bag. This greatly reduces the annotation effort and leads to the creation of a large-scale Re-ID benchmark called SYSU-30$k$. The new benchmark contains $30k$ individuals, which is about $20$ times larger than CUHK03 ($1.3k$ individuals) and Market-1501 ($1.5k$ individuals), and $30$ times larger than ImageNet ($1k$ categories). It sums up to 29,606,918 images. Learning a Re-ID model with bag-level annotation is called the weakly supervised Re-ID problem. To solve this problem, we introduce a differentiable graphical model to capture the dependencies from all images in a bag and generate a reliable pseudo label for each person image. The pseudo label is further used to supervise the learning of the Re-ID model. When compared with the fully supervised Re-ID models, our method achieves state-of-the-art performance on SYSU-30$k$ and other datasets. The code, dataset, and pretrained model will be available at \url{https://github.com/wanggrun/SYSU-30k}.

Citations (5)

View on Semantic Scholar

Summary

The paper demonstrates that weakly supervised learning with bag-level annotations effectively reduces labor costs while maintaining competitive accuracy.
The paper introduces a novel differentiable graphical model that generates pseudo-labels and integrates seamlessly with deep neural networks.
The paper validates the approach on the SYSU-30k benchmark, showcasing performance that rivals and sometimes surpasses fully supervised models.

Insights into Weakly Supervised Person Re-Identification and Differentiable Graphical Learning

The paper "Weakly Supervised Person Re-ID: Differentiable Graphical Learning and A New Benchmark" by Guangrun Wang et al. presents an innovative approach to person re-identification (Re-ID) through weakly supervised learning. The authors introduce a large-scale benchmark, SYSU-30k, characterized by weak annotations, and develop a cutting-edge graphical model that operates within a weakly supervised framework. Throughout the paper, the authors navigate the complexities of Re-ID, demonstrating how weak supervision can effectively reduce laborious annotation costs while maintaining high levels of accuracy.

Overview of Key Contributions

Weakly Supervised Learning Formulation: Recognizing the constraints imposed by fully labeled datasets, the authors propose using bag-level labels instead of exhaustive strong annotations. This involves categorizing images into temporal bags and providing labels at the bag level, thereby significantly reducing annotation efforts. Their formulation captures the essence of weakly supervised Re-ID, delineating a path toward scalable, large-scale implementations.
SYSU-30k Benchmark: The SYSU-30k dataset is a substantial contribution, containing images of 30k individuals across nearly 30 million images, which is notably larger than traditional datasets like CUHK03 and Market-1501. This new dataset provides a robust platform for testing and evaluating Re-ID models under realistic, large-scale conditions.
Differentiable Graphical Model: The authors introduce a differentiable graphical model designed to create pseudo-labels from weak annotations. By capturing the dependencies among images in a bag and generating probabilities for pseudo-labeling, this model enables end-to-end learning and integrates seamlessly with deep neural networks. The graphical model is unique in its differentiable nature, allowing optimization via standard backpropagation techniques, thus facilitating integration within neural network architectures.
Experimental Validation and Comparison: When tested on the SYSU-30k and other traditional datasets, the proposed approach achieves competitive, and in some cases superior, performance to state-of-the-art fully supervised models. The paper provides comprehensive experiments comparing the weakly supervised method against other existing Re-ID paradigms, illustrating its efficacy and scalability.

Implications and Future Directions

The implications of this work are multifaceted. From a practical standpoint, the reduction in annotation costs through weakly supervised learning paves the way for deploying Re-ID systems in large-scale applications without prohibitive labor investments. Furthermore, the compatibility of the graphical learning model with existing deep learning frameworks marks a significant advance towards more flexible machine learning systems that function effectively with insufficient data annotations.

On a theoretical level, the success of leveraging weak annotations challenges the prevailing requirement for complete data supervision, suggesting that systems can indeed learn effectively from less precise, more generalized data inputs. One can anticipate future research focusing on refining the discriminative features across varying camera conditions, such as occlusion and illumination changes, entirely relying on weak annotations.

Lastly, the SYSU-30k dataset represents a valuable research asset, likely inspiring developments in areas beyond Re-ID, such as object detection and video surveillance, benefitting from weakly supervised settings. The benchmarks set by this dataset may well establish new standards and prompt further research into enhancing model robustness and accuracy in dynamic real-world settings.

In summary, Guangrun Wang et al.'s contributions facilitate a significant stride in person Re-ID research, demonstrating that weak supervision is not only feasible but highly effective, potentially signaling a shift in focus towards more economically viable and scalable machine learning paradigms.