Person Re-identification: Past, Present and Future (1610.02984v1)

Published 10 Oct 2016 in cs.CV

Abstract: Person re-identification (re-ID) has become increasingly popular in the community due to its application and research significance. It aims at spotting a person of interest in other cameras. In the early days, hand-crafted algorithms and small-scale evaluation were predominantly reported. Recent years have witnessed the emergence of large-scale datasets and deep learning systems which make use of large data volumes. Considering different tasks, we classify most current re-ID methods into two classes, i.e., image-based and video-based; in both tasks, hand-crafted and deep learning systems will be reviewed. Moreover, two new re-ID tasks which are much closer to real-world applications are described and discussed, i.e., end-to-end re-ID and fast re-ID in very large galleries. This paper: 1) introduces the history of person re-ID and its relationship with image classification and instance retrieval; 2) surveys a broad selection of the hand-crafted systems and the large-scale methods in both image- and video-based re-ID; 3) describes critical future directions in end-to-end re-ID and fast retrieval in large galleries; and 4) finally briefs some important yet under-developed issues.

Authors (3)

Liang Zheng (181 papers)
Yi Yang (856 papers)
Alexander G. Hauptmann (40 papers)

Citations (1,079)

View on Semantic Scholar

Summary

An Overview of "Person Re-identification: Past, Present and Future"

The paper "Person Re-identification: Past, Present and Future" by Liang Zheng, Yi Yang, and Alexander G. Hauptmann presents a comprehensive survey of the developments and techniques in person re-identification (re-ID). This task is critical for surveillance and public safety applications and involves identifying a person of interest across different camera feeds. The authors structure their survey chronologically and methodologically, covering various approaches and providing a detailed examination of the future directions in the field.

Historical Context and Conceptual Foundation

Historically, person re-ID has roots in metaphysical, psychological, and logical domains well before the advent of its application in computer vision. The foundational idea resides in the identification of the same individual across different instances, a concept that ties back to Leibniz's Law. In modern contexts, re-ID was initially intermixed with multi-camera tracking systems where the notion of tracking an individual's movement across different camera views was paramount. Early efforts by Huang and Russell in 1997 employed Bayesian formulations to hypothesize object appearances in one camera based on evidences from another, setting an early precedent for re-ID formulations.

Categorization of Approaches

The paper categorizes re-ID methods into image-based and video-based approaches, further segmenting them into hand-crafted systems and deeply-learned systems.

Image-based Re-identification

Hand-crafted Systems: These methods heavily relied on building discriminative descriptors from images through color, texture, and spatial cues. Classic techniques included color histograms, texture descriptors like LBP, and segmentation methods to isolate reliable features. Multi-scale and part-based approaches were employed to mitigate the effects of occlusion and viewpoint variations. Examples include the SDALF method which contrasted body parts symmetrically to generate reliable matches.

Deep Learning Systems: Deep learning frameworks, particularly Convolutional Neural Networks (CNNs), brought a significant leap in re-ID by learning features directly from data. Early works used Siamese networks to learn embeddings from pairs of images, with later advancements incorporating more complex models such as triplet networks and hybrid approaches combining CNNs with RNN components to account for sequential dependencies in data.

Video-based Re-identification

Hand-crafted Systems: Building upon temporal information, early video-based techniques synthesized spatial and movement features to better distinguish individuals across frames. For instance, the GEI and HOG3D descriptors captured gait dynamics, which were pivotal in stable person matching across sequences.

Deep Learning Systems: The advent of video-based re-ID in deep learning included frameworks like LSTM-based models to capture temporal dynamics. RNNs and recent models like Gated Siamese Networks addressed detailed temporal coherence among frames, significantly increasing matching accuracy across longer sequences.

Emerging Trends and Future Directions

The authors delineate critical future directions which include:

End-to-End Re-ID Systems: Integrating detection, tracking, and re-ID into cohesive frameworks is vital. Efforts like those by Xu et al. emphasize the need for joint optimization between detection and re-ID for consistent performance in dynamic environments.
Scalability in Large Galleries: The expansion to city-scale applications highlights the necessity for more robust and scalable systems. The use of inverted indices and hashing techniques have been noted as promising pathways to manage large datasets, ensuring that re-ID remains computationally feasible without sacrificing accuracy.
Unlabeled Data Utilization: Leveraging unsupervised and semi-supervised learning to pre-train models using abundant unlabeled data sources like tracking outputs could address data scarcity issues, thus enhancing model robustness across different domains and conditions.
Re-ranking and Post-processing Techniques: Advanced re-ranking strategies utilizing contextual information and human-in-the-loop systems could refine initial search results, thereby enhancing the precision of re-ID systems.
Open-world Re-ID: Extending beyond closed-set assumptions, open-world re-ID introduces the challenge of dynamically updating galleries and employing robust detection mechanisms to identify whether a person of interest exists within the system. Novel probabilistic models and adaptable learning frameworks are essential for progress in this area.

Conclusion

This paper offers an in-depth overview of person re-identification, covering its historical roots, methodological advances, and future prospects. As the re-ID field advances, integrating multimodal data, enhancing scalability, and leveraging novel learning paradigms will be critical to meet the demands of real-world applications in surveillance and public safety. The survey by Zheng, Yang, and Hauptmann stands as a pivotal resource, guiding future research trajectories in person re-ID.

PDF Markdown

Related Papers

YouTube

Show All Videos