Learning to Align Multi-Camera Domains using Part-Aware Clustering for Unsupervised Video Person Re-Identification (1909.13248v4)

Published 29 Sep 2019 in cs.CV

Abstract: Most video person re-identification (re-ID) methods are mainly based on supervised learning, which requires cross-camera ID labeling. Since the cost of labeling increases dramatically as the number of cameras increases, it is difficult to apply the re-identification algorithm to a large camera network. In this paper, we address the scalability issue by presenting deep representation learning without ID information across multiple cameras. Technically, we train neural networks to generate both ID-discriminative and camera-invariant features. To achieve the ID discrimination ability of the embedding features, we maximize feature distances between different person IDs within a camera by using a metric learning approach. At the same time, considering each camera as a different domain, we apply adversarial learning across multiple camera domains for generating camera-invariant features. We also propose a part-aware adaptation module, which effectively performs multi-camera domain invariant feature learning in different spatial regions. We carry out comprehensive experiments on three public re-ID datasets (i.e., PRID-2011, iLIDS-VID, and MARS). Our method outperforms state-of-the-art methods by a large margin of about 20\% in terms of rank-1 accuracy on the large-scale MARS dataset.

View on arXiv

Authors (5)

Youngeun Kim (48 papers)
Seokeon Choi (13 papers)
Taekyung Kim (41 papers)
Sumin Lee (29 papers)
Changick Kim (75 papers)

Citations (2)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Learning to Align Multi-Camera Domains using Part-Aware Clustering for Unsupervised Video Person Re-Identification (1909.13248v4)

Summary

Related Papers