A Pose-Sensitive Embedding for Person Re-Identification with Expanded Cross Neighborhood Re-Ranking (1711.10378v2)

Published 28 Nov 2017 in cs.CV

Abstract: Person re identification is a challenging retrieval task that requires matching a person's acquired image across non overlapping camera views. In this paper we propose an effective approach that incorporates both the fine and coarse pose information of the person to learn a discriminative embedding. In contrast to the recent direction of explicitly modeling body parts or correcting for misalignment based on these, we show that a rather straightforward inclusion of acquired camera view and/or the detected joint locations into a convolutional neural network helps to learn a very effective representation. To increase retrieval performance, re-ranking techniques based on computed distances have recently gained much attention. We propose a new unsupervised and automatic re-ranking framework that achieves state-of-the-art re-ranking performance. We show that in contrast to the current state-of-the-art re-ranking methods our approach does not require to compute new rank lists for each image pair (e.g., based on reciprocal neighbors) and performs well by using simple direct rank list based comparison or even by just using the already computed euclidean distances between the images. We show that both our learned representation and our re-ranking method achieve state-of-the-art performance on a number of challenging surveillance image and video datasets. The code is available online at: https://github.com/pse-ecn/pose-sensitive-embedding

Authors (4)

M. Saquib Sarfraz (30 papers)
Arne Schumann (7 papers)
Andreas Eberle (23 papers)
Rainer Stiefelhagen (155 papers)

Citations (481)

View on Semantic Scholar

Summary

Overview of "A Pose-Sensitive Embedding for Person Re-Identification with Expanded Cross Neighborhood Re-Ranking"

The paper introduces an advanced methodology for person re-identification (re-id) in surveillance environments, focusing on enhancing the discriminative power of embeddings through the incorporation of pose information. This approach is complemented by a novel re-ranking framework for improving retrieval accuracy.

Pose-Sensitive Embedding

Person re-id remains a challenging task due to variations in camera angles and individual poses. Traditional methods often rely on convolutional neural networks (CNNs) to capture global appearance features or explicitly model body parts for better alignment across different views. This paper proposes a more straightforward, yet effective solution, leveraging both coarse and fine pose cues to build a robust embedding.

Coarse Pose Integration: By incorporating a person's view information—categorizing orientation as either 'front', 'side', or 'back'—the model enhances feature extraction via a side-branch network that weights features according to predicted views. This allows the system to create specialized feature maps for different orientations.
Fine Pose Integration: Utilizing body joint locations as additional input channels, the network learns to prioritize relevant body parts dynamically. This integration employs confidence maps from a pose estimator, diverging from more rigid methods that depend on predefined alignments or normalizations.

Expanded Cross Neighborhood Re-Ranking

The paper introduces a novel re-ranking technique called Expanded Cross Neighborhood (ECN) distance, which improves rank accuracy without the necessity of recalculating rank lists per image pair:

Expanded Neighborhoods: It aggregates distances from close neighbors (top k nearest) across the probe and gallery images, using either original Euclidean distances or rank-list based distances, simplifying computational demands.
Efficient Comparison: The model uses a straightforward rank list comparison measure, demonstrating competitive performance. This method circumvents the complexity of recalculating comprehensive rank lists, a requirement of many existing techniques like k-reciprocal encoding.

Performance Evaluation

The proposed system is rigorously evaluated against established benchmarks such as Market-1501, Duke-MTMC-reID, and others. The introduction of pose-sensitive embeddings yields significant accuracy improvements. The ECN re-ranking framework boosts precision without excessive computational overhead, achieving superior performance over current state-of-the-art methods on multiple datasets.

Implications and Future Work

The implications of integrating pose information into person re-id systems are significant, offering insights into improving neural network design by leveraging intuitive, yet powerful cues like body orientation. This work paves the way for further exploration into embedded pose estimation and adaptive feature weighting directly within neural architectures.

Future research could explore the integration of these pose-related components into a unified end-to-end system, potentially improving computational efficiency and real-time application viability. Additionally, the ECN framework could be extended into other retrieval tasks beyond re-id, addressing broader challenges in image and video analysis.

In conclusion, the paper presents a methodologically sound and technically sophisticated approach to enhancing re-id systems, demonstrating the potential for pose information to critically improve the accuracy and robustness of person re-identification in practical scenarios.

PDF Markdown

Related Papers

Find Related Papers

GitHub

GitHub - pse-ecn/pose-sensitive-embedding: Pose Sensitive Embedding for Person Re-Identification (PSE) (112 stars)