A Dual-Path Model With Adaptive Attention For Vehicle Re-Identification (1905.03397v3)

Published 9 May 2019 in cs.CV

Abstract: In recent years, attention models have been extensively used for person and vehicle re-identification. Most re-identification methods are designed to focus attention on key-point locations. However, depending on the orientation, the contribution of each key-point varies. In this paper, we present a novel dual-path adaptive attention model for vehicle re-identification (AAVER). The global appearance path captures macroscopic vehicle features while the orientation conditioned part appearance path learns to capture localized discriminative features by focusing attention on the most informative key-points. Through extensive experimentation, we show that the proposed AAVER method is able to accurately re-identify vehicles in unconstrained scenarios, yielding state of the art results on the challenging dataset VeRi-776. As a byproduct, the proposed system is also able to accurately predict vehicle key-points and shows an improvement of more than 7% over state of the art. The code for key-point estimation model is available at https://github.com/Pirazh/Vehicle_Key_Point_Orientation_Estimation.

Authors (6)

Pirazh Khorramshahi (10 papers)
Amit Kumar (224 papers)
Neehar Peri (22 papers)
Sai Saketh Rambhatla (15 papers)
Jun-Cheng Chen (42 papers)
Rama Chellappa (190 papers)

Citations (198)

View on Semantic Scholar

Summary

The paper proposes a novel dual-path model with adaptive attention that integrates global appearance and local orientation-based features.
It employs a ResNet-based global feature extractor alongside a two-stage key-point detection network to capture subtle vehicle details.
The model outperforms state-of-the-art methods on benchmarks like VeRi-776 and VehicleID, achieving over 7% improvement in orientation classification accuracy.

Overview of "A Dual-Path Model With Adaptive Attention For Vehicle Re-Identification"

The paper addresses the challenge of vehicle re-identification (re-id) by proposing a novel dual-path model, known as Adaptive Attention for Vehicle Re-Identification (AAVER). Building on the limitations of existing re-id methods, which struggle to differentiate vehicles of similar make, model, and color, the authors introduce an adaptive attention mechanism that leverages both global appearance features and localized discriminative features conditioned by vehicle orientation.

Methodology

The AAVER model operates through two main paths:

Global Appearance Path: This path utilizes a Deep Convolutional Neural Network (DCNN), specifically ResNet-50 or ResNet-101, to extract macroscopic features of vehicles. The features are trained using an $L_2$ softmax loss to ensure they are positioned on a hyper-sphere in the feature space, thus making it easier to distinguish between different vehicle identities. This path alone, while useful, often misses subtle vehicle distinctions crucial for re-id.
Orientation Conditioned Local Path: This path is designed to complement the global path by focusing on adaptive attention based on vehicle orientation. It employs a two-stage vehicle key-point detection model that estimates key-points and classifies the vehicle's orientation. The first stage provides a coarse estimate using a VGG-16 backbone while the second stage refines these estimates and predicts the orientation using a two-stack hourglass network.

The localized feature extraction relies on adaptively selected key-points determined by inferred vehicle orientation, integrating features from earlier layers of the global ResNet network. This approach ensures attention is placed on the most informative vehicle parts, such as unique logos or configurations, which are pivotal for precise re-identification.

Results

The proposed model outperforms baseline methods and competitive state-of-the-art approaches across several datasets, notably VeRi-776 and VehicleID, marking significant improvements in retrieval accuracy, with mAP and CMC metrics showing marked enhancements. Additionally, the orientation-conditioned path on key-point estimation demonstrated over 7% improvement in accuracy compared to existing methods.

Implications and Future Directions

The implications of the paper are critical for surveillance and intelligence applications where accurate vehicle tracking and identification is crucial. By incorporating vehicle orientation and adaptively focusing on essential vehicle parts, the approach enhances detection precision without demanding additional temporal or location-based data.

Future research might delve into integrating 3D vehicle modeling to further refine vehicle re-id, perhaps in conjunction with speed estimation and real-time application within dynamic urban surveillance environments. Furthermore, extending the model to account for more complex scenarios, such as occlusions or rapid changes in vehicle appearance due to environmental factors, would broaden its application and robustness.

The AAVER model thus presents a compelling step forward in vehicle re-identification, significantly elevating the capability to distinguish vehicles beyond superficial similarities.

PDF Markdown