Resource Aware Person Re-identification across Multiple Resolutions (1805.08805v3)

Published 22 May 2018 in cs.CV

Abstract: Not all people are equally easy to identify: color statistics might be enough for some cases while others might require careful reasoning about high- and low-level details. However, prevailing person re-identification(re-ID) methods use one-size-fits-all high-level embeddings from deep convolutional networks for all cases. This might limit their accuracy on difficult examples or makes them needlessly expensive for the easy ones. To remedy this, we present a new person re-ID model that combines effective embeddings built on multiple convolutional network layers, trained with deep-supervision. On traditional re-ID benchmarks, our method improves substantially over the previous state-of-the-art results on all five datasets that we evaluate on. We then propose two new formulations of the person re-ID problem under resource-constraints, and show how our model can be used to effectively trade off accuracy and computation in the presence of resource constraints. Code and pre-trained models are available at https://github.com/mileyan/DARENet.

Citations (241)

View on Semantic Scholar

Summary

The paper presents a novel multi-layer embedding fusion strategy that integrates detailed and semantic features for precise person re-identification.
It leverages deep supervision across convolutional network layers to ensure each feature level contributes to improved discrimination and computational efficiency.
Extensive evaluations on standard benchmarks demonstrate state-of-the-art performance, enabling flexible anytime and budgeted re-ID in resource-constrained scenarios.

Resource Aware Person Re-identification across Multiple Resolutions: An Overview

This essay provides an overview of the paper "Resource Aware Person Re-identification across Multiple Resolutions," which discusses the design and implementation of a novel person re-identification (re-ID) model incorporating resource awareness. The research aims to address the limitations of conventional re-ID systems that utilize a one-size-fits-all approach, resulting in varying degrees of accuracy and computational inefficiency. The paper highlights a methodology that combines information from multiple convolutional network layers, enhancing the discrimination capability of the model while being computationally efficient.

Problem Statement and Approach

Person re-identification is a vital task in computer vision, demanding robust and efficient models to handle a variety of image resolutions and intrinsic differences. Traditional high-level embedding methods fall short in precision for challenging examples or expend unnecessary resources on simpler ones. The paper introduces a model that synthesizes embeddings across different semantic layers, facilitating a dual improvement in accuracy and computational resource efficiency.

Methodology

The model proposed in this research is built on standard deep convolutional network architectures, specifically, the ResNet and DenseNet, enhanced with two primary modifications:

Multi-layer Embedding Fusion: The model captures embeddings at various network layers, thereby encapsulating both detailed, low-level features and abstract, high-level features. This allows the model to integrate texture and fine-grained appearance cues alongside semantic information, supporting a more nuanced identification process.
Deep Supervision: Unlike conventional singular loss functions affecting only the final output, the proposed model leverages loss functions on embeddings at multiple layers, thereby ensuring that each level contributes discriminatively to the task. This multi-layer supervision guarantees that intermediate representations remain task-relevant.

Performance Evaluation

Experiments are conducted on several benchmark datasets, including Market-1501, MARS, CUHK03, and DukeMTMC-reID. The results demonstrate that the model surpasses prior state-of-the-art methods across all datasets, proving the efficacy of multi-layer embedding and deep supervision strategies.

Resource-constrained Scenarios

The paper further explores applications of the proposed re-ID model under resource-constrained settings:

Anytime Re-ID: In scenarios where model predictions must be produced rapidly and computation may be interrupted, the model offers an anytime prediction, utilizing the most recent layer embedding computed. This allows for flexible computational needs without sacrificing performance drastically.
Budgeted Re-ID: In an online setting where average computational cost is constrained, the model dynamically balances between higher accuracy requests and resource allocation. By making learned decisions on when to cease further embedding calculation, the model adapts to given constraints, providing a reliable budgeted re-ID output.

Implications

The presented resource-aware person re-ID model advances both the theoretical and practical aspects of computational efficiency in deep learning applications. By demonstrating that it is possible to maintain competitive, state-of-the-art accuracy with nuanced resource management, this research suggests significant implications for edge computing and real-time vision systems, particularly those deployed in power- and performance-sensitive contexts like mobile devices and surveillance systems.

Future Directions

Future work could explore extending this model to other computer vision tasks, such as object detection or tracking, where similar levels of flexibility and efficiency may be beneficial. Additionally, further exploration of how adaptive methodologies in re-ID can be incorporated into end-to-end neural architecture search could yield additional insights into automatic model refinement, balancing computational resources and accuracy.

This research underscores the importance of resource-aware architectures in deep learning, especially as deployments increasingly occur in environments with stringent computational and power constraints.

PDF Markdown

Related Papers

GitHub

GitHub - mileyan/DARENet: (CVPR) Resource Aware Person Re-identification across Multiple Resolutions (64 stars)