Learning Adaptive Discriminative Correlation Filters via Temporal Consistency Preserving Spatial Feature Selection for Robust Visual Tracking (1807.11348v3)

Published 30 Jul 2018 in cs.CV

Abstract: With efficient appearance learning models, Discriminative Correlation Filter (DCF) has been proven to be very successful in recent video object tracking benchmarks and competitions. However, the existing DCF paradigm suffers from two major issues, i.e., spatial boundary effect and temporal filter degradation. To mitigate these challenges, we propose a new DCF-based tracking method. The key innovations of the proposed method include adaptive spatial feature selection and temporal consistent constraints, with which the new tracker enables joint spatial-temporal filter learning in a lower dimensional discriminative manifold. More specifically, we apply structured spatial sparsity constraints to multi-channel filers. Consequently, the process of learning spatial filters can be approximated by the lasso regularisation. To encourage temporal consistency, the filter model is restricted to lie around its historical value and updated locally to preserve the global structure in the manifold. Last, a unified optimisation framework is proposed to jointly select temporal consistency preserving spatial features and learn discriminative filters with the augmented Lagrangian method. Qualitative and quantitative evaluations have been conducted on a number of well-known benchmarking datasets such as OTB2013, OTB50, OTB100, Temple-Colour, UAV123 and VOT2018. The experimental results demonstrate the superiority of the proposed method over the state-of-the-art approaches.

Citations (312)

View on Semantic Scholar

Summary

The paper introduces LADCF, which integrates group lasso-based spatial feature selection with temporal consistency to improve visual tracking.
The method adaptively selects approximately 5% of hand-crafted and 20% of deep features, mitigating spatial boundary effects and background clutter.
An augmented Lagrangian optimization framework yields state-of-the-art results on benchmarks like OTB and VOT2018, enhancing tracking stability.

Overview of Adaptive Discriminative Correlation Filters for Visual Object Tracking

The paper presents a paper on improving Discriminative Correlation Filter (DCF) frameworks used in visual object tracking, addressing issues related to spatial boundary effects and temporal filter degradation. It introduces an innovative approach to DCF-based tracking through temporal consistency-preserving spatial feature selection. This enables joint spatial-temporal learning on a low-dimensional discriminative manifold.

Key Contributions

This research introduces a sophisticated method, learning Adaptive Discriminative Correlation Filters (LADCF), which brings forward multiple technique enhancements:

Spatial Feature Selection: The proposed model employs structured spatial sparsity constraints using group lasso regularization to adaptively choose spatial features. This results in approximately 5% of hand-crafted features and 20% of deep features being selected, achieving enhanced performance while mitigating boundary effects and background clutter.
Temporal Consistency Enhancement: By reinforcing temporal consistency, the filter model maintains alignment with its historical values, enhancing stability across frames. This approach preserves the global structure in the discriminative manifold, reducing the risk of filter degradation.
Optimization Framework: A unified framework utilizing the augmented Lagrangian method achieves efficient filter learning and spatial feature selection. This framework harmonizes spatial feature selection in the spatial domain with filter learning in the frequency domain.

Method and Results

LADCF uses a combination of hand-crafted features such as HOG and Colour-Names, as well as deep neural network features for its multi-channel tracking framework. Experimental results on multiple datasets, including OTB2013, OTB50, OTB100, Temple-Colour, UAV123, and VOT2018, show LADCF exceeding state-of-the-art methods in various metrics. Evaluation using standard metrics like OP, DP, and AUC demonstrated LADCF’s superior tracking performance, particularly noted was its EAO score on the VOT2018 benchmark.

Implications and Future Work

The work's implications are broad in the field of computer vision and pattern recognition, particularly in real-time visual object tracking under challenging scenarios such as motion blur, occlusion, and diverse environmental conditions. The methods introduced promise improvements in tracking stability and accuracy, which are crucial for applications in surveillance, autonomous vehicles, and augmented reality.

With the introduction of embedded feature selection and adaptive learning strategies within the correlation filter paradigm, this work paves the way for further explorations into real-time optimization and deeper learning integration. Future research could capitalize on these enhancements by exploring higher-dimensional data or leveraging graph-based techniques to further enhance feature selection and robustness in dynamic environments.

The paper effectively presents a nuanced yet practical approach to improving the efficacy of DCF in visual tracking, establishing a foundation for innovation in both theoretical and applied contexts.

Related Papers

YouTube

Show All Videos