Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

High-Speed Tracking with Kernelized Correlation Filters (1404.7584v3)

Published 30 Apr 2014 in cs.CV

Abstract: The core component of most modern trackers is a discriminative classifier, tasked with distinguishing between the target and the surrounding environment. To cope with natural image changes, this classifier is typically trained with translated and scaled sample patches. Such sets of samples are riddled with redundancies -- any overlapping pixels are constrained to be the same. Based on this simple observation, we propose an analytic model for datasets of thousands of translated patches. By showing that the resulting data matrix is circulant, we can diagonalize it with the Discrete Fourier Transform, reducing both storage and computation by several orders of magnitude. Interestingly, for linear regression our formulation is equivalent to a correlation filter, used by some of the fastest competitive trackers. For kernel regression, however, we derive a new Kernelized Correlation Filter (KCF), that unlike other kernel algorithms has the exact same complexity as its linear counterpart. Building on it, we also propose a fast multi-channel extension of linear correlation filters, via a linear kernel, which we call Dual Correlation Filter (DCF). Both KCF and DCF outperform top-ranking trackers such as Struck or TLD on a 50 videos benchmark, despite running at hundreds of frames-per-second, and being implemented in a few lines of code (Algorithm 1). To encourage further developments, our tracking framework was made open-source.

Citations (5,016)

Summary

  • The paper presents a novel method for high-speed visual tracking by using circulant matrices and the Discrete Fourier Transform (DFT) to optimize kernelized correlation filters.
  • The authors extend correlation filtering to a kernelized version, enabling non-linear regression with competitive accuracy and reduced computational complexity.
  • Experimental results show a mean precision of 73.2% and frame rates up to 292 FPS, proving the method's real-time robustness for challenging tracking scenarios.

High-Speed Tracking with Kernelized Correlation Filters

The paper "High-Speed Tracking with Kernelized Correlation Filters" by Henriques et al. introduces a novel method for efficient visual tracking using kernelized correlation filters. The authors address the redundancy inherent in discriminative learning methods for tracking, where multiple translated patches of a target object are used as training samples. By leveraging the circulant property of the data matrix, they show that the computational and storage requirements can be drastically reduced through the Discrete Fourier Transform (DFT).

Core Contributions and Methodology

The primary contribution of this paper is the development of an analytic model for handling large datasets derived from translated image patches. Key to their approach is the observation that the data matrix becomes circulant when dealing with these translated patches. This structure allows significant computational optimizations:

  1. Reduction via DFT: The circulant data matrix can be diagonalized using DFT, simplifying operations and minimizing computational effort.
  2. Kernelized Correlation Filter (KCF): Extending the concept to kernels, the authors derive a Kernelized Correlation Filter for non-linear regression tasks, facilitating efficient kernel-based learning at a complexity comparable to linear methods.
  3. Dual Correlation Filter (DCF): For linear regression, they identify an equivalence with correlation filters and introduce a multi-channel extension called the Dual Correlation Filter, which performs competitively even with significantly reduced implementation complexity.

A prominent feature of the proposed method is its ability to maintain high-speed processing while incorporating a large number of training samples. The proposed framework is also made available as open-source, encouraging further research and development in the field.

Experimental Results

The experimental validation was conducted on a benchmark consisting of 50 video sequences. Key results include:

  • Performance Metrics: The KCF tracker demonstrated superior accuracy with a mean precision of 73.2% at a 20-pixel error threshold when using HOG features. This performance outstrips contemporary trackers like Struck and TLD.
  • Frame Rates: One of the standout aspects of the method is its efficiency. The KCF implementation runs at 172 FPS using HOG features, and the DCF variant achieves 292 FPS, making it suitable for real-time applications.
  • Robustness: The KCF and DCF trackers showed significant robustness against common challenges in tracking such as non-rigid deformations, occlusions, and background clutter. Their ability to incorporate numerous negative samples efficiently stands out as a key factor in their resilient performance.

Implications and Future Directions

From a theoretical perspective, the use of circulant matrices and DFT in kernelized learning frameworks represents a substantial contribution to the field of machine learning and computer vision. The method offers a compelling blend of computational efficiency and tracking performance, which can be applied to various domains requiring real-time processing.

In practical terms, the introduced techniques could be beneficial in applications such as autonomous driving, robotics, and surveillance, where tracking accuracy and speed are critical.

Future developments could focus on extending the current framework to handle non-periodic boundaries, which could further enhance tracking accuracy. Additionally, exploring the applicability of these concepts to other types of transformations beyond translations, such as affine transformations or non-rigid deformations, could lead to more versatile tracking systems.

Conclusion

Henriques et al. have presented an efficient and robust method for visual tracking that leverages the circulant property of data matrices and the power of DFT for both linear and non-linear regression tasks. The proposed Kernelized Correlation Filter and Dual Correlation Filter demonstrate significant improvements in both speed and accuracy over state-of-the-art trackers, offering a promising avenue for future research and application in real-time tracking scenarios.

The paper not only advances the fundamental understanding of correlation filters in machine learning but also provides practical tools and open-source implementations that are poised to impact a wide range of real-world applications.