MKCFup: Dual Filter Updates in Tracking & Kalman
- MKCFup is a dual-method approach that integrates multi-kernel correlation filtering for high-speed visual tracking with a robust update step in the Maximum Correntropy Kalman Filter for state estimation.
- The framework employs advanced kernel formulations and FFT-based alternating optimization, achieving superior performance metrics such as 83.5% precision and 64.1% AUC at 150 fps.
- Per-kernel temporal adaptation with Gaussian kernels reduces interference and enhances robustness, offering improved tracking precision and resilience to non-Gaussian noise.
MKCFup denotes two distinct but high-impact methods within the correlation filter and robust Kalman filtering literature: (1) the Multi-Kernel Correlation Filter update for high-speed visual object tracking (Tang et al., 2018), and (2) the “update” step in the Maximum Correntropy Kalman Filter, a robust state-estimation methodology (Chen et al., 2015). Both approaches exploit advanced kernel-based formulations and efficient alternating optimization. The following account details the MKCFup tracker for high-speed tracking and the Maximum Correntropy Kalman Filter update, including mathematical underpinnings, algorithmic schemes, evaluation, and performance insights.
1. Multi-Kernel Correlation Filter Frameworks
KCF and MKCF Formulations
The Kernelized Correlation Filter (KCF) learns a discriminative function in a reproducing-kernel Hilbert space (RKHS) by minimizing a Tikhonov-regularized empirical risk: where inputs are circulant shifts of a base patch, targets are Gaussian-shaped labels, and regularization parameter penalizes RKHS norm . By the Representer Theorem, . Letting be the kernel matrix, the dual problem is quadratic with closed-form solution; in the circulant setting, FFT diagonalization yields element-wise updates for efficient computation (Tang et al., 2018).
The Multi-Kernel Correlation Filter (MKCF) extends KCF to convex combinations of base kernels: MKCF alternates between solving for the dual weights (fixed ) and updating 0 (fixed 1) within a convex quadratic program, but exhibits increased computational complexity and only moderate improvements over KCF (Tang et al., 2018).
2. Upper-Bound Reformulation and Decoupled Optimization
MKCFup introduces an upper-bound surrogate to the MKCF objective, fundamentally decoupling kernel contributions and suppressing negative interference:
2
with parameters 3, 4, and 5. This reformulation permits per-kernel loss evaluation, leading to more stable alternating block-coordinate optimization. Temporal adaptation and historical influence are controlled by per-kernel learning rates 6 and exponential weights 7:
8
where 9 reflects the influence of frames 1 to 0 according to (per-kernel) exponential forgetting.
3. Alternating Block-Coordinate Algorithm
MKCFup employs a highly efficient alternating optimization per frame, summarized as:
- Initialize: 1 for 2.
- Alternate 3 times (typically 4):
- Fix 5, solve for 6 via FFT-based elementwise updates in the Fourier domain, exploiting running accumulators for numerator and denominator terms per kernel.
- Fix 7, update 8 in closed-form using current responses and running historical averages.
Detection in the next frame utilizes the estimated weights and filter response: 9 All operations remain in the Fourier domain, maintaining low computational overhead.
4. Feature Extraction, Kernels, and Historical Adaptation
MKCFup adopts two base kernel types:
- Color Name (CN) features: 13-D, projected to 4-D by PCA.
- HOG features: 9 orientation bins over 0 cells, PCA-reduced to 4-D.
Both use Gaussian base kernels
1
with cross-validated kernel widths for color (2, 3 for color images; reduced for grayscale). Per-kernel historical learning rates are 4, 5, and initial weights 6 adapt automatically thereafter (Tang et al., 2018).
5. Empirical Performance and Implementation Considerations
On OTB2013, MKCFup achieves 83.5% precision@20px and 64.1% AUC at 150 fps, outperforming both KCF (70.9% precision/50.7% AUC/297 fps) and classic MKCF (76.7%/57.0%/30 fps). The table below summarizes empirical metrics:
| Tracker | Precision@20px | AUC | fps |
|---|---|---|---|
| KCF | 70.9% | 50.7% | 297 |
| MKCF | 76.7% | 57.0% | 30 |
| fMKCF | 78.6% | 58.0% | 50 |
| MKCFup | 83.5% | 64.1% | 150 |
Key practical considerations:
- Search region: 2.5× object bounding-box (same as KCF)
- Hann window applied to suppress high-frequency FFT artifacts
- Gaussian response label matching feature patch shape
- Regularization: 7, 8
- Scale estimation: fDSST applied post-tracking
6. Theoretical and Practical Advantages
MKCFup’s upper-bound reformulation suppresses kernel mutual interference, enabling more discriminative per-feature optimization. Temporal weighting via 9 and per-kernel 0 enables adaptive memory length per feature. All updates rely on FFT-based operations, facilitating real-time throughput (1150 fps).
The search region size is optimized for small inter-frame translations (offset ratio 2), avoiding unnecessary background and increasing robustness in “small-move/high-speed” scenarios. Empirically, MKCFup consistently exhibits superior accuracy and computational efficiency over both linear kernel methods (KCF) and standard MKL-based correlation filters (MKCF) (Tang et al., 2018).
7. MKCFup in the Maximum Correntropy Kalman Filter
In robust state estimation, “MKCFup” also refers to the update step in the Maximum Correntropy Kalman Filter (MCKF) (Chen et al., 2015). Here, the MMSE criterion of classic Kalman filtering is replaced with the Maximum Correntropy Criterion (MCC), leading to a fixed-point update of the posterior mean: 3 with modified covariances and gain reflecting data-driven residual weights: 4 where 5 and 6 depend on the diagonal correntropy matrices 7, 8 induced by
9
The fixed-point equations down-weight outliers via 0, yielding robustness to heavy-tailed or impulsive noise. Convergence is guaranteed for sufficiently large 1. For Gaussian noise, MCKF reduces to the classical Kalman update. This robustification incurs a marginal increase in computational complexity (a few fixed-point iterations per timestep), yielding superior empirical error distributions in non-Gaussian settings (Chen et al., 2015).