FAMNet: Joint Learning of Feature, Affinity and Multi-dimensional Assignment for Online Multiple Object Tracking (1904.04989v1)

Published 10 Apr 2019 in cs.CV

Abstract: Data association-based multiple object tracking (MOT) involves multiple separated modules processed or optimized differently, which results in complex method design and requires non-trivial tuning of parameters. In this paper, we present an end-to-end model, named FAMNet, where Feature extraction, Affinity estimation and Multi-dimensional assignment are refined in a single network. All layers in FAMNet are designed differentiable thus can be optimized jointly to learn the discriminative features and higher-order affinity model for robust MOT, which is supervised by the loss directly from the assignment ground truth. We also integrate single object tracking technique and a dedicated target management scheme into the FAMNet-based tracking system to further recover false negatives and inhibit noisy target candidates generated by the external detector. The proposed method is evaluated on a diverse set of benchmarks including MOT2015, MOT2017, KITTI-Car and UA-DETRAC, and achieves promising performance on all of them in comparison with state-of-the-arts.

Citations (210)

View on Semantic Scholar

Summary

The paper presents an integrated model that simultaneously refines feature extraction, estimates higher-order affinities, and solves multi-dimensional assignment through joint optimization.
It leverages appearance and motion clues along with single object tracking to effectively reduce false negatives and manage noisy detections.
Evaluation on benchmarks like MOT2017 and KITTI-Car demonstrates FAMNet's robust performance and potential for applications in surveillance and autonomous driving.

FAMNet: Joint Learning of Feature, Affinity, and Multi-dimensional Assignment for Online Multiple Object Tracking

The paper "FAMNet: Joint Learning of Feature, Affinity, and Multi-dimensional Assignment for Online Multiple Object Tracking" introduces a novel approach to enhancing multiple object tracking (MOT) via an integrated, deep learning model. Traditional data association-based MOT methods typically involve distinct stages such as feature extraction, affinity estimation, and tracking assignment, which are processed separately. These approaches encounter challenges, including complex design and extensive parameter tuning. This paper proposes FAMNet, a unified, end-to-end architecture that allows these components to be optimized cohesively within a single network, thus potentially streamlining the tracking process and improving tracking robustness.

Key Contributions

Integrated Deep Learning Model: FAMNet presents an architecture that simultaneously refines feature extraction, affinity estimation, and multi-dimensional assignment. This comprehensive integration is facilitated by making all network layers differentiable, allowing joint optimization based on the assignment ground truth.
Higher-Order Affinity Model: The paper leverages higher-order discriminative clues, such as appearance changes over time and motion context, for improved data association surpassing traditional pairwise models.
Incorporation of Single Object Tracking (SOT): To address false negatives and filter out noisy detections, FAMNet incorporates SOT strategies and a dedicated target management scheme. This integration permits the recovery and management of target trajectories effectively.
Innovative Assignment Solution: FAMNet employs a modified rank-1 tensor approximation using power iteration, adapted for deep learning, to solve the multi-dimensional assignment (MDA) problem.
Evaluation Across Benchmarks: The model demonstrates its efficacy across various benchmark datasets, including MOT2015, MOT2017, KITTI-Car, and UA-DETRAC, achieving competitive performance against state-of-the-art techniques.

Implications and Future Directions

The introduction of FAMNet signifies progress on several fronts within the domain of MOT:

Practical Improvements: By reducing the complexity and tuning overhead associated with conventional methods, FAMNet offers a more adaptable and straightforward approach for real-world applications like surveillance systems and autonomous driving.
Advancements in Deep Learning Application: The model illustrates how deep learning can be applied to implicitly learn and adapt task-specific priors, enhancing its applicability across different scenarios without excessive manual intervention.
Potential for Further Optimization: While promising results have been reported, exploring alternative architectures or training regimes could uncover further enhancements in terms of speed and accuracy.
Integration with Other AI Modules: Future research might explore more profound integrations with other AI components, such as integrating semantic segmentation to help distinguish and track occluded or overlapping targets better.

Conclusion

FAMNet positions itself as a significant advancement in the object tracking domain by addressing the pervasive fragmentation within existing methods. Its architecture aligns well with contemporary shifts towards integrated AI solutions, setting a precedent for future work aiming to unify perceptual tasks in complex dynamic environments. As the field progresses, FAMNet’s conceptual underpinnings are likely to inspire more comprehensive models that blend recognition, tracking, and decision-making processes seamlessly.

PDF Markdown