Joint Object Detection and Multi-Object Tracking with Graph Neural Networks (2006.13164v3)

Published 23 Jun 2020 in cs.CV, cs.LG, cs.MA, and cs.RO

Abstract: Object detection and data association are critical components in multi-object tracking (MOT) systems. Despite the fact that the two components are dependent on each other, prior works often design detection and data association modules separately which are trained with separate objectives. As a result, one cannot back-propagate the gradients and optimize the entire MOT system, which leads to sub-optimal performance. To address this issue, recent works simultaneously optimize detection and data association modules under a joint MOT framework, which has shown improved performance in both modules. In this work, we propose a new instance of joint MOT approach based on Graph Neural Networks (GNNs). The key idea is that GNNs can model relations between variable-sized objects in both the spatial and temporal domains, which is essential for learning discriminative features for detection and data association. Through extensive experiments on the MOT15/16/17/20 datasets, we demonstrate the effectiveness of our GNN-based joint MOT approach and show state-of-the-art performance for both detection and MOT tasks. Our code is available at: https://github.com/yongxinw/GSDT

Citations (39)

View on Semantic Scholar

Summary

The paper introduces a joint optimization framework using Graph Neural Networks to integrate object detection and tracking, enabling error back-propagation through the entire system.
It leverages relational modeling to extract discriminative features, reducing identity switches and improving data association accuracy.
Extensive experiments on MOT15/16/17/20 datasets demonstrate significant improvements in MOTA and IDF1 scores compared to previous methods.

Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

The paper "Joint Object Detection and Multi-Object Tracking with Graph Neural Networks" addresses a significant issue in the development of multi-object tracking (MOT) systems. The proposed approach leverages Graph Neural Networks (GNNs) to integrate object detection and data association into a cohesive framework, optimizing the MOT system's overall performance.

A critical observation in this work is that previous MOT methodologies have often decomposed object detection and data association into two independent tasks. This artificial separation hinders the ability to back-propagate errors through the entire system, leading to sub-optimal outcomes where each module reaches only its local optimum rather than contributing to a global solution for the MOT objective. Recognizing this, the authors have proposed a joint optimization approach using GNNs, which can inherently model the relationships between objects over time and space.

The use of GNNs allows for the extraction and utilization of more discriminative features by apprehending relations among variable-sized objects. This relational modeling is pivotal in MOT, where the spatial-temporal context significantly impacts detection and association processes. The constructed GNN layers facilitate feature sharing, where nodes (representing objects) are connected, allowing for more sophisticated reasoning over object-object relations.

Comprehensive experiments were conducted on multiple benchmarks, including MOT15/16/17/20 datasets, demonstrating the efficacy of their approach. The results consistently show that this GNN-based joint MOT framework achieves state-of-the-art (SOTA) performance in both object detection and tracking tasks. Specifically, marked improvements were noted in terms of MOTA and IDF1 scores across different MOT challenges as compared to previously published work.

Analyzing these experiments, the authors assert the dual benefits of their approach: robust object detection coupled with improved data association accuracy. The GNN model's ability to account for relational dependencies between objects contributes prominently to reducing identity switches and increasing the number of correctly tracked objects over time.

From a theoretical standpoint, this work implies a significant advancement in the understanding of MOT systems by integrating neural methodologies smoothly into the MOT pipeline. Practically, it suggests improvements in applications such as autonomous driving, surveillance, and any systems that rely heavily on accurate and reliable MOT.

Speculating about future developments, the focus will likely shift towards even more comprehensive models that not only consider object relationships but also learn from multi-modal data across various domains. Moreover, scalability and computational efficiency remain pertinent challenges as GNNs are further developed for real-time applications with larger datasets.

In conclusion, by implementing a refined joint framework and leveraging GNNs for MOT, this paper makes a considerable contribution to the field of robotics and autonomous systems. It opens avenues for more synergistic techniques that capitalize on recent advancements in neural networks, promising further improvements in tracking accuracy and efficiency.

PDF Markdown

Related Papers

GitHub

GitHub - yongxinw/GSDT: Official PyTorch implementation of "Joint Object Detection and Multi-Object Tracking with Graph Neural Networks" (462 stars)

Tweets

https://twitter.com/xinshuoweng/status/1362812460961562635

https://twitter.com/pythontrending/status/1363446004683591681

https://twitter.com/andresvilarino/status/1363950460303007744