ByteTrack: Multi-Object Tracking by Associating Every Detection Box (2110.06864v3)

Published 13 Oct 2021 in cs.CV

Abstract: Multi-object tracking (MOT) aims at estimating bounding boxes and identities of objects in videos. Most methods obtain identities by associating detection boxes whose scores are higher than a threshold. The objects with low detection scores, e.g. occluded objects, are simply thrown away, which brings non-negligible true object missing and fragmented trajectories. To solve this problem, we present a simple, effective and generic association method, tracking by associating almost every detection box instead of only the high score ones. For the low score detection boxes, we utilize their similarities with tracklets to recover true objects and filter out the background detections. When applied to 9 different state-of-the-art trackers, our method achieves consistent improvement on IDF1 score ranging from 1 to 10 points. To put forwards the state-of-the-art performance of MOT, we design a simple and strong tracker, named ByteTrack. For the first time, we achieve 80.3 MOTA, 77.3 IDF1 and 63.1 HOTA on the test set of MOT17 with 30 FPS running speed on a single V100 GPU. ByteTrack also achieves state-of-the-art performance on MOT20, HiEve and BDD100K tracking benchmarks. The source code, pre-trained models with deploy versions and tutorials of applying to other trackers are released at https://github.com/ifzhang/ByteTrack.

Authors (9)

Yifu Zhang (22 papers)
Peize Sun (33 papers)
Yi Jiang (171 papers)
Dongdong Yu (26 papers)
Fucheng Weng (1 paper)
Zehuan Yuan (65 papers)
Ping Luo (340 papers)
Wenyu Liu (146 papers)
Xinggang Wang (163 papers)

Citations (1,095)

View on Semantic Scholar

Summary

ByteTrack: A Multi-Object Tracking Method

ByteTrack: Multi-Object Tracking by Associating Every Detection Box by Yifu Zhang et al. introduces a novel methodology for Multi-Object Tracking (MOT) that focuses on associating every detection box, including those with low detection scores, to improve tracking accuracy. This work, which is evaluated on multiple state-of-the-art trackers and datasets, shows significant improvements and proposes a new tracker, ByteTrack, which achieves top-tier performance across various benchmarks.

Overview of ByteTrack

Traditionally, MOT methods associate only high-confidence detection boxes to avoid the inclusion of false positives. This common practice, however, often results in the loss of genuine objects, such as those that are occluded or moving rapidly, thereby introducing fragmented trajectories. ByteTrack addresses this by associating almost every detection box, thus striving to recover true objects even from low-confidence detections.

The core of ByteTrack's association method, BYTE, comprises two main stages:

Associating high-confidence detection boxes with tracklets.
Associating the remaining low-confidence detection boxes with unmatched tracklets from the first association.

This two-step approach allows ByteTrack to filter out background detections and recover true objects effectively. Experimental results demonstrate that when BYTE is applied to nine different state-of-the-art trackers, it consistently shows enhancement in IDF1 scores across a range from 1 to 10 points.

Results and Performance

ByteTrack, equipped with the high-performance YOLOX detector, has set new benchmarks on various datasets:

MOT17: ByteTrack achieves 80.3 MOTA, 77.3 IDF1, and 63.1 HOTA at 30 FPS, outperforming all previous trackers considerably.
MOT20: Similar performance gains are observed with 77.8 MOTA, 75.2 IDF1, and 61.3 HOTA despite the more crowded scenes.
HiEve and BDD100K: ByteTrack continues to show state-of-the-art performance, demonstrating its robustness across different tracking challenges.

The method's standout aspect is not only its improvement in accuracy but also its computational efficiency, maintaining a high running speed critical for real-world applications.

Implications and Future Directions

The implications of ByteTrack are substantial for both practical and theoretical advancements in MOT:

Practical Impact: ByteTrack's ability to effectively handle low-confidence detections makes it particularly valuable for applications requiring high reliability in varying conditions, such as surveillance, autonomous driving, and social computing.
Theoretical Contributions: BYTE's two-stage association process provides insights into the importance of accounting for low-confidence detections, challenging the conventional high-confidence-only paradigm prevalent in tracking methodologies.

Speculating on future developments, the principles underlying BYTE could potentially be extended to newer generation trackers that incorporate advanced feature extraction and learning mechanisms. Integration with more sophisticated Re-ID modules or motion prediction models may further enhance its performance, particularly in extremely dense and fast-moving scenarios.

Overall, ByteTrack stands out as an effective and efficient solution for MOT, pushing the envelope in terms of both accuracy and speed. Its contributions could pave the way for more resilient tracking systems that maintain high efficacy across a broader spectrum of real-world conditions.

PDF Markdown

Related Papers

GitHub

GitHub - ifzhang/ByteTrack: [ECCV 2022] ByteTrack: Multi-Object Tracking by Associating Every Detection Box (4,373 stars)

YouTube

Show All Videos