One More Check: Making "Fake Background" Be Tracked Again (2104.09441v2)

Published 19 Apr 2021 in cs.CV

Abstract: The one-shot multi-object tracking, which integrates object detection and ID embedding extraction into a unified network, has achieved groundbreaking results in recent years. However, current one-shot trackers solely rely on single-frame detections to predict candidate bounding boxes, which may be unreliable when facing disastrous visual degradation, e.g., motion blur, occlusions. Once a target bounding box is mistakenly classified as background by the detector, the temporal consistency of its corresponding tracklet will be no longer maintained. In this paper, we set out to restore the bounding boxes misclassified as fake background'' by proposing a re-check network. The re-check network innovatively expands the role of ID embedding from data association to motion forecasting by effectively propagating previous tracklets to the current frame with a small overhead. Note that the propagation results are yielded by an independent and efficient embedding search, preventing the model from over-relying on detection results. Eventually, it helps to reload thefake background'' and repair the broken tracklets. Building on a strong baseline CSTrack, we construct a new one-shot tracker and achieve favorable gains by 70.7 $\rightarrow$ 76.4, 70.6 $\rightarrow$ 76.3 MOTA on MOT16 and MOT17, respectively. It also reaches a new state-of-the-art MOTA and IDF1 performance. Code is released at https://github.com/JudasDie/SOTS.

Citations (60)

View on Semantic Scholar

Summary

The paper introduces a re-check network that restores false background detections in one-shot multi-object tracking.
It employs a transductive detection module with cross-correlation and a refinement module to effectively filter false positives.
The approach improves MOTA scores on MOT16 and MOT17 and is adaptable for real-time tracking applications.

Overview of "One More Check: Making 'Fake Background' Be Tracked Again"

The paper "One More Check: Making 'Fake Background' Be Tracked Again" addresses significant limitations in the domain of one-shot multi-object tracking (MOT). The work primarily targets the challenge of visual degradation in tracking, resulting from factors such as motion blur and occlusions. These challenges can lead to erroneous classifications of target bounding boxes as background, disrupting tracklet consistency.

Core Contributions

The authors introduce a re-check network to enhance one-shot trackers by restoring bounding boxes mislabeled as "fake background." This network extends the functionality of ID embeddings, traditionally used for data association, to also forecast motions. This novel approach allows previous tracklets to be effectively propagated to the current frame with minimal computational overhead.

Methodology

The proposed methodology builds on a strong baseline, CSTrack, with innovative integrations, achieving significant performance improvements. The approach incorporates a double-check mechanism that complements initial detections with re-check network predictions, expanding the temporal context used during tracking. This allows missed targets to be dynamically re-evaluated and restored.

Key components include:

Transductive Detection Module: Utilizes a modified cross-correlation operation to achieve global search in ID embedding space, facilitating robust multi-object tracking without additional complexity.
Refinement Module: Combines visual features with similarity maps to filter false positives effectively, preserving the accuracy of restored bounding boxes.

Strong Numerical Results

The implementation of the re-check network leads to substantial performance gains on established MOT benchmarks. The improvement on MOTA scores from 70.7 to 76.4 and from 70.6 to 76.3 on MOT16 and MOT17 respectively, demonstrates the efficacy of the approach. This enhancement establishes new state-of-the-art performance metrics for one-shot trackers on these benchmarks.

Implications and Future Directions

Practically, the re-check network's "plug-and-play" capability, as demonstrated with integration into other one-shot trackers like FairMOT, underscores its generalizability and potential for widespread adoption. This flexibility allows the methodology to be easily incorporated into existing frameworks, making it valuable for applications requiring real-time tracking, such as autonomous driving and surveillance.

Theoretically, this work suggests a promising direction in leveraging temporal cues within MOT systems, potentially stimulating further research into embedding-based motion forecasting methods. Future developments could explore the integration of additional contextual data or adaptive thresholding mechanisms to enhance robustness further.

In conclusion, this paper makes noteworthy contributions to the field of multi-object tracking, presenting a method that not only improves performance metrics but also enriches the underlying framework of one-shot trackers with temporal dynamics. The proposed re-check network represents a significant advancement in addressing the challenges posed by visual degradation in dynamic environments.

PDF Markdown

Related Papers

GitHub

GitHub - JudasDie/SOTS: Single object tracking and segmentation. (468 stars)