A Lightweight and Detector-free 3D Single Object Tracker on Point Clouds (2203.04232v2)

Published 8 Mar 2022 in cs.CV

Abstract: Recent works on 3D single object tracking treat the task as a target-specific 3D detection task, where an off-the-shelf 3D detector is commonly employed for the tracking. However, it is non-trivial to perform accurate target-specific detection since the point cloud of objects in raw LiDAR scans is usually sparse and incomplete. In this paper, we address this issue by explicitly leveraging temporal motion cues and propose DMT, a Detector-free Motion-prediction-based 3D Tracking network that completely removes the usage of complicated 3D detectors and is lighter, faster, and more accurate than previous trackers. Specifically, the motion prediction module is first introduced to estimate a potential target center of the current frame in a point-cloud-free manner. Then, an explicit voting module is proposed to directly regress the 3D box from the estimated target center. Extensive experiments on KITTI and NuScenes datasets demonstrate that our DMT can still achieve better performance (~10% improvement over the NuScenes dataset) and a faster tracking speed (i.e., 72 FPS) than state-of-the-art approaches without applying any complicated 3D detectors. Our code is released at \url{https://github.com/jimmy-dq/DMT}

Citations (20)

View on Semantic Scholar

Summary

The paper introduces a novel detector-free motion-prediction-based 3D tracker that eliminates reliance on traditional 3D detectors.
It employs a motion prediction module to estimate target centers and an explicit voting module to regress 3D bounding boxes with enhanced precision.
Experiments on KITTI and NuScenes show a 10% performance improvement and 72 FPS tracking speed, highlighting its potential for autonomous driving and robotics.

The paper "A Lightweight and Detector-free 3D Single Object Tracker on Point Clouds" introduces an innovative approach to 3D single object tracking that addresses some of the critical limitations of existing methods. Traditional approaches often rely on off-the-shelf 3D detectors for tracking, which can struggle with the inherent sparsity and incompleteness of point cloud data from LiDAR scans. This paper proposes a novel solution by entirely removing the dependency on such detectors and offering a more efficient and effective tracking mechanism.

Key Contributions and Methodology

Detector-Free Motion-Prediction-Based 3D Tracking Network (DMT): The proposed DMT network eliminates the need for cumbersome 3D detectors, presenting a lightweight and faster alternative. The primary innovation lies in leveraging temporal motion cues to predict the motion of targets in a detector-free manner.
Motion Prediction Module: This module is designed to estimate the potential center of the target in the current frame. It operates without the need for point cloud data, significantly reducing computational complexity and enhancing tracking speed.
Explicit Voting Module: Following the motion prediction, an explicit voting module is introduced, which directly regresses the 3D bounding box from the estimated target center. This method improves the precision and robustness of tracking by focusing directly on the target's location and dimensions.

Performance and Experiments

The authors conducted extensive experiments using the KITTI and NuScenes datasets to validate DMT's performance. The key findings from these experiments are:

Performance Improvement:

DMT demonstrated a significant performance boost, achieving approximately a 10% improvement over state-of-the-art approaches on the NuScenes dataset. This highlights the method's effectiveness in different scenarios and its capacity to deal with the challenges posed by point cloud sparsity.

Tracking Speed:

One of the standout features of DMT is its tracking speed. The method achieves an impressive 72 frames per second (FPS), outperforming other approaches and making it suitable for real-time applications.

Practical Implications

The advancements introduced in this paper could have profound implications for various real-world applications, particularly in autonomous driving and robotics, where quick and accurate object tracking is crucial. By eschewing complex 3D detectors, the DMT network significantly cuts down on processing overhead, making it a compelling choice for systems where both performance and efficiency are critical.

The implementation of DMT is available on GitHub, providing the research community and industry practitioners an opportunity to explore and build upon this novel approach.

The approach taken by the authors offers a substantial improvement in the field of 3D single object tracking, providing a path forward for more streamlined and effective tracking solutions.

PDF Markdown

Related Papers

GitHub

GitHub - jimmy-dq/DMT (8 stars)