UCMCTrack: Multi-Object Tracking with Uniform Camera Motion Compensation (2312.08952v2)
Abstract: Multi-object tracking (MOT) in video sequences remains a challenging task, especially in scenarios with significant camera movements. This is because targets can drift considerably on the image plane, leading to erroneous tracking outcomes. Addressing such challenges typically requires supplementary appearance cues or Camera Motion Compensation (CMC). While these strategies are effective, they also introduce a considerable computational burden, posing challenges for real-time MOT. In response to this, we introduce UCMCTrack, a novel motion model-based tracker robust to camera movements. Unlike conventional CMC that computes compensation parameters frame-by-frame, UCMCTrack consistently applies the same compensation parameters throughout a video sequence. It employs a Kalman filter on the ground plane and introduces the Mapped Mahalanobis Distance (MMD) as an alternative to the traditional Intersection over Union (IoU) distance measure. By leveraging projected probability distributions on the ground plane, our approach efficiently captures motion patterns and adeptly manages uncertainties introduced by homography projections. Remarkably, UCMCTrack, relying solely on motion cues, achieves state-of-the-art performance across a variety of challenging datasets, including MOT17, MOT20, DanceTrack and KITTI. More details and code are available at https://github.com/corfyi/UCMCTrack
- BoT-SORT: Robust associations multi-pedestrian tracking. arXiv preprint arXiv:2206.14651.
- A dual CNN–RNN for multiple people tracking. Neurocomputing, 368: 69–83.
- Tracking Without Bells and Whistles. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV).
- Evaluating multiple object tracking performance: the clear mot metrics. EURASIP Journal on Image and Video Processing, 2008: 1–10.
- Simple Online and Realtime Tracking. In 2016 IEEE International Conference on Image Processing (ICIP), 3464–3468.
- Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 9686–9696.
- Unifying Short and Long-Term Tracking With Graph Hierarchies. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 22877–22887.
- Choi, W. 2015. Near-online multi-target tracking with aggregated local flow descriptor. In Proceedings of the IEEE international conference on computer vision, 3029–3037.
- Mot20: A benchmark for multi object tracking in crowded scenes. arXiv preprint arXiv:2003.09003.
- Quo Vadis: Is Trajectory Forecasting the Key Towards Long-Term Multi-Object Tracking? Advances in Neural Information Processing Systems, 35: 15657–15671.
- GIAOTracker: A Comprehensive Framework for MCMOT With Global Information and Optimizing Strategies in VisDrone 2021. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, 2809–2819.
- Strongsort: Make deepsort great again. IEEE Transactions on Multimedia.
- Parametric image alignment using enhanced correlation coefficient maximization. IEEE transactions on pattern analysis and machine intelligence, 30(10): 1858–1865.
- Parametric Image Alignment Using Enhanced Correlation Coefficient Maximization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(10): 1858–1865.
- Detect to track and track to detect. In Proceedings of the IEEE international conference on computer vision, 3038–3046.
- Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6): 381–395.
- YOLOX: Exceeding YOLO Series in 2021. CoRR, abs/2107.08430.
- Vision meets robotics: The kitti dataset. The International Journal of Robotics Research, 32(11): 1231–1237.
- Multiple Object Tracking from appearance by hierarchically clustering tracklets. arXiv preprint arXiv:2210.03355.
- MAT: Motion-aware multi-object tracking. Neurocomputing, 476: 75–86.
- Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
- Learning to track at 100 fps with deep regression networks. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, 749–765. Springer.
- Monocular quasi-dense 3d object tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(2): 1992–2008.
- Detecting Invisible People. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 3174–3184.
- DeepMTT: A deep learning maneuvering target-tracking algorithm based on bidirectional LSTM network. Information Fusion, 53: 289–304.
- GSM: Graph Similarity Model for Multi-Object Tracking. In IJCAI, 530–536.
- SparseTrack: Multi-Object Tracking by Performing Scene Decomposition based on Pseudo-Depth. arXiv:2306.05238.
- Hota: A higher order metric for evaluating multi-object tracking. International journal of computer vision, 129: 548–578.
- Deep oc-sort: Multi-pedestrian tracking by adaptive re-identification. arXiv preprint arXiv:2302.11813.
- TripletTrack: 3D Object Tracking Using Triplet Embeddings and LSTM. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 4500–4510.
- MOT16: A benchmark for multi-object tracking. arXiv preprint arXiv:1603.00831.
- TrackMPNN: A message passing graph neural architecture for multi-object tracking. arXiv preprint arXiv:2101.04206.
- Generalized intersection over union: A metric and a loss for bounding box regression. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 658–666.
- Performance measures and a data set for multi-target, multi-camera tracking. In Computer Vision–ECCV 2016 Workshops: Amsterdam, The Netherlands, October 8-10 and 15-16, 2016, Proceedings, Part II, 17–35. Springer.
- ORB: An efficient alternative to SIFT or SURF. In 2011 International conference on computer vision, 2564–2571. Ieee.
- Simple Cues Lead to a Strong Multi-Object Tracker. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 13813–13823.
- DanceTrack: Multi-Object Tracking in Uniform Appearance and Diverse Motion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 20993–21002.
- Learning To Track With Object Permanence. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 10860–10869.
- Simple Online and Realtime Tracking with a Deep Association Metric. In 2017 IEEE International Conference on Image Processing (ICIP), 3645–3649.
- Simple baselines for human pose estimation and tracking. In Proceedings of the European conference on computer vision (ECCV), 466–481.
- MotionTrack: Learning Motion Predictor for Multiple Object Tracking. arXiv:2306.02585.
- Hard to track objects with irregular motions and similar appearances? make it easier by buffering the matching space. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 4799–4808.
- Transformer-based two-source motion model for multi-object tracking. Applied Intelligence, 1–13.
- Three-pronged compensation and hysteresis thresholding for moving object detection in real-time video surveillance. IEEE Transactions on Industrial Electronics, 64(6): 4945–4955.
- General linear cameras. In Computer Vision-ECCV 2004: 8th European Conference on Computer Vision, Prague, Czech Republic, May 11-14, 2004. Proceedings, Part II 8, 14–27. Springer.
- Moving object detection for a moving camera based on global motion compensation and adaptive background model. International Journal of Control, Automation and Systems, 17: 1866–1874.
- Bytetrack: Multi-object tracking by associating every detection box. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXII, 1–21. Springer.
- MOTRv2: Bootstrapping End-to-End Multi-Object Tracking by Pretrained Object Detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 22056–22065.
- Distance-IoU loss: Faster and better learning for bounding box regression. In Proceedings of the AAAI conference on artificial intelligence, volume 34, 12993–13000.
- Tracking objects as points. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part IV, 474–490. Springer.