Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
175 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

EarlyBird: Early-Fusion for Multi-View Tracking in the Bird's Eye View (2310.13350v1)

Published 20 Oct 2023 in cs.CV

Abstract: Multi-view aggregation promises to overcome the occlusion and missed detection challenge in multi-object detection and tracking. Recent approaches in multi-view detection and 3D object detection made a huge performance leap by projecting all views to the ground plane and performing the detection in the Bird's Eye View (BEV). In this paper, we investigate if tracking in the BEV can also bring the next performance breakthrough in Multi-Target Multi-Camera (MTMC) tracking. Most current approaches in multi-view tracking perform the detection and tracking task in each view and use graph-based approaches to perform the association of the pedestrian across each view. This spatial association is already solved by detecting each pedestrian once in the BEV, leaving only the problem of temporal association. For the temporal association, we show how to learn strong Re-Identification (re-ID) features for each detection. The results show that early-fusion in the BEV achieves high accuracy for both detection and tracking. EarlyBird outperforms the state-of-the-art methods and improves the current state-of-the-art on Wildtrack by +4.6 MOTA and +5.6 IDF1.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. Deep occlusion reasoning for multi-camera multi-target detection. In ICCV, pages 271–279, 2017.
  2. Multiple object tracking using k-shortest paths optimization. IEEE TPAMI, 33(9):1806–1819, 2011.
  3. Tracking without bells and whistles. In CVPR, pages 941–951, 2019.
  4. Evaluating multiple object tracking performance: the clear mot metrics. EURASIP Journal on Image and Video Processing, 2008:1–10, 2008.
  5. Wildtrack: A multi-camera hd dataset for dense unscripted pedestrian detection. In CVPR, pages 5030–5039, 2018.
  6. Deep multi-camera people detection. In 2017 16th IEEE international conference on machine learning and applications (ICMLA), pages 848–853. IEEE, 2017.
  7. Real-time multiple people tracking with deeply learned candidate selection and person re-identification. In ICME, pages 1–6. IEEE, 2018.
  8. Rest: A reconfigurable spatial-temporal graph model for multi-camera multi-object tracking. arXiv preprint arXiv:2308.13229, 2023.
  9. Multi-camera object detection for robotics. In 2010 IEEE International Conference on Robotics and Automation, pages 412–419. IEEE, 2010.
  10. Homography based multiple camera detection and tracking of people in a dense crowd. In CVPR, pages 1–8. IEEE, 2008.
  11. Detect to track and track to detect. In ICCV, pages 3038–3046, 2017.
  12. Pets2009: Dataset and challenge. In 2009 Twelfth IEEE international workshop on performance evaluation of tracking and surveillance, pages 1–6. IEEE, 2009.
  13. Multicamera people tracking with a probabilistic occupancy map. IEEE TPAMI, 30(2):267–282, 2007.
  14. Simple-BEV: What really matters for multi-sensor bev perception? In IEEE International Conference on Robotics and Automation (ICRA), 2023.
  15. Multiple view geometry in computer vision. Cambridge university press, 2003.
  16. Jules. Harvey, Adam. LaPlace. Exposing.ai, 2021.
  17. Mask r-cnn. In ICCV, pages 2961–2969, 2017.
  18. Synthehicle: Multi-vehicle multi-camera tracking in virtual cities. In WACV Worksh., pages 1–11, January 2023.
  19. Lightweight multi-branch network for person re-identification. In ICIP, pages 1129–1133. IEEE, 2021.
  20. Hypergraphs for joint multi-view reconstruction and multi-object tracking. In CVPR, pages 3650–3657, 2013.
  21. Multiview detection with shadow transformer (and view-coherent data augmentation). In ACM MM, 2021.
  22. Multiview detection with feature perspective transformation. In ECCV, 2020.
  23. Principal axis-based correspondence between multiple cameras for people tracking. IEEE TPAMI, 28(4):663–671, 2006.
  24. Rudolph Emil Kalman. A new approach to linear filtering and prediction problems. Journal of Basic Engineering, 1960.
  25. Supervised contrastive learning. Advances in neural information processing systems, 33:18661–18673, 2020.
  26. Branch-and-price global optimization for multi-view multi-target tracking. In CVPR, pages 1987–1994. IEEE, 2012.
  27. Multi-view target transformation for pedestrian detection. In WACV Worksh., pages 90–99, 2023.
  28. Bevformer: Learning bird’s-eye-view representation from multi-camera images via spatiotemporal transformers. In ECCV, pages 1–18, 2022.
  29. Lmgp: Lifted multicut meets geometry projections for multi-camera multi-object tracking. In CVPR, pages 8866–8875, 2022.
  30. A bayesian filter for multi-view 3d multi-object tracking with occlusion handling. IEEE TPAMI, 44(5):2246–2263, 2020.
  31. Lift, splat, shoot: Encoding images from arbitrary camera rigs by implicitly unprojecting to 3d. In ECCV, pages 194–210. Springer, 2020.
  32. 3d random occlusion and multi-layer projection for deep multi-camera pedestrian localization. In ECCV, pages 695–710. Springer, 2022.
  33. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767, 2018.
  34. Performance measures and a data set for multi-target, multi-camera tracking. In ECCV, pages 17–35. Springer, 2016.
  35. Conditional random fields for multi-camera object detection. In ICCV, pages 563–570, 2011.
  36. Object detection, tracking and recognition for multiple smart cameras. Proceedings of the IEEE, 96(10):1606–1624, 2008.
  37. Simple cues lead to a strong multi-object tracker. In CVPR, pages 13813–13823, 2023.
  38. Multi-commodity network flow for tracking multiple people. IEEE TPAMI, 36(8):1614–1627, 2013.
  39. Stacked homography transformations for multi-view pedestrian detection. In CVPR, pages 6049–6057, 2021.
  40. Cityflow: A city-scale benchmark for multi-target multi-camera vehicle tracking and re-identification. In CVPR, pages 8797–8806, 2019.
  41. MOTS: Multi-object tracking and segmentation. In CVPR, 2019.
  42. Towards real-time multi-object tracking. In ECCV, pages 107–122. Springer, 2020.
  43. Simple online and realtime tracking with a deep association metric. In ICIP, pages 3645–3649. IEEE, 2017.
  44. Multi-view people tracking via hierarchical trajectory composition. In CVPR, pages 4256–4265, 2016.
  45. Real-time 3d deep multi-camera tracking. arXiv preprint arXiv:2003.11753, 2020.
  46. FairMot: On the fairness of detection and re-identification in multiple object tracking. IJCV, 129:3069–3087, 2021.
  47. Tracking objects as points. In ECCV, pages 474–490. Springer, 2020.
  48. Objects as points. In arXiv preprint arXiv:1904.07850, 2019.
Citations (7)

Summary

We haven't generated a summary for this paper yet.