Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 88 tok/s
Gemini 2.5 Pro 47 tok/s Pro
GPT-5 Medium 21 tok/s Pro
GPT-5 High 13 tok/s Pro
GPT-4o 81 tok/s Pro
Kimi K2 175 tok/s Pro
GPT OSS 120B 450 tok/s Pro
Claude Sonnet 4 39 tok/s Pro
2000 character limit reached

SMILEtrack: SiMIlarity LEarning for Occlusion-Aware Multiple Object Tracking (2211.08824v4)

Published 16 Nov 2022 in cs.CV

Abstract: Despite recent progress in Multiple Object Tracking (MOT), several obstacles such as occlusions, similar objects, and complex scenes remain an open challenge. Meanwhile, a systematic study of the cost-performance tradeoff for the popular tracking-by-detection paradigm is still lacking. This paper introduces SMILEtrack, an innovative object tracker that effectively addresses these challenges by integrating an efficient object detector with a Siamese network-based Similarity Learning Module (SLM). The technical contributions of SMILETrack are twofold. First, we propose an SLM that calculates the appearance similarity between two objects, overcoming the limitations of feature descriptors in Separate Detection and Embedding (SDE) models. The SLM incorporates a Patch Self-Attention (PSA) block inspired by the vision Transformer, which generates reliable features for accurate similarity matching. Second, we develop a Similarity Matching Cascade (SMC) module with a novel GATE function for robust object matching across consecutive video frames, further enhancing MOT performance. Together, these innovations help SMILETrack achieve an improved trade-off between the cost ({\em e.g.}, running speed) and performance (e.g., tracking accuracy) over several existing state-of-the-art benchmarks, including the popular BYTETrack method. SMILETrack outperforms BYTETrack by 0.4-0.8 MOTA and 2.1-2.2 HOTA points on MOT17 and MOT20 datasets. Code is available at https://github.com/pingyang1117/SMILEtrack_Official

Definition Search Book Streamline Icon: https://streamlinehq.com
References (62)
  1. BoT-SORT: Robust Associations Multi-Pedestrian Tracking. arXiv:2206.14651.
  2. Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics. EURASIP JIVP.
  3. Simple online and realtime tracking. ICIP.
  4. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv:2004.10934.
  5. Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking. arXiv:2203.14360.
  6. Parallel residual bi-fusion feature pyramid network for accurate single-shot object detection. IEEE Transactions on Image Processing, 30: 9099–9111.
  7. Mot20: A benchmark for multi object tracking in crowded scenes. arXiv preprint arXiv:2003.09003.
  8. Pedestrian Detection: A Benchmark. In CVPR.
  9. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv:2010.11929.
  10. Strongsort: Make deepsort great again. IEEE Transactions on Multimedia.
  11. A mobile vision system for robust multi-person tracking. In 2008 IEEE Conference on Computer Vision and Pattern Recognition, 1–8.
  12. YOLOX: Exceeding YOLO Series in 2021. arXiv:2107.08430.
  13. MAT:: Motion-aware multi-object tracking. Neurocomputing, 476: 104–114.
  14. Set transformer for multi-object tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4374–4383.
  15. Rethinking the Competition Between Detection and ReID in Multiobject Tracking. IEEE Transactions on Image Processing, 30: 7188–7200.
  16. Microsoft COCO: Common Objects in Context. In ECCV.
  17. SSD: Single Shot MultiBox Detector. ECCV, 21–37.
  18. TransMOT: Spatial-Temporal Graph Transformer for Multiple Object Tracking. In ICCV, 10002–10011.
  19. HOTA: A higher order metric for evaluating multi-object tracking. IJCV, 129: 548–578.
  20. Learning discriminative activated simplices for action recognition. In AAAI.
  21. TrackFormer: Multi-Object Tracking with Transformers. In CVPR.
  22. MOT16: A Benchmark for Multi-Object Tracking. arXiv:1603.00831.
  23. Tubetk: Adopting tubes to track multi-object in a one-step training model. In CVPR, 6308–6318.
  24. Quasi-dense similarity learning for multiple object tracking. In CVPR, 164–173.
  25. Chained-Tracker: Chaining paired attentive regression results for end-to-end joint multiple-object detection and tracking. In ECCV, 144–161. Springer.
  26. TransTrack: Transformer based Multiple Object Tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14830–14839.
  27. You Only Look Once: Unified, Real-Time Object Detection. arXiv:1506.02640.
  28. YOLO9000: Better, Faster, Stronger. In CVPR, 6517–6525.
  29. YOLOv3: An Incremental Improvement. arXiv:1804.02767.
  30. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. arXiv:1506.01497.
  31. Performance Measures and a Data Set for Multi-target, Multi-camera Tracking. In ECCV Workshops.
  32. A Multi-View Stereo Benchmark with High-Resolution Images and Multi-Camera Videos. In CVPR.
  33. CrowdHuman: A Benchmark for Detecting Human in a Crowd. arXiv:1805.00123.
  34. On the Performance of Crowd-Specific Detectors in Multi-Pedestrian Tracking. In AVSS, 1–12.
  35. Modelling Ambiguous Assignments for Multi-Person Tracking in Crowds. In WACV, 133–142.
  36. TransTrack: Multiple-Object Tracking with Transformer. arXiv:2012.15460.
  37. Learning to Track with Object Permanence. In ICCV, 10012–10021.
  38. Attention is all you need. NeurIPS, 30.
  39. Attention is All you Need. In NeurIPS, volume 30.
  40. An Approach to Pose-Based Action Recognition. In CVPR, 915–922.
  41. Wang, C.-Y.; et al. 2020a. CSPNet: A New Backbone That Can Enhance Learning Capability of CNN. In CVPR Workshops.
  42. Multiple Object Tracking with Correlation Learning. In CVPR, 3876–3886.
  43. Joint Object Detection and Multi-Object Tracking with Graph Neural Networks. In ICRA, 10077–10083. IEEE.
  44. One More Check: Making “Fake Background” Be Tracked Again. In AAAI, volume 35, 15446–15454.
  45. Towards Real-Time Multi-Object Tracking. In ECCV.
  46. Simple online and realtime tracking with a deep association metric. In ICIP, 3645–3649.
  47. Track To Detect and Segment: An Online Multi-Object Tracker. In CVPR, 12352–12361.
  48. Joint Detection and Identification Feature Learning for Person Search. arXiv:1604.01850.
  49. TransCenter: Transformers with Dense Representations for Multiple-Object Tracking. IEEE PAMI.
  50. MOTSNet: A Unified Framework for Multi-Object Tracking and Segmentation. In CVPR, 11686–11695.
  51. RelationTrack: Relation-Aware Multiple Object Tracking With Decoupled Representation. In ICIP, 3004–3008.
  52. MOTR: End-to-end multiple-object tracking with transformer. In ICCV, 11076–11085.
  53. TrackFormer: Multi-Object Tracking with Transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8841–8850.
  54. CityPersons: A Diverse Dataset for Pedestrian Detection. In CVPR, 4457–4465.
  55. ByteTrack: Multi-Object Tracking by Associating Every Detection Box. arXiv:2110.06864.
  56. FairMOT: On the fairness of detection and re-identification in multiple object tracking. IJCV, 129: 3069–3087.
  57. ReMOT: A model-agnostic refinement for multiple object tracking. Image and Vision Computing, 105: 104067.
  58. Improving Multiple Object Tracking With Single Object Tracking. In CVPR, 2453–2462.
  59. Person Re-identification in the Wild. arXiv:1604.02531.
  60. Tracking Objects as Points. In ECCV, 474–490. Springer.
  61. Objects as Points. arXiv:1904.07850.
  62. Deformable DETR: Deformable transformers for end-to-end object detection. In IICLR.
Citations (23)

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube