Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

JRDB-PanoTrack: An Open-world Panoptic Segmentation and Tracking Robotic Dataset in Crowded Human Environments (2404.01686v1)

Published 2 Apr 2024 in cs.CV

Abstract: Autonomous robot systems have attracted increasing research attention in recent years, where environment understanding is a crucial step for robot navigation, human-robot interaction, and decision. Real-world robot systems usually collect visual data from multiple sensors and are required to recognize numerous objects and their movements in complex human-crowded settings. Traditional benchmarks, with their reliance on single sensors and limited object classes and scenarios, fail to provide the comprehensive environmental understanding robots need for accurate navigation, interaction, and decision-making. As an extension of JRDB dataset, we unveil JRDB-PanoTrack, a novel open-world panoptic segmentation and tracking benchmark, towards more comprehensive environmental perception. JRDB-PanoTrack includes (1) various data involving indoor and outdoor crowded scenes, as well as comprehensive 2D and 3D synchronized data modalities; (2) high-quality 2D spatial panoptic segmentation and temporal tracking annotations, with additional 3D label projections for further spatial understanding; (3) diverse object classes for closed- and open-world recognition benchmarks, with OSPA-based metrics for evaluation. Extensive evaluation of leading methods shows significant challenges posed by our dataset.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. Bot-sort: Robust associations multi-pedestrian tracking. arXiv preprint arXiv:2206.14651, 2022.
  2. 4d panoptic lidar segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5527–5537, 2021.
  3. Semantickitti: A dataset for semantic scene understanding of lidar sequences. In ICCV, pages 9297–9307, 2019.
  4. Simple online and realtime tracking. In 2016 IEEE International Conference on Image Processing (ICIP), pages 3464–3468, 2016.
  5. Observation-centric sort: Rethinking sort for robust multi-object tracking. In CVPR, pages 9686–9696, 2023.
  6. Per-pixel classification is not all you need for semantic segmentation. NeurIPS, 34:17864–17875, 2021.
  7. Masked-attention Mask Transformer for Universal Image Segmentation. 2022a.
  8. Masked-attention mask transformer for universal image segmentation. In CVPR, pages 1290–1299, 2022b.
  9. The cityscapes dataset for semantic urban scene understanding. In CVPR, pages 3213–3223, 2016.
  10. Open-vocabulary panoptic segmentation with maskclip. arXiv preprint arXiv:2208.08984, 2022.
  11. Jrdb-act: A large-scale dataset for spatio-temporal action, social group and activity detection. In CVPR, 2022.
  12. Panoptic nuscenes: A large-scale benchmark for lidar panoptic segmentation and tracking. RA-L, 7(2):3795–3802, 2022.
  13. Lidar-based panoptic segmentation via dynamic shifting network. In CVPR, pages 13090–13099, 2021.
  14. Mopt: Multi-object panoptic tracking. arXiv preprint arXiv:2004.08189, 2020.
  15. Video panoptic segmentation. In CVPR, pages 9859–9868, 2020.
  16. Panoptic feature pyramid networks. In CVPR, pages 6399–6408, 2019a.
  17. Panoptic segmentation. In CVPR, pages 9404–9413, 2019b.
  18. Mask dino: Towards a unified transformer-based framework for object detection and segmentation. In CVPR, pages 3041–3050, 2023a.
  19. Ovtrack: Open-vocabulary multiple object tracking. In CVPR, pages 5567–5577, 2023b.
  20. Microsoft coco: Common objects in context. In Proceedings of the ECCV, pages 740–755. Springer, 2014.
  21. Opening up open world tracking. In CVPR, pages 19045–19055, 2022.
  22. Mask-based panoptic lidar segmentation for autonomous driving. RA-L, 8(2):1141–1148, 2023.
  23. Jrdb: A dataset and benchmark of egocentric robot visual perception of humans in built environments. TPAMI, 2021.
  24. Waymo open dataset: Panoramic video panoptic segmentation. In ECCV, pages 53–72. Springer, 2022.
  25. Large-scale video panoptic segmentation in the wild: A benchmark. In CVPR, pages 21033–21043, 2022.
  26. How trustworthy are performance evaluations for basic vision tasks? TPAMI, 2022.
  27. Openscene: 3d scene understanding with open vocabularies. In CVPR, pages 815–824, 2023.
  28. Freeseg: Unified, universal and open-vocabulary image segmentation. In CVPR, 2023.
  29. A consistent metric for performance evaluation of multi-object filters. IEEE transactions on signal processing, 56(8):3447–3457, 2008.
  30. Jrdb-pose: A large-scale dataset for multi-person pose estimation and tracking. In CVPR, pages 4811–4820, 2023.
  31. Max-deeplab: End-to-end panoptic segmentation with mask transformers. In CVPR, pages 5463–5474, 2021.
  32. Step: Segmenting and tracking every pixel. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), 2021.
  33. Upsnet: A unified panoptic segmentation network. In CVPR, pages 8818–8826, 2019.
  34. Open-vocabulary panoptic segmentation with text-to-image diffusion models. In CVPR, 2023a.
  35. Masqclip for open-vocabulary universal image segmentation. In ICCV, pages 887–898, 2023b.
  36. Mask4d: Mask transformer for 4d panoptic segmentation. arXiv preprint arXiv:2309.16133, 2023.
  37. k-means mask transformer. In ECCV, pages 288–307. Springer, 2022.
  38. Convolutions die hard: Open-vocabulary segmentation with single frozen convolutional clip. arXiv preprint arXiv:2308.02487, 2023.
  39. K-net: Towards unified image segmentation. NeurIPS, 34:10326–10338, 2021.
  40. Bytetrack: Multi-object tracking by associating every detection box. In ECCV, pages 1–21. Springer, 2022.
  41. Scene parsing through ade20k dataset. In CVPR, 2017.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Duy-Tho Le (4 papers)
  2. Chenhui Gou (12 papers)
  3. Stavya Datta (2 papers)
  4. Hengcan Shi (13 papers)
  5. Ian Reid (174 papers)
  6. Jianfei Cai (163 papers)
  7. Hamid Rezatofighi (61 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.