Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
175 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

RISeg: Robot Interactive Object Segmentation via Body Frame-Invariant Features (2403.01731v1)

Published 4 Mar 2024 in cs.CV and cs.RO

Abstract: In order to successfully perform manipulation tasks in new environments, such as grasping, robots must be proficient in segmenting unseen objects from the background and/or other objects. Previous works perform unseen object instance segmentation (UOIS) by training deep neural networks on large-scale data to learn RGB/RGB-D feature embeddings, where cluttered environments often result in inaccurate segmentations. We build upon these methods and introduce a novel approach to correct inaccurate segmentation, such as under-segmentation, of static image-based UOIS masks by using robot interaction and a designed body frame-invariant feature. We demonstrate that the relative linear and rotational velocities of frames randomly attached to rigid bodies due to robot interactions can be used to identify objects and accumulate corrected object-level segmentation masks. By introducing motion to regions of segmentation uncertainty, we are able to drastically improve segmentation accuracy in an uncertainty-driven manner with minimal, non-disruptive interactions (ca. 2-3 per scene). We demonstrate the effectiveness of our proposed interactive perception pipeline in accurately segmenting cluttered scenes by achieving an average object segmentation accuracy rate of 80.7%, an increase of 28.2% when compared with other state-of-the-art UOIS methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. Y. Xiang, C. Xie, A. Mousavian, and D. Fox, “Learning rgb-d feature embeddings for unseen object instance segmentation,” in Conference on Robot Learning.   PMLR, 2021, pp. 461–470.
  2. S. Back, J. Lee, T. Kim, S. Noh, R. Kang, S. Bak, and K. Lee, “Unseen object amodal instance segmentation via hierarchical occlusion modeling,” in IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2022, pp. 5085–5092.
  3. C. Xie, Y. Xiang, A. Mousavian, and D. Fox, “Unseen object instance segmentation for robotic environments,” IEEE Transactions on Robotics, vol. 37, no. 5, pp. 1343–1359, 2021.
  4. M. Danielczuk, M. Matl, S. Gupta, A. Li, A. Lee, J. Mahler, and K. Goldberg, “Segmenting unknown 3d objects from real depth images using mask r-cnn trained on synthetic point clouds,” arXiv preprint arXiv:1809.05825, vol. 16, 2018.
  5. Y. Lu, Y. Chen, N. Ruozzi, and Y. Xiang, “Mean shift mask transformer for unseen object instance segmentation,” in IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2024.
  6. J. Kenney, T. Buckley, and O. Brock, “Interactive segmentation for manipulation in unstructured environments,” in IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2009, pp. 1377–1382.
  7. Y. Lu, N. Khargonkar, Z. Xu, C. Averill, K. Palanisamy, K. Hang, Y. Guo, N. Ruozzi, and Y. Xiang, “Self-supervised unseen object instance segmentation via long-term robot interaction,” in Robotics: Science and Systems, 2023.
  8. P. F. Felzenszwalb and D. P. Huttenlocher, “Efficient graph-based image segmentation,” International journal of computer vision, vol. 59, pp. 167–181, 2004.
  9. A. J. Trevor, S. Gedikli, R. B. Rusu, and H. I. Christensen, “Efficient organized point cloud segmentation with connected components,” Semantic Perception Mapping and Exploration (SPME), vol. 10, no. 6, pp. 251–257, 2013.
  10. S. Christoph Stein, M. Schoeler, J. Papon, and F. Worgotter, “Object partitioning using local convexity,” in IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 304–311.
  11. T. T. Pham, T.-T. Do, N. Sünderhauf, and I. Reid, “Scenecut: Joint geometric and object segmentation for indoor scenes,” in IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2018, pp. 3213–3220.
  12. C. Xie, Y. Xiang, A. Mousavian, and D. Fox, “The best of both modes: Separately leveraging rgb and depth for unseen object instance segmentation,” in Conference on Robot Learning.   PMLR, 2020, pp. 1369–1378.
  13. M. Danielczuk, M. Matl, S. Gupta, A. Li, A. Lee, J. Mahler, and K. Goldberg, “Segmenting unknown 3d objects from real depth images using mask r-cnn trained on synthetic data,” in IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2019, pp. 7283–7290.
  14. L. Shao, Y. Tian, and J. Bohg, “Clusternet: 3d instance segmentation in rgb-d images,” arXiv preprint arXiv:1807.08894, 2018.
  15. A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y. Lo et al., “Segment anything,” in International Conference on Computer Vision.   IEEE, 2023, pp. 4015–4026.
  16. L. Zhang, S. Zhang, X. Yang, H. Qiao, and Z. Liu, “Unseen object instance segmentation with fully test-time rgb-d embeddings adaptation,” in IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2023, pp. 4945–4952.
  17. J. C. Balloch, V. Agrawal, I. Essa, and S. Chernova, “Unbiasing semantic segmentation for robot perception using synthetic data feature transfer,” arXiv preprint arXiv:1809.03676, 2018.
  18. J. Bohg, K. Hausman, B. Sankaran, O. Brock, D. Kragic, S. Schaal, and G. S. Sukhatme, “Interactive perception: Leveraging action in perception and perception in action,” IEEE Transactions on Robotics, vol. 33, no. 6, pp. 1273–1291, 2017.
  19. C. Julia, A. Sappa, F. Lumbreras, J. Serrat, and A. López, “Motion segmentation from feature trajectories with missing data,” in Pattern Recognition and Image Analysis: Third Iberian Conference, IbPRIA 2007, Girona, Spain, June 6-8, 2007, Proceedings, Part I 3.   Springer, 2007, pp. 483–490.
  20. J. P. Costeira and T. Kanade, “A multibody factorization method for independently moving objects,” International Journal of Computer Vision, vol. 29, pp. 159–179, 1998.
  21. A. Goh and R. Vidal, “Segmenting motions of different types by unsupervised manifold clustering,” in IEEE Conference on Computer Vision and Pattern Recognition.   IEEE, 2007, pp. 1–6.
  22. A. Arsenio, P. Fitzpatrick, C. C. Kemp, and G. Metta, “The whole world in your hand: Active and interactive segmentation,” in Proceedings of the Third International Workshop on Epigenetic Robotics, 2003, pp. 49–56.
  23. P. Fitzpatrick, “First contact: an active vision approach to segmentation,” in IEEE International Conference on Intelligent Robots and Systems (IROS), vol. 3.   IEEE, 2003, pp. 2161–2166.
  24. G. Metta and P. Fitzpatrick, “Early integration of vision and manipulation,” Adaptive behavior, vol. 11, no. 2, pp. 109–128, 2003.
  25. A. Zeng, K.-T. Yu, S. Song, D. Suo, E. Walker, A. Rodriguez, and J. Xiao, “Multi-view self-supervised deep learning for 6d pose estimation in the amazon picking challenge,” in IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2017, pp. 1386–1383.
  26. C. Mitash, K. E. Bekris, and A. Boularias, “A self-supervised learning system for object detection using physics simulation and multi-view pose estimation,” in IEEE International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2017, pp. 545–551.
  27. Z. Teed and J. Deng, “Raft: Recurrent all-pairs field transforms for optical flow,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II 16.   Springer, 2020, pp. 402–419.
  28. S. Weglarczyk, “Kernel density estimation and its application,” in ITM web of conferences, vol. 23.   EDP Sciences, 2018, p. 00037.
  29. S. Haddadin, S. Parusel, L. Johannsmeier, S. Golz, S. Gabl, F. Walch, M. Sabaghian, C. Jähne, L. Hausperger, and S. Haddadin, “The franka emika robot: A reference platform for robotics research and education,” IEEE Robotics & Automation Magazine, vol. 29, no. 2, pp. 46–64, 2022.
  30. M. Carfagni, R. Furferi, L. Governi, C. Santarelli, M. Servi, F. Uccheddu, and Y. Volpe, “Metrological and critical characterization of the intel d415 stereo depth camera,” Sensors, vol. 19, no. 3, p. 489, 2019.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com