Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Bin-picking of novel objects through category-agnostic-segmentation: RGB matters (2312.16741v1)

Published 27 Dec 2023 in cs.RO

Abstract: This paper addresses category-agnostic instance segmentation for robotic manipulation, focusing on segmenting objects independent of their class to enable versatile applications like bin-picking in dynamic environments. Existing methods often lack generalizability and object-specific information, leading to grasp failures. We present a novel approach leveraging object-centric instance segmentation and simulation-based training for effective transfer to real-world scenarios. Notably, our strategy overcomes challenges posed by noisy depth sensors, enhancing the reliability of learning. Our solution accommodates transparent and semi-transparent objects which are historically difficult for depth-based grasping methods. Contributions include domain randomization for successful transfer, our collected dataset for warehouse applications, and an integrated framework for efficient bin-picking. Our trained instance segmentation model achieves state-of-the-art performance over WISDOM public benchmark [1] and also over the custom-created dataset. In a real-world challenging bin-picking setup our bin-picking framework method achieves 98% accuracy for opaque objects and 97% accuracy for non-opaque objects, outperforming the state-of-the-art baselines with a greater margin.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. M. Danielczuk, M. Matl, S. Gupta, A. Li, A. Lee, J. Mahler, and K. Goldberg, “Segmenting unknown 3d objects from real depth images using mask r-cnn trained on synthetic data,” in 2019 International Conference on Robotics and Automation (ICRA).   IEEE, 2019, pp. 7283–7290.
  2. D. Morrison, A. W. Tow, M. Mctaggart, R. Smith, N. Kelly-Boxall, S. Wade-Mccue, J. Erskine, R. Grinover, A. Gurman, T. Hunn et al., “Cartman: The low-cost cartesian manipulator that won the amazon robotics challenge,” in 2018 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2018, pp. 7757–7764.
  3. A. Kumar and L. Behera, “Semi supervised deep quick instance detection and segmentation,” in 2019 International Conference on Robotics and Automation (ICRA).   IEEE, 2019, pp. 8325–8331.
  4. J. Mahler, J. Liang, S. Niyaz, M. Laskey, R. Doan, X. Liu, J. A. Ojea, and K. Goldberg, “Dex-net 2.0: Deep learning to plan robust grasps with synthetic point clouds and analytic grasp metrics,” in in Proceedings of Robotics: Science and Systems (RSS).   IEEE, 2017.
  5. J. Mahler, M. Matl, V. Satish, M. Danielczuk, B. DeRose, S. McKinley, and K. Goldberg, “Learning ambidextrous robot grasping policies,” Science Robotics, vol. 4, no. 26, 2019.
  6. P. Raj, A. Kumar, V. Sanap, T. Sandhan, and L. Behera, “Towards object agnostic and robust 4-dof table-top grasping,” in IEEE 18th International Conference on Automation Science and Engineering (CASE), 2022.
  7. W. Wang, W. Liu, J. Hu, Y. Fang, Q. Shao, and J. Qi, “Graspfusionnet: a two-stage multi-parameter grasp detection network based on rgb–xyz fusion in dense clutter,” Machine Vision and Applications, vol. 31, no. 7, pp. 1–19, 2020.
  8. K. Tung, J. Su, J. Cai, Z. Wan, and H. Cheng, “Uncertainty-based exploring strategy in densely cluttered scenes for vacuum cup grasping,” in 2022 International Conference on Robotics and Automation (ICRA).   IEEE, 2022, pp. 3483–3489.
  9. T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft coco: Common objects in context,” in Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13.   Springer, 2014, pp. 740–755.
  10. P. Raj, V. P. Namboodiri, and L. Behera, “Learning to switch cnns with model agnostic meta learning for fine precision visual servoing,” in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2020, pp. 10 210–10 217.
  11. D. Horváth, G. Erdős, Z. Istenes, T. Horváth, and S. Földi, “Object detection using sim2real domain randomization for robotic applications,” IEEE Transactions on Robotics, 2022.
  12. C. Xie, Y. Xiang, A. Mousavian, and D. Fox, “The best of both modes: Separately leveraging rgb and depth for unseen object instance segmentation,” in Conference on robot learning.   PMLR, 2020, pp. 1369–1378.
  13. S. Höfer, K. Bekris, A. Handa, J. C. Gamboa, M. Mozifian, F. Golemo, C. Atkeson, D. Fox, K. Goldberg, J. Leonard et al., “Sim2real in robotics and automation: Applications and challenges,” IEEE transactions on automation science and engineering, vol. 18, no. 2, pp. 398–400, 2021.
  14. Y. Xiang, C. Xie, A. Mousavian, and D. Fox, “Learning rgb-d feature embeddings for unseen object instance segmentation,” in Conference on Robot Learning.   PMLR, 2021, pp. 461–470.
  15. C. Xie, Y. Xiang, A. Mousavian, and D. Fox, “Unseen object instance segmentation for robotic environments,” IEEE Transactions on Robotics, vol. 37, no. 5, pp. 1343–1359, 2021.
  16. S. Back, J. Lee, T. Kim, S. Noh, R. Kang, S. Bak, and K. Lee, “Unseen object amodal instance segmentation via hierarchical occlusion modeling,” in 2022 International Conference on Robotics and Automation (ICRA).   IEEE, 2022, pp. 5085–5092.
  17. S. Back, J. Kim, R. Kang, S. Choi, and K. Lee, “Segmenting unseen industrial components in a heavy clutter using rgb-d fusion and synthetic data,” in 2020 IEEE International Conference on Image Processing (ICIP), 2020, pp. 828–832.
  18. J. Tobin, R. Fong, A. Ray, J. Schneider, W. Zaremba, and P. Abbeel, “Domain randomization for transferring deep neural networks from simulation to the real world,” in 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS).   IEEE, 2017, pp. 23–30.
  19. K. He, G. Gkioxari, P. Dollar, and R. Girshick, “Mask r-cnn,” in Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017.
  20. S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” in Proceedings of the Conference on Neural Information Processing Systems (NIPS), 2015.
  21. X. Zhang, X. Zhou, M. Lin, and J. Sun, “Shufflenet v2: Practical guidelines for efficient cnn architecture design,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018.
  22. T.-Y. Lin, P. Doll’ar, R. Girshick, K. He, B. Hariharan, and S. Belongie, “Feature pyramid networks for object detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
  23. J. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “Big transfer (bit): General visual representation learning,” in Proceedings of the International Conference on Learning Representations (ICLR), 2021.
  24. A. Kirillov, R. Girshick, K. He, and P. Doll’ar, “Panoptic segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
  25. A. Mousavian, A. Toshev, and A. Fathi, “Joint object and pose recognition using conditional latent-variable models,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 7753–7761.
  26. Y. Zhou, H. Zheng, B. Zhao, and J. Liu, “An object recognition and localization method based on category-agnostic instance segmentation,” in Proceedings of the 2020 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA).   IEEE, 2020, pp. 51–55.
  27. Y. Li, X. Wang, W. Cao, J. Li, and H. Lu, “A real-time system for robotic bin picking based on category-agnostic instance segmentation,” in 2020 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2020, pp. 10 432–10 438.
  28. L. Qin, S. Yang, and H. Liu, “An efficient and robust category-agnostic instance segmentation method for bin-picking robots,” Robotics and Autonomous Systems, vol. 120, pp. 104–116, 2019.
  29. X. Yin, Y. Huang, X. Li, Y. Zhang, J. Li, S. Li, and S. Chen, “Bin-picking for multiple objects based on category-agnostic instance segmentation and hand–eye calibration,” IEEE Transactions on Automation Science and Engineering, vol. 18, no. 3, pp. 1427–1438, 2021.
  30. S. James, A. J. Davison, and E. Johns, “Transferring end-to-end visuomotor control from simulation to real world for a multi-stage task,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 2217–2226.
  31. C. Shorten and T. M. Khoshgoftaar, “A survey on image data augmentation for deep learning,” Journal of Big Data, vol. 6, no. 1, p. 60, 2019.
  32. S. James, A. Gupta, and S. Levine, “Sim-to-real transfer of robotic control with dynamics randomization,” in 2019 International Conference on Robotics and Automation (ICRA).   IEEE, 2019, pp. 6914–6920.
  33. F. Sadeghi and S. Levine, “Sim2real view invariant visual servoing by recurrent control,” in Robotics: Science and Systems (RSS), 2018.
  34. X. B. Peng, G. Berseth, C. Yin, and S. Schaal, “Deep reinforcement learning for robotic locomotion with asynchronous advantage actor-critic (a3c),” in IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2018, pp. 1–8.
  35. M. Johnson-Roberson, C. Barto, R. Mehta, S. N. Sridhar, K. Rosaen, and R. Vasudevan, “Driving in the matrix: Can virtual worlds replace human-generated annotations for real world tasks?” in 2017 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2017, pp. 746–753.
  36. M. Fujita, Y. Domae, A. Noda, G. Garcia Ricardez, T. Nagatani, A. Zeng, S. Song, A. Rodriguez, A. Causo, I.-M. Chen et al., “What are the important technologies for bin picking? technology analysis of robots in competitions based on a set of performance metrics,” Advanced Robotics, vol. 34, no. 7-8, pp. 560–574, 2020.
  37. A. Cordeiro, L. F. Rocha, C. Costa, P. Costa, and M. F. Silva, “Bin picking approaches based on deep learning techniques: A state-of-the-art survey,” in 2022 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), 2022, pp. 110–117.
  38. S. Kumra, S. Joshi, and F. Sahin, “Antipodal robotic grasping using generative residual convolutional neural network,” in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2020, pp. 9626–9633.
  39. S. V. Pharswan, M. Vohra, A. Kumar, and L. Behera, “Domain-independent unsupervised detection of grasp regions to grasp novel objects,” in 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2019, pp. 640–645.
  40. S. Levine, P. Pastor, A. Krizhevsky, J. Ibarz, and D. Quillen, “Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection,” The International Journal of Robotics Research, vol. 37, no. 4-5, pp. 421–436, 2018.
  41. P. Raj, A. Singhal, V. Sanap, L. Behera, and R. Sinha, “Domain-independent disperse and pick method for robotic grasping,” in 2022 International Joint Conference on Neural Networks (IJCNN), 2022.
  42. P. Raj, L. Behera, and T. Sandhan, “Scalable and time-efficient bin-picking for unknown objects in dense clutter,” IEEE Transactions on Automation Science and Engineering, pp. 1–13, 2023.
  43. L. D. Hanh and K. T. G. Hieu, “3d matching by combining cad model and computer vision for autonomous bin picking,” International Journal on Interactive Design and Manufacturing (IJIDeM), vol. 15, pp. 239–247, 2021.
  44. L. Downs, A. Francis, N. Koenig, B. Kinman, R. Hickman, K. Reymann, T. B. McHugh, and V. Vanhoucke, “Google scanned objects: A high-quality dataset of 3d scanned household items,” arXiv preprint arXiv:2204.11918, 2022.
  45. M. Cimpoi, S. Maji, I. Kokkinos, S. Mohamed, and A. Vedaldi, “Describing textures in the wild,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 3606–3613.
  46. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, “Pytorch: An imperative style, high-performance deep learning library,” in Advances in Neural Information Processing Systems 32.   Curran Associates, Inc., 2019, pp. 8024–8035.
  47. T. Lin, M. Maire, S. J. Belongie, L. D. Bourdev, R. B. Girshick, J. Hays, P. Perona, D. Ramanan, P. Doll’a r, and C. L. Zitnick, “Microsoft COCO: common objects in context,” CoRR, vol. abs/1405.0312, 2014.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Prem Raj (3 papers)
  2. Sachin Bhadang (1 paper)
  3. Gaurav Chaudhary (31 papers)
  4. Laxmidhar Behera (31 papers)
  5. Tushar Sandhan (5 papers)
Citations (1)
Youtube Logo Streamline Icon: https://streamlinehq.com