Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Get a Grip: Multi-Finger Grasp Evaluation at Scale Enables Robust Sim-to-Real Transfer (2410.23701v1)

Published 31 Oct 2024 in cs.RO

Abstract: This work explores conditions under which multi-finger grasping algorithms can attain robust sim-to-real transfer. While numerous large datasets facilitate learning generative models for multi-finger grasping at scale, reliable real-world dexterous grasping remains challenging, with most methods degrading when deployed on hardware. An alternate strategy is to use discriminative grasp evaluation models for grasp selection and refinement, conditioned on real-world sensor measurements. This paradigm has produced state-of-the-art results for vision-based parallel-jaw grasping, but remains unproven in the multi-finger setting. In this work, we find that existing datasets and methods have been insufficient for training discriminitive models for multi-finger grasping. To train grasp evaluators at scale, datasets must provide on the order of millions of grasps, including both positive and negative examples, with corresponding visual data resembling measurements at inference time. To that end, we release a new, open-source dataset of 3.5M grasps on 4.3K objects annotated with RGB images, point clouds, and trained NeRFs. Leveraging this dataset, we train vision-based grasp evaluators that outperform both analytic and generative modeling-based baselines on extensive simulated and real-world trials across a diverse range of objects. We show via numerous ablations that the key factor for performance is indeed the evaluator, and that its quality degrades as the dataset shrinks, demonstrating the importance of our new dataset. Project website at: https://sites.google.com/view/get-a-grip-dataset.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (55)
  1. Differentiable physics and stable modes for tool-use and manipulation planning. In Robotics: Science and Systems, 2018.
  2. A versatile humanoid robot platform for dexterous manipulation and human–robot collaboration. CAAI Transactions on Intelligence Technology, 9(2):526–540, 2024. doi:https://doi.org/10.1049/cit2.12214. URL https://ietresearch.onlinelibrary.wiley.com/doi/abs/10.1049/cit2.12214.
  3. Emg-driven shared human-robot compliant control for in-hand object manipulation in hand prostheses. Journal of Neural Engineering, 19, 11 2022. doi:10.1088/1741-2552/aca35f.
  4. A. Miller and P. Allen. GraspIt! IEEE Robotics & Automation Magazine, 11(4):110–122, Dec. 2004. ISSN 1070-9932. doi:10.1109/MRA.2004.1371616. URL http://ieeexplore.ieee.org/document/1371616/.
  5. Dexterous grasping via eigengrasps: A low-dimensional approach to a high-complexity problem. In Robotics: Science and systems manipulation workshop-sensing and adapting to the real world, 2007.
  6. Grasping the dice by dicing the grasp. In Proceedings 2003 IEEE/RSJ international conference on intelligent robots and systems (IROS 2003) (cat. No.03CH37453), volume 4, pages 3692–3697 vol.3, 2003. doi:10.1109/IROS.2003.1249729.
  7. C. Ferrari and J. F. Canny. Planning optimal grasps. In ICRA, volume 3, page 6, 1992. Number: 4.
  8. Data-driven grasp Synthesis—A survey. IEEE Transactions on Robotics, 30(2):289–309, 2014. doi:10.1109/TRO.2013.2289018.
  9. Predicting grasp success in the real world - a study of quality metrics and human assessment. Robotics and Autonomous Systems, 121:103274, 2019. ISSN 0921-8890. doi:https://doi.org/10.1016/j.robot.2019.103274. URL https://www.sciencedirect.com/science/article/pii/S0921889019300247.
  10. Planning Multi-Fingered Grasps as Probabilistic Inference in a Learned Deep Network. In International Symposium of Robotics Research, 2018.
  11. Multifingered Grasp Planning via Inference in Deep Neural Networks: Outperforming Sampling by Learning Differentiable Models. IEEE Robotics and Automation Magazine, 27(2):55–65, 2020. doi:10.1109/MRA.2020.2976322.
  12. Unigrasp: Learning a unified model to grasp with multifingered robotic hands. IEEE Robotics and Automation Letters, 5(2):2286–2293, April 2020. ISSN 2377-3774. doi:10.1109/LRA.2020.2969946.
  13. Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. The International Journal of Robotics Research, 37(4-5):421–436, Apr. 2018. ISSN 0278-3649, 1741-3176. doi:10.1177/0278364917710318. URL http://journals.sagepub.com/doi/10.1177/0278364917710318. publisher: SAGE Publications Sage UK: London, England tex.ids= levine2018a, levineLearningHandeyeCoordination2018.
  14. Dex-net 2.0: Deep learning to plan robust grasps with synthetic point clouds and analytic grasp metrics. arXiv preprint arXiv:1703.09312, 2017. tex.ids= mahler2017b, mahler2017dex, mahler2017dexneta, mahlerDexnetDeepLearning2017 arXiv: 1703.09312 publicationTitle: Robotics: Science and systems (RSS).
  15. Grasp proposal networks: An end-to-end solution for visual learning of robotic grasps, 2020. URL https://arxiv.org/abs/2009.12606.
  16. When transformer meets robotic grasping: Exploits context for efficient grasp detection. IEEE Robotics and Automation Letters, 7:1–8, 07 2022a. doi:10.1109/LRA.2022.3187261.
  17. DexGraspNet: A large-scale robotic dexterous grasp dataset for general objects based on simulation. arXiv preprint arXiv:2210.02697, 2022b.
  18. Unidexgrasp: Universal robotic dexterous grasping via learning diverse proposal generation and goal-conditioned policy, 2023.
  19. Grasp’d: Differentiable contact-rich grasp synthesis for multi-fingered hands, 2022.
  20. Fast-grasp’d: Dexterous multi-finger grasp generation through differentiable simulation, 2023.
  21. Geometry matching for multi-embodiment grasping, 2023. URL https://arxiv.org/abs/2312.03864.
  22. Frogger: Fast robust grasp generation via the min-weight metric. arxiv:2302.13687, February 2023.
  23. Learning diverse and physically feasible dexterous grasps with generative model and bilevel optimization. In 6th annual conference on robot learning, 2022. URL https://openreview.net/forum?id=A12nd105kFr.
  24. Ffhnet: Generating multi-fingered robotic grasps for unknown objects in real-time. In 2022 International Conference on Robotics and Automation (ICRA), pages 762–769, 2022. doi:10.1109/ICRA46639.2022.9811666.
  25. Dexdiffuser: Generating dexterous grasps with diffusion models, 2024.
  26. Leap hand: Low-cost, efficient, and anthropomorphic hand for robot learning, 2023. URL https://arxiv.org/abs/2309.06440.
  27. Gendexgrasp: Generalizable dexterous grasping. arXiv preprint arXiv:2210.00722, 2022.
  28. S. El-Khoury and A. Sahbani. Handling objects by their handles. In IROS-2008 workshop on grasp and task learning by imitation, 2008.
  29. A mathematical introduction to robotics manipulation. CRC Press, 1994.
  30. Toward an analytic theory of intrinsic robustness for dexterous grasping, 2024.
  31. BigBIRD: A large-scale 3D database of object instances. In 2014 IEEE international conference on robotics and automation (ICRA), pages 509–516, 2014. doi:10.1109/ICRA.2014.6906903.
  32. Google scanned objects: A high-quality dataset of 3D scanned household items. In 2022 international conference on robotics and automation (ICRA), pages 2553–2560, 2022. doi:10.1109/ICRA46639.2022.9811809.
  33. Efficient learning on point clouds with basis point sets, 2019.
  34. NeRF: Representing scenes as neural radiance fields for view synthesis. In ECCV, 2020.
  35. Dex-NeRF: Using a neural radiance field to grasp transparent objects. In Conference on robot learning (CoRL), 2020.
  36. Evo-NeRF: Evolving NeRF for sequential robot grasping of transparent objects. In 6th annual conference on robot learning, 2022. URL https://openreview.net/forum?id=Bxr45keYrf.
  37. GraspNeRF: Multiview-based 6-DoF grasp detection for transparent and specular objects using generalizable NeRF. 2023. Publication Title: IEEE international conference on robotics and automation (ICRA).
  38. Learning any-view 6dof robotic grasping in cluttered scenes via neural surface rendering, 2024.
  39. Language embedded radiance fields for zero-shot task-oriented grasping. In 7th Annual Conference on Robot Learning, 2023. URL https://openreview.net/forum?id=k-Fg8JDQmc.
  40. MIRA: Mental imagery for robotic affordances. In Conference on Robot Learning (CoRL), 2022.
  41. Distilled feature fields enable few-shot language-guided manipulation. In 7th Annual Conference on Robot Learning, 2023.
  42. Deep learning approaches to grasp synthesis: A review, 2022. arXiv: 2207.02556 [cs.RO].
  43. ACRONYM: A large-scale grasp dataset based on simulation. In 2021 IEEE Int. Conf. on Robotics and Automation, ICRA, 2020.
  44. Contact-graspnet: Efficient 6-dof grasp generation in cluttered scenes. 2021.
  45. Fast-grasp’d: Dexterous multi-finger grasp generation through differentiable simulation. In ICRA, 2023.
  46. Multigrippergrasp: A dataset for robotic grasping from parallel jaw grippers to dexterous hands, 2024.
  47. Deep differentiable grasp planner for high-dof grippers, 2020.
  48. Learning continuous 3d reconstructions for geometrically aware grasping, 2020.
  49. Planning multi-fingered grasps as probabilistic inference in a learned deep network, 2018.
  50. Multifingered grasp planning via inference in deep neural networks: Outperforming sampling by learning differentiable models. IEEE Robotics and Automation Magazine, 27(2):55–65, 2020. doi:10.1109/MRA.2020.2976322.
  51. Isaac gym: High performance GPU-Based physics simulation for robot learning, 2021.
  52. J. Weisz and P. K. Allen. Pose error robust grasping from contact wrench space metrics. In 2012 IEEE International Conference on Robotics and Automation, pages 557–562, 2012. doi:10.1109/ICRA.2012.6224697.
  53. Leveraging big data for grasp planning. In 2015 IEEE International Conference on Robotics and Automation (ICRA), pages 4304–4311, 2015. doi:10.1109/ICRA.2015.7139793.
  54. Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics, 41(4):1–15, July 2022. ISSN 1557-7368. doi:10.1145/3528223.3530127. URL http://dx.doi.org/10.1145/3528223.3530127.
  55. Nerfstudio: A modular framework for neural radiance field development. In Special Interest Group on Computer Graphics and Interactive Techniques Conference Conference Proceedings, SIGGRAPH ’23. ACM, July 2023. doi:10.1145/3588432.3591516. URL http://dx.doi.org/10.1145/3588432.3591516.
Citations (1)

Summary

  • The paper introduces a novel sim-to-real transfer framework that leverages a 3.5M grasp dataset to train robust discriminative evaluators.
  • It employs real-world sensor inputs in a vision-based evaluation pipeline to outperform traditional analytic and learning-based grasp models.
  • The findings emphasize the critical role of large-scale data and evaluator design in advancing reliable multi-finger robotic manipulation.

Multi-Finger Grasp Evaluation and Sim-to-Real Transfer in Robotics

This paper introduces a significant advancement in robotic multi-finger grasping by proposing a robust framework for sim-to-real transfer using discriminative grasp evaluators. The authors bridge the prevalent gap between simulation and real-world performance in dexterous grasping, a known challenge in robotic manipulation.

Contributions of the Paper

  1. Large-Scale Grasp Dataset: The authors have released a comprehensive dataset that includes 3.5 million grasp attempts across 4,300 unique objects. This dataset stands out by offering both positive and negative grasp examples, annotated with realistic perceptual data such as RGB images and trained NeRF models. The sheer scale of the dataset facilitates the training of robust discriminative models that generalize well from simulation to real-world applications.
  2. Evaluation Pipeline: The paper focuses on discriminative grasp evaluation conditioned on real-world sensor inputs, a relatively under-exploited approach in multi-finger setups compared to parallel-jaw grasping. This pipeline is optimized for performance by training with data that reflect real-world measurement conditions, particularly emphasizing the structure of grasp failures and successes.
  3. Outperformance of Baselines: The paper demonstrates that their vision-based grasp evaluators outperform both traditional physics-based analytic models and recent learning-based generative models. The results are substantiated with empirical evaluations conducted in both simulation environments and physical trials, confirming the models' robustness across a diverse object set.
  4. Implications of Evaluator Reliance: Through comprehensive ablation studies, the authors highlight the pivotal role of evaluators in the grasping process. They note significant performance degradation when datasets are insufficiently large, reinforcing the importance of training scale for effective discriminative grasp evaluation.

Practical and Theoretical Implications

The proposed methodology enhances the understanding of the symbiotic relationship between dataset attributes and model performance in robotic grasping. This work reaffirms the critical role of large datasets in algorithm training, analogous to trends observed in other fields of AI. Practically, the presented system can be adapted for diverse applications ranging from anthropomorphic manipulation in service robotics to autonomous field robots handling irregular objects.

Theoretical implications include new insights into the efficacy of data-driven approaches over traditional models reliant on precise geometric information, which are often unreliable due to real-world sensor noise. The paper suggests a promising direction for future research in refining evaluator-based methods and integrating multi-modal sensor data to augment the grasp evaluation process.

Future Directions

Future research directions could involve extending dataset diversity to include non-rigid and articulated objects, addressing the complexities of grasping in cluttered or occluded environments. Additionally, there is potential to explore enhanced representation learning methods that further improve grasp prediction accuracy without the extensive data collection phase. Developing faster evaluation models to meet real-time requirements in more challenging manipulation tasks could also be crucial.

This paper contributes valuable knowledge and resources to the field of robotic manipulation, particularly in overcoming barriers to achieving reliable sim-to-real transfer in complex grasping scenarios.

X Twitter Logo Streamline Icon: https://streamlinehq.com