Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Reconciling Reality through Simulation: A Real-to-Sim-to-Real Approach for Robust Manipulation (2403.03949v3)

Published 6 Mar 2024 in cs.RO, cs.AI, and cs.LG

Abstract: Imitation learning methods need significant human supervision to learn policies robust to changes in object poses, physical disturbances, and visual distractors. Reinforcement learning, on the other hand, can explore the environment autonomously to learn robust behaviors but may require impractical amounts of unsafe real-world data collection. To learn performant, robust policies without the burden of unsafe real-world data collection or extensive human supervision, we propose RialTo, a system for robustifying real-world imitation learning policies via reinforcement learning in "digital twin" simulation environments constructed on the fly from small amounts of real-world data. To enable this real-to-sim-to-real pipeline, RialTo proposes an easy-to-use interface for quickly scanning and constructing digital twins of real-world environments. We also introduce a novel "inverse distillation" procedure for bringing real-world demonstrations into simulated environments for efficient fine-tuning, with minimal human intervention and engineering required. We evaluate RialTo across a variety of robotic manipulation problems in the real world, such as robustly stacking dishes on a rack, placing books on a shelf, and six other tasks. RialTo increases (over 67%) in policy robustness without requiring extensive human data collection. Project website and videos at https://real-to-sim-to-real.github.io/RialTo/

Definition Search Book Streamline Icon: https://streamlinehq.com
References (68)
  1. Learning dexterous in-hand manipulation. The International Journal of Robotics Research, 39(1):3–20, 2020.
  2. Efficient online reinforcement learning with offline data. arXiv preprint arXiv:2302.02948, 2023.
  3. Autonomous robotic reinforcement learning with asynchronous human feedback. arXiv preprint arXiv:2310.20608, 2023.
  4. Training diffusion models with reinforcement learning. arXiv preprint arXiv:2305.13301, 2023.
  5. Interactive perception: Leveraging action in perception and perception in action. IEEE Transactions on Robotics, 33(6):1273–1291, 2017.
  6. Rt-1: Robotics transformer for real-world control at scale. arXiv preprint arXiv:2212.06817, 2022.
  7. Nerf2real: Sim2real transfer of vision-guided bipedal motion skills using neural radiance fields. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 9362–9369. IEEE, 2023.
  8. Goat: Go to any thing. arXiv preprint arXiv:2311.06430, 2023.
  9. Learning by cheating. In Conference on Robot Learning, pages 66–75. PMLR, 2020.
  10. Urdformer: Constructing interactive realistic scenes from real images via simulation and generative modeling. In Towards Generalist Robots: Learning Paradigms for Scalable Skill Acquisition@ CoRL2023, 2023a.
  11. A system for general in-hand object re-orientation. In Conference on Robot Learning, pages 297–307. PMLR, 2022.
  12. Visual dexterity: In-hand reorientation of novel and complex object shapes. Science Robotics, 8(84):eadc9244, 2023b.
  13. Diffusion policy: Visuomotor policy learning via action diffusion. arXiv preprint arXiv:2303.04137, 2023.
  14. Deep reinforcement learning from human preferences. Advances in neural information processing systems, 30, 2017.
  15. AR Code. Ar code. https://ar-code.com/, 2022.
  16. Phone2proc: Bringing robust robots into our chaotic world. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9665–9675, 2023a.
  17. Objaverse: A universe of annotated 3d objects. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13142–13153, 2023b.
  18. Finetuning offline world models in the real world. arXiv preprint arXiv:2310.16029, 2023.
  19. Implicit behavioral cloning. In Conference on Robot Learning, pages 158–168. PMLR, 2022.
  20. Self-supervised correspondence in visuomotor policy learning. IEEE Robotics and Automation Letters, 5(2):492–499, 2019.
  21. Arnold: A benchmark for language-grounded task learning with continuous states in realistic 3d scenes. arXiv preprint arXiv:2304.04321, 2023.
  22. Reset-free reinforcement learning via multi-task learning: Learning dexterous manipulation behaviors without human intervention. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 6664–6671. IEEE, 2021.
  23. Dextreme: Transfer of agile in-hand manipulation from simulation to reality. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 5977–5984. IEEE, 2023.
  24. Retinagan: An object-aware approach to sim-to-real transfer. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 10920–10926. IEEE, 2021.
  25. Imitation learning: A survey of learning methods. ACM Computing Surveys (CSUR), 50(2):1–35, 2017.
  26. Learning agile and dynamic motor skills for legged robots. Science Robotics, 4(26):eaau5872, 2019.
  27. Q-attention: Enabling efficient learning for vision-based robotic manipulation. IEEE Robotics and Automation Letters, 7(2):1612–1619, 2022.
  28. Coarse-to-fine q-attention: Efficient learning for visual robotic manipulation via discretisation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13739–13748, 2022.
  29. Ditto: Building digital twins of articulated objects from interaction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5616–5626, 2022.
  30. Reinforcement learning in robotics: A survey. The International Journal of Robotics Research, 32(11):1238–1274, 2013.
  31. Rma: Rapid motor adaptation for legged robots. arXiv preprint arXiv:2107.04034, 2021.
  32. Conservative q-learning for offline reinforcement learning. Advances in Neural Information Processing Systems, 33:1179–1191, 2020.
  33. Learning quadrupedal locomotion over challenging terrain. Science robotics, 5(47):eabc5986, 2020.
  34. Polymetis. https://facebookresearch.github.io/fairo/polymetis/, 2021.
  35. Real–sim–real transfer for real-world robot control policy learning with deep reinforcement learning. Applied Sciences, 10(5):1555, 2020.
  36. What matters in learning from offline human demonstrations for robot manipulation. arXiv preprint arXiv:2108.03298, 2021.
  37. Rapid locomotion via reinforcement learning. arXiv preprint arXiv:2205.02824, 2022.
  38. Structured world models from human videos. arXiv preprint arXiv:2308.10901, 2023.
  39. Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1):99–106, 2021.
  40. Orbit: A unified simulation framework for interactive robot learning environments. IEEE Robotics and Automation Letters, 2023.
  41. Instant neural graphics primitives with a multiresolution hash encoding. ACM Trans. Graph., 41(4):102:1–102:15, July 2022. doi: 10.1145/3528223.3530127. URL https://doi.org/10.1145/3528223.3530127.
  42. Overcoming exploration in reinforcement learning with demonstrations. In 2018 IEEE international conference on robotics and automation (ICRA), pages 6292–6299. IEEE, 2018.
  43. Awac: Accelerating online reinforcement learning with offline datasets. arXiv preprint arXiv:2006.09359, 2020.
  44. NVIDIA. Nvidia isaac-sim. https://developer.nvidia.com/isaac-sim, May 2022.
  45. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744, 2022.
  46. Convolutional occupancy networks. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part III 16, pages 523–540. Springer, 2020.
  47. Sim-to-real transfer of robotic control with dynamics randomization. In 2018 IEEE international conference on robotics and automation (ICRA), pages 3803–3810. IEEE, 2018.
  48. Polycam. Polycam. https://poly.cam, 2020.
  49. Stable-baselines3: Reliable reinforcement learning implementations. Journal of Machine Learning Research, 22(268):1–8, 2021. URL http://jmlr.org/papers/v22/20-1364.html.
  50. Learning complex dexterous manipulation with deep reinforcement learning and demonstrations. arXiv preprint arXiv:1709.10087, 2017.
  51. Rl-cyclegan: Reinforcement learning aware simulation-to-real. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11157–11166, 2020.
  52. Imitation learning for locomotion and manipulation. In 2007 7th IEEE-RAS International Conference on Humanoid Robots, pages 392–397. IEEE, 2007.
  53. A reduction of imitation learning and structured prediction to no-regret online learning. In Proceedings of the fourteenth international conference on artificial intelligence and statistics, pages 627–635. JMLR Workshop and Conference Proceedings, 2011.
  54. Computational approaches to motor learning by imitation. Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences, 358(1431):537–547, 2003.
  55. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
  56. Tgrl: An algorithm for teacher guided reinforcement learning. In International Conference on Machine Learning, pages 31077–31093. PMLR, 2023.
  57. Reaching the limit in autonomous racing: Optimal control versus reinforcement learning. Science Robotics, 8(82):eadg1462, 2023.
  58. Diffcloud: Real-to-sim from point clouds with differentiable simulation and rendering of deformable objects. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 10828–10835. IEEE, 2022.
  59. Sim-to-real: Learning agile locomotion for quadruped robots. arXiv preprint arXiv:1804.10332, 2018.
  60. Nerfstudio: A modular framework for neural radiance field development. In ACM SIGGRAPH 2023 Conference Proceedings, pages 1–12, 2023.
  61. Domain randomization for transferring deep neural networks from simulation to the real world. In 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS), pages 23–30. IEEE, 2017.
  62. A real2sim2real method for robust object grasping with neural surface reconstruction. In 2023 IEEE 19th International Conference on Automation Science and Engineering (CASE), pages 1–8. IEEE, 2023.
  63. Densephysnet: Learning dense physical object representations via multi-step dynamic interactions. arXiv preprint arXiv:1906.03853, 2019.
  64. Robot fine-tuning made easy: Pre-training rewards and policies for autonomous real-world reinforcement learning. arXiv preprint arXiv:2310.15145, 2023.
  65. Learning fine-grained bimanual manipulation with low-cost hardware. arXiv preprint arXiv:2304.13705, 2023.
  66. Nerf in the palm of your hand: Corrective augmentation for robotics via novel-view synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17907–17917, 2023.
  67. Dexterous manipulation with deep reinforcement learning: Efficient, general, and low-cost. In 2019 International Conference on Robotics and Automation (ICRA), pages 3651–3657. IEEE, 2019.
  68. The ingredients of real-world robotic reinforcement learning. arXiv preprint arXiv:2004.12570, 2020.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Marcel Torne (6 papers)
  2. Anthony Simeonov (14 papers)
  3. Zechu Li (7 papers)
  4. April Chan (1 paper)
  5. Tao Chen (397 papers)
  6. Abhishek Gupta (226 papers)
  7. Pulkit Agrawal (103 papers)
Citations (25)

Summary

Enhancing Robustness in Real-World Robotic Manipulation with Simulation-Based Reinforcement Learning: Introducing RialTo

Overview

Within the field of robot learning, a critical challenge is achieving a level of flexibility and robustness that allows robots to adapt to the myriad of variations and disturbances present in real-world environments. To this end, we explore new territories in the synthesis of robust robotic manipulation policies through simulation. At the crux of our investigation lies the novel system RialTo, which epitomizes the fusion of real-world demonstration data with simulation-based reinforcement learning (RL) to engender policies exhibiting notable robustness. RialTo encapsulates an innovative approach to bridging the gap between the predictably structured world of simulation and the dynamic chaos of real-world interactions.

Real-to-Sim-to-Real Pipeline

The principal innovation of this work revolves around the cyclical pipeline devised to enhance the real-world efficacy of robotic manipulation policies. This multi-stage process commences with the swift creation of digital twins of real-world environments, leveraging minimal real-world data for simulation construction. By capitalizing on these simulated environments, RialTo affords the large-scale fine-tuning of imitation learning policies through RL, significantly bolstering their robustness. The inverse distillation procedure presents another cornerstone of RialTo, facilitating the seamless transfer of real-world demonstrations into simulated environments, thereby enriching the RL fine-tuning phase.

Methodological Distinctions

Distinctly, RialTo introduces a graphical user interface (GUI) streamlining the conversion of real-world scenes into manipulable digital twins, thereby lowering barriers to simulation environment construction. Additionally, the innovative "inverse distillation" algorithm advocates for the efficient transfer of policies from real-world scenarios to simulations, necessitating minimal human intervention. This synergetic combination underpins the subsequent RL refinement stage, culminating in the derivation of policies displaying enhanced adaptability and resilience.

Experimental Insights

RialTo’s efficacy is empirically validated across a spectrum of robotic manipulation tasks, demonstrating noteworthy improvements in policy robustness. Our evaluations reveal that policies refined through RialTo achieve a significant uptick in success rates, surpassing 67% over conventional baselines. Notably, the system demonstrates proficiency in coping with diverse scene perturbations, physical disturbances, and visual distractions, underscoring its practical utility and the scalability of the underlying methodology.

Theoretical and Practical Implications

RialTo's approach heralds several implications, both theoretical and practical. Theoretically, it paves the way for further exploration into the hybridization of imitation learning and RL within the context of robotic manipulation. Practically, it underscores the potential for inexpensive and rapid policy development cycles by mitigating the need for extensive real-world data collection or intricate simulation engineering.

Future Horizons in AI and Robotics

Looking ahead, RialTo sets the stage for advancements in the seamless integration of simulation-based learning within real-world robotic applications. Its success prompts future endeavors in enhancing the fidelity of digital twin simulations and exploring novel paradigms for policy transfer and fine-tuning. As we venture forward, RialTo not only signifies a significant step in robotic learning but also illuminates the path toward achieving versatile and dynamically adaptive robotic systems.

In conclusion, RialTo embodies a meaningful stride toward reconciling the disparate worlds of simulation and reality within the domain of robotic manipulation. Through its innovative methodologies and promising experimental outcomes, RialTo offers a glimpse into the future of robust, real-world robotics, where adaptability and resilience are paramount.