TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction (2405.10315v3)
Abstract: Learning in simulation and transferring the learned policy to the real world has the potential to enable generalist robots. The key challenge of this approach is to address simulation-to-reality (sim-to-real) gaps. Previous methods often require domain-specific knowledge a priori. We argue that a straightforward way to obtain such knowledge is by asking humans to observe and assist robot policy execution in the real world. The robots can then learn from humans to close various sim-to-real gaps. We propose TRANSIC, a data-driven approach to enable successful sim-to-real transfer based on a human-in-the-loop framework. TRANSIC allows humans to augment simulation policies to overcome various unmodeled sim-to-real gaps holistically through intervention and online correction. Residual policies can be learned from human corrections and integrated with simulation policies for autonomous execution. We show that our approach can achieve successful sim-to-real transfer in complex and contact-rich manipulation tasks such as furniture assembly. Through synergistic integration of policies learned in simulation and from humans, TRANSIC is effective as a holistic approach to addressing various, often coexisting sim-to-real gaps. It displays attractive properties such as scaling with human effort. Videos and code are available at https://transic-robot.github.io/
- On the use of simulation in robotics: Opportunities, challenges, and suggestions for moving forward. Proceedings of the National Academy of Sciences, 118(1):e1907856118, 2021. doi:10.1073/pnas.1907856118. URL https://www.pnas.org/doi/abs/10.1073/pnas.1907856118.
- C. K. Liu and D. Negrut. The role of physics-based simulators in robotics. Annual Review of Control, Robotics, and Autonomous Systems, 4(1):35–58, 2021. doi:10.1146/annurev-control-072220-093055. URL https://doi.org/10.1146/annurev-control-072220-093055.
- P. Abbeel and A. Y. Ng. Apprenticeship learning via inverse reinforcement learning. In Proceedings of the Twenty-First International Conference on Machine Learning, ICML ’04, page 1, New York, NY, USA, 2004. Association for Computing Machinery. ISBN 1581138385. doi:10.1145/1015330.1015430. URL https://doi.org/10.1145/1015330.1015430.
- Search-based structured prediction. arXiv preprint arXiv: Arxiv-0907.0786, 2009.
- Foundation models for decision making: Problems, methods, and opportunities. arXiv preprint arXiv: Arxiv-2303.04129, 2023.
- Mimicgen: A data generation system for scalable robot learning using human demonstrations. arXiv preprint arXiv: Arxiv-2310.17596, 2023.
- Mujoco: A physics engine for model-based control. 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 5026–5033, 2012. doi:10.1109/IROS.2012.6386109.
- E. Coumans and Y. Bai. Pybullet, a python module for physics simulation for games, robotics and machine learning. http://pybullet.org, 2016–2021.
- robosuite: A modular simulation framework and benchmark for robot learning. arXiv preprint arXiv: Arxiv-2009.12293, 2020.
- Isaac gym: High performance gpu-based physics simulation for robot learning. arXiv preprint arXiv: Arxiv-2108.10470, 2021.
- BEHAVIOR-1k: A benchmark for embodied AI with 1,000 everyday activities and realistic simulation. In 6th Annual Conference on Robot Learning, 2022. URL https://openreview.net/forum?id=_8DoIe8G3t.
- Rt-1: Robotics transformer for real-world control at scale. arXiv preprint arXiv: Arxiv-2212.06817, 2022.
- Robocat: A self-improving generalist agent for robotic manipulation. arXiv preprint arXiv: Arxiv-2306.11706, 2023.
- Solving rubik’s cube with a robot hand. arXiv preprint arXiv: Arxiv-1910.07113, 2019.
- Visual dexterity: In-hand reorientation of novel and complex object shapes. Science Robotics, 8(84):eadc9244, 2023a. doi:10.1126/scirobotics.adc9244. URL https://www.science.org/doi/abs/10.1126/scirobotics.adc9244.
- Sequential dexterity: Chaining dexterous policies for long-horizon manipulation. arXiv preprint arXiv: Arxiv-2309.00987, 2023b.
- In-hand object rotation via rapid motor adaptation. arXiv preprint arXiv: Arxiv-2210.04887, 2022.
- General in-hand object rotation with vision and touch. arXiv preprint arXiv: Arxiv-2309.09979, 2023.
- Sim-to-real: Learning agile locomotion for quadruped robots. arXiv preprint arXiv: Arxiv-1804.10332, 2018.
- RMA: rapid motor adaptation for legged robots. In D. A. Shell, M. Toussaint, and M. A. Hsieh, editors, Robotics: Science and Systems XVII, Virtual Event, July 12-16, 2021, 2021. doi:10.15607/RSS.2021.XVII.011. URL https://doi.org/10.15607/RSS.2021.XVII.011.
- Robot parkour learning. arXiv preprint arXiv: Arxiv-2309.05665, 2023.
- Neural volumetric memory for visual locomotion control. arXiv preprint arXiv: Arxiv-2304.01201, 2023.
- H. Benbrahim and J. A. Franklin. Biped dynamic walking using reinforcement learning. Robotics and Autonomous Systems, 22(3):283–302, 1997. ISSN 0921-8890. doi:https://doi.org/10.1016/S0921-8890(97)00043-2. URL https://www.sciencedirect.com/science/article/pii/S0921889097000432. Robot Learning: The New Wave.
- Reinforcement learning-based cascade motion policy design for robust 3d bipedal locomotion. IEEE Access, 10:20135–20148, 2022. doi:10.1109/ACCESS.2022.3151771.
- Linear policies are sufficient to realize robust bipedal walking on challenging terrains. arXiv preprint arXiv: Arxiv-2109.12665, 2021.
- Blind bipedal stair traversal via sim-to-real reinforcement learning. arXiv preprint arXiv: Arxiv-2105.08328, 2021.
- Real-world humanoid locomotion with reinforcement learning. arXiv preprint arXiv: Arxiv-2303.03381, 2023.
- Reinforcement learning for versatile, dynamic, and robust bipedal locomotion control. arXiv preprint arXiv: Arxiv-2401.16889, 2024.
- Champion-level drone racing using deep reinforcement learning. Nature, 2023. doi:10.1038/s41586-023-06419-4. URL https://doi.org/10.1038/s41586-023-06419-4.
- Reaching the limit in autonomous racing: Optimal control versus reinforcement learning. Science Robotics, 8(82):eadg1462, 2023. doi:10.1126/scirobotics.adg1462. URL https://www.science.org/doi/abs/10.1126/scirobotics.adg1462.
- Planar robot casting with real2sim2real self-supervised learning. arXiv preprint arXiv: Arxiv-2111.04814, 2021.
- W. Zhou and D. Held. Learning to grasp the ungraspable with emergent extrinsic dexterity. In K. Liu, D. Kulic, and J. Ichnowski, editors, Conference on Robot Learning, CoRL 2022, 14-18 December 2022, Auckland, New Zealand, volume 205 of Proceedings of Machine Learning Research, pages 150–160. PMLR, 2022. URL https://proceedings.mlr.press/v205/zhou23a.html.
- Pre- and post-contact policy decomposition for non-prehensile manipulation with zero-shot sim-to-real transfer. arXiv preprint arXiv: Arxiv-2309.02754, 2023.
- Learning generalizable pivoting skills. In IEEE International Conference on Robotics and Automation, ICRA 2023, London, UK, May 29 - June 2, 2023, pages 5865–5871. IEEE, 2023. doi:10.1109/ICRA48891.2023.10161271. URL https://doi.org/10.1109/ICRA48891.2023.10161271.
- Reinforcement learning of impedance policies for peg-in-hole tasks: Role of asymmetric matrices. IEEE Robotics and Automation Letters, 7(4):10898–10905, 2022. doi:10.1109/LRA.2022.3191070.
- Sim-to-real transfer of bolting tasks with tight tolerance. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 9056–9063, 2020. doi:10.1109/IROS45743.2020.9341644.
- Industreal: Transferring contact-rich assembly tasks from simulation to reality. In K. E. Bekris, K. Hauser, S. L. Herbert, and J. Yu, editors, Robotics: Science and Systems XIX, Daegu, Republic of Korea, July 10-14, 2023, 2023. doi:10.15607/RSS.2023.XIX.039. URL https://doi.org/10.15607/RSS.2023.XIX.039.
- Efficient sim-to-real transfer of contact-rich manipulation skills with online admittance residual learning. In 7th Annual Conference on Robot Learning, 2023a. URL https://openreview.net/forum?id=gFXVysXh48K.
- Bridging the sim-to-real gap with dynamic compliance tuning for industrial insertion. arXiv preprint arXiv: Arxiv-2311.07499, 2023b.
- Closing the sim-to-real loop: Adapting simulation randomization with real world experience. arXiv preprint arXiv: Arxiv-1810.05687, 2018.
- Noise and the reality gap: The use of simulation in evolutionary robotics. In F. Morán, A. Moreno, J. J. Merelo, and P. Chacón, editors, Advances in Artificial Life, pages 704–720, Berlin, Heidelberg, 1995. Springer Berlin Heidelberg. ISBN 978-3-540-49286-3.
- Towards adapting deep visuomotor representations from simulated to real environments. ArXiv, abs/1511.07111, 2015. URL https://api.semanticscholar.org/CorpusID:1541419.
- Using simulation and domain adaptation to improve efficiency of deep robotic grasping. arXiv preprint arXiv: Arxiv-1709.07857, 2017.
- Retinagan: An object-aware approach to sim-to-real transfer. arXiv preprint arXiv: Arxiv-2011.03148, 2020.
- Crossing the reality gap in evolutionary robotics by promoting transferable controllers. In Proceedings of the 12th Annual Conference on Genetic and Evolutionary Computation, GECCO ’10, page 119–126, New York, NY, USA, 2010. Association for Computing Machinery. ISBN 9781450300728. doi:10.1145/1830483.1830505. URL https://doi.org/10.1145/1830483.1830505.
- Feedback control for cassie with deep reinforcement learning. arXiv preprint arXiv: Arxiv-1803.05580, 2018.
- Variable impedance control in end-effector space: An action space for reinforcement learning in contact-rich tasks. arXiv preprint arXiv: Arxiv-1906.08880, 2019.
- Oscar: Data-driven operational space control for adaptive and robust robot manipulation. arXiv preprint arXiv: Arxiv-2110.00704, 2021.
- On the role of the action space in robot manipulation learning and sim-to-real transfer. arXiv preprint arXiv: Arxiv-2312.03673, 2023.
- Factory: Fast contact for robotic assembly. In K. Hauser, D. A. Shell, and S. Huang, editors, Robotics: Science and Systems XVIII, New York City, NY, USA, June 27 - July 1, 2022, 2022. doi:10.15607/RSS.2022.XVIII.035. URL https://doi.org/10.15607/RSS.2022.XVIII.035.
- L. Ljung. System Identification, pages 163–173. Birkhäuser Boston, Boston, MA, 1998. ISBN 978-1-4612-1768-8. doi:10.1007/978-1-4612-1768-8_11. URL https://doi.org/10.1007/978-1-4612-1768-8_11.
- P. Chang and T. Padir. Sim2real2sim: Bridging the gap between simulation and real-world in flexible object manipulation. arXiv preprint arXiv: Arxiv-2002.02538, 2020.
- Sim-to-real transfer of robotic control with dynamics randomization. arXiv preprint arXiv: Arxiv-1710.06537, 2017.
- Dextreme: Transfer of agile in-hand manipulation from simulation to reality. In IEEE International Conference on Robotics and Automation, ICRA 2023, London, UK, May 29 - June 2, 2023, pages 5977–5984. IEEE, 2023. doi:10.1109/ICRA48891.2023.10160216. URL https://doi.org/10.1109/ICRA48891.2023.10160216.
- Cyberdemo: Augmenting simulated human demonstration for real-world dexterous manipulation. arXiv preprint arXiv: Arxiv-2402.14795, 2024.
- Meta-reinforcement learning for robotic industrial insertion tasks. arXiv preprint arXiv: Arxiv-2004.14404, 2020.
- Cherry-Picking with Reinforcement Learning. In Proceedings of Robotics: Science and Systems, Daegu, Republic of Korea, July 2023. doi:10.15607/RSS.2023.XIX.021.
- Closing the sim-to-real loop: Adapting simulation randomization with real world experience. In International Conference on Robotics and Automation, ICRA 2019, Montreal, QC, Canada, May 20-24, 2019, pages 8973–8979. IEEE, 2019. doi:10.1109/ICRA.2019.8793789. URL https://doi.org/10.1109/ICRA.2019.8793789.
- J. P. Hanna and P. Stone. Grounded action transformation for robot learning in simulation. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, AAAI’17, page 4931–4932. AAAI Press, 2017.
- Augmenting differentiable simulators with neural networks to close the sim2real gap. arXiv preprint arXiv: Arxiv-2007.06045, 2020.
- C. A. Cruz and T. Igarashi. A survey on interactive reinforcement learning: Design principles and open challenges. arXiv preprint arXiv: Arxiv-2105.12949, 2021.
- Understanding the relationship between interactions and outcomes in human-in-the-loop machine learning. In Z.-H. Zhou, editor, Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pages 4382–4391. International Joint Conferences on Artificial Intelligence Organization, 8 2021. doi:10.24963/ijcai.2021/599. URL https://doi.org/10.24963/ijcai.2021/599. Survey Track.
- Leveraging human guidance for deep reinforcement learning tasks. arXiv preprint arXiv: Arxiv-1909.09906, 2019.
- Shared autonomy via hindsight optimization. arXiv preprint arXiv: Arxiv-1503.07619, 2015.
- Shared autonomy via deep reinforcement learning. arXiv preprint arXiv: Arxiv-1802.01744, 2018.
- Hg-dagger: Interactive imitation learning with human experts. arXiv preprint arXiv: Arxiv-1810.02890, 2018.
- Human-in-the-loop imitation learning using remote teleoperation. arXiv preprint arXiv: Arxiv-2012.06733, 2020.
- What matters in learning from offline human demonstrations for robot manipulation. arXiv preprint arXiv: Arxiv-2108.03298, 2021.
- Offline reinforcement learning with implicit q-learning. arXiv preprint arXiv: Arxiv-2110.06169, 2021.
- Robot learning on the job: Human-in-the-loop autonomy and learning during deployment. arXiv preprint arXiv: Arxiv-2211.08416, 2022.
- Maximum a posteriori policy optimisation. arXiv preprint arXiv: Arxiv-1806.06920, 2018.
- D. A. Pomerleau. Alvinn: An autonomous land vehicle in a neural network. In D. Touretzky, editor, Advances in Neural Information Processing Systems, volume 1. Morgan-Kaufmann, 1988. URL https://proceedings.neurips.cc/paper_files/paper/1988/file/812b4ba287f5ee0bc9d43bbf5bbe87fb-Paper.pdf.
- Reinforcement and imitation learning for diverse visuomotor skills. arXiv preprint arXiv: Arxiv-1802.09564, 2018.
- Decomposing the generalization gap in imitation learning for visual robotic manipulation. arXiv preprint arXiv: Arxiv-2307.03659, 2023.
- Dexpoint: Generalizable point cloud reinforcement learning for sim-to-real dexterous manipulation. arXiv preprint arXiv: Arxiv-2211.09423, 2022.
- O. Khatib. A unified approach for motion and force control of robot manipulators: The operational space formulation. IEEE Journal on Robotics and Automation, 3(1):43–53, 1987. doi:10.1109/JRA.1987.1087068.
- Operational space control: A theoretical and empirical comparison. The International Journal of Robotics Research, 27(6):737–757, 2008. doi:10.1177/0278364908091463. URL https://doi.org/10.1177/0278364908091463.
- Policy distillation. arXiv preprint arXiv: Arxiv-1511.06295, 2015.
- Visual dexterity: In-hand dexterous manipulation from depth. arXiv preprint arXiv: Arxiv-2211.11744, 2022.
- A system for general in-hand object re-orientation. In A. Faust, D. Hsu, and G. Neumann, editors, Conference on Robot Learning, 8-11 November 2021, London, UK, volume 164 of Proceedings of Machine Learning Research, pages 297–307. PMLR, 2021. URL https://proceedings.mlr.press/v164/chen22a.html.
- Residual reinforcement learning for robot control. arXiv preprint arXiv: Arxiv-1812.03201, 2018.
- Residual policy learning. arXiv preprint arXiv: Arxiv-1812.06298, 2018.
- Tossingbot: Learning to throw arbitrary objects with residual physics. arXiv preprint arXiv: Arxiv-1903.11239, 2019.
- Proximal policy optimization algorithms. arXiv preprint arXiv: Arxiv-1707.06347, 2017.
- Diffusion policy: Visuomotor policy learning via action diffusion. In K. E. Bekris, K. Hauser, S. L. Herbert, and J. Yu, editors, Robotics: Science and Systems XIX, Daegu, Republic of Korea, July 10-14, 2023, 2023. doi:10.15607/RSS.2023.XIX.026. URL https://doi.org/10.15607/RSS.2023.XIX.026.
- Pointnet: Deep learning on point sets for 3d classification and segmentation. arXiv preprint arXiv: Arxiv-1612.00593, 2016.
- Perceiver: General perception with iterative attention. arXiv preprint arXiv: Arxiv-2103.03206, 2021.
- Set transformer: A framework for attention-based permutation-invariant neural networks. arXiv preprint arXiv: Arxiv-1810.00825, 2018.
- I. Loshchilov and F. Hutter. Sgdr: Stochastic gradient descent with warm restarts. arXiv preprint arXiv: Arxiv-1608.03983, 2016.
- Furniturebench: Reproducible real-world benchmark for long-horizon complex manipulation. In K. E. Bekris, K. Hauser, S. L. Herbert, and J. Yu, editors, Robotics: Science and Systems XIX, Daegu, Republic of Korea, July 10-14, 2023, 2023. doi:10.15607/RSS.2023.XIX.041. URL https://doi.org/10.15607/RSS.2023.XIX.041.
- Motion policy networks. In K. Liu, D. Kulic, and J. Ichnowski, editors, Conference on Robot Learning, CoRL 2022, 14-18 December 2022, Auckland, New Zealand, volume 205 of Proceedings of Machine Learning Research, pages 967–977. PMLR, 2022. URL https://proceedings.mlr.press/v205/fishman23a.html.
- Thriftydagger: Budget-aware novelty and risk gating for interactive imitation learning. arXiv preprint arXiv: Arxiv-2109.08273, 2021.
- Drake: Model-based design and verification for robotics, 2019. URL https://drake.mit.edu.
- Sapien: A simulated part-based interactive environment. arXiv preprint arXiv: Arxiv-2003.08515, 2020.
- Orbit: A unified simulation framework for interactive robot learning environments. arXiv preprint arXiv: Arxiv-2301.04195, 2023.
- Transporter networks: Rearranging the visual world for robotic manipulation. arXiv preprint arXiv: Arxiv-2010.14406, 2020.
- Cliport: What and where pathways for robotic manipulation. arXiv preprint arXiv: Arxiv-2109.12098, 2021.
- Vima: General robot manipulation with multimodal prompts. arXiv preprint arXiv: Arxiv-2210.03094, 2022.
- Perceiver-actor: A multi-task transformer for robotic manipulation. arXiv preprint arXiv: Arxiv-2209.05451, 2022.
- Rearrangement: A challenge for embodied ai. arXiv preprint arXiv: Arxiv-2011.01975, 2020.
- Multi-skill mobile manipulation for object rearrangement. arXiv preprint arXiv: Arxiv-2209.02778, 2022.
- Homerobot: Open-vocabulary mobile manipulation. arXiv preprint arXiv: Arxiv-2306.11565, 2023.
- Imitating shortest paths in simulation enables effective navigation and manipulation in the real world. arXiv preprint arXiv: Arxiv-2312.02976, 2023.
- Learning to manipulate deformable objects without demonstrations. arXiv preprint arXiv: Arxiv-1910.13439, 2019.
- H. Ha and S. Song. Flingbot: The unreasonable effectiveness of dynamic manipulation for cloth unfolding. arXiv preprint arXiv: Arxiv-2105.03655, 2021.
- Toolflownet: Robotic manipulation with tools via predicting tool flow from point clouds. arXiv preprint arXiv: Arxiv-2211.09006, 2022.
- Diffskill: Skill abstraction from differentiable physics for deformable object manipulations with tools. arXiv preprint arXiv: Arxiv-2203.17275, 2022.
- Hierarchical reinforcement learning for precise soccer shooting skills using a quadrupedal robot. arXiv preprint arXiv: Arxiv-2208.01160, 2022.
- Eureka: Human-level reward design via coding large language models. arXiv preprint arXiv: Arxiv-2310.12931, 2023.
- A. Boeing and T. Bräunl. Leveraging multiple simulators for crossing the reality gap. In 2012 12th International Conference on Control Automation Robotics & Vision (ICARCV), pages 1113–1119, 2012. doi:10.1109/ICARCV.2012.6485313.
- Learning agile and dynamic motor skills for legged robots. Science Robotics, 4(26):eaau5872, 2019. doi:10.1126/scirobotics.aau5872. URL https://www.science.org/doi/abs/10.1126/scirobotics.aau5872.
- Reinforcement learning on variable impedance controller for high-precision robotic assembly. In International Conference on Robotics and Automation, ICRA 2019, Montreal, QC, Canada, May 20-24, 2019, pages 3080–3087. IEEE, 2019. doi:10.1109/ICRA.2019.8793506. URL https://doi.org/10.1109/ICRA.2019.8793506.
- On the similarities and differences among contact models in robot simulation. IEEE Robotics and Automation Letters, 4(2):493–499, 2019. doi:10.1109/LRA.2019.2891085.
- Model-agnostic meta-learning for fast adaptation of deep networks. arXiv preprint arXiv: Arxiv-1703.03400, 2017.
- Interactive machine learning. In Proceedings of the 8th International Conference on Intelligent User Interfaces, IUI ’03, page 39–45, New York, NY, USA, 2003. Association for Computing Machinery. ISBN 1581135866. doi:10.1145/604045.604056. URL https://doi.org/10.1145/604045.604056.
- Power to the people: The role of humans in interactive machine learning. AI Magazine, 35(4):105–120, Dec. 2014. doi:10.1609/aimag.v35i4.2513. URL https://ojs.aaai.org/aimagazine/index.php/aimagazine/article/view/2513.
- The k-armed dueling bandits problem. Journal of Computer and System Sciences, 78(5):1538–1556, 2012. ISSN 0022-0000. doi:https://doi.org/10.1016/j.jcss.2011.12.028. URL https://www.sciencedirect.com/science/article/pii/S0022000012000281. JCSS Special Issue: Cloud Computing 2011.
- Learning trajectory preferences for manipulators via iterative improvement. arXiv preprint arXiv: Arxiv-1306.6294, 2013.
- Deep reinforcement learning from human preferences. arXiv preprint arXiv: Arxiv-1706.03741, 2017.
- Learning reward functions from diverse sources of human feedback: Optimally integrating demonstrations and preferences. arXiv preprint arXiv: Arxiv-2006.14091, 2020.
- Pebble: Feedback-efficient interactive reinforcement learning via relabeling experience and unsupervised pre-training. arXiv preprint arXiv: Arxiv-2106.05091, 2021.
- Skill preferences: Learning to extract and execute robotic skills from human feedback. arXiv preprint arXiv: Arxiv-2108.05382, 2021.
- Training language models to follow instructions with human feedback. arXiv preprint arXiv: Arxiv-2203.02155, 2022.
- Active reward learning from online preferences. arXiv preprint arXiv: Arxiv-2302.13507, 2023.
- Direct preference optimization: Your language model is secretly a reward model. arXiv preprint arXiv: Arxiv-2305.18290, 2023.
- Contrastive preference learning: Learning from human feedback without rl. arXiv preprint arXiv: Arxiv-2310.13639, 2023.
- W. B. Knox and P. Stone. Reinforcement learning from human reward: Discounting in episodic tasks. In 2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication, pages 878–885, 2012. doi:10.1109/ROMAN.2012.6343862.
- Tactile guidance for policy refinement and reuse. In 2010 IEEE 9th International Conference on Development and Learning, pages 7–12, 2010. doi:10.1109/DEVLRN.2010.5578872.
- Human-guided trajectory adaptation for tool transfer. In Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, AAMAS ’19, page 1350–1358, Richland, SC, 2019. International Foundation for Autonomous Agents and Multiagent Systems. ISBN 9781450363099.
- Learning robot objectives from physical human interaction. In Conference on Robot Learning, 2017. URL https://api.semanticscholar.org/CorpusID:28406224.
- Interactively shaping robot behaviour with unlabeled human instructions. arXiv preprint arXiv: Arxiv-1902.01670, 2019.
- Learning reward functions from scale feedback. arXiv preprint arXiv: Arxiv-2110.00284, 2021.
- J. Zhang and K. Cho. Query-efficient imitation learning for end-to-end autonomous driving. arXiv preprint arXiv: Arxiv-1605.06450, 2016.
- Trial without error: Towards safe reinforcement learning via human intervention. arXiv preprint arXiv: Arxiv-1707.05173, 2017.
- Appli: Adaptive planner parameter learning from interventions. arXiv preprint arXiv: Arxiv-2011.00400, 2020.
- An interactive framework for learning continuous actions policies based on corrective feedback. Journal of Intelligent & Robotic Systems, 95:77–97, 2018. doi:10.1007/s10846-018-0839-z. URL http://link.springer.com/article/10.1007/s10846-018-0839-z/fulltext.html.
- Learning from active human involvement through proxy value propagation. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=q8SukwaEBy.
- Interactive imitation learning in robotics: A survey. arXiv preprint arXiv: Arxiv-2211.00600, 2022.
- Interventional data generation for robust and data-efficient robot imitation learning. In First Workshop on Out-of-Distribution Generalization in Robotics at CoRL 2023, 2023. URL https://openreview.net/forum?id=ckFRoOaA3n.
- J. Crandall and M. Goodrich. Characterizing efficiency of human robot interaction: a case study of shared-control teleoperation. In IEEE/RSJ International Conference on Intelligent Robots and Systems, volume 2, pages 1290–1295 vol.2, 2002. doi:10.1109/IRDS.2002.1043932.
- A policy-blending formalism for shared control. The International Journal of Robotics Research, 32(7):790–805, 2013. doi:10.1177/0278364913490324. URL https://doi.org/10.1177/0278364913490324.
- Human-in-the-loop optimization of shared autonomy in assistive robotics. IEEE Robotics and Automation Letters, 2(1):247–254, 2017. doi:10.1109/LRA.2016.2593928.
- Formalizing assistive teleoperation, volume 376. MIT Press, July, 2012.
- Shared autonomy with learned latent actions. arXiv preprint arXiv: Arxiv-2005.03210, 2020.
- ”no, to the right” – online language corrections for robotic manipulation via shared autonomy. arXiv preprint arXiv: Arxiv-2301.02555, 2023.
- Gello: A general, low-cost, and intuitive teleoperation framework for robot manipulators. arXiv preprint arXiv: Arxiv-2309.13037, 2023.
- Dexcap: Scalable and portable mocap data collection system for dexterous manipulation. arXiv preprint arXiv: Arxiv-2403.07788, 2024.
- Mobile aloha: Learning bimanual mobile manipulation with low-cost whole-body teleoperation. arXiv preprint arXiv: Arxiv-2401.02117, 2024.
- Telemoma: A modular and versatile teleoperation system for mobile manipulation. arXiv preprint arXiv: Arxiv-2403.07869, 2024.
- Learning human-to-humanoid real-time whole-body teleoperation. arXiv preprint arXiv: Arxiv-2403.04436, 2024.
- Model-based runtime monitoring with interactive imitation learning. arXiv preprint arXiv: Arxiv-2310.17552, 2023a.
- Reflect: Summarizing robot experiences for failure explanation and correction. arXiv preprint arXiv: Arxiv-2306.15724, 2023b.
- Juicer: Data-efficient imitation learning for robotic assembly. arXiv preprint arXiv: Arxiv-2404.03729, 2024.
- M. N. Mistry and L. Righetti. Operational space control of constrained and underactuated systems. In Robotics: Science and Systems, 2011. URL https://api.semanticscholar.org/CorpusID:10392712.
- Minedojo: Building open-ended embodied agents with internet-scale knowledge. arXiv preprint arXiv: Arxiv-2206.08853, 2022.
- Fast and accurate deep network learning by exponential linear units (elus). arXiv preprint arXiv: Arxiv-1511.07289, 2015.
- D. Makoviichuk and V. Makoviychuk. rl-games: A high-performance framework for reinforcement learning. https://github.com/Denys88/rl_games, May 2021.
- High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv: Arxiv-1506.02438, 2015.
- D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv: Arxiv-1412.6980, 2014.
- S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Comput., 9(8):1735–1780, nov 1997. ISSN 0899-7667. doi:10.1162/neco.1997.9.8.1735. URL https://doi.org/10.1162/neco.1997.9.8.1735.
- D. Hendrycks and K. Gimpel. Gaussian error linear units (gelus). arXiv preprint arXiv: Arxiv-1606.08415, 2016.
- Viola: Imitation learning for vision-based manipulation with object proposal priors. 6th Annual Conference on Robot Learning, 2022.
- Pointnet++: Deep hierarchical feature learning on point sets in a metric space. arXiv preprint arXiv: Arxiv-1706.02413, 2017.
- Cross-episodic curriculum for transformer agents. arXiv preprint arXiv: Arxiv-2310.08549, 2023.
- Yunfan Jiang (11 papers)
- Chen Wang (600 papers)
- Ruohan Zhang (34 papers)
- Jiajun Wu (249 papers)
- Li Fei-Fei (199 papers)