Papers
Topics
Authors
Recent
Search
2000 character limit reached

MILES: Making Imitation Learning Easy with Self-Supervision

Published 25 Oct 2024 in cs.RO, cs.AI, and cs.LG | (2410.19693v1)

Abstract: Data collection in imitation learning often requires significant, laborious human supervision, such as numerous demonstrations, and/or frequent environment resets for methods that incorporate reinforcement learning. In this work, we propose an alternative approach, MILES: a fully autonomous, self-supervised data collection paradigm, and we show that this enables efficient policy learning from just a single demonstration and a single environment reset. MILES autonomously learns a policy for returning to and then following the single demonstration, whilst being self-guided during data collection, eliminating the need for additional human interventions. We evaluated MILES across several real-world tasks, including tasks that require precise contact-rich manipulation such as locking a lock with a key. We found that, under the constraints of a single demonstration and no repeated environment resetting, MILES significantly outperforms state-of-the-art alternatives like imitation learning methods that leverage reinforcement learning. Videos of our experiments and code can be found on our webpage: www.robot-learning.uk/miles.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. A. Brohan et al. Rt-1: Robotics transformer for real-world control at scale. In arXiv preprint arXiv:2212.06817, 2022.
  2. Mimicgen: A data generation system for scalable robot learning using human demonstrations. In Conference on Robot Learning, 2023.
  3. Teach a robot to fish: Versatile imitation from one minute of demonstrations. arXiv preprint arXiv:2303.01497, 2023.
  4. J. Ho and S. Ermon. Generative adversarial imitation learning. In Conference on Neural Information Processing Systems, page 4572–4580, 2016.
  5. V. Mnih et al. Human-level control through deep reinforcement learning. Nat., 518(7540):529–533, 2015.
  6. E. Johns. Back to reality for imitation learning. In Proceedings of the 5th Conference on Robot Learning (CoRL), 2021.
  7. A reduction of imitation learning and structured prediction to no-regret online learning. In International Conference on Artificial Intelligence and Statistics, pages 627–635, 2011.
  8. O. X.-E. Collaboration et al. Open X-Embodiment: Robotic learning datasets and RT-X models, 2023.
  9. BC-z: Zero-shot task generalization with robotic imitation learning. In Conference on Robot Learning, 2021.
  10. R+x: Retrieval and execution from everyday human videos. In arXiv, 2024. URL https://arxiv.org/abs/2407.12957.
  11. Model predictive optimization for imitation learning from demonstrations. Robotics and Autonomous Systems, 163, 2023.
  12. Generalized task-parameterized skill learning. In 2018 IEEE International Conference on Robotics and Automation (ICRA), pages 5667–5474, 2018. doi:10.1109/ICRA.2018.8461079.
  13. One-shot visual imitation learning via meta-learning. ArXiv, abs/1709.04905, 2017.
  14. Towards more generalizable one-shot visual imitation learning. In 2022 International Conference on Robotics and Automation (ICRA), pages 2434–2444, 2022. doi:10.1109/ICRA46639.2022.9812450.
  15. G. Papagiannis and Y. Li. Imitation learning with sinkhorn distances. In European Conference in Machine Learning and Knowledge Discovery in Databases, 2022.
  16. Provably efficient imitation learning from observation alone. In K. Chaudhuri and R. Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 6036–6045. PMLR, 09–15 Jun 2019. URL https://proceedings.mlr.press/v97/sun19b.html.
  17. Deep reinforcement learning for industrial insertion tasks with visual inputs and natural rewards. In International Conference on Intelligent Robots and Systems (IROS), 2020.
  18. E. Johns. Coarse-to-fine imitation learning: Robot manipulation from a single demonstration. In IEEE International Conference on Robotics and Automation (ICRA), 2021.
  19. E. Valassakis et al. Demonstrate once, imitate immediately (dome): Learning visual servoing for one-shot imitation learning. 2022.
  20. N. D. Palo and E. Johns. On the effectiveness of retrieval, alignment, and replay in manipulation. RA-Letters, 2024.
  21. One-shot imitation learning: A pose estimation perspective. In Conference on Robot Learning, 2023.
  22. You only demonstrate once: Category-level manipulation from single visual demonstration. ArXiv, abs/2201.12716, 2022.
  23. Dart: Noise injection for robust imitation learning. In Conference on Robot Learning, 2017.
  24. Grasping with chopsticks: Combating covariate shift in model-free imitation learning for fine manipulation. In International Conference on Robotics and Automation (ICRA), 2021.
  25. Nerf in the palm of your hand: Corrective augmentation for robotics via novel-view synthesis, 2023.
  26. Seil: Simulation-augmented equivariant imitation learning. In International Conference on Robotics and Automation (ICRA), pages 1845–1851, 2023. doi:10.1109/ICRA48891.2023.10161252.
  27. CCIL: Continuity-based data augmentation for corrective imitation learning. In First Workshop on Out-of-Distribution Generalization in Robotics at CoRL 2023, 2023.
  28. Get back here: Robust imitation by return-to-distribution planning, 2023.
  29. Emerging properties in self-supervised vision transformers. International Conference on Computer Vision (ICCV), 2021.
  30. S. Amir et al. Deep vit features as dense visual descriptors. ECCVW What is Motion For?, 2022.
  31. Robust multi-modal policies for industrial assembly via reinforcement learning and demonstrations: A large-scale study. ArXiv, abs/2103.11512, 2021.
  32. Serl: A software suite for sample-efficient robotic reinforcement learning. In International Conference on Robotics and Automation (ICRA), 2024.
  33. Offline meta-reinforcement learning for industrial insertion. In International Conference on Robotics and Automation (ICRA), pages 6386–6393, 2022.
  34. Learning on the job: Self-rewarding offline-to-online finetuning for industrial insertion of novel connectors from vision. In International Conference on Robotics and Automation (ICRA), pages 7154–7161, 2023.
  35. Benchmarking protocols for evaluating small parts robotic assembly systems. IEEE Robotics and Automation Letters, 5(2):883–889, 2020.
  36. Diffusion policy: Visuomotor policy learning via action diffusion. In Proceedings of Robotics: Science and Systems (RSS), 2023.
  37. Learning fine-grained bimanual manipulation with low-cost hardware, 2023.
  38. Continuous control with deep reinforcement learning. ArXiv, abs/1509.02971, 2015.
  39. Rlbench: The robot learning benchmark & learning environment. CoRR, abs/1909.12271, 2019. URL http://arxiv.org/abs/1909.12271.
  40. Toolflownet: Robotic manipulation with tools via predicting tool flow from point clouds. In K. Liu, D. Kulic, and J. Ichnowski, editors, Proceedings of The 6th Conference on Robot Learning, volume 205 of Proceedings of Machine Learning Research, pages 1038–1049. PMLR, 14–18 Dec 2023. URL https://proceedings.mlr.press/v205/seita23a.html.
  41. N. Di Palo and E. Johns. Learning multi-stage tasks with one demonstration via self-replay. In Conference on Robot Learning (CoRL), 2021.
  42. Deep Residual Learning for Image Recognition. In Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR ’16, pages 770–778. IEEE, June 2016.
  43. S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Computation, 9(8), 1997.
  44. A. Mandlekar et al. What matters in learning from offline human demonstrations for robot manipulation. In Conference on Robot Learning, 2021.
  45. The surprising effectiveness of representation learning for visual imitation. CoRR, abs/2112.01511, 2021. URL https://arxiv.org/abs/2112.01511.
Citations (1)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.