Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

One ACT Play: Single Demonstration Behavior Cloning with Action Chunking Transformers (2309.10175v1)

Published 18 Sep 2023 in cs.RO, cs.AI, and cs.LG

Abstract: Learning from human demonstrations (behavior cloning) is a cornerstone of robot learning. However, most behavior cloning algorithms require a large number of demonstrations to learn a task, especially for general tasks that have a large variety of initial conditions. Humans, however, can learn to complete tasks, even complex ones, after only seeing one or two demonstrations. Our work seeks to emulate this ability, using behavior cloning to learn a task given only a single human demonstration. We achieve this goal by using linear transforms to augment the single demonstration, generating a set of trajectories for a wide range of initial conditions. With these demonstrations, we are able to train a behavior cloning agent to successfully complete three block manipulation tasks. Additionally, we developed a novel addition to the temporal ensembling method used by action chunking agents during inference. By incorporating the standard deviation of the action predictions into the ensembling method, our approach is more robust to unforeseen changes in the environment, resulting in significant performance improvements.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. D. Pomerleau, “Alvinn: An autonomous land vehicle in a neural network,” in Proceedings of (NeurIPS) Neural Information Processing Systems, D. Touretzky, Ed.   Morgan Kaufmann, December 1989, pp. 305 – 313.
  2. M. Bojarski, D. D. Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller, J. Zhang, X. Zhang, J. Zhao, and K. Zieba, “End to end learning for self-driving cars,” 2016.
  3. W. Farag and Z. Saleh, “Behavior cloning for autonomous driving using convolutional neural networks,” in 2018 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT), 2018, pp. 1–7.
  4. O. Vinyals, I. Babuschkin, W. M. Czarnecki, M. Mathieu, A. Dudzik, J. Chung, D. H. Choi, R. Powell, T. Ewalds, P. Georgiev et al., “Grandmaster level in starcraft ii using multi-agent reinforcement learning,” Nature, vol. 575, no. 7782, pp. 350–354, 2019.
  5. F. Torabi, G. Warnell, and P. Stone, “Behavioral cloning from observation,” arXiv preprint arXiv:1805.01954, 2018.
  6. A. Mandlekar, D. Xu, J. Wong, S. Nasiriany, C. Wang, R. Kulkarni, L. Fei-Fei, S. Savarese, Y. Zhu, and R. Mart’in-Mart’in, “What matters in learning from offline human demonstrations for robot manipulation,” in Conference on Robot Learning, 2021. [Online]. Available: https://api.semanticscholar.org/CorpusID:236956615
  7. N. M. Shafiullah, Z. Cui, A. A. Altanzaya, and L. Pinto, “Behavior transformers: Cloning k𝑘kitalic_k modes with one stone,” Advances in neural information processing systems, vol. 35, pp. 22 955–22 968, 2022.
  8. J. Choi, H. Kim, Y. Son, C.-W. Park, and J. H. Park, “Robotic behavioral cloning through task building,” in 2020 International Conference on Information and Communication Technology Convergence (ICTC), 2020, pp. 1279–1281.
  9. S. Ross, G. Gordon, and D. Bagnell, “A reduction of imitation learning and structured prediction to no-regret online learning,” in Proceedings of the fourteenth international conference on artificial intelligence and statistics.   JMLR Workshop and Conference Proceedings, 2011, pp. 627–635.
  10. S. Tu, A. Robey, T. Zhang, and N. Matni, “On the sample complexity of stability constrained imitation learning,” in Conference on Learning for Dynamics & Control, 2021. [Online]. Available: https://api.semanticscholar.org/CorpusID:235358617
  11. A. Hussein, M. M. Gaber, E. Elyan, and C. Jayne, “Imitation learning: A survey of learning methods,” ACM Comput. Surv., vol. 50, no. 2, apr 2017. [Online]. Available: https://doi.org/10.1145/3054912
  12. L. Lai, A. Z. Huang, and S. J. Gershman, “Action chunking as policy compression,” 2022.
  13. S.-J. Lee, T. Y. Chun, H. W. Lim, and S.-H. Lee, “Path tracking control using imitation learning with variational auto-encoder,” in 2019 19th International Conference on Control, Automation and Systems (ICCAS), 2019, pp. 501–505.
  14. H.-C. Wang, S.-F. Chen, M.-H. Hsu, C.-M. Lai, and S.-H. Sun, “Diffusion model-augmented behavioral cloning,” 2023.
  15. T. Zhang, Z. McCarthy, O. Jow, D. Lee, X. Chen, K. Goldberg, and P. Abbeel, “Deep imitation learning for complex manipulation tasks from virtual reality teleoperation,” in 2018 IEEE International Conference on Robotics and Automation (ICRA), 2018, pp. 5628–5635.
  16. A. George, A. Bartsch, and A. B. Farimani, “Minimizing human assistance: Augmenting a single demonstration for deep reinforcement learning,” in 2023 IEEE International Conference on Robotics and Automation (ICRA), 2023, pp. 5027–5033.
  17. T. Z. Zhao, V. Kumar, S. Levine, and C. Finn, “Learning fine-grained bimanual manipulation with low-cost hardware,” 2023.
  18. N. Ratliff, J. A. Bagnell, and S. S. Srinivasa, “Imitation learning for locomotion and manipulation,” in 2007 7th IEEE-RAS International Conference on Humanoid Robots.   IEEE, 2007, pp. 392–397.
  19. J. Nakanishi, J. Morimoto, G. Endo, G. Cheng, S. Schaal, and M. Kawato, “Learning from demonstration and adaptation of biped locomotion,” Robotics and autonomous systems, vol. 47, no. 2-3, pp. 79–91, 2004.
  20. A. Reichlin, G. L. Marchetti, H. Yin, A. Ghadirzadeh, and D. Kragic, “Back to the manifold: Recovering from out-of-distribution states,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2022, pp. 8660–8666.
  21. B. Kang, Z. Jie, and J. Feng, “Policy optimization with demonstrations,” in Proceedings of the 35th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, J. Dy and A. Krause, Eds., vol. 80.   PMLR, 10–15 Jul 2018, pp. 2469–2478. [Online]. Available: https://proceedings.mlr.press/v80/kang18a.html
  22. K. Subramanian, C. L. Isbell Jr, and A. L. Thomaz, “Exploration from demonstration for interactive reinforcement learning,” in Proceedings of the 2016 international conference on autonomous agents & multiagent systems, 2016, pp. 447–456.
  23. T. Brys, A. Harutyunyan, H. B. Suay, S. Chernova, M. E. Taylor, and A. Nowé, “Reinforcement learning from demonstration through shaping,” in Twenty-fourth international joint conference on artificial intelligence, 2015.
  24. Y. Bengio, J. Louradour, R. Collobert, and J. Weston, “Curriculum learning,” in Proceedings of the 26th Annual International Conference on Machine Learning, ser. ICML ’09.   New York, NY, USA: Association for Computing Machinery, 2009, p. 41–48. [Online]. Available: https://doi.org/10.1145/1553374.1553380
  25. T. Salimans and R. Chen, “Learning montezuma’s revenge from a single demonstration,” 2018.
  26. J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017.
  27. C. Florensa, D. Held, M. Wulfmeier, M. Zhang, and P. Abbeel, “Reverse curriculum generation for reinforcement learning,” 2018.
  28. Z. C. Lipton, J. Gao, L. Li, X. Li, F. Ahmed, and L. Deng, “Efficient exploration for dialog policy learning with deep bbq networks & replay buffer spiking,” CoRR abs/1608.05081, 2016.
  29. A. Nair, B. McGrew, M. Andrychowicz, W. Zaremba, and P. Abbeel, “Overcoming exploration in reinforcement learning with demonstrations,” in 2018 IEEE international conference on robotics and automation (ICRA).   IEEE, 2018, pp. 6292–6299.
  30. J. C. Vargas, M. Bhoite, and A. B. Farimani, “Creativity in robot manipulation with deep reinforcement learning,” 2019.
  31. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” 2017. [Online]. Available: https://arxiv.org/pdf/1706.03762.pdf
  32. H. Bharadhwaj, J. Vakil, M. Sharma, A. Gupta, S. Tulsiani, and V. Kumar, “Roboagent: Generalization and efficiency in robot manipulation via semantic augmentations and action chunking,” 2023.
  33. A. George, A. Bartsch, and A. B. Farimani, “Openvr: Teleoperation for manipulation,” 2023.
  34. E. Coumans and Y. Bai, “Pybullet, a python module for physics simulation for games, robotics and machine learning,” http://pybullet.org, 2016–2019.
  35. A. Dikshit, A. Bartsch, A. George, and A. B. Farimani, “Robochop: Autonomous framework for fruit and vegetable chopping leveraging foundational models,” 2023.
  36. Q. Gallouédec, N. Cazin, E. Dellandréa, and L. Chen, “Multi-goal reinforcement learning environments for simulated franka emika panda robot,” arXiv preprint arXiv:2106.13687, 2021.
  37. Franka Emika Robot’s Instruction Handbook.   Franka Emika GmbH, 2021.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Abraham George (13 papers)
  2. Amir Barati Farimani (121 papers)
Citations (9)

Summary

We haven't generated a summary for this paper yet.