So You Think You Can Scale Up Autonomous Robot Data Collection? (2411.01813v1)
Abstract: A long-standing goal in robot learning is to develop methods for robots to acquire new skills autonomously. While reinforcement learning (RL) comes with the promise of enabling autonomous data collection, it remains challenging to scale in the real-world partly due to the significant effort required for environment design and instrumentation, including the need for designing reset functions or accurate success detectors. On the other hand, imitation learning (IL) methods require little to no environment design effort, but instead require significant human supervision in the form of collected demonstrations. To address these shortcomings, recent works in autonomous IL start with an initial seed dataset of human demonstrations that an autonomous policy can bootstrap from. While autonomous IL approaches come with the promise of addressing the challenges of autonomous RL as well as pure IL strategies, in this work, we posit that such techniques do not deliver on this promise and are still unable to scale up autonomous data collection in the real world. Through a series of real-world experiments, we demonstrate that these approaches, when scaled up to realistic settings, face much of the same scaling challenges as prior attempts in RL in terms of environment design. Further, we perform a rigorous study of autonomous IL methods across different data scales and 7 simulation and real-world tasks, and demonstrate that while autonomous data collection can modestly improve performance, simply collecting more human data often provides significantly more improvement. Our work suggests a negative result: that scaling up autonomous data collection for learning robot policies for real-world tasks is more challenging and impractical than what is suggested in prior work. We hope these insights about the core challenges of scaling up data collection help inform future efforts in autonomous learning.
- Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation. In Conference on Robot Learning (CoRL), 2018.
- The Ingredients of Real World Robotic Reinforcement Learning. In International Conference on Learning Representations (ICLR), 2020.
- Self-Improving Robots: End-to-End Autonomous Visuomotor Reinforcement Learning. Conference on Robot Learning (CoRL), 2023.
- Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware. In Proceedings of Robotics: Science and Systems (RSS), 2023.
- Diffusion Policy: Visuomotor Policy Learning via Action Diffusion. In Proceedings of Robotics: Science and Systems (RSS), 2023.
- Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards. arXiv, 2017.
- Deep Q-learning From Demonstrations. In AAAI Conference on Artificial Intelligence, 2018.
- Watch and Match: Supercharging Imitation with Regularized Optimal Transport. In Conference on Robot Learning (CoRL), 2022.
- Efficient Online Reinforcement Learning with Offline Data. In International Conference on Machine Learning (ICML), 2023.
- Imitation Bootstrapped Reinforcement Learning. In Proceedings of Robotics: Science and Systems (RSS), 2024.
- HG-DAgger: Interactive Imitation Learning with Human Experts. In International Conference on Robotics and Automation (ICRA), 2019.
- EnsembleDAgger: A Bayesian Approach to Safe Imitation Learning. In International Conference on Intelligent Robots and Systems (IROS), 2019.
- ThriftyDAgger: Budget-Aware Novelty and Risk Gating for Interactive Imitation Learning. In Conference on Robot Learning (CoRL), 2021.
- Fleet-DAgger: Interactive Robot Fleet Learning with Scalable Human Supervision. In Conference on Robot Learning (CoRL), 2022.
- BC-Z: Zero-Shot Task Generalization with Robotic Imitation Learning. Conference on Robot Learning (CoRL), 2021.
- Expert Intervention Learning: An online framework for robot learning from explicit and implicit human feedback. Autonomous Robots, 2022.
- Robot learning on the job: Human-in-the-loop autonomy and learning during deployment. In Proceedings of Robotics: Science and Systems (RSS), 2023.
- RoboCat: A Self-Improving Generalist Agent for Robotic Manipulation. Transactions on Machine Learning Research, 2023.
- AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents. arXiv, 2024.
- How to train your robot with deep reinforcement learning: lessons we have learned. International Journal of Robotics Research (IJRR), 2021.
- Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning. In International Conference on Machine Learning (ICML), 2017.
- An Algorithmic Perspective on Imitation Learning. Foundations and Trends in Robotics, 2018.
- Data Quality in Imitation Learning. In Advances in Neural Information Processing Systems (NeurIPS), 2023.
- What Matters in Learning from Offline Human Demonstrations for Robot Manipulation. In Conference on Robot Learning (CoRL), 2021.
- MoDem: Accelerating Visual Model-Based Reinforcement Learning with Demonstrations. In International Conference on Learning Representations (ICLR), 2023.
- Rt-h: Action hierarchies using language. In Proceedings of Robotics: Science and Systems (RSS), 2024.
- Aloha unleashed: A simple recipe for robot dexterity. In Conference on Robot Learning (CoRL), 2024.
- H. Ha and S. Song. Flingbot: The unreasonable effectiveness of dynamic manipulation for cloth unfolding. In Conference on Robot Learning (CoRL), 2022.
- LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning. In Advances in Neural Information Processing Systems (NeurIPS), 2023.
- Eliciting Compatible Demonstrations for Multi-Human Imitation Learning. In Conference on Robot Learning (CoRL), 2022.
- J. Peters and S. Schaal. Reinforcement learning by reward-weighted regression for operational space control. In International Conference on Machine Learning (ICML), 2007.
- Grounding DINO: Marrying DINO with grounded pre-training for open-set object detection. arXiv, 2023.
- Fast Segment Anything. arXiv, 2023.
- IDQL: Implicit Q-learning as an actor-critic method with diffusion policies. arXiv, 2023.
- Suvir Mirchandani (17 papers)
- Suneel Belkhale (18 papers)
- Joey Hejna (19 papers)
- Evelyn Choi (4 papers)
- Md Sazzad Islam (3 papers)
- Dorsa Sadigh (162 papers)