Active Exploration in Bayesian Model-based Reinforcement Learning for Robot Manipulation (2404.01867v1)
Abstract: Efficiently tackling multiple tasks within complex environment, such as those found in robot manipulation, remains an ongoing challenge in robotics and an opportunity for data-driven solutions, such as reinforcement learning (RL). Model-based RL, by building a dynamic model of the robot, enables data reuse and transfer learning between tasks with the same robot and similar environment. Furthermore, data gathering in robotics is expensive and we must rely on data efficient approaches such as model-based RL, where policy learning is mostly conducted on cheaper simulations based on the learned model. Therefore, the quality of the model is fundamental for the performance of the posterior tasks. In this work, we focus on improving the quality of the model and maintaining the data efficiency by performing active learning of the dynamic model during a preliminary exploration phase based on maximize information gathering. We employ Bayesian neural network models to represent, in a probabilistic way, both the belief and information encoded in the dynamic model during exploration. With our presented strategies we manage to actively estimate the novelty of each transition, using this as the exploration reward. In this work, we compare several Bayesian inference methods for neural networks, some of which have never been used in a robotics context, and evaluate them in a realistic robot manipulation setup. Our experiments show the advantages of our Bayesian model-based RL approach, with similar quality in the results than relevant alternatives with much lower requirements regarding robot execution steps. Unlike related previous studies that focused the validation solely on toy problems, our research takes a step towards more realistic setups, tackling robotic arm end-tasks.
- A survey on deep reinforcement learning algorithms for robotic manipulation. Sensors, 23(7):3762, 2023.
- Robot learning towards smart robotic manufacturing: A review. Robotics and Computer-Integrated Manufacturing, 77:102360, 2022.
- Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In ICRA, pages 3389–3396. IEEE, 2017.
- Deep dynamics models for learning dexterous manipulation. In CoRL, pages 1101–1112. PMLR, 2020.
- How to train your robot with deep reinforcement learning: lessons we have learned. IJRR, 40(4-5):698–721, 2021.
- Transfer learning in deep reinforcement learning: A survey. TPAMI, 2023.
- Sharing knowledge in multi-task deep reinforcement learning. arXiv preprint arXiv:2401.09561, 2024.
- A survey on model-based reinforcement learning. Science China Information Sciences, 67(2):121101, 2024.
- Safe trajectory sampling in model-based reinforcement learning. In CASE, 2023.
- Robust policy search for robot navigation. RA-L, 6(2):2389–2396, 2021.
- PILCO: A model-based and data-efficient approach to policy search. In ICML, 2011.
- When to trust your model: Model-based policy optimization. NeurIPS, 32, 2019.
- Deep reinforcement learning in a handful of trials using probabilistic dynamics models. NeurIPS, 31, 2018.
- Neural network dynamics for model-based deep reinforcement learning with model-free fine-tuning. In IEEE ICRA, 2018.
- Morel: Model-based offline reinforcement learning. NeurIPS, 33:21810–21823, 2020.
- World models. In NeurIPS, 2018.
- Model-based reinforcement learning via imagination with derived memory. NeurIPS, 34:9493–9505, 2021.
- Simple and scalable predictive uncertainty estimation using deep ensembles. NeurIPS, 30, 2017.
- Evaluating scalable Bayesian deep learning methods for robust computer vision. In CVPR Workshops, pages 318–319, 2020.
- Can you trust your model’s uncertainty? evaluating predictive uncertainty under dataset shift. NeurIPS, 32, 2019.
- Dropout as a Bayesian approximation: Representing model uncertainty in deep learning. In ICML, pages 1050–1059. PMLR, 2016.
- David JC MacKay. Bayesian interpolation. Neural computation, 4(3):415–447, 1992.
- Deep exploration via bootstrapped dqn. NeurIPS, 29, 2016.
- Vime: Variational information maximizing exploration. NeurIPS, 29, 2016.
- Model-based active exploration. In ICML, pages 5779–5788. PMLR, 2019.
- Active learning for autonomous intelligent agents: Exploration, curiosity, and interaction. arXiv preprint arXiv:1403.1497, 2014.
- Active learning in robotics: A review of control principles. Mechatronics, 77:102576, 2021.
- An experimental design perspective on model-based reinforcement learning. In ICLR, 2021.
- Unifying count-based exploration and intrinsic motivation. NeurIPS, 29, 2016.
- Curiosity-driven exploration by self-supervised prediction. In ICML, pages 2778–2787. PMLR, 2017.
- Jürgen Schmidhuber. Curious model-building control systems. In IJCNN, pages 1458–1463, 1991.
- Exploration in model-based reinforcement learning by empirically estimating learning progress. NeurIPS, 25, 2012.
- Diversity is all you need: Learning skills without a reward function. arXiv preprint arXiv:1802.06070, 2018.
- Abandoning objectives: Evolution through the search for novelty alone. Evolutionary computation, 19(2):189–223, 2011.
- Self-supervised exploration via disagreement. In ICML, pages 5062–5071, 2019.
- Laplace redux-effortless Bayesian deep learning. NeurIPS, 34:20089–20103, 2021. https://github.com/AlexImmer/{L}aplace.
- Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In ICML, pages 1861–1870, 2018.
- Openai gym. arXiv preprint arXiv:1606.01540, 2016.
- RlBench: The robot learning benchmark & learning environment. RA-L, 5(2):3019–3026, 2020.
- MuJoCo: A physics engine for model-based control. In IEEE/RSJ IROS, 2012.
- V-REP: A versatile and scalable robot simulation framework. In IEEE/RSJ IROS, 2013.
- Carlos Plou (5 papers)
- Ana C. Murillo (24 papers)
- Ruben Martinez-Cantin (36 papers)