Less is more -- the Dispatcher/ Executor principle for multi-task Reinforcement Learning (2312.09120v1)
Abstract: Humans instinctively know how to neglect details when it comes to solve complex decision making problems in environments with unforeseeable variations. This abstraction process seems to be a vital property for most biological systems and helps to 'abstract away' unnecessary details and boost generalisation. In this work we introduce the dispatcher/ executor principle for the design of multi-task Reinforcement Learning controllers. It suggests to partition the controller in two entities, one that understands the task (the dispatcher) and one that computes the controls for the specific device (the executor) - and to connect these two by a strongly regularizing communication channel. The core rationale behind this position paper is that changes in structure and design principles can improve generalisation properties and drastically enforce data-efficiency. It is in some sense a 'yes, and ...' response to the current trend of using large neural networks trained on vast amounts of data and bet on emerging generalisation properties. While we agree on the power of scaling - in the sense of Sutton's 'bitter lesson' - we will give some evidence, that considering structure and adding design principles can be a valuable and critical component in particular when data is not abundant and infinite, but is a precious resource.
- Maximum a posteriori policy optimisation. In International Conference on Learning Representations, 2018.
- Do as i can, not as i say: Grounding language in robotic affordances, 2022.
- Robocat: A self-improving foundation agent for robotic manipulation. arXiv preprint arXiv:2306.11706, 2023.
- Rt-1: Robotics transformer for real-world control at scale. In arXiv preprint arXiv:2212.06817, 2022.
- Magnetic control of tokamak plasmas through deep reinforcement learning. Nat., 602(7897):414–419, 2022. doi: 10.1038/S41586-021-04301-9. URL https://doi.org/10.1038/s41586-021-04301-9.
- Goal-conditioned end-to-end visuomotor control for versatile skill primitives. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 1319–1325. IEEE, 2021.
- Deep hierarchical planning from pixels, 2022.
- Reinforcement learning in feedback control - challenges and benchmarks from technical process control. Mach. Learn., 84(1-2):137–169, 2011. doi: 10.1007/S10994-011-5235-X. URL https://doi.org/10.1007/s10994-011-5235-x.
- Learning an embedding space for transferable robot skills. In International Conference on Learning Representations, 2018. URL https://openreview.net/forum?id=rk07ZXZRb.
- Imagenet classification with deep convolutional neural networks. Commun. ACM, 60(6):84–90, 2017. doi: 10.1145/3065386. URL https://doi.org/10.1145/3065386.
- Mastering stacking of diverse shapes with large-scale iterative reinforcement learning on real robots. arXiv preprint arXiv:2312.abcde, 2023.
- Batch reinforcement learning. In Reinforcement learning: State-of-the-art, pp. 45–73. Springer, 2012.
- Beyond pick-and-place: Tackling robotic stacking of diverse shapes. arXiv preprint arXiv:2110.06192, 2021.
- End-to-end training of deep visuomotor policies, 2016.
- Data-efficient hierarchical reinforcement learning, 2018.
- OpenAI. Gpt-4 technical report, 2023.
- Dinov2: Learning robust visual features without supervision. arXiv preprint arXiv:2304.07193, 2023.
- Equivariant data augmentation for generalization in offline reinforcement learning. arXiv preprint arXiv:2309.07578, 2023.
- A generalist agent. Transactions on Machine Learning Research, 2022.
- Learning by playing solving sparse reward tasks from scratch. In International conference on machine learning, pp. 4344–4353. PMLR, 2018.
- Collect & infer - a fresh look at data-efficient reinforcement learning. In Aleksandra Faust, David Hsu, and Gerhard Neumann (eds.), Conference on Robot Learning, 8-11 November 2021, London, UK, volume 164 of Proceedings of Machine Learning Research, pp. 1736–1744. PMLR, 2021. URL https://proceedings.mlr.press/v164/riedmiller22a.html.
- Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587):484, 2016.
- Richard Sutton. The bitter lesson. Blog Post, 2019. URL http://www.incompleteideas.net/IncIdeas/BitterLesson.html.
- Attention is all you need. In Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (eds.), Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, pp. 5998–6008, 2017. URL https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html.
- Skills: Adaptive skill sequencing for efficient temporally-extended exploration. CoRR, abs/2211.13743, 2022. doi: 10.48550/ARXIV.2211.13743. URL https://doi.org/10.48550/arXiv.2211.13743.
- Scaling robot learning with semantically imagined experience, 2023.
- Hierarchical task learning from language instructions with unified transformers and self-monitoring. arXiv preprint arXiv:2106.03427, 2021.
- Rt-2: Vision-language-action models transfer web knowledge to robotic control. In 7th Annual Conference on Robot Learning, 2023.
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days freePaper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.