Sharing Knowledge in Multi-Task Deep Reinforcement Learning (2401.09561v1)
Abstract: We study the benefit of sharing representations among tasks to enable the effective use of deep neural networks in Multi-Task Reinforcement Learning. We leverage the assumption that learning from different tasks, sharing common properties, is helpful to generalize the knowledge of them resulting in a more effective feature extraction compared to learning a single task. Intuitively, the resulting set of features offers performance benefits when used by Reinforcement Learning algorithms. We prove this by providing theoretical guarantees that highlight the conditions for which is convenient to share representations among tasks, extending the well-known finite-time bounds of Approximate Value-Iteration to the multi-task setting. In addition, we complement our analysis by proposing multi-task extensions of three Reinforcement Learning algorithms that we empirically evaluate on widely used Reinforcement Learning benchmarks showing significant improvements over the single-task counterparts in terms of sample efficiency and performance.
- Hindsight experience replay. In Advances in Neural Information Processing Systems, pp. 5048–5058, 2017.
- Jonathan Baxter. A model of inductive bias learning. Journal of Artificial Intelligence Research, 12:149–198, 2000.
- The arcade learning environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research, 47:253–279, 2013.
- Richard Bellman. The theory of dynamic programming. Technical report, RAND Corp Santa Monica CA, 1954.
- Openai gym, 2016.
- Sample complexity of multi-task reinforcement learning. In Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence, 2013.
- Rich Caruana. Multitask learning. Machine learning, 28(1):41–75, 1997.
- Linear algorithms for online multitask classification. Journal of Machine Learning Research, 11(Oct):2901–2934, 2010.
- Mushroomrl: Simplifying reinforcement learning research. arXiv:2001.01102, 2020.
- Tree-based batch mode reinforcement learning. Journal of Machine Learning Research, 6(Apr):503–556, 2005.
- Amir-massoud Farahmand. Regularization in reinforcement learning. 2011.
- Multi-task deep reinforcement learning with popart. arXiv:1809.04474, 2018.
- Darla: Improving zero-shot transfer in reinforcement learning. In International Conference on Machine Learning, pp. 1480–1490, 2017.
- Least-squares policy iteration. Journal of machine learning research, 4(Dec):1107–1149, 2003.
- Alessandro Lazaric. Transfer in reinforcement learning: a framework and a survey. In Reinforcement Learning, pp. 143–173. Springer, 2012.
- Bayesian multi-task reinforcement learning. In ICML-27th International Conference on Machine Learning, pp. 599–606. Omnipress, 2010.
- Transfer from multiple mdps. In Advances in Neural Information Processing Systems, pp. 1746–1754, 2011.
- Transfer of samples in batch reinforcement learning. In Proceedings of the 25th international conference on Machine learning, pp. 544–551. ACM, 2008.
- Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971, 2015.
- Decoding multitask dqn in the world of minecraft. In European Workshop on Reinforcement Learning, 2016.
- Andreas Maurer. Bounds for linear multi-task learning. Journal of Machine Learning Research, 7(Jan):117–139, 2006.
- Andreas Maurer. A chain rule for the expected suprema of gaussian processes. Theoretical Computer Science, 650:109–122, 2016.
- The benefit of multitask representation learning. The Journal of Machine Learning Research, 17(1):2853–2884, 2016.
- Human-level control through deep reinforcement learning. Nature, 518(7540):529, 2015.
- Deep exploration via bootstrapped dqn. In Advances in neural information processing systems, pp. 4026–4034, 2016.
- Actor-mimic: Deep multitask and transfer reinforcement learning. arXiv preprint arXiv:1511.06342, 2015.
- Martin Riedmiller. Neural fitted q iteration–first experiences with a data efficient neural reinforcement learning method. In European Conference on Machine Learning, pp. 317–328. Springer, 2005.
- Policy distillation. arXiv preprint arXiv:1511.06295, 2015.
- Universal value function approximators. In International Conference on Machine Learning, pp. 1312–1320, 2015.
- Deepmind control suite. CoRR, abs/1801.00690, 2018.
- Transfer learning for reinforcement learning domains: A survey. Journal of Machine Learning Research, 10(Jul):1633–1685, 2009.
- Transfer learning via inter-task mappings for temporal difference learning. Journal of Machine Learning Research, 8(Sep):2125–2167, 2007.
- Distral: Robust multitask reinforcement learning. In Advances in Neural Information Processing Systems, pp. 4496–4506, 2017.
- Learning to learn. Springer Science & Business Media, 2012.
- Mujoco: A physics engine for model-based control. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 2012.
- Multi-task reinforcement learning: a hierarchical bayesian approach. In Proceedings of the 24th international conference on Machine learning, pp. 1015–1022. ACM, 2007.
- Regularized hierarchical policies for compositional transfer in robotics. arXiv:1906.11228, 2019.
- Multi-task deep reinforcement learning for continuous action control. In IJCAI, pp. 3301–3307, 2017.