Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Task and Domain Adaptive Reinforcement Learning for Robot Control (2404.18713v3)

Published 29 Apr 2024 in cs.RO, cs.AI, cs.SY, and eess.SY

Abstract: Deep reinforcement learning (DRL) has shown remarkable success in simulation domains, yet its application in designing robot controllers remains limited, due to its single-task orientation and insufficient adaptability to environmental changes. To overcome these limitations, we present a novel adaptive agent that leverages transfer learning techniques to dynamically adapt policy in response to different tasks and environmental conditions. The approach is validated through the blimp control challenge, where multitasking capabilities and environmental adaptability are essential. The agent is trained using a custom, highly parallelized simulator built on IsaacGym. We perform zero-shot transfer to fly the blimp in the real world to solve various tasks. We share our code at https://github.com/robot-perception-group/adaptive_agent.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (32)
  1. Z. Zhu, K. Lin, A. K. Jain, and J. Zhou, “Transfer learning in deep reinforcement learning: A survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
  2. W. Zhao, J. P. Queralta, and T. Westerlund, “Sim-to-real transfer in deep reinforcement learning for robotics: a survey,” in 2020 IEEE symposium series on computational intelligence (SSCI).   IEEE, 2020, pp. 737–744.
  3. S. J. Russell and A. Zimdars, “Q-decomposition for reinforcement learning agents,” in Proceedings of the 20th International Conference on Machine Learning (ICML-03), 2003, pp. 656–663.
  4. A. Barreto, S. Hou, D. Borsa, D. Silver, and D. Precup, “Fast reinforcement learning with generalized policy updates,” Proceedings of the National Academy of Sciences, vol. 117, no. 48, pp. 30 079–30 087, 2020.
  5. A. Kumar, Z. Fu, D. Pathak, and J. Malik, “Rma: Rapid motor adaptation for legged robots,” arXiv preprint arXiv:2107.04034, 2021.
  6. E. Price, M. J. Black, and A. Ahmad, “Driven formation control of airships for cooperative target tracking,” IEEE Robotics and Automation Letters, 2023.
  7. Y. T. Liu, E. Price, M. J. Black, and A. Ahmad, “Deep residual reinforcement learning based autonomous blimp control,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2022, pp. 12 566–12 573.
  8. J. Liang, V. Makoviychuk, A. Handa, N. Chentanez, M. Macklin, and D. Fox, “Gpu-accelerated robotic simulation for distributed reinforcement learning,” in Conference on Robot Learning.   PMLR, 2018, pp. 270–282.
  9. L. Espeholt, H. Soyer, R. Munos, K. Simonyan, V. Mnih, T. Ward, Y. Doron, V. Firoiu, T. Harley, I. Dunning, et al., “Impala: Scalable distributed deep-rl with importance weighted actor-learner architectures,” in International conference on machine learning.   PMLR, 2018, pp. 1407–1416.
  10. D. Kalashnikov, J. Varley, Y. Chebotar, B. Swanson, R. Jonschkowski, C. Finn, S. Levine, and K. Hausman, “Mt-opt: Continuous multi-task robotic reinforcement learning at scale,” arXiv preprint arXiv:2104.08212, 2021.
  11. X. B. Peng, M. Chang, G. Zhang, P. Abbeel, and S. Levine, “Mcp: Learning composable hierarchical control with multiplicative compositional policies,” Advances in Neural Information Processing Systems, vol. 32, 2019.
  12. S. Pateria, B. Subagdja, A.-h. Tan, and C. Quek, “Hierarchical reinforcement learning: A comprehensive survey,” ACM Computing Surveys (CSUR), vol. 54, no. 5, pp. 1–35, 2021.
  13. R. S. Sutton, D. Precup, and S. Singh, “Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning,” Artificial intelligence, vol. 112, no. 1-2, pp. 181–211, 1999.
  14. R. Yang, H. Xu, Y. Wu, and X. Wang, “Multi-task reinforcement learning with soft modularization,” Advances in Neural Information Processing Systems, vol. 33, pp. 4767–4777, 2020.
  15. D. Ghosh, A. Singh, A. Rajeswaran, V. Kumar, and S. Levine, “Divide-and-conquer reinforcement learning,” arXiv preprint arXiv:1711.09874, 2017.
  16. A. Gupta, C. Devin, Y. Liu, P. Abbeel, and S. Levine, “Learning invariant feature spaces to transfer skills with reinforcement learning,” arXiv preprint arXiv:1703.02949, 2017.
  17. A. Zhang, H. Satija, and J. Pineau, “Decoupling dynamics and reward for transfer learning,” arXiv preprint arXiv:1804.10689, 2018.
  18. P. Dayan, “Improving generalization for temporal difference learning: The successor representation,” Neural computation, vol. 5, no. 4, pp. 613–624, 1993.
  19. Y. T. Liu and A. Ahmad, “Multi-task reinforcement learning in continuous control with successor feature-based concurrent composition,” arXiv preprint arXiv:2303.13935, 2023.
  20. J. Tobin, R. Fong, A. Ray, J. Schneider, W. Zaremba, and P. Abbeel, “Domain randomization for transferring deep neural networks from simulation to the real world,” in 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS).   IEEE, 2017, pp. 23–30.
  21. M. E. Taylor and P. Stone, “Transfer learning for reinforcement learning domains: A survey.” Journal of Machine Learning Research, vol. 10, no. 7, 2009.
  22. J. Ho and S. Ermon, “Generative adversarial imitation learning,” Advances in neural information processing systems, vol. 29, 2016.
  23. J. X. Wang, Z. Kurth-Nelson, D. Tirumala, H. Soyer, J. Z. Leibo, R. Munos, C. Blundell, D. Kumaran, and M. Botvinick, “Learning to reinforcement learn,” arXiv preprint arXiv:1611.05763, 2016.
  24. X. B. Peng, E. Coumans, T. Zhang, T.-W. Lee, J. Tan, and S. Levine, “Learning agile robotic locomotion skills by imitating animals,” arXiv preprint arXiv:2004.00784, 2020.
  25. E. Price, Y. T. Liu, M. J. Black, and A. Ahmad, “Simulation and control of deformable autonomous airships in turbulent wind,” in 16th International Conference on Intelligent Autonomous System (IAS), June 2021.
  26. A. Rottmann and W. Burgard, “Adaptive autonomous control using online value iteration with gaussian processes,” in 2009 IEEE International Conference on Robotics and Automation.   IEEE, 2009, pp. 2106–2111.
  27. M. Jaderberg, V. Mnih, W. M. Czarnecki, T. Schaul, J. Z. Leibo, D. Silver, and K. Kavukcuoglu, “Reinforcement learning with unsupervised auxiliary tasks,” 2016.
  28. M. Riedmiller, R. Hafner, T. Lampe, M. Neunert, J. Degrave, T. Wiele, V. Mnih, N. Heess, and J. T. Springenberg, “Learning by playing solving sparse reward tasks from scratch,” in International conference on machine learning.   PMLR, 2018, pp. 4344–4353.
  29. T. Yu, S. Kumar, A. Gupta, S. Levine, K. Hausman, and C. Finn, “Gradient surgery for multi-task learning,” Advances in Neural Information Processing Systems, vol. 33, pp. 5824–5836, 2020.
  30. Y. Pan, K. Banman, and M. White, “Fuzzy tiling activations: A simple approach to learning sparse representations online,” arXiv preprint arXiv:1911.08068, 2019.
  31. S. Sinha, H. Bharadhwaj, A. Srinivas, and A. Garg, “D2rl: Deep dense architectures in reinforcement learning,” arXiv preprint arXiv:2010.09163, 2020.
  32. T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine, “Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor,” in International conference on machine learning.   PMLR, 2018, pp. 1861–1870.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Yu Tang Liu (8 papers)
  2. Nilaksh Singh (2 papers)
  3. Aamir Ahmad (28 papers)