Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
175 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Not All Tasks Are Equally Difficult: Multi-Task Deep Reinforcement Learning with Dynamic Depth Routing (2312.14472v2)

Published 22 Dec 2023 in cs.AI

Abstract: Multi-task reinforcement learning endeavors to accomplish a set of different tasks with a single policy. To enhance data efficiency by sharing parameters across multiple tasks, a common practice segments the network into distinct modules and trains a routing network to recombine these modules into task-specific policies. However, existing routing approaches employ a fixed number of modules for all tasks, neglecting that tasks with varying difficulties commonly require varying amounts of knowledge. This work presents a Dynamic Depth Routing (D2R) framework, which learns strategic skipping of certain intermediate modules, thereby flexibly choosing different numbers of modules for each task. Under this framework, we further introduce a ResRouting method to address the issue of disparate routing paths between behavior and target policies during off-policy training. In addition, we design an automatic route-balancing mechanism to encourage continued routing exploration for unmastered tasks without disturbing the routing of mastered ones. We conduct extensive experiments on various robotics manipulation tasks in the Meta-World benchmark, where D2R achieves state-of-the-art performance with significantly improved learning efficiency.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. Modular Multitask Reinforcement Learning with Policy Sketches. In Proceedings of International Conference on Machine Learning, 166–175.
  2. Bellman, R. 1966. Dynamic Programming. Science, 153(3731): 34–37.
  3. Adaptive Neural Networks for Efficient Inference. In Proceedings of International Conference on Machine Learning, 527–536.
  4. Sparse Multi-Task Reinforcement Learning. In Advances in Neural Information Processing Systems, 819–827.
  5. Caruana, R. 1997. Multitask Learning. Machine Learning, 28: 41–75.
  6. Multi-Task Reinforcement Learning with Task Representation Method. In ICLR Workshop on Generalizable Policy Learning in Physical World, 1–11.
  7. Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity. Journal of Machine Learning Research, 23(1): 5232–5270.
  8. PathNet: Evolution Channels Gradient Descent in Super Neural Networks. arXiv:1701.08734.
  9. Soft Actor-Critic: Off-policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. In Proceedings of International Conference on Machine Learning, 1856–1865.
  10. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770–778.
  11. Multi-Task Deep Reinforcement Learning with Popart. In Proceedings of the AAAI Conference on Artificial Intelligence, 3796–3803.
  12. Adaptive Mixtures of Local Experts. Neural Computation, 3(1): 79–87.
  13. GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding. In Proceedings of International Conference on Learning Representations, 1–14.
  14. End-to-end Training of Deep Visuomotor Policies. The Journal of Machine Learning Research, 17(1): 1334–1373.
  15. Continuous Control with Deep Reinforcement Learning. In Proceedings of International Conference on Learning Representations, 1–10.
  16. Conflict-Averse Gradient Descent for Multi-Task Learning. In Advances in Neural Information Processing Systems, 18878–18890.
  17. Modeling Task Relationships in Multi-Task Learning with Multi-Gate Mixture-of-Experts. In Proceedings of ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 1930–1939.
  18. Human-Level Control through Deep Reinforcement Learning. Nature, 518(7540): 529–533.
  19. Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning. In Proceedings of International Conference on Learning Representations, 1–9.
  20. Puterman, M. L. 2014. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons.
  21. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer. In Proceedings of International Conference on Learning Representations, 1–12.
  22. Singh, S. P. 1992. Transfer of Learning by Composing Solutions of Elemental Sequential Tasks. Machine Learning, 8: 323–339.
  23. Multi-Task Reinforcement Learning with Context-based Representations. In Proceedings of International Conference on Machine Learning, 9767–9779.
  24. PaCo: Parameter-Compositional Multi-Task Reinforcement Learning. In Advances in Neural Information Processing Systems, 21495–21507.
  25. Reinforcement Learning: An Introduction. MIT press.
  26. BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks. In Proceedings of International Conference on Pattern Recognition, 2464–2469.
  27. Distral: Robust Multitask Reinforcement Learning. In Advances in Neural Information Processing Systems, 4496–4506.
  28. Mujoco: A Physics Engine for Model-based Control. In Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, 5026–5033.
  29. Visualizing Data using t-SNE. Journal of Machine Learning Research, 9(86): 2579–2605.
  30. Convolutional Networks with Adaptive Inference Graphs. In Proceedings of the European Conference on Computer Vision, 3–18.
  31. SkipNet: Learning Dynamic Routing in Convolutional Networks. In Proceedings of the European Conference on Computer Vision, 409–424.
  32. Knowledge Transfer in Multi-Task Deep Reinforcement Learning for Continuous Control. In Advances in Neural Information Processing Systems, 15146–15155.
  33. Multi-Task Reinforcement Learning with Soft Modularization. In Advances in Neural Information Processing Systems, 4767–4777.
  34. Mastering Complex Control in MOBA Games with Deep Reinforcement Learning. In Proceedings of the AAAI Conference on Artificial Intelligence, 6672–6679.
  35. Gradient Surgery for Multi-Task Learning. In Advances in Neural Information Processing Systems, 5824–5836.
  36. Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning. In Proceedings of the Conference on Robot Learning, 1094–1100.
Citations (2)

Summary

We haven't generated a summary for this paper yet.