Papers
Topics
Authors
Recent
2000 character limit reached

Reset & Distill: A Recipe for Overcoming Negative Transfer in Continual Reinforcement Learning (2403.05066v2)

Published 8 Mar 2024 in cs.LG and cs.AI

Abstract: We argue that the negative transfer problem occurring when the new task to learn arrives is an important problem that needs not be overlooked when developing effective Continual Reinforcement Learning (CRL) algorithms. Through comprehensive experimental validation, we demonstrate that such issue frequently exists in CRL and cannot be effectively addressed by several recent work on mitigating plasticity loss of RL agents. To that end, we develop Reset & Distill (R&D), a simple yet highly effective method, to overcome the negative transfer problem in CRL. R&D combines a strategy of resetting the agent's online actor and critic networks to learn a new task and an offline learning step for distilling the knowledge from the online actor and previous expert's action probabilities. We carried out extensive experiments on long sequence of Meta World tasks and show that our method consistently outperforms recent baselines, achieving significantly higher success rates across a range of tasks. Our findings highlight the importance of considering negative transfer in CRL and emphasize the need for robust strategies like R&D to mitigate its detrimental effects.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. Loss of plasticity in continual deep reinforcement learning. arXiv preprint arXiv:2303.07507, 2023.
  2. Uncertainty-based continual learning with adaptive regularization. In Advances in Neural Information Processing Systems (NeurIPS), pp.  4394–4404, 2019.
  3. Learning dexterous in-hand manipulation. The International Journal of Robotics Research, 39(1):3–20, 2020.
  4. A study on the plasticity of neural networks. arXiv preprint arXiv:2106.00042, 2021.
  5. Partial transfer learning with selective adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  2724–2732, 2018.
  6. Efficient lifelong learning with a-GEM. In International Conference on Learning Representations, 2019a. URL https://openreview.net/forum?id=Hkf2_sC5FX.
  7. On tiny episodic memories in continual learning. arXiv preprint arXiv:1902.10486, 2019b.
  8. Catastrophic forgetting meets negative transfer: Batch spectral shrinkage for safe transfer learning. In Wallach, H., Larochelle, H., Beygelzimer, A., d'Alché-Buc, F., Fox, E., and Garnett, R. (eds.), Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019. URL https://proceedings.neurips.cc/paper_files/paper/2019/file/c6bff625bdb0393992c9d4db0c6bbe45-Paper.pdf.
  9. Continual backprop: Stochastic gradient descent with persistent randomness. arXiv preprint arXiv:2108.06325, 2021.
  10. On handling negative transfer and imbalanced distributions in multiple source transfer learning. Statistical Analysis and Data Mining: The ASA Data Science Journal, 7(4):254–271, 2014.
  11. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In International conference on machine learning, pp. 1861–1870. PMLR, 2018.
  12. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
  13. Compacting, picking and growing for unforgetting continual learning. In Wallach, H., Larochelle, H., Beygelzimer, A., d'Alché-Buc, F., Fox, E., and Garnett, R. (eds.), Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019. URL https://proceedings.neurips.cc/paper_files/paper/2019/file/3b220b436e5f3d917a1e649a0dc0281c-Paper.pdf.
  14. Transient non-stationarity and generalisation in deep reinforcement learning. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=Qun8fv4qSby.
  15. Continual learning with node-importance based adaptive group sparse regularization. Advances in Neural Information Processing Systems, 33:3647–3658, 2020.
  16. Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences, 114(13):3521–3526, 2017. ISSN 0027-8424. doi: 10.1073/pnas.1611835114. URL https://www.pnas.org/content/114/13/3521.
  17. Reinforcement learning in robotics: A survey. The International Journal of Robotics Research, 32(11):1238–1274, 2013.
  18. Implicit under-parameterization inhibits data-efficient deep reinforcement learning. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=O9bnihsFfXU.
  19. Curvature explains loss of plasticity, 2023.
  20. Gradient episodic memory for continual learning. In Advances in Neural Information Processing System (NIPS), pp.  6467–6476. 2017.
  21. Understanding and preventing capacity loss in reinforcement learning. In International Conference on Learning Representations, 2022.
  22. Understanding plasticity in neural networks. In Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., and Scarlett, J. (eds.), Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pp.  23190–23211. PMLR, 23–29 Jul 2023.
  23. Packnet: Adding multiple tasks to a single network by iterative pruning. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp.  7765–7773, 2018.
  24. Piggyback: Adapting a single network to multiple tasks by learning to mask weights. In Proceedings of the European Conference on Computer Vision (ECCV), pp.  67–82, 2018.
  25. Lifelong policy gradient learning of factored policies for faster training without forgetting. Advances in Neural Information Processing Systems, 33:14398–14409, 2020.
  26. Lifelong learning of compositional structures. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=ADWd4TJO13G.
  27. Modular lifelong reinforcement learning via neural composition. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=5XmLzdslFNN.
  28. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013.
  29. Human-level control through deep reinforcement learning. nature, 518(7540):529–533, 2015.
  30. The primacy bias in deep reinforcement learning. In International conference on machine learning, pp. 16828–16847. PMLR, 2022.
  31. Pomerleau, D. A. Alvinn: An autonomous land vehicle in a neural network. Advances in neural information processing systems, 1, 1988.
  32. Experience replay for continual learning. Advances in Neural Information Processing Systems, 32, 2019.
  33. To transfer or not to transfer. In NIPS 2005 workshop on transfer learning, volume 898, 2005.
  34. Progressive neural networks. arXiv preprint arXiv:1606.04671, 2016.
  35. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
  36. Progress & compress: A scalable framework for continual learning. In International Conference on Machine Learning (ICML), pp. 4528–4537, 2018.
  37. Mastering the game of go with deep neural networks and tree search. nature, 529(7587):484, 2016.
  38. The dormant neuron phenomenon in deep reinforcement learning. In Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., and Scarlett, J. (eds.), Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pp.  32145–32168. PMLR, 23–29 Jul 2023.
  39. Transfer learning for reinforcement learning domains: A survey. Journal of Machine Learning Research, 10(7), 2009.
  40. Characterizing and avoiding negative transfer. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  11293–11302, 2019.
  41. Disentangling transfer in continual reinforcement learning. Advances in Neural Information Processing Systems, 35:6304–6317, 2022.
  42. Lifelong learning with dynamically expandable networks. In International Conference on Learning Representations (ICLR), 2018. URL https://openreview.net/forum?id=Sk7KsfW0-.
  43. Meta-world: A benchmark and evaluation for multi-task and meta reinforcement learning. In Conference on robot learning, pp.  1094–1100. PMLR, 2020.
  44. Continual learning through synaptic intelligence. In International Conference on Machine Learning (ICML), pp. 3987–3995, 2017.
  45. A survey on negative transfer. IEEE/CAA Journal of Automatica Sinica, 2022.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 1 like about this paper.