Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Data-Incremental Continual Offline Reinforcement Learning (2404.12639v3)

Published 19 Apr 2024 in cs.LG

Abstract: In this work, we propose a new setting of continual learning: data-incremental continual offline reinforcement learning (DICORL), in which an agent is asked to learn a sequence of datasets of a single offline reinforcement learning (RL) task continually, instead of learning a sequence of offline RL tasks with respective datasets. Then, we propose that this new setting will introduce a unique challenge to continual learning: active forgetting, which means that the agent will forget the learnt skill actively. The main reason for active forgetting is conservative learning used by offline RL, which is used to solve the overestimation problem. With conservative learning, the offline RL method will suppress the value of all actions, learnt or not, without selection, unless it is in the just learning dataset. Therefore, inferior data may overlay premium data because of the learning sequence. To solve this problem, we propose a new algorithm, called experience-replay-based ensemble implicit Q-learning (EREIQL), which introduces multiple value networks to reduce the initial value and avoid using conservative learning, and the experience replay to relieve catastrophic forgetting. Our experiments show that EREIQL relieves active forgetting in DICORL and performs well.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. A Definition of Continual Reinforcement Learning, July 2023.
  2. Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble, Oct. 2021.
  3. N. Anand and D. Precup. Prediction and Control in Continual Reinforcement Learning. In Thirty-Seventh Conference on Neural Information Processing Systems, Nov. 2023.
  4. Rainbow Memory: Continual Learning with a Memory of Diverse Samples. arXiv:2103.17230 [cs], Mar. 2021.
  5. Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence. arXiv:1801.10112 [cs], Aug. 2018. 10.1007/978-3-030-01252-6_33.
  6. Efficient Lifelong Learning with A-GEM. arXiv:1812.00420 [cs, stat], Jan. 2019.
  7. Is forgetting less a good inductive bias for forward transfer?, Mar. 2023.
  8. Kernel Continual Learning. In ICML, volume 139, July 2021.
  9. PathNet: Evolution Channels Gradient Descent in Super Neural Networks. arXiv:1701.08734 [cs], Jan. 2017.
  10. D4RL: Datasets for Deep Data-Driven Reinforcement Learning. ArXiv, Apr. 2020.
  11. S. Fujimoto and S. S. Gu. A Minimalist Approach to Offline Reinforcement Learning. arXiv:2106.06860 [cs, stat], June 2021.
  12. Offline Experience Replay for Continual Offline Reinforcement Learning, May 2023.
  13. CLR: Channel-wise Lightweight Reprogramming for Continual Learning, July 2023.
  14. Why So Pessimistic? Estimating Uncertainties for Offline RL through Ensembles, and Why Their Independence Matters, May 2022.
  15. Real Robot Challenge 2022: Learning Dexterous Manipulation from Offline Data in the Real World, Nov. 2023.
  16. IDQL: Implicit Q-Learning as an Actor-Critic Method with Diffusion Policies, May 2023.
  17. T. L. Hayes and C. Kanan. Selective Replay Enhances Learning in Online Continual Analogical Reasoning. arXiv:2103.03987 [cs], Apr. 2021.
  18. Forget-free Continual Learning with Soft-Winning SubNetworks, Mar. 2023.
  19. Continual Reinforcement Learning with Multi-Timescale Replay. arXiv:2004.07530 [cs, stat], Apr. 2020.
  20. Same State, Different Task: Continual Reinforcement Learning without Interference. AAAI, 36(7):7143–7151, June 2022. ISSN 2374-3468, 2159-5399. 10.1609/aaai.v36i7.20674.
  21. D. P. Kingma and J. Ba. Adam: A Method for Stochastic Optimization. CoRR, Dec. 2014.
  22. Overcoming catastrophic forgetting in neural networks. Proc Natl Acad Sci USA, 114(13):3521–3526, Mar. 2017. ISSN 0027-8424, 1091-6490. 10.1073/pnas.1611835114.
  23. Offline Reinforcement Learning with Implicit Q-Learning, Oct. 2021.
  24. Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction. arXiv:1906.00949 [cs, stat], Nov. 2019.
  25. Conservative Q-Learning for Offline Reinforcement Learning. arXiv:2006.04779 [cs, stat], Aug. 2020.
  26. Continual learning for robotics: Definition, framework, learning strategies, opportunities and challenges. Information Fusion, 58:52–68, June 2020. ISSN 15662535. 10.1016/j.inffus.2019.12.004.
  27. D. Lopez-Paz and M. Ranzato. Gradient Episodic Memory for Continual Learning. arXiv:1706.08840 [cs], Nov. 2017.
  28. Learning to Predict Gradients for Semi-Supervised Continual Learning. arXiv:2201.09196 [cs], Jan. 2022.
  29. Mildly Conservative Q-Learning for Offline Reinforcement Learning, Oct. 2022.
  30. Conservative Offline Distributional Reinforcement Learning. arXiv:2107.06106 [cs], Oct. 2021.
  31. J. Millichamp and X. Chen. Brain-inspired feature exaggeration in generative replay for continual learning. arXiv:2110.15056 [cs], Oct. 2021.
  32. DreamWaQ: Learning Robust Quadrupedal Locomotion With Implicit Terrain Imagination via Deep Reinforcement Learning, Mar. 2023.
  33. AWAC: Accelerating Online Reinforcement Learning with Offline Datasets, Apr. 2021.
  34. Q-Ensemble for Offline RL: Don’t Scale the Ensemble, Scale the Batch Size, Jan. 2023.
  35. Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning. arXiv:1910.00177 [cs, stat], Oct. 2019.
  36. GDumb: A Simple Approach that Questions Our Progress in Continual Learning. In Computer Vision – ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II, pages 524–540, Berlin, Heidelberg, Aug. 2020. Springer-Verlag. ISBN 978-3-030-58535-8. 10.1007/978-3-030-58536-5_31.
  37. M. B. Ring. Toward a Formal Framework for Continual Learning. In Inductive Transfer : 10 Years Later NIPS 2005 Workshop, 2005.
  38. Experience Replay for Continual Learning. arXiv:1811.11682 [cs, stat], Nov. 2019.
  39. T. Seno and M. Imai. D3rlpy: An offline deep reinforcement learning library. J. Mach. Learn. Res., 23(1), Jan. 2022. ISSN 1532-4435.
  40. Always Be Dreaming: A New Approach for Data-Free Class-Incremental Learning. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 9354–9364, 2021. 10.1109/ICCV48922.2021.00924.
  41. G. M. van de Ven and A. S. Tolias. Three scenarios for continual learning. arXiv:1904.07734 [cs, stat], Apr. 2019.
  42. Continual World: A Robotic Benchmark For Continual Reinforcement Learning, Oct. 2021.
  43. Disentangling Transfer in Continual Reinforcement Learning, Sept. 2022.
  44. Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization, Mar. 2023.
  45. Online Coreset Selection for Rehearsal-based Continual Learning. In International Conference on Learning Representations, Sept. 2021.
  46. Continual Learning Through Synaptic Intelligence. arXiv:1703.04200 [cs, q-bio, stat], June 2017.
  47. A Real-World Quadrupedal Locomotion Benchmark for Offline Reinforcement Learning. arXiv.org, 2023.
  48. Behavior Proximal Policy Optimization, Feb. 2023.

Summary

We haven't generated a summary for this paper yet.