P2DT: Mitigating Forgetting in task-incremental Learning with progressive prompt Decision Transformer (2401.11666v1)
Abstract: Catastrophic forgetting poses a substantial challenge for managing intelligent agents controlled by a large model, causing performance degradation when these agents face new tasks. In our work, we propose a novel solution - the Progressive Prompt Decision Transformer (P2DT). This method enhances a transformer-based model by dynamically appending decision tokens during new task training, thus fostering task-specific policies. Our approach mitigates forgetting in continual and offline reinforcement learning scenarios. Moreover, P2DT leverages trajectories collected via traditional reinforcement learning from all tasks and generates new task-specific tokens during training, thereby retaining knowledge from previous studies. Preliminary results demonstrate that our model effectively alleviates catastrophic forgetting and scales well with increasing task environments.
- “Decision transformer: Reinforcement learning via sequence modeling,” Advances in neural information processing systems, vol. 34, pp. 15084–15097, 2021.
- “Multi-game decision transformers,” Advances in Neural Information Processing Systems, vol. 35, pp. 27921–27936, 2022.
- “Online decision transformer,” in international conference on machine learning. PMLR, 2022, pp. 27042–27059.
- “Embracing change: Continual learning in deep neural networks,” Trends in cognitive sciences, vol. 24, no. 12, pp. 1028–1040, 2020.
- “Continual lifelong learning with neural networks: A review,” Neural networks, vol. 113, pp. 54–71, 2019.
- “Learning without forgetting,” IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 12, pp. 2935–2947, 2017.
- “Overcoming catastrophic forgetting in neural networks,” Proceedings of the national academy of sciences, vol. 114, no. 13, pp. 3521–3526, 2017.
- “Fedet: a communication-efficient federated class-incremental learning framework based on enhanced transformer,” in Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023, pp. 3984–3992.
- “Gradient episodic memory for continual learning,” Advances in neural information processing systems, vol. 30, 2017.
- “icarl: Incremental classifier and representation learning,” in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2017, pp. 2001–2010.
- “Shoggoth: towards efficient edge-cloud collaborative real-time video inference via adaptive online learning,” in 2023 60th ACM/IEEE Design Automation Conference (DAC). IEEE, 2023, pp. 1–6.
- “The power of scale for parameter-efficient prompt tuning,” in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021, pp. 3045–3059.
- “Prefix-tuning: Optimizing continuous prompts for generation,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics, 2021, pp. 4582–4597.
- “What learning systems do intelligent agents need? complementary learning systems theory updated,” Trends in cognitive sciences, vol. 20, no. 7, pp. 512–534, 2016.
- “Progressive prompts: Continual learning for language models,” in The Eleventh International Conference on Learning Representations, 2022.
- “Mujoco: A physics engine for model-based control,” in 2012 IEEE/RSJ international conference on intelligent robots and systems. IEEE, 2012, pp. 5026–5033.
- “D4rl: Datasets for deep data-driven reinforcement learning,” arXiv preprint arXiv:2004.07219, 2020.
- “Conservative q-learning for offline reinforcement learning,” Advances in Neural Information Processing Systems, vol. 33, pp. 1179–1191, 2020.
- “Stabilizing off-policy q-learning via bootstrapping error reduction,” Advances in Neural Information Processing Systems, vol. 32, 2019.
- “Behavior regularized offline reinforcement learning,” arXiv preprint arXiv:1911.11361, 2019.
- “Advantage-weighted regression: Simple and scalable off-policy reinforcement learning,” arXiv preprint arXiv:1910.00177, 2019.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.