Decision ConvFormer: Local Filtering in MetaFormer is Sufficient for Decision Making (2310.03022v3)
Abstract: The recent success of Transformer in natural language processing has sparked its use in various domains. In offline reinforcement learning (RL), Decision Transformer (DT) is emerging as a promising model based on Transformer. However, we discovered that the attention module of DT is not appropriate to capture the inherent local dependence pattern in trajectories of RL modeled as a Markov decision process. To overcome the limitations of DT, we propose a novel action sequence predictor, named Decision ConvFormer (DC), based on the architecture of MetaFormer, which is a general structure to process multiple entities in parallel and understand the interrelationship among the multiple entities. DC employs local convolution filtering as the token mixer and can effectively capture the inherent local associations of the RL dataset. In extensive experiments, DC achieved state-of-the-art performance across various standard RL benchmarks while requiring fewer resources. Furthermore, we show that DC better understands the underlying meaning in data and exhibits enhanced generalization capability.
- An Optimistic Perspective on Offline Reinforcement Learning. In International Conference on Machine Learning, pp. 104–114. PMLR, 2020.
- Is Conditional Generative Modeling all you need for Decision Making? In International Conference on Learning Representations, 2022.
- A Framework for Behavioural Cloning. In Machine Intelligence 15, pp. 103–129, 1995.
- Richard Bellman. A Markovian Decision Process. Journal of Mathematics and Mechanics, pp. 679–684, 1957.
- Language Models are Few-Shot Learners. Advances in Neural Information Processing Systems, 33:1877–1901, 2020.
- Decision Transformer: Reinforcement Learning via Sequence Modeling. Advances in Neural Information Processing Systems, 34:15084–15097, 2021.
- PaLM: Scaling Language Modeling with Pathways. arXiv preprint arXiv:2204.02311, 2022.
- Decision S4: Efficient Sequence-Based RL via State Spaces Layers. In International Conference on Learning Representations, 2023.
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186, 2019.
- RvS: What is Essential for Offline RL via Supervised Learning? In International Conference on Learning Representations, 2021.
- D4RL: Datasets for Deep Data-Driven Reinforcement Learning, 2020.
- A Minimalist Approach to Offline Reinforcement Learning. Advances in Neural Information Processing Systems, 34:20132–20145, 2021.
- Deep Learning. MIT press, 2016.
- Efficiently Modeling Long Sequences with Structured State Spaces. In International Conference on Learning Representations, 2022.
- Global Context Vision Transformers. In International Conference on Machine Learning, pp. 12633–12646. PMLR, 2023.
- Gaussian Error Linear Units (GELUs). arXiv preprint arXiv:1606.08415, 2016.
- Offline Reinforcement Learning with Implicit Q-Learning. In International Conference on Learning Representations, 2021.
- Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations. In International Conference on Learning Representations, 2016.
- Reward-Conditioned Policies. arXiv preprint arXiv:1912.13465, 2019.
- Conservative Q-Learning for Offline Reinforcement Learning. Advances in Neural Information Processing Systems, 33:1179–1191, 2020.
- ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. arXiv preprint arXiv:1909.11942, 2019.
- Merging Decision Transformers: Weight Averaging for Forming Multi-Task Policies. In Workshop on Reincarnating Reinforcement Learning at ICLR 2023, 2023.
- RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv preprint arXiv:1907.11692, 2019.
- Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022, 2021.
- Finetuning from Offline Reinforcement Learning: Challenges, Trade-offs and Practical Solutions. arXiv preprint arXiv:2303.17396, 2023.
- Playing Atari with Deep Reinforcement Learning. arXiv preprint arXiv:1312.5602, 2013.
- Human-level control through deep reinforcement learning. Nature, 518(7540):529–533, 2015.
- Rectified Linear Units Improve Restricted Boltzmann Machines. In International Conference on Machine Learning, pp. 807–814, 2010.
- Language Models are Unsupervised Multitask Learners. OpenAI Blog, 1(8):9, 2019.
- Sidney I Resnick. Adventures in Stochastic Processes. Springer Science & Business Media, 1992.
- Juergen Schmidhuber. Reinforcement Learning Upside Down: Don’t Predict Rewards–Just Map Them to Actions. arXiv preprint arXiv:1912.02875, 2019.
- StARformer: Transformer with State-Action-Reward Representations for Visual Reinforcement Learning. In Computer Vision – ECCV 2022, pp. 462–479. Springer Nature Switzerland, 2022.
- Dropout: A Simple Way to Prevent Neural Networks from Overfitting. The Journal of Machine Learning Research, 15(1):1929–1958, 2014.
- MLP-Mixer: An all-MLP Architecture for Vision. Advances in Neural Information Processing Systems, 34:24261–24272, 2021.
- Attention Is All You Need. Advances in Neural Information Processing Systems, 30, 2017.
- Prompting Decision Transformer for Few-Shot Policy Generalization. In International Conference on Machine Learning, pp. 24631–24645. PMLR, 2022.
- Mastering Atari Games with Limited Data. Advances in Neural Information Processing Systems, 34:25476–25488, 2021.
- MetaFormer Is Actually What You Need for Vision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10819–10829, 2022.
- Policy Expansion for Bridging Offline-to-Online Reinforcement Learning. In International Conference on Learning Representations, 2022.
- Online Decision Transformer. In International Conference on Machine Learning, pp. 27042–27059. PMLR, 2022.
- Jeonghye Kim (5 papers)
- Suyoung Lee (13 papers)
- Woojun Kim (20 papers)
- Youngchul Sung (48 papers)