Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Decision Mamba: Reinforcement Learning via Sequence Modeling with Selective State Spaces (2403.19925v1)

Published 29 Mar 2024 in cs.LG and cs.AI

Abstract: Decision Transformer, a promising approach that applies Transformer architectures to reinforcement learning, relies on causal self-attention to model sequences of states, actions, and rewards. While this method has shown competitive results, this paper investigates the integration of the Mamba framework, known for its advanced capabilities in efficient and effective sequence modeling, into the Decision Transformer architecture, focusing on the potential performance enhancements in sequential decision-making tasks. Our study systematically evaluates this integration by conducting a series of experiments across various decision-making environments, comparing the modified Decision Transformer, Decision Mamba, with its traditional counterpart. This work contributes to the advancement of sequential decision-making models, suggesting that the architecture and training methodology of neural networks can significantly impact their performance in complex tasks, and highlighting the potential of Mamba as a valuable tool for improving the efficacy of Transformer-based models in reinforcement learning scenarios.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (19)
  1. An optimistic perspective on offline reinforcement learning. In International Conference on Machine Learning, pp. 104–114. PMLR, 2020.
  2. A framework for behavioural cloning. In Machine Intelligence 15, pp.  103–129, 1995.
  3. The arcade learning environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research, 47:253–279, 2013.
  4. Openai gym. arXiv preprint arXiv:1606.01540, 2016.
  5. Decision transformer: Reinforcement learning via sequence modeling. Advances in neural information processing systems, 34:15084–15097, 2021.
  6. Decision S4: Efficient sequence-based RL via state spaces layers. In The Eleventh International Conference on Learning Representations, 2023.
  7. D4rl: Datasets for deep data-driven reinforcement learning. arXiv preprint arXiv:2004.07219, 2020.
  8. Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752, 2023.
  9. Hippo: Recurrent memory with optimal polynomial projections. Advances in neural information processing systems, 33:1474–1487, 2020.
  10. Combining recurrent, convolutional, and continuous-time models with linear state space layers. Advances in neural information processing systems, 34:572–585, 2021.
  11. On the parameterization and initialization of diagonal state space models. Advances in Neural Information Processing Systems, 35:35971–35983, 2022a.
  12. Efficiently modeling long sequences with structured state spaces. In International Conference on Learning Representations, 2022b.
  13. Diagonal state spaces are as effective as structured state spaces. Advances in Neural Information Processing Systems, 35:22982–22994, 2022.
  14. Decision convformer: Local filtering in metaformer is sufficient for decision making. In The Twelfth International Conference on Learning Representations, 2024.
  15. Offline reinforcement learning: Tutorial, review, and perspectives on open problems. arXiv preprint arXiv:2005.01643, 2020.
  16. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013.
  17. Human-level control through deep reinforcement learning. Nature, 518(7540):529–533, 2015.
  18. Improving language understanding by generative pre-training.
  19. Mastering atari games with limited data. Advances in neural information processing systems, 34:25476–25488, 2021.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (1)
  1. Toshihiro Ota (9 papers)
Citations (13)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets