Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Recurrent Action Transformer with Memory (2306.09459v5)

Published 15 Jun 2023 in cs.LG and cs.AI

Abstract: Recently, the use of transformers in offline reinforcement learning has become a rapidly developing area. This is due to their ability to treat the agent's trajectory in the environment as a sequence, thereby reducing the policy learning problem to sequence modeling. In environments where the agent's decisions depend on past events (POMDPs), it is essential to capture both the event itself and the decision point in the context of the model. However, the quadratic complexity of the attention mechanism limits the potential for context expansion. One solution to this problem is to extend transformers with memory mechanisms. This paper proposes a Recurrent Action Transformer with Memory (RATE), a novel model architecture that incorporates a recurrent memory mechanism designed to regulate information retention. To evaluate our model, we conducted extensive experiments on memory-intensive environments (ViZDoom-Two-Colors, T-Maze, Memory Maze, Minigrid-Memory), classic Atari games, and MuJoCo control environments. The results show that using memory can significantly improve performance in memory-intensive environments, while maintaining or improving results in classic environments. We believe that our results will stimulate research on memory mechanisms for transformers applicable to offline reinforcement learning.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. “Long short-term memory” In Neural computation 9.8 MIT press, 1997, pp. 1735–1780
  2. Dzmitry Bahdanau, Kyunghyun Cho and Yoshua Bengio “Neural machine translation by jointly learning to align and translate” In arXiv preprint arXiv:1409.0473, 2014
  3. “On the properties of neural machine translation: Encoder-decoder approaches” In arXiv preprint arXiv:1409.1259, 2014
  4. “Deep residual learning for image recognition” In Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778
  5. Aaron Van Den Oord and Oriol Vinyals “Neural discrete representation learning” In Advances in neural information processing systems 30, 2017
  6. “Attention is all you need” In Advances in neural information processing systems 30, 2017
  7. “Reinforcement Learning, second edition: An Introduction”, Adaptive Computation and Machine Learning series MIT Press, 2018 URL: https://books.google.ru/books?id=sWV0DwAAQBAJ
  8. “Transformer-xl: Attentive language models beyond a fixed-length context” In arXiv preprint arXiv:1901.02860, 2019
  9. “Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks” In Advances in neural information processing systems 32, 2019
  10. “Compressive transformers for long-range sequence modelling” In arXiv preprint arXiv:1911.05507, 2019
  11. “Vl-bert: Pre-training of generic visual-linguistic representations” In arXiv preprint arXiv:1908.08530, 2019
  12. “Lxmert: Learning cross-modality encoder representations from transformers” In arXiv preprint arXiv:1908.07490, 2019
  13. Rishabh Agarwal, Dale Schuurmans and Mohammad Norouzi “An optimistic perspective on offline reinforcement learning” In International Conference on Machine Learning, 2020, pp. 104–114 PMLR
  14. “wav2vec 2.0: A framework for self-supervised learning of speech representations” In Advances in neural information processing systems 33, 2020, pp. 12449–12460
  15. “ERNIE-Doc: A retrospective long-document modeling transformer” In arXiv preprint arXiv:2012.15688, 2020
  16. “An image is worth 16x16 words: Transformers for image recognition at scale” In arXiv preprint arXiv:2010.11929, 2020
  17. “Conservative Q-Learning for Offline Reinforcement Learning”, 2020 arXiv:2006.04779 [cs.LG]
  18. “Mart: Memory-augmented recurrent transformer for coherent video paragraph captioning” In arXiv preprint arXiv:2005.05402, 2020
  19. “Stabilizing transformers for reinforcement learning” In International conference on machine learning, 2020, pp. 7487–7498 PMLR
  20. “Memformer: The memory-augmented transformer” In arXiv preprint arXiv:2010.06891, 2020
  21. “Deformable detr: Deformable transformers for end-to-end object detection” In arXiv preprint arXiv:2010.04159, 2020
  22. “Decision transformer: Reinforcement learning via sequence modeling” In Advances in neural information processing systems 34, 2021, pp. 15084–15097
  23. “D4RL: Datasets for Deep Data-Driven Reinforcement Learning”, 2021 arXiv:2004.07219 [cs.LG]
  24. “Hubert: Self-supervised speech representation learning by masked prediction of hidden units” In IEEE/ACM Transactions on Audio, Speech, and Language Processing 29 IEEE, 2021, pp. 3451–3460
  25. Michael Janner, Qiyang Li and Sergey Levine “Offline reinforcement learning as one big sequence modeling problem” In Advances in neural information processing systems 34, 2021, pp. 1273–1286
  26. “Mdetr-modulated detection for end-to-end multi-modal understanding” In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 1780–1790
  27. Manoj Kumar, Dirk Weissenborn and Nal Kalchbrenner “Colorization transformer” In arXiv preprint arXiv:2102.04432, 2021
  28. “Training data-efficient image transformers & distillation through attention” In International conference on machine learning, 2021, pp. 10347–10357 PMLR
  29. Xinpeng Wang, Chandan Yeshwanth and Matthias Nießner “Sceneformer: Indoor scene generation with transformers” In 2021 International Conference on 3D Vision (3DV), 2021, pp. 106–115 IEEE
  30. Aydar Bulatov, Yury Kuratov and Mikhail Burtsev “Recurrent memory transformer” In Advances in Neural Information Processing Systems 35, 2022, pp. 11079–11091
  31. “Vima: General robot manipulation with multimodal prompts” In arXiv preprint arXiv:2210.03094, 2022
  32. “Efficient planning in a compact latent action space” In arXiv preprint arXiv:2208.10291, 2022
  33. “Multi-game decision transformers” In Advances in Neural Information Processing Systems 35, 2022, pp. 27921–27936
  34. “Interactive language: Talking to robots in real time” In arXiv preprint arXiv:2210.06407, 2022
  35. “A generalist agent” In arXiv preprint arXiv:2205.06175, 2022
  36. “Memorizing transformers” In arXiv preprint arXiv:2203.08913, 2022
  37. “Palm-e: An embodied multimodal language model” In arXiv preprint arXiv:2303.03378, 2023
  38. Mohit Shridhar, Lucas Manuelli and Dieter Fox “Perceiver-actor: A multi-task transformer for robotic manipulation” In Conference on Robot Learning, 2023, pp. 785–799 PMLR
Citations (1)

Summary

We haven't generated a summary for this paper yet.