Sequential Action-Induced Invariant Representation for Reinforcement Learning (2309.12628v1)
Abstract: How to accurately learn task-relevant state representations from high-dimensional observations with visual distractions is a realistic and challenging problem in visual reinforcement learning. Recently, unsupervised representation learning methods based on bisimulation metrics, contrast, prediction, and reconstruction have shown the ability for task-relevant information extraction. However, due to the lack of appropriate mechanisms for the extraction of task information in the prediction, contrast, and reconstruction-related approaches and the limitations of bisimulation-related methods in domains with sparse rewards, it is still difficult for these methods to be effectively extended to environments with distractions. To alleviate these problems, in the paper, the action sequences, which contain task-intensive signals, are incorporated into representation learning. Specifically, we propose a Sequential Action--induced invariant Representation (SAR) method, in which the encoder is optimized by an auxiliary learner to only preserve the components that follow the control signals of sequential actions, so the agent can be induced to learn the robust representation against distractions. We conduct extensive experiments on the DeepMind Control suite tasks with distractions while achieving the best performance over strong baselines. We also demonstrate the effectiveness of our method at disregarding task-irrelevant information by deploying SAR to real-world CARLA-based autonomous driving with natural distractions. Finally, we provide the analysis results of generalization drawn from the generalization decay and t-SNE visualization. Code and demo videos are available at https://github.com/DMU-XMU/SAR.git.
- Contrastive behavioral similarity embeddings for generalization in reinforcement learning. In International Conference on Learning Representations, 2020.
- Behavior predictive representations for generalization in reinforcement learning. In Deep RL Workshop NeurIPS 2021, 2021.
- Learning representations via a robust behavioral metric for deep reinforcement learning. In Advances in Neural Information Processing Systems, 2022a.
- Learning generalizable representations for reinforcement learning via adaptive meta-learner of behavioral similarities. International Conference on Learning Representations, 2022b.
- Hard negative sample mining for contrastive representation in reinforcement learning. In Advances in Knowledge Discovery and Data Mining: 26th Pacific-Asia Conference, PAKDD 2022, Chengdu, China, May 16–19, 2022, Proceedings, Part II, pp. 277–288. Springer, 2022.
- A simple framework for contrastive learning of visual representations. In International conference on machine learning, pp. 1597–1607. PMLR, 2020.
- Identifying the key frames: An attention-aware sampling method for action recognition. Pattern Recognition, 130:108797, 2022.
- Carla: An open urban driving simulator. In Conference on robot learning, pp. 1–16. PMLR, 2017.
- An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2020.
- Dribo: Robust deep reinforcement learning via multi-view information bottleneck. In International Conference on Machine Learning, pp. 6074–6102. PMLR, 2022.
- Learning task informed abstractions. In International Conference on Machine Learning, pp. 3480–3491. PMLR, 2021.
- Learning latent dynamics for planning from pixels. In International Conference on Machine Learning, pp. 2555–2565. PMLR, 2019.
- Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16000–16009, 2022.
- Henaff, Olivier. Data-efficient image recognition with contrastive predictive coding. In International conference on machine learning, pp. 4182–4192. PMLR, 2020.
- Darla: Improving zero-shot transfer in reinforcement learning. In International Conference on Machine Learning, pp. 1480–1490. PMLR, 2017.
- Unified curiosity-driven learning with smoothed intrinsic reward estimation. Pattern Recognition, 123:108352, 2022.
- Human-level performance in 3d multiplayer games with population-based reinforcement learning. Science, 364(6443):859–865, 2019.
- Qt-opt: Scalable deep reinforcement learning for vision-based robotic manipulation. ArXiv, abs/1806.10293, 2018.
- Towards robust bisimulation metric learning. Advances in Neural Information Processing Systems, 34:4764–4777, 2021.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Reformer: The efficient transformer. In International Conference on Learning Representations, 2019.
- Autonomous reinforcement learning on raw visual input data in a real world application. In The 2012 international joint conference on neural networks (IJCNN), pp. 1–8. IEEE, 2012.
- Curl: Contrastive unsupervised representations for reinforcement learning. In International Conference on Machine Learning, pp. 5639–5650. PMLR, 2020a.
- Cic: Contrastive intrinsic control for unsupervised skill discovery. Advances in Neural Information Processing Systems, 2022.
- Reinforcement learning with augmented data. Advances in neural information processing systems, 33:19884–19895, 2020b.
- Stochastic latent actor-critic: Deep reinforcement learning with a latent variable model. Advances in Neural Information Processing Systems, 33:741–752, 2020.
- Implicit posteriori parameter distribution optimization in reinforcement learning. IEEE Transactions on Cybernetics, pp. 1–14, 2023. doi: 10.1109/TCYB.2023.3254596.
- Reinforcement learning for robust parameterized locomotion control of bipedal robots. 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 2811–2817, 2021.
- Gated multi-attention representation in reinforcement learning. Knowledge-Based Systems, 233:107535, 2021.
- The treatment of sepsis: an episodic memory-assisted deep reinforcement learning approach. Applied Intelligence, pp. 1–11, 2022.
- Masked autoencoding for scalable and generalizable decision making. In Advances in Neural Information Processing Systems, 2022.
- Return-based contrastive representation learning for reinforcement learning. In International Conference on Learning Representations, 2020.
- Cross-trajectory representation learning for zero-shot generalization in rl. In International Conference on Learning Representations, 2021.
- Human-level control through deep reinforcement learning. nature, 518(7540):529–533, 2015.
- Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748, 2018.
- Stabilizing transformers for reinforcement learning. In International conference on machine learning, pp. 7487–7498. PMLR, 2020.
- Schölkopf, Bernhard. Causality for machine learning. In Probabilistic and Causal Inference: The Works of Judea Pearl, pp. 765–804. 2022.
- Masked world models for visual control. arXiv preprint arXiv:2206.14244, 2022.
- The distracting control suite–a challenging benchmark for reinforcement learning from pixels. arXiv preprint arXiv:2101.02722, 2021.
- Deepmind control suite. ArXiv, abs/1801.00690, 2018.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- Trajectory-guided control prediction for end-to-end autonomous driving: A simple yet strong baseline. Advances in Neural Information Processing Systems, 2022.
- Learning task-relevant representations for generalization via characteristic functions of reward sequence distributions. Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2022.
- A maximum divergence approach to optimal policy in deep reinforcement learning. IEEE Transactions on Cybernetics, 53(3):1499–1510, 2023. doi: 10.1109/TCYB.2021.3104612.
- Image augmentation is all you need: Regularizing deep reinforcement learning from pixels. In International Conference on Learning Representations, 2021.
- Mask-based latent reconstruction for reinforcement learning. ArXiv, abs/2201.12096, 2022.
- Don’t touch what matters: Task-aware lipschitz data augmentationfor visual reinforcement learning. arXiv preprint arXiv:2202.09982, 2022.
- Simsr: Simple distance-based state representations for deep reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pp. 8997–9005, 2022.
- Learning invariant representations for reinforcement learning without reconstruction. In International Conference on Learning Representations, 2020.
- Cadre: A cascade deep reinforcement learning framework for vision-based autonomous urban driving. In AAAI, 2022.
- Dayang Liang (3 papers)
- Qihang Chen (10 papers)
- Yunlong Liu (23 papers)