MaDi: Learning to Mask Distractions for Generalization in Visual Deep Reinforcement Learning
Abstract: The visual world provides an abundance of information, but many input pixels received by agents often contain distracting stimuli. Autonomous agents need the ability to distinguish useful information from task-irrelevant perceptions, enabling them to generalize to unseen environments with new distractions. Existing works approach this problem using data augmentation or large auxiliary networks with additional loss functions. We introduce MaDi, a novel algorithm that learns to mask distractions by the reward signal only. In MaDi, the conventional actor-critic structure of deep reinforcement learning agents is complemented by a small third sibling, the Masker. This lightweight neural network generates a mask to determine what the actor and critic will receive, such that they can focus on learning the task. The masks are created dynamically, depending on the current input. We run experiments on the DeepMind Control Generalization Benchmark, the Distracting Control Suite, and a real UR5 Robotic Arm. Our algorithm improves the agent's focus with useful masks, while its efficient Masker network only adds 0.2% more parameters to the original structure, in contrast to previous work. MaDi consistently achieves generalization results better than or competitive to state-of-the-art methods.
- Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning. In International Conference on Learning Representations. URL: https://arxiv.org/abs/2101.05265.
- Solving Rubik’s Cube with a Robot Hand. arXiv preprint arXiv:1910.07113 (2019). URL: https://openai.com/research/solving-rubiks-cube.
- Layer Normalization. Advances in Neural Information Processing Systems, Deep Learning Symposium (2016). URL: https://arxiv.org/abs/1607.06450.
- Look where you look! Saliency-guided Q-networks for generalization in visual Reinforcement Learning. Advances in Neural Information Processing Systems 35 (2022), 30693–30706. URL: https://arxiv.org/abs/2209.09203.
- OpenAI Gym. arXiv preprint arXiv:1606.01540 (2016). URL: https://www.gymlibrary.dev/.
- Quantifying Generalization in Reinforcement Learning. In Proceedings of the 36th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 97), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, 1282–1289. URL: https://proceedings.mlr.press/v97/cobbe19a.html.
- Magnetic control of tokamak plasmas through deep reinforcement learning. Nature 602, 7897 (2022), 414–419. URL: https://www.nature.com/articles/s41586-021-04301-9.
- An image is worth 16x16 words: Transformers for image recognition at scale. International Conference on Learning Representations (2021). URL: https://arxiv.org/abs/2010.11929.
- RL22{}^{2}start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPT: Fast Reinforcement Learning via Slow Reinforcement Learning. arXiv preprint arXiv:1611.02779 (2016). URL: https://arxiv.org/abs/1611.02779.
- Open X-Embodiment: Robotic Learning Datasets and RT-X Models. URL: https://robotics-transformer-x.github.io.
- SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies. International Conference on Machine Learning (2021). URL: https://arxiv.org/abs/2106.09678.
- Generalization and Regularization in DQN. NeurIPS’18 Deep Reinforcement Learning Workshop (2018). URL: https://arxiv.org/abs/1810.00123.
- Bisimulation Metrics for Continuous Markov Decision Processes. SIAM J. Comput. 40, 6 (2011), 1662–1714. URL: https://doi.org/10.1137/10080484X.
- Deep Spatial Autoencoders for Visuomotor Learning. In 2016 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 512–519. URL: https://arxiv.org/abs/1509.06113.
- The State of Sparse Training in Deep Reinforcement Learning. In International Conference on Machine Learning. PMLR, 7766–7792. URL: https://arxiv.org/abs/2206.10369.
- Automatic Noise Filtering with Dynamic Sparse Training in Deep Reinforcement Learning. The 22nd International Conference on Autonomous Agents and Multiagent Systems (AAMAS) (2023). URL: https://arxiv.org/abs/2302.06548.
- Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning. arXiv preprint arXiv:2304.13653 (2023). URL: https://sites.google.com/view/op3-soccer.
- Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. In International Conference on Machine Learning. PMLR, 1861–1870. URL: https://arxiv.org/abs/1801.01290.
- Dream to Control: Learning Behaviors by Latent Imagination. International Conference on Learning Representations (2020). URL: https://arxiv.org/abs/1912.01603.
- Self-Supervised Policy Adaptation during Deployment. International Conference on Learning Representations (2020). URL: https://arxiv.org/abs/2007.04309.
- Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under Data Augmentation. 35th Conference on Neural Information Processing Systems (2021). URL: https://arxiv.org/abs/2107.00644.
- Nicklas Hansen and Xiaolong Wang. 2021. Generalization in Reinforcement Learning by Soft Data Augmentation. In International Conference on Robotics and Automation. URL: https://arxiv.org/abs/2011.13389.
- Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778. URL: https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/He_Deep_Residual_Learning_CVPR_2016_paper.pdf.
- Dan Hendrycks and Kevin Gimpel. 2016. Gaussian Error Linear Units (GELUs). arXiv preprint arXiv:1606.08415 (2016). URL: https://arxiv.org/abs/1606.08415.
- Planning and acting in partially observable stochastic domains. Artificial Intelligence 101, 1-2 (1998), 99–134. URL: https://www.sciencedirect.com/science/article/pii/S000437029800023X.
- Diederik Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. International Conference for Learning Representations (2015). URL: https://arxiv.org/abs/1412.6980.
- A Survey of Zero-shot Generalisation in Deep Reinforcement Learning. Journal of Artificial Intelligence Research 76 (2023), 201–264. URL: https://arxiv.org/abs/2111.09794.
- Reinforcement Learning with Augmented Data. Advances in Neural Information Processing Systems 33 (2020), 19884–19895. URL: https://arxiv.org/abs/2004.14990.
- CURL: Contrastive Unsupervised Representations for Reinforcement Learning. In International Conference on Machine Learning. PMLR, 5639–5650. URL: https://arxiv.org/abs/2004.04136.
- Benchmarking Reinforcement Learning Algorithms on Real-World Robots. In Conference on robot learning. PMLR, 561–591. URL: https://arxiv.org/abs/1809.07731.
- Learning to Navigate in Complex Environments. arXiv preprint arXiv:1611.03673 (2016). URL: https://arxiv.org/abs/1611.03673.
- Human-level control through deep reinforcement learning. Nature 518, 7540 (2015), 529–533. URL: https://www.nature.com/articles/nature14236.
- Minimalistic Attacks: How Little it Takes to Fool a Deep Reinforcement Learning Policy. IEEE Transactions on Cognitive and Developmental Systems 13, 4 (2020), 806–817. URL: https://arxiv.org/abs/1911.03849.
- Automatic Data Augmentation for Generalization in Reinforcement Learning. In Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan (Eds.), Vol. 34. Curran Associates, Inc., 5402–5415. URL: https://proceedings.neurips.cc/paper/2021/hash/2b38c2df6a49b97f706ec9148ce48d86-Abstract.html.
- U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. Springer, 234–241. URL: https://arxiv.org/abs/1505.04597.
- ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV) 115, 3 (2015), 211–252. https://doi.org/10.1007/s11263-015-0816-y URL: https://link.springer.com/article/10.1007/s11263-015-0816-y.
- Mastering Atari, Go, chess and shogi by planning with a learned model. Nature 588, 7839 (2020), 604–609. URL: https://www.nature.com/articles/s41586-020-03051-4.
- Proximal Policy Optimization Algorithms. arXiv preprint arXiv:1707.06347 (2017). URL: https://arxiv.org/abs/1707.06347.
- The Distracting Control Suite – A Challenging Benchmark for Reinforcement Learning from Pixels. arXiv preprint arXiv:2101.02722 (2021). URL: https://arxiv.org/abs/2101.02722.
- Evaluating Vision Transformer Methods for Deep Reinforcement Learning from Pixels. arXiv preprint arXiv:2204.04905 (2022). URL: https://arxiv.org/abs/2204.04905.
- Deepmind Control Suite. arXiv preprint arXiv:1801.00690 (2018). URL: https://arxiv.org/abs/1801.00690.
- Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World. In 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 23–30. URL: https://arxiv.org/abs/1703.06907.
- MuJoCo: A physics engine for model-based control. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 5026–5033. URL: https://mujoco.org/.
- Ignorance is Bliss: Robust Control via Information Gating. arXiv preprint arXiv:2303.06121 (2023). URL: https://arxiv.org/abs/2303.06121.
- Attention Is All You Need. Advances in Neural Information Processing Systems 30 (2017). URL: https://arxiv.org/abs/1706.03762.
- Learning to reinforcement learn. arXiv preprint arXiv:1611.05763 (2016). URL: https://arxiv.org/abs/1611.05763.
- Improving Generalization in Reinforcement Learning with Mixture Regularization. Advances in Neural Information Processing Systems 33 (2020), 7968–7978. URL: https://arxiv.org/abs/2010.10814.
- Real-Time Reinforcement Learning for Vision-Based Robotics Utilizing Local and Remote Computers. In 2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 9435–9441. URL: https://arxiv.org/abs/2210.02317.
- Domain Adaptation In Reinforcement Learning Via Latent Unified State Representation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 10452–10459. URL: https://arxiv.org/abs/2102.05714.
- Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels. International Conference on Learning Representations (2021). URL: https://openreview.net/forum?id=GY6-6sTvGaf.
- Mask-based Latent Reconstruction for Reinforcement Learning. Advances in Neural Information Processing Systems 35 (2022), 25117–25131. URL: https://arxiv.org/abs/2201.12096.
- Yufeng Yuan and A Rupam Mahmood. 2022. Asynchronous Reinforcement Learning for Real-Time Control of Physical Robots. In 2022 International Conference on Robotics and Automation (ICRA). IEEE, 5546–5552. URL: https://arxiv.org/abs/2203.12759.
- Don’t Touch What Matters: Task-Aware Lipschitz Data Augmentation for Visual Reinforcement Learning. International Joint Conference on Artificial Intelligence (2022). URL: https://arxiv.org/abs/2202.09982.
- Pre-Trained Image Encoder for Generalizable Visual Reinforcement Learning. Advances in Neural Information Processing Systems 35 (2022), 13022–13037. URL: https://arxiv.org/abs/2212.08860.
- Augmentation Curriculum Learning For Generalization in RL. (2023). URL: https://openreview.net/forum?id=Fj1S0SV8p3U.
- Learning Invariant Representations for Reinforcement Learning without Reconstruction. International Conference on Learning Representations (ICLR) (2021). URL: https://arxiv.org/abs/2006.10742.
- A Study on Overfitting in Deep Reinforcement Learning. arXiv preprint arXiv:1804.06893 (2018). URL: https://arxiv.org/abs/1804.06893.
- Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations. Advances in Neural Information Processing Systems 33 (2020), 21024–21037. URL: https://arxiv.org/abs/2003.08938.
- Places: A 10 million Image Database for Scene Recognition. IEEE transactions on pattern analysis and machine intelligence 40, 6 (2017), 1452–1464. URL: https://ieeexplore.ieee.org/document/7968387.
- Target-driven Visual Navigation in Indoor Scenes using Deep Reinforcement Learning. International Conference on Robotics and Automation (2017). URL: https://ieeexplore.ieee.org/document/7989381.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.