Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learning Generalizable Agents via Saliency-Guided Features Decorrelation (2310.05086v2)

Published 8 Oct 2023 in cs.AI

Abstract: In visual-based Reinforcement Learning (RL), agents often struggle to generalize well to environmental variations in the state space that were not observed during training. The variations can arise in both task-irrelevant features, such as background noise, and task-relevant features, such as robot configurations, that are related to the optimal decisions. To achieve generalization in both situations, agents are required to accurately understand the impact of changed features on the decisions, i.e., establishing the true associations between changed features and decisions in the policy model. However, due to the inherent correlations among features in the state space, the associations between features and decisions become entangled, making it difficult for the policy to distinguish them. To this end, we propose Saliency-Guided Features Decorrelation (SGFD) to eliminate these correlations through sample reweighting. Concretely, SGFD consists of two core techniques: Random Fourier Functions (RFF) and the saliency map. RFF is utilized to estimate the complex non-linear correlations in high-dimensional images, while the saliency map is designed to identify the changed features. Under the guidance of the saliency map, SGFD employs sample reweighting to minimize the estimated correlations related to changed features, thereby achieving decorrelation in visual RL tasks. Our experimental results demonstrate that SGFD can generalize well on a wide range of test environments and significantly outperforms state-of-the-art methods in handling both task-irrelevant variations and task-relevant variations.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (52)
  1. Sanity checks for saliency maps. Advances in neural information processing systems, 31, 2018.
  2. Learning generalizable representations for reinforcement learning via adaptive meta-learner of behavioral similarities. In Proceedings of the International Conference on Learning Representations (ICLR), 2021a.
  3. Contrastive behavioral similarity embeddings for generalization in reinforcement learning. In Proceedings of the International Conference on Learning Representations (ICLR), 2021b.
  4. Causalworld: A robotic manipulation benchmark for causal structure and transfer learning. arXiv peprint arXiv:2010.04296, 2020.
  5. Exploratory not explanatory: Counterfactual analysis of saliency maps for deep reinforcement learning. In International Conference on Learning Representations, 2020.
  6. Local feature swapping for generalization in reinforcement learning. In Proceedings of the International Conference on Learning Representations (ICLR), 2022.
  7. Look where you look! saliency-guided q-networks for generalization in visual reinforcement learning. In Advances in Neural Information Processing Systems (NeurIPS), 2022.
  8. Invariant causal imitation learning for generalizable policies. In Advances in Neural Information Processing Systems (NIPS), 2021.
  9. A comprehensive survey of ai-generated content (aigc): A history of generative ai from gan to chatgpt. arXiv preprint arXiv:2303.04226, 2023.
  10. Closing the sim-to-real loop: Adapting simulation randomization with real world experience. In International Conference on Robotics and Automation (ICRA), 2019.
  11. Learning generalizable representations for reinforcement learning via adaptive meta-learner of behavioral similarities. In Proceedings of the International Conference on Learning Representations (ICLR), 2022.
  12. A simple framework for contrastive learning of visual representations. In International Conference on Machine Learning (ICML), 2020a.
  13. Decoupling representation learning from reinforcement learning. In International Conference on Machine Learning (ICML), 2020b.
  14. Stable learning establishes some common ground between causal inference and machine learning. Nature Machine Intelligence, 4(2):110–115, 2022.
  15. Temporal disentanglement of representations for improved generalisation in reinforcement learning. In Proceedings of the International Conference on Learning Representations (ICLR), 2023.
  16. A kernel statistical test of independence. Advances in neural information processing systems, 20, 2007.
  17. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In International Conference on Machine Learning (PMLR), page 1861–1870, 2018.
  18. Affordance learning for end-to0end visuomotor robot control. In International Conference on Intelligent Robots and Systems (IROS), 2019.
  19. Generalization in reinforcement learning by soft data augmentation. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 13611–13617. IEEE, 2021.
  20. Stabilizing deep q-learning with convnets and vision transformers under data augmentation. Advances in neural information processing systems, 34:3680–3693, 2021a.
  21. Stabilizing deep q-learning with convnets and vision transformers under data augmentation. Advances in neural information processing systems, 34:3680–3693, 2021b.
  22. Distributional reward estimation for effective multi-agent deep reinforcement learning. Advances in Neural Information Processing Systems, 35:12619–12632, 2022.
  23. The kinetics human action video dataset. arXiv preprint arXiv:1705.06950, 2017.
  24. A survey of generalisation in deep reinforcement learning. arXiv preprint arXiv:2111.09794, 2021.
  25. Image augmentation is all you need: Regularizing deep reinforcement learning from pixels. arXiv preprint arXiv:2004.13649, 2020.
  26. Stable prediction across unknown environments. In proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pages 1617–1626, 2018.
  27. Stable prediction with model misspecification and agnostic distribution shift. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2020.
  28. Curl: Contrastive unsupervised representations for reinforcement learning. In International Conference on Machine Learning, pages 5639–5650. PMLR, 2020a.
  29. Curl: Contrastive unsupervised representations for reinforcement learning. In International Conference on Machine Learning, pages 5639–5650. PMLR, 2020b.
  30. Reinforcement learning with augmented data. Advances in neural information processing systems, 33:19884–19895, 2020c.
  31. Domain adversarial reinforcement learning. arXiv preprint arXiv:2102.07097, 2021.
  32. Deep reinforcement and infomax learning. Advances in Neural Information Processing Systems, 33:3686–3698, 2020.
  33. Cross-trajectory representation learning for zero-shot generalization in rl. In Proceedings of the International Conference on Learning Representations (ICLR), 2022.
  34. Martin L Puterman. Markov decision processes: discrete stochastic dynamic programming. John Wiley & Sons, 2014.
  35. Do imagenet classifiers generalize to imagenet? In International conference on machine learning, pages 5389–5400. PMLR, 2019.
  36. Pretraining representations for data-efficient reinforcement learning. Advances in Neural Information Processing Systems, 34:12686–12699, 2021.
  37. Stable learning via sample reweighting. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2020.
  38. Approximate kernel-based conditional independence tests for fast non-parametric causal discovery. Journal of Causal Inference, 7(1), 2019.
  39. One pixel attack for fooling deep neural networks. IEEE Transactions on Evolutionary Computation, 23(5):828–841, 2019.
  40. Smart: Self-supervised multi-task pretraining with control transformers. In Proceedings of the International Conference on Learning Representations (ICLR), 2023.
  41. Deepmind control suite. arXiv peprint arXiv:1801.00690, 2018.
  42. Domain randomization for transferring deep neural networks from simulation to the real world. In 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS), pages 23–30. IEEE, 2017.
  43. S3k: Self-supervised semantic keypoints for robotic manipulation via multi-view consistency. In In Conference on Robot Learning (PMLR), 2021.
  44. A theoretical analysis on independence-driven importance weighting for covariate-shift generalization. In International Conference on Machine Learning, pages 24803–24829. PMLR, 2022.
  45. Stable learning via sparse variable independence. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2023.
  46. Don’t touch what matters: Task-aware lipschitz data augmentationfor visual reinforcement learning. arXiv peprint arXiv:2202.09982, 2022a.
  47. Pre-trained image encoder for generalizable visual reinforcement learning. arXiv preprint arXiv:2212.08860, 2022b.
  48. Learning invariant representations for reinforcement learning without reconstruction. In Proceedings of the International Conference on Learning Representations (ICLR), 2021a.
  49. Invariant causal prediction for block mdps. In International Conference on Machine Learning (PMLR), 2020a.
  50. Learning robust state abstractions for hidden-parameter block mdps. arXiv preprint arXiv:2007.07206, 2020b.
  51. Deep stable learning for out-of-distribution generalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021b.
  52. A comprehensive survey on pretrained foundation models: A history from bert to chatgpt. arXiv preprint arXiv:2302.09419, 2023.
Citations (1)

Summary

We haven't generated a summary for this paper yet.