Boosting Efficiency in Task-Agnostic Exploration through Causal Knowledge (2407.20506v1)
Abstract: The effectiveness of model training heavily relies on the quality of available training resources. However, budget constraints often impose limitations on data collection efforts. To tackle this challenge, we introduce causal exploration in this paper, a strategy that leverages the underlying causal knowledge for both data collection and model training. We, in particular, focus on enhancing the sample efficiency and reliability of the world model learning within the domain of task-agnostic reinforcement learning. During the exploration phase, the agent actively selects actions expected to yield causal insights most beneficial for world model training. Concurrently, the causal knowledge is acquired and incrementally refined with the ongoing collection of data. We demonstrate that causal exploration aids in learning accurate world models using fewer data and provide theoretical guarantees for its convergence. Empirical experiments, on both synthetic data and real-world applications, further validate the benefits of causal exploration.
- A survey on intrinsic motivation in reinforcement learning. arXiv preprint arXiv:1908.06976, 2019.
- Unifying count-based exploration and intrinsic motivation. Advances in neural information processing systems, 29, 2016.
- Large-scale study of curiosity-driven learning. arXiv preprint arXiv:1808.04355, 2018.
- Generalizing goal-conditioned reinforcement learning with variational causal reasoning. Advances in Neural Information Processing Systems, 35:26532–26548, 2022.
- Benchmarking deep reinforcement learning for continuous control. In International conference on machine learning, pages 1329–1338. PMLR, 2016.
- Learning how to active learn: A deep reinforcement learning approach. arXiv preprint arXiv:1708.02383, 2017.
- Causality-driven hierarchical structure discovery for reinforcement learning. Advances in Neural Information Processing Systems, 35:20064–20076, 2022.
- Adarl: What, where, and how to adapt in transfer reinforcement learning. arXiv preprint arXiv:2107.02729, 2021.
- Action-sufficient state representation learning for control with structural constraints. In International Conference on Machine Learning, pages 9260–9279. PMLR, 2022.
- Systematic evaluation of causal discovery in visual model based reinforcement learning. arXiv preprint arXiv:2107.00848, 2021.
- Emi: Exploration with mutual information. arXiv preprint arXiv:1810.01176, 2018.
- Learning causal overhypotheses through exploration in children and computational models. In Conference on Causal Learning and Reasoning, pages 390–406. PMLR, 2022.
- Count-based exploration with the successor representation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 5125–5133, 2020.
- Provably efficient causal model-based reinforcement learning for systematic generalization. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 9251–9259, 2023.
- Variational continual learning. arXiv preprint arXiv:1710.10628, 2017.
- Curiosity-driven exploration by self-supervised prediction. In International conference on machine learning, pages 2778–2787. PMLR, 2017.
- Self-supervised exploration via disagreement. In International conference on machine learning, pages 5062–5071. PMLR, 2019.
- Online continual learning with maximally interfered retrieval. In NIPS, 2019.
- Transformer-based world models are happy with 100k interactions. arXiv preprint arXiv:2303.07109, 2023.
- Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
- Causal influence detection for improving efficiency in reinforcement learning. Advances in Neural Information Processing Systems, 34:22905–22918, 2021.
- Planning to explore via self-supervised world models. In International Conference on Machine Learning, pages 8583–8592. PMLR, 2020.
- Model-based active exploration. In International conference on machine learning, pages 5779–5788. PMLR, 2019.
- Mastering the game of go with deep neural networks and tree search. nature, 529(7587):484–489, 2016.
- Causation, prediction, and search. MIT press, 2000.
- Deepmind control suite. arXiv preprint arXiv:1801.00690, 2018.
- Mujoco: A physics engine for model-based control. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 5026–5033. IEEE, 2012.
- Deep reinforcement learning with double q-learning. In Proceedings of the AAAI conference on artificial intelligence, volume 30, 2016.
- Causal dynamics learning for task-independent state abstraction. arXiv preprint arXiv:2206.13452, 2022.
- Intellilight: A reinforcement learning approach for intelligent traffic light control. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 2496–2505, 2018.
- Learning loss for active learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 93–102, 2019.
- Online coreset selection for rehearsal-based continual learning. arXiv preprint arXiv:2106.01085, 2021.
- Explainable reinforcement learning via a causal world model. In Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, pages 4540–4548. IJCAI, 2023.
- Kernel-based conditional independence test and application in causal discovery. arXiv preprint arXiv:1202.3775, 2012.
- Causal discovery with reinforcement learning. arXiv preprint arXiv:1906.04477, 2019.
- Offline reinforcement learning with causal structured world models. arXiv preprint arXiv:2206.01474, 2022.