DIDA: Denoised Imitation Learning based on Domain Adaptation (2404.03382v1)
Abstract: Imitating skills from low-quality datasets, such as sub-optimal demonstrations and observations with distractors, is common in real-world applications. In this work, we focus on the problem of Learning from Noisy Demonstrations (LND), where the imitator is required to learn from data with noise that often occurs during the processes of data collection or transmission. Previous IL methods improve the robustness of learned policies by injecting an adversarially learned Gaussian noise into pure expert data or utilizing additional ranking information, but they may fail in the LND setting. To alleviate the above problems, we propose Denoised Imitation learning based on Domain Adaptation (DIDA), which designs two discriminators to distinguish the noise level and expertise level of data, facilitating a feature encoder to learn task-related but domain-agnostic representations. Experiment results on MuJoCo demonstrate that DIDA can successfully handle challenging imitation tasks from demonstrations with various types of noise, outperforming most baseline methods.
- Hydra: Hybrid robot actions for imitation learning. In Conference on Robot Learning, pp. 2113–2133. PMLR, 2023.
- Analysis of representations for domain adaptation. Advances in neural information processing systems, 19, 2006.
- Extrapolating beyond suboptimal demonstrations via inverse reinforcement learning from observations. In International conference on machine learning, pp. 783–792. PMLR, 2019.
- Better-than-demonstrator imitation learning via automatically-ranked demonstrations. In Conference on robot learning, pp. 330–359. PMLR, 2020.
- Distance minimization for reward learning from scored trajectories. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 30, 2016.
- Domain-robust visual imitation learning with mutual information constraints. arXiv preprint arXiv:2103.05079, 2021.
- Robust imitation learning against variations in environment dynamics. In International Conference on Machine Learning, pp. 2828–2852. PMLR, 2022.
- Learning from suboptimal demonstration via self-supervised reward regression. In Conference on robot learning, pp. 1262–1277. PMLR, 2021.
- Efficient rl with impaired observability: Learning to act with delayed and missing state observations. arXiv preprint arXiv:2306.01243, 2023.
- Neuroceril: Robotic imitation learning via hierarchical cause-effect reasoning in programmable attractor neural networks. International Journal of Social Robotics, pp. 1–19, 2023.
- A structured prediction approach for robot imitation learning. The International Journal of Robotics Research, pp. 02783649231204656, 2023.
- Cross-domain imitation learning via optimal transport. arXiv preprint arXiv:2110.03684, 2021.
- Eliciting compatible demonstrations for multi-human imitation learning. In Conference on Robot Learning, pp. 1981–1991. PMLR, 2023.
- Domain-adversarial training of neural networks. Journal of machine learning research, 17(59):1–35, 2016.
- A divergence minimization perspective on imitation learning methods. In Conference on Robot Learning, pp. 1259–1277. PMLR, 2020.
- Generative adversarial nets. Advances in neural information processing systems, 27, 2014.
- Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752, 2023.
- Learning monopoly gameplay: A hybrid model-free deep reinforcement learning and imitation learning approach. arXiv preprint ArXiv:2103.00683, 2021.
- Generative adversarial imitation learning. Advances in neural information processing systems, 29, 2016.
- Imitation learning: A survey of learning methods. ACM Computing Surveys (CSUR), 50(2):1–35, 2017.
- Imitation learning as f-divergence minimization. In Algorithmic Foundations of Robotics XIV: Proceedings of the Fourteenth Workshop on the Algorithmic Foundations of Robotics 14, pp. 313–329. Springer, 2021.
- Langley, P. Crafting papers on machine learning. In Langley, P. (ed.), Proceedings of the 17th International Conference on Machine Learning (ICML 2000), pp. 1207–1216, Stanford, CA, 2000. Morgan Kaufmann.
- Dart: Noise injection for robust imitation learning. In Conference on robot learning, pp. 143–156. PMLR, 2017.
- State alignment-based imitation learning. arXiv preprint arXiv:1911.10947, 2019.
- An algorithmic perspective on imitation learning. Foundations and Trends® in Robotics, 7(1-2):1–179, 2018.
- Valuation of stocks by integrating discounted cash flow with imitation learning and guided policy. IEEE Transactions on Automation Science and Engineering, 2023.
- Efficient reductions for imitation learning. In Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp. 661–668. JMLR Workshop and Conference Proceedings, 2010.
- Schaal, S. Is imitation learning the route to humanoid robots? Trends in cognitive sciences, 3(6):233–242, 1999.
- Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
- Third-person visual imitation learning via decoupled hierarchical controller. Advances in Neural Information Processing Systems, 32, 2019.
- Self-imitation learning for action generation in text-based games. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pp. 703–726, 2023.
- Third-person imitation learning. arXiv preprint arXiv:1703.01703, 2017.
- Robust imitation learning from noisy demonstrations. arXiv preprint arXiv:2010.10181, 2020.
- Mujoco: A physics engine for model-based control. In 2012 IEEE/RSJ international conference on intelligent robots and systems, pp. 5026–5033. IEEE, 2012.
- Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008.
- Adversarially robust imitation learning. In Conference on Robot Learning, pp. 320–331. PMLR, 2022.
- Imitation learning from imperfect demonstration. In International Conference on Machine Learning, pp. 6818–6827. PMLR, 2019.
- Good better best: Self-motivated imitation learning for noisy demonstrations. arXiv preprint arXiv:2310.15815, 2023.
- Confidence-aware imitation learning from demonstrations with varying optimality. Advances in Neural Information Processing Systems, 34:12340–12350, 2021.
- Multi-task imitation learning for linear dynamical systems. In Learning for Dynamics and Control Conference, pp. 586–599. PMLR, 2023.