Auto-Encoding Adversarial Imitation Learning (2206.11004v5)
Abstract: Reinforcement learning (RL) provides a powerful framework for decision-making, but its application in practice often requires a carefully designed reward function. Adversarial Imitation Learning (AIL) sheds light on automatic policy acquisition without access to the reward signal from the environment. In this work, we propose Auto-Encoding Adversarial Imitation Learning (AEAIL), a robust and scalable AIL framework. To induce expert policies from demonstrations, AEAIL utilizes the reconstruction error of an auto-encoder as a reward signal, which provides more information for optimizing policies than the prior discriminator-based ones. Subsequently, we use the derived objective functions to train the auto-encoder and the agent policy. Experiments show that our AEAIL performs superior compared to state-of-the-art methods on both state and image based environments. More importantly, AEAIL shows much better robustness when the expert demonstrations are noisy.
- Autonomous Autorotation of an RC Helicopter. In International Symposium on Robotics.
- Autonomous Helicopter Aerobatics through Apprenticeship Learning. The International Journal of Robotics Research (2010).
- An Application of Reinforcement Learning to Aerobatic Helicopter Flight. In Advances in Neural Information Processing Systems (NeurIPS).
- Apprenticeship learning for motion planning with application to parking lot navigation. In IEEE/RSj International Conference on Intelligent Robots and Systems.
- A Learning vehicular dynamics, with application to modeling helicopters. In Advances in Neural Information Processing Systems (NeurIPS).
- Wasserstein GAN. In International Conference on Machine Learning (ICML).
- Enda Barrett and Stephen Linder. 2015. Autonomous hva control, a reinforcement learning approach. Machine Learning and Knowledge Discovery in Databases (2015).
- Glen Berseth and Christopher Pal. 2020. Visual Imitation with Reinforcement Learning using Recurrent Siamese Networks. (2020). https://openreview.net/forum?id=BJgdOh4Ywr
- Began: Boundary equilibrium generative adversarial networks. In Advances in Neural Information Processing Systems (NeurIPS).
- Relative entropy inverse reinforcement learning. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. JMLR Workshop and Conference Proceedings, 182–189.
- Extrapolating beyond suboptimal demonstrations via inverse reinforcement learning from observations. In International Conference on Machine Learning (ICML). PMLR, 783–792.
- Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model. (2016). arXiv:1610.03518 [cs.RO]
- Learning for Control from Multiple Demonstrations. In International Conference on Machine Learning (ICML).
- Primal Wasserstein Imitation Learning. In International Conference on Learning Representations (ICLR).
- OpenAI Baselines.
- Learning robust rewards with adversarial inverse reinforcement learning. International Conference on Learning Representations (ICLR) (2017).
- Tanmay Gangwani and Jian Peng. 2020. State-only Imitation with Transition Dynamics Mismatch. In International Conference on Learning Representations (ICLR).
- IQ-Learn: Inverse soft-Q Learning for Imitation. In Advances in Neural Information Processing Systems (NeurIPS).
- A divergence minimization perspective on imitation learning methods. In Conference on Robot Learning (CoRL). PMLR, 1259–1277.
- Deep Learning. MIT Press.
- Generative Adversarial Nets. In Advances in Neural Information Processing Systems (NeurIPS).
- Daniel H Grollman and Aude G Billard. 2012. Robot learning from failed demonstrations. International Journal of Social Robotics 4, 4 (2012), 331–342.
- When Will Generative Adversarial Imitation Learning Algorithms Attain Global Convergence. In International Conference on Artificial Intelligence and Statistics (AISTATS). PMLR, 1117–1125.
- Watch and Match: Supercharging Imitation with Regularized Optimal Transport. CoRL (2022).
- Jonathan Ho and Stefano Ermon. 2016. Generative Adversarial Imitation Learning. In Advances in Neural Information Processing Systems (NeurIPS).
- Reinforcement learning from imperfect demonstrations under soft expert guidance. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 5109–5116.
- Domain adaptive imitation learning. In International Conference on Machine Learning (ICML). PMLR, 5286–5295.
- Diederik P Kingma and Max Welling. 2014. Auto-Encoding Variational Bayes. In Advances in Neural Information Processing Systems (NeurIPS).
- A cascaded supervised learning approach to inverse reinforcement learning. In Joint European conference on machine learning and knowledge discovery in databases. Springer, 1–16.
- Discriminator-Actor-Critic: Addressing Sample Inefficiency and Reward Bias in Adversarial Imitation Learning. In International Conference on Learning Representations (ICLR).
- Imitation Learning via Off-Policy Distribution Matching. In International Conference on Learning Representations. https://openreview.net/forum?id=Hyg-JC4FDr
- Autoencoding beyond pixels using a learned similarity metric. In International Conference on Machine Learning (ICML).
- Inverted autonomous helicopter flight via reinforcement learning. In International Symposium on Experimental Robotics.
- Bridging the gap between imitation learning and inverse reinforcement learning. IEEE transactions on neural networks and learning systems 28, 8 (2016), 1814–1826.
- Adversarial Imitation via Variational Inverse Reinforcement Learning. In International Conference on Learning Representations (ICLR).
- Maximum Margin Planning. In International Conference on Machine Learning (ICML).
- Inverse reinforcement learning from failure. In Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems (AAMAS). 1060–1068.
- Third-Person Imitation Learning. In International Conference on Learning Representations (ICLR).
- Random expert distillation: Imitation learning via expert policy support estimation. In International Conference on Machine Learning. PMLR, 6536–6544.
- David Warde-Farley and Yoshua Bengio. 2017. Improving generative adversarial networks with denoising feature matching. In International Conference on Learning Representations (ICLR).
- Imitation Learning from Imperfect Demonstration. In Proceedings of the 36th International Conference on Machine Learning (ICML). 6818–6827.
- f-GAIL: Learning f-Divergence for Generative Adversarial Imitation Learning. Advances in Neural Information Processing Systems (NeurIPS) 33 (2020).
- Energy-based generative adversarial network. International Conference on Learning Representations (ICLR) (2016).
- Modeling Interaction via the Principle of Maximum Causal Entropy. In International Conference on Machine Learning (ICML).
- Maximum Entropy Inverse Reinforcement Learning. In AAAI Conference on Artificial Intelligence (AAAI).
- Kaifeng Zhang (11 papers)
- Rui Zhao (241 papers)
- Ziming Zhang (59 papers)
- Yang Gao (761 papers)