Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Auto-Encoding Adversarial Imitation Learning (2206.11004v5)

Published 22 Jun 2022 in cs.LG

Abstract: Reinforcement learning (RL) provides a powerful framework for decision-making, but its application in practice often requires a carefully designed reward function. Adversarial Imitation Learning (AIL) sheds light on automatic policy acquisition without access to the reward signal from the environment. In this work, we propose Auto-Encoding Adversarial Imitation Learning (AEAIL), a robust and scalable AIL framework. To induce expert policies from demonstrations, AEAIL utilizes the reconstruction error of an auto-encoder as a reward signal, which provides more information for optimizing policies than the prior discriminator-based ones. Subsequently, we use the derived objective functions to train the auto-encoder and the agent policy. Experiments show that our AEAIL performs superior compared to state-of-the-art methods on both state and image based environments. More importantly, AEAIL shows much better robustness when the expert demonstrations are noisy.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. Autonomous Autorotation of an RC Helicopter. In International Symposium on Robotics.
  2. Autonomous Helicopter Aerobatics through Apprenticeship Learning. The International Journal of Robotics Research (2010).
  3. An Application of Reinforcement Learning to Aerobatic Helicopter Flight. In Advances in Neural Information Processing Systems (NeurIPS).
  4. Apprenticeship learning for motion planning with application to parking lot navigation. In IEEE/RSj International Conference on Intelligent Robots and Systems.
  5. A Learning vehicular dynamics, with application to modeling helicopters. In Advances in Neural Information Processing Systems (NeurIPS).
  6. Wasserstein GAN. In International Conference on Machine Learning (ICML).
  7. Enda Barrett and Stephen Linder. 2015. Autonomous hva control, a reinforcement learning approach. Machine Learning and Knowledge Discovery in Databases (2015).
  8. Glen Berseth and Christopher Pal. 2020. Visual Imitation with Reinforcement Learning using Recurrent Siamese Networks. (2020). https://openreview.net/forum?id=BJgdOh4Ywr
  9. Began: Boundary equilibrium generative adversarial networks. In Advances in Neural Information Processing Systems (NeurIPS).
  10. Relative entropy inverse reinforcement learning. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. JMLR Workshop and Conference Proceedings, 182–189.
  11. Extrapolating beyond suboptimal demonstrations via inverse reinforcement learning from observations. In International Conference on Machine Learning (ICML). PMLR, 783–792.
  12. Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model. (2016). arXiv:1610.03518 [cs.RO]
  13. Learning for Control from Multiple Demonstrations. In International Conference on Machine Learning (ICML).
  14. Primal Wasserstein Imitation Learning. In International Conference on Learning Representations (ICLR).
  15. OpenAI Baselines.
  16. Learning robust rewards with adversarial inverse reinforcement learning. International Conference on Learning Representations (ICLR) (2017).
  17. Tanmay Gangwani and Jian Peng. 2020. State-only Imitation with Transition Dynamics Mismatch. In International Conference on Learning Representations (ICLR).
  18. IQ-Learn: Inverse soft-Q Learning for Imitation. In Advances in Neural Information Processing Systems (NeurIPS).
  19. A divergence minimization perspective on imitation learning methods. In Conference on Robot Learning (CoRL). PMLR, 1259–1277.
  20. Deep Learning. MIT Press.
  21. Generative Adversarial Nets. In Advances in Neural Information Processing Systems (NeurIPS).
  22. Daniel H Grollman and Aude G Billard. 2012. Robot learning from failed demonstrations. International Journal of Social Robotics 4, 4 (2012), 331–342.
  23. When Will Generative Adversarial Imitation Learning Algorithms Attain Global Convergence. In International Conference on Artificial Intelligence and Statistics (AISTATS). PMLR, 1117–1125.
  24. Watch and Match: Supercharging Imitation with Regularized Optimal Transport. CoRL (2022).
  25. Jonathan Ho and Stefano Ermon. 2016. Generative Adversarial Imitation Learning. In Advances in Neural Information Processing Systems (NeurIPS).
  26. Reinforcement learning from imperfect demonstrations under soft expert guidance. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 5109–5116.
  27. Domain adaptive imitation learning. In International Conference on Machine Learning (ICML). PMLR, 5286–5295.
  28. Diederik P Kingma and Max Welling. 2014. Auto-Encoding Variational Bayes. In Advances in Neural Information Processing Systems (NeurIPS).
  29. A cascaded supervised learning approach to inverse reinforcement learning. In Joint European conference on machine learning and knowledge discovery in databases. Springer, 1–16.
  30. Discriminator-Actor-Critic: Addressing Sample Inefficiency and Reward Bias in Adversarial Imitation Learning. In International Conference on Learning Representations (ICLR).
  31. Imitation Learning via Off-Policy Distribution Matching. In International Conference on Learning Representations. https://openreview.net/forum?id=Hyg-JC4FDr
  32. Autoencoding beyond pixels using a learned similarity metric. In International Conference on Machine Learning (ICML).
  33. Inverted autonomous helicopter flight via reinforcement learning. In International Symposium on Experimental Robotics.
  34. Bridging the gap between imitation learning and inverse reinforcement learning. IEEE transactions on neural networks and learning systems 28, 8 (2016), 1814–1826.
  35. Adversarial Imitation via Variational Inverse Reinforcement Learning. In International Conference on Learning Representations (ICLR).
  36. Maximum Margin Planning. In International Conference on Machine Learning (ICML).
  37. Inverse reinforcement learning from failure. In Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems (AAMAS). 1060–1068.
  38. Third-Person Imitation Learning. In International Conference on Learning Representations (ICLR).
  39. Random expert distillation: Imitation learning via expert policy support estimation. In International Conference on Machine Learning. PMLR, 6536–6544.
  40. David Warde-Farley and Yoshua Bengio. 2017. Improving generative adversarial networks with denoising feature matching. In International Conference on Learning Representations (ICLR).
  41. Imitation Learning from Imperfect Demonstration. In Proceedings of the 36th International Conference on Machine Learning (ICML). 6818–6827.
  42. f-GAIL: Learning f-Divergence for Generative Adversarial Imitation Learning. Advances in Neural Information Processing Systems (NeurIPS) 33 (2020).
  43. Energy-based generative adversarial network. International Conference on Learning Representations (ICLR) (2016).
  44. Modeling Interaction via the Principle of Maximum Causal Entropy. In International Conference on Machine Learning (ICML).
  45. Maximum Entropy Inverse Reinforcement Learning. In AAAI Conference on Artificial Intelligence (AAAI).
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Kaifeng Zhang (11 papers)
  2. Rui Zhao (241 papers)
  3. Ziming Zhang (59 papers)
  4. Yang Gao (761 papers)
Citations (1)