Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Visual Imitation Learning with Calibrated Contrastive Representation (2401.11396v1)

Published 21 Jan 2024 in cs.LG and cs.CV

Abstract: Adversarial Imitation Learning (AIL) allows the agent to reproduce expert behavior with low-dimensional states and actions. However, challenges arise in handling visual states due to their less distinguishable representation compared to low-dimensional proprioceptive features. While existing methods resort to adopt complex network architectures or separate the process of learning representation and decision-making, they overlook valuable intra-agent information within demonstrations. To address this problem, this paper proposes a simple and effective solution by incorporating calibrated contrastive representative learning into visual AIL framework. Specifically, we present an image encoder in visual AIL, utilizing a combination of unsupervised and supervised contrastive learning to extract valuable features from visual states. Based on the fact that the improved agent often produces demonstrations of varying quality, we propose to calibrate the contrastive loss by treating each agent demonstrations as a mixed sample. The incorporation of contrastive learning can be jointly optimized with the AIL framework, without modifying the architecture or incurring significant computational costs. Experimental results on DMControl Suite demonstrate our proposed method is sample efficient and can outperform other compared methods from different aspects.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. Disagreement-regularized imitation learning. In International Conference on Learning Representations, 2019.
  2. Extrapolating beyond suboptimal demonstrations via inverse reinforcement learning from observations. In International conference on machine learning, pages 783–792. PMLR, 2019.
  3. Better-than-demonstrator imitation learning via automatically-ranked demonstrations. In Conference on Robot Learning, pages 330–359, 2020.
  4. Imitation learning from pixel-level demonstrations by hashreward. In Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems, pages 279–287, 2021.
  5. A simple framework for contrastive learning of visual representations. In International Conference on Machine Learning, 2020.
  6. Quantifying generalization in reinforcement learning. In International Conference on Machine Learning, pages 1282–1289. PMLR, 2019.
  7. Imitation learning from pixel observations for continuous control. 2021.
  8. Impala: Scalable distributed deep-rl with importance weighted actor-learner architectures. In International conference on machine learning, pages 1407–1416. PMLR, 2018.
  9. Learning visual feature spaces for robotic manipulation with deep spatial autoencoders. arXiv preprint arXiv:1509.06113, 25:2, 2015.
  10. Learning robust rewards with adversarial inverse reinforcement learning. arXiv preprint arXiv:1710.11248, 2017.
  11. Generative adversarial nets. In Advances in neural information processing systems, pages 2672–2680, 2014.
  12. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In International conference on machine learning, pages 1861–1870. PMLR, 2018.
  13. Watch and match: Supercharging imitation with regularized optimal transport. arXiv preprint arXiv:2206.15469, 2022.
  14. Generative adversarial imitation learning. In Advances in neural information processing systems, pages 4565–4573, 2016.
  15. Policy contrastive imitation learning. In International Conference on Machine Learning, pages 14007–14022. PMLR, 2023.
  16. Imitation learning: A survey of learning methods. ACM Computing Surveys (CSUR), 50(2):1–35, 2017.
  17. Human-level performance in 3d multiplayer games with population-based reinforcement learning. In Conference on Neural Information Processing Systems, 2019.
  18. Training gans with stronger augmentations via contrastive discriminator. arXiv preprint arXiv:2103.09742, 2021.
  19. Imitation learning as f-divergence minimization. In International Workshop on the Algorithmic Foundations of Robotics, pages 313–329. Springer, 2020.
  20. Supervised contrastive learning. arXiv preprint arXiv:2004.11362, 2020.
  21. Image augmentation is all you need: Regularizing deep reinforcement learning from pixels. arXiv preprint arXiv:2004.13649, 2020.
  22. Curl: Contrastive unsupervised representations for reinforcement learning. In International Conference on Machine Learning, pages 5639–5650. PMLR, 2020.
  23. Reinforcement learning with augmented data. Advances in neural information processing systems, 33:19884–19895, 2020.
  24. Network randomization: A simple technique for generalization in deep reinforcement learning. arXiv preprint arXiv:1910.05396, 2019.
  25. Visual imitation learning with patch rewards. arXiv preprint arXiv:2302.00965, 2023.
  26. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748, 2018.
  27. Dean A Pomerleau. Alvinn: An autonomous land vehicle in a neural network. Advances in neural information processing systems, 1, 1988.
  28. Martin L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley Series in Probability and Statistics. Wiley, 1994.
  29. Visual adversarial imitation learning using variational models. Advances in Neural Information Processing Systems, 34:3016–3028, 2021.
  30. A reduction of imitation learning and structured prediction to no-regret online learning. In Proceedings of the fourteenth international conference on artificial intelligence and statistics, pages 627–635, 2011.
  31. Trust region policy optimization. In International conference on machine learning, pages 1889–1897, 2015.
  32. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
  33. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision, pages 618–626, 2017.
  34. Deterministic policy gradient algorithms. 2014.
  35. Mastering the game of go with deep neural networks and tree search. nature, 529(7587):484–489, 2016.
  36. Reinforcement learning: An introduction. MIT press, 2018.
  37. Mujoco: A physics engine for model-based control. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 5026–5033. IEEE, 2012.
  38. Behavioral cloning from observation. arXiv preprint arXiv:1805.01954, 2018.
  39. On the sample complexity of stability constrained imitation learning. In Learning for Dynamics and Control Conference, pages 180–191. PMLR, 2022.
  40. Inverse reinforcement learning for video games. arXiv preprint arXiv:1810.10593, 2018.
  41. dm_control: Software and tasks for continuous control. Software Impacts, 6:100022, 2020.
  42. Deep reinforcement learning with double q-learning. In Proceedings of the AAAI conference on artificial intelligence, volume 30, 2016.
  43. Learning to weight imperfect demonstrations. In International Conference on Machine Learning, pages 10961–10970. PMLR, 2021.
  44. Unlabeled imperfect demonstrations in adversarial imitation learning. arXiv preprint arXiv:2302.06271, 2023.
  45. Keyframe-focused visual imitation learning. In International Conference on Machine Learning, pages 11123–11133. PMLR, 2021.
  46. Error bounds of imitating policies and environments. Advances in Neural Information Processing Systems, 33:15737–15749, 2020.
  47. Reinforcement learning with prototypical representations. In International Conference on Machine Learning, pages 11920–11931. PMLR, 2021.
  48. Improving sample efficiency in model-free reinforcement learning from images. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 10674–10681, 2021.
  49. Ye Yuan and Kris Kitani. 3d ego-pose estimation via imitation learning. In Proceedings of the European Conference on Computer Vision (ECCV), pages 735–750, 2018.
  50. Imitation learning: Progress, taxonomies and challenges. IEEE Transactions on Neural Networks and Learning Systems, pages 1–16, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Yunke Wang (11 papers)
  2. Linwei Tao (11 papers)
  3. Bo Du (263 papers)
  4. Yutian Lin (10 papers)
  5. Chang Xu (323 papers)