Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Exploring Gradient Explosion in Generative Adversarial Imitation Learning: A Probabilistic Perspective (2312.11214v1)

Published 18 Dec 2023 in cs.LG and cs.AI

Abstract: Generative Adversarial Imitation Learning (GAIL) stands as a cornerstone approach in imitation learning. This paper investigates the gradient explosion in two types of GAIL: GAIL with deterministic policy (DE-GAIL) and GAIL with stochastic policy (ST-GAIL). We begin with the observation that the training can be highly unstable for DE-GAIL at the beginning of the training phase and end up divergence. Conversely, the ST-GAIL training trajectory remains consistent, reliably converging. To shed light on these disparities, we provide an explanation from a theoretical perspective. By establishing a probabilistic lower bound for GAIL, we demonstrate that gradient explosion is an inevitable outcome for DE-GAIL due to occasionally large expert-imitator policy disparity, whereas ST-GAIL does not have the issue with it. To substantiate our assertion, we illustrate how modifications in the reward function can mitigate the gradient explosion challenge. Finally, we propose CREDO, a simple yet effective strategy that clips the reward function during the training phase, allowing the GAIL to enjoy high data efficiency and stable trainability.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Wanying Wang (3 papers)
  2. Yichen Zhu (51 papers)
  3. Yirui Zhou (7 papers)
  4. Chaomin Shen (25 papers)
  5. Jian Tang (326 papers)
  6. Zhiyuan Xu (47 papers)
  7. Yaxin Peng (22 papers)
  8. Yangchun Zhang (6 papers)
Citations (2)