RAIL: Risk-Averse Imitation Learning (1707.06658v4)

Published 20 Jul 2017 in cs.LG and cs.AI

Abstract: Imitation learning algorithms learn viable policies by imitating an expert's behavior when reward signals are not available. Generative Adversarial Imitation Learning (GAIL) is a state-of-the-art algorithm for learning policies when the expert's behavior is available as a fixed set of trajectories. We evaluate in terms of the expert's cost function and observe that the distribution of trajectory-costs is often more heavy-tailed for GAIL-agents than the expert at a number of benchmark continuous-control tasks. Thus, high-cost trajectories, corresponding to tail-end events of catastrophic failure, are more likely to be encountered by the GAIL-agents than the expert. This makes the reliability of GAIL-agents questionable when it comes to deployment in risk-sensitive applications like robotic surgery and autonomous driving. In this work, we aim to minimize the occurrence of tail-end events by minimizing tail risk within the GAIL framework. We quantify tail risk by the Conditional-Value-at-Risk (CVaR) of trajectories and develop the Risk-Averse Imitation Learning (RAIL) algorithm. We observe that the policies learned with RAIL show lower tail-end risk than those of vanilla GAIL. Thus the proposed RAIL algorithm appears as a potent alternative to GAIL for improved reliability in risk-sensitive applications.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (7)

Anirban Santara (13 papers)
Abhishek Naik (9 papers)
Balaraman Ravindran (100 papers)
Dipankar Das (86 papers)
Dheevatsa Mudigere (35 papers)
Sasikanth Avancha (20 papers)
Bharat Kaul (23 papers)

Citations (16)

View on Semantic Scholar

YouTube

Show All Videos

RAIL: Risk-Averse Imitation Learning (1707.06658v4)

Related Papers

YouTube