Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Addressing reward bias in Adversarial Imitation Learning with neutral reward functions (2009.09467v1)

Published 20 Sep 2020 in cs.LG, cs.RO, and stat.ML

Abstract: Generative Adversarial Imitation Learning suffers from the fundamental problem of reward bias stemming from the choice of reward functions used in the algorithm. Different types of biases also affect different types of environments - which are broadly divided into survival and task-based environments. We provide a theoretical sketch of why existing reward functions would fail in imitation learning scenarios in task based environments with multiple terminal states. We also propose a new reward function for GAIL which outperforms existing GAIL methods on task based environments with single and multiple terminal states and effectively overcomes both survival and termination bias.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Rohit Jena (16 papers)
  2. Siddharth Agrawal (7 papers)
  3. Katia Sycara (93 papers)
Citations (6)