Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable Feedback (1810.11748v1)

Published 28 Oct 2018 in cs.HC and cs.LG

Abstract: Exploration has been one of the greatest challenges in reinforcement learning (RL), which is a large obstacle in the application of RL to robotics. Even with state-of-the-art RL algorithms, building a well-learned agent often requires too many trials, mainly due to the difficulty of matching its actions with rewards in the distant future. A remedy for this is to train an agent with real-time feedback from a human observer who immediately gives rewards for some actions. This study tackles a series of challenges for introducing such a human-in-the-loop RL scheme. The first contribution of this work is our experiments with a precisely modeled human observer: binary, delay, stochasticity, unsustainability, and natural reaction. We also propose an RL method called DQN-TAMER, which efficiently uses both human feedback and distant rewards. We find that DQN-TAMER agents outperform their baselines in Maze and Taxi simulated environments. Furthermore, we demonstrate a real-world human-in-the-loop RL application where a camera automatically recognizes a user's facial expressions as feedback to the agent while the agent explores a maze.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Riku Arakawa (17 papers)
  2. Sosuke Kobayashi (19 papers)
  3. Yuya Unno (3 papers)
  4. Yuta Tsuboi (4 papers)
  5. Shin-ichi Maeda (29 papers)
Citations (71)

Summary

We haven't generated a summary for this paper yet.