Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Generative Adversarial Reward Learning for Generalized Behavior Tendency Inference (2105.00822v2)

Published 3 May 2021 in cs.LG, cs.AI, and cs.IR

Abstract: Recent advances in reinforcement learning have inspired increasing interest in learning user modeling adaptively through dynamic interactions, e.g., in reinforcement learning based recommender systems. Reward function is crucial for most of reinforcement learning applications as it can provide the guideline about the optimization. However, current reinforcement-learning-based methods rely on manually-defined reward functions, which cannot adapt to dynamic and noisy environments. Besides, they generally use task-specific reward functions that sacrifice generalization ability. We propose a generative inverse reinforcement learning for user behavioral preference modelling, to address the above issues. Instead of using predefined reward functions, our model can automatically learn the rewards from user's actions based on discriminative actor-critic network and Wasserstein GAN. Our model provides a general way of characterizing and explaining underlying behavioral tendencies, and our experiments show our method outperforms state-of-the-art methods in a variety of scenarios, namely traffic signal control, online recommender systems, and scanpath prediction.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Xiaocong Chen (24 papers)
  2. Lina Yao (194 papers)
  3. Xianzhi Wang (49 papers)
  4. Aixin Sun (99 papers)
  5. Wenjie Zhang (138 papers)
  6. Quan Z. Sheng (91 papers)
Citations (8)

Summary

We haven't generated a summary for this paper yet.