Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MetaDiffuser: Diffusion Model as Conditional Planner for Offline Meta-RL (2305.19923v1)

Published 31 May 2023 in cs.LG and cs.AI

Abstract: Recently, diffusion model shines as a promising backbone for the sequence modeling paradigm in offline reinforcement learning(RL). However, these works mostly lack the generalization ability across tasks with reward or dynamics change. To tackle this challenge, in this paper we propose a task-oriented conditioned diffusion planner for offline meta-RL(MetaDiffuser), which considers the generalization problem as conditional trajectory generation task with contextual representation. The key is to learn a context conditioned diffusion model which can generate task-oriented trajectories for planning across diverse tasks. To enhance the dynamics consistency of the generated trajectories while encouraging trajectories to achieve high returns, we further design a dual-guided module in the sampling process of the diffusion model. The proposed framework enjoys the robustness to the quality of collected warm-start data from the testing task and the flexibility to incorporate with different task representation method. The experiment results on MuJoCo benchmarks show that MetaDiffuser outperforms other strong offline meta-RL baselines, demonstrating the outstanding conditional generation ability of diffusion architecture.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Fei Ni (21 papers)
  2. Jianye Hao (185 papers)
  3. Yao Mu (58 papers)
  4. Yifu Yuan (19 papers)
  5. Yan Zheng (102 papers)
  6. Bin Wang (750 papers)
  7. Zhixuan Liang (14 papers)
Citations (34)

Summary

We haven't generated a summary for this paper yet.