Papers
Topics
Authors
Recent
Search
2000 character limit reached

PIGDreamer: Privileged Information Guided World Models for Safe Partially Observable Reinforcement Learning

Published 4 Aug 2025 in cs.LG | (2508.02159v1)

Abstract: Partial observability presents a significant challenge for safe reinforcement learning, as it impedes the identification of potential risks and rewards. Leveraging specific types of privileged information during training to mitigate the effects of partial observability has yielded notable empirical successes. In this paper, we propose Asymmetric Constrained Partially Observable Markov Decision Processes (ACPOMDPs) to theoretically examine the advantages of incorporating privileged information. Building upon ACPOMDPs, we propose the Privileged Information Guided Dreamer, a model-based safe reinforcement learning approach that leverages privileged information to enhance the agent's safety and performance through privileged representation alignment and an asymmetric actor-critic structure. Our empirical results demonstrate that our approach significantly outperforms existing methods in terms of safety and task-centric performance. Meanwhile, compared to alternative privileged model-based reinforcement learning methods, our approach exhibits superior performance and ease of training.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.