Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
95 tokens/sec
Gemini 2.5 Pro Premium
55 tokens/sec
GPT-5 Medium
20 tokens/sec
GPT-5 High Premium
20 tokens/sec
GPT-4o
98 tokens/sec
DeepSeek R1 via Azure Premium
86 tokens/sec
GPT OSS 120B via Groq Premium
463 tokens/sec
Kimi K2 via Groq Premium
200 tokens/sec
2000 character limit reached

Safe Reinforcement Learning via Projection on a Safe Set: How to Achieve Optimality? (2004.00915v1)

Published 2 Apr 2020 in eess.SY, cs.AI, cs.LG, and cs.SY

Abstract: For all its successes, Reinforcement Learning (RL) still struggles to deliver formal guarantees on the closed-loop behavior of the learned policy. Among other things, guaranteeing the safety of RL with respect to safety-critical systems is a very active research topic. Some recent contributions propose to rely on projections of the inputs delivered by the learned policy into a safe set, ensuring that the system safety is never jeopardized. Unfortunately, it is unclear whether this operation can be performed without disrupting the learning process. This paper addresses this issue. The problem is analysed in the context of $Q$-learning and policy gradient techniques. We show that the projection approach is generally disruptive in the context of $Q$-learning though a simple alternative solves the issue, while simple corrections can be used in the context of policy gradient methods in order to ensure that the policy gradients are unbiased. The proposed results extend to safe projections based on robust MPC techniques.

Citations (46)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.