Predictive Safety Filter (PSF)

Updated 2 July 2025

Predictive Safety Filter is a modular safety mechanism that filters and minimally adjusts control actions to prevent constraint violations.
It leverages model predictive control principles to compute safe input trajectories while accommodating state- and input-dependent uncertainty.
Demonstrated in tasks like inverted pendulum swing-up and quadrotor landing, PSF enables reinforcement learning systems to operate safely.

A Predictive Safety Filter (PSF) is a modular safety layer for learning-based and general control of nonlinear dynamical systems subject to state and input constraints. Its principal aim is to prevent violation of physical or safety-critical system constraints during both learning and deployment by filtering proposed control inputs online and replacing or modifying them only as needed. The PSF operates in real time alongside any controller—such as a reinforcement learning (RL) agent—enforcing probabilistic safety guarantees without interfering with performance except when necessary.

1. Principles and Formulation

The PSF is fundamentally based on the model predictive control (MPC) paradigm but specializes in safety rather than performance optimization. The core technique is to solve, at each time step, a finite-horizon optimal control problem which predicts the evolution of the system under a sequence of inputs—beginning with the input proposed by the RL agent or controller—and seeks the "closest" safe input sequence that ensures constraint satisfaction with high probability, possibly in the presence of state- and input-dependent uncertainty.

Mathematically, consider a discrete-time, nonlinear system: $x(k+1) = f(x(k), u(k); \theta_\mathcal{R})$ where $x(k)$ is the state, $u(k)$ is the control input, and $\theta_\mathcal{R}$ represents uncertain or random system parameters.

State and input constraints are defined as: $x(k) \in \mathbb{X} = \{x \mid A_x x \leq b_x\}, \quad u(k) \in \mathbb{U} = \{u \mid A_u u \leq b_u\}$ The PSF enforces these with a high probability $p_S$ : $\mathbf{P}(\forall k: x(k) \in \mathbb{X}, u(k) \in \mathbb{U}) \geq p_S$

At each step, the PSF solves: $\begin{align} &\min_{\{v_{i|k}\}} \| u_L - v_{0|k} \|^2 \ \text{s.t.}~~ &~\bar{x}_{i+1|k} = f(\bar{x}_{i|k}, v_{i|k}; \bar{\theta}) \ &~\bar{x}_{i|k} \in \mathbb{X}_i \ &~v_{i|k} \in \mathbb{U}_i \ &~\mathcal{E}_{p_S}(\bar{x}_{i|k}, v_{i|k}) \subseteq \overline{\mathcal{E}_{i}^{\gamma}} \ &~\bar{x}_{N|k} \in \mathcal{S}^t_N \ &~\bar{x}_{0|k} = x(k) \end{align}$ where $u_L$ is the RL-suggested action and the constraints are subject to probabilistic tightening to compensate for model uncertainty.

2. Modular Integration with Reinforcement Learning

The PSF is algorithmically agnostic—it sits as a filter between any RL or otherwise potentially unsafe controller and the physical system. The RL agent computes an action $u_L(k) = \pi_L(k, x(k))$ , and the PSF determines whether this action is certifiably safe given the system’s current state and learned uncertainty model. If so, the action is applied; if not, the PSF computes a minimally modified action as close as possible (by minimizing $\|u_L - v_{0|k}\|^2$ ) that admits a safe backup plan to a terminal invariant set.

This decoupling allows the RL agent to treat the safety-constrained system as if it were unconstrained; the learning process is accelerated since unsafe exploratory actions are filtered online, and catastrophic failures are prevented during training and deployment. The RL agent receives unmodified feedback from the environment except at times of PSF intervention.

3. Handling Model Uncertainty and Constraint Tightening

Practical deployment demands robust handling of uncertainty in the system model, which is typically learned from data. The PSF does this via predictive constraint tightening, using a set-valued model confidence map: $\mathcal{E}_{p_S}(x, u)$ which supplies, at each state-action pair, a region in which the model error lies with at least probability $p_S$ . Constraints are correspondingly tightened along the predicted horizon: $\mathbb{X}_i = \{ x \mid A_x x \leq (1-\epsilon_i)b_x \}, \quad \epsilon_{i+1} = \epsilon_i + \sqrt{\rho}^i \epsilon$ where the schedule for $\epsilon_i$ reflects the contraction properties of the system dynamics and the magnitude of tolerated model error.

The confidence map is usually constructed using a Bayesian learner (such as a Gaussian Process model), so that model-induced risk is state and input dependent. This enables the PSF to be more permissive in confident regions while remaining conservative in uncertain regions.

4. Intervention Logic and Recovery

The PSF modifies controller inputs only when replaying the RL-proposed action $u_L(k)$ cannot guarantee a feasible trajectory through the tightened constraint sets to a known terminal safe set. Upon detecting infeasibility, the PSF applies the first input of the nearest safe plan, and if no new feasible plan can be constructed at the next step, it "shrinks" its backup horizon, eventually reverting to a predefined terminal controller if necessary. This mechanism ensures that, even in cases of unexpected model error or process disturbance, the system falls back to an action plan that recursively guarantees safety.

A simplified pseudocode for the supervisory PSF logic is:

Function π_S(k, x(k), u_L(k)):
    if PSF MPC feasible (N steps):
        store k as last feasible step
        return v_{0|k}
    else if k < N + last_feasible_step:
        solve for reduced horizon
        return v_{0|(N - (k - last_feasible_step))}
    else:
        return terminal safe policy π_S^t(x(k))

5. Experimental Demonstrations

Empirical validation involved two primary case studies:

Inverted Pendulum Swing-up: With the PSF filtering a RL-based exploration policy, the pendulum swing-up task was completed without any constraint violation over 120 episodes, as opposed to frequent constraint breaches in pure RL.
Quadrotor Landing: The PSF enabled a learning agent to land a quadrotor at a specified target while respecting altitude and velocity constraints, completely avoiding unsafe trajectories that would otherwise lead to ground collisions.

Both experiments showed that the PSF only modified agent actions when required, that constraint tightness and model conservatism could be adaptively relaxed as state-action confidence grew with accumulating data, and that overall learning efficiency was not degraded.

6. Theoretical Properties and Scalability

The PSF delivers several theoretically grounded guarantees:

Universal compatibility: Applicable to any RL or model-free/model-based controller.
Online, minimally invasive operation: Modifies actions only as necessary.
Data-driven, adaptive safety: Uncertainty is state- and input-dependent, enabling less conservative operation in well-learned regions.
Probability-guaranteed constraint satisfaction: Safety is certified at a tunable probability level $p_S$ .
No precomputed global sets required: Unlike Hamilton-Jacobi or invariant set methods, the PSF requires only local finitary computation, scaling well with dimensionality and facilitating online deployment.

7. Design Parameters and Practical Considerations

Design of a PSF involves selecting:

Prediction horizon $N$
Contraction rate $\rho$ and error tolerance $\epsilon$
Terminal set and safe controller policies $\mathcal{S}^t$
Safety probability $p_S$ and model confidence set $\gamma$

Offline statistical validation is possible via Monte Carlo sampling.

Computational complexity is primarily determined by the MPC optimization and model inference (for uncertainty), but the absence of global set computations and the decoupling of safety from controller logic supports real-time application.

The Predictive Safety Filter provides a modular, scalable, and rigorous approach to enforcing safety in learning-based control of nonlinear constrained systems. Its MPC-inspired architecture, data-driven uncertainty quantification, and minimally invasive intervention policy establish it as a foundational framework for the safe deployment of reinforcement learning and other black-box control methods in real-world, safety-critical tasks.

PDF Markdown Chat (Upgrade)