Robust Action Governor (RAG)

Updated 28 January 2026

Robust Action Governor (RAG) is a supervisory scheme that enforces safety in uncertain control systems with non-convex state-input constraints.
It uses online mixed-integer quadratic programming to adjust nominal controller actions, ensuring recursive feasibility and robust invariance.
RAG computes robust invariant sets offline to manage uncertainties, bridging nominal control and safe execution in applications like reinforcement learning.

The Robust Action Governor (RAG) is an add-on supervisory scheme designed to enforce strict safety specifications for systems subject to uncertainties and hard state-input constraints, including non-convex constraints. As an intermediary between a nominal controller—whether classical, model-based, or learned—and the plant, RAG modifies proposed control actions in real time to guarantee recursive, robust, and all-time satisfaction of safety requirements. This includes accommodating both parametric and additive uncertainties prevalent in piecewise affine (PWA) and linear systems, and managing scenarios in reinforcement learning (RL) where unsafe exploration may otherwise occur (Li et al., 2022, Li et al., 2022, Li et al., 2021).

1. System Model and Safety Constraints

RAG is formulated for discrete-time systems with state-dependent mode switching, commonly represented as PWA dynamics with uncertainties. The evolution of system state takes the general form:

$x_{k+1} = A_{\sigma(k)}(w^p_k)\,x_k + B_{\sigma(k)}(w^p_k)\,u_k + f_{\sigma(k)}(w^p_k) + E_{\sigma(k)}(w^p_k)w^a_k,$

where $\sigma(k)$ indicates the active mode, $w^p_k$ parameterizes the matrices (with $w^p_k\in W^p_{\sigma(k)}$ a unit simplex), and $w^a_k\in W^a_{\sigma(k)}$ is the additive disturbance (bounded polytope). Matrices $A_q(w^p), B_q(w^p), f_q(w^p), E_q(w^p)$ are convex combinations over vertices indexed by $j$ .

Safety constraints are expressed as general non-convex, pointwise state constraints and polyhedral input constraints:

$x_k \in X := \bigcup_{i=1}^{r_0} X^i$ , $X^i$ polyhedral;
$u_k \in U_{\sigma(k)}$ , $U_q$ polytope.

This structure accommodates high expressiveness, including constraints that vary with state, mode, and operating region.

2. Robust Control-Invariant Sets and the RAG Principle

At the core of RAG is the robust maximal control-invariant set (the "viability kernel") defined recursively for uncertain, possibly switching, dynamics. The invariant set $\Omega_\infty$ is given as:

$\Omega_\infty = \left\{ x \in X \mid \exists u \in U_q : A_q(w^p)x + B_q(w^p)u + f_q(w^p) + E_q(w^p)w^a \in \Omega_\infty,\,\,\forall q, \forall w^p \in W^p_q, \forall w^a \in W^a_q \right\}$

Offline, $\Omega_\infty$ is approximated by a decreasing sequence of sets:

$\Omega_0 = X$ ,
$\Omega_{k+1} = \mathrm{Proj}_x\{ (x, u) : x\in\Omega_k\cap P_q, u\in U_q, A_{q,j}x + B_{q,j}u + f_{q,j} + E_{q,j}W^a_q \subseteq \Omega_k,\, \forall j \}$ .

Each $\Omega_k$ is a (possibly non-convex) union of polyhedra, enabling precise accommodation of complex constraints and uncertainties (Li et al., 2022).

3. Online Optimization: Mixed-Integer Quadratic Programming Formulation

At runtime, RAG operates as a filter between the nominal controller and the actuator. For each timestep $k$ , it solves:

$u_k^* \in \arg\min_{u \in U_{\sigma(k)}} \; \|u - u_{\text{nom}}(k)\|_S^2$

subject to

$A_{\sigma(k),j}x_k + B_{\sigma(k),j}u + f_{\sigma(k),j} \in \left(\Omega_{k'}^i \ominus E_{\sigma(k),j} W^a_{\sigma(k)}\right),\;\; \forall j,\, \text{for at least one cell } i,$

where $S$ is a positive definite weighting matrix and $\ominus$ denotes the Pontryagin difference. Since $\Omega_{k'}$ is non-convex (union of polyhedra), integer variables and big-M constraints are introduced, resulting in a mixed-integer quadratic program (MIQP). This MIQP is solved efficiently online (typical runtime: 15-30 ms per step on modern CPUs) using solvers such as Gurobi or SCIP (Li et al., 2022, Li et al., 2021).

4. Theoretical Properties and Robustness Guarantees

RAG ensures:

Recursive Feasibility: If the MIQP is feasible at step $k$ with $x_k\in\Omega_{k'}$ , the resulting $u_k$ ensures $x_{k+1}\in\Omega_{k'}$ for all allowable uncertainties; hence the MIQP remains feasible for subsequent $k$ .
Robust Safety: All trajectories satisfy $x_k\in X$ for all $k$ , across all admissible realizations of $w^p_k$ and $w^a_k$ .
Safe Set Convergence: The set sequence $\Omega_{k+1}\subseteq\Omega_k$ converges to $\Omega_\infty$ , proven robustly invariant under mild compactness assumptions (Li et al., 2022, Li et al., 2022).

The supervisor's action corrections trade a margin of performance for guaranteed invariance, with larger uncertainties inducing more conservative feasible sets.

5. Integration with Safe Reinforcement Learning

RAG enables safe RL by robustly decoupling safety enforcement from exploratory policy learning. The process proceeds as follows:

The RL agent observes $x_k$ and proposes a candidate action $u_{\text{nom}}(k)$ (potentially through ε-greedy selection over its Q-function).
RAG filters $u_{\text{nom}}(k)$ to produce $u^*_k$ via the MIQP, ensuring one-step (and recursive) safety.
The plant receives $u^*_k$ , reward $R_k$ is observed, and the RL agent updates its value function as if $u_{\text{nom}}(k)$ had been applied, ensuring undisturbed learning.
All-time constraint violations are precluded throughout both exploration and exploitation phases (Li et al., 2022, Li et al., 2021).

To further reduce online computational demands, an explicit safe policy $\hat{u} = \hat{\pi}(x)$ can be obtained by imitation learning from RAG-filtered offline data, allowing near-instantaneous (e.g., 0.5 ms per step) action evaluation with minor safety approximation error.

6. Computational Approach

Offline Phase: The robust invariant set computations utilize polyhedral operations—Minkowski sums, Pontryagin differences, intersections, and projections—implemented in toolboxes such as MPT3. For PWA systems and non-convex $X$ , vertex enumeration is employed for universal quantification over uncertainties.

Online Phase: MIQP solution for action correction is performed at every step. For moderate system dimensions and sampling rates (e.g., ≤20 ms per MIQP), RAG is compatible with real-time operation in practical safety-critical control loops (Li et al., 2022, Li et al., 2021).

7. Example Applications and Performance

RAG has been evaluated on PWA models, such as a mass-spring-damper system with uncertain mass and adversarial disturbances. In soft-landing tasks with non-convex velocity-position safety regions and input force constraints, RAG achieves a violation rate of zero over 500 disturbance trials under adversarial injection, compared to frequent violations from a nominal RL controller.

In RL-driven adaptation scenarios (e.g., shifting system parameters $(m,d)$ ), RAG-based safe RL maintains zero constraint violations from the first episode, whereas conventional RL may require hundreds of episodes and still suffer occasional violations. When deploying learned explicit safe policies via imitation learning, it is possible to obtain real-time safe control at >95% faster per-step compute time, tolerating negligible or minor constraint relaxation (Li et al., 2022).

In automotive adaptive cruise control, RAG strictly enforces distance and actuator limits both during training and deployment, yielding zero violations and faster RL convergence—demonstrating the general applicability of the approach (Li et al., 2021).

References

"Robust Action Governor for Uncertain Piecewise Affine Systems with Non-convex Constraints and Safe Reinforcement Learning" (Li et al., 2022)
"Safe Control and Learning Using the Generalized Action Governor" (Li et al., 2022)
"Safe Reinforcement Learning Using Robust Action Governor" (Li et al., 2021)

Markdown Report Issue Upgrade to Chat

References (3)

Robust Action Governor for Uncertain Piecewise Affine Systems with Non-convex Constraints and Safe Reinforcement Learning (2022)

Safe Control and Learning Using the Generalized Action Governor (2022)

Safe Reinforcement Learning Using Robust Action Governor (2021)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Robust Action Governor (RAG).

Robust Action Governor (RAG)

1. System Model and Safety Constraints

2. Robust Control-Invariant Sets and the RAG Principle

3. Online Optimization: Mixed-Integer Quadratic Programming Formulation

4. Theoretical Properties and Robustness Guarantees

5. Integration with Safe Reinforcement Learning

6. Computational Approach

7. Example Applications and Performance

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Robust Action Governor (RAG)

1. System Model and Safety Constraints

2. Robust Control-Invariant Sets and the RAG Principle

3. Online Optimization: Mixed-Integer Quadratic Programming Formulation

4. Theoretical Properties and Robustness Guarantees

5. Integration with Safe Reinforcement Learning

6. Computational Approach

7. Example Applications and Performance

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research