Distributionally Robust PAC-Bayesian Control

Published 12 Apr 2026 in cs.LG and eess.SY | (2604.10588v1)

Abstract: We present a distributionally robust PAC-Bayesian framework for certifying the performance of learning-based finite-horizon controllers. While existing PAC-Bayes control literature typically assumes bounded losses and matching training and deployment distributions, we explicitly address unbounded losses and environmental distribution shifts (the sim-to-real gap). We achieve this by drawing on two modern lines of research, namely the PAC-Bayes generalization theory and distributionally robust optimization via the type-1 Wasserstein distance. By leveraging the System Level Synthesis (SLS) reparametrization, we derive a sub-Gaussian loss proxy and a bound on the performance loss due to distribution shift. Both are tied directly to the operator norm of the closed-loop map. For linear time-invariant systems, this yields a computationally tractable optimization-based framework together with high-probability safety certificates for deployment in real-world environments that differ from those used in training.

Abstract PDF Upgrade to Chat

Authors (2)

Summary

The paper introduces a PAC-Bayesian framework that integrates Wasserstein DRO with operator-norm penalties to bound worst-case deployment risks for controllers.
It employs system level synthesis to parameterize finite-horizon LTI control as a tractable, differentiable optimization problem with explicit generalization guarantees.
Empirical evaluations on a double integrator task demonstrate that the robust bound effectively captures test risk under adversarial distribution shifts.

Distributionally Robust PAC-Bayesian Control: A Technical Examination

Introduction

"Distributionally Robust PAC-Bayesian Control" (2604.10588) introduces a PAC-Bayesian framework for learning finite-horizon controllers endowed with explicit, high-probability deployment guarantees robust to distribution shift. The framework uniquely combines PAC-Bayesian generalization theory with distributionally robust optimization (DRO) via 1-Wasserstein distance, resulting in a tractable procedure applicable to linear time-invariant (LTI) systems. A notable aspect is the avoidance of restrictive bounded-loss assumptions, handling unbounded losses common in control under realistic disturbances.

Theoretical Formulation: PAC-Bayes + Wasserstein DRO

The principal innovation is an upper bound on distributionally robust population (DROP) risk for learning-based controllers, measured under a worst-case distribution shift constrained by Wasserstein radius $\rho$ from the empirical (training) data distribution. The bound holds for general Lipschitz (potentially unbounded) loss functions and calibrates its complexity and robustness penalty terms directly through controller-dependent operator norms.

Formally, for any posterior $Q$ over the controller space, the DROP risk is bounded as

$\mathbb{E}_{K \sim Q}\left[R_\rho(K)\right] \leq \mathbb{E}_{K \sim Q}\left[\widehat{R}_S(K) + L(K)\rho\right] + C(Q,P,\sigma)$

where $L(K)$ is the Lipschitz constant of the loss map for controller $K$ , $C(Q,P,\sigma)$ is a PAC-Bayes complexity term parameterized by controller-dependent sub-Gaussian variance proxy $\sigma(K)^2$ , and $\widehat{R}_S(K)$ is the empirical sample loss. This architecture replaces previous control-specific PAC-Bayes formulations that imposed bounded loss/saturation, used loose fixed penalty terms, or ignored environment shift.

Specialization to System Level Synthesis (SLS) for LTI Control

To operationalize these theoretical results, the framework is instantiated in the SLS paradigm for finite-horizon LTI systems. Controllers are parameterized via affine constraints encoding causality and feasibility for the lifted system, allowing direct computation of both operator-norm-based concentration and robustness certificates through the closed-loop map $M(\theta)$ . The loss, defined as $\ell(\theta, w) = \|M(\theta)w\|$ , is shown—via Gaussian concentration and operator norm analysis—to admit tractable, closed-form expressions for the complexity and DRO penalty terms for both Gaussian and bounded disturbances.

This parameterization yields a finite-dimensional, differentiable optimization problem over the posterior $Q$ 0 (e.g., an isotropic Gaussian in SLS coordinates), efficiently solvable via modern automatic differentiation and (stochastic) gradient methods. The KL complexity term is analytically available for Gaussian families, and posterior sampling is facilitated by reparameterization.

Empirical Evaluation and Certificate Analysis

The framework is empirically validated on a finite-horizon double integrator control task subject to random disturbances. The optimization calibrates both the empirical risk and the DRO penalty term, demonstrating the self-balancing property of the PAC-Bayes complexity when endowed with robust certificates.

Figure 1: Decomposition of the robust PAC-Bayes certificate cost for various training sample sizes, showing the interplay between empirical Gibbs risk and complexity as a function of data.

The ablation between the standard PAC-Bayes bound ( $Q$ 1) and the robust DRO-PAC-Bayes bound ( $Q$ 2) under distribution shift establishes the inability of the vanilla PAC-Bayesian bound to capture test-time risk in the presence of unmodeled deployment distributions. In contrast, the robust variant correctly upper-bounds the test risk under adversarially shifted deployment conditions, while attaining lower empirical loss, as visually confirmed.

Figure 2: Comparison of vanilla PAC-Bayes (rho = 0) and robust PAC-Bayes (rho = 0.08) on test risk under deployment distribution shift; only the robust version delivers valid risk certificates and improved test cost.

Experimental results highlight:

The PAC-Bayes complexity term contracts predictably with increased sample size, validating finite-sample performance guarantees;
The robust DRO penalty acts as an explicit regularizer directly tied to closed-loop sensitivity, thus optimizing for controllers that generalize under worst-case environmental drift;
When the deployment distribution shift remains inside the certified Wasserstein radius, only the robust bound maintains a valid upper certificate, thereby highlighting the necessity of explicit distributional robustness in safety-critical learning-based control.

Implications and Relationship to Prior Work

This framework explicitly decouples concentration and distributional robustness in the PAC-Bayesian context, allowing them to be calibrated in a controller- and task-dependent fashion. This differentiates it from prior PAC-Bayesian control work relying on $Q$ 3-divergence ambiguity sets with fixed penalties, which do not scale with system sensitivity. Via SLS, the analysis delivers numerically tractable, operator-norm-based proxies suitable for deployment in learning-based synthesis pipelines.

On the theoretical front, the explicit coupling of controller performance to robustness penalties sets the stage for systematic controller regularization under epistemic uncertainty. Practically, this enables deployment of learning-based controllers with strong statistical certificates even under adversarial environment perturbation—critical for real-world, high-assurance applications (robotics, automotive, financial systems, etc.).

Future Directions

Natural generalizations include extension to:

Robust control under more complex ambiguity sets (e.g., divergence-based, non-convex, or tailored uncertainty classes),
Nonlinear systems, where certificate and proxy computation demands tighter relaxations or scalable sampling methods,
Model predictive control settings with adaptive online update of robustness radii,
Sub-exponential or heavy-tailed cost structures,
Integration with data-driven system identification and robust estimator design.

Conclusion

The distributionally robust PAC-Bayesian control framework delivers tractable and explicit deployment risk certificates for learning-based controllers subject to real-world distributional uncertainties. By leveraging SLS to parameterize both controller concentration and robustness, the framework achieves minimax certificates under Wasserstein-constrained shifts while maintaining sample-efficient learning. This approach achieves strong empirical performance and valid guarantees in the presence of adversarial environment drift, representing a significant technical advance in the robust control synthesis under statistical uncertainty.