Papers
Topics
Authors
Recent
Search
2000 character limit reached

Bridging Data-Driven Reachability Analysis and Statistical Estimation via Constrained Matrix Convex Generators

Published 6 Apr 2026 in eess.SY | (2604.04822v1)

Abstract: Data-driven reachability analysis enables safety verification when first-principles models are unavailable. This requires constructing sets of system models consistent with measured trajectories and noise assumptions. Existing approaches rely on zonotopic or box-based approximations, which do not fit the geometry of common noise distributions such as Gaussian disturbances and can lead to significant conservatism, especially in high-dimensional settings. This paper builds on ellipsotope-based representations to introduce mixed-norm uncertainty sets for data-driven reachability. The highest-density region defines the exact minimum-volume noise confidence set, while Constrained Convex Generators (CCG) and their matrix counterpart (CMCG) provide compatible geometric representations at the noise and parameter level. We show that the resulting CMCG coincides with the maximum-likelihood confidence ellipsoid for Gaussian disturbances, while remaining strictly tighter than constrained matrix zonotopes for mixed bounded-Gaussian noise. For non-convex noise distributions such as Gaussian mixtures, a minimum-volume enclosing ellipsoid provides a tractable convex surrogate. We further prove containment of the CMCG times CCG product and bound the conservatism of the Gaussian-Gaussian interaction. Numerical examples demonstrate substantially tighter reachable sets compared to box-based approximations of Gaussian disturbances. These results enable less conservative safety verification and improve the accuracy of uncertainty-aware control design.

Summary

  • The paper presents a novel framework that uses constrained matrix convex generators to reduce conservatism in reachability and parameter estimation.
  • It employs mixed-norm uncertainty representations to preserve the true geometry of Gaussian noise, yielding much tighter confidence sets.
  • Empirical results show up to 220× volume reduction and over 1000× faster computations compared to traditional box-based approaches.

Bridging Data-Driven Reachability and Statistical Estimation with Constrained Matrix Convex Generators

Introduction and Motivation

The paper "Bridging Data-Driven Reachability Analysis and Statistical Estimation via Constrained Matrix Convex Generators" (2604.04822) addresses a key bottleneck in data-driven reachability analysis: the conservatism introduced by standard box-based and zonotopic uncertainty approximations, especially under Gaussian noise. By leveraging mixed-norm uncertainty representations through Constrained Convex Generators (CCG) and their matrix counterparts (CMCG), the authors introduce a framework that preserves the geometry of the underlying noise distribution, achieving tighter confidence sets and improved computational properties. The highest-density region (HDR) is used as the statistically exact noise confidence set, and a formal connection is established between statistical estimation (MLE) and reachability via CMCG representations.

Zonotopic, Ellipsotopic, and Mixed-Norm Uncertainty Representations

Standard zonotopes, matrix zonotopes, and their constrained versions (CMZ) rely on ∞\infty-norm bounds, which over-approximate the true geometry of Gaussian disturbances. For high-dimensional systems, such over-approximation causes exponential inflation in volume relative to the actual confidence region, as previously reported (e.g., 310×310\times for q=10q=10 dimensions). Ellipsotopes, as introduced in prior work, unify ellipsoidal and zonotopic forms, and the present paper generalizes these sets to mixed-pp CCGs, assigning $2$-norm constraints to Gaussian generators and ∞\infty-norm constraints to bounded generators.

Mixed-norm CCGs accommodate arbitrary partitioning of generator coefficients, allowing the uncertainty representation to match the underlying noise model. The matrix variant, CMCG, further encodes parameter-level uncertainty sets consistent with input-state trajectory data and noise assumptions. Figure 1

Figure 1: Mixed bounded-Gaussian truncation illustrates how CCG (solid) avoids the box over-approximation imposed by probabilistic zonotope (dashed), preserving the $2$-norm geometry for Gaussian disturbances.

Highest Density Regions (HDR) and Exact Confidence Sets

The HDR is the smallest-volume region covering 1−α1-\alpha probability and thus serves as the natural confidence set for both bounded and Gaussian noise. For convex HDRs, the CCG representation is exact; for non-convex HDRs, as with Gaussian mixtures, the minimum-volume enclosing ellipsoid (MVEE) provides a tractable convex surrogate.

From Noise Geometry to Parameter Uncertainty: CMCG Pullback

A central theoretical contribution is the pullback theorem mapping noise-level CCGs to parameter-level CMCGs. Under Gaussian disturbances, the CMCG coincides with the classical MLE confidence ellipsoid. The parameter uncertainty set depends only on the directions observable in the data, dramatically reducing conservatism compared to box-based approaches, which inflate volume in all noise dimensions regardless of parameter relevance. Figure 2

Figure 2: Parameter-set comparison for scalar system (n=1n=1, T=30T=30): CMCG (green, solid) and MLE ellipsoid (blue, dashed) are identical; CMZ (red, dash-dot) is a much larger polytope due to box-based inflation.

For bounded noise, the CMCG reduces to CMZ, matching set-membership feasible set representations, and for mixed bounded-Gaussian noise, CMCG preserves the orthogonal sum, yielding confidence regions strictly tighter than CMZ by retaining the true 310×310\times0-norm geometry for the stochastic component.

Propagation, Product Operations, and Containment

The framework supports forward propagation of uncertainty via CMCG 310×310\times1 CCG products and Minkowski sums, retaining mixed 310×310\times2-norm constraints at each step. The containment theorem guarantees that reachable-set over-approximations are valid outer bounds, and the wrapping error from bilinear generator products is explicitly bounded. The construction avoids the exponential volume inflation typical of box-based Gaussian310×310\times3Gaussian blocks. Figure 3

Figure 3: Reachable-set comparison over five propagation steps in a 5D system. The CMCG-based sets (green) remain consistently tighter than CMZ-based sets (red), and closely track the true reachable set (blue).

Numerical Results and Computational Efficiency

Empirical studies confirm strong numerical advantages:

  • For parameter estimation from data, CMCG coincides exactly with the MLE ellipsoid, drastically smaller and sharper than CMZ, with volume ratios exceeding 310×310\times4 in high dimension.
  • In reachability analysis, CMCG-based propagation yields reachable sets with much tighter interval hulls and reduced conservatism, validated over multi-step propagation.
  • Computational time for CMCG-based reachability is more than 310×310\times5 faster than CMZ, due to linear algebraic operations replacing expensive LP-based kernel constraints.
  • For non-convex HDRs from Gaussian-mixture noise, the MVEE surrogate greatly improves over single-Gaussian approximations by convexifying the non-convex confidence region and maintaining valid coverage. Figure 4

    Figure 4: Gaussian-mixture case study demonstrates that MVEE-based CMCG (green) closely approximates the true HDR (shaded), outperforming conservative single-Gaussian surrogates (red).

Implications and Outlook

The theoretical implications extend to both statistical estimation and safety-critical model verification. By bridging statistical confidence sets (MLE ellipsoids, HDRs) and data-driven reachable sets, the framework enables sharper, less conservative uncertainty handling in identification, verification, and uncertainty-aware control. In mixed-noise regimes, the partitioned generator structure avoids unnecessary coupling, maximizing tightness and interpretability.

Practically, the CMCG approach preserves tractability for high-dimensional systems, enabling real-time reachable-set computation and enhanced safety verification in settings where bounded and stochastic disturbances co-occur.

Future developments will aim at:

  • Richer convex representations (polynomial CCGs) for exact non-convex HDRs
  • Distribution-free guarantees via sign-perturbed sums (SPS) and conformal prediction
  • Integration into uncertainty-aware control design and adaptive safety verification pipelines in autonomous systems and cyber-physical applications

Conclusion

The paper establishes a principled methodology that aligns uncertainty representations with the underlying noise geometry, guaranteeing coverage while minimizing conservatism. Through mixed-310×310\times6 norm CCG/CMCG sets, data-driven reachability analysis becomes both statistically sound and computationally efficient, with direct connections to statistical estimation theory. The tightness and tractability of CMCG-based sets render them especially valuable for safety-critical systems, with immediate impact on verification and robust control in realistic, uncertain environments.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.