Data-Driven Receding Horizon Control

Updated 1 March 2026

Data-Driven Receding Horizon Control is a data-centric methodology that leverages real-time measurements to update and optimize predictive control actions without relying on explicit parametric models.
It employs techniques like set-membership, Hankel matrix methods, and stochastic frameworks to ensure stability, robustness, and strict constraint satisfaction.
Its iterative receding horizon strategy progressively refines decision policies for dynamic systems, reducing uncertainty and enhancing performance even in distributed, safety-critical settings.

Data-Driven Receding Horizon Control (DDRHC) refers to a class of receding-horizon (often “model predictive”) control methodologies that directly leverage measured data—often eschewing explicit parametric modeling—in controller design, update, and execution. DDRHC architectures iteratively refine decision policies by incorporating new data within each receding window, typically offering guarantees on stability, constraint satisfaction, and sometimes quantifiable robustness against uncertainty and noise. Recent developments have advanced DDRHC for both deterministic and stochastic systems, including fully distributed, robust, and safety-critical settings.

1. Foundations of Data-Driven Receding Horizon Control

DDRHC departs from classical model-based RHC/MPC by obtaining a predictive model or feasible trajectory set directly from observed system input/output data, either using a batch of historical measurements or through an online execution-driven process. Core foundational mechanisms include:

Set-membership approaches: The controller maintains and updates a polyhedral (or semidefinite) set of system matrices consistent with all observed data under bounded noise/disturbances, thereby capturing feasible system behaviors without explicit identification (Zheng et al., 7 Oct 2025, Zheng et al., 7 Oct 2025).
Fundamental lemma methods: Construct block Hankel matrices from persistently-exciting data, exploiting the trajectory inclusion result of Willems et al. to parameterize all system-consistent behaviors and formulate predictive control laws (Allibhoy et al., 2020, Li et al., 2023).
Distributionally robust and stochastic frameworks: Use data-driven ambiguity sets for probabilistic system quantities (e.g., Markov transition matrices, stochastic disturbance models), and solve min-max or chance-constrained optimization problems over these sets (Schuurmans et al., 2021, Li et al., 2023).
Recursive estimation and online adaptation: Real-time updating of data dictionaries, model sets, or behavioral models using newly collected input-output measurements within each RHC iteration, leading to shrinking uncertainty and improved control performance (Zheng et al., 7 Oct 2025, Ebenbauer et al., 2020).

In all DDRHC variants, receding-horizon execution means only the first action from the optimized sequence is applied, after which data and window are shifted forward for the next control period.

2. Key Methodologies and Core Algorithms

DDRHC encompasses several distinct algorithmic templates, tightly linked to the modeling and robustness requirements:

2.1 Set-Membership Polytope Controllers

Represent the set of all linear systems $(A, B)$ compatible with execution data $\mathcal{D}_k$ , i.e.,

$\mathcal{C}(\mathcal{D}_k) = \{(A, B): V(Ax_i + Bu_i - x_{i+1}) \leq 1, \forall (x_i, u_i, x_{i+1}) \in \mathcal{D}_k \}$

With this uncertainty set, the DDRHC law at time $k$ seeks a control input $u_k$ and contractivity factor $\lambda$ such that all feasible system realizations satisfy

$\psi_{\mathcal X}(A x_k + B u_k + v) \leq \lambda \, \psi_{\mathcal X}(x_k),\quad \forall v \in \mathcal V,\, (A,B) \in \mathcal{C}(\mathcal{D}_k)$

This constraint is convexified using dualization, yielding a tractable LP or SDP (Zheng et al., 7 Oct 2025, Zheng et al., 7 Oct 2025).

2.2 Hankel Matrix and Fundamental Lemma Methods

Employ block Hankel matrices $H_\tau(w^d)$ constructed from measured data to encode all system-consistent trajectories. In the centralized case:

Any extension of past inputs $u_{\text{ini}}$ and states $x_{\text{ini}}$ over a horizon of length $N$ can be reproduced as

$u = U_f \alpha,\, x = X_f \alpha,$

under $\left[U_p; X_p\right] \alpha = [u_{\text{ini}}; x_{\text{ini}}]$ for some coefficient vector $\alpha$ .

Distributed DDRHC uses these decompositions at each agent, enforcing local consistency constraints in a network (Allibhoy et al., 2020).

2.3 Stochastic and Chance-Constrained Data-Driven RHC

For unknown stochastic LTI models with measurement and process noise, DDRHC reconstructs data-driven state-space models through Hankel-based least-squares, leading to auxiliary models $(\mathbf{A}, \mathbf{B}, \mathbf{C})$ (Li et al., 2023). Kalman filtering and affine disturbance-feedback policies are implemented as in standard SMPC, but all filter and controller matrices are constructed from data. Chance constraints are imposed on predicted input and output trajectories:

$\mathbb{P}(E^{\mathrm{u}} u_t \leq f^{\mathrm{u}}) \geq 1-p^{\mathrm{u}}, \quad \mathbb{P}(E^{\mathrm{y}} y_t \leq f^{\mathrm{y}}) \geq 1-p^{\mathrm{y}}$

with closed-form risk allocation.

2.4 Recursive Estimation and Receding-Horizon Learning

Recursive estimation schemes combine proximity-based system identification with receding-horizon QP or SDP control, iteratively refining predictors and minimizing worst-case or expected costs using only the last $N$ samples (Ebenbauer et al., 2020).

2.5 Data-Driven Estimation and Robust Control for Switched and Uncertain Systems

For Markov jump linear systems (MJLS) with unknown modes and transition probabilities, a receding-horizon estimator identifies consistent mode-sequences and state trajectories online, while ambiguity sets $\mathcal P_{t,i}$ for Markov rows are recursively updated, leading to robust LMIs for mode-dependent stabilizing feedback (Schuurmans et al., 2021).

3. Robustness, Stability, and Theoretical Guarantees

Modern DDRHC frameworks provide rigorous robust stability and feasibility results that hold uniformly for every compatible model (or distribution) given the data:

Set-invariance and robustness: Provided the controlled-invariant set $\mathcal{X}_I$ is contractive for all $(A,B)$ in the membership polytope, recursive feasibility and robust ultimate boundedness are guaranteed; the closed-loop contracts toward $\lambda^* \mathcal{X}_I$ for some $\lambda^* < 1$ (Zheng et al., 7 Oct 2025).
Stochastic and chance-constrained guarantees: Under idealized conditions, the data-driven SMPC is nominally equivalent to its model-based counterpart, including state and input constraint satisfaction with desired risk levels (Li et al., 2023).
Online uncertainty reduction: Each new data sample shrinks the feasible set $\mathcal{C}$ , tightening the worst-case cost bounds and typically improving closed-loop performance (Zheng et al., 7 Oct 2025, Zheng et al., 7 Oct 2025).
Regret and learning: While specific statistical regret guarantees are developed in the meta-learning community, the central point is closed-loop cost or regret improves as compatible model sets are pruned over time (see (Muthirayan et al., 2020) for associated guarantees).
Safety and infeasibility handling: In safety-prioritized DDRHC, such as mixed traffic or platoon control, hard safety constraints are enforced and always take precedence, ensuring feasibility with only mild assumptions on the environment (Mahbub et al., 2022).

4. Computational Approaches and Practical Implementation

DDRHC protocols are tailored for tractability:

Polytope- and dual-based LPs/SDPs: Membership polytopes typically yield tractable LPs in low dimensions; input-constraint or LQR problems use robust SDP relaxations with dual or sum-of-squares multipliers, scaling polynomially in the polytope facets and degree (Zheng et al., 7 Oct 2025).
Recursive updating: Consistency sets are incrementally shrunk as data accrue; per step, this may require only a limited number of least-squares projections and robust LMI or LP solves (Zheng et al., 7 Oct 2025, Schuurmans et al., 2021).
Distributed algorithms: Primal-dual flows and consensus schemes decompose centralized problems into local updates requiring only neighbor communication (Allibhoy et al., 2020).
Data-handling: Data dictionaries are updated online; sliding or growing-horizon implementations control complexity and enable adaptation (Ebenbauer et al., 2020).
Monte Carlo and path-integral methods: In nonlinear or diffusion systems, Monte Carlo approximations of the Feynman-Kac representation enable control synthesis directly from sample-driven path integrals (Bertoli et al., 2015).
Demand and mobility models: Large-scale urban DDRHC, as in taxi dispatch, leverages real-time data assimilation and tractable convex programs, e.g., for assignment and balancing in networks with stochastic demands (Miao et al., 2016).

5. Representative Applications

DDRHC frameworks are deployed in a variety of settings:

LTI and networked systems: Distributed stabilization, output tracking, and robust LQR control under full or partial-state feedback (Allibhoy et al., 2020, Zheng et al., 7 Oct 2025).
Stochastic and partially observed systems: Output-tracking under process/measurement noise and uncertain initial conditions, with equivalence to classical SMPC (Li et al., 2023).
MJLS and hybrid systems: Robust stabilization when mode transitions are hidden and Markov probabilities are unknown, with online mode and distribution estimation (Schuurmans et al., 2021).
Mobility and logistics: Taxi dispatching in large urban environments, minimizing idle driving and service mismatch via real-time demand/mobility inference (Miao et al., 2016).
Autonomous transportation: Platoon formation involving both automated and human-driven vehicles, with RLS-based estimation of human agent dynamics and safety-prioritized MPC (Mahbub et al., 2022).
Nonlinear stochastic control: Sample-based DDRHC for controlled diffusions using path-integral representations to bypass high-dimensional HJB solution (Bertoli et al., 2015).

6. Strengths, Limitations, and Research Directions

DDRHC provides an explicit, data-driven pathway to robust and adaptive RHC, with several notable advantages:

Model-independence: Directly leverages available data without explicit system identification or parametric uncertainty modeling.
Robust constraint handling: Guaranteed satisfaction of state, input, and safety constraints across all system realizations consistent with observed data.
Online learning: Each new trajectory sample is assimilated, shrinking the robust uncertainty set and typically improving cost performance and contractivity.
Distributed and scalable implementations: Enables decentralized and scalable control of large-scale networked systems.

However, limitations and open challenges include:

Scaling to high-dimensional systems: Robust LP/SDP complexity grows with the number of polytope facets and degree of sum-of-squares constraints (Zheng et al., 7 Oct 2025, Zheng et al., 7 Oct 2025).
Conservatism under severe noise or unmodeled dynamics: Set-membership bounds may be conservative if disturbance/noise is large and data is limited.
Persistent excitation requirements: Hankel-based and some set-membership methods require persistently-exciting data for full behavior characterization.
Integration with nonlinear and hybrid dynamics: While Monte Carlo methods and finite-memory observers are effective, routine robustification for nonlinear, switched, or non-stationary systems remains an active area (Bertoli et al., 2015, Schuurmans et al., 2021).
Safety-prioritized infeasibility management: Ensuring feasibility under hard safety constraints or in highly uncertain, adversarial environments is complex and demands further research (Mahbub et al., 2022).

DDRHC continues to evolve, with ongoing work on scalable robustification, reduced conservatism, automated excitation design, and rigorous learning-convergence analysis across diverse dynamical regimes.