Relaxed Terminal-Cost Formulation

Updated 2 December 2025

Relaxed Terminal-Cost Formulation is a control paradigm that replaces strict terminal constraints with soft penalties or probabilistic enforcement to achieve stability and flexibility.
It employs scenario-based synthesis and learning techniques to reduce online computational complexity while maintaining robust, quantifiable performance in MPC.
The approach enhances multiobjective and stochastic control by leveraging metrics like Wasserstein and Hilbert–Schmidt distances to balance control effort and terminal performance.

A relaxed terminal-cost formulation is an optimal control or predictive control paradigm in which the traditional hard constraint of enforcing a strict terminal state (or distribution) or terminal Lyapunov property is replaced by a soft or probabilistic penalty, or by imposing descent or stability requirements on a broad class of approximators only on a subset of states, with statistical guarantees elsewhere. This approach increases flexibility, reduces online computational complexity, and is central to modern data-driven, learning-based, and multiobjective model predictive control (MPC) methods. Recent research spans settings from nonlinear MPC and covariance steering to adversarial differential games and stochastic network routing.

1. Conceptual Foundation and Definitions

Classic MPC and optimal control formulations achieve closed-loop stability and performance by explicitly enforcing a terminal constraint or terminal cost—often a control Lyapunov function (CLF) or a fixed final state distribution. In a relaxed terminal-cost approach, this requirement is softened:

Soft penalty: Rather than enforcing $x_N = x_{\mathrm{target}}$ or $\Sigma_N = \Sigma_{\mathrm{target}}$ exactly, a penalty term such as $W_2^2$ (Wasserstein), Hilbert–Schmidt, or Gromov–Wasserstein distance is added to the cost functional, allowing terminal deviations at finite cost.
Probabilistic or scenario-based enforcement: The stability/descent condition on the terminal cost is imposed only on a finite set of sampled states, with scenario-theoretic guarantees covering the rest of the state space.
Relaxed Lyapunov/CLF conditions: The terminal cost function is not required to be a global Lyapunov function, but must satisfy a decrease property on a statistically significant set of states or under specific policies.

A canonical discrete-time relaxed terminal-cost MPC problem takes the form (following (Baltussen et al., 7 Aug 2025)):

$\begin{aligned} &\min_{u_0,\ldots,u_{N-1}} \sum_{k=0}^{N-1}\ell(x_{k|t},u_{k|t}) + V_{\text{term}}(x_{N|t}) \ &\text{subject to: } x_{k+1|t} = f(x_{k|t},u_{k|t}), \; x_{k|t}\in\mathcal{X},\, u_{k|t}\in\mathcal{U}. \end{aligned}$

with $V_{\text{term}}$ learned or designed via a relaxed formulation.

2. Scenario-Based Terminal Cost Synthesis

A key methodology for developing relaxed terminal costs in nonlinear MPC is the scenario approach (Baltussen et al., 7 Aug 2025). Here, the terminal cost is parameterized in a function class (e.g., linear in basis functions):

$\widehat{V}(x;\theta) = \theta^\top \phi(x).$

Instead of enforcing the Lyapunov decrease condition

$\widehat{V}(f(x,u^*(x));\theta) - \widehat{V}(x;\theta) \leq -\ell(x,u^*(x)) \qquad \forall x \in \mathcal{X},$

for all $x$ , it is imposed on $M$ states $\{x^{(i)}\}$ sampled i.i.d. from $\mathcal{X}$ :

$\widehat{V}(f(x^{(i)},u^{(i)});\theta) - \widehat{V}(x^{(i)};\theta) \leq -\ell(x^{(i)},u^{(i)}), \quad i=1, ..., M.$

The scenario theorem (Campi–Garatti) ensures that, with probability at least $1-\beta$ , the measure of violating states is no greater than $\epsilon$ (for suitable $M$ as a function of parameter dimension $d$ and confidence level $\beta$ ):

$\Pr^M\left\{\Pr_{x\sim\mathcal{X}}\left\{\text{descent holds at } x\right\} \ge 1-\epsilon\right\}\ge 1-\beta.$

This guarantees that the closed-loop MPC inherits a relaxed Lyapunov property: the value function decreases everywhere except in at most an $\epsilon$ -fraction of the state space (Baltussen et al., 7 Aug 2025).

3. Relaxed Terminal Cost in Multiobjective and Stochastic Control

Relaxed terminal-cost approaches appear in advanced multiobjective and stochastic problems:

Multiobjective Nonlinear MPC: Imposing a compatible terminal cost and dissipativity only for a single “master” objective suffices for asymptotic stability and performance bounds for all objectives. The remaining objectives need no terminal costs and float freely as long as the master’s criterion does not worsen (Eichfelder et al., 2022).
Covariance Steering with Wasserstein or Gromov–Wasserstein Terminal Costs: Classical covariance steering enforces hard constraints on terminal distributions. Relaxed formulations replace this with soft penalties—typically squared Wasserstein (Balci et al., 2022) or Gromov–Wasserstein (Morimoto et al., 2024) distances—yielding convex (or difference-of-convex, DC) programs. These formulations allow for minimum control effort solutions and enable matching the “shape” of the terminal distribution up to isometry, which is critical for swarming and formation applications (Morimoto et al., 2024).
Continuous-Time LQ Covariance Steering with Hilbert–Schmidt Terminal Cost: Here, the penalty is the squared Frobenius norm between the achieved and desired final covariances, leading to a tractable matrix-valued boundary-value problem. The penalty parameter provides a trade-off: as its weight increases, the solution approaches the hard terminal constraint (Sial et al., 24 Oct 2025).
Relaxed Schrödinger Bridges: In network routing or entropy-optimal control, enforcing a soft terminal cost via relative entropy (KL divergence) or maximum entropy formulations yields iterative algorithms with provable contraction and fast convergence, as developed in (Chen et al., 2018).

4. Learning-Based and Adaptive Terminal Costs

Model predictive control schemes increasingly incorporate learning of the terminal cost, using offline value function approximation (e.g., approximate dynamic programming, value iteration) or online adaptation:

Learning via Value Iteration: The terminal cost is fit (e.g., by regression) to an approximate cost-to-go over a sampled state-action dataset, imposing a Lyapunov-like decrease only on the data, and using tolerance bounds to guarantee closed-loop stability for sufficiently large prediction horizons (Moreno-Mora et al., 2022).
Online Adaptive Terminal Cost: The terminal cost may be updated at runtime using adaptive dynamic programming, continually recalibrating the one-step decrease condition and improving the estimated suboptimality index for the closed-loop cost (Beckenbach et al., 2021).
Relaxed CLF via Finite-Tail Cost: The finite-tail cost method replaces the standard terminal CLF with the cost accumulated under a locally stabilizing feedback over a short tail horizon, yielding significant reductions in the required open-loop horizon and computational effort, proven to maintain stability and performance under realistic assumptions (Köhler et al., 2021).

5. Theoretical Guarantees and Trade-Offs

Relaxed terminal-cost approaches are accompanied by probabilistic, analytical, or empirical performance and stability certificates:

Probabilistic Descent: Scenario-based synthesis provides explicit $(\epsilon,\beta)$ guarantees—violations of the Lyapunov decrease occur in at most an $\epsilon$ -fraction of the state space with probability at least $1-\beta$ (Baltussen et al., 7 Aug 2025).
Suboptimality Bounds: The predicted closed-loop infinite-horizon cost is at most $1/\alpha$ times the theoretical optimum, with $\alpha$ quantifiable from the relaxation parameters and learning error (Moreno-Mora et al., 2022, Beckenbach et al., 2021).
Performance vs. Horizon Length: Shortening the predictive horizon by leveraging an improved (possibly learned) terminal cost reduces online complexity. The accuracy and robustness now hinge on the fidelity of the approximation and the measure of the violation set (Baltussen et al., 7 Aug 2025). The MPC prediction horizon can often be reduced by an order of magnitude without sacrificing closed-loop performance.
Soft Penalty Trade-Offs: The terminal penalty weight (e.g., in Wasserstein, Hilbert–Schmidt, or GW cost) controls the balance between tracking the terminal constraint and minimizing control effort. As the penalty weight rises, the relaxation approaches the hard constraint at the expense of increased effort/cost (Sial et al., 24 Oct 2025, Morimoto et al., 2024).

6. Applications and Case Studies

Relaxed terminal-cost formulations have demonstrable impact in diverse advanced applications:

Nonlinear Chemical Process Control (CSTR): Relaxed, scenario-based terminal-cost MPC matches long-horizon expert MPC performance with online solve times reduced by nearly an order of magnitude, with descent violations matching the prescribed $\epsilon$ (Baltussen et al., 7 Aug 2025, Eichfelder et al., 2022).
Four-Tank Benchmark: Finite-tail MPC achieves stability at far shorter combined horizon than classical methods, supported by theoretical and simulation evidence (Köhler et al., 2021).
Network Flow and Routing: Schrödinger bridge with relaxed terminal cost supports robust routing under network uncertainties and admits scalable fixed-point algorithms with guaranteed convergence (Chen et al., 2018).
Formation and Density Steering in Robotics: Relaxed terminal-costs using GW distance enable direct specification of desired shapes for swarm distributions, invariant up to rotation and translation, permitting efficient energy use (Morimoto et al., 2024).
Pursuit–Evasion Games: The generalization to locally Lipschitz terminal costs with relaxed convexity yields mathematically universal viscosity solution results, enabling value computation via tractable mathematical programming even with nonconvex targets (Huang et al., 31 Oct 2025).

7. Summary

Relaxed terminal-cost formulations represent a fundamental paradigm shift in optimal and predictive control, trading rigid terminal constraints for tractable, data-driven, and flexible frameworks with quantifiable guarantees. They enable reduced horizon lengths, computational complexity, and conservative design, proven for nonlinear MPC, covariance steering, multiobjective control, stochastic optimal transport, and adversarial games. The theoretical foundation leverages scenario optimization, probabilistic analysis, convex relaxations, and Lyapunov-based inequalities. These formulations underpin many emerging applications in modern control, learning-based robotics, and distributed systems (Baltussen et al., 7 Aug 2025, Eichfelder et al., 2022, Morimoto et al., 2024, Moreno-Mora et al., 2022, Köhler et al., 2021, Sial et al., 24 Oct 2025, Huang et al., 31 Oct 2025, Balci et al., 2022, Beckenbach et al., 2021, Chen et al., 2018).