Papers
Topics
Authors
Recent
Search
2000 character limit reached

Exit Time Relaxed Control Problems

Updated 19 January 2026
  • Exit time relaxed control problems are defined by optimizing a cost functional up to a system exit, using measure-valued relaxed controls to guarantee solution existence under nonconvex and singular conditions.
  • They leverage Hamilton–Jacobi–Bellman equations and Pontryagin-type maximum principles to derive robust feedback laws for both deterministic and stochastic dynamical systems.
  • Key applications include stabilization, safety verification, and reach-avoid games, with numerical approaches like direct HJB solvers and symbolic abstractions mitigating the curse of dimensionality.

Exit time relaxed control problems concern the synthesis, analysis, and numerical approximation of feedback control strategies for controlled dynamical systems, where the objective is to optimize a cost functional up to the (system-dependent) time at which the state exits a prescribed set. The introduction of relaxed controls—controls valued in probability measures over the admissible control set—allows for the existence of optimal solutions and provides analytic and numerical advantages in the presence of nonconvexity, singularities, and regularity issues. Exit time problems arise naturally in stabilization, safety verification, safe exploration, minimum-time synthesis, reach-avoid games, and stochastic reachability.

1. Mathematical Formulation of Exit Time Relaxed Control Problems

Consider a controlled system

x˙(t)=f(x(t),u(t)),x(0)=x0GRn,u()U:=Lloc([0,),U),\dot x(t) = f(x(t), u(t)),\quad x(0)=x_0\in G\subset\mathbb{R}^n,\quad u(\cdot)\in \mathcal{U} := L^\infty_{\rm loc}([0,\infty), U),

where URmU\subset\mathbb{R}^m is compact and f:G×URnf:G\times U\to\mathbb{R}^n is continuous, locally Lipschitz in xx, and convex in uu (Yegorov et al., 2019). The exit time from a closed set QRnQ\subset\mathbb{R}^n (or, more generally, a sublevel set of a candidate Lyapunov function) is

TQ(x0,u())=inf{T0:x(T;x0,u())Q}.T_{Q}(x_0, u(\cdot)) = \inf\{T\ge 0: x(T; x_0, u(\cdot)) \in Q\}.

The value function for the exit-time optimal control problem is

V(x0)=infu():TQ(x0,u)<0TQg(x(t),u(t))dt+G(x(TQ)),V(x_0) = \inf_{u(\cdot): T_Q(x_0, u) < \infty} \int_0^{T_Q} g(x(t), u(t))\,dt + G(x(T_Q)),

where gg and GG are running and terminal costs.

In the stochastic setting, the system dynamics are given by

dXt=b(Xt,αt)dt+σ(Xt,αt)dWt,dX_t = b(X_t, \alpha_t)\,dt + \sigma(X_t, \alpha_t)\,dW_t,

with exit time τ:=inf{t0:XtO}\tau := \inf\{t \ge 0: X_t \notin O\} for bounded open domain OO, and the objective

J(x,α)=E[0τΓsf(Xs,αs)ds+Γτg(Xτ)],J(x, \alpha) = \mathbb{E}\left[\int_0^{\tau} \Gamma_s f(X_s, \alpha_s)\,ds + \Gamma_\tau g(X_\tau)\right],

where Γτ\Gamma_\tau encodes optional discount factors (Reisinger et al., 2020).

Relaxed controls are measure-valued processes λt\lambda_t with values in the probability simplex over UU, replacing deterministic actions by randomized policies. The (averaged) controlled dynamics become

x˙(t)=Uf(x(t),v)λt(dv).\dot x(t) = \int_U f(x(t), v) \,\lambda_t(dv).

This extension ensures existence and, under convexity, recovers the classical (non-relaxed) formulation (Lou et al., 2013).

2. Analysis: Hamilton–Jacobi–Bellman Equations and Pontryagin-Type Conditions

The value function for exit-time problems is characterized as the unique (viscosity) solution to a Hamilton–Jacobi–Bellman (HJB) PDE with exit/time boundary conditions. In deterministic continuous-time settings, for xQx \notin Q, the HJB equation is

minuU{DV(x),f(x,u)+g(x,u)}=0,VQ=G()\min_{u\in U} \{ \langle D V(x), f(x, u) \rangle + g(x, u) \} = 0,\qquad V|_{\partial Q} = G(\cdot)

(Yegorov et al., 2019). For stochastic controlled diffusions, the HJB incorporates the second-order operator: 0=maxλΔK{Lλu(x)+kf(x,k)λkρ(λ)}0 = \max_{\lambda \in \Delta_K} \left\{ L^\lambda u(x) + \sum_k f(x, k)\,\lambda_k - \rho(\lambda) \right\} with boundary condition on O\partial O (Reisinger et al., 2020). Here, LλL^\lambda is the infinitesimal generator averaged over the relaxed control.

For non-smooth or singular systems, Pontryagin-type Maximum Principles can be established for relaxed controls. The existence of an optimal relaxed trajectory and adjoint p()p(\cdot) is guaranteed, with

p˙(t)=Ufy(t,y(t),v)p(t)μt(dv)\dot p(t) = - \int_U f_y(t, y^*(t), v)^\top p(t)\, \mu_t^*(dv)

and the support of μt\mu_t^* concentrated on maximizers of the Hamiltonian [1309.5800][1309.5800]: suppμtargmaxvUp(t),f(t,y(t),v).\operatorname{supp} \mu^*_t \subset \arg\max_{v\in U} \langle p(t), f(t, y^*(t), v)\rangle. In convex settings, optimal classical controls are recovered as extremal points of the relaxed (measure) control.

3. Feedback Synthesis and Control Lyapunov Function Construction

A central application of exit-time control is the synthesis of global control Lyapunov functions (CLFs) for feedback stabilization. A local CLF VlocV_{\text{loc}} and associated feedback can be computed over a neighborhood. By formulating an exit-time optimal control problem with respect to the sublevel set of VlocV_{\text{loc}}, the solution concatenates VlocV_{\text{loc}} inside the local region and the exit-time value outside, yielding a global CLF satisfying requisite decrease conditions: infuUV(x;f(x,u))W(x)\inf_{u\in U} \partial^- V(x; f(x, u)) \le -W(x) over the domain of asymptotic null-controllability (Yegorov et al., 2019).

The feedback law is synthesized via

u(x)=argminuU{DV(x),f(x,u)+g(x,u)},u^*(x) = \arg\min_{u\in U} \{ \langle D V(x), f(x, u) \rangle + g(x, u)\},

exhibiting Lyapunov (strict) decrease along optimal closed-loop trajectories. In degenerate or stochastic settings, feedback relaxed controls—measurable selections of the optimal measure-valued policy—admit regularity and stability properties (e.g., Hölder/Lipschitz continuity with respect to data or regularization parameters) (Reisinger et al., 2020).

If no explicit local CLF is available, the construction is adapted by considering a family of exit-time problems with small target balls, yielding practical CLFs and uniform convergence as the ball radius vanishes.

4. Existence, Regularity, and Stability Results

  • Existence and Regularity: Under convexity and regularity hypotheses on dynamics and costs, exit-time optimal control problems admit (relaxed) optimal solutions and value functions that are continuous or possess higher regularity (e.g., C2,αC^{2,\alpha}) (Reisinger et al., 2020, Yegorov et al., 2019). For nonconvex or singular systems, relaxed solutions exist and satisfy suitable maximum principles (Lou et al., 2013).
  • Stability: Feedback and value functions for relaxed exit-time problems are Lipschitz stable with respect to parameter perturbations, with bounds on the deviation in value and synthesized feedback under model mismatch (Reisinger et al., 2020).
  • Exploration–Exploitation Interpolation: Regularization via convex penalties (e.g., entropy or general mixing penalties) induces continuous feedback laws, interpolating between pure exploitation (as the penalty vanishes) and robust exploratory policies. Monotone convergence and recovery of pure strategies in the limit are rigorously established (Reisinger et al., 2020).

5. Numerical Approaches and Approximate Solution Schemes

The solution of exit-time relaxed control problems involves high-dimensional HJB PDEs or dynamic programming recursions, susceptible to the curse of dimensionality. Several computational strategies are employed:

  • Direct Numerical Solution of HJB: For low dimensions, methods of characteristics or direct collocation (e.g., ACADO Toolkit) solve the exit-time HJB with boundary/terminal conditions (Yegorov et al., 2019).
  • Symbolic (Finite-State) Abstraction: In discrete-time settings with continuous state/control, symbolic abstractions (finite covers of the state space and quantized inputs) yield finite minimax recursions whose solutions upper/lower bound the value function. Algorithmic schemes similar to Dijkstra's algorithm can compute these symbolic value functions and feedbacks, with guaranteed hypo-convergence as abstraction granularity improves (Reissig et al., 2017).
Numerical Method Key Features Reference
Collocation & characteristics PDE-based direct value function computation (Yegorov et al., 2019)
Symbolic abstraction Grid-based finite-state synthesis, minimax recursion (Reissig et al., 2017)
Policy iteration + TT format SDEs, high dimensions, sample-based TT compression (Fackeldey et al., 2020)
  • Tensor Train/Policy Iteration for SDEs: For stochastic exit-time problems, approximate policy iteration on polynomial function spaces with Tensor Train (TT) decomposition is effective. Least-squares Monte-Carlo projections enable policy evaluation, and TT format storage mitigates exponential complexity in moderate dimensions. Monte Carlo integration approximates value and gradient calculations efficiently, with demonstrated scalability up to dimension n=6n=6 (Fackeldey et al., 2020).

6. Theoretical and Practical Implications

Exit-time relaxed control theory rigorously bridges classical optimal control, measure-valued and randomized policies, and robust feedback synthesis under regularity, nonconvexity, and high-dimensional scenarios. Measure-valued control formulations (Young measures) provide general existence and regularity results even when classical (pointwise) optimal policies fail to exist or are not robust. Regularization by exploration rewards and entropy terms ensures continuity and practical implementability of feedback policies, explaining the robustness of entropy-regularized reinforcement learning heuristics observed empirically (Reisinger et al., 2020).

Constructed value and control functions serve dual roles: as Lyapunov certificates for stabilization and as templates for correct-by-construction symbolic or numerical controllers. The curse of dimensionality is addressed via grid/coarsening refinements, symbolic abstractions, or low-rank function representations; however, the curse of complexity (computational, not just memory) can persist as a limiting factor (Yegorov et al., 2019).

7. Extensions and Open Directions

  • Infinite-Horizon and Minimax Formulations: Many results transfer directly to infinite-horizon, reach-avoid, minimum-time, or minimax settings. Non-smooth and singular systems can be handled via auxiliary or perturbed relaxed formulations, with the maximum principle holding at least for some optimal relaxed control (Lou et al., 2013).
  • Robust and Adaptive Control: Stability and first-order sensitivity formulae for value and relaxed controls underpin robust and adaptive control design under model uncertainty (Reisinger et al., 2020).
  • Algorithmic Enhancements: Multi-level Monte Carlo, control variates, sparse grids, kernel expansions, and neural-network-based policies serve as platforms for further computational improvements in high-dimensional or non-smooth domains (Fackeldey et al., 2020).

Exit time relaxed control theory synthesizes advanced functional analytic tools, stochastic process theory, optimization, and numerical analysis, constituting a foundational framework for modern robust, high-dimensional feedback control synthesis, analysis, and computation across deterministic, stochastic, and non-smooth systems.

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Exit Time Relaxed Control Problems.