Discrete-Time Optimal Control
- Discrete-time optimal control is a framework that defines optimal strategies for systems evolving in discrete steps under various constraints.
- It employs rigorous methods like the Pontryagin maximum principle, dynamic programming, and Riccati recursions to achieve optimality in deterministic and stochastic settings.
- Advanced numerical techniques and geometric discretizations enable practical solutions for applications in robotics, economics, aerospace, and model predictive control.
Discrete-time optimal control concerns the design and analysis of control strategies for dynamical systems governed by discrete-time evolution equations, formulated to optimize a given performance criterion under constraints. The field has broad applicability covering deterministic, stochastic, constrained, partially observed, mean-field, and structure-preserving settings, with rigorous methodologies encompassing the Pontryagin maximum principle, dynamic programming, convexification, duality, acceleration techniques, and modern machine learning-based approaches.
1. Problem Formulations and Canonical Models
Let denote the system state evolving via discrete-time dynamics
where is the control, and may represent known inputs, exogenous disturbances, or stochastic noise. The objective is to choose a sequence (possibly adapted to observations or stochastic filtrations) to optimize a cost functional. Canonical cost forms include:
- Bolza/finite-horizon form:
- Stochastic mean-field cost:
- Constrained objectives: quadratic constraints or state/input constraints .
Admissible control sets may be deterministic, stochastic (mean-field, Markov jump), subject to information or delay constraints, or must adapt to partial/noisy observations (Chichportich et al., 2023, Li et al., 2023).
2. Necessary Conditions: Discrete-Time Maximum Principle
The discrete-time Pontryagin Maximum Principle (PMP) and geometric variants constitute foundational necessary conditions for optimality:
- Adjoint recursion: For costate (adjoint) variables , the backward difference recursion is
- Stationarity: The optimal control at each satisfies
where is the stage (discrete) Hamiltonian,
- Extensions: The PMP generalizes to settings with manifold-valued states, pointwise or frequency-domain constraints, and mean-field interactions (Kipka et al., 2017, K et al., 2018, Ahmadova et al., 2022).
Complementary-sensitivity results and (if convexity holds) sufficient optimality can also be derived, typically via duality on the Hamiltonian or via DPP (Ahmadova et al., 2022, Kipka et al., 2017).
3. Methods for Deterministic and Stochastic Problems
3.1 Linear-Quadratic (LQ) and LQG Problems
- Standard Riccati recursion: For deterministic or Gaussian (LQG) systems, the optimal controller is found via backward Riccati equations, yielding a linear or affine state-feedback law (Kipka et al., 2017).
- Stochastic and Mean-field settings: If noise is additive or multiplicative, optimal feedback gains arise from a generalized algebraic or difference Riccati equation (SARE), requiring mean-square stability of the closed-loop map (Lai et al., 2020, Li et al., 2019).
- Input Delay/Markov Jump: For systems with input delays or MJLS, the optimal gain is given by solving coupled Riccati recursions with appropriate augmented state representations and information patterns (Liu et al., 16 Nov 2024, Han et al., 2018).
3.2 Nonlinear Systems and State/Control Constraints
- Forward-Backward Difference Equations (FBDEs): Nonlinear discrete-time OCPs with general constraints are approached via augmented Lagrangian methods and FBDEs, ensuring that the first-order (gradient) and second-order (Hessian) optimization quantities are obtained via O(Nn²) and O(N²n²) complexity, with Newton-type iterations affording superlinear convergence (Lv et al., 20 Mar 2025).
- Exact Penalization and Sensitivity: For inequality or path constraints, exact penalization and non-smooth analysis are used to relate constraint violation and the value function landscape, aiding sensitivity quantification and global optimization (Kipka et al., 2017).
4. Stochastic, Mean-Field, and Partially Observed Control
- Backward Stochastic Difference Equations (BS∆Es): In mean-field settings, adjoint processes satisfy discrete-time BS∆Es. Necessary and sufficient (under convexity) optimality conditions are derived, accounting for law-dependent dynamics and cost (Ahmadova et al., 2022).
- Partial Observation: For partially observed or output-feedback control, problems are formulated on conditional probability measures (filters). Verification theorems and explicit Riccati/Kalman-type filters provide implementable solutions, with quantization or grid-based approximations enabling tractability even on infinite-dimensional law-space (Chichportich et al., 2023, Li et al., 2023).
- Colored/Muliplicative Noise: Systems with temporally correlated or multiplicative noise require FBSDEs and path-dependent Riccati recursions for controller synthesis (Li et al., 2019, Lai et al., 2020, Liu et al., 16 Nov 2024).
5. Numerical and Algorithmic Methods
- Globally Convergent Homotopy Methods: Homotopy continuation methods construct solution paths from an auxiliary (easy, e.g., LQR) problem to the original nonconvex problem. Robust predictor-corrector schemes (arc-length, Newton refinement) ensure convergence under mild regularity, providing a tractable route to KKT points of nonconvex OCPs (Esterhuizen et al., 2023).
- Convexification and Lossless Relaxation: When control constraints are nonconvex but stage costs are convex in magnitude, lossless convexification (LCvx) converts the problem into a convex conic form amenable to efficient convex programming. Discrete-time LCvx is shown to be "nearly exact": after a random infinitesimal perturbation, constraint violation occurs at at most grid points. For long-horizon degenerate cases, a bisection strategy in control horizon guarantees convergence up to a prescribed tolerance (Luo et al., 13 Oct 2024).
| Methodology | Setting | Key Recursion/Approach |
|---|---|---|
| Riccati recursion | LQ/LQG deterministic/stochastic | via backward DE, |
| FBSDE/BS∆E | Stochastic, colored, mean-field | Forward and backward coupled SDEs |
| Dynamic programming/DPP | General, finite state/control | Backward Bellman, grid-quantization on law-space |
| Homotopy continuation | Nonconvex, path-planning | Arc-length predictor-corrector path-tracking |
| LCvx | Nonconvex control bounds | Convex relaxation, SOCP, bisection on horizon |
| Augmented Lagrangian, FBDE | Nonlinear, constraints | ALM outer loop, Newton-type optimization inner loop |
6. Structure-preserving and Geometric Discretizations
- Geometric Discretization (SO(3), pH Systems): Problems defined on Lie groups or port-Hamiltonian manifolds employ geometric discretizations (e.g., discrete mechanics, variational integrators) to preserve invariants and constraints at the numerical level (Phogat et al., 2015, Kipka et al., 2017, Sarkar et al., 1 Sep 2025).
- Strict dissipativity and turnpike property: Discrete-time port-Hamiltonian systems require care—energy-preserving integrators (e.g., implicit midpoint) may fail to maintain (strict) dissipativity. Discretizations via difference or differential representations (DDR) restore strict dissipativity, ensuring that most of the optimal trajectory remains close to an invariant manifold (turnpike property) (Sarkar et al., 1 Sep 2025).
7. Extensions: Information Regularization, Switched and Hybrid Systems
- Mutual Information-regularized Control: In stochastic systems, optimal control can be cast with information usage regularization, e.g., mutual information between state and control policies. Alternating minimization over (Gaussian) policy and prior yields closed-form recursions interpolating between deterministic LQR and maximally random control depending on the regularization weight (Enami et al., 7 Jul 2025).
- Switched and Hybrid Systems: Switched system optimal control, with exponentially many mode sequences, can be solved efficiently via continuous parameterization and block-sparsity-inducing convex surrogates, avoiding combinatorial complexity with near-optimal empirical performance (Kreiss et al., 2017).
8. Applications and Computational Aspects
Discrete-time optimal control theory and algorithms have broad impact in robotics, economics, power systems, autonomous vehicles, spacecraft guidance, and model predictive control (MPC). Techniques from this domain underpin fast embedded solvers for real-time control (accelerated FBDE+ALM for AGV trajectory tracking), probabilistic planning (LQG, mean-field, mutual information control), and large-scale hybrid systems.
Computational efficiency is achieved by leveraging problem structure (Riccati/sparse Jacobians, law-space quantization, block sparsity), and by modern convexification and homotopy methods, with polynomial complexity guarantees in discretization size and manageable memory requirements (Lv et al., 20 Mar 2025, Esterhuizen et al., 2023, Luo et al., 13 Oct 2024, Kreiss et al., 2017).
References to Principal Developments
- Duality and Riccati-based ellipsoidal methods in discrete-time optimal disturbance rejection (Dogadin et al., 18 Sep 2024)
- Discrete-time geometric maximum principle and penalization theory for constraints (Kipka et al., 2017, K et al., 2018)
- Mean-field/distribution-dependent control, stochastic maximum principles, and dynamic programming (Ahmadova et al., 2022, Chichportich et al., 2023)
- Homotopy continuation algorithms for nonconvex discrete-time OCP (Esterhuizen et al., 2023)
- Discrete-time lossless convexification and feasibility guarantees (Luo et al., 13 Oct 2024)
- Block-sparsity methods for switched/hybrid discrete-time systems (Kreiss et al., 2017)
- Delay and Markov jump effects in Riccati-type difference equations (Liu et al., 16 Nov 2024, Han et al., 2018)
- Strict dissipativity and turnpike in port-Hamiltonian discretizations (Sarkar et al., 1 Sep 2025)
- Information-constrained and mutual-information-regularized discrete control (Enami et al., 7 Jul 2025)
These developments underscore the confluence of modern optimization, geometry, stochastic analysis, and computational methods in advancing the theory and practice of discrete-time optimal control.