Sequential Linear-Quadratic Approach
- SLQ is a numerical optimal control method that iteratively approximates nonlinear or stochastic systems as linear dynamics with quadratic cost, ensuring dynamic feasibility.
- The approach leverages Riccati recursions to update control policies while handling constraints, stochasticity, and switched dynamics across varied applications including robotics and dynamic games.
- Its efficiency is demonstrated by linear to cubic per-iteration computational complexity, enabling real-time trajectory optimization and robust convergence under standard convexity assumptions.
The Sequential Linear-Quadratic (SLQ) approach is a numerical technique for solving nonlinear or stochastic optimal control and dynamic game problems by iteratively approximating them as sequences of linear dynamics and quadratic cost (LQ) subproblems. SLQ algorithms underlie a spectrum of modern optimal control and trajectory optimization methods, particularly in robotics, control of SPDEs, switched systems, and dynamic game theory. The SLQ family spans both open-loop and feedback implementations and supports various propagation, discretization, and constraint-handling strategies.
1. Core Mathematical Structure of SLQ
At its foundation, the SLQ approach sequentially solves optimal control problems of the form
subject to continuous or discrete-time nonlinear dynamics,
SLQ constructs, at each iteration, a local time-varying LQ subproblem via first-order (linear) expansion of about a nominal trajectory and second-order (quadratic) expansion of and . The resulting LQ subproblem features the recursively defined Riccati backward equations and admits an affine-incremental optimal control law,
where (feedforward) and (feedback) are determined via Riccati recursions parameterized by linearization and quadratization data (Sleiman et al., 2021, Abhijeet et al., 3 Oct 2025).
The objective in SLQ is to iteratively update the trajectory such that each step is a steepest descent in the true nonlinear cost while maintaining dynamic feasibility; this is achieved by forward-propagation of corrected control along the nonlinear system, followed by local model updates (Abhijeet et al., 3 Oct 2025).
2. Algorithmic Approaches and Variants
Multiple algorithmic instantiations of SLQ exist, adapting the methodology to discrete-time, continuous-time, stochastic, game-theoretic, and switched-systems settings.
- Classic SLQ/iLQR/RTI: These schemes solve sequences of LQ subproblems using Riccati recursions, followed by line-search and forward rollout to enforce dynamic constraints. SQP-based perspectives clarify that SLQ, or iterative LQR (iLQR), is not a mere approximation to DDP but a fully consistent and globally convergent scheme under standard convexity and regularity assumptions (Abhijeet et al., 3 Oct 2025, Sleiman et al., 2021).
- Stochastic SLQ (SPDEs): In the context of infinite-dimensional SPDEs, SLQ is realized via open-loop (forward-backward SPDE, FBSPDE) resolution and closed-loop (feedback via discretized Riccati equations) strategies, each adapted for discretization in space (Galerkin/FEM) and time (Euler-type schemes), yielding rigorous convergence and rate estimates (Prohl et al., 2024).
- Constrained SLQ: Inequality and equality constraints are handled using projection-based or augmented-Lagrangian techniques, introducing dual updates and penalty functions into the Riccati recursion and control computation (Sleiman et al., 2021).
- SLQ for Switched Systems: For problems involving switching dynamics, SLQ is embedded in a two-stage optimization involving inner loop SLQ and outer loop switching-time update, leveraging sensitivity ODEs for gradients with respect to switch times (Farshidian et al., 2016).
- Sequential LQ in Games: In sequential/Stackelberg zero-sum linear-quadratic games, SLQ alternates inner (follower) and outer (leader) policy updates, exploiting natural-gradient or quasi-Newton steps for convergence and stabilization, and connects to Nash equilibria through Riccati-based constructions (Bu et al., 2019, Sun et al., 2021).
3. Sequential Linear-Quadratic Policy Update and Convergence
The key SLQ procedure involves alternating forward and backward passes:
- Forward pass: Nonlinear system rollout with control increments.
- Backward pass: Local Riccati recursion for value approximation and policy update.
The backward Riccati equation yields, for either time-discrete or time-continuous settings,
where are derived from the linearizations and cost Hessians. The resulting affine control law correction ensures feasibility and decreases the cost under sufficient conditions on 0-positivity and controllability (Abhijeet et al., 3 Oct 2025, Sleiman et al., 2021). The existence of descent directions is a distinct feature versus DDP/Newton, which may generate indefinite 1 far from optimality.
Line-search or Armijo-type conditions are imposed during the forward rollout to guarantee convergence. Under assumptions of cost convexity (Lipschitz Hessians, positive-definite 2), and feasibility of the linearized dynamics, global convergence to a first-order KKT point is established (Abhijeet et al., 3 Oct 2025). Quadratic convergence can be recovered in the local regime using second-order expansions of the dynamics (as in DDP/Newton), albeit at risk of global instability (Abhijeet et al., 3 Oct 2025, Sleiman et al., 2021).
In the sequential policy setting (e.g., SLQ in dynamic games), convergence theorems establish sublinear (natural gradient) and quadratic (quasi-Newton) rates under spectral step-size rules and stabilization conditions, all while eliminating the need for explicit projection to the stabilizing policy set (Bu et al., 2019).
4. Extensions: Constraints, Stochasticity, and Switching
SLQ supports various generalizations:
- Constraint Handling: Equality constraints are enforced via backward-pass projection in the Riccati recursion, while inequalities are managed by adding penalization terms (PHR, smooth-PHR, non-slack quadratic) to the stage cost and updating dual variables in an augmented-Lagrangian, real-time iteration framework. Such approaches address the challenge of ill-conditioning under hard penalty tightening, which besets log-barrier methods (Sleiman et al., 2021).
- Stochastic Setting / SPDEs: For stochastic systems driven by SPDEs, the discrete SLQ approach is realized through spatial finite element and temporal Euler discretizations. In the open-loop regime, this yields a coupled system of forward and backward SDEs, solved iteratively with conditional expectations and gradient descent. In the closed-loop (feedback) approach, the stochastic Riccati equation is solved discretely, facilitating direct feedback control of the SPDE. Convergence results guarantee optimality rates 3 in additive noise settings (Prohl et al., 2024).
- Nonlinear Switched Systems: The SLQ framework extends to switched systems via a two-stage approach: an inner SLQ loop optimizes controls for fixed switching times, while the outer loop updates switching times using computed cost gradients from trajectory and Riccati sensitivity ODEs. This two-stage process (e.g., OCS2 algorithm) offers substantial computational advantages over boundary value problem-based solvers for both moderate and high-dimensional switched systems (Farshidian et al., 2016).
5. SLQ in Differential Games and Stackelberg/Nash Equilibria
In dynamic games, especially zero-sum linear-quadratic settings, SLQ is naturally suited to Stackelberg leader-follower architectures. The inner loop computes the follower's optimal response via a standard SLQ (Riccati) solution; the leader, treating the follower's response as given, then solves a backward SLQ problem whose cost is informed by the follower's Riccati equation. The result is two coupled Riccati ODEs whose difference yields the Nash equilibrium Riccati equation, implying Stackelberg and Nash equilibria coincide under uniform convexity-concavity (Sun et al., 2021). Feedback policies for both leader and follower are constructed explicitly in terms of the Riccati solutions, and the closed-loop system exhibits the classical feedback Nash (saddle-point) structure.
6. Computational Complexity, Practical Aspects, and Performance
The per-iteration complexity of SLQ is typically linear in horizon steps and cubic in state-input dimension, 4, due to Riccati backward/forward sweeps (Abhijeet et al., 3 Oct 2025, Sleiman et al., 2021). For moderate-dimensional systems (e.g., 5, 6), each SLQ pass runs in milliseconds, enabling real-time applications in robot MPC (Sleiman et al., 2021). In high-dimensional PDE/SPDE settings, complexity is governed by the number of finite elements, samples, and iterations—with open-loop approaches more flexible but more expensive due to repeated forward-backward simulation and conditional expectation evaluation, while closed-loop Riccati methods offer substantial computational reductions for 1D/2D domains at the expense of increased memory storage for Riccati matrices (Prohl et al., 2024).
SLQ (via OCS2) outperforms boundary-value-problem-based approaches for switched systems, enabling real-time trajectory optimization in quadrupedal robots and other high-dimensional platforms (Farshidian et al., 2016). In augmented-Lagrangian constrained SLQ, the rate and robustness of convergence exceed that of log-barrier methods, supporting high-frequency receding horizon control in practice (Sleiman et al., 2021).
7. Summary Table
| Domain/Variant | Discretization & Feedback | Key Advantage |
|---|---|---|
| Classic (iLQR/SLQ, deterministic) | Riccati (backward pass), forward rollout | Guaranteed descent, feasibility |
| Constrained SLQ | Augmented Lagrangian, projection | Handles path (in)equality constraints robustly |
| Stochastic SLQ (SPDEs) | Finite Element (FEM), Euler, Riccati | Provable rates for high-dimensional PDEs |
| SLQ for Switched Systems | Inner SLQ + outer time optimization | Real-time-capable, scalable |
| Game-theoretic (Stackelberg/Nash) | Coupled Riccati forward/backward | Feedback Nash/Stackelberg equilibria |
The Sequential Linear-Quadratic approach provides a coherent, algorithmically robust, and versatile framework for broad classes of nonlinear, stochastic, constrained, and multi-agent optimal control problems. Its principled reliance on local linearization and quadratic approximation, together with its Riccati-based recursion and adaptability to feedback structure, underpins both its theoretical convergence properties and its widespread practical adoption across modern control, robotics, and dynamic game theory (Abhijeet et al., 3 Oct 2025, Sleiman et al., 2021, Prohl et al., 2024, Sun et al., 2021, Bu et al., 2019, Farshidian et al., 2016).