Interior-Point Differential Dynamic Programming
- IPDDP is a class of algorithms that embed interior‐point techniques into Differential Dynamic Programming to directly handle nonlinear equality and inequality constraints.
- It employs a backward–forward recursion with stagewise KKT systems and primal–dual Newton steps to achieve local quadratic convergence and maintain linear per-iteration complexity.
- IPDDP is applied in robotics and trajectory optimization, offering robust, efficient solutions for contact-implicit and hybrid-dynamical systems relative to traditional solvers.
Interior-Point Differential Dynamic Programming (IPDDP) is a class of algorithms for solving discrete-time, finite-horizon optimal control problems with nonlinear equality and inequality constraints. By embedding primal–dual interior-point methodology within the Differential Dynamic Programming (DDP) framework, IPDDP realizes the benefits of second-order DDP efficiency while enforcing feasibility through barrier-augmented cost functions, slack variables, and primal–dual Newton steps. This approach is distinguished by its ability to directly handle state and control constraints, including contact-implicit or hybrid-dynamical systems, with local quadratic convergence and per-iteration complexity linear in the time horizon.
1. Mathematical Foundations
IPDDP addresses the nonlinear constrained optimal control problem: Here, and are the stage and terminal costs, is (possibly nonlinear) dynamics, are equality constraints, and are inequality constraints. Control and state variables are denoted , .
The constrained problem is converted to a series of unconstrained surrogates via log-barrier augmentation. For inequality constraints, slack variables are introduced, yielding , and a barrier term is added for barrier parameter . Equality constraints are handled through Lagrange multipliers. The algorithm solves a sequence of barrier problems that approach the Karush-Kuhn-Tucker (KKT) conditions of the original problem as (Cao et al., 2021, Prabhu et al., 18 Sep 2024, Pavlov et al., 2020, Xu et al., 11 Apr 2025, Kim et al., 2022).
2. Algorithmic Structure and Primal–Dual Updates
The distinguishing algorithmic hallmark of IPDDP is its synthesis of DDP’s backward–forward second-order recursion with primal–dual Newton steps arising from the barrier-augmented Lagrangian. Each DDP sweep consists of:
- Backward Pass: For each stage to $0$, the local Q-function is constructed as a quadratic expansion in deviations about a nominal trajectory. Incorporating the value function’s Taylor series and the augmented Lagrangian—including barrier and Lagrange terms—yields the required derivatives:
with “” denoting inclusion of barrier and dual terms.
- Solving the Stagewise KKT System: The primal–dual step computes updates to (controls), (slacks), and multipliers for equality and inequality constraints by solving a symmetric indefinite linear system at each stage—the size is , where is the number of equality and of inequality constraints (Xu et al., 11 Apr 2025, Prabhu et al., 18 Sep 2024, Kim et al., 2022). For feasible-IPDDP, and strict primal feasibility is maintained; infeasible-IPDDP admits transitory infeasibility but maintains (where are dual slacks).
- Forward Pass and Line-Search: The computed affine feedback policy for controls and slacks is applied via a forward rollout, updating the trajectory. Acceptance is determined using a merit function or IPOPT-style filter, testing both cost reduction and constraint violation improvement.
- Barrier Parameter Update: Once the optimum for a given is approached (when optimality and feasibility residuals fall below for some ), is reduced geometrically (, typical ), and the process repeats until the KKT conditions for the original constrained problem are met (Cao et al., 2021, Pavlov et al., 2020, Xu et al., 11 Apr 2025, Prabhu et al., 18 Sep 2024).
3. Computational Complexity and Convergence
IPDDP exhibits per-iteration complexity of , dominated by factorizations of stagewise KKT matrices of size or coming from controls plus constraints. This preserves DDP’s favorable scaling with trajectory length. By contrast, general purpose direct solvers such as IPOPT scale cubically in the total trajectory dimension, yielding considerable performance advantage for long-horizon or high-dimensional problems (Xu et al., 11 Apr 2025, Cao et al., 2021). Empirically, convergence is typically achieved in a few to a few dozen iterations, with total wall-clock times of tens of milliseconds for on modern workstations (Prabhu et al., 18 Sep 2024, Pavlov et al., 2020, Cao et al., 2021).
Local quadratic convergence is established by showing that the stagewise DDP + Newton step constitutes a local Newton iteration on the perturbed KKT system. Under the linear independence constraint qualification (LICQ), boundedness of KKT system inverses, and strict complementarity, the iterates satisfy: for some norm and constant , up to step-size regularization and numerical precision (Xu et al., 11 Apr 2025, Pavlov et al., 2020).
4. Comparison with Other Constrained DDP Methods
IPDDP diverges fundamentally from active-set or augmented Lagrangian DDP extensions by continuously enforcing feasibility via interior-point penalties rather than combinatorial switching or penalty terms. Compared to classic CLDDP (Constrained LQR-DDP, e.g., with box-QP), IPDDP:
- Handles general (nonlinear) equality and inequality constraints without recourse to solving many subproblems or managing constraint activity patterns.
- Exhibits more predictable convergence properties—avoiding phenomena such as “staircase” progress or stagnation near the boundary, as observed with relaxed log-barrier DDP.
- Accepts large primal-dual steps after each barrier reduction, in contrast to the shrinking steps required by standard log-barrier-only methods (Pavlov et al., 2020, Prabhu et al., 18 Sep 2024).
Tables from benchmarking studies demonstrate competitive or superior performance of IPDDP2 relative to IPOPT and AL–iLQR, especially on hard, hybrid, or contact-implicit problems:
| Problem | Method | Iter | Cost | (Violat.) | Wall ms | Solver ms |
|---|---|---|---|---|---|---|
| Car Obstacle Avoid. | IPOPT | 51 | 23.97 | 104 | 71 | |
| IPDDP2 | 73 | 19.26 | 97 | 78 | ||
| Cartpole Swing-Up | IPOPT | 35 | 0.1253 | 30 | 20.5 | |
| IPDDP2 | 33 | 0.1253 | 13 | 11.6 |
Further, AL–iLQR methods can fail on complementarity/contact cases that IPDDP handles robustly due to perturbed primal–dual slackness (Xu et al., 11 Apr 2025).
5. Implementation Techniques and Practical Stabilization
Several recent implementations of IPDDP demonstrate the following best practices:
- Preallocation of arrays for trajectories, derivatives, and multipliers for efficient memory access.
- Use of symbolic differentiation tools (such as Symbolics.jl) to supply exact second derivatives for the cost, dynamics, and constraints (Xu et al., 11 Apr 2025).
- Regularization of or enlarged KKT blocks via Levenberg–Marquardt diagonal shifts to guarantee positive definiteness and stable Cholesky/LDL factorizations (Prabhu et al., 18 Sep 2024, Kim et al., 2022).
- IPOPT-style filter linesearch, where a trial iterate is accepted if either the cost reduction or constraint violation is improved relative to filter entries. Inertia correction of the KKT system via Bunch–Kaufman with rook pivoting ensures correct sign structure in indefiniteness (Xu et al., 11 Apr 2025).
- Fraction-to-boundary rules restrict iterates to remain strictly in the feasible region, i.e., , for small .
- Efficient implementation yields wall times for problems with and moderate control/state dimensions in the range of 10–50 ms per DDP sweep on standard CPUs (Cao et al., 2021, Prabhu et al., 18 Sep 2024).
6. Application Domains and Representative Experiments
IPDDP has been demonstrated on a range of robotic and process control benchmarks:
- Differentially flat systems and trajectory generation: Smooth, segment-constrained polynomial trajectories for drone or vehicle path planning, where IPDDP is used to jointly optimize polynomial coefficients and segment times subject to box or polyhedral collision/actuator bounds (Cao et al., 2021).
- Autonomous robotics: Collision-free planning for mobile robots and quadrotors, via hybrid MPPI-IPDDP approaches—first obtaining coarse exploratory trajectories and then performing local smoothing within a convex corridor using IPDDP (Kim et al., 2022).
- Contact-implicit motion and hybrid systems: Robust trajectory optimization for acrobots or block-pushing systems with unilateral constraints and joint limit impulses, scenarios where AL–iLQR and general constrained DDP methods can fail or stagnate (Xu et al., 11 Apr 2025).
- Classical control: Nonlinear and constrained problems such as inverted pendulum, continuously stirred tank reactors, and obstacle-avoidance for unicycle or car models. IPDDP converges within a few tens of DDP sweeps, with all constraints satisfied to within or tighter (Pavlov et al., 2020, Prabhu et al., 18 Sep 2024).
7. Extensions and Contemporary Directions
Recent developments focus on algorithmic generality, exploiting system structure, and robust globalization:
- Structure-exploiting IPDDP2 enables fast solution of high-dimensional or contact-implicit robotic OCPs via specialized Julia implementations (InteriorPointDDP.jl), manipulating compact stagewise KKT systems rather than direct transcriptions (Xu et al., 11 Apr 2025).
- Hybrid methods, such as MPPI-IPDDP, combine sampling-based exploration for nonconvex, high-dimensional space coverage with IPDDP for local trajectory refinement, yielding improved reliability and smoothness over pure sampling solvers (Kim et al., 2022).
- Regularization and filter globalization strategies inspired by recent advances in interior-point NLP solvers (e.g., IPOPT) are critical for robustness on challenging problems and from infeasible or remote initializations (Kim et al., 2022, Xu et al., 11 Apr 2025).
A plausible implication is that IPDDP’s flexibility and efficiency make it a natural candidate for embedded, real-time, and contact-rich robotic control applications. These methods are now competitive alternatives to both classical sequential quadratic programming and augmented-Lagrangian-based DDP in academic and applied settings.