Sequential Quadratic Programming (SQP)

Updated 9 December 2025

Sequential Quadratic Programming (SQP) is an iterative method that solves nonlinear constrained optimization problems by approximating them with quadratic subproblems.
It uses second-order derivative approximations and linearized constraints to generate effective search directions and achieve local quadratic convergence under regularity conditions.
Advanced SQP variants integrate merit functions, sampling strategies, and regularization techniques to handle nonsmooth, stochastic, and large-scale optimization challenges.

Sequential Quadratic Programming (SQP) is a class of iterative algorithms for nonlinear constrained optimization, recognized for their strong local convergence properties and wide applicability to smooth, nonconvex, and increasingly nonsmooth and stochastic programming problems. The core principle is to approximate the nonlinear optimization problem at each iteration by a quadratic programming (QP) subproblem that models the second-order local behavior of the Lagrangian and linearizes the constraints. Advanced variants exploit model reduction, merit functions, relaxed feasibility, and modern sampling strategies to address ill-conditioning, nonsmoothness, or high computational cost.

1. Mathematical Foundations and Subproblem Structure

In the canonical SQP framework, the general nonlinear program considered is

$\min_{x \in \mathbb{R}^n} f(x) \quad \text{s.t.} \quad h(x) = 0, \quad g(x) \geq 0,$

where $f: \mathbb{R}^n \rightarrow \mathbb{R}$ is the objective, and $h:\mathbb{R}^n \rightarrow \mathbb{R}^{m_e}$ , $g:\mathbb{R}^n \rightarrow \mathbb{R}^{m_i}$ are equality and inequality constraints, respectively. At each major iteration $x_k$ , a QP is formed by expanding the Lagrangian to second order and linearizing the constraints: $\begin{aligned} & \min_{d \in \mathbb{R}^n} \quad \tfrac{1}{2} d^T B_k d + \nabla f(x_k)^T d, \ & \text{s.t.} \quad \nabla h(x_k)^T d + h(x_k) = 0, \ & \qquad \quad \nabla g(x_k)^T d + g(x_k) \geq 0, \end{aligned}$ where $B_k$ is a positive-definite (or semi-definite) approximation to $\nabla_{xx}^2 L(x_k, \lambda_k)$ (the Hessian of the Lagrangian). The solution $d_k$ provides a search direction, and an appropriate line search or trust-region determines the step length.

For equality constraints only, or in abstract Banach space settings, the KKT system corresponding to the QP can be solved directly (Nguyen et al., 2017, Yamakawa, 15 Mar 2025), resulting in fast Newton-type or stabilized Newton-type updates. In nonsmooth or upper- $\mathcal{C}^2$ settings, subgradients or generalized subdifferentials replace classical gradients (Wang et al., 2023).

2. Convergence Theory and Stabilization Techniques

Local quadratic convergence of SQP is guaranteed if the Hessian approximation approaches the true second derivative of the Lagrangian and standard regularity conditions (e.g., Linear Independence Constraint Qualification and Second-Order Sufficient Conditions) hold (Nguyen et al., 2017, Yamakawa, 15 Mar 2025). When high-quality Hessians are not available or the problem is degenerate, stabilized- or relaxed variants, such as introducing a positive regularization or considering a stabilized subproblem of the form

$\min_{d, \mu}\left\{ \langle f'(x_k), d \rangle + \tfrac{1}{2} \langle L_{xx}(x_k, \lambda_k) d, d \rangle + \frac{\sigma_k}{2} \|\mu\|^2 \right\}$

with shifted constraints, ensure solvability and robust convergence in Banach spaces and degenerate finite-dimensional problems (Yamakawa, 15 Mar 2025).

A simple, effective approach for poor Hessian approximations is the direction interpolation method. Here, the step is a convex combination of the standard SQP direction and a purely feasible direction: $\Delta x_k = \alpha \Delta x_k^{\text{SQP}} + (1 - \alpha) \Delta x_k^f,$ where $\Delta x_k^f$ is minimum-norm feasible (Nguyen et al., 2017). This approach ensures local linear convergence regardless of Hessian accuracy and can be tuned by adjusting $\alpha$ .

Global convergence is typically achieved by the use of merit functions—most often an exact penalty of the form $\phi(x; \rho) = f(x) + \rho \|c(x)\|_1$ with adaptive penalty parameter updates—and adequate line search rules (Grundvig et al., 8 Jul 2025, Joshy et al., 5 Dec 2025). In the case of nonsmooth or nonconvex upper- $\mathcal{C}^2$ objectives, convergence to generalized KKT points is established via potential/Kurdyka–Łojasiewicz arguments (Wang et al., 2023).

3. Model Inexactness and Function Approximations

For problems where function and gradient evaluations are computationally expensive, such as PDE-constrained optimization, SQP can operate on reduced-order or surrogate models subject to certified error bounds. The generalized $\ell_1$ -merit SQP algorithm requires approximations $m_k(x) \approx f(x)$ , $h_k(x) \approx c(x)$ to meet controlled absolute and relative error tolerances at each iteration: $|f(x)-m_k(x)| \leq M_f e^f_k(x), \quad \|\nabla m_k(x_k)-\nabla f(x_k)\| / \|\text{denominator}\| \leq \tau^{f,g}_k,$ with the error tolerances being driven to zero according to algorithmic progress (Grundvig et al., 8 Jul 2025). Acceptance of steps and models is enforced via error-aware sufficient decrease conditions, guaranteeing that model-based iterates converge to true problem solutions under mild regularity.

In practice, this enables a dramatic reduction in costly function/gradient computations (e.g., evaluating the Boussinesq system for flow control), with reduced-order models constructed by projection/snapshot techniques. The region of trust for the reduced model is dynamically adjusted based on online error estimation.

4. SQP in Nonsmooth, Multi-objective, and Stochastic Optimization

SQP has been generalized to handle nonconvex, nonsmooth, and multi-objective programs:

For upper- $\mathcal{C}^2$ objectives and nonsmooth constraint sets, the subproblem employs generalized gradients and penalizes infeasibility via exact penalty terms, with global and local convergence analyzed via KL-potentials and subanalyticity (Wang et al., 2023).
In multi-objective optimization, linearly combining descent and feasibility via a QP in extended variables (with a slack parameter) allows simultaneous descent in all objectives. A scalar penalty parameter penalizes constraint violation, and global convergence to weak or strong Pareto-optimal points is established (Ansary et al., 2018).
Stochastic and sample-based variants incorporate adaptive sampling for function/gradient estimates. Convergence in expectation or with high probability is ensured under variance reduction and adaptive sampling rules (Wang et al., 2023, Berahas et al., 2022, Na et al., 2021). In these methods, both sample size and solution accuracy of subproblems are dynamically adjusted based on estimated variances and inexactness controls, providing efficiency and robustness in large-scale stochastic settings.

5. Algorithmic Enhancements and Large-scale Implementation

Modern SQP implementations integrate a range of enhancements:

Merit and Filter Functions: Advanced merit functions, such as the smooth augmented Lagrangian or composite filter-penalty functions, allow robust globalization and line search (Joshy et al., 5 Dec 2025).
Hessian Approximations: Quasi-Newton BFGS or limited-memory updates are default, but exact Hessians or reduced second-order surrogates (e.g., in log-space or control applications) can be swapped for robustness or structure exploitation (Karcher, 2021, Abhijeet et al., 3 Oct 2025).
Relaxation and Regularization: Hybrid relaxation techniques for inconsistent subproblems (e.g., combining modified-Powell and Nowak relaxations for path feasibility) yield superior robustness on ill-conditioned or degenerate NLPs (Ma et al., 16 Feb 2024).
Sparse and Decomposed QPs: Large-scale problems (e.g., in optimal power flow or control) require efficient solution of large, sparse QP subproblems. Active-set, ADMM, and partitioned solves on GPUs are keys for scaling to tens of thousands of variables and real-time performance (Li et al., 2023, Verheijen et al., 2023).
Module-based Architectures: Recent frameworks (e.g., OpenSQP) expose modular interfaces to allow swapping of merit functions, Hessian updates, QP solvers, and even variable transformations to exploit log-convex models or custom constraints (Joshy et al., 5 Dec 2025, Karcher, 2021).

6. Applications and Empirical Performance

SQP algorithms have demonstrated strong empirical results across domains:

In PDE-constrained control, projection-based reduced-order models within SQP yield order-of-magnitude speedups and allow tractable solution to problems with large discretizations (Grundvig et al., 8 Jul 2025).
GPU-accelerated and decomposed SQP methods approach or surpass the performance of state-of-the-art interior point solvers on large ACOPF instances, with warm-start and high-accuracy solutions (Li et al., 2023).
Real-time control tasks (MPC, MPCC) benefit from parallel shooting and LPV-embedded SQP, achieving 30–40% reductions in major iteration count and low single-step latency compatible with hardware-in-the-loop operation (Verheijen et al., 2023, Floch et al., 12 Nov 2025).
Stochastic, nonsmooth, and multi-objective extensions enable SQP to efficiently handle realistic nonconvexities, stochasticity, and constraint coupling, as evidenced by results on high-dimensional regression, power-grid optimization, and process design (Wang et al., 2023, Ma et al., 16 Feb 2024).
Modular, open-source SQP implementations achieve performance competitive with established solvers (SNOPT, IPOPT, SLSQP) on CUTEst test sets and support user-driven tailoring for specialized requirements (Joshy et al., 5 Dec 2025).

7. Future Directions and Quantum-accelerated SQP

Emergent directions include the integration of quantum linear algebra as an inner solver (notably for the Schur complement in barrier-SQP) (Dehaghani et al., 20 Oct 2025). By leveraging block-encoding and Quantum Singular Value Transformation, these methods offer polylogarithmic complexity in problem dimension for core linear algebra operations, leading to exponential speedup potential in sufficiently large nonlinear control problems, subject to quantum hardware limitations. The hybrid classical/quantum SQP retains local input-to-state stability and global convergence to a neighborhood of the KKT point, with explicit error bounds in terms of quantum solver accuracy and barrier parameters (Dehaghani et al., 20 Oct 2025).

Another trend is unification of SQP and LPV-based model predictive control through the lens of the differential Fundamental Theorem of Calculus embedding, which enables identical subproblem structure and analysis under certain choices of LPV scheduling (Floch et al., 12 Nov 2025). This framework systematically enables hybridization of classical SQP, zero-order propagation, and exact LPV embeddings for computational and robustness gains in real-time MPC.

Ongoing research is focused on: further reducing the cost of function/gradient evaluations via structure-exploiting surrogates; developing fully certified inexact and sample-based global convergence theories; and scalable, reconfigurable solver architectures that natively accommodate nonsmoothness, stochasticity, and domain-specific structure.