S-HLSP: Hierarchical Least-Squares Optimization

Updated 9 April 2026

S-HLSP is a structured optimization framework that solves lexicographic multi-level least-squares programs by hierarchically prioritizing objectives and constraints.
It leverages advanced techniques such as hierarchical Newton/Gauss–Newton linearizations, adaptive step filters, and nullspace-based solvers to attain global convergence and computational efficiency.
Its applications span robot control, trajectory optimization, and non-linear optimal control, where maintaining strict priority ordering is critical for performance.

Sequential Hierarchical Least-Squares Programming (S-HLSP) is a structured optimization framework for solving lexicographic (prioritized) multi-level least-squares programs, especially tailored for applications requiring strict prioritization of objectives and constraints, such as robot control, trajectory optimization, and prioritized non-linear optimal control. S-HLSP operates by stacking multiple least-squares subproblems in a strict hierarchy, ensuring that lower-priority objectives are optimized over the consensual set of (in)feasible solutions of all higher priorities. Advanced algorithmic components—including hierarchical Newton/Gauss–Newton linearizations, adaptive step filters, and efficient nullspace-based interior-point or ADMM solvers—enable S-HLSP to achieve both global convergence and high computational efficiency in large-scale and structured instances (Pfeiffer et al., 2024, Pfeiffer et al., 2023).

1. Formal Problem Statement and Lexicographic Structure

S-HLSP addresses hierarchical least-squares programs where $p$ levels of quadratic objectives are stacked lexicographically: $\begin{aligned} &\lexmin_{x,\,v_1,\ldots, v_p} \bigg( \tfrac{1}{2}\|v_{\mathbb{E}_1}\|^2,\; \tfrac{1}{2}\|v_{\mathbb{I}_1}\|^2,\,\ldots,\, \tfrac{1}{2}\|v_{\mathbb{E}_p}\|^2,\; \tfrac{1}{2}\|v_{\mathbb{I}_p}\|^2 \bigg) \ &\text{subject to: } \begin{cases} f_{\mathbb{E}_l}(x) = v_{\mathbb{E}_l},\ f_{\mathbb{I}_l}(x) \leq v_{\mathbb{I}_l},\ f_{\cup\,l-1}(x) = v_{\cup\,l-1}^*,\ f_{\cup\,l-1}(x) \leq 0, \end{cases} \quad l=1 \ldots p. \end{aligned}$ At each hierarchy level $l$ , the solution preserves (within strict feasibility or optimal infeasibility) the best possible value of the upper priorities, and seeks a least-squares minimization on the new slack variables $v_l$ . This structure is critical for tasks such as task-prioritized motion generation in robotics or sequential convexification in non-linear optimal control (Pfeiffer et al., 2024, Pfeiffer et al., 2023).

Linearizations are performed at each sequential outer iteration using hierarchical Newton or Gauss–Newton steps, reducing the original non-linear problem to a series of hierarchical linear least-squares subproblems (HLSP) on the step $\Delta x_k$ .

2. Algorithmic Framework: Sequential HLSP and Step-Filters

Each S-HLSP iteration proceeds as follows:

Outer iteration: Linearize the non-linear HLSP (NL-HLSP) about the current $x_k$ .
Subproblem formation: Formulate a linear, lexicographic HLSP on variables $\Delta x$ , with blockwise structure reflecting the task prioritization.
Hierarchical step-filter (HSF): For each priority $l$ , maintain a filter $\mathcal{F}_l$ consisting of pairs $(h_{\cup l-1}(x), \|f_l^+(x)\|)$ . A new candidate step is accepted if it improves feasibility or optimality according to specific filter criteria:

$\begin{aligned} &\lexmin_{x,\,v_1,\ldots, v_p} \bigg( \tfrac{1}{2}\|v_{\mathbb{E}_1}\|^2,\; \tfrac{1}{2}\|v_{\mathbb{I}_1}\|^2,\,\ldots,\, \tfrac{1}{2}\|v_{\mathbb{E}_p}\|^2,\; \tfrac{1}{2}\|v_{\mathbb{I}_p}\|^2 \bigg) \ &\text{subject to: } \begin{cases} f_{\mathbb{E}_l}(x) = v_{\mathbb{E}_l},\ f_{\mathbb{I}_l}(x) \leq v_{\mathbb{I}_l},\ f_{\cup\,l-1}(x) = v_{\cup\,l-1}^*,\ f_{\cup\,l-1}(x) \leq 0, \end{cases} \quad l=1 \ldots p. \end{aligned}$0

with $\begin{aligned} &\lexmin_{x,\,v_1,\ldots, v_p} \bigg( \tfrac{1}{2}\|v_{\mathbb{E}_1}\|^2,\; \tfrac{1}{2}\|v_{\mathbb{I}_1}\|^2,\,\ldots,\, \tfrac{1}{2}\|v_{\mathbb{E}_p}\|^2,\; \tfrac{1}{2}\|v_{\mathbb{I}_p}\|^2 \bigg) \ &\text{subject to: } \begin{cases} f_{\mathbb{E}_l}(x) = v_{\mathbb{E}_l},\ f_{\mathbb{I}_l}(x) \leq v_{\mathbb{I}_l},\ f_{\cup\,l-1}(x) = v_{\cup\,l-1}^*,\ f_{\cup\,l-1}(x) \leq 0, \end{cases} \quad l=1 \ldots p. \end{aligned}$1.

Trust region adaptation: The trust region radius $\begin{aligned} &\lexmin_{x,\,v_1,\ldots, v_p} \bigg( \tfrac{1}{2}\|v_{\mathbb{E}_1}\|^2,\; \tfrac{1}{2}\|v_{\mathbb{I}_1}\|^2,\,\ldots,\, \tfrac{1}{2}\|v_{\mathbb{E}_p}\|^2,\; \tfrac{1}{2}\|v_{\mathbb{I}_p}\|^2 \bigg) \ &\text{subject to: } \begin{cases} f_{\mathbb{E}_l}(x) = v_{\mathbb{E}_l},\ f_{\mathbb{I}_l}(x) \leq v_{\mathbb{I}_l},\ f_{\cup\,l-1}(x) = v_{\cup\,l-1}^*,\ f_{\cup\,l-1}(x) \leq 0, \end{cases} \quad l=1 \ldots p. \end{aligned}$2 at each level is dynamically adjusted according to step acceptance, balancing global convergence guarantees and efficient progress (Pfeiffer et al., 2023).

Initialize $\begin{aligned} &\lexmin_{x,\,v_1,\ldots, v_p} \bigg( \tfrac{1}{2}\|v_{\mathbb{E}_1}\|^2,\; \tfrac{1}{2}\|v_{\mathbb{I}_1}\|^2,\,\ldots,\, \tfrac{1}{2}\|v_{\mathbb{E}_p}\|^2,\; \tfrac{1}{2}\|v_{\mathbb{I}_p}\|^2 \bigg) \ &\text{subject to: } \begin{cases} f_{\mathbb{E}_l}(x) = v_{\mathbb{E}_l},\ f_{\mathbb{I}_l}(x) \leq v_{\mathbb{I}_l},\ f_{\cup\,l-1}(x) = v_{\cup\,l-1}^*,\ f_{\cup\,l-1}(x) \leq 0, \end{cases} \quad l=1 \ldots p. \end{aligned}$3, large initial $\begin{aligned} &\lexmin_{x,\,v_1,\ldots, v_p} \bigg( \tfrac{1}{2}\|v_{\mathbb{E}_1}\|^2,\; \tfrac{1}{2}\|v_{\mathbb{I}_1}\|^2,\,\ldots,\, \tfrac{1}{2}\|v_{\mathbb{E}_p}\|^2,\; \tfrac{1}{2}\|v_{\mathbb{I}_p}\|^2 \bigg) \ &\text{subject to: } \begin{cases} f_{\mathbb{E}_l}(x) = v_{\mathbb{E}_l},\ f_{\mathbb{I}_l}(x) \leq v_{\mathbb{I}_l},\ f_{\cup\,l-1}(x) = v_{\cup\,l-1}^*,\ f_{\cup\,l-1}(x) \leq 0, \end{cases} \quad l=1 \ldots p. \end{aligned}$4.
For each level $\begin{aligned} &\lexmin_{x,\,v_1,\ldots, v_p} \bigg( \tfrac{1}{2}\|v_{\mathbb{E}_1}\|^2,\; \tfrac{1}{2}\|v_{\mathbb{I}_1}\|^2,\,\ldots,\, \tfrac{1}{2}\|v_{\mathbb{E}_p}\|^2,\; \tfrac{1}{2}\|v_{\mathbb{I}_p}\|^2 \bigg) \ &\text{subject to: } \begin{cases} f_{\mathbb{E}_l}(x) = v_{\mathbb{E}_l},\ f_{\mathbb{I}_l}(x) \leq v_{\mathbb{I}_l},\ f_{\cup\,l-1}(x) = v_{\cup\,l-1}^*,\ f_{\cup\,l-1}(x) \leq 0, \end{cases} \quad l=1 \ldots p. \end{aligned}$5:
- Solve HLSPs (with trust region) using subsolvers, applying HSF.
- Accept/reject steps, adapt $\begin{aligned} &\lexmin_{x,\,v_1,\ldots, v_p} \bigg( \tfrac{1}{2}\|v_{\mathbb{E}_1}\|^2,\; \tfrac{1}{2}\|v_{\mathbb{I}_1}\|^2,\,\ldots,\, \tfrac{1}{2}\|v_{\mathbb{E}_p}\|^2,\; \tfrac{1}{2}\|v_{\mathbb{I}_p}\|^2 \bigg) \ &\text{subject to: } \begin{cases} f_{\mathbb{E}_l}(x) = v_{\mathbb{E}_l},\ f_{\mathbb{I}_l}(x) \leq v_{\mathbb{I}_l},\ f_{\cup\,l-1}(x) = v_{\cup\,l-1}^*,\ f_{\cup\,l-1}(x) \leq 0, \end{cases} \quad l=1 \ldots p. \end{aligned}$6, update filter.
- Upon convergence at level $\begin{aligned} &\lexmin_{x,\,v_1,\ldots, v_p} \bigg( \tfrac{1}{2}\|v_{\mathbb{E}_1}\|^2,\; \tfrac{1}{2}\|v_{\mathbb{I}_1}\|^2,\,\ldots,\, \tfrac{1}{2}\|v_{\mathbb{E}_p}\|^2,\; \tfrac{1}{2}\|v_{\mathbb{I}_p}\|^2 \bigg) \ &\text{subject to: } \begin{cases} f_{\mathbb{E}_l}(x) = v_{\mathbb{E}_l},\ f_{\mathbb{I}_l}(x) \leq v_{\mathbb{I}_l},\ f_{\cup\,l-1}(x) = v_{\cup\,l-1}^*,\ f_{\cup\,l-1}(x) \leq 0, \end{cases} \quad l=1 \ldots p. \end{aligned}$7, propagate feasible/optimal slacks to level $\begin{aligned} &\lexmin_{x,\,v_1,\ldots, v_p} \bigg( \tfrac{1}{2}\|v_{\mathbb{E}_1}\|^2,\; \tfrac{1}{2}\|v_{\mathbb{I}_1}\|^2,\,\ldots,\, \tfrac{1}{2}\|v_{\mathbb{E}_p}\|^2,\; \tfrac{1}{2}\|v_{\mathbb{I}_p}\|^2 \bigg) \ &\text{subject to: } \begin{cases} f_{\mathbb{E}_l}(x) = v_{\mathbb{E}_l},\ f_{\mathbb{I}_l}(x) \leq v_{\mathbb{I}_l},\ f_{\cup\,l-1}(x) = v_{\cup\,l-1}^*,\ f_{\cup\,l-1}(x) \leq 0, \end{cases} \quad l=1 \ldots p. \end{aligned}$8.

3. Hierarchical Newton/Gauss–Newton and Threshold Adaptation

The core of S-HLSP is the choice of local model for each HLSP subproblem:

Hierarchical Newton: Utilize full second-order (Lagrangian) information, regularized for stability.
Hierarchical Gauss–Newton: Drop second-order terms, using only $\begin{aligned} &\lexmin_{x,\,v_1,\ldots, v_p} \bigg( \tfrac{1}{2}\|v_{\mathbb{E}_1}\|^2,\; \tfrac{1}{2}\|v_{\mathbb{I}_1}\|^2,\,\ldots,\, \tfrac{1}{2}\|v_{\mathbb{E}_p}\|^2,\; \tfrac{1}{2}\|v_{\mathbb{I}_p}\|^2 \bigg) \ &\text{subject to: } \begin{cases} f_{\mathbb{E}_l}(x) = v_{\mathbb{E}_l},\ f_{\mathbb{I}_l}(x) \leq v_{\mathbb{I}_l},\ f_{\cup\,l-1}(x) = v_{\cup\,l-1}^*,\ f_{\cup\,l-1}(x) \leq 0, \end{cases} \quad l=1 \ldots p. \end{aligned}$9.

An adaptive per-level threshold $l$ 0 governs the choice: if the current slack norm $l$ 1 exceeds the threshold, Newton regularization is activated; otherwise, the Gauss–Newton approximation is used. This approach ensures both numerical stability when infeasibilities remain and avoids unnecessary variable locking at lower levels when constraints are already nearly satisfied. The threshold is automatically adapted based on filter progress and rejection counts (Pfeiffer et al., 2024).

4. HLSP Solvers: Nullspace-Projected ADMM and Interior-Point Methods

Two major classes of algorithms efficiently solve the HLSP and subproblems:

$l$ 2ADM $l$ 3 (Nullspace ADMM): Projects the HLSP onto the nullspace of active high-priority constraints using a computationally efficient turnback algorithm. The projected system is then solved via an alternating direction method of multipliers (ADMM) in $l$ 4 coordinates, exploiting closed-form slack projections and dual updates.
s- $l$ 5IPM-HLSP (Sparse Reduced-Hessian IPM): Employs a reduced Hessian interior-point method in nullspace coordinates for each subproblem, using the turnback algorithm to exploit the banded structure of Jacobians in large-scale optimal control.

Nullspace basis computation is critical and relies on either efficient block-banded LU factorization or the turnback algorithm, avoiding expensive rank-revealing factorizations. For systems with discretized dynamics (Euler or multiple-shooting), upper bounds on nullspace basis bandwidth ensure scalability (Pfeiffer et al., 2024, Pfeiffer et al., 2023).

Solver Summary

Solver	HLSP Method	Nullspace Basis
$l$ 6ADM $l$ 7	ADMM + nullspace proj.	Turnback algorithm
s- $l$ 8IPM	Reduced-Hessian IPM	LU/Turnback

$l$ 9ADM $v_l$ 0 is especially effective for sparse, structured banded systems, and provides moderate-accuracy solutions rapidly. These can then warm-start higher-accuracy interior-point solvers.

5. Convergence Properties and Global Optimality

The S-HLSP methodology satisfies global convergence to a KKT point of the NL-HLSP under two key conditions:

Filter radius condition: If, at each priority, the trust region is not expanded past the maximum filter-accepted radius, convergence is guaranteed.
Filter initialization: If the filter at each level is initialized with strictly feasible front points, S-HLSP converges regardless of $v_l$ 1 size.

No restoration or feasibility repair is required because the slack-based formulation ensures feasibility of every HLSP subproblem, even in the presence of infeasible higher-level objectives (Pfeiffer et al., 2023).

The adaptation of nullspace trust regions at each priority level ensures that while moving through the hierarchy, progress at each level does not violate the KKT set of higher priorities.

6. Computational Performance and Scalability

Extensive benchmarks on non-linear test functions, whole-body inverse kinematics, and multi-contact optimal control demonstrate the numerical efficacy and scalability of S-HLSP methods:

For moderate to large-scale tasks (e.g., robot inverse kinematics $v_l$ 2; whole-body optimal control $v_l$ 3), $v_l$ 4ADM $v_l$ 5 and s- $v_l$ 6IPM provide competitive or superior solve times compared to state-of-the-art commercial (H-MOSEK, Gurobi) and open-source (H-OSQP, H-PIQP) solvers, while achieving lower KKT residuals in hierarchical settings.
Nullspace-projected ADMM yields fast moderate-accuracy solutions ( $v_l$ 7 to $v_l$ 8 residuals in ms for $v_l$ 9), while interior-point methods can reach high-accuracy KKT points ( $\Delta x_k$ 0) in larger but still tractable runtimes.
The turnback algorithm preserves sparsity and banded structure, enabling scalability to high-dimensional, multiple-shooting discretizations with low computational overhead per stage.

7. Warm-Starting and Combined Solver Strategies

An effective computational strategy is to use a first-order HLSP solver like $\Delta x_k$ 1ADM $\Delta x_k$ 2 to obtain a moderate-accuracy warm start, followed by a high-accuracy interior-point method (e.g., H-MOSEK) to rapidly reach tight KKT residuals. This two-stage approach reduces both the total number of outer sequential iterations and the number of expensive inner iterations in the high-precision solver. The benefit is especially pronounced in large-scale under-actuated systems and structured optimal control problems (Pfeiffer et al., 2024).

8. Applications and Extensions

S-HLSP is particularly suited for:

Prioritized robot control, generating kinematically and dynamically feasible trajectories respecting task hierarchies (position, force, redundancy resolution).
Discrete-time non-linear optimal control with complex multi-level constraints (multi-contact locomotion, time-optimal control).
Sparse and structured system identification, where hierarchical recursive least squares formulations also emerge (Slavakis, 2018).

The methodology is extensible to stochastic and adaptive settings, where variants such as S-FM-HSDM and hierarchical RLS address noisy, online, or composite convex versions of S-HLSP.

References

"Efficient Lexicographic Optimization for Prioritized Robot Control and Planning" (Pfeiffer et al., 2024)
"Sequential Hierarchical Least-Squares Programming for Prioritized Non-Linear Optimal Control" (Pfeiffer et al., 2023)
"The Stochastic Fejér-Monotone Hybrid Steepest Descent Method and the Hierarchical RLS" (Slavakis, 2018)

Markdown Report Issue Upgrade to Chat

References (3)

Efficient Lexicographic Optimization for Prioritized Robot Control and Planning (2024)

Sequential Hierarchical Least-Squares Programming for Prioritized Non-Linear Optimal Control (2023)

The Stochastic Fejér-Monotone Hybrid Steepest Descent Method and the Hierarchical RLS (2018)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Sequential Hierarchical Least-Squares Programming (S-HLSP).

S-HLSP: Hierarchical Least-Squares Optimization

1. Formal Problem Statement and Lexicographic Structure

2. Algorithmic Framework: Sequential HLSP and Step-Filters

High-Level Algorithm (paraphrased from (Pfeiffer et al., 2023))

3. Hierarchical Newton/Gauss–Newton and Threshold Adaptation

4. HLSP Solvers: Nullspace-Projected ADMM and Interior-Point Methods

Solver Summary

5. Convergence Properties and Global Optimality

6. Computational Performance and Scalability

7. Warm-Starting and Combined Solver Strategies

8. Applications and Extensions

References

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

S-HLSP: Hierarchical Least-Squares Optimization

1. Formal Problem Statement and Lexicographic Structure

2. Algorithmic Framework: Sequential HLSP and Step-Filters

High-Level Algorithm (paraphrased from (Pfeiffer et al., 2023))

3. Hierarchical Newton/Gauss–Newton and Threshold Adaptation

4. HLSP Solvers: Nullspace-Projected ADMM and Interior-Point Methods

Solver Summary

5. Convergence Properties and Global Optimality

6. Computational Performance and Scalability

7. Warm-Starting and Combined Solver Strategies

8. Applications and Extensions

References

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research