Papers
Topics
Authors
Recent
Search
2000 character limit reached

Separable Non-linear Least Squares (SNLLS)

Updated 25 January 2026
  • SNLLS is a subclass of nonlinear least squares where the objective function is separable into linear and nonlinear parameters, enabling efficient variable elimination.
  • Variable projection reduces dimensionality by solving for linear parameters in closed form, improving conditioning and optimization speed.
  • Robust SNLLS algorithms employ block-structured and dual formulations to handle constraints and non-quadratic losses, with applications in system identification, signal processing, and biological modeling.

Separable Non-linear Least Squares (SNLLS) is a specific subclass of nonlinear least squares problems in which the objective depends linearly on a subset of parameters and nonlinearly on another subset. This structure enables fundamentally different algorithmic strategies compared to general nonlinear least squares, including explicit variable elimination (“variable projection”), block-structured Newton-type methods, and dual formulations for robust norms. SNLLS models arise extensively in system identification, biology, signal processing, and high-dimensional inverse problems.

1. Mathematical Formulation and Problem Structure

The canonical SNLLS problem minimizes a squared residual of the form

minxRk,yRn Φ(x,y)=F(x,y)22,F(x,y)=A(x)yb(x),\min_{x \in \mathbb{R}^k,\, y \in \mathbb{R}^n}\ \Phi(x, y) = \lVert F(x, y) \rVert_2^2,\quad F(x, y) = A(x) y - b(x),

where:

  • xx is the vector of nonlinear parameters,
  • yy is the vector of linear parameters,
  • A(x)Rm×nA(x) \in \mathbb{R}^{m\times n} is a matrix whose entries involve nonlinear functions of xx (often constructed via basis functions ϕj(x,ti)\phi_j(x, t_i)),
  • b(x)Rmb(x)\in\mathbb{R}^m is the data vector (potentially nonlinear in xx),
  • mm is the number of observations.

For fixed xx, the residual is affine (linear) in yy, allowing the elimination of yy via

y(x)=A(x)+b(x),y^*(x) = A(x)^+ b(x),

with A(x)+A(x)^+ denoting the Moore–Penrose pseudoinverse. Substituting y(x)y^*(x) back yields the reduced variable-projection functional

ψ(x)=[IP(x)]b(x)22,P(x)=A(x)A(x)+,\psi(x) = \lVert [I - P(x)] b(x) \rVert_2^2,\quad P(x) = A(x) A(x)^+,

which now depends only on the nonlinear parameters xx (Gharibi et al., 2011, Herrera-Gomez et al., 2017, Dattner et al., 2019).

This structure generalizes broadly. For models f(t;c,α)=jcjϕj(t;α)f(t; c, \alpha) = \sum_j c_j \phi_j(t; \alpha), cc enters linearly and α\alpha nonlinearly, and for any fixed α\alpha, the optimal cc is given by linear least squares (Herrera-Gomez et al., 2017).

2. Algorithmic Approaches: Variable Projection and Semi-reduced Schemes

Variable Projection

The variable-projection method exploits the separability by analytically solving for the linear parameters. The resulting reduced optimization, expressed solely in the nonlinear variables, dramatically reduces dimensionality and often improves conditioning. The steps are:

  1. For the current nonlinear parameter xx, form A(x)A(x) and b(x)b(x).
  2. Solve for the optimal linear parameters y(x)y^*(x) via standard linear least squares.
  3. Compute the reduced cost ψ(x)=[IP(x)]b(x)22\psi(x) = \lVert [I - P(x)] b(x) \rVert_2^2.
  4. Update xx via gradient-based (Gauss–Newton, Levenberg–Marquardt) methods, using a projected reduced Jacobian (Herrera-Gomez et al., 2017, Shearer et al., 2013).

The gradient and approximate Hessian of ψ(x)\psi(x) are given by

xψ(x)=2Jnonlin(x)r(x),\nabla_x\psi(x) = -2 J_{\text{nonlin}}(x)^\top r^*(x),

H(x)=2Jnonlin(x)[IP(x)]Jnonlin(x),H(x) = 2 J_{\text{nonlin}}(x)^\top [I - P(x)] J_{\text{nonlin}}(x),

where JnonlinJ_{\text{nonlin}} is the Jacobian of the residual with respect to the nonlinear parameters, computed at the optimal linear fit (Herrera-Gomez et al., 2017).

Semi-reduced and Generalized Variable Elimination

For cases where closed-form elimination is impossible (Poisson likelihoods, bound constraints, non-quadratic losses), semi-reduced methods generalize the classical variable projection:

  • The overall Newton-type system is partitioned into blocks associated with linear and nonlinear parameters.
  • Block Gaussian elimination (Schur complement) is used to solve for the nonlinear parameters, with trial-point adjustment in the linear parameters—possibly via partial or exact inner minimization.
  • This interpolates between full parameter joint updates and reduced variable projection, maintaining robust convergence properties, especially in large-scale or ill-conditioned regimes (Shearer et al., 2013).

1
2
3
4
5
6
7
For k = 0, 1, 2,...
  1. Compute gradient g and Hessian B (partitioned by y,z).
  2. Solve reduced system for nonlinear block using Schur complement.
  3. Solve for linear block given the nonlinear update.
  4. Optionally adjust linear parameters via inner solves.
  5. Line search using Armijo or projected Newton criteria.
Repeat until convergence.

3. Norm Choices, Duality, and Robust Formulations

SNLLS can be posed under both the Euclidean (2-norm) and Chebyshev (∞-norm) metrics:

  • 2-norm: Differentiable, conventional Gauss–Newton/LM methods apply. Variable projection is efficient and robust in this setting.
  • ∞-norm: Nondifferentiable at maxima; classical gradient-based approaches break down. Dual or subgradient-based schemes are required.

A Lagrangian-dual algorithm is introduced for minimax (∞-norm) problems, transforming

minx,ymaxi[A(y)xb(y)]i\min_{x, y} \max_i |[A(y) x - b(y)]_i|

into an iterated sequence:

  1. At each dual iterate λ\lambda, solve a weighted nonlinear least squares:

minx,yi=1mλi[A(y)xb(y)]i2\min_{x, y} \sum_{i=1}^m \lambda_i [A(y)x - b(y)]_i^2

  1. Update λ\lambda via subgradient (projection onto the probability simplex).
  2. Iterate until the duality gap/convergence criteria are satisfied.

This approach preserves separable structure within each weighted subproblem and is applicable in robust regression and Chebyshev approximation settings where maximum residual control is required (Gharibi et al., 2011).

4. Computational Complexity, Convergence Properties, and Statistical Interpretation

Computational Complexity

  • Classical SNLLS permits dimensional reduction: for p=n+kp = n + k total parameters, the variable-projection method reduces outer optimization to kk variables, with linear solves in nn (Dattner et al., 2019).
  • Per iteration costs: forming and solving the linear least squares (dominant for large nn), plus Jacobian computations and outer nonlinear updates.
  • Semi-reduced methods leverage block structure in the Hessian, facilitating sparsity-exploiting direct solvers (block-diagonal, circulant, banded) and parallelization (Shearer et al., 2013, Fodor et al., 2023).
Method Outer Dim. Inner Solve Overall Complexity
Variable Projection kk n×nn \times n O(n2k+k3)O(n^2 k + k^3)
Semi-reduced k+nk+n Block LS/CG Leverages structure

Convergence Theory

  • SNLLS inherits local superlinear (or quadratic) convergence of Gauss–Newton-type methods when residual norms are small and Jacobians well-conditioned.
  • Semi-reduced methods admit global convergence proofs via Armijo and monotonic adjustment operators.
  • Dual subgradient methods for ∞-norm formulations converge as O(1/ε2)O(1/\varepsilon^2) under nonsmooth settings, with rapid initial residual decrease (Gharibi et al., 2011, Shearer et al., 2013).

Statistical Significance

  • The Schur complement formula for the reduced Hessian ensures that covariance estimates of the nonlinear parameters are identical to what would be obtained from the full nonlinear least squares, preserving statistical validity (Herrera-Gomez et al., 2017).
  • Variable-projection does not bias estimation, nor does it alter variance properties, if the noise is Gaussian and the model structure adhered to.

5. Practical Implementations and Extensions

SNLLS methodologies have led to specialized packages and scalable solvers:

  • Non-gradient grid-search methods (e.g., nlstac): exploit separability for robustness and initialization-free parameter estimation, particularly in models such as sums of exponentials, Gaussians, and exponential+sinusoid composites. These methods solve for linear parameters at every grid point over bounded intervals of nonlinear parameters, yielding globally reliable fits but at exponential cost in nonlinear parameter dimension (Torvisco et al., 2024).
  • Nonnegative least squares (NNLS)-driven SNLLS: basis functions parameterized nonlinearly, with nonnegative linear coefficients. Grid discretization in nonlinear parameters and NNLS solvers select an optimal sparse basis expansion (illustrated in rational/exponential function approximation) (Vabishchevich, 2023).

Some SNLLS solvers are designed for large-scale, nearly-block-separable problems, using parallel fixed-point iteration for block-structured Levenberg–Marquardt systems (Fodor et al., 2023). This is relevant in high-dimensional geospatial applications (cadastral map refinement, bundle adjustment) and any scenario with block-sparse Jacobians.

6. Applications and Illustrative Case Studies

Key domains leveraging SNLLS structure include:

  • System identification: best Chebyshev-fit of parameterized models for robust system parameter recovery (Gharibi et al., 2011).
  • Biological modeling: parameter inference in ODE models with linearly embedded rates and nonlinear kinetic orders. SNLLS provides substantial speedups (2–10× over vanilla NLLS) and improved robustness in biochemical systems and epidemic modeling (Dattner et al., 2019).
  • Signal fitting: multi-channel regression, e.g., fitting sums of damped exponentials in spectroscopy or sensor analysis.
  • Computer vision/robotics: reprojection error minimization (with linear pose or scale parameters) under robust norms or constraints (Gharibi et al., 2011, Fodor et al., 2023).
  • Function approximation: rational or exponential sum approximations implemented via NNLS-based SNLLS workflows, guiding selection of basis parameters for high-precision fits of functions like xαx^{-\alpha} or exp(xα)\exp(-x^\alpha) (Vabishchevich, 2023).
  • Large-scale inverse problems: block-separable and nearly-separable NLS solvers scale efficiently via parallel fixed-point inner iterations and are empirically validated on million-variable test problems (Fodor et al., 2023).

7. Limitations, Extensions, and Open Problems

SNLLS algorithms rely on model separability and explicit closed-form for the linear parameter block. When constraints (e.g., y0y \geq 0), non-Gaussian likelihoods (Poisson), or nonquadratic losses are present, classical elimination is infeasible; semi-reduced and adjustment-based extensions become necessary (Shearer et al., 2013). Unseparated approaches are useful when A(x)A(x) is ill-conditioned or yy is constrained, as they avoid repeated inversion and enable handling of general constraints. Robust formulations under the ∞-norm require dual or nonsmooth optimization strategies, as standard variable-projection fails in nondifferentiable regimes (Gharibi et al., 2011).

Grid-search and NNLS-based SNLLS methods have exponential complexity in the number of nonlinear parameters and become impractical for high-dimensional nonlinear blocks (Torvisco et al., 2024, Vabishchevich, 2023). In such cases, hybrid approaches using SNLLS for initialization followed by local gradient-based methods are recommended.

The continuing development of SNLLS methodologies is situated in the context of large-scale data-driven modeling, leveraging separable structure for both algorithmic efficiency and statistical reliability across disciplinary boundaries.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Separable Non-linear Least Squares (SNLLS).