Variable Projection Methods

Updated 27 March 2026

Variable Projection Methods are a set of numerical optimization techniques that separate linear and nonlinear variables to improve problem conditioning and scalability.
They leverage analytic elimination of linear parameters to reduce problem dimensionality, enabling rapid convergence with quasi-Newton and Gauss–Newton approaches.
Applications include imaging, system identification, low-rank modeling, and neural network design, emphasizing efficient inner solvers and structured regularization.

Variable projection methods refer to a class of numerical optimization techniques that exploit the separable structure of many inverse and parameter estimation problems. These methods enable the analytic elimination of a subset of variables—typically those entering the model linearly—reducing the dimensionality and improving the conditioning of the remaining nonlinear optimization problem. Originating from the Golub–Pereyra approach to separable nonlinear least squares, variable projection has evolved to address a diverse variety of challenges, including the incorporation of general regularization, non-smooth constraints, large-scale linear subproblems, and advanced applications in imaging, system identification, and machine learning (Salzer et al., 8 Jan 2026, Español et al., 2024, Chen et al., 2024).

1. Mathematical Foundations of Variable Projection

The classical context for variable projection is the separable nonlinear least squares problem: $\min_{x\in\mathbb{R}^n,\,y\in\mathbb{R}^r}\;\frac12\|A(y)x-b\|_2^2 + R(y) + \frac12\|Lx\|_2^2,$ where $A(y)$ is a parameterized linear operator, $b$ are measurements, $R(y)$ is a regularization term for nonlinear parameters, and $L$ is a Tikhonov regularizer for $x$ .

For fixed $y$ , the minimizer in $x$ is explicitly

$x(y) = \left(A(y)^\top A(y) + L^\top L \right)^{-1} A(y)^\top b.$

Substitution yields the reduced objective in $y$ only: $\varphi(y) = \frac12\|A(y) x(y)-b\|^2 + R(y) + \frac12\|Lx(y)\|^2 = \frac12\|(I-P_{K(y)})[b;0]\|^2 + R(y)$ with $P_{K(y)}$ the orthogonal projector onto the range of the augmented matrix $K(y)=[A(y);\lambda L]$ (Salzer et al., 8 Jan 2026).

The gradient and (Gauss–Newton/quasi-Newton) Hessian with respect to $y$ are computed via analytic formulas involving the derivative of the projection operator, and closed-form expressions for the Jacobian and Hessian approximations underpin rapid convergence in well-conditioned regimes (Español et al., 2024, Chen et al., 2024).

2. Algorithmic Frameworks and Extensions

2.1. Quasi-Newton and Gauss–Newton VarPro

The quasi-Newton (RGenVarPro) and Gauss–Newton schemes apply directly to the reduced objective. At each iteration:

Solve for $x(y)$ via (regularized) least squares.
Compute residuals, Jacobians, and the Hessian approximation.
Take a Newton-type step in $y$ :

$s^{(k)} = -[J^{(k)\top} J^{(k)} + \nabla^2 R(y^{(k)})]^{-1}[J^{(k)\top} f^{(k)} + \nabla R(y^{(k)})].$

Update $y$ (line search optional; $\alpha=1$ often sufficient for mild nonlinearity).

2.2. Inexact and Large-Scale Variants

For large-scale problems where direct linear solves are infeasible, an iterative LSQR-based algorithm (iRGenVarPro) is used:

Approximate $x(y)$ up to a tolerance in the subproblem using LSQR.
Construct approximate Jacobian and Hessian via the current LSQR solution.
Drive inner solve tolerance $\epsilon^{(k)}$ to zero geometrically (e.g., $\epsilon^{(k+1)} = \epsilon^{(k)}/2$ ) to guarantee convergence (Español et al., 2024, Salzer et al., 8 Jan 2026).

This framework yields algorithms suitable for problems with $n \gtrsim 10^4$ unknowns, with computational cost comparable to a modest number of matrix-vector products per outer iteration.

2.3. Non-smooth and Block-separable Extensions

When the problem contains non-smooth regularizers (e.g., $\ell_1$ , constrained variables), the projected function remains smooth in the outer variables provided the inner subproblem is strongly convex (or has a unique minimum). Inexact variants employ adaptive stopping criteria for approximate inner solves, with convergence rates depending on the accuracy control between inner and outer iterations (Leeuwen et al., 2016).

Augmented Lagrangian schemes incorporating variable projection have been developed for generalized Lasso and sparse-inverse problems, enabling smooth descent on the reduced objective after eliminating variables via soft-thresholding (Aleotti et al., 28 Oct 2025, Solomon et al., 27 Feb 2025).

3. Theoretical Analysis and Convergence Properties

Variable projection methods achieve rapid, often superlinear or quadratic local convergence under standard regularity assumptions:

$A(y)$ and $R(y)$ are $C^2$ in a neighborhood of the solution,
the projected Jacobian $J(y^*)$ has full rank,
the Gauss–Newton Hessian is positive definite at the solution.

For inexact schemes, local linear convergence is retained provided the inner solving tolerance is decreased sufficiently rapidly, typically geometrically. Error recursions explicitly track deviations due to inexactness in both the inner variable and the Jacobian; these are bounded by constants determined from problem conditioning (Salzer et al., 8 Jan 2026, Español et al., 2024, Chen et al., 2024).

In the presence of large residuals, standard (Kaufman-type) variable projection may converge slowly due to neglected second-order terms; the VPLR algorithm augments the Hessian approximation adaptively to restore fast convergence in ill-conditioned scenarios (Chen et al., 2024).

4. Applications Across Scientific Domains

Variable projection has been successfully deployed in a wide range of inverse, estimation, and machine learning problems, exploiting separability to reduce computational complexity and improve solution quality:

Application Area	Variable Projection Structure	Reference
Semi-blind image deblurring	Linear unknown: image; nonlinear: PSF parameters	(Salzer et al., 8 Jan 2026)
Weighted low-rank matrix approximation	Linear: factors; nonlinear: low-rank constraint	(Terray, 6 May 2025, Usevich et al., 2012)
Structured low-rank and common divisor	Inner: quotients/gcds of polynomials	(Usevich et al., 2013)
Seismic sparse deconvolution	Linear: reflectivity; nonlinear: source parameters	(Aravkin et al., 2012)
Optimized dynamic mode decomposition	Linear: mode amplitudes; nonlinear: eigenvalues	(Askham et al., 2017)
EEG/MEG source localization	Linear: sources; nonlinear: auxiliary ( $\ell_1$ )	(Solomon et al., 27 Feb 2025, Aleotti et al., 28 Oct 2025)
Tensor algebraic learning (star-M)	Linear: tensor representation; nonlinear: map $M$	(Newman et al., 2024)
Principal Component Analysis	Linear: loadings/scores; nonlinear: subspace	(Erichson et al., 2018)

Applications such as semi-blind deblurring demonstrate that parameter regularization within variable projection suppresses degenerate no-blur solutions, robustly identifies PSF parameters, and recovers images to high accuracy within modest iteration counts, even in large-scale settings (Salzer et al., 8 Jan 2026).

5. Computational Aspects and Implementation

Per-iteration cost of variable projection depends on the dimension of the eliminated variable and the structure of the linear subproblem. For Tikhonov-regularized least squares or structured low-rank approximation, operations can often be reduced to matrix-vector products, exploiting sparsity, block-circulant, or Hankel structures (Usevich et al., 2012, Terray, 6 May 2025). Key implementation elements include:

Accurate and efficient inner solvers (direct, LSQR, Krylov subspace, randomized SVD).
Analytic or algorithmically tractable Jacobian and Hessian approximations for the projected objective.
Inexact or adaptive schemes for inner solves with principled stopping rules providing complexity guarantees with minimal overhead (Leeuwen et al., 2016, Español et al., 2024).

When the projected Jacobian is rank-deficient (e.g., in weighted low-rank approximation), minimum-norm solutions or subspace constraints enforce uniqueness of the search direction, circumventing issues with non-isolated minima (Terray, 6 May 2025).

6. Recent Advances and Generalizations

Recent developments extend variable projection beyond classical smooth settings:

Non-smooth variable projection accommodates block-separable convex/non-convex regularizers with provable complexity bounds using inexact adaptive tolerance schedules for the inner problem (Leeuwen et al., 2016).
Variable-projected augmented Lagrangian (VPAL) methods show state-of-the-art performance for generalized Lasso, phase retrieval, and MRI, with preconditioned Newton-type variants delivering significant acceleration and sharp convergence (Aleotti et al., 28 Oct 2025, Solomon et al., 27 Feb 2025).
Model-driven neural network architectures embed variable projection layers to exploit analytic elimination of linear parameters, yielding interpretable, compact networks in classification and regression (Kovács et al., 2020).
Riemannian optimization and manifold-based approaches extend VP methodology to matrix-mimetic and tensor representations, with provable descent guarantees (Newman et al., 2024).

These advances render variable projection a versatile and foundational tool in modern scientific computing, suitable for high-dimensional, nonconvex, regularized, or non-smooth estimation and inverse problems.

7. Summary and Impact

Variable projection methods exploit separable structure to analytically eliminate linear variables, reducing the dimensionality and improving the conditioning of nonlinear optimization tasks. State-of-the-art VP algorithms—exact, inexact, and non-smooth—demonstrate robust convergence in theory and high computational efficiency in practice across a spectrum of scientific and engineering domains. Applications in image reconstruction, spectral estimation, low-rank modeling, sparse deconvolution, and neural network design leverage VP’s analytic tractability, scalability, and strong theoretical guarantees. Recent research further generalizes VP to non-smooth, manifold, and augmented Lagrangian settings, positioning variable projection as a central component of modern large-scale inverse problem and learning methodologies (Salzer et al., 8 Jan 2026, Español et al., 2024, Chen et al., 2024, Aleotti et al., 28 Oct 2025).