Papers
Topics
Authors
Recent
Search
2000 character limit reached

Gauss-Newton-Bartlett Estimator

Updated 7 November 2025
  • The Gauss-Newton-Bartlett estimator is a structured diagonal Hessian approximation technique designed for nonlinear least squares problems using secant-based updates and robust safeguards.
  • It exploits the Jacobian’s structure to enable efficient, matrix-free computations while preserving descent directions for scalable optimization.
  • Empirical studies show that the GNB estimator achieves global convergence with fewer iterations and lower computational cost compared to conventional unstructured approaches.

The Gauss-Newton-Bartlett (GNB) estimator refers to a structured diagonal Hessian approximation used in iterative algorithms for large-scale nonlinear least squares (NLS) optimization. The GNB estimator, as detailed in (Awwal et al., 2020), exploits the particular form of the Hessian in NLS problems by constructing a diagonal approximation that is updated iteratively via a secant equation and robust safeguarding strategies. This makes it particularly suitable for scalable, matrix-free implementations, preserving descent directions and ensuring positive definiteness of the approximate Hessian.

1. Mathematical Foundation and Problem Structure

Nonlinear least squares problems take the standard form

minxRnf(x),f(x)=12F(x)2,\min_{x \in \mathbb{R}^n} f(x), \qquad f(x) = \frac{1}{2} \| F(x) \|^2,

where F:RnRmF : \mathbb{R}^n \to \mathbb{R}^m. The gradient and Hessian are structured as

g(x)=J(x)F(x),g(x) = J(x)^\top F(x),

H(x)=J(x)J(x)+C(x),C(x)=i=1mFi(x)2Fi(x),H(x) = J(x)^\top J(x) + C(x), \qquad C(x) = \sum_{i=1}^{m} F_i(x) \nabla^2 F_i(x),

with J(x)J(x) the Jacobian matrix of FF.

The GNB estimator constructs a diagonal matrix DkD_k that approximates the Hessian at iteration kk, aiming to accurately capture the leading curvature information necessary for Newton-type or quasi-Newton methods, while sidestepping the cost of storing or manipulating the full Hessian. This approach contrasts with traditional Gauss-Newton (GN) methods, which drop the second term and use JJJ^\top J, and with generic diagonal BFGS updates that are agnostic to the Hessian's NLS structure.

2. Structured Diagonal Hessian Approximation: GNB Update Rule

At each iteration, the update for the diagonal entries is derived from a componentwise secant condition specific to NLS structure. Define

sk1=xkxk1,yk1=JkJksk1+(JkJk1)Fk=y^k1+yk1.s_{k-1} = x_k - x_{k-1}, \qquad y_{k-1} = J_k^\top J_k s_{k-1} + (J_k - J_{k-1})^\top F_k = \hat{y}_{k-1} + \overline{y}_{k-1}.

The GNB estimator seeks F:RnRmF : \mathbb{R}^n \to \mathbb{R}^m0, the F:RnRmF : \mathbb{R}^n \to \mathbb{R}^m1-th element of F:RnRmF : \mathbb{R}^n \to \mathbb{R}^m2, via

F:RnRmF : \mathbb{R}^n \to \mathbb{R}^m3

implemented by solving a constrained least-squares problem for each component: F:RnRmF : \mathbb{R}^n \to \mathbb{R}^m4 Robustness and numerical stability require safeguarded projection: F:RnRmF : \mathbb{R}^n \to \mathbb{R}^m5 with bounds F:RnRmF : \mathbb{R}^n \to \mathbb{R}^m6, F:RnRmF : \mathbb{R}^n \to \mathbb{R}^m7, and further adjustments if the numerator and denominator are of opposite sign (to enforce positivity).

This update succinctly incorporates the structure of the NLS Hessian (including changes in F:RnRmF : \mathbb{R}^n \to \mathbb{R}^m8), making it distinct from standard iterative diagonal approximation rules used in generic quasi-Newton frameworks.

3. Iterative Algorithm and Workflow

The GNB estimator is embedded in a globally convergent matrix-free optimization framework:

  1. Residual and Jacobian computations: At each step, compute F:RnRmF : \mathbb{R}^n \to \mathbb{R}^m9 and Jacobian–vector products g(x)=J(x)F(x),g(x) = J(x)^\top F(x),0 (without storing g(x)=J(x)F(x),g(x) = J(x)^\top F(x),1 explicitly, enabling scalability).
  2. Search direction: Solve for g(x)=J(x)F(x),g(x) = J(x)^\top F(x),2, where g(x)=J(x)F(x),g(x) = J(x)^\top F(x),3.
  3. Non-monotone line search: Apply the Zhang–Hager rule to select an adaptive step size g(x)=J(x)F(x),g(x) = J(x)^\top F(x),4.
  4. Update iterate: g(x)=J(x)F(x),g(x) = J(x)^\top F(x),5.
  5. Safeguard and update diagonal: Use the GNB estimator to update g(x)=J(x)F(x),g(x) = J(x)^\top F(x),6.
  6. Convergence test: Stop when g(x)=J(x)F(x),g(x) = J(x)^\top F(x),7.

4. Convergence Properties and Robustness

The GNB estimator is equipped with algorithmic guarantees:

  • Descent direction: The safeguarded diagonal ensures g(x)=J(x)F(x),g(x) = J(x)^\top F(x),8 is positive definite, thus g(x)=J(x)F(x),g(x) = J(x)^\top F(x),9 is always a descent direction.
  • Global convergence: Under standard NLS assumptions (bounded level sets; Lipschitz continuity of H(x)=J(x)J(x)+C(x),C(x)=i=1mFi(x)2Fi(x),H(x) = J(x)^\top J(x) + C(x), \qquad C(x) = \sum_{i=1}^{m} F_i(x) \nabla^2 F_i(x),0 and H(x)=J(x)J(x)+C(x),C(x)=i=1mFi(x)2Fi(x),H(x) = J(x)^\top J(x) + C(x), \qquad C(x) = \sum_{i=1}^{m} F_i(x) \nabla^2 F_i(x),1), the algorithm satisfies

H(x)=J(x)J(x)+C(x),C(x)=i=1mFi(x)2Fi(x),H(x) = J(x)^\top J(x) + C(x), \qquad C(x) = \sum_{i=1}^{m} F_i(x) \nabla^2 F_i(x),2

With line search parameters constrained (H(x)=J(x)J(x)+C(x),C(x)=i=1mFi(x)2Fi(x),H(x) = J(x)^\top J(x) + C(x), \qquad C(x) = \sum_{i=1}^{m} F_i(x) \nabla^2 F_i(x),3), full convergence follows:

H(x)=J(x)J(x)+C(x),C(x)=i=1mFi(x)2Fi(x),H(x) = J(x)^\top J(x) + C(x), \qquad C(x) = \sum_{i=1}^{m} F_i(x) \nabla^2 F_i(x),4

5. Numerical Performance and Scaling

The GNB estimator demonstrates superior numerical performance on high-dimensional NLS problems:

  • On a 30-problem test suite (dimensions up to H(x)=J(x)J(x)+C(x),C(x)=i=1mFi(x)2Fi(x),H(x) = J(x)^\top J(x) + C(x), \qquad C(x) = \sum_{i=1}^{m} F_i(x) \nabla^2 F_i(x),5), the GNB-based solver (ASDH) solves 80% of problems with minimal overall work (iterations, function evaluations, matrix–vector products), and 70% with minimal CPU time.
  • Robustness: ASDH solves all instances without failure; a comparable structured diagonal method failed on three problems.
  • Efficiency: In nearly all cases, ASDH (using GNB) converges with fewer iterations and lower CPU cost than matrix-free alternatives utilizing less structured diagonal approximations.

6. Implementation and Practical Considerations

Implementation highlights:

  • Matrix-free: Only Jacobian–vector products are required; full storage or construction of H(x)=J(x)J(x)+C(x),C(x)=i=1mFi(x)2Fi(x),H(x) = J(x)^\top J(x) + C(x), \qquad C(x) = \sum_{i=1}^{m} F_i(x) \nabla^2 F_i(x),6 or H(x)=J(x)J(x)+C(x),C(x)=i=1mFi(x)2Fi(x),H(x) = J(x)^\top J(x) + C(x), \qquad C(x) = \sum_{i=1}^{m} F_i(x) \nabla^2 F_i(x),7 is avoided.
  • Safeguarding: Critical for global convergence and stability—implemented componentwise with explicit bounds and positivity enforcement.
  • Parallelization: The structure naturally permits parallel application in each diagonal entry update and search direction computation.
  • Resource utilization: Suited for large-scale settings (potentially H(x)=J(x)J(x)+C(x),C(x)=i=1mFi(x)2Fi(x),H(x) = J(x)^\top J(x) + C(x), \qquad C(x) = \sum_{i=1}^{m} F_i(x) \nabla^2 F_i(x),8), benefitting from the reduced storage and computational complexity compared to full-matrix or even banded methods.

Limitations:

  • The diagonal structure limits the rate at which the method can exploit strong off-diagonal curvature (non-separability), particularly relevant for extremely ill-conditioned or highly coupled NLS problems.
  • The method is tailored for NLS; adaptation to general unconstrained smooth optimization may lose beneficial structure.

Deployment scenarios: The method is especially effective for large-scale NLS problems in scientific computing, model fitting, inverse problems, and machine learning applications where residual structure and Jacobian products can be exploited efficiently.

  • Classic Gauss-Newton: Ignores the H(x)=J(x)J(x)+C(x),C(x)=i=1mFi(x)2Fi(x),H(x) = J(x)^\top J(x) + C(x), \qquad C(x) = \sum_{i=1}^{m} F_i(x) \nabla^2 F_i(x),9 term and uses J(x)J(x)0; GNB further exploits structure while retaining low computational cost.
  • Quasi-Newton/Diagonal BFGS: Updates are generic, lack NLS specificity, and typically do not enforce positive definiteness as robustly.
  • Structured Diagonal Heuristics: Previously proposed methods (e.g., SDHAM [Mohammad & Sandra 2018]) lack the improved secant-based update and safeguarding components; GNB shows improved robustness and efficiency over these.
Method Hessian Approx. Safeguarding Matrix-free NLS Structure Used Global Convergence
Classic GN J(x)J(x)1 None Yes Yes No
Quasi-Newton Diag Diag approx., generic Optional Yes No Often weak
SDHAM Structured diag. Limited Yes Yes Partial
GNB/ASDH Structured diag. (secant) Strong Yes Yes Yes

8. Summary

The Gauss-Newton-Bartlett estimator provides a principled, computationally efficient approach to diagonal Hessian approximation in large-scale nonlinear least squares problems, achieving matrix-free operation, robust safeguarding, and provable global convergence. Its design capitalizes on the segregation of Jacobian and curvature terms unique to NLS, outperforming less structured alternatives, and makes it a preferred methodology for scalable second-order NLS solvers (Awwal et al., 2020).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Gauss-Newton-Bartlett (GNB) Estimator.