Full Gauss-Newton Method for Nonlinear Least Squares
- The full Gauss-Newton method is an iterative algorithm for solving nonlinear least squares problems using only first-order derivatives, which simplifies computation compared to Newton's method.
- It employs a majorant function to rigorously establish local convergence rates—quadratic with zero residuals and linear with small residuals—by bounding the nonlinear behavior of the Jacobian.
- Optimal convergence and uniqueness radii are derived under unified theoretical conditions, guiding effective initialization and adaptive termination in practical nonlinear optimization.
The full Gauss-Newton method is a fundamental iterative algorithm for solving nonlinear least squares problems of the form , where is a continuously differentiable map. Distinct from Newton's method, which requires computation of second derivatives, the full Gauss-Newton method leverages only first-order derivative information—specifically, the Jacobian—thus simplifying the local quadratic model of the least-squares objective. The method's update,
defines a sequence that approximates the solution , provided certain regularity and proximity conditions are met. The local convergence characteristics of the full Gauss-Newton method, including convergence rate, radius of convergence, and uniqueness domain, can be rigorously elucidated via the introduction of a majorant function—a scalar function that upper bounds the nonlinear behavior of the Jacobian. This approach both unifies and generalizes classical local convergence results, including those based on Lipschitz continuity and analytic (Smale-type) conditions (Ferreira et al., 2010).
1. Local Convergence Under Majorant Conditions
A central insight is that global assumptions (e.g., global Lipschitz continuity of the Jacobian ) are often unnecessarily restrictive for local convergence analysis. Instead, the method's behavior near the solution can be characterized using a majorant function with
and satisfying a differential bounding condition: for all in a neighborhood of . This condition generalizes the classical Lipschitz assumption and allows for sharper, more flexible local analysis.
The main result (Theorem 7) states that, assuming has full rank, the sequence defined by the full Gauss-Newton iteration remains within a ball and converges to . The estimate
captures both the nonlinear behavior of (via ) and the local residual . When the method achieves Q-quadratic convergence; for small the rate is Q-linear; and for large convergence may fail.
2. Optimal Radius and Uniqueness Region
The majorant condition enables precise quantification of the optimal convergence radius. Constants are introduced:
- such that ,
- ,
- , and the convergence ball is .
It is shown (Lemma 16) that, under a certain boundary condition, is sharp—providing the largest possible convergence radius under the majorant structure. In situations where the uniqueness of solution is required—from the additional bound , with —an even smaller radius is defined, ensuring uniqueness of the solution to in .
3. Unified Framework for Convergence Analyses
The majorant function approach subsumes previous, apparently disparate frameworks:
- Classical local convergence under a Lipschitz condition on is recovered by selecting quadratic,
- Smale's point estimate theory for analytic mappings is recovered by taking appropriate majorants,
- Both yield optimal radii and uniqueness results under the same abstract machinery.
This unification demonstrates that the full Gauss-Newton method's local convergence is not bounded to the classical linear bounds but is adaptable via the choice of an appropriate majorant.
4. Convergence Sensitivity and Rate
A key feature revealed by the analysis is the sensitivity of the convergence rate to both the behavior of and the residual . The bound for displays explicit dependence on :
- Zero-residual regime: , Q-quadratic convergence;
- Small residual regime: small but nonzero; Q-linear convergence, rate deteriorating smoothly as increases;
- Large residual regime: convergence may not be assured.
This delineation clarifies practical expectations on convergence based on proximity and the nature of the target solution (exact or inexact).
5. Implications for Method Design and Termination
The optimal convergence radius informs how close the initial iterate must lie to for guaranteed convergence. Practically, this determines initialization strategies and the reliability of the full Gauss-Newton approach in the presence of pronounced nonlinearity. Furthermore, the explicit error estimates can guide adaptive termination criteria and step-size control, especially in ill-conditioned or near-degenerate cases.
6. Broader Impact and Extension to Generalized Gauss-Newton Methods
The majorant framework directly inspires generalizations to composite and constrained settings. For example, proximal or projected variants of Gauss-Newton inherit analogous convergence properties when a suitable majorant condition controls the Jacobian's behavior in the regularized or projected domain. The results also communicate that strong Lipschitz conditions may be replaced with more tractable, locally-verifiable majorant constructions, broadening the full Gauss-Newton method's effective domain of applicability.
7. Summary Table: Theoretical Guarantees under Majorant Conditions
Result Property | Majorant Condition (f) | Residual Regime | Convergence Rate | Guarantee Domain |
---|---|---|---|---|
Local Existence/Convergence | ; (2) | Quadratic | ||
Local Existence/Convergence | General | small | Linear | |
Optimality of Radius | (see Lemma 16) | any | -- | |
Uniqueness of Solution | Additional bound | -- | -- | |
Unification of Theories | Choice of | -- | -- | Both Lipschitz/analytic |
These results provide a comprehensive local theory, supplanting and enhancing classical convergence analyses for nonlinear least squares problems.
The majorant-based local convergence framework developed in (Ferreira et al., 2010) provides the mathematical infrastructure for rigorous and generalizable guarantees for the full Gauss-Newton method in nonlinear least-squares settings. It enables practical assessment of convergence/uniqueness domains and convergence rates, offers a unified view of previously diverse results, and lays the foundation for further development of Newton-type and Gauss-Newton-type methods in broader nonlinear and composite settings.