GN-Prox-Linear Method Overview
- GN-Prox-Linear Method is an iterative optimization approach that linearizes nonlinear operators while integrating proximal mappings to handle convex penalties.
- It guarantees local convergence under generalized Lipschitz conditions, achieving linear to quadratic convergence based on the problem's regularity.
- The method efficiently tackles penalized nonlinear least squares in both constrained and unconstrained settings while ensuring numerical stability and feasibility.
The GN-Prox-Linear method, also known as the proximal Gauss–Newton method, is an iterative optimization framework designed for solving penalized nonlinear least squares problems, particularly in the presence of convex constraints or nonsmooth regularization. By combining a linearization of the nonlinear operator with proximity mappings for convex penalties, it generalizes classical Gauss–Newton approaches and provides both theoretical guarantees and practical effectiveness for a variety of structured problems ranging from nonlinear equations to signal processing. The development of the method is characterized by local convergence results under generalized Lipschitz conditions, explicit analyses of the basin of attraction, and robust numerical performance in constrained and unconstrained settings (Salzo et al., 2011).
1. Formulation and Algorithmic Structure
The GN-Prox-Linear method addresses penalized nonlinear least squares problems of the form: where is a differentiable nonlinear operator between Hilbert spaces and is a proper, lower semicontinuous convex penalty (e.g., indicator function for constraints, convex regularizer).
At each iteration, the method executes a linearization of F and then incorporates the convex penalty via a proximity operator defined with respect to the metric induced by the linearized Jacobian: The update formula is: where denotes the proximity operator with respect to the metric (or a projection for indicator penalties). If , the method reduces to classical Gauss–Newton.
2. Convergence Theory and Generalized Lipschitz Conditions
Convergence is local: the sequence converges to a local minimizer provided the starting point is sufficiently close, and specific regularity conditions are met. The analysis weakens the standard Lipschitz assumption, introducing "radius Lipschitz" and "center Lipschitz" conditions:
- Radius Lipschitz: For all ,
- Center Lipschitz:
where is an increasing, continuous function quantifying local smoothness.
A critical convergence condition is given: where , denotes a local condition number, and is determined by the pseudoinverse structure. Under these assumptions, the method guarantees
with a strictly increasing contraction function and . Linear convergence is established, and quadratic convergence occurs if the local residual vanishes (). The approach provides explicit estimates for the radius of the local convergence ball.
3. Handling Constraints and Penalties
For problems with convex constraints (), the penalty is taken as the indicator function of : The proximity operator for indicator penalties is simply the metric projection onto under the metric : So the update becomes: Projection in non-Euclidean metrics may not admit closed-form, motivating the use of inner iterative routines (such as forward–backward algorithms) to compute projections approximately. Empirically, the algorithm's structure ensures feasibility is preserved at each step, often helping maintain good conditioning of .
4. Numerical Performance and Robustness
Empirical evaluation includes benchmark nonlinear least squares problems (Rosenbrock, Osborne1/2, Kowalik) and truly constrained instances. Key observations:
- The method produces feasible iterates and robust convergence regardless of initialization.
- For constrained problems, it maintains bounded condition numbers for , yielding convergence even where Gauss–Newton alone may fail due to ill-conditioning.
- Typically, convergence to high precision () requires a small number of outer iterations (across 20 random initializations).
- Both solutions in the interior and at the boundary of feasible sets are handled efficiently.
5. Theoretical and Practical Significance
The GN-Prox-Linear method unifies Gauss–Newton linearization and proximal mapping while permitting generalized regularity assumptions. Its main strengths:
- Explicit convergence rates and radii independent of standard global Lipschitz assumptions.
- Applicability to nonsmooth convex penalties and constraints via proximity operators/projection.
- Empirical success in constrained environments, with feasibility and numerical stability maintained.
The method serves both as a generalization of classical nonlinear least squares optimization and as a robust algorithm for structured inverse problems with nonsmooth regularization. Its analysis (via generalized Lipschitz properties and proximity operators in non-Euclidean metrics) is foundational for later developments in composite optimization and for applications requiring guaranteed descent in the presence of constraints and regularization.
6. Relationship to Majorization, Proximal, and Splitting Methods
While the GN-Prox-Linear method leverages linearization in the style of Gauss–Newton, its use of proximity operators for nonsmooth convex terms aligns it with the broader class of proximal gradient and splitting algorithms. Unlike the proximal distance or majorization-minimization algorithms, which emphasize objective majorants and feasibility via projections, the GN-Prox-Linear method emphasizes a problem structure where the smooth and nonsmooth terms coexist, handled via metric-specific proximity maps after an explicit linearization of the nonlinear component. This architecture is also reflected in more recent composite and multiproximal frameworks, where subproblem structure and local metric adaptation are central (Bolte et al., 2017).
7. Key Mathematical Objects and Update Formulae
Object | Definition / Formula | Role in the Method |
---|---|---|
Linearized Model | Subproblem for each iteration | |
Metric matrix | Defines proximity geometry | |
Update formula | Core iterative step |
This formalism provides the basis for detailed implementation and analysis in both unconstrained and constrained settings. The explicit operator-level updates and regularization via proximity induce guaranteed descent and facilitate both theoretical convergence proofs and practical algorithmic stability.