Consistency of the Universal Stochastic Newton algorithm with 1/n step size

Establish almost sure consistency of the Universal Stochastic Newton algorithm in which the parameter iterate is updated by theta_n = theta_{n-1} - (1/n) A_{n-1} ∇_h g(X_n, theta_{n-1}), where A_n is the online inverse-Hessian estimator updated via A_n = A_{n-1} - γ_n (P_n Q_n^T + Q_n P_n^T - 2 I_d) with P_n = A_{n-1} Z_n, Q_n = ∇_h^2 g(X_n, theta_{n-1}) Z_n, and (Z_n) independent, centered with E[Z_n Z_n^T] = I_d. Prove that theta_n converges almost surely to the unique minimizer theta of G(h) = E[g(X,h)] under the regularity assumptions on G used in the paper (convexity, twice differentiability, and positive-definite Hessian at theta).

Background

The paper introduces a direct Robbins–Monro procedure to estimate the inverse Hessian online and then builds Universal Stochastic Newton algorithms. A simpler, non-averaged variant (termed Universal Stochastic Newton Algorithm) updates theta_n using A_n only.

For general stepsizes ν_n in (1/2, 1−β), the authors establish almost sure convergence rates. However, for the classical Stochastic Newton choice ν_n = 1/n, they explicitly state they could not prove consistency of the resulting estimator, leaving this as an unresolved issue.

References

In addition, mention that following the reasoning presented by , one could take a step sequence of the form \nu_{n} = \frac{1}{n} leading to the Stochastic Newton algorithm. However, we are unfortunately not able to obtain the consistency of the estimates in this context.

— Online estimation of the inverse of the Hessian for stochastic optimization with application to universal stochastic Newton algorithms (2401.10923 - Godichon-Baggioni et al., 15 Jan 2024) in Remark (rmq), Section 4: Universal Weighted Averaged Stochastic Newton Algorithm

Consistency of the Universal Stochastic Newton algorithm with 1/n step size

Background

References

Related Problems