Implement and evaluate the Jacobian-free linearized empirical Fisher variant

Develop and implement the Jacobian-free empirical Fisher approximation for the linearized gradient/Hessian estimator that uses the gradient of the scalar objective −(1/2)(y − h_t(w))^T R^{-1}(y − h_t(w)) and replaces the Hessian with the outer product of this gradient, and empirically assess its performance and computational speed on high-dimensional observation models relative to the Hessian/Jacobian-based linearized approach.

Background

The authors propose two empirical Fisher (EF) approximations: one that estimates the Hessian by an outer product of Monte Carlo gradients, and a second, Jacobian-free EF variant for the linearized case that avoids computing the network Jacobian. They provide formulas for the Jacobian-free EF gradient and Hessian (outer product of gradient) under the linearized likelihood.

While they anticipate computational advantages for high-dimensional observations, they did not implement or evaluate this Jacobian-free EF combination in their experiments, leaving a gap in understanding its empirical benefits versus the Hessian/Jacobian-based linearized method.

References

“We expect ef{ to be much faster than hess{ with high-dimensional observations (since it avoids computing the Jacobian), but we do not report experimental results on this combination and leave its implementation to future work.”

— Bayesian Online Natural Gradient (BONG) (2405.19681 - Jones et al., 30 May 2024) in Section “Empirical Fisher” (sec:EF)

Implement and evaluate the Jacobian-free linearized empirical Fisher variant

Background

References

Related Problems