Analytically invert the Fisher block for the DLR Gaussian variational family

Derive an efficient analytical expression for the inverse of the Fisher information matrix block corresponding to the diagonal-plus-low-rank Gaussian variational parametrization (Σ, W), where the precision is Σ + W W^T, so that natural-gradient updates can be performed directly in the (Σ, W) parameterization without resorting to projection via singular value decomposition.

Background

The paper introduces a diagonal-plus-low-rank (DLR) Gaussian variational family with precision Σ + W W^T to enable scalable online Bayesian learning. For natural gradient descent in this parameterization, one needs the inverse of the Fisher information matrix in (Σ, W) coordinates. Unlike the full-covariance Gaussian case, the authors could not find an efficient analytical inversion for the relevant Fisher block.

As a workaround, they update in full-covariance natural parameters and then project the precision back to low rank via SVD. An analytic inversion would allow direct and potentially faster natural-gradient updates in (Σ, W), simplifying the method and reducing computational overhead.

References

“The Fisher matrix can be decomposed as a block-diagonal with blocks for ${t,i}$ and for $({t,i},W_{t,i})$, but (in contrast to the FC Gaussian case in \cref{eq:FC-mom-Fisher}) we have not found an efficient way to analytically invert the latter block, which has size $P + \rankP$.”

— Bayesian Online Natural Gradient (BONG) (2405.19681 - Jones et al., 30 May 2024) in Section “Diagonal plus low rank”, Derivations (sec:DLR-deriv)

Analytically invert the Fisher block for the DLR Gaussian variational family

Background

References

Related Problems