2000 character limit reached
Natural Gradient for Combined Loss Using Wavelets (2006.15806v1)
Published 29 Jun 2020 in math.NA, cs.LG, cs.NA, and math.OC
Abstract: Natural gradients have been widely used in optimization of loss functionals over probability space, with important examples such as Fisher-Rao gradient descent for Kullback-Leibler divergence, Wasserstein gradient descent for transport-related functionals, and Mahalanobis gradient descent for quadratic loss functionals. This note considers the situation in which the loss is a convex linear combination of these examples. We propose a new natural gradient algorithm by utilizing compactly supported wavelets to diagonalize approximately the Hessian of the combined loss. Numerical results are included to demonstrate the efficiency of the proposed algorithm.