Small errors in random zeroth-order optimization are imaginary (2103.05478v6)

Published 9 Mar 2021 in math.OC

Abstract: Most zeroth-order optimization algorithms mimic a first-order algorithm but replace the gradient of the objective function with some gradient estimator that can be computed from a small number of function evaluations. This estimator is constructed randomly, and its expectation matches the gradient of a smooth approximation of the objective function whose quality improves as the underlying smoothing parameter $\delta$ is reduced. Gradient estimators requiring a smaller number of function evaluations are preferable from a computational point of view. While estimators based on a single function evaluation can be obtained by use of the divergence theorem from vector calculus, their variance explodes as $\delta$ tends to $0$. Estimators based on multiple function evaluations, on the other hand, suffer from numerical cancellation when $\delta$ tends to $0$. To combat both effects simultaneously, we extend the objective function to the complex domain and construct a gradient estimator that evaluates the objective at a complex point whose coordinates have small imaginary parts of the order $\delta$. As this estimator requires only one function evaluation, it is immune to cancellation. In addition, its variance remains bounded as $\delta$ tends to $0$. We prove that zeroth-order algorithms that use our estimator offer the same theoretical convergence guarantees as the state-of-the-art methods. Numerical experiments suggest, however, that they often converge faster in practice.

Authors (3)

Wouter Jongeneel (10 papers)
Man-Chung Yue (28 papers)
Daniel Kuhn (57 papers)

Citations (7)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Small errors in random zeroth-order optimization are imaginary (2103.05478v6)

Summary

Related Papers