Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
133 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Convergence of Langevin-Simulated Annealing algorithms with multiplicative noise (2109.11669v2)

Published 23 Sep 2021 in math.PR

Abstract: We study the convergence of Langevin-Simulated Annealing type algorithms with multiplicative noise, i.e. for $V : \mathbb{R}d \to \mathbb{R}$ a potential function to minimize, we consider the stochastic equation $dY_t = - \sigma \sigma\top \nabla V(Y_t) dt + a(t)\sigma(Y_t)dW_t + a(t)2\Upsilon(Y_t)dt$, where $(W_t)$ is a Brownian motion, where $\sigma : \mathbb{R}d \to \mathcal{M}d(\mathbb{R})$ is an adaptive (multiplicative) noise, where $a : \mathbb{R}+ \to \mathbb{R}+$ is a function decreasing to $0$ and where $\Upsilon$ is a correction term. This setting can be applied to optimization problems arising in Machine Learning. The case where $\sigma$ is a constant matrix has been extensively studied however little attention has been paid to the general case. We prove the convergence for the $L1$-Wasserstein distance of $Y_t$ and of the associated Euler-scheme $\bar{Y}_t$ to some measure $\nu\star$ which is supported by $\text{argmin}(V)$ and give rates of convergence to the instantaneous Gibbs measure $\nu{a(t)}$ of density $\propto \exp(-2V(x)/a(t)2)$. To do so, we first consider the case where $a$ is a piecewise constant function. We find again the classical schedule $a(t) = A\log{-1/2}(t)$. We then prove the convergence for the general case by giving bounds for the Wasserstein distance to the stepwise constant case using ergodicity properties.

Summary

We haven't generated a summary for this paper yet.