Papers
Topics
Authors
Recent
Search
2000 character limit reached

Stochastic Bilevel Optimization with Heavy-Tailed Noise

Published 18 Sep 2025 in cs.LG | (2509.14952v1)

Abstract: This paper considers the smooth bilevel optimization in which the lower-level problem is strongly convex and the upper-level problem is possibly nonconvex. We focus on the stochastic setting that the algorithm can access the unbiased stochastic gradient evaluation with heavy-tailed noise, which is prevalent in many machine learning applications such as training LLMs and reinforcement learning. We propose a nested-loop normalized stochastic bilevel approximation (N$2$SBA) for finding an $\epsilon$-stationary point with the stochastic first-order oracle (SFO) complexity of $\tilde{\mathcal{O}}\big(\kappa{\frac{7p-3}{p-1}} \sigma{\frac{p}{p-1}} \epsilon{-\frac{4 p - 2}{p-1}}\big)$, where $\kappa$ is the condition number, $p\in(1,2]$ is the order of central moment for the noise, and $\sigma$ is the noise level. Furthermore, we specialize our idea to solve the nonconvex-strongly-concave minimax optimization problem, achieving an $\epsilon$-stationary point with the SFO complexity of $\tilde{\mathcal O}\big(\kappa{\frac{2p-1}{p-1}} \sigma{\frac{p}{p-1}} \epsilon{-\frac{3p-2}{p-1}}\big)$. All above upper bounds match the best-known results under the special case of the bounded variance setting, i.e., $p=2$.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.