Perturbed Gradient Descent via Convex Quadratic Approximation for Nonconvex Bilevel Optimization (2504.17215v1)

Published 24 Apr 2025 in math.OC

Abstract: Bilevel optimization is a fundamental tool in hierarchical decision-making and has been widely applied to machine learning tasks such as hyperparameter tuning, meta-learning, and continual learning. While significant progress has been made in bilevel optimization, existing methods predominantly focus on the {nonconvex-strongly convex, or the} nonconvex-PL settings, leaving the more general nonconvex-nonconvex framework underexplored. In this paper, we address this gap by developing an efficient gradient-based method inspired by the recently proposed Relaxed Gradient Flow (RXGF) framework with a continuous-time dynamic. In particular, we introduce a discretized variant of RXGF and formulate convex quadratic program subproblems with closed-form solutions. We provide a rigorous convergence analysis, demonstrating that under the existence of a KKT point and a regularity assumption {(lower-level gradient PL assumption)}, our method achieves an iteration complexity of $\mathcal{O}(1/\epsilon^{1.5})$ in terms of the squared norm of the KKT residual for the reformulated problem. Moreover, even in the absence of the regularity assumption, we establish an iteration complexity of $\mathcal{O}(1/\epsilon^{3})$ for the same metric. Through extensive numerical experiments on convex and nonconvex synthetic benchmarks and a hyper-data cleaning task, we illustrate the efficiency and scalability of our approach.

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Perturbed Gradient Descent via Convex Quadratic Approximation for Nonconvex Bilevel Optimization (2504.17215v1)

Summary

Related Papers