Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
121 tokens/sec
GPT-4o
9 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Perturbed Gradient Descent via Convex Quadratic Approximation for Nonconvex Bilevel Optimization (2504.17215v1)

Published 24 Apr 2025 in math.OC

Abstract: Bilevel optimization is a fundamental tool in hierarchical decision-making and has been widely applied to machine learning tasks such as hyperparameter tuning, meta-learning, and continual learning. While significant progress has been made in bilevel optimization, existing methods predominantly focus on the {nonconvex-strongly convex, or the} nonconvex-PL settings, leaving the more general nonconvex-nonconvex framework underexplored. In this paper, we address this gap by developing an efficient gradient-based method inspired by the recently proposed Relaxed Gradient Flow (RXGF) framework with a continuous-time dynamic. In particular, we introduce a discretized variant of RXGF and formulate convex quadratic program subproblems with closed-form solutions. We provide a rigorous convergence analysis, demonstrating that under the existence of a KKT point and a regularity assumption {(lower-level gradient PL assumption)}, our method achieves an iteration complexity of $\mathcal{O}(1/\epsilon{1.5})$ in terms of the squared norm of the KKT residual for the reformulated problem. Moreover, even in the absence of the regularity assumption, we establish an iteration complexity of $\mathcal{O}(1/\epsilon{3})$ for the same metric. Through extensive numerical experiments on convex and nonconvex synthetic benchmarks and a hyper-data cleaning task, we illustrate the efficiency and scalability of our approach.

Summary

We haven't generated a summary for this paper yet.