2000 character limit reached
Improved Complexities for Stochastic Conditional Gradient Methods under Interpolation-like Conditions (2006.08167v2)
Published 15 Jun 2020 in math.OC, cs.LG, and stat.ML
Abstract: We analyze stochastic conditional gradient methods for constrained optimization problems arising in over-parametrized machine learning. We show that one could leverage the interpolation-like conditions satisfied by such models to obtain improved oracle complexities. Specifically, when the objective function is convex, we show that the conditional gradient method requires $\mathcal{O}(\epsilon{-2})$ calls to the stochastic gradient oracle to find an $\epsilon$-optimal solution. Furthermore, by including a gradient sliding step, we show that the number of calls reduces to $\mathcal{O}(\epsilon{-1.5})$.