Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
10 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
3 tokens/sec
DeepSeek R1 via Azure Pro
51 tokens/sec
2000 character limit reached

Implicit Interpretation of Importance Weight Aware Updates (2307.11955v1)

Published 22 Jul 2023 in cs.LG

Abstract: Due to its speed and simplicity, subgradient descent is one of the most used optimization algorithms in convex machine learning algorithms. However, tuning its learning rate is probably its most severe bottleneck to achieve consistent good performance. A common way to reduce the dependency on the learning rate is to use implicit/proximal updates. One such variant is the Importance Weight Aware (IWA) updates, which consist of infinitely many infinitesimal updates on each loss function. However, IWA updates' empirical success is not completely explained by their theory. In this paper, we show for the first time that IWA updates have a strictly better regret upper bound than plain gradient updates in the online learning setting. Our analysis is based on the new framework, generalized implicit Follow-the-Regularized-Leader (FTRL) (Chen and Orabona, 2023), to analyze generalized implicit updates using a dual formulation. In particular, our results imply that IWA updates can be considered as approximate implicit/proximal updates.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.