Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Universal Online Learning with Gradient Variations: A Multi-layer Online Ensemble Approach (2307.08360v3)

Published 17 Jul 2023 in cs.LG, math.OC, and stat.ML

Abstract: In this paper, we propose an online convex optimization approach with two different levels of adaptivity. On a higher level, our approach is agnostic to the unknown types and curvatures of the online functions, while at a lower level, it can exploit the unknown niceness of the environments and attain problem-dependent guarantees. Specifically, we obtain $\mathcal{O}(\log V_T)$, $\mathcal{O}(d \log V_T)$ and $\hat{\mathcal{O}}(\sqrt{V_T})$ regret bounds for strongly convex, exp-concave and convex loss functions, respectively, where $d$ is the dimension, $V_T$ denotes problem-dependent gradient variations and the $\hat{\mathcal{O}}(\cdot)$-notation omits $\log V_T$ factors. Our result not only safeguards the worst-case guarantees but also directly implies the small-loss bounds in analysis. Moreover, when applied to adversarial/stochastic convex optimization and game theory problems, our result enhances the existing universal guarantees. Our approach is based on a multi-layer online ensemble framework incorporating novel ingredients, including a carefully designed optimism for unifying diverse function types and cascaded corrections for algorithmic stability. Notably, despite its multi-layer structure, our algorithm necessitates only one gradient query per round, making it favorable when the gradient evaluation is time-consuming. This is facilitated by a novel regret decomposition equipped with carefully designed surrogate losses.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. N. Cesa-Bianchi and G. Lugosi. Prediction, Learning, and Games. Cambridge University Press, 2006.
  2. Impossible tuning made possible: A new expert algorithm and its applications. In Proceedings of the 34th Annual Conference Computational Learning Theory (COLT), pages 1216–1259, 2021.
  3. Optimistic online mirror descent for bridging stochastic and adversarial online convex optimization. In Proceedings of the 40th International Conference on Machine Learning (ICML), pages 5002–5035, 2023a.
  4. Optimistic online mirror descent for bridging stochastic and adversarial online convex optimization. ArXiv preprint, arXiv:2302.04552, 2023b.
  5. Online optimization with gradual variations. In Proceedings of the 25th Annual Conference Computational Learning Theory (COLT), pages 6.1–6.20, 2012.
  6. A second-order bound with excess losses. In Proceedings of the 27th Annual Conference Computational Learning Theory (COLT), pages 176–196, 2014.
  7. E. Hazan. Introduction to Online Convex Optimization. Foundations and Trends in Optimization, 2(3-4):157–325, 2016.
  8. Logarithmic regret algorithms for online convex optimization. Machine Learning, 69(2-3):169–192, 2007.
  9. S. Ji and J. Ye. An accelerated gradient method for trace norm minimization. In Proceedings of the 26th International Conference on Machine Learning (ICML), pages 457–464, 2009.
  10. Efficient mini-batch training for stochastic optimization. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), pages 661–670, 2014.
  11. F. Orabona. A modern introduction to online learning. ArXiv preprint, arXiv:1912.13213, 2019.
  12. Beyond logarithmic bounds in online learning. In Proceedings of the 15th International Conference on Artificial Intelligence and Statistics (AISTATS), pages 823–831, 2012.
  13. M. S. Pinsker. Information and information stability of random variables and processes. Holden-Day, 1964.
  14. R. Pogodin and T. Lattimore. On first-order bounds, variance and gap-dependent bounds for adversarial bandits. In Proceedings of the 35th Conference on Uncertainty in Artifificial Intelligence (UAI), pages 894–904, 2019.
  15. A. Rakhlin and K. Sridharan. Online learning with predictable sequences. In Proceedings of the 26th Annual Conference Computational Learning Theory (COLT), pages 993–1019, 2013a.
  16. A. Rakhlin and K. Sridharan. Optimization, learning, and games with predictable sequences. In Advances in Neural Information Processing Systems 26 (NIPS), pages 3066–3074, 2013b.
  17. Between stochastic and adversarial online convex optimization: Improved regret bounds via smoothness. In Advances in Neural Information Processing Systems 35 (NeurIPS), pages 691–702, 2022.
  18. Accelerated rates between stochastic and adversarial online convex optimization. ArXiv preprint, arXiv:2303.03272, 2023.
  19. Smoothness, low noise and fast rates. In Advances in Neural Information Processing Systems 23 (NIPS), pages 2199–2207, 2010.
  20. Fast convergence of regularized learning in games. In Advances in Neural Information Processing Systems 28 (NIPS), pages 2989–2997, 2015.
  21. Metagrad: Multiple learning rates in online learning. In Advances in Neural Information Processing Systems 29 (NIPS), pages 3666–3674, 2016.
  22. Minimizing adaptive regret with one gradient per iteration. In Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI), pages 2762–2768, 2018.
  23. Adaptivity and optimality: A universal algorithm for online convex optimization. In Proceedings of the 35th Conference on Uncertainty in Artificial Intelligence (UAI), pages 659–668, 2019.
  24. Tracking the best expert in non-stationary stochastic environments. In Advances in Neural Information Processing Systems 29 (NIPS), pages 3972–3980, 2016.
  25. Adaptive online learning in dynamic environments. In Advances in Neural Information Processing Systems 31 (NeurIPS), pages 1330–1340, 2018.
  26. A simple yet universal strategy for online convex optimization. In Proceedings of the 39th International Conference on Machine Learning (ICML), pages 26605–26623, 2022a.
  27. No-regret learning in time-varying zero-sum games. In Proceedings of the 39th International Conference on Machine Learning (ICML), pages 26772–26808, 2022b.
  28. Dynamic regret of convex and smooth functions. In Advances in Neural Information Processing Systems 33 (NeurIPS), pages 12510–12520, 2020.
  29. Adaptivity and non-stationarity: Problem-dependent dynamic regret for online convex optimization. ArXiv preprint, arXiv:2112.14368, 2021.
  30. M. Zinkevich. Online convex programming and generalized infinitesimal gradient ascent. In Proceedings of the 20th International Conference on Machine Learning (ICML), pages 928–936, 2003.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Yu-Hu Yan (3 papers)
  2. Peng Zhao (162 papers)
  3. Zhi-Hua Zhou (126 papers)
Citations (7)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com