Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
173 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Risk-averse Learning with Non-Stationary Distributions (2404.02988v1)

Published 3 Apr 2024 in eess.SY, cs.LG, and cs.SY

Abstract: Considering non-stationary environments in online optimization enables decision-maker to effectively adapt to changes and improve its performance over time. In such cases, it is favorable to adopt a strategy that minimizes the negative impact of change to avoid potentially risky situations. In this paper, we investigate risk-averse online optimization where the distribution of the random cost changes over time. We minimize risk-averse objective function using the Conditional Value at Risk (CVaR) as risk measure. Due to the difficulty in obtaining the exact CVaR gradient, we employ a zeroth-order optimization approach that queries the cost function values multiple times at each iteration and estimates the CVaR gradient using the sampled values. To facilitate the regret analysis, we use a variation metric based on Wasserstein distance to capture time-varying distributions. Given that the distribution variation is sub-linear in the total number of episodes, we show that our designed learning algorithm achieves sub-linear dynamic regret with high probability for both convex and strongly convex functions. Moreover, theoretical results suggest that increasing the number of samples leads to a reduction in the dynamic regret bounds until the sampling number reaches a specific limit. Finally, we provide numerical experiments of dynamic pricing in a parking lot to illustrate the efficacy of the designed algorithm.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. Elad Hazan. Efficient algorithms for online convex optimization and their applications. Princeton University, 2006.
  2. No-regret learning in unknown games with correlated payoffs. Advances in Neural Information Processing Systems, 32(24):13624–13633, 2019.
  3. An online convex optimization approach to proactive network resource allocation. IEEE Transactions on Signal Processing, 65(24):6350–6364, 2017.
  4. No-regret learning in convex games. In Proc. of the 25th International Conference on Machine Learning, pages 360–367, 2008.
  5. Elad Hazan et al. Introduction to online convex optimization. Foundations and Trends in Optimization, 2(3-4):157–325, 2016.
  6. Decision-dependent risk minimization in geometrically decaying dynamic environments. In Proc. of the AAAI Conference on Artificial Intelligence, pages 8081–8088, 2022.
  7. Learning in stochastic monotone games with decision-dependent data. In International Conference on Artificial Intelligence and Statistics, pages 5891–5912, 2022.
  8. Performative prediction. In International Conference on Machine Learning, pages 7599–7609, 2020.
  9. Outside the echo chamber: Optimizing the performative risk. In International Conference on Machine Learning, pages 7710–7720, 2021.
  10. Non-stationary stochastic optimization. Operations Research, 63(5):1227–1244, 2015.
  11. Data-driven risk-averse stochastic optimization with Wasserstein metric. Operations Research Letters, 46(2):262–267, 2018.
  12. Bandit convex optimization in non-stationary environments. The Journal of Machine Learning Research, 22(1):5562–5606, 2021.
  13. Online stochastic optimization with Wasserstein based non-stationarity. arXiv preprint arXiv:2012.06961, 2020.
  14. Online stochastic convex optimization: Wasserstein distance variation. arXiv preprint arXiv:2006.01397, 2020.
  15. Value at risk. Financial Analysts Journal, 56(2):47–67, 2000.
  16. Conditional value-at-risk for general loss distributions. Journal of Banking & Finance, 26(7):1443–1471, 2002.
  17. Optimization of conditional value-at-risk. Journal of Risk, 2:21–42, 2000.
  18. Risk-averse stochastic convex bandit. In Proc. of the 22nd International Conference on Artificial Intelligence and Statistics, pages 39–47, 2019.
  19. A zeroth-order momentum method for risk-averse online convex games. In Proc. of the 61st IEEE Conference on Decision and Control, pages 5179–5184. IEEE, 2022.
  20. Risk-averse no-regret learning in online convex games. In International Conference on Machine Learning, pages 22999–23017, 2022.
  21. Toward a scalable upper bound for a CVaR-lq problem. IEEE Control Systems Letters, 6:920–925, 2021.
  22. Risk-aware linear quadratic control using conditional value-at-risk. IEEE Transactions on Automatic Control, 2022.
  23. A risk-sensitive finite-time reachability approach for safety of stochastic dynamic systems. In 2019 American Control Conference, pages 2958–2963. IEEE, 2019.
  24. Online stochastic optimization with time-varying distributions. IEEE Transactions on Automatic Control, 66(4):1840–1847, 2020.
  25. Distributionally-aware exploration for CVaR bandits. In NeurIPS 2019 Workshop on Safety and Robustness on Decision Making, 2019.
  26. Statistical learning with conditional value at risk. arXiv preprint arXiv:2002.05826, 2020.
  27. Risk-averse offline reinforcement learning. arXiv preprint arXiv:2102.05371, 2021.
  28. An adaptive news-driven method for CVaR-sensitive online portfolio selection in non-stationary financial markets. In Proc. of the 30th International Joint Conference on Artificial Intelligence, pages 2708–2715, 2021.
  29. Leonid V Kantorovich. Mathematical methods of organizing and planning production. Management Science, 6(4):366–422, 1960.
  30. David A Edwards. On the Kantorovich–Rubinstein theorem. Expositiones Mathematicae, 29(4):387–398, 2011.
  31. Joseph L Doob. The Brownian movement and stochastic equations. Annals of Mathematics, pages 351–369, 1942.
  32. Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator. The Annals of Mathematical Statistics, pages 642–669, 1956.
  33. Online convex optimization in the bandit setting: gradient descent without a gradient. arXiv preprint cs/0408007, 2004.
  34. Varying confidence levels for CVaR risk measures and minimax limits. Mathematical Programming, 180:327–370, 2020.
  35. Convex optimization. Cambridge University Press, 2004.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets