Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Robust Lipschitz Bandits to Adversarial Corruptions (2305.18543v2)

Published 29 May 2023 in cs.LG and stat.ML

Abstract: Lipschitz bandit is a variant of stochastic bandits that deals with a continuous arm set defined on a metric space, where the reward function is subject to a Lipschitz constraint. In this paper, we introduce a new problem of Lipschitz bandits in the presence of adversarial corruptions where an adaptive adversary corrupts the stochastic rewards up to a total budget $C$. The budget is measured by the sum of corruption levels across the time horizon $T$. We consider both weak and strong adversaries, where the weak adversary is unaware of the current action before the attack, while the strong one can observe it. Our work presents the first line of robust Lipschitz bandit algorithms that can achieve sub-linear regret under both types of adversary, even when the total budget of corruption $C$ is unrevealed to the agent. We provide a lower bound under each type of adversary, and show that our algorithm is optimal under the strong case. Finally, we conduct experiments to illustrate the effectiveness of our algorithms against two classic kinds of attacks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. Improved algorithms for linear stochastic bandits. Advances in neural information processing systems, 24, 2011.
  2. Rajeev Agrawal. The continuum-armed bandit problem. SIAM journal on control and optimization, 33(6):1926–1951, 1995.
  3. Finite-time analysis of the multiarmed bandit problem. Machine learning, 47:235–256, 2002.
  4. The nonstochastic multiarmed bandit problem. SIAM journal on computing, 32(1):48–77, 2002.
  5. Contextual bandit algorithms with supervised learning guarantees. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pages 19–26. JMLR Workshop and Conference Proceedings, 2011.
  6. Stochastic linear bandits robust to adversarial attacks. In International Conference on Artificial Intelligence and Statistics, pages 991–999. PMLR, 2021.
  7. The best of both worlds: Stochastic and adversarial bandits. In Conference on Learning Theory, pages 42–1. JMLR Workshop and Conference Proceedings, 2012.
  8. Online optimization in x-armed bandits. Advances in Neural Information Processing Systems, 21, 2008.
  9. Nearly optimal adaptive procedure with change detection for piecewise-stationary bandit. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 418–427. PMLR, 2019.
  10. Adversarial robustness for machine learning. Academic Press, San Diego, CA, August 2022.
  11. Learning to optimize under non-stationarity. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 1079–1087. PMLR, 2019.
  12. Robust stochastic linear contextual bandits under adversarial attacks. In International Conference on Artificial Intelligence and Statistics, pages 7111–7123. PMLR, 2022.
  13. Syndicated bandits: A framework for auto tuning hyper-parameters in contextual bandit algorithms. Advances in Neural Information Processing Systems, 35:1170–1181, 2022.
  14. Lipschitz bandits with batched feedback. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho, editors, Advances in Neural Information Processing Systems, 2022.
  15. Adversarial attacks on linear contextual bandits. Advances in Neural Information Processing Systems, 33:14362–14373, 2020.
  16. Better algorithms for stochastic bandits with adversarial corruptions. In Conference on Learning Theory, pages 1562–1578. PMLR, 2019.
  17. Nearly optimal algorithms for linear contextual bandits with adversarial corruptions. arXiv preprint arXiv:2205.06811, 2022.
  18. Adversarial attacks on stochastic bandits. Advances in Neural Information Processing Systems, 31, 2018.
  19. Online continuous hyperparameter optimization for contextual bandits. arXiv preprint arXiv:2302.09440, 2023.
  20. Efficient frameworks for generalized low-rank matrix bandit problems. Advances in Neural Information Processing Systems, 35:19971–19983, 2022.
  21. Robert Kleinberg. Nearly tight bounds for the continuum-armed bandit problem. Advances in Neural Information Processing Systems, 17, 2004.
  22. Sharp dichotomies for regret minimization in metric spaces. In Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms, pages 827–846. SIAM, 2010.
  23. Bandits and experts in metric spaces. Journal of the ACM (JACM), 66(4):1–77, 2019.
  24. Bandit algorithms. Cambridge University Press, 2020.
  25. Achieving near instance-optimality and minimax-optimality in stochastic and adversarial linear bandits simultaneously. In International Conference on Machine Learning, pages 6142–6151. PMLR, 2021.
  26. Pyxab – a python library for 𝒳𝒳\mathcal{X}caligraphic_X-armed bandit and online blackbox optimization algorithms, 2023.
  27. Stochastic linear optimization with adversarial corruption. arXiv preprint arXiv:1909.02109, 2019.
  28. Optimal algorithms for lipschitz bandits with heavy-tailed rewards. In International Conference on Machine Learning, pages 4154–4163. PMLR, 2019.
  29. Stochastic bandits robust to adversarial corruptions. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, pages 114–122, 2018.
  30. Lipschitz bandits: Regret lower bound and optimal algorithms. In Conference on Learning Theory, pages 975–999. PMLR, 2014.
  31. Model selection in contextual stochastic bandit problems. Advances in Neural Information Processing Systems, 33:10328–10337, 2020.
  32. Adaptive discretization for adversarial lipschitz bandits. In Conference on Learning Theory, pages 3788–3805. PMLR, 2021.
  33. Aleksandrs Slivkins. Contextual bandits with similarity information. In Proceedings of the 24th annual Conference On Learning Theory, pages 679–702. JMLR Workshop and Conference Proceedings, 2011.
  34. Aleksandrs Slivkins. Multi-armed bandits on implicit metric spaces. Advances in Neural Information Processing Systems, 24, 2011.
  35. Aleksandrs Slivkins et al. Introduction to multi-armed bandits. Foundations and Trends® in Machine Learning, 12(1-2):1–286, 2019.
  36. A fast and robust method for global topological functional optimization. In International Conference on Artificial Intelligence and Statistics, pages 109–117. PMLR, 2021.
  37. A model selection approach for corruption robust reinforcement learning. In International Conference on Algorithmic Learning Theory, pages 1043–1096. PMLR, 2022.
  38. Corruption-robust algorithms with uncertainty weighting for nonlinear contextual bandits and markov decision processes. In International Conference on Machine Learning, pages 39834–39863. PMLR, 2023.
  39. Linear contextual bandits with adversarial corruptions. arXiv preprint arXiv:2110.12615, 2021.
Citations (6)

Summary

We haven't generated a summary for this paper yet.