Order-Optimal Regret in Distributed Kernel Bandits using Uniform Sampling with Shared Randomness (2402.13182v1)
Abstract: We consider distributed kernel bandits where $N$ agents aim to collaboratively maximize an unknown reward function that lies in a reproducing kernel Hilbert space. Each agent sequentially queries the function to obtain noisy observations at the query points. Agents can share information through a central server, with the objective of minimizing regret that is accumulating over time $T$ and aggregating over agents. We develop the first algorithm that achieves the optimal regret order (as defined by centralized learning) with a communication cost that is sublinear in both $N$ and $T$. The key features of the proposed algorithm are the uniform exploration at the local agents and shared randomness with the central server. Working together with the sparse approximation of the GP model, these two key components make it possible to preserve the learning rate of the centralized setting at a diminishing rate of communication.
- Improved algorithms for linear stochastic bandits. In Proceedings of the 25th Annual Conference on Neural Information Processing Systems, 2011. ISBN 9781618395993.
- Distributed Contextual Linear Bandits with Minimax Optimal Communication Cost, 2022. URL http://arxiv.org/abs/2205.13170.
- Hybrid batch bayesian optimization. In Proceedings of the 29th International Conference on Machine Learning, ICML, volume 2, pages 1215–1222, 2012. ISBN 9781450312851.
- Gaussian Process Optimization with Adaptive Sketching: Scalable and No Regret. Proceedings of Machine Learning Research, 99:1–25, 2019.
- High-Dimensional Experimental Design and Kernel Bandits. In Proceedings of the 38th International Conference on Machine Learning, 2021.
- The Gossiping Insert-Eliminate Algorithm for Multi-Agent Bandits, 2020.
- S. R. Chowdhury and A. Gopalan. On kernelized multi-armed bandits. In Proceedings of the 34th International Conference on Machine Learning, ICML, volume 2, pages 1397–1422, 2017.
- Federated Bayesian optimization via Thompson sampling. In Proceedings of the 34th Annual Conference on Neural Information Processing Systems, volume 2020-Decem, 2020.
- Collaborative Pure Exploration in Kernel Bandit. In Proceedings of the 11th International Conference on Learning Representations, ICLR, 2023.
- A. Dubey and A. Pentland. Kernel methods for cooperative multi-agent contextual bandits. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, pages 2720–2730, 2020. ISBN 9781713821120.
- Adaptive Clustering and Personalization in Multi-Agent Stochastic Linear Bandits, 2021. URL http://arxiv.org/abs/2106.08902.
- Learning from distributed users in contextual linear bandits without sharing the context. In Proceedings of the 36th Annual Conference on Neural Information Processing Systems, volume 35, pages 11049–11062, 2022.
- Federated Linear Contextual Bandits. In Advances in Neural Information Processing Systems, volume 32, pages 27057–27068, 2021. ISBN 9781713845393.
- Neural tangent kernel: Convergence and generalization in neural networks. In Proceedings of the 32nd Annual Conference on Neural Information Processing Systems, pages 8571–8580, 2018.
- Distributed clustering of linear bandits in peer to peer networks. In 33rd International Conference on Machine Learning, ICML 2016, volume 3, pages 1966–1980, 2016. ISBN 9781510829008.
- On distributed cooperative decision-making in multiarmed bandits. In Proceedings of the European Control Conference, ECC, pages 243–248, 2017. ISBN 9781509025916.
- Communication Efficient Distributed Learning for Kernelized Contextual Bandits. In Proceedings of the 36th Annual Conference on Neural Information Processing Systems, 2022.
- Z. Li and J. Scarlett. Gaussian process bandit optimization with few batches. In Proceedings of the 25th International Conference on Artificial Intelligence and Statistics, AISTATS, 2022.
- Exploiting Heterogeneity in Robust Federated Best-Arm Identification, 2021. URL http://arxiv.org/abs/2109.05700.
- Linear Stochastic Bandits over a Bit-Constrained Channel, 2022.
- A benchmark of kriging-based infill criteria for noisy optimization. Structural and Multidisciplinary Optimization, 48(3):607–626, 2013. ISSN 1615147X.
- S. Salgia and Q. Zhao. Distributed linear bandits under communication constraints. In Proceedings of the 40th International Conference on Machine Learning, ICML, pages 29845–29875. PMLR, 2023.
- A domain-shrinking based Bayesian optimization algorithm with order-optimal regret performance. In Proceedings of the 35th Annual Conference on Neural Information Processing Systems, volume 34, 2021.
- Random exploration in bayesian optimization: Order-optimal regret and computational efficiency, 2023a.
- Collaborative learning in kernel-based bandits for distributed users. IEEE Transactions on Signal Processing, 71:3956–3967, 2023b.
- Social learning in multi agent multi armed bandits. Proc. ACM Meas. Anal. Comput. Syst., 3(3), dec 2019.
- Lower Bounds on Regret for Noisy Gaussian Process Bandit Optimization. In Conference on Learning Theory, volume 65, pages 1–20, 2017.
- Multi-armed bandits in multi-agent networks. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 2786–2790, 2017.
- C. Shi and C. Shen. Federated Multi-Armed Bandits. In Proceedings of the 35th AAAI Conference on Artificial Intelligence, pages 9603–9611, 2021.
- Federated Multi-armed Bandits with Personalization, 2021. URL http://arxiv.org/abs/2102.13101.
- Gaussian process optimization in the bandit setting: no regret and experimental design. In Proceedings of the 27th International Conference on Machine Learning, ICML, pages 1015–1022, 2010.
- Optimal order simple regret for Gaussian process bandits. In Proceedings of the 35th Annual Conference on Neural Information Processing Systems, 2021a.
- On information gain and regret bounds in Gaussian process bandits. In Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, AISTATS, 2021b.
- Improved convergence rates for sparse approximation methods in kernel-based learning. In Proceedings of the 39th International Conference on Machine Learning, ICML, pages 21960–21983. PMLR, 2022.
- Finite-time analysis of kernelised contextual bandits. In Proceedings of the 29th Conference on Uncertainty in Artificial Intelligence, UAI, pages 654–663, 2013.
- Distributed Bandit Learning: Near-Optimal Regret with Efficient Communication. In Proceedings of the 7th International Conference on Learning Representations (ICLR), 2019.
- Connections and equivalences between the nyström method and sparse variational gaussian processes, 2021.
- Federated Bandit: A Gossiping Approach. Proceedings of the ACM on Measurement and Analysis of Computing Systems, 5(1):1–29, 2021.