Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Quantum Bayesian Optimization (2310.05373v1)

Published 9 Oct 2023 in cs.LG and cs.AI

Abstract: Kernelized bandits, also known as Bayesian optimization (BO), has been a prevalent method for optimizing complicated black-box reward functions. Various BO algorithms have been theoretically shown to enjoy upper bounds on their cumulative regret which are sub-linear in the number T of iterations, and a regret lower bound of Omega(sqrt(T)) has been derived which represents the unavoidable regrets for any classical BO algorithm. Recent works on quantum bandits have shown that with the aid of quantum computing, it is possible to achieve tighter regret upper bounds better than their corresponding classical lower bounds. However, these works are restricted to either multi-armed or linear bandits, and are hence not able to solve sophisticated real-world problems with non-linear reward functions. To this end, we introduce the quantum-Gaussian process-upper confidence bound (Q-GP-UCB) algorithm. To the best of our knowledge, our Q-GP-UCB is the first BO algorithm able to achieve a regret upper bound of O(polylog T), which is significantly smaller than its regret lower bound of Omega(sqrt(T)) in the classical setting. Moreover, thanks to our novel analysis of the confidence ellipsoid, our Q-GP-UCB with the linear kernel achieves a smaller regret than the quantum linear UCB algorithm from the previous work. We use simulations, as well as an experiment using a real quantum computer, to verify that the theoretical quantum speedup achieved by our Q-GP-UCB is also potentially relevant in practice.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. The power of quantum neural networks. Nature Computational Science, 1(6):403–409, 2021.
  2. Improved algorithms for linear stochastic bandits. In Proc. NeurIPS, volume 24, 2011.
  3. Quantum machine learning. Nature, 549(7671):195–202, 2017.
  4. Quantum amplitude amplification and estimation. Contemporary Mathematics, 305:53–74, 2002.
  5. Bayesian optimization of risk measures. In Proc. NeurIPS, volume 33, pages 20130–20141, 2020.
  6. Scaling Gaussian process optimization by evaluating a few unique candidates multiple times. In Proc. ICML, pages 2523–2541. PMLR, 2022.
  7. High-dimensional experimental design and kernel bandits. In Proc. ICML, pages 1227–1237. PMLR, 2021.
  8. Quantum bandits. Quantum Machine Intelligence, 2:1–7, 2020.
  9. S. R. Chowdhury and A. Gopalan. On kernelized multi-armed bandits. In Proc. ICML, pages 844–853, 2017.
  10. Federated Bayesian optimization via Thompson sampling. In Proc. NeurIPS, 2020.
  11. Differentially private federated Bayesian optimization with distributed exploration. In Proc. NeurIPS, volume 34, 2021.
  12. Sample-then-optimize batch neural Thompson sampling. In Proc. NeurIPS, 2022.
  13. Federated neural bandits. In Proc. ICLR, 2023.
  14. Bayesian optimization meets Bayesian optimal stopping. In Proc. ICML, pages 1496–1506, 2019.
  15. Weighted Gaussian process bandits for non-stationary environments. In Proc. AISTATS, pages 6909–6932. PMLR, 2022.
  16. P. I. Frazier. A tutorial on Bayesian optimization. arXiv preprint arXiv:1807.02811, 2018.
  17. Quantum computing provides exponential regret improvement in episodic reinforcement learning. arXiv:2302.08617, 2023.
  18. S. Garg and G. Ramakrishnan. Advances in quantum deep learning: An overview. arXiv preprint arXiv:2005.04316, 2020.
  19. R. Garnett. Bayesian Optimization. Cambridge Univ. Press, 2022.
  20. Iterative quantum amplitude estimation. npj Quantum Information, 7(1):52, 2021.
  21. L. K. Grover. A framework for fast quantum mechanical algorithms. In Proceedings of the thirtieth annual ACM symposium on Theory of computing, pages 53–62, 1998.
  22. Automated machine learning: methods, systems, challenges. Springer Nature, 2019.
  23. T. Li and R. Zhang. Quantum speedups of optimizing approximately convex functions with applications to logarithmic regret stochastic convex bandits. In Proc. NeurIPS, 2022.
  24. Z. Li and J. Scarlett. Gaussian process bandit optimization with few batches. In Proc. AISTATS, pages 92–107. PMLR, 2022.
  25. Multi-armed quantum bandits: Exploration versus exploitation when learning properties of quantum states. Quantum, 6:749, 2022.
  26. A. Messiah. Quantum mechanics. Courier Corporation, 2014.
  27. A survey on quantum reinforcement learning. arXiv:2211.03464, 2022.
  28. A. Montanaro. Quantum speedup of Monte Carlo methods. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 471(2181):20150301, 2015.
  29. Quantum support vector machine for big data classification. Physical review letters, 113(13):130503, 2014.
  30. A domain-shrinking based Bayesian optimization algorithm with order-optimal regret performance. In Proc. NeurIPS, volume 34, pages 28836–28847, 2021.
  31. Lower bounds on regret for noisy Gaussian process bandit optimization. In Proc. COLT, pages 1723–1742. PMLR, 2017.
  32. Taking the human out of the loop: A review of Bayesian optimization. Proceedings of the IEEE, 104(1):148–175, 2016.
  33. Gaussian process optimization in the bandit setting: No regret and experimental design. In Proc. ICML, pages 1015–1022, 2010.
  34. A. Steane. Quantum computing. Reports on Progress in Physics, 61(2):117, 1998.
  35. On information gain and regret bounds in Gaussian process bandits. In Proc. AISTATS, pages 82–90. PMLR, 2021.
  36. Finite-time analysis of kernelised contextual bandits. In Proc. UAI, 2013.
  37. Bayesian optimization under stochastic delayed feedback. In Proc. ICML, pages 22145–22167. PMLR, 2022.
  38. Quantum multi-armed bandits and stochastic linear bandits enjoy logarithmic regrets. In Proc. AAAI.
  39. Quantum algorithms for reinforcement learning with a generative model. In Proc. ICML, pages 10916–10926. PMLR, 2021.
  40. Quantum exploration algorithms for multi-armed bandits. In Proc. AAAI, volume 35, pages 10102–10110, 2021.
  41. Quantum heavy-tailed bandits. arXiv:2301.09680, 2023.
  42. Neural Thompson sampling. In Proc. ICLR, 2021.
  43. Provably efficient exploration in quantum reinforcement learning with logarithmic worst-case regret. arXiv:2302.10796, 2023.
  44. Neural contextual bandits with UCB-based exploration. In Proc. ICML, pages 11492–11502. PMLR, 2020.
Citations (6)

Summary

We haven't generated a summary for this paper yet.