Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On the Optimal Regret of Locally Private Linear Contextual Bandit (2404.09413v1)

Published 15 Apr 2024 in stat.ML, cs.CR, and cs.LG

Abstract: Contextual bandit with linear reward functions is among one of the most extensively studied models in bandit and online learning research. Recently, there has been increasing interest in designing \emph{locally private} linear contextual bandit algorithms, where sensitive information contained in contexts and rewards is protected against leakage to the general public. While the classical linear contextual bandit algorithm admits cumulative regret upper bounds of $\tilde O(\sqrt{T})$ via multiple alternative methods, it has remained open whether such regret bounds are attainable in the presence of local privacy constraints, with the state-of-the-art result being $\tilde O(T{3/4})$. In this paper, we show that it is indeed possible to achieve an $\tilde O(\sqrt{T})$ regret upper bound for locally private linear contextual bandit. Our solution relies on several new algorithmic and analytical ideas, such as the analysis of mean absolute deviation errors and layered principal component regression in order to achieve small mean absolute deviation errors.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. Improved algorithms for linear stochastic bandits. Advances in neural information processing systems 24.
  2. Contextual bandit learning with predictable rewards. Artificial Intelligence and Statistics. PMLR, 19–26.
  3. Taming the monster: A fast and simple algorithm for contextual bandits. International Conference on Machine Learning. PMLR, 1638–1646.
  4. Auer, Peter. 2002. Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research 3(Nov) 397–422.
  5. The nonstochastic multiarmed bandit problem. SIAM Journal on Computing 32(1) 48–77.
  6. Online decision making with high-dimensional covariates. Operations Research 68(1) 276–294.
  7. Contextual bandit algorithms with supervised learning guarantees. Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. JMLR Workshop and Conference Proceedings, 19–26.
  8. Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations and Trends® in Machine Learning 5(1) 1–122.
  9. Chen, Jiahua. 1995. Optimal rate of convergence for finite mixture models. The Annals of Statistics 221–233.
  10. Nonparametric pricing analytics with customer covariates. Operations Research 69(3) 974–984.
  11. Differential privacy in personalized pricing with nonparametric demand models. Operations Research 71(2) 581–602.
  12. Privacy-preserving dynamic personalized pricing with demand learning. Management Science 68(7) 4878–4898.
  13. Nearly dimension-independent sparse linear bandit over small action spaces via best subset selection. Journal of the American Statistical Association 119(545) 246–258.
  14. Minimax optimal procedures for locally private estimation. Journal of the American Statistical Association 113(521) 182–201.
  15. The right complexity measure in locally private estimation: It is not the fisher information. The Annals of Statistics 52(1) 1–51.
  16. Dwork, Cynthia. 2006. Differential privacy. International colloquium on automata, languages, and programming. Springer, 1–12.
  17. The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science 9(3–4) 211–407.
  18. Parametric bandits: The generalized linear case. Advances in neural information processing systems 23.
  19. Practical contextual bandits with regression oracles. International Conference on Machine Learning (ICML). PMLR, 1539–1548.
  20. Beyond ucb: Optimal and efficient contextual bandits with regression oracles. International Conference on Machine Learning. PMLR, 3199–3210.
  21. Privacy-preserving personalized recommender systems. Available at SSRN 4202576 .
  22. Generalized linear bandits with local differential privacy. Advances in Neural Information Processing Systems 34 26511–26522.
  23. Differentially private stochastic linear bandits:(almost) for free. arXiv preprint arXiv:2207.03445 .
  24. Contexts can be cheap: Solving stochastic contextual bandits with linear bandit algorithms. The Thirty Sixth Annual Conference on Learning Theory. PMLR, 1791–1821.
  25. Convergence rates of parameter estimation for some weakly identifiable finite mixtures. The Annals of Statistics 44(6) 2726–2755.
  26. Singularity structures and impacts on parameter estimation in finite mixtures of distributions. SIAM Journal on Mathematics of Data Science 1(4) 730–758.
  27. Wishart mechanism for differentially private principal components analysis. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), vol. 30.
  28. Bandit algorithms. Cambridge University Press.
  29. Privacy-preserving personalized revenue management. Management Science .
  30. Provably optimal algorithms for generalized linear contextual bandits. International Conference on Machine Learning. PMLR, 2071–2080.
  31. Linearly parameterized bandits. Mathematics of Operations Research 35(2) 395–411.
  32. Differentially private contextual linear bandits. Advances in Neural Information Processing Systems 31.
  33. Bypassing the monster: A faster and simpler optimal algorithm for contextual bandits under realizability. Mathematics of Operations Research 47(3) 1904–1931.
  34. Generalized linear models in non-interactive local differential privacy with public data. Journal of Machine Learning Research 24(132) 1–57.
  35. Upper counterfactual confidence bounds: a new optimism principle for contextual bandits. arXiv preprint arXiv:2007.07876 .
  36. Locally differentially private (contextual) bandits learning. Advances in Neural Information Processing Systems (NeurIPS) 33 12300–12310.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Jiachun Li (17 papers)
  2. David Simchi-Levi (50 papers)
  3. Yining Wang (91 papers)
X Twitter Logo Streamline Icon: https://streamlinehq.com