Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Differentially Private Linear Bandits with Partial Distributed Feedback (2207.05827v2)

Published 12 Jul 2022 in cs.LG, cs.CR, cs.NA, and math.NA

Abstract: In this paper, we study the problem of global reward maximization with only partial distributed feedback. This problem is motivated by several real-world applications (e.g., cellular network configuration, dynamic pricing, and policy selection) where an action taken by a central entity influences a large population that contributes to the global reward. However, collecting such reward feedback from the entire population not only incurs a prohibitively high cost but often leads to privacy concerns. To tackle this problem, we consider differentially private distributed linear bandits, where only a subset of users from the population are selected (called clients) to participate in the learning process and the central server learns the global model from such partial feedback by iteratively aggregating these clients' local feedback in a differentially private fashion. We then propose a unified algorithmic learning framework, called differentially private distributed phased elimination (DP-DPE), which can be naturally integrated with popular differential privacy (DP) models (including central DP, local DP, and shuffle DP). Furthermore, we prove that DP-DPE achieves both sublinear regret and sublinear communication cost. Interestingly, DP-DPE also achieves privacy protection for free'' in the sense that the additional cost due to privacy guarantees is a lower-order additive term. In addition, as a by-product of our techniques, the same results offree" privacy can also be achieved for the standard differentially private linear bandits. Finally, we conduct simulations to corroborate our theoretical results and demonstrate the effectiveness of DP-DPE.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (63)
  1. Mridul Agarwal, Vaneet Aggarwal and Kamyar Azizzadenesheli “Multi-Agent Multi-Armed Bandits with Limited Communication” In arXiv preprint arXiv:2102.08462, 2021
  2. “Reinforcement learning: Theory and algorithms” In CS Dept., UW Seattle, Seattle, WA, USA, Tech. Rep 32, 2019
  3. Yasin Abbasi-Yadkori, Dávid Pál and Csaba Szepesvári “Improved Algorithms for Linear Stochastic Bandits.” In NIPS 11, 2011, pp. 2312–2320
  4. “The price of differential privacy for online learning” In International Conference on Machine Learning, 2017, pp. 32–40 PMLR
  5. “Private Stochastic Convex Optimization with Optimal Rates” In Advances in Neural Information Processing Systems 32, 2019, pp. 11282–11291
  6. “Regret analysis of stochastic and nonstochastic multi-armed bandit problems” In arXiv preprint arXiv:1204.5721, 2012
  7. “Prochlo: Strong privacy for analytics in the crowd” In Proceedings of the 26th Symposium on Operating Systems Principles, 2017, pp. 441–459
  8. Djallel Bouneffouf, Irina Rish and Charu Aggarwal “Survey on Applications of Multi-Armed and Contextual Bandits” In 2020 IEEE Congress on Evolutionary Computation (CEC), 2020, pp. 1–8 DOI: 10.1109/CEC48606.2020.9185782
  9. Peter Bühlmann and Sara Van De Geer “Statistics for high-dimensional data: methods, theory and applications” Springer Science & Business Media, 2011
  10. “Yahoo! learning to rank challenge overview” In Proceedings of the learning to rank challenge, 2011, pp. 1–24 PMLR
  11. “Delay and cooperation in nonstochastic bandits” In Conference on Learning Theory, 2016, pp. 605–622 PMLR
  12. “Distributed differential privacy via shuffling” In Annual International Conference on the Theory and Applications of Cryptographic Techniques, 2019, pp. 375–403 Springer
  13. “Shuffle Private Stochastic Convex Optimization” In The Tenth International Conference on Learning Representations (ICLR), 2022
  14. “Contextual bandits with linear payoff functions” In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011, pp. 208–214 JMLR WorkshopConference Proceedings
  15. Clément L Canonne, Gautam Kamath and Thomas Steinke “The discrete gaussian for differential privacy” In arXiv preprint arXiv:2004.00010, 2020
  16. Sayak Ray Chowdhury and Xingyu Zhou “Shuffle Private Linear Contextual Bandits” In Proceedings of the 39 th International Conference on Machine Learning (ICML) 162 PMLR, 2022, pp. 3984–4009
  17. Sayak Ray Chowdhury and Xingyu Zhou “Distributed Differential Privacy in Multi-Armed Bandits” In The Eleventh International Conference on Learning Representations (ICLR), 2023
  18. Varsha Dani, Thomas P Hayes and Sham M Kakade “Stochastic linear optimization under bandit feedback” In In Proceedings of the 21st Annual Conference on Learning Theory (COLT), 2008, pp. 355–366
  19. John C Duchi, Michael I Jordan and Martin J Wainwright “Minimax optimal procedures for locally private estimation” In Journal of the American Statistical Association 113.521 Taylor & Francis, 2018, pp. 182–201
  20. Zhongxiang Dai, Kian Hsiang Low and Patrick Jaillet “Federated Bayesian optimization via Thompson sampling” In arXiv preprint arXiv:2010.10154, 2020
  21. “Differentially-Private Federated Linear Bandits” In arXiv preprint arXiv:2010.11425, 2020
  22. “The algorithmic foundations of differential privacy.” In Found. Trends Theor. Comput. Sci. 9.3-4, 2014, pp. 211–407
  23. Abhimanyu Dubey “Cooperative multi-agent bandits with heavy tails” In International Conference on Machine Learning, 2020, pp. 2730–2739 PMLR
  24. Abhimanyu Dubey “Kernel methods for cooperative multi-agent contextual bandits” In International Conference on Machine Learning, 2020, pp. 2740–2750 PMLR
  25. “Calibrating noise to sensitivity in private data analysis” In Theory of cryptography conference, 2006, pp. 265–284 Springer
  26. “Analyze gauss: optimal bounds for privacy-preserving principal component analysis” In Proceedings of the forty-sixth annual ACM symposium on Theory of computing, 2014, pp. 11–20
  27. “Amplification by shuffling: From local to central differential privacy via anonymity” In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, 2019, pp. 2468–2479 SIAM
  28. “Parametric Bandits: The Generalized Linear Case” In Advances in Neural Information Processing Systems 23, 2010, pp. 586–594
  29. “Practical contextual bandits with regression oracles” In International Conference on Machine Learning, 2018, pp. 1539–1548 PMLR
  30. “Privacy Amplification via Shuffling for Linear Contextual Bandits” In arXiv preprint arXiv:2112.06008, 2021
  31. “Shuffled Model of Differential Privacy in Federated Learning” In International Conference on Artificial Intelligence and Statistics, 2021, pp. 2521–2529 PMLR
  32. Robin C Geyer, Tassilo Klein and Moin Nabi “Differentially private federated learning: A client level perspective” In NIPS 2017 Workshop: Machine Learning on the Phone and other Consumer Devices, 2017
  33. “Differentially private stochastic linear bandits:(almost) for free” In arXiv preprint arXiv:2207.03445, 2022
  34. “Federated Linear Contextual Bandits” In Thirty-Fifth Conference on Neural Information Processing Systemss 34, 2021, pp. 27057–27068
  35. Prateek Jain, Pravesh Kothari and Abhradeep Thakurta “Differentially private online learning” In Conference on Learning Theory 23, 2012, pp. 24.1–24.34 JMLR: WorkshopConference Proceedings
  36. “What can we learn privately?” In SIAM Journal on Computing 40.3 SIAM, 2011, pp. 793–826
  37. Peter Kairouz, Ziyu Liu and Thomas Steinke “The distributed discrete gaussian mechanism for federated learning with secure aggregation” In arXiv preprint arXiv:2102.06387, 2021
  38. “The equivalence of two extremum problems” In Canadian Journal of Mathematics 12 Cambridge University Press, 1960, pp. 363–366
  39. “A contextual-bandit approach to personalized news article recommendation” In Proceedings of the 19th international conference on World wide web, 2010, pp. 661–670
  40. Tze Leung Lai and Herbert Robbins “Asymptotically efficient adaptive allocation rules” In Advances in applied mathematics 6.1 Academic Press, 1985, pp. 4–22
  41. “Bandit algorithms” Cambridge University Press, 2020
  42. Tor Lattimore, Csaba Szepesvari and Gellert Weisz “Learning with good feature representations in bandits and in RL with a generative model” In International Conference on Machine Learning, 2020, pp. 5662–5670 PMLR
  43. Fengjiao Li, Xingyu Zhou and Bo Ji “Differentially private linear bandits with partial distributed feedback” In 2022 20th International Symposium on Modeling and Optimization in Mobile, Ad hoc, and Wireless Networks (WiOpt), 2022, pp. 41–48 IEEE
  44. Fengjiao Li, Xingyu Zhou and Bo Ji “(Private) Kernelized Bandits with Distributed Biased Feedback” In Proceedings of the ACM on Measurement and Analysis of Computing Systems 7.1 New York, NY, USA: Association for Computing Machinery, 2023 DOI: 10.1145/3579318
  45. Fengjiao Li, Xingyu Zhou and Bo Ji “Distributed Linear Bandits with Differential Privacy” In IEEE Transactions on Network Science and Engineering IEEE, 2024
  46. “Auric: using data-driven recommendation to automatically generate cellular configuration” In Proceedings of the 2021 ACM SIGCOMM 2021 Conference, 2021, pp. 807–820
  47. “Communication-efficient learning of deep networks from decentralized data” In Artificial Intelligence and Statistics, 2017, pp. 1273–1282 PMLR
  48. David Martínez-Rubio, Varun Kanade and Patrick Rebeschini “Decentralized cooperative stochastic bandits” In Advances in Neural Information Processing Systems 32, 2019, pp. 4529–4540
  49. “(Nearly) optimal differentially private stochastic multi-arm bandits” In Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence, 2015, pp. 592–601
  50. Friedrich Pukelsheim “Optimal design of experiments” SIAM, 2006
  51. “Multi-armed bandits with local differential privacy” In arXiv preprint arXiv:2007.03121, 2020
  52. Paat Rusmevichientong and John N Tsitsiklis “Linearly parameterized bandits” In Mathematics of Operations Research 35.2 INFORMS, 2010, pp. 395–411
  53. Aleksandrs Slivkins “Introduction to multi-armed bandits” In arXiv preprint arXiv:1904.07272, 2019
  54. “Differentially Private Contextual Linear Bandits” In Advances in Neural Information Processing Systems 31, 2018, pp. 4296–4306
  55. “Federated multi-armed bandits” In 35th AAAI Conference on Artificial Intelligence, 2021, pp. 9603–9610
  56. Chengshuai Shi, Cong Shen and Jing Yang “Federated Multi-armed Bandits with Personalization” In International Conference on Artificial Intelligence and Statistics, 2021, pp. 2917–2925 PMLR
  57. Aristide CY Tossou and Christos Dimitrakakis “Algorithms for differentially private multi-armed bandits” In Thirtieth AAAI Conference on Artificial Intelligence, 2016, pp. 2087–2093
  58. “Differentially Private Multi-Armed Bandits in the Shuffle Model” In arXiv preprint arXiv:2106.02900, 2021
  59. “Distributed bandit learning: Near-optimal regret with efficient communication” In The Eighth International Conference on Learning Representations (ICLR), 2020
  60. “Design of experiments for stochastic contextual linear bandits” In Advances in Neural Information Processing Systems 34, 2021, pp. 22720–22731
  61. “Locally differentially private (contextual) bandits learning” In arXiv preprint arXiv:2006.00701, 2020
  62. “Federated bandit: A gossiping approach” In Proceedings of the ACM on Measurement and Analysis of Computing Systems 5.1 ACM New York, NY, USA, 2021, pp. 1–29
  63. “Local differential privacy for bayesian optimization” In arXiv preprint arXiv:2010.06709, 2020
Citations (10)

Summary

We haven't generated a summary for this paper yet.