Differentially Private Linear Bandits with Partial Distributed Feedback (2207.05827v2)
Abstract: In this paper, we study the problem of global reward maximization with only partial distributed feedback. This problem is motivated by several real-world applications (e.g., cellular network configuration, dynamic pricing, and policy selection) where an action taken by a central entity influences a large population that contributes to the global reward. However, collecting such reward feedback from the entire population not only incurs a prohibitively high cost but often leads to privacy concerns. To tackle this problem, we consider differentially private distributed linear bandits, where only a subset of users from the population are selected (called clients) to participate in the learning process and the central server learns the global model from such partial feedback by iteratively aggregating these clients' local feedback in a differentially private fashion. We then propose a unified algorithmic learning framework, called differentially private distributed phased elimination (DP-DPE), which can be naturally integrated with popular differential privacy (DP) models (including central DP, local DP, and shuffle DP). Furthermore, we prove that DP-DPE achieves both sublinear regret and sublinear communication cost. Interestingly, DP-DPE also achieves privacy protection for free'' in the sense that the additional cost due to privacy guarantees is a lower-order additive term. In addition, as a by-product of our techniques, the same results of
free" privacy can also be achieved for the standard differentially private linear bandits. Finally, we conduct simulations to corroborate our theoretical results and demonstrate the effectiveness of DP-DPE.
- Mridul Agarwal, Vaneet Aggarwal and Kamyar Azizzadenesheli “Multi-Agent Multi-Armed Bandits with Limited Communication” In arXiv preprint arXiv:2102.08462, 2021
- “Reinforcement learning: Theory and algorithms” In CS Dept., UW Seattle, Seattle, WA, USA, Tech. Rep 32, 2019
- Yasin Abbasi-Yadkori, Dávid Pál and Csaba Szepesvári “Improved Algorithms for Linear Stochastic Bandits.” In NIPS 11, 2011, pp. 2312–2320
- “The price of differential privacy for online learning” In International Conference on Machine Learning, 2017, pp. 32–40 PMLR
- “Private Stochastic Convex Optimization with Optimal Rates” In Advances in Neural Information Processing Systems 32, 2019, pp. 11282–11291
- “Regret analysis of stochastic and nonstochastic multi-armed bandit problems” In arXiv preprint arXiv:1204.5721, 2012
- “Prochlo: Strong privacy for analytics in the crowd” In Proceedings of the 26th Symposium on Operating Systems Principles, 2017, pp. 441–459
- Djallel Bouneffouf, Irina Rish and Charu Aggarwal “Survey on Applications of Multi-Armed and Contextual Bandits” In 2020 IEEE Congress on Evolutionary Computation (CEC), 2020, pp. 1–8 DOI: 10.1109/CEC48606.2020.9185782
- Peter Bühlmann and Sara Van De Geer “Statistics for high-dimensional data: methods, theory and applications” Springer Science & Business Media, 2011
- “Yahoo! learning to rank challenge overview” In Proceedings of the learning to rank challenge, 2011, pp. 1–24 PMLR
- “Delay and cooperation in nonstochastic bandits” In Conference on Learning Theory, 2016, pp. 605–622 PMLR
- “Distributed differential privacy via shuffling” In Annual International Conference on the Theory and Applications of Cryptographic Techniques, 2019, pp. 375–403 Springer
- “Shuffle Private Stochastic Convex Optimization” In The Tenth International Conference on Learning Representations (ICLR), 2022
- “Contextual bandits with linear payoff functions” In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011, pp. 208–214 JMLR WorkshopConference Proceedings
- Clément L Canonne, Gautam Kamath and Thomas Steinke “The discrete gaussian for differential privacy” In arXiv preprint arXiv:2004.00010, 2020
- Sayak Ray Chowdhury and Xingyu Zhou “Shuffle Private Linear Contextual Bandits” In Proceedings of the 39 th International Conference on Machine Learning (ICML) 162 PMLR, 2022, pp. 3984–4009
- Sayak Ray Chowdhury and Xingyu Zhou “Distributed Differential Privacy in Multi-Armed Bandits” In The Eleventh International Conference on Learning Representations (ICLR), 2023
- Varsha Dani, Thomas P Hayes and Sham M Kakade “Stochastic linear optimization under bandit feedback” In In Proceedings of the 21st Annual Conference on Learning Theory (COLT), 2008, pp. 355–366
- John C Duchi, Michael I Jordan and Martin J Wainwright “Minimax optimal procedures for locally private estimation” In Journal of the American Statistical Association 113.521 Taylor & Francis, 2018, pp. 182–201
- Zhongxiang Dai, Kian Hsiang Low and Patrick Jaillet “Federated Bayesian optimization via Thompson sampling” In arXiv preprint arXiv:2010.10154, 2020
- “Differentially-Private Federated Linear Bandits” In arXiv preprint arXiv:2010.11425, 2020
- “The algorithmic foundations of differential privacy.” In Found. Trends Theor. Comput. Sci. 9.3-4, 2014, pp. 211–407
- Abhimanyu Dubey “Cooperative multi-agent bandits with heavy tails” In International Conference on Machine Learning, 2020, pp. 2730–2739 PMLR
- Abhimanyu Dubey “Kernel methods for cooperative multi-agent contextual bandits” In International Conference on Machine Learning, 2020, pp. 2740–2750 PMLR
- “Calibrating noise to sensitivity in private data analysis” In Theory of cryptography conference, 2006, pp. 265–284 Springer
- “Analyze gauss: optimal bounds for privacy-preserving principal component analysis” In Proceedings of the forty-sixth annual ACM symposium on Theory of computing, 2014, pp. 11–20
- “Amplification by shuffling: From local to central differential privacy via anonymity” In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, 2019, pp. 2468–2479 SIAM
- “Parametric Bandits: The Generalized Linear Case” In Advances in Neural Information Processing Systems 23, 2010, pp. 586–594
- “Practical contextual bandits with regression oracles” In International Conference on Machine Learning, 2018, pp. 1539–1548 PMLR
- “Privacy Amplification via Shuffling for Linear Contextual Bandits” In arXiv preprint arXiv:2112.06008, 2021
- “Shuffled Model of Differential Privacy in Federated Learning” In International Conference on Artificial Intelligence and Statistics, 2021, pp. 2521–2529 PMLR
- Robin C Geyer, Tassilo Klein and Moin Nabi “Differentially private federated learning: A client level perspective” In NIPS 2017 Workshop: Machine Learning on the Phone and other Consumer Devices, 2017
- “Differentially private stochastic linear bandits:(almost) for free” In arXiv preprint arXiv:2207.03445, 2022
- “Federated Linear Contextual Bandits” In Thirty-Fifth Conference on Neural Information Processing Systemss 34, 2021, pp. 27057–27068
- Prateek Jain, Pravesh Kothari and Abhradeep Thakurta “Differentially private online learning” In Conference on Learning Theory 23, 2012, pp. 24.1–24.34 JMLR: WorkshopConference Proceedings
- “What can we learn privately?” In SIAM Journal on Computing 40.3 SIAM, 2011, pp. 793–826
- Peter Kairouz, Ziyu Liu and Thomas Steinke “The distributed discrete gaussian mechanism for federated learning with secure aggregation” In arXiv preprint arXiv:2102.06387, 2021
- “The equivalence of two extremum problems” In Canadian Journal of Mathematics 12 Cambridge University Press, 1960, pp. 363–366
- “A contextual-bandit approach to personalized news article recommendation” In Proceedings of the 19th international conference on World wide web, 2010, pp. 661–670
- Tze Leung Lai and Herbert Robbins “Asymptotically efficient adaptive allocation rules” In Advances in applied mathematics 6.1 Academic Press, 1985, pp. 4–22
- “Bandit algorithms” Cambridge University Press, 2020
- Tor Lattimore, Csaba Szepesvari and Gellert Weisz “Learning with good feature representations in bandits and in RL with a generative model” In International Conference on Machine Learning, 2020, pp. 5662–5670 PMLR
- Fengjiao Li, Xingyu Zhou and Bo Ji “Differentially private linear bandits with partial distributed feedback” In 2022 20th International Symposium on Modeling and Optimization in Mobile, Ad hoc, and Wireless Networks (WiOpt), 2022, pp. 41–48 IEEE
- Fengjiao Li, Xingyu Zhou and Bo Ji “(Private) Kernelized Bandits with Distributed Biased Feedback” In Proceedings of the ACM on Measurement and Analysis of Computing Systems 7.1 New York, NY, USA: Association for Computing Machinery, 2023 DOI: 10.1145/3579318
- Fengjiao Li, Xingyu Zhou and Bo Ji “Distributed Linear Bandits with Differential Privacy” In IEEE Transactions on Network Science and Engineering IEEE, 2024
- “Auric: using data-driven recommendation to automatically generate cellular configuration” In Proceedings of the 2021 ACM SIGCOMM 2021 Conference, 2021, pp. 807–820
- “Communication-efficient learning of deep networks from decentralized data” In Artificial Intelligence and Statistics, 2017, pp. 1273–1282 PMLR
- David Martínez-Rubio, Varun Kanade and Patrick Rebeschini “Decentralized cooperative stochastic bandits” In Advances in Neural Information Processing Systems 32, 2019, pp. 4529–4540
- “(Nearly) optimal differentially private stochastic multi-arm bandits” In Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence, 2015, pp. 592–601
- Friedrich Pukelsheim “Optimal design of experiments” SIAM, 2006
- “Multi-armed bandits with local differential privacy” In arXiv preprint arXiv:2007.03121, 2020
- Paat Rusmevichientong and John N Tsitsiklis “Linearly parameterized bandits” In Mathematics of Operations Research 35.2 INFORMS, 2010, pp. 395–411
- Aleksandrs Slivkins “Introduction to multi-armed bandits” In arXiv preprint arXiv:1904.07272, 2019
- “Differentially Private Contextual Linear Bandits” In Advances in Neural Information Processing Systems 31, 2018, pp. 4296–4306
- “Federated multi-armed bandits” In 35th AAAI Conference on Artificial Intelligence, 2021, pp. 9603–9610
- Chengshuai Shi, Cong Shen and Jing Yang “Federated Multi-armed Bandits with Personalization” In International Conference on Artificial Intelligence and Statistics, 2021, pp. 2917–2925 PMLR
- Aristide CY Tossou and Christos Dimitrakakis “Algorithms for differentially private multi-armed bandits” In Thirtieth AAAI Conference on Artificial Intelligence, 2016, pp. 2087–2093
- “Differentially Private Multi-Armed Bandits in the Shuffle Model” In arXiv preprint arXiv:2106.02900, 2021
- “Distributed bandit learning: Near-optimal regret with efficient communication” In The Eighth International Conference on Learning Representations (ICLR), 2020
- “Design of experiments for stochastic contextual linear bandits” In Advances in Neural Information Processing Systems 34, 2021, pp. 22720–22731
- “Locally differentially private (contextual) bandits learning” In arXiv preprint arXiv:2006.00701, 2020
- “Federated bandit: A gossiping approach” In Proceedings of the ACM on Measurement and Analysis of Computing Systems 5.1 ACM New York, NY, USA, 2021, pp. 1–29
- “Local differential privacy for bayesian optimization” In arXiv preprint arXiv:2010.06709, 2020