Thompson Sampling under Bernoulli Rewards with Local Differential Privacy (2307.00863v1)

Published 3 Jul 2023 in cs.LG and cs.CR

Abstract: This paper investigates the problem of regret minimization for multi-armed bandit (MAB) problems with local differential privacy (LDP) guarantee. Given a fixed privacy budget $\epsilon$, we consider three privatizing mechanisms under Bernoulli scenario: linear, quadratic and exponential mechanisms. Under each mechanism, we derive stochastic regret bound for Thompson Sampling algorithm. Finally, we simulate to illustrate the convergence of different mechanisms under different privacy budgets.

References (14)

Analysis of thompson sampling for the multi-armed bandit problem. volume 23 of Proceedings of Machine Learning Research, pages 39.1–39.26, Edinburgh, Scotland, 25–27 Jun 2012. JMLR Workshop and Conference Proceedings. URL http://proceedings.mlr.press/v23/agrawal12.html.
Further optimal regret bounds for thompson sampling. In Artificial Intelligence and Statistics, pages 99–107, 2013.
Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47:235–256, 05 2002. doi: 10.1023/A:1013689704352.
Differential privacy for multi-armed bandits: What is it and what is its cost? CoRR, abs/1905.12298, 2019. URL http://arxiv.org/abs/1905.12298.
Privacy at scale: Local differential privacy in practice. In Proceedings of the 2018 International Conference on Management of Data, SIGMOD ’18, pages 1655–1658, New York, NY, USA, 2018. ACM. ISBN 978-1-4503-4703-7. doi: 10.1145/3183713.3197390. URL http://doi.acm.org/10.1145/3183713.3197390.
Cynthia Dwork. Differential privacy: A survey of results. In Manindra Agrawal, Dingzhu Du, and Zhenhua Duan, editors, Theory and Applications of Models of Computation: 5th International Conference, TAMC, pages 1–19. 2008. ISBN 978-3-540-79228-4. doi: 10.1007/978-3-540-79228-4_1. URL https://doi.org/10.1007/978-3-540-79228-4_1.
Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography: Third Theory of Cryptography Conference, pages 265–284. 2006. ISBN 978-3-540-32732-5. doi: 10.1007/11681878_14. URL https://doi.org/10.1007/11681878_14.
Corrupt bandits for preserving local privacy. volume 83 of Proceedings of Machine Learning Research, pages 387–412. PMLR, 07–09 Apr 2018. URL http://proceedings.mlr.press/v83/gajane18a.html.
F. Liu. Generalized gaussian mechanism for differential privacy. IEEE Transactions on Knowledge and Data Engineering, 31(4):747–756, 2019.
Multi-armed bandits with local differential privacy. arXiv preprint arXiv:2007.03121, 2020.
Differential Privacy Team. Learning with privacy at scale. https: //machinelearning.apple.com/2017/12/06/ learning-with-privacy-at-scale.html.
Global and local differential privacy for collaborative bandits. In Fourteenth ACM Conference on Recommender Systems, RecSys ’20, page 150–159, New York, NY, USA, 2020. Association for Computing Machinery. ISBN 9781450375832. doi: 10.1145/3383313.3412254. URL https://doi.org/10.1145/3383313.3412254.
Locally differentially private protocols for frequency estimation. In 26th USENIX Security 17, pages 729–745. USENIX Association, 2017. ISBN 978-1-931971-40-9. URL https://www.usenix.org/conference/usenixsecurity17/technical-sessions/presentation/wang-tianhao.
Rappor: Randomized aggregatable privacy-preserving ordinal response. In Proceedings of the 21st ACM CCS, 2014. URL https://arxiv.org/abs/1407.6981.

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Thompson Sampling under Bernoulli Rewards with Local Differential Privacy (2307.00863v1)

Summary

Related Papers