FedDMF: Privacy-Preserving User Attribute Prediction using Deep Matrix Factorization (2312.15420v1)
Abstract: User attribute prediction is a crucial task in various industries. However, sharing user data across different organizations faces challenges due to privacy concerns and legal requirements regarding personally identifiable information. Regulations such as the General Data Protection Regulation (GDPR) in the European Union and the Personal Information Protection Law of the People's Republic of China impose restrictions on data sharing. To address the need for utilizing features from multiple clients while adhering to legal requirements, federated learning algorithms have been proposed. These algorithms aim to predict user attributes without directly sharing the data. However, existing approaches typically rely on matching users across companies, which can result in dishonest partners discovering user lists or the inability to utilize all available features. In this paper, we propose a novel algorithm for predicting user attributes without requiring user matching. Our approach involves training deep matrix factorization models on different clients and sharing only the item vectors. This allows us to predict user attributes without sharing the user vectors themselves. The algorithm is evaluated using the publicly available MovieLens dataset and demonstrate that it achieves similar performance to the FedAvg algorithm, reaching 96% of a single model's accuracy. The proposed algorithm is particularly well-suited for improving customer targeting and enhancing the overall customer experience. This paper presents a valuable contribution to the field of user attribute prediction by offering a novel algorithm that addresses some of the most pressing privacy concerns in this area.
- D. Gao, Y. Liu, A. Huang, C. Ju, H. Yu, and Q. Yang, “Privacy-preserving heterogeneous federated transfer learning,” in 2019 IEEE international conference on big data (Big Data). IEEE, 2019, pp. 2552–2559.
- S. Liu, S. Xu, W. Yu, Z. Fu, Y. Zhang, and A. Marian, “Fedct: Federated collaborative transfer for recommendation,” in Proceedings of the 44th international ACM SIGIR conference on research and development in information retrieval, 2021, pp. 716–725.
- H.-J. Xue, X. Dai, J. Zhang, S. Huang, and J. Chen, “Deep matrix factorization models for recommender systems.” in IJCAI, vol. 17. Melbourne, Australia, 2017, pp. 3203–3209.
- R. Shokri and V. Shmatikov, “Privacy-preserving deep learning,” in Proceedings of the 22nd ACM SIGSAC conference on computer and communications security, 2015, pp. 1310–1321.
- J. Xu, B. S. Glicksberg, C. Su, P. Walker, J. Bian, and F. Wang, “Federated learning for healthcare informatics,” Journal of Healthcare Informatics Research, vol. 5, pp. 1–19, 2021.
- G. Long, Y. Tan, J. Jiang, and C. Zhang, “Federated learning for open banking,” in Federated Learning: Privacy and Incentive. Springer, 2020, pp. 240–254.
- A. Flanagan, W. Oyomno, A. Grigorievskiy, K. E. Tan, S. A. Khan, and M. Ammad-Ud-Din, “Federated multi-view matrix factorization for personalized recommendations,” in Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2020, Ghent, Belgium, September 14–18, 2020, Proceedings, Part II. Springer, 2021, pp. 324–347.
- M. Ammad-Ud-Din, E. Ivannikova, S. A. Khan, W. Oyomno, Q. Fu, K. E. Tan, and A. Flanagan, “Federated collaborative filtering for privacy-preserving personalized recommendation system,” arXiv preprint arXiv:1901.09888, 2019.
- F. Fu, Y. Shao, L. Yu, J. Jiang, H. Xue, Y. Tao, and B. Cui, “Vf2boost: Very fast vertical federated gradient boosting for cross-enterprise learning,” in Proceedings of the 2021 International Conference on Management of Data, 2021, pp. 563–576.
- R. Xu, N. Baracaldo, Y. Zhou, A. Anwar, J. Joshi, and H. Ludwig, “Fedv: Privacy-preserving federated learning over vertically partitioned data. arxiv preprint arxiv: 210303918,” 2021.
- J. Ye, A. Maddi, S. K. Murakonda, V. Bindschaedler, and R. Shokri, “Enhanced membership inference attacks against machine learning models,” in Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, 2022, pp. 3093–3106.
- L. Zhu, Z. Liu, and S. Han, “Deep leakage from gradients,” Advances in neural information processing systems, vol. 32, 2019.
- B. Zhao, K. R. Mopuri, and H. Bilen, “idlg: Improved deep leakage from gradients,” arXiv preprint arXiv:2001.02610, 2020.
- L. Fan, K. W. Ng, C. Ju, T. Zhang, C. Liu, C. S. Chan, and Q. Yang, “Rethinking privacy preserving deep learning: How to evaluate and thwart privacy attacks,” Federated Learning: Privacy and Incentive, pp. 32–50, 2020.
- Y. Yang, X. Huang, X. Liu, H. Cheng, J. Weng, X. Luo, and V. Chang, “A comprehensive survey on secure outsourced computation and its applications,” IEEE Access, vol. 7, pp. 159 426–159 465, 2019.
- P. De Handschutter, N. Gillis, and X. Siebert, “A survey on deep matrix factorizations,” Computer Science Review, vol. 42, p. 100423, 2021.
- N. Ketkar and N. Ketkar, “Stochastic gradient descent,” Deep learning with Python: A hands-on introduction, pp. 113–132, 2017.