Privacy-Preserving Distributed Nonnegative Matrix Factorization (2403.18326v1)
Abstract: Nonnegative matrix factorization (NMF) is an effective data representation tool with numerous applications in signal processing and machine learning. However, deploying NMF in a decentralized manner over ad-hoc networks introduces privacy concerns due to the conventional approach of sharing raw data among network agents. To address this, we propose a privacy-preserving algorithm for fully-distributed NMF that decomposes a distributed large data matrix into left and right matrix factors while safeguarding each agent's local data privacy. It facilitates collaborative estimation of the left matrix factor among agents and enables them to estimate their respective right factors without exposing raw data. To ensure data privacy, we secure information exchanges between neighboring agents utilizing the Paillier cryptosystem, a probabilistic asymmetric algorithm for public-key cryptography that allows computations on encrypted data without decryption. Simulation results conducted on synthetic and real-world datasets demonstrate the effectiveness of the proposed algorithm in achieving privacy-preserving distributed NMF over ad-hoc networks.
- N. Gillis, “The why and how of nonnegative matrix factorization,” Connections, vol. 12, no. 2, 2014.
- M. W. Berry, M. Browne, A. N. Langville, V. P. Pauca, and R. J. Plemmons, “Algorithms and applications for approximate nonnegative matrix factorization,” Comput. Stat. Data Anal., vol. 52, no. 1, pp. 155–173, 2007.
- Y.-X. Wang and Y.-J. Zhang, “Nonnegative matrix factorization: A comprehensive review,” IEEE Trans. Knowl. Data Eng., vol. 25, no. 6, pp. 1336–1353, 2013.
- D. Lee and H. S. Seung, “Algorithms for non-negative matrix factorization,” in Proc. Adv. Neural Inf. Process. Syst., T. Leen, T. Dietterich, and V. Tresp, Eds., vol. 13. MIT Press, 2000.
- M. Udell, C. Horn, R. Zadeh, S. Boyd et al., “Generalized low rank models,” Found. Trends Mach. Learn., vol. 9, no. 1, pp. 1–118, 2016.
- P. Paatero and U. Tapper, “Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values,” Environmetrics, vol. 5, no. 2, pp. 111–126, 1994.
- D. D. Lee and H. S. Seung, “Learning the parts of objects by non-negative matrix factorization,” Nature, vol. 401, no. 6755, pp. 788–791, 1999.
- M. W. Spratling and P. Dayan, “Learning image components for object recognition.” J. Mach. Learn. Res., vol. 7, no. 5, 2006.
- A. Kumar and V. Sindhwani, “Near-separable non-negative matrix factorization with ℓ1subscriptℓ1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and Bregman loss functions,” in Proc. SIAM Int. Conf. Data Min., 2015, pp. 343–351.
- W.-K. Ma, J. M. Bioucas-Dias, T.-H. Chan, N. Gillis, P. Gader, A. J. Plaza, A. Ambikapathi, and C.-Y. Chi, “A signal processing perspective on hyperspectral unmixing: Insights from remote sensing,” IEEE Signal Process. Mag., vol. 31, no. 1, pp. 67–81, 2013.
- D. Godfrey, C. Johns, C. Meyer, S. Race, and C. Sadek, “A case study in text mining: Interpreting twitter data from world cup tweets,” arXiv preprint arXiv:1408.5427, 2014.
- T.-H. Chan, W.-K. Ma, C.-Y. Chi, and Y. Wang, “A convex analysis framework for blind separation of non-negative sources,” IEEE Trans. Signal Process., vol. 56, no. 10, pp. 5120–5134, 2008.
- A. C. Türkmen, “A review of nonnegative matrix factorization methods for clustering,” arXiv preprint arXiv:1507.03194, 2015.
- P. Melville and V. Sindhwani, “Recommender systems.” Encycl. Mach. Learn., vol. 1, pp. 829–838, 2010.
- K. Devarajan, “Nonnegative matrix factorization: an analytical and interpretive tool in computational biology,” PLoS Comput. Biol., vol. 4, no. 7, p. e1000029, 2008.
- C. Févotte, N. Bertin, and J.-L. Durrieu, “Nonnegative matrix factorization with the itakura-saito divergence: With application to music analysis,” Neural Comput., vol. 21, no. 3, pp. 793–830, 2009.
- S. Bhattacharya and N. D. Lane, “Sparsification and separation of deep learning layers for constrained resource inference on wearables,” in Proc. ACM Conf. Embedded Netw. Sensor Syst., 2016, pp. 176–189.
- Y.-W. Chang, H.-Y. Chen, C. Han, T. Morikawa, T. Takahashi, and T.-N. Lin, “FINISH: Efficient and scalable NMF-based federated learning for detecting malware activities,” IEEE Trans. Emerg. Top. Comput., vol. 11, no. 4, pp. 934–949, 2023.
- Y. Qian, C. Tan, D. Ding, H. Li, and N. Mamoulis, “Fast and secure distributed nonnegative matrix factorization,” IEEE Trans. Knowl. Data Eng., vol. 34, no. 2, pp. 653–666, 2022.
- P. Mai and Y. Pang, “Privacy-preserving multiview matrix factorization for recommender systems,” IEEE Trans. Artif. Intell., vol. 5, no. 1, pp. 267–277, 2024.
- N. K. D. Venkategowda and S. Werner, “Privacy-preserving distributed maximum consensus,” IEEE Signal Process. Lett., vol. 27, pp. 1839–1843, 2020.
- P. Paillier, “Public-key cryptosystems based on composite degree residuosity classes,” in Proc. Int. Conf. Theory Appl. Cryptogr. Techn. Springer, 1999, pp. 223–238.
- Y. Yan, Z. Chen, V. Varadharajan, M. J. Hossain, and G. E. Town, “Distributed consensus-based economic dispatch in power grids using the paillier cryptosystem,” IEEE Trans. Smart Grid, vol. 12, no. 4, pp. 3493–3502, 2021.
- R. Lu, X. Liang, X. Li, X. Lin, and X. Shen, “Eppa: An efficient and privacy-preserving aggregation scheme for secure smart grid communications,” IEEE Trans. Parallel Distrib. Syst., vol. 23, pp. 1621–1631, 2012.
- H. Shen, M. Zhang, and J. Shen, “Efficient privacy-preserving cube-data aggregation scheme for smart grids,” IEEE Trans. Inf. Forensics Secur., vol. 12, no. 6, pp. 1369–1381, 2017.
- M. Shen, X. Tang, L. Zhu, X. Du, and M. Guizani, “Privacy-preserving support vector machine training over blockchain-based encrypted IoT data in smart cities,” IEEE Internet Things J., vol. 6, pp. 7702–7712, 2019.
- S. M. Errapotu, J. Wang, Y. Gong, J.-H. Cho, M. Pan, and Z. Han, “Safe: Secure appliance scheduling for flexible and efficient energy consumption for smart home IoT,” IEEE Internet Things J., vol. 5, no. 6, pp. 4380–4391, 2018.
- B. Li, Y. Wu, J. Song, R. Lu, T. Li, and L. Zhao, “DeepFed: Federated deep learning for intrusion detection in industrial cyber–physical systems,” IEEE Trans. Ind. Inform., vol. 17, no. 8, pp. 5615–5624, 2021.
- Q. Xu, Y. Lan, Z. Su, D. Fang, and H. Zhang, “Verifiable and privacy-preserving cooperative federated learning in uav-assisted vehicular networks,” in Proc. IEEE Int. Conf. Commun., 2023, pp. 2288–2293.
- C. Zhang, M. Ahmad, and Y. Wang, “ADMM based privacy-preserving decentralized optimization,” IEEE Trans. Inf. Forensics Secur., vol. 14, no. 3, pp. 565–580, 2019.
- E. Wei and A. Ozdaglar, “Distributed alternating direction method of multipliers,” in Proc. IEEE Conf. Decis. Control, 2012, pp. 5445–5450.
- W. Deng and W. Yin, “On the global and linear convergence of the generalized alternating direction method of multipliers,” J. Sci. Comput., vol. 66, pp. 889–916, 2016.
- Y. Wang, W. Yin, and J. Zeng, “Global convergence of admm in nonconvex nonsmooth optimization,” J. Sci. Comput., vol. 78, pp. 29–63, 2019.
- S. Boyd, N. Parikh, E. Chu, B. Peleato, J. Eckstein et al., “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Found. Trends Mach. Learn., vol. 3, no. 1, 2011.
- K. Kogiso and T. Fujita, “Cyber-security enhancement of networked control systems using homomorphic encryption,” in Proc. IEEE Conf. Decis. Control, 2015, pp. 6836–6843.
- M. Ruan, H. Gao, and Y. Wang, “Secure and privacy-preserving consensus,” IEEE Trans. Automat. Control, vol. 64, pp. 4035–4049, 2019.
- R. Fischer, J. Skelley, and B. Heisele, “The MIT-CBCL facial expression database.” [Online]. Available: http://cbcl.mit.edu/software-datasets/FaceData2.html