Federated Classification in Hyperbolic Spaces via Secure Aggregation of Convex Hulls (2308.06895v2)
Abstract: Hierarchical and tree-like data sets arise in many applications, including language processing, graph data mining, phylogeny and genomics. It is known that tree-like data cannot be embedded into Euclidean spaces of finite dimension with small distortion. This problem can be mitigated through the use of hyperbolic spaces. When such data also has to be processed in a distributed and privatized setting, it becomes necessary to work with new federated learning methods tailored to hyperbolic spaces. As an initial step towards the development of the field of federated learning in hyperbolic spaces, we propose the first known approach to federated classification in hyperbolic spaces. Our contributions are as follows. First, we develop distributed versions of convex SVM classifiers for Poincar\'e discs. In this setting, the information conveyed from clients to the global classifier are convex hulls of clusters present in individual client data. Second, to avoid label switching issues, we introduce a number-theoretic approach for label recovery based on the so-called integer $B_h$ sequences. Third, we compute the complexity of the convex hulls in hyperbolic spaces to assess the extent of data leakage; at the same time, in order to limit communication cost for the hulls, we propose a new quantization method for the Poincar\'e disc coupled with Reed-Solomon-like encoding. Fourth, at the server level, we introduce a new approach for aggregating convex hulls of the clients based on balanced graph partitioning. We test our method on a collection of diverse data sets, including hierarchical single-cell RNA-seq data from different patients distributed across different repositories that have stringent privacy constraints. The classification accuracy of our method is up to $\sim 11\%$ better than its Euclidean counterpart, demonstrating the importance of privacy-preserving learning in hyperbolic spaces.
- Privacy in mobile technology for personal healthcare. ACM Computing Surveys (CSUR), 45(1):1–54, 2012.
- Revisiting sparsity hunting in federated learning: Why does sparsity consensus matter? Transactions on Machine Learning Research, 2023.
- Differential privacy has disparate impact on model accuracy. Advances in neural information processing systems, 32, 2019.
- Optimizing the collaboration structure in cross-silo federated learning. In International Conference on Machine Learning. PMLR, 2023.
- Elwyn R. Berlekamp. Algebraic coding theory. In McGraw-Hill series in systems science, 1968.
- Practical secure aggregation for privacy-preserving machine learning. In proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 1175–1191, 2017.
- Theorems in the additive theory of numbers. Technical report, North Carolina State University. Dept. of Statistics, 1960.
- Jean Bourgain. On lipschitz embedding of finite metric spaces in hilbert space. Israel Journal of Mathematics, 52:46–52, 1985.
- From trees to continuous embeddings and back: Hyperbolic hierarchical clustering. Advances in Neural Information Processing Systems, 33:15065–15076, 2020a.
- Low-dimensional hyperbolic knowledge graph embeddings. arXiv preprint arXiv:2005.00545, 2020b.
- Hyperbolic interaction model for hierarchical multi-label classification. In Proceedings of the AAAI conference on artificial intelligence, volume 34, pp. 7496–7503, 2020.
- Modeling scale-free graphs with hyperbolic geometry for knowledge-aware recommendation. In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining, pp. 94–102, 2022.
- Health insurance portability and accountability act (hippa) compliant access control model for web services. International Journal of Healthcare Information Systems and Informatics (IJHISI), 1(1):22–39, 2006.
- Highly scalable and provably accurate classification in poincaré balls. In 2021 IEEE International Conference on Data Mining (ICDM), pp. 61–70. IEEE, 2021.
- Hyperaid: Denoising in hyperbolic spaces for tree-fitting and hierarchical clustering. arXiv preprint arXiv:2205.09721, 2022.
- Large-margin classification in hyperbolic space. In International Conference on Artificial Intelligence and Statistics, pp. 1832–1840. PMLR, 2019.
- A review of medical federated learning: Applications in oncology and cancer research. In Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries: 7th International Workshop, BrainLes 2021, Held in Conjunction with MICCAI 2021, Virtual Event, September 27, 2021, Revised Selected Papers, Part I, pp. 3–24. Springer, 2022.
- Federated learning for predicting clinical outcomes in patients with covid-19. Nature medicine, 27(10):1735–1743, 2021.
- Embedding text in hyperbolic spaces. In Proceedings of the Twelfth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-12), pp. 59–69, New Orleans, Louisiana, USA, June 2018. Association for Computational Linguistics. doi: 10.18653/v1/W18-1708. URL https://aclanthology.org/W18-1708.
- W Diffie and ME Hellman. " new directions in cryptography" ieee transactions on information theory, v. it-22, n. 6. 1976.
- Deep generative model embedding of single-cell rna-seq profiles on hyperspheres and hyperbolic spaces. Nature communications, 12(1):2554, 2021.
- The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science, 9(3–4):211–407, 2014.
- Bradley Efron. The convex hull of a random set of points. Biometrika, 52(3-4):331–343, 1965.
- Basil: A fast and byzantine-resilient approach for decentralized training. IEEE Journal on Selected Areas in Communications, 40(9):2694–2716, 2022.
- Kernel methods in hyperbolic spaces. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10665–10674, 2021.
- Hyperbolic neural networks. Advances in neural information processing systems, 31, 2018.
- Federated analysis in coinstac reveals functional network connectivity and spectral links to smoking and alcohol consumption in nearly 2,000 adolescent brains. Neuroinformatics, pp. 1–15, 2022.
- Ronald L. Graham. An efficient algorithm for determining the convex hull of a finite planar set. Info. Proc. Lett., 1:132–133, 1972.
- Exploring network structure, dynamics, and function using networkx. Technical report, Los Alamos National Lab.(LANL), Los Alamos, NM (United States), 2008.
- Sequences. Springer Science & Business Media, 2012.
- Sariel Har-Peled. On the expected complexity of random convex hulls. arXiv preprint arXiv:1111.5340, 2011.
- Michael P Hitchman. Geometry with an introduction to cosmic topology. Jones & Bartlett Learning, 2009.
- Visualising very large phylogenetic trees in three dimensional hyperbolic space. BMC bioinformatics, 5:1–6, 2004.
- Learning hyperbolic embedding for phylogenetic tree placement and updates. Biology, 11(9):1256, 2022.
- Advances and open problems in federated learning. Foundations and Trends® in Machine Learning, 14(1–2):1–210, 2021.
- Kazutoshi Kan. Seeking the ideal privacy protection: Strengths and limitations of differential privacy. Monetary and Economic Studies, 41:49–80, 2023.
- Scaffold: Stochastic controlled averaging for federated learning. In International conference on machine learning, pp. 5132–5143. PMLR, 2020.
- Fast polynomial factorization and modular composition. SIAM Journal on Computing, 40(6):1767–1802, 2011.
- An efficient heuristic procedure for partitioning graphs. The Bell system technical journal, 49(2):291–307, 1970.
- Hyperbolic image embeddings. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6418–6428, 2020.
- No free lunch in data privacy. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, pp. 193–204, 2011.
- Poincaré maps for analyzing complex hierarchies in single-cell data. Nature communications, 11(1):2966, 2020.
- Improved bounds on sidon sets via lattice packings of simplices. SIAM Journal on Discrete Mathematics, 31(3):2269–2278, 2017.
- Codes in the space of multisets—coding for permutation channels with impairments. IEEE Transactions on Information Theory, 64(7):5156–5169, 2018.
- Federated learning on riemannian manifolds. arXiv preprint arXiv:2206.05668, 2022.
- Practical one-shot federated learning for cross-silo setting. arXiv preprint arXiv:2010.01017, 2020a.
- Federated learning: Challenges, methods, and future directions. IEEE signal processing magazine, 37(3):50–60, 2020b.
- Federated optimization in heterogeneous networks. Proceedings of Machine learning and systems, 2:429–450, 2020c.
- Hyperbolic diffusion embedding and distance for hierarchical representation learning. arXiv preprint arXiv:2305.18962, 2023.
- The geometry of graphs and some of its algorithmic applications. Combinatorica, 15:215–245, 1995.
- Hyperbolic graph neural networks. In Advances in Neural Information Processing Systems, pp. 8230–8241, 2019.
- Current best practices in single-cell rna-seq analysis: a tutorial. Molecular systems biology, 15(6):e8746, 2019.
- James Massey. Shift-register synthesis and bch decoding. IEEE transactions on Information Theory, 15(1):122–127, 1969.
- Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics, pp. 1273–1282. PMLR, 2017.
- Shortened array codes of large girth. IEEE Transactions on Information Theory, 52(8):3707–3722, 2006.
- What if kidney tumor segmentation challenge (kits19) never happened. In 2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 1740–1747. IEEE, 2022.
- On spectral clustering: Analysis and an algorithm. Advances in neural information processing systems, 14, 2001.
- Poincaré embeddings for learning hierarchical representations. In Advances in Neural Information Processing Systems, pp. 6338–6347, 2017.
- Learning continuous hierarchies in the lorentz model of hyperbolic geometry. In International conference on machine learning, pp. 3779–3788. PMLR, 2018.
- Overcoming resource constraints in federated learning: Large models can be trained with only weak clients. Transactions on Machine Learning Research, 2023.
- Single-cell analysis of mixed-lineage states leading to a binary cell fate choice. Nature, 537(7622):698–702, 2016.
- Provably accurate and scalable linear classifiers in hyperbolic spaces. Knowledge and Information Systems, pp. 1–34, 2023a.
- Machine unlearning of federated clusters. In International Conference on Learning Representations, 2023b. URL https://openreview.net/forum?id=VzwfoFyYDga.
- Scikit-learn: Machine learning in python. the Journal of machine Learning research, 12:2825–2830, 2011.
- Hyperbolic deep neural networks: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(12):10023–10044, 2021.
- Attribute noise robust binary classification (student abstract). In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pp. 13897–13898, 2020.
- Sundar Pichai. Google’s Sundar Pichai: Privacy Should Not Be a Luxury Good. In New York Times, 2019.
- John Platt et al. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advances in Large Margin Classifiers, 10(3):61–74, 1999.
- Coded computing for low-latency federated learning over wireless edge networks. IEEE Journal on Selected Areas in Communications, 39(1):233–250, 2020a.
- Secure and fault tolerant decentralized learning. arXiv preprint arXiv:2010.07541, 2020b.
- Hierarchical coded gradient aggregation for learning at the edge. In 2020 IEEE International Symposium on Information Theory (ISIT), pp. 2616–2621. IEEE, 2020c.
- Machine learning for single-cell genomics data analysis. Current Opinion in Systems Biology, 26:64–71, 2021.
- Foundations of hyperbolic manifolds, volume 149. Springer, 1994.
- Foundations of hyperbolic manifolds, volume 149. Springer, 2006.
- The future of digital health with federated learning. NPJ digital medicine, 3(1):119, 2020.
- Privacy-preserving quality control of neuroimaging datasets in federated environments. Human Brain Mapping, 43(7):2289–2310, 2022.
- Representation tradeoffs for hyperbolic embeddings. In International conference on machine learning, pp. 4460–4469. PMLR, 2018.
- One-shot federated learning: theoretical limits and algorithms to achieve them. The Journal of Machine Learning Research, 22(1):8485–8531, 2021.
- Rik Sarkar. Low distortion delaunay embedding of trees in hyperbolic plane. In Graph Drawing: 19th International Symposium, GD 2011, Eindhoven, The Netherlands, September 21-23, 2011, Revised Selected Papers 19, pp. 355–366. Springer, 2012.
- Clustered federated learning: Model-agnostic distributed multitask optimization under privacy constraints. IEEE transactions on neural networks and learning systems, 32(8):3710–3722, 2020.
- Federated learning in medicine: facilitating multi-institutional collaborations without sharing patient data. Scientific reports, 10(1):1–12, 2020.
- Normalized cuts and image segmentation. IEEE Transactions on pattern analysis and machine intelligence, 22(8):888–905, 2000.
- Mixed-curvature variational autoencoders. In 8th International Conference on Learning Representations (ICLR 2020)(virtual). International Conference on Learning Representations, 2020.
- Intra-and inter-cellular rewiring of the human colon during ulcerative colitis. Cell, 178(3):714–730, 2019.
- Learning from noisy labels with deep neural networks: A survey. IEEE Transactions on Neural Networks and Learning Systems, 2022.
- Tree! i am no tree! i am a low dimensional hyperbolic embedding. Advances in Neural Information Processing Systems, 33:845–856, 2020.
- On procrustes analysis in hyperbolic space. IEEE Signal Processing Letters, 28:1120–1124, 2021.
- Linear classifiers in product space forms. arXiv preprint arXiv:2102.10204, 2021.
- Complex hierarchical structures in single-cell genomics data unveiled by deep hyperbolic manifold learning. Genome Research, 33(2):232–246, 2023.
- Poincaré glove: hyperbolic word embeddings. In International Conference on Learning Representations, 2019. URL https://openreview.net/forum?id=Ske5r3AqK7.
- Abraham A Ungar. Hyperbolic trigonometry and its application in the poincaré ball model of hyperbolic geometry. Computers & Mathematics with Applications, 41(1-2):135–147, 2001.
- J Vermeer. A geometric interpretation of ungar’s addition and of gyration in the hyperbolic plane. Topology and its Applications, 152(3):226–242, 2005.
- A cellular census of human lungs identifies novel cell states in health and in asthma. Nature medicine, 25(7):1153–1163, 2019.
- Decentralized riemannian algorithm for nonconvex minimax problems. arXiv preprint arXiv:2302.03825, 2023.
- Adilson Elias Xavier. The hyperbolic smoothing clustering method. Pattern Recognition, 43(3):731–737, 2010.
- Numerically accurate hyperbolic embeddings using tiling-based models. Advances in Neural Information Processing Systems, 32, 2019.
- Correlated data in differential privacy: definition and analysis. Concurrency and Computation: Practice and Experience, 34(16):e6015, 2022.
- Hyperbolic graph attention network. IEEE Transactions on Big Data, 8(6):1690–1701, 2021.
- Federated learning with non-iid data. arXiv preprint arXiv:1806.00582, 2018.
- Distilled one-shot federated learning. arXiv preprint arXiv:2009.07999, 2020.