Reliability, reproducibility, and scalability of bioinformatics federated learning methods

Determine the reliability, reproducibility, and scalability of federated learning methods specifically designed for bioinformatics—such as federated implementations for proteomics and differential gene expression (for example, DEqMS and limma voom), federated genome-wide association studies using generalized linear mixed models and privacy-preserving relatedness estimation, federated single-cell RNA-seq cell type classifiers (including ACTINN, linear support vector machines, XGBoost, and GeneFormer), vertical federated multi-omics integration neural networks, and medical imaging federated segmentation and diagnosis protocols—when applied in real-world cross-silo consortia with heterogeneous clients and privacy-preserving constraints.

Background

The paper highlights several federated learning approaches tailored to bioinformatics domains, including proteomics and differential expression, GWAS, single-cell RNA sequencing, multi-omics integration, and medical imaging. These methods address privacy and data-sharing barriers via secure aggregation, homomorphic encryption, secure multiparty computation, and differential privacy, aiming to deliver results comparable to centralized analyses while mitigating legal and infrastructural constraints.

However, the authors emphasize that these domain-specific federated solutions are in early development stages. As a consequence, it remains uncertain how reliably they perform across institutions, whether their results are reproducible under heterogeneous conditions, and how well they scale to large, multi-party, real-world consortia.

References

They are all in the early stages of development, so their reliability, reproducibility, and scalability are open questions.

— Technical Insights and Legal Considerations for Advancing Federated Learning in Bioinformatics (2503.09649 - Malpetti et al., 12 Mar 2025) in Section 4 (Federated learning in bioinformatics), opening paragraph

Reliability, reproducibility, and scalability of bioinformatics federated learning methods

Sponsor

Background

References

Related Problems