Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
95 tokens/sec
Gemini 2.5 Pro Premium
32 tokens/sec
GPT-5 Medium
18 tokens/sec
GPT-5 High Premium
20 tokens/sec
GPT-4o
97 tokens/sec
DeepSeek R1 via Azure Premium
87 tokens/sec
GPT OSS 120B via Groq Premium
468 tokens/sec
Kimi K2 via Groq Premium
202 tokens/sec
2000 character limit reached

Distributed Harmonization: Federated Clustered Batch Effect Adjustment and Generalization (2405.15081v3)

Published 23 May 2024 in cs.LG

Abstract: Independent and identically distributed (i.i.d.) data is essential to many data analysis and modeling techniques. In the medical domain, collecting data from multiple sites or institutions is a common strategy that guarantees sufficient clinical diversity, determined by the decentralized nature of medical data. However, data from various sites are easily biased by the local environment or facilities, thereby violating the i.i.d. rule. A common strategy is to harmonize the site bias while retaining important biological information. The ComBat is among the most popular harmonization approaches and has recently been extended to handle distributed sites. However, when faced with situations involving newly joined sites in training or evaluating data from unknown/unseen sites, ComBat lacks compatibility and requires retraining with data from all the sites. The retraining leads to significant computational and logistic overhead that is usually prohibitive. In this work, we develop a novel Cluster ComBat harmonization algorithm, which leverages cluster patterns of the data in different sites and greatly advances the usability of ComBat harmonization. We use extensive simulation and real medical imaging data from ADNI to demonstrate the superiority of the proposed approach. Our codes are provided in https://github.com/illidanlab/distributed-cluster-harmonization.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. Brain Tumor Detection Based on Deep Learning Approaches and Magnetic Resonance Imaging. Cancers 15, 16 (Aug. 2023), 4172. https://doi.org/10.3390/cancers15164172
  2. Site effects how-to and when: An overview of retrospective techniques to accommodate site effects in multi-site neuroimaging analyses. Frontiers in Neurology 13 (Oct. 2022). https://doi.org/10.3389/fneur.2022.923988
  3. Longitudinal ComBat: A method for harmonizing longitudinal multi-scanner imaging data. NeuroImage 220 (Oct. 2020), 117129. https://doi.org/10.1016/j.neuroimage.2020.117129
  4. Flower: A Friendly Federated Learning Research Framework. arXiv:2007.14390 [cs.LG]
  5. Associations between cortical β𝛽\betaitalic_β-amyloid burden, fornix microstructure and cognitive processing of faces, places, bodies and other visual objects in early Alzheimer’s disease. Hippocampus 33, 2 (2023), 112–124.
  6. Privacy-preserving harmonization via distributed ComBat. NeuroImage 248 (March 2022), 118822. https://doi.org/10.1016/j.neuroimage.2021.118822
  7. Development and assessment of a composite score for memory in the Alzheimer’s Disease Neuroimaging Initiative (ADNI). Brain imaging and behavior 6 (2012), 502–516.
  8. A Federated Learning Based Privacy Preserving Approach for Detecting Parkinson’s Disease Using Deep Learning. In 2022 25th International Conference on Computer and Information Technology (ICCIT). IEEE, 139–144.
  9. Learning from electronic health records across multiple sites: A communication-efficient and privacy-preserving distributed algorithm. Journal of the American Medical Informatics Association 27, 3 (Dec. 2019), 376–385. https://doi.org/10.1093/jamia/ocz199
  10. Kristian Steen Frederiksen. 2013. Corpus callosum in aging and dementia. Dan Med J 60, 10 (2013), B4721.
  11. An Effective Distributed Privacy-Preserving Data Mining Algorithm. Springer Berlin Heidelberg, 320–325. https://doi.org/10.1007/978-3-540-28651-6_47
  12. Machine learning models for diagnosis and prognosis of Parkinson’s disease using brain imaging: general overview, main challenges, and future directions. Frontiers in Aging Neuroscience 15 (July 2023). https://doi.org/10.3389/fnagi.2023.1216163
  13. Composite measures of executive function and memory: ADNI_EF and ADNI_Mem. Alzheimer’s Dis Neuroimaging Initiat (2012).
  14. A novel secure and distributed architecture for privacy-preserving healthcare system. Journal of Network and Computer Applications 217 (Aug. 2023), 103696. https://doi.org/10.1016/j.jnca.2023.103696
  15. Privacy-Preserving Federated Learning With Resource Adaptive Compression for Edge Devices. IEEE Internet of Things Journal PP (01 2023), 1–1. https://doi.org/10.1109/JIOT.2023.3347552
  16. Evaluating Alzheimer’s disease biomarkers as mediators of age-related cognitive decline. Neurobiology of aging 58 (2017), 120–128.
  17. Nasir Ahmad Jalali and Hongsong Chen. 2024. Federated Learning Security and Privacy-Preserving Algorithm and Experiments Research Under Internet of Things Critical Infrastructure. Tsinghua Science and Technology 29, 2 (2024), 400–414. https://doi.org/10.26599/TST.2023.9010007
  18. Endothelial function is associated with white matter microstructure and executive function in older adults. Frontiers in Aging Neuroscience 9 (2017), 255.
  19. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8, 1 (April 2006), 118–127. https://doi.org/10.1093/biostatistics/kxj037
  20. Privacy-Preserving Distributed Processing: Metrics, Bounds and Algorithms. IEEE Transactions on Information Forensics and Security 16 (2021), 2090–2103. https://doi.org/10.1109/tifs.2021.3050064
  21. Bridging Reduced Grip Strength and Altered Executive Function: Specific Brain White Matter Structural Changes in Patients with Alzheimer’s Disease. Clinical Interventions in Aging (2024), 93–107.
  22. Andrzej Maćkiewicz and Waldemar Ratajczak. 1993. Principal components analysis (PCA). Computers& Geosciences 19, 3 (March 1993), 303–342. https://doi.org/10.1016/0098-3004(93)90090-r
  23. Communication-Efficient Learning of Deep Networks from Decentralized Data. (2016). https://doi.org/10.48550/ARXIV.1602.05629
  24. Robert Monsour. 2022. Neuroimaging in the Era of Artificial Intelligence: Current Applications. Federal Practitioner 39 (Suppl 1) (April 2022). https://doi.org/10.12788/fp.0231
  25. Effectiveness of regional DTI measures in distinguishing Alzheimer’s disease, MCI, and normal aging. NeuroImage: clinical 3 (2013), 180–195.
  26. A Guide to ComBat Harmonization of Imaging Biomarkers in Multicenter Studies. Journal of Nuclear Medicine 63, 2 (Sept. 2021), 172–179. https://doi.org/10.2967/jnumed.121.262464
  27. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825–2830.
  28. Experimental Multicenter and Multivendor Evaluation of the Performance of PET Radiomic Features Using 3-Dimensionally Printed Phantom Inserts. Journal of Nuclear Medicine 61, 3 (Aug. 2019), 469–476. https://doi.org/10.2967/jnumed.119.229724
  29. Harmonization of large MRI datasets for the analysis of brain imaging patterns throughout the lifespan. NeuroImage 208 (March 2020), 116450. https://doi.org/10.1016/j.neuroimage.2019.116450
  30. ComBat Harmonization: Empirical Bayes versus fully Bayes approaches. NeuroImage: Clinical 39 (2023), 103472. https://doi.org/10.1016/j.nicl.2023.103472
  31. Medical Imaging Applications of Federated Learning. Diagnostics 13, 19 (2023). https://doi.org/10.3390/diagnostics13193140
  32. Large-scale analysis of structural brain asymmetries in schizophrenia via the ENIGMA consortium. Proceedings of the National Academy of Sciences 120, 14 (2023), e2213880120.
  33. Bridging cognition and action: executive functioning mediates the relationship between white matter fiber density and complex motor abilities in older adults. Aging (Albany NY) 14, 18 (2022), 7263.
  34. Fed-ComBat: A Generalized Federated Framework for Batch Effect Harmonization in Collaborative Studies. (May 2023). https://doi.org/10.1101/2023.05.24.542107
  35. How Machine Learning is Powering Neuroimaging to Improve Brain Health. Neuroinformatics 20, 4 (March 2022), 943–964. https://doi.org/10.1007/s12021-022-09572-9
  36. The ENIGMA Consortium: large-scale collaborative analyses of neuroimaging and genetic data. Brain imaging and behavior 8 (2014), 153–182.
  37. Robert Tibshirani. 1996. Regression Shrinkage and Selection Via the Lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58, 1 (Jan. 1996), 267–288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
  38. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7464–7475.
  39. The Added Value of Diffusion-Weighted MRI-Derived Structural Connectome in Evaluating Mild Cognitive Impairment: A Multi-Cohort Validation1. Journal of Alzheimer’s Disease 64, 1 (June 2018), 149–169. https://doi.org/10.3233/jad-171048
  40. Multi-Modality Disease Modeling via Collective Deep Matrix Factorization. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’17). ACM. https://doi.org/10.1145/3097983.3098164
  41. Discriminative fusion of multiple brain networks for early mild cognitive impairment detection. In 2016 IEEE 13th International Symposium on Biomedical Imaging (ISBI). 568–572. https://doi.org/10.1109/ISBI.2016.7493332
  42. A Privacy-Preserving Distributed Analytics Platform for Health Care Data. Methods of Information in Medicine 61, S 01 (Jan. 2022), e1–e11. https://doi.org/10.1055/s-0041-1740564
  43. Privacy-preserving data sharing infrastructures for medical research: systematization and comparison. BMC Medical Informatics and Decision Making 21, 1 (Aug. 2021). https://doi.org/10.1186/s12911-021-01602-x
  44. A privacy-preserving and computation-efficient federated algorithm for generalized linear mixed models to analyze correlated electronic health records data. PLOS ONE 18, 1 (Jan. 2023), e0280192. https://doi.org/10.1371/journal.pone.0280192
  45. Generalized Out-of-Distribution Detection: A Survey. https://doi.org/10.48550/ARXIV.2110.11334
  46. Problem solving, working memory, and motor correlates of association and commissural fiber bundles in normal aging: a quantitative fiber tracking study. Neuroimage 44, 3 (2009), 1050–1062.
  47. FedLab: A Flexible Federated Learning Framework. Journal of Machine Learning Research 24, 100 (2023), 1–7. http://jmlr.org/papers/v24/22-0440.html
  48. Understanding scanner upgrade effects on brain integrity & connectivity measures. In 2014 IEEE 11th International Symposium on Biomedical Imaging (ISBI). IEEE, 234–237.
  49. Boosting brain connectome classification accuracy in Alzheimer’s disease using higher-order singular value decomposition. Frontiers in Neuroscience 9 (July 2015). https://doi.org/10.3389/fnins.2015.00257
  50. Comparison of nine tractography algorithms for detecting abnormal structural brain networks in Alzheimer’s disease. Frontiers in Aging Neuroscience 7 (April 2015). https://doi.org/10.3389/fnagi.2015.00048

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets