OpenFL: An open-source framework for Federated Learning (2105.06413v1)

Published 13 May 2021 in cs.LG and cs.DC

Abstract: Federated learning (FL) is a computational paradigm that enables organizations to collaborate on ML projects without sharing sensitive data, such as, patient records, financial data, or classified secrets. Open Federated Learning (OpenFL https://github.com/intel/openfl) is an open-source framework for training ML algorithms using the data-private collaborative learning paradigm of FL. OpenFL works with training pipelines built with both TensorFlow and PyTorch, and can be easily extended to other ML and deep learning frameworks. Here, we summarize the motivation and development characteristics of OpenFL, with the intention of facilitating its application to existing ML model training in a production environment. Finally, we describe the first use of the OpenFL framework to train consensus ML models in a consortium of international healthcare organizations, as well as how it facilitates the first computational competition on FL.

Authors (18)

G Anthony Reina (6 papers)
Alexey Gruzdev (6 papers)
Patrick Foley (7 papers)
Olga Perepelkina (1 paper)
Mansi Sharma (20 papers)
Igor Davidyuk (1 paper)
Ilya Trushkin (1 paper)
Maksim Radionov (2 papers)
Aleksandr Mokrov (1 paper)
Dmitry Agapov (1 paper)
Jason Martin (13 papers)
Brandon Edwards (8 papers)
Micah J. Sheller (3 papers)
Sarthak Pati (24 papers)
Prakash Narayana Moorthy (1 paper)
Prashant Shah (6 papers)
Spyridon Bakas (55 papers)
Shih-Han Wang (6 papers)

Citations (92)

View on Semantic Scholar

Summary

OpenFL: An Open-Source Framework for Federated Learning

The paper "OpenFL: An Open-Source Framework for Federated Learning" presents a detailed exploration of Open Federated Learning (OpenFL), a framework developed to address the increasing demand for data privacy in collaborative ML and deep learning model training. This open-source project is the result of a collaborative effort between Intel Labs and the University of Pennsylvania and is designed to facilitate the deployment of federated learning (FL) across various industries, notably healthcare.

Framework Overview

OpenFL allows training ML models on decentralized data by leveraging the federated learning paradigm where the model iterations occur at the data location. This differs from traditional centralized approaches, which may involve significant privacy concerns. Notably, OpenFL is compatible with TensorFlow and PyTorch and is extendable to other ML frameworks. This adaptability aligns with its foundational philosophy — to be industry and use-case agnostic.

Architecture and Workflow

The framework operates on a star-topology network comprising two types of nodes: collaborators and aggregators. Collaborator nodes host local datasets, enabling decentralized training, while aggregator nodes, trusted entities within the federation, combine updates from collaborator nodes into a global model. The federated learning process is coordinated through a federated learning plan, which defines tasks, parameters, and networking configurations.

Security Considerations

OpenFL is designed with a focus on security, anticipated by potential threats such as model intellectual property theft or data inference attacks. The framework employs Public Key Infrastructure (PKI) certificates for secure communications and supports integration with Trusted Execution Environments (TEEs), such as Intel SGX, to ensure execution confidentiality and integrity. These features are central to building trust among participants, especially in sectors with stringent data protection requirements such as healthcare.

Installation and Usage

The paper offers comprehensive guidance on the deployment of OpenFL either on bare metal or using Docker, targeting both individual data scientists and large-scale production environments. Two interaction modes are proposed: a Python API for prototyping and a Command Line Interface (CLI) for scalable production deployments. The step-by-step instructions and the support for various deployment scenarios underscore the framework's accessibility and flexibility.

Empirical Applications

OpenFL has been practically applied in initiatives like the Federated Tumor Segmentation (FeTS), which involves multiple international healthcare institutions working collaboratively to enhance tumor boundary detection models. Furthermore, OpenFL facilitated the first computational competition on federated learning, demonstrating its capability to enable real-world evaluation of FL methods across distributed datasets.

Implications and Future Work

Federated learning presents significant potential for improving accuracy and reducing bias in AI models by allowing access to diverse datasets while maintaining privacy. OpenFL’s contributions highlight its utility in this area and its readiness for production-level tasks. Moreover, by being transparent and open-source, OpenFL invites further enhancement and adoption by the federated learning community.

As federated learning continues to evolve, frameworks like OpenFL may play a critical role in real-world applications across multiple sectors, including finance, healthcare, and beyond. The vision is for federations to become permanent ecosystems for continuous AI development and enhancement, effectively bridging the gap between data availability and privacy.

In conclusion, OpenFL sets a foundation for secure, privacy-preserving federated learning, offering a pathway for diverse organizations to collaborate effectively. This framework not only advances academic research but is also poised to influence industry practices significantly.

PDF Markdown

Related Papers

Find Related Papers

GitHub

GitHub - securefederatedai/openfl: An open framework for Federated Learning. (789 stars)