Papers
Topics
Authors
Recent
2000 character limit reached

Python Testbed for Federated Learning

Updated 18 November 2025
  • Python Testbed for Federated Learning is a pure-Python, zero-dependency framework that simplifies prototyping and testing both centralized and decentralized federated learning algorithms, focusing on smart IoT and edge systems.
  • Its modular architecture abstracts away inter-process communication details by using callback functions, enabling developers to implement FL algorithms without managing low-level protocols.
  • Empirical validation and formal verification confirm rapid convergence—with centralized protocols converging in 5 rounds and decentralized in 3 rounds—and ensure deadlock-freedom for reliable performance.

The Python Testbed for Federated Learning Algorithms (PTB-FLA) is a pure-Python, zero-dependency framework designed for efficient prototyping, testing, and validating both centralized and decentralized federated learning (FL) routines, with a particular focus on smart Internet of Things (IoT) and edge system scenarios. PTB-FLA is architected to minimize the burden on the developer, requiring only the definition of callback functions that encapsulate the core logic of FL algorithms while abstracting away all details of distributed process and message orchestration. Its workflow is validated with canonical FL primitives and supported by formal verification methodologies.

1. System Architecture and Design

PTB-FLA is structured into three conceptual software layers:

  • Python Layer: The foundational layer leverages Python's multiprocessing and subprocess modules—specifically, Process, Listener, Client, and Queue objects—to implement robust inter-process communication on a single host.
  • PTB-FLA Layer: At this layer, the system provides (a) the launcher module, which spawns N processes using subprocess.Popen; (b) the mpapi module, providing an internal message-passing API built atop multiprocessing's Listener and Client; and (c) the ptbfla module, where the main PtbFla class implements the generic centralized (fl_centralized) and decentralized (fl_decentralized) FL protocols.
  • Application Layer: Here, user-written Python scripts import PtbFla, define callback functions for client and server roles, and invoke the testbed API. All FL algorithmic specialization is via callbacks; no subclassing or lower-level protocol knowledge is required.

Runtime Model: PTB-FLA's launcher (CLI script) forks N identical Python processes. Each process instantiates PtbFla(noNodes=N, nodeId=i, [flSrvId]), and executes either the centralized or decentralized protocol. All message-passing details are internally managed and hidden from user scripts.

Centralized and Decentralized Protocols

  • Centralized (Star Topology): Node 0 (the default server) broadcasts its model wtw^t to all clients. Each client computes an update Δi=clientcb(wt,pi)\Delta_i = \mathrm{clientcb}(w^t, p_i) and returns it. The server aggregates the updates via servercb({Δi})wt+1\mathrm{servercb}(\{\Delta_i\}) \to w^{t+1}.
  • Decentralized (Complete Graph): Each node ii sends its current model witw_i^t to all others. Upon receiving a peer's model, each node computes Δij=clientcb(wit,pi,wjt)\Delta_{i\leftarrow j} = \mathrm{clientcb}(w_i^t, p_i, w_j^t) and responds. After collecting all updates from peers, ii aggregates via servercb({Δki})wit+1\mathrm{servercb}(\{\Delta_{k\leftarrow i}\}) \to w_i^{t+1}.

2. API, Configuration, and Usage

The primary interaction point is the PtbFla class and its methods:

Constructor / Method Description
PtbFla(noNodes, nodeId, flSrvId=0) Instantiates PTB-FLA testbed per process
fl_centralized(sfun, cfun, ldata, pdata, noIters=1) Runs centralized FedAvg-style protocol
fl_decentralized(sfun, cfun, ldata, pdata, noIters=1) Runs decentralized clique-based protocol

Parameters:

  • noNodes: Total number of FL processes NN.
  • nodeId: Integer 0N10\dots N-1, unique per process.
  • flSrvId: Index of the server node (for centralized), default 0.
  • sfun: User-written server callback, signature f(privateData,listOfMessages)updateDataf(\mathrm{privateData}, \mathrm{listOfMessages}) \to \mathrm{updateData}.
  • cfun: User-written client callback, signature f(localData,privateData,msgFromServerOrPeer)updateDataf(\mathrm{localData}, \mathrm{privateData}, \mathrm{msgFromServerOrPeer}) \to \mathrm{updateData}.
  • ldata: Model state (Python object, must be picklable).
  • pdata: Local training data (remains private to client).
  • noIters: Number of federated rounds.

Initialization & Launch Workflow:

  1. Clone PTB-FLA: git clone https://github.com/username/ptb-fla.git
  2. Install via pip: pip install .
  3. Write an application script that imports PtbFla, defines callbacks, and invokes one of the two main methods.
  4. Launch N processes via the CLI: ptb-launch --nodes 4 --script example.py

3. Canonical Algorithm Implementations

Example 1: Federated Map (Threshold Counting)

  • Server selects a threshold θ\theta; each client compares its reading and returns 1.0 (above threshold) or 0.0.
  • Server callback computes the average: score=1Nclientsi=1Nclients1[xi>θ]\text{score} = \frac{1}{N_\text{clients}} \sum_{i=1}^{N_\text{clients}} \mathbf{1}[x_i > \theta].

Example 2: Centralized Data Averaging (Toy FedAvg)

  • Each node holds a scalar wiw_i.
  • Client update: Δi=wit+wservert2\Delta_i = \frac{w_i^t + w_\text{server}^t}{2}.
  • Server aggregates: wt+1=1Ki=1KΔiw^{t+1} = \frac{1}{K} \sum_{i=1}^{K} \Delta_i.
  • Convergence is rapid: all values reach server-biased mean in 5 rounds.

Example 3: Decentralized Data Averaging

  • Same callbacks as above, invoked via fl_decentralized.
  • Protocol ensures all nodes converge to unweighted mean in three rounds.

Mathematical Foci

  • Centralized aggregation (FedAvg one-shot): wt+1=1Ki=1KΔiw^{t+1} = \frac{1}{K} \sum_{i=1}^K \Delta_i.
  • Client update rule: Δi=wit+wpeert2\Delta_i = \frac{w_i^t + w_\text{peer}^t}{2}.

4. Extending and Customizing the Testbed

Researchers extend PTB-FLA by writing new callback pairs for the client and server roles. The architecture is intentionally minimal, supporting the following extension mechanisms:

  • Custom topologies: Fork the mpapi layer and adjust neighbor lists for non-star/non-clique communication graphs.
  • Algorithmic specialization: Subclass PtbFla or add methods (e.g., fl_custom) for custom protocols, message stubs, or scheduling policies.
  • Callback-driven design: Any algorithm expressible as a separation between local-compute and aggregation fits the PTB-FLA paradigm.

Development is further facilitated by a disciplined four-phase programming paradigm, supporting direct stepwise translation from sequential code to federated code with callbacks and then PTB-FLA orchestration (Popovic et al., 2023).

5. Empirical Validation and Formal Verification

PTB-FLA's empirical and formal properties are extensively addressed:

  • Validation: On synthetic scalar/vector data, centralized and decentralized routines exhibit expected convergence rates (5 and 3 rounds respectively). The threshold counting example always yields the correct ratio in a single round.
  • Communication Cost:
    • Centralized: $2(N - 1)$ messages per round, each O(model)O(\lvert \text{model} \rvert).
    • Decentralized: $2(N - 1) N$ messages (clique exchange) per round.
  • IoT Simulation Constraints: Staging all processes on a single host avoids real-world heterogeneity, network latency, or packet loss, and limits scale to OS process/file handle limits.
  • Formal Verification: The generic centralized and decentralized algorithms, as well as universal TDM peer exchange, have been formally modeled in CSP and verified in PAT for deadlock-freedom and termination properties (Popovic et al., 12 Nov 2025, Popovic et al., 8 Jun 2025).

6. Significance, Limitations, and Evolving Use Cases

PTB-FLA achieves several unique objectives within the research and prototyping space:

  • Minimal Footprint: Pure-Python, no dependencies beyond the standard library, readily deployable even in severely resource-constrained environments (Windows, Linux, macOS).
  • Simplicity of Porting and Prototyping: By accepting only two core functions (callbacks), PTB-FLA supports rapid iteration, formal model extraction, and LLM-driven development workflows (Popovic et al., 2023).
  • Pathway to Edge and IoT Deployment: Provides a roadmap for migration to true edge/LAN/heterogeneous systems by decoupling logical protocol from execution substrate, as well as a MicroPython variant for actual device deployment (Popovic et al., 15 May 2024).

Limitations:

  • PTB-FLA does not support cross-host execution or simulate network pathologies (latency, drop, bandwidth limitations) in its base version.
  • All nodes are assumed to run synchronously; asynchronicity, partial participation, secure aggregation, and hardware heterogeneity require external simulation or further development.

Evolution: Subsequent frameworks extend PTB-FLA to multi-host and IoT-capable asynchronous execution (MicroPython Testbed for Federated Learning Algorithms (Popovic et al., 15 May 2024)), while formal verification efforts demonstrate that core routines meet safety and liveness properties essential for correctness under practical deployment constraints.


References:

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Python Testbed for Federated Learning.