EPFL-REMNet: Federated REM Mapping
- EPFL-REMNet is a federated learning framework that builds high-fidelity digital twins of heterogeneous 6G radio environments using a shared backbone and personalized heads.
- It employs advanced uplink sparsification and 8-bit quantization to reduce communication overhead by over 97% compared to traditional FedAvg.
- The architecture robustly handles Non-IID data distributions, enhancing accuracy and fairness for tail clients while accelerating convergence.
EPFL-REMNet is a federated learning framework developed for constructing high-fidelity digital twins of heterogeneous 6G radio environments, with a focus on efficient personalization and severe communication constraints under statistical heterogeneity (Li et al., 7 Nov 2025). It is characterized by a "shared backbone + lightweight personalized head" architecture, advanced uplink compression with Top-K sparsification and quantization, and robust convergence in Non-IID (non-independent and identically distributed) settings. In some earlier computer vision literature, "RemNet" refers to a distinct architecture for camera model identification (Rafi et al., 2019); however, EPFL-REMNet in the context of digital twins and wireless REM is specific to federated radio environment mapping.
1. Motivation and Problem Setting
The transition from homogeneous 5G to B5G/6G radio environments introduces drastic spatial and environmental heterogeneity in propagation (e.g., due to urban canyons, open terrain, and resulting path loss). Data for REM modeling is distributed among edge servers or devices, making centralized training unfeasible because of privacy and uplink bandwidth constraints.
Classical federated learning, exemplified by FedAvg, exhibits two fundamental limitations in this setting:
- Statistical heterogeneity—the population of client data distributions is highly Non-IID, leading to poor generalization for the global model and substantial performance gaps, particularly for “long-tail” clients possessing rare or atypical propagation patterns.
- Communication overhead—repeatedly transmitting high-dimensional model updates becomes prohibitive over constrained uplinks, especially with large numbers of geographically partitioned clients.
EPFL-REMNet is designed to address these challenges by combining model personalization, update compression, and robust optimization under severe Non-IID regimes in federated REM construction.
2. Architecture: Shared Backbone and Personalized Heads
EPFL-REMNet implements a modular structure consisting of a shared feature extractor (the backbone) and client-specific adaptation heads:
- Shared Backbone: A three-layer Multi-Layer Perceptron (MLP) employing SiLU activations and LayerNorm, which maps each input sample (2D coordinates plus P auxiliary environmental features) into a 512-dimensional latent space. This encoder is synchronized and updated across all clients.
- Personalized Head for Client : A lightweight network (either a single linear layer or a two-layer MLP with dropout) that projects the shared latent representation to the M-dimensional vector of cell-specific signal strength estimates. Each client updates only its own head parameters locally; these are not transmitted to the server.
This separation enables clients to adapt to their local propagation peculiarities without global synchronization, while the backbone accumulates globally transferable radio features. Empirical paper demonstrates that this design substantially reduces Non-IID drift and boosts accuracy for atypical (“long-tail”) clients.
3. Model Optimization and Uplink Compression
3.1 Optimization Objective
Let denote the local dataset for client , with as the client’s dataset weight. The optimization target is:
where is the shared backbone, the personalized head, the Huber loss, and an optional backbone regularizer.
3.2 Uplink Sparsification and Quantization
To minimize communication cost, only the backbone update is transmitted from each client, and it is drastically compressed. The steps are:
- Error-feedback accumulation: With previous quantization error , for round , update , where is the local backbone parameter change.
- Top-K sparsification: Apply , retaining only the elements with largest absolute value and zeroing the remainder.
- 8-bit symmetric quantization: Nonzero entries are quantized; only indices, quantized values, and a shared scale factor are sent.
The effective reduction combines high sparsity (up to 98%) with value compression on nonzero elements, providing 97% reduction in uplink bytes versus FedAvg in all scenarios.
4. Handling Non-IID Data and Federated Training
4.1 Scenario Partitioning and Heterogeneity Quantification
A heterogeneity score is defined per sample as , where are measurements from different base stations. The full dataset is partitioned into Light, Medium, and Heavy Non-IID splits using the 33rd and 66th percentiles of , reflecting environments from smooth (low spatial variation) to highly non-stationary (urban canyons).
Clients are organized into a spatial grid (90 clients per scenario), each dominantly sampling from its corresponding cell, inducing realistic Non-IID feature distributions.
4.2 Federated Training Loop
Each communication round consists of:
- Server broadcasts the current backbone () to clients.
- Each participating client locally updates both its backbone (single copy) and personalized head (multiple local epochs), computes , and, every rounds, compresses and transmits the backbone update.
- Server decodes updates, averages received to produce , optionally applies exponential moving average (EMA) for stability.
This alternating local fitting (with no head synchronization) and periodic backbone averaging is critical for scalability and fairness across Non-IID distributions.
5. Experimental Methodology and Evaluation Metrics
Experiments use the RadioMapSeer dataset, comprising rasterized path-loss and received-power maps. Each input vector is 102-dimensional (2D coordinates plus 100 auxiliary features), and each output is the vector of signal strengths to four base stations.
- Number of clients per scenario: 90
- Local update epochs : 5
- Communication interval : 5
- Sparsity target : Tuned to achieve 98% sparsification
- Quantization: 8-bit
- Training rounds: 100
Major metrics:
- Digital Twin Fidelity: Micro-averaged RMSE, Macro-averaged RMSE, and MAE, computed across all test samples and clients.
- Uplink Overhead: Total transmitted bytes for backbone updates.
- Fairness Gap: Maximum minus minimum per-base-station RMSE across clients.
6. Results, Ablation, and Comparative Analysis
6.1 Numerical Results
A summary of main results:
| Scenario | Macro RMSE (EPFL-REMNet) | Macro RMSE (FedAvg) | Uplink MB (REMNet) | Uplink MB (FedAvg) |
|---|---|---|---|---|
| Light Non-IID | 3.75 | 6.63 | 13.4 | 86.7 |
| Medium | 1.33 | 3.96 | 7.0 | 86.7 |
| Heavy | 2.02 | 6.27 | 21.6 | 86.7 |
EPFL-REMNet yields 97% reduction in communication over FedAvg, 88% over other baselines, and 60% lower error under heavy Non-IID.
6.2 Convergence and Pareto Analysis
EPFL-REMNet demonstrates faster convergence towards low error plateaus and dominates the accuracy–overhead Pareto frontier, delivering best-in-class performance for a given uplink budget.
6.3 Fairness and Tail Performance
REMNet consistently achieves per-base-station RMSE ranges below 0.3 dB for all scenarios (vs. 2 dB for baselines), reducing both inter-client and tail-client error gaps.
6.4 Ablation Study
Omission of any key component (personalized heads, Top-K sparsification, quantization, periodic sync, EMA) results in up to 300% higher RMSE, confirming necessity and synergy of all elements.
7. Implications, Applicability, and Potential Extensions
Modular decoupling of backbone and local heads enables robust generalization and uniform performance across highly non-uniform environments. Compression not only improves scalability but also regularizes backbone learning, filtering out update noise.
EPFL-REMNet’s architecture and pipeline are adaptable for deployment-specific constraints—parameters such as backbone/head sizes, compression rates, and communication intervals can be tuned for edge hardware, latency, and bandwidth budgets. Suggested future research directions include multi-modal fusion (e.g., integrating channel state and environmental imagery), dynamic meta-learning or head re-initialization for new deployment zones, and integration of differential privacy or secure aggregation to enhance data confidentiality.
A plausible implication is that this framework is extensible beyond wireless REM to any distributed system with mosaicked and highly Non-IID data distributions, when personalized modeling and communication efficiency are jointly prioritized.