Generalizable Federated Learning using Client Adaptive Focal Modulation (2508.10840v1)
Abstract: Federated learning (FL) has proven essential for privacy-preserving, collaborative training across distributed clients. Our prior work, TransFed, introduced a robust transformer-based FL framework that leverages a learn-to-adapt hypernetwork to generate personalized focal modulation layers per client, outperforming traditional methods in non-IID and cross-domain settings. In this extended version, we propose AdaptFED, where we deepen the investigation of focal modulation in generalizable FL by incorporating: (1) a refined adaptation strategy that integrates task-aware client embeddings to personalize modulation dynamics further, (2) enhanced theoretical bounds on adaptation performance, and (3) broader empirical validation across additional modalities, including time-series and multilingual data. We also introduce an efficient variant of TransFed that reduces server-client communication overhead via low-rank hypernetwork conditioning, enabling scalable deployment in resource-constrained environments. Extensive experiments on eight diverse datasets reaffirm the superiority of our method over state-of-the-art baselines, particularly in source-free and cross-task federated setups. Our findings not only extend the capabilities of focal modulation in FL but also pave the way for more adaptive, scalable, and generalizable transformer-based federated systems. The code is available at http://github.com/Tajamul21/TransFed
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.