Federated Prompt Learning (FPL)

Updated 15 December 2025

FPL is a decentralized paradigm that exchanges prompt vectors instead of full model weights, reducing communication costs and preserving privacy.
It leverages prompt aggregation techniques like FedAvg and optimal transport to handle data heterogeneity and resource constraints.
Empirical studies show FPL achieves full-model fine-tuning accuracy at a fraction of communication and computational cost.

Federated Prompt Learning (FPL) is a model adaptation paradigm that integrates prompt-based tuning techniques into federated learning settings, enabling decentralized, privacy-preserving, and communication-efficient training of large pre-trained models—primarily vision-LLMs (VLMs) such as CLIP or multimodal LLMs. FPL replaces full-model weight updating with the collaborative learning or exchange of prompt (context) vectors, allowing efficient adaptation to diverse downstream tasks while preserving data privacy and minimizing communication overhead. The aggregation of prompt parameters, rather than model weights, directly addresses challenges of data heterogeneity, resource constraints, and privacy in federated environments. Recent research demonstrates FPL’s ability to match full-model fine-tuning accuracy at orders-of-magnitude lower communication and computation cost, while enabling robust generalization and supporting a wide spectrum of personalization, security, and continual learning strategies (Zhao et al., 2022, Guo et al., 2022, Liao et al., 28 Mar 2025).

1. Conceptual Foundation and Motivation

Federated Prompt Learning is motivated by limitations of classical federated learning, notably:

prohibitive communication overhead when transmitting full model weights in large foundation models,
overfitting and slow convergence on clients with limited or non-IID data,
privacy risks from gradient inversion.

FPL addresses these issues by freezing the backbone (e.g., the entire encoder or transformer stack of CLIP or BERT), and introducing a small set of learnable prompt vectors/tokens. These prompts serve as low-dimensional adapters superimposed on the shared model. Each client locally tunes only its prompt parameters on private data, and shares updates with a central server which performs prompt aggregation (typically FedAvg weighted by local data size). FPL drastically reduces per-round communication (often <0.1% of full model size), and achieves rapid convergence even under data scarcity and class/domain heterogeneity (Zhao et al., 2022, Guo et al., 2022, Liao et al., 28 Mar 2025, 2505.23024).

2. FPL Architectures and Parameterizations

Prompt Types

Soft textual prompts: Continuous vectors prepended to text encoder inputs, acting as tunable context (e.g., [p₁,…,p_m,class_name]) (Guo et al., 2022, Qiu et al., 2023).
Visual prompts: Pixel-level or patch-level embeddings inserted into visual encoder inputs, capturing instance-level or domain-specific semantic cues (Liao et al., 28 Mar 2025, Li et al., 2023).
Multimodal prompts: Joint text-visual or style-aware tokens, learned or generated by fusing multi-scale features and textual context (Prasad et al., 17 Aug 2025).

Aggregation Strategies

FedAvg: Weighted averaging of prompt matrices across clients, preserving global alignment while allowing prompt specialization (Zhao et al., 2022, Guo et al., 2022, Liao et al., 28 Mar 2025).
Probabilistic/EM alignment: EM-style assignment and matching of client prompt-sets to global prompt clusters, preventing destructive averaging under extreme heterogeneity (Weng et al., 27 Feb 2025).
Optimal Transport: Unbalanced transport regularizes local/global prompt alignment at the patch level, focusing each prompt on semantically relevant regions (Li et al., 29 Feb 2024).
Prompt portfolios: Weighted mixtures of global (shared) and local (personalized) prompts, with mixing coefficient determined by data heterogeneity (Pan et al., 29 Sep 2024).

Personalization and Continual Learning

Local residuals: Clients maintain private residual prompt vectors (ΔP_i) or low-rank adaptation components, allowing fine-grained personalization atop a global prompt (Cui et al., 16 May 2024, Tran et al., 23 Jan 2025).
Prototype-augmented prompts: Clients use fusion functions to combine task-specific prompts and leverage local/global prototypes for contrastive alignment and debiasing in continual, non-IID settings (He et al., 4 Nov 2024).
Instance-wise Bayesian prompts: Semi-implicit variational inference yields per-instance prompt distributions for strong intra-client adaptation (Ye et al., 27 Aug 2025).

3. Communication and Privacy Efficiency

Prompt-based federated aggregation is highly communication-efficient. For representative large PLMs:

Per-round transfer: typically 0.01–0.14% of model weight size (e.g., 15–20 KB vs. 110M parameters), enabling practical deployment across bandwidth-constrained, edge, or mobile devices (Zhao et al., 2022, Liao et al., 28 Mar 2025, Wang et al., 17 Jun 2025).
Parameter-efficient personalization: Only prompt or residual parameters are tuned locally; backbone weights remain frozen, minimizing computation and memory footprint (Guo et al., 2022, Li et al., 2023, Cui et al., 16 May 2024).
Privacy preservation: Exchange is limited to prompt vectors, which contain substantially less sensitive information than raw gradients or images. Differential privacy (DP) mechanisms can be efficiently applied to prompt updates, particularly to low-rank subspaces, limiting the utility–privacy tradeoff degradation (Tran et al., 23 Jan 2025, Zhao et al., 2022).

4. Robustness to Data Heterogeneity and Generalization

FPL mitigates the adverse effects of label skew, domain shift, and client-specific bias:

Multi-prompt/portfolio approaches: Joint learning of global and local prompts, possibly with OT or Bayesian regularization, balances generalization and personalization (Pan et al., 29 Sep 2024, Li et al., 29 Feb 2024, Weng et al., 27 Feb 2025).
Geometry-guided calibration: Distributional shape (covariance/eigenvectors) is reconstructed centrally and transmitted as a prior, allowing local prompt updates to align with global data geometry (Luo et al., 8 Dec 2025).
Style-aware prompt generation: Visual and style cues are fused with textual context via attention mechanisms, ensuring prompt tokens are context-adaptive and non-redundant, boosting generalization under domain and label heterogeneity (Prasad et al., 17 Aug 2025).
Continual/prototype-fusion learning: Task-specific prompt freezing, prompt fusion, and contrastive prototype alignment prevent catastrophic forgetting without data rehearsal, with server-side debiasing correcting head drift (He et al., 4 Nov 2024).

5. Security Considerations and Threats

Prompt-level aggregation introduces new attack surfaces:

Prompt-level backdoor attacks: Malicious clients can inject poisoned prompts via joint optimization of triggers and prompt embeddings, leading to universal backdoor activation with high attack success rates and minimal impact on standard accuracy (Zhang et al., 11 Aug 2025). As demonstrated empirically, standard aggregation defenses (FedAvg, MKrum, Foolsgold) only partially mitigate these risks; heavy DP noise neutralizes attacks at the cost of utility collapse.
Defensive research directions: Prompt-space anomaly detection, certified prompt aggregation, and encoder-plus-prompt joint defenses are needed to secure future FPL deployments (Zhang et al., 11 Aug 2025, Zhao et al., 2022).

6. Benchmarking, Evaluation, and Practical Insights

FPL algorithms have been extensively benchmarked for accuracy, generalization, and resource consumption:

FLIP framework: Comprehensive evaluation of eight SOTA FPL methods over four federation protocols, six evaluation scenarios (global/personalized/novel/few-shot/cross-domain/cost trade-offs), and twelve open datasets consistently shows prompt learning matches or outperforms centralized baselines, providing robust generalization under data scarcity, unseen classes, and domain shifts (Liao et al., 28 Mar 2025).
Empirical guidelines: For label skew, visual prompts are preferred; domain shift favors text prompts; combined scenarios benefit from dual-prompt strategies if resources permit. Weighted averaging is optimal under label heterogeneity; equal aggregation is preferable in domain shift (2505.23024).
Resource scaling: Increasing prompt length or token count yields diminishing returns; ensemble or multi-prompt methods with optimal transport or regularization achieve near-maximal accuracy at tractable communication overhead (Liao et al., 28 Mar 2025, Li et al., 29 Feb 2024).

7. Future Directions and Open Challenges

Major lines for future FPL research include:

Prompt-specific federated aggregation rules and dynamic adaptation to client/task heterogeneity (Liao et al., 28 Mar 2025, Pan et al., 29 Sep 2024).
Security and fairness: scalable robust prompt aggregation, anomaly-aware client selection, and incentive-compatible protocols (Zhang et al., 11 Aug 2025, Wang et al., 17 Jun 2025).
Multimodal, multi-task, and continual FPL in real-world federated settings (He et al., 4 Nov 2024, Ye et al., 27 Aug 2025).
Domain-adaptive and style-aware prompt learning, including automatic prompt architecture discovery (Prasad et al., 17 Aug 2025, 2505.23024).
Privacy-preserving and resource-aware learning leveraging low-rank, residual, and geometric calibration mechanics (Tran et al., 23 Jan 2025, Luo et al., 8 Dec 2025).

Federated Prompt Learning is increasingly recognized as a principled foundation for collaborative model adaptation under strict privacy and efficiency constraints, with a rich design space for further theoretical, security, and application advances.