ERPA: External-Referenced Prototype Alignment

Updated 28 January 2026

The paper introduces ERPA as a semantic anchoring strategy that leverages a minimal public dataset to stabilize cross-client prototype alignment.
Methodology distinguishes between covered and uncovered classes using uniform and data-weighted prototype aggregation to ensure consistent class anchors.
Empirical results show ERPA enhances classification accuracy and convergence in non-IID settings, outperforming baseline federated learning approaches.

External-Referenced Prototype Alignment (ERPA) is a semantic anchoring strategy designed for federated learning (FL) in contexts with non-IID data and strict communication constraints. Originating as a core component of the RefProtoFL framework, ERPA stabilizes class-wise representation learning across heterogeneous clients by leveraging a small, public, server-held dataset to construct shared prototype references. Through this mechanism, ERPA reduces prototype drift, enhances cross-client consistency, and empirically improves classification accuracy, particularly under severe label heterogeneity (Wu et al., 21 Jan 2026).

1. Conceptual Motivation and Integration in RefProtoFL

ERPA addresses the challenge of prototype inconsistency in prototype-based FL, where clients communicate class-wise feature prototypes—rather than entire model parameters—to mitigate bandwidth expenditure. In non-IID regimes, local prototypes tend to diverge, impairing global generalization. ERPA compensates by establishing “class anchors”: for classes present in the public dataset (“covered” classes), prototypes are induced from the public data and uniformly aggregated; for “uncovered” classes, anchors are formed by a data-weighted server aggregation of local client prototypes. By incorporating these external and global anchors into each client’s training loss, ERPA enforces semantic alignment without the requirement for direct data sharing.

Within RefProtoFL, ERPA operates alongside Adaptive Probabilistic Update Dropping (APUD), which provides update sparsification for communication efficiency. ERPA is positioned after adapter/parameter broadcast but before local optimization, ensuring that every local update is informed by the most recent global semantic references.

2. Mathematical Formulation

ERPA formalizes cross-client alignment using prototype aggregation and a prototype-alignment loss.

Notation:

$C$ : label space (set of classes)
$D^{pub}_c$ : public dataset for class $c$
$D_{k,c}$ : data for client $k$ , class $c$
$\theta^f, \theta^c$ : shared adapter’s feature and classifier parameters
$\theta^b_k$ : client $k$ ’s private backbone
$F(x; \theta^b_k, \theta^f)$ : client-internal feature extractor

External-reference prototypes (for covered classes, $|D^{pub}_c| > 0$ ):

Client computes: $p^{pub,t}_{k,c} = \frac{1}{|D^{pub}_c|}\sum_{x\in D^{pub}_c} F(x; \theta^b_k, \theta^f)$
Server aggregates: $p^{ext,t}_c = \frac{1}{|S^t|} \sum_{k\in S^t} p^{pub,t}_{k,c}$

Global prototypes (for uncovered classes, $|D^{pub}_c| = 0$ , but $|D_{k,c}| > 0$ ):

Client computes: $p^{t}_{k,c} = \frac{1}{|D_{k,c}|} \sum_{x\in D_{k,c}} F(x; \theta^b_k, \theta^f)$
Server aggregates (data-weighted): $p^{g,t}_c = \sum_{k\in U^t_c} \frac{|D_{k,c}|}{\sum_{j\in U^t_c}|D_{j,c}|}\cdot p^{t}_{k,c}$ , $U^t_c = \{ k\in S^t: |D_{k,c}|>0\}$

Prototype-alignment and combined loss:

Class anchor: $a^t_{k,c} = (1-\delta_c)p^{ext,t}_c + \delta_c p^{g,t}_c$ , where $\delta_c=0$ if class $c$ is covered, $1$ otherwise
Cross-entropy loss: $L^{ce}_k = \sum_{(x,y)\in D_k} \ell_{ce}(g(F(x; \theta^b_k, \theta^f), \theta^c), y)$
Prototype alignment loss: $L^{proto}_k = \sum_{c} \sum_{x\in D_{k,c}} ||F(x; \theta^b_k, \theta^f) - a^t_{k,c}||^2_2$
Local objective: $L_k = L^{ce}_k + \lambda L^{proto}_k$ , with tradeoff $\lambda$ between classification and alignment.

3. Algorithm and Communication Protocol

The ERPA-enabled RefProtoFL operates via the following protocol per communication round:

Step	Entity	Operation
Broadcast model and public set	Server	Transmits $(\theta^{a,t}, D^{pub})$ to selected clients $S^t$
Prototype computation and upload	Client	For $c\in C$ : if $\|D^{pub}_c\|>0$ , compute $p^{pub,t}_{k,c}$ ; else if $\|D_{k,c}\|>0$ , compute $p^{t}_{k,c}$
Prototype aggregation	Server	For each $c$ : if $\delta_c=0$ , uniform $p^{ext,t}_c$ ; else weighted $p^{g,t}_c$
Broadcast prototypes	Server	Sends $p^{ext,t}_c$ and $p^{g,t}_c$ to all clients
Training with alignment loss	Client	Forms anchors, initializes adapters, optimizes $L_k$
Adapter sparsification and aggregation	Client/Server	Applies APUD to adapter updates, followed by server-side aggregation

ERPA thus enables every client, regardless of private data coverage, to align its representations to the same set of class anchors, mitigating the effects of non-IID label splits (Wu et al., 21 Jan 2026).

4. Handling of Covered and Uncovered Classes

ERPA explicitly distinguishes between covered and uncovered classes to ensure universal semantic alignment.

Covered classes ( $\delta_c=0$ ): All clients compute external-reference prototypes on the shared public set. The server aggregates these uniformly, broadcasting $p^{ext,t}_c$ as the anchor.
Uncovered classes ( $\delta_c=1$ ): Only clients with private data for class $c$ participate. Prototypes are aggregated via data-volume weighting to form $p^{g,t}_c$ .
Anchor Distribution: Both $p^{ext,t}_c$ and $p^{g,t}_c$ are broadcast to all clients in each round, ensuring consistency of the anchoring mechanism across the federation.

This approach guarantees that for any class, clients align their local feature spaces to the same reference, suppressing prototype divergence due to data distributional differences.

5. Theoretical Properties and Underlying Assumptions

ERPA is predicated on the existence of a small public dataset whose label classes partially overlap with the total label space. Notably, the embedding of public samples is performed exclusively at the client side—no forward passes on raw data occur at the server, preserving data privacy. The prototype aggregation strategies—uniform for $p^{ext}$ and data-size-weighted for $p^{g}$ —result in unbiased estimators of class-wise feature means under reasonable sampling distributions.

While no formal proof of convergence is offered for ERPA in isolation, empirical evidence indicates stabilization of global features and accelerated convergence of the overall RefProtoFL framework in highly non-IID scenarios. The explicit prototype-alignment term acts analogously to a variance-reducing regularizer within the feature space.

6. Empirical Evaluation and Impact

Ablation studies on CIFAR-10 (Dirichlet $\alpha=0.5$ ) demonstrate quantifiable benefits attributable to ERPA. Full RefProtoFL achieves 45.51% Top-1 accuracy; omission of ERPA reduces performance by 0.89% absolute (to 44.62%), exceeding the decrement observed when APUD is omitted (–0.14%). In combined omission (neither APUD nor ERPA), accuracy falls further to 44.54%. These results validate ERPA as the principal contributor to performance gains in mitigating prototype drift and ensuring cross-client consistency.

On a broader benchmark covering MNIST, FashionMNIST, CIFAR-10, and CIFAR-100, RefProtoFL (with ERPA) attains an averaged accuracy of 60.63%, surpassing the leading prototype-based baseline FedProto’s 60.11%. Gains are accentuated under severe heterogeneity (e.g., +1.18% on CIFAR-10 at $\alpha=0.5$ ), indicating ERPA’s efficacy in managing highly skewed client data distributions.

7. Significance and Implications

ERPA advances federated learning by introducing a lightweight, communication-efficient protocol for semantic alignment that does not require sharing model backbones or raw data. Its reliance on minimal public data distinguishes ERPA from prior approaches necessitating large shared corpora or full parameter aggregation. The method’s explicit partitioning of class references—external for public, global for private—enables robust alignment across a variety of data partitioning regimes.

A plausible implication is that ERPA's framework may be extensible to broader settings where public data only partially overlap the full task distribution or where privacy constraints are especially stringent. Its demonstrated stabilization of representations under non-IID conditions suggests utility in more general decentralized learning scenarios, although further theoretical characterization may be warranted (Wu et al., 21 Jan 2026).

Markdown Report Issue Upgrade to Chat

References (1)

RefProtoFL: Communication-Efficient Federated Learning via External-Referenced Prototype Alignment (2026)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to External-Referenced Prototype Alignment (ERPA).