Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
GPT-5.1
GPT-5.1 91 tok/s
Gemini 3.0 Pro 46 tok/s Pro
Gemini 2.5 Flash 148 tok/s Pro
Kimi K2 170 tok/s Pro
Claude Sonnet 4.5 34 tok/s Pro
2000 character limit reached

Mixture of Ego-Graphs Fusion (MoEGF)

Updated 13 November 2025
  • MoEGF is a fine-grained graph fusion mechanism that adaptively combines per-sample ego-graphs via a Mixture-of-Experts approach.
  • It constructs KNN-based ego-graphs from multiple views and employs a gating MLP to generate a fused adjacency matrix for GNN processing.
  • Empirical results show significant accuracy improvements over traditional view-level fusion, underscoring its practical impact on multi-view clustering.

Mixture of Ego-Graphs Fusion (MoEGF) is a fine-grained graph fusion mechanism designed for multi-view clustering within the Mixture of Ego-Graphs Contrastive Representation Learning (MoEGCL) framework. Diverging from traditional view-level fusion, MoEGF aggregates per-sample (ego-graph) structures from multiple data views using a Mixture-of-Experts (MoE) paradigm. This design enables adaptive, sample-specific fusion of ego-graphs to produce a fused adjacency matrix for downstream graph neural network (GNN) processing, substantially enhancing clustering performance by capturing localized multi-view interactions (Zhu et al., 8 Nov 2025).

1. Mathematical Formulation of MoEGF

Given MM data views, each sample ii’s representation in view mm is encoded as zim=fm(xim)Rdψz_i^m = f^m(x_i^m) \in \mathbb{R}^{d_\psi}. For each view, a kk-nearest-neighbor (KNN) adjacency matrix Sm{0,1}N×NS^m \in \{0,1\}^{N \times N} is built:

Sijm={1if jKim 0otherwiseS_{ij}^m = \begin{cases} 1 & \text{if } j \in K_i^m \ 0 & \text{otherwise} \end{cases}

The sample’s ego-graph in view mm is the binary vector Vim:=(Si1m,Si2m,,SiNm){0,1}NV_i^m := (S_{i1}^m,\,S_{i2}^m,\,\dots,S_{iN}^m) \in \{0,1\}^N.

The concatenated embedding zi:=[zi1;zi2;;ziM]RMdψz_i := [z_i^1; z_i^2; \ldots; z_i^M] \in \mathbb{R}^{M d_\psi} serves as the gating input to a two-layer MLP, yielding softmax weights CiRMC_i \in \mathbb{R}^M:

Ci=softmax(mlp(1)(zi))C_i = \text{softmax}(\text{mlp}^{(1)}(z_i))

The fused ego-graph vector for sample ii is then the convex combination:

Vi=m=1MCimVimV_i = \sum_{m=1}^M C_i^m V_i^m

The stacked set of ViV_i forms the fused adjacency matrix SRN×NS \in \mathbb{R}^{N \times N}.

To incorporate feature information, a two-layer GCN is applied:

S~=IN+S;D~ii=jS~ij\tilde{S} = I_N + S;\quad \tilde{D}_{ii} = \sum_j \tilde{S}_{ij}

Z~=σ(D~1/2S~D~1/2(σ(D~1/2S~D~1/2ZW0)W1))\tilde{Z} = \sigma\left( \tilde{D}^{-1/2} \tilde{S} \tilde{D}^{-1/2} \left( \sigma\left( \tilde{D}^{-1/2} \tilde{S} \tilde{D}^{-1/2} Z W^0 \right) W^1 \right) \right)

where Z=[z1;;zN]Z = [z_1; \dots; z_N] and W0,W1W^0, W^1 are learnable parameters.

2. Algorithmic Implementation

A typical training epoch for MoEGF within MoEGCL, assuming minibatch size BB, comprises the following steps:

  1. Encoding: For each sample and each view, zim=fm(xim)z_i^m = f^m(x_i^m); form concatenated ziz_i.
  2. KNN Construction: For each view, build VimV_i^m as the row of SmS^m.
  3. Gating: Apply mlp(1)\text{mlp}^{(1)} and softmax to ziz_i, outputting CiC_i.
  4. Fusion: Compute ViV_i as the weighted sum of VimV_i^m using CiC_i.
  5. Adjacency Assembly: Assemble {Vi}\{V_i\} into the fused adjacency SbatchS_\text{batch}.
  6. GCN Forward: Apply two-layer GCN to obtain Z~batch\tilde{Z}_\text{batch}.
  7. Projection Heads: Apply separate MLPs to z~i\tilde{z}_i (h^i\hat{h}_i) and zimz_i^m (himh_i^m).
  8. Loss Computation: Compute autoencoder reconstruction loss LRec\mathcal{L}_\text{Rec} and ego-graph contrastive loss LEgc\mathcal{L}_\text{Egc}.
  9. Optimization: Sum total loss L=LRec+λLEgc\mathcal{L} = \mathcal{L}_\text{Rec} + \lambda\mathcal{L}_\text{Egc}, backpropagate, and update all parameters.

The dominant computational cost per batch is O(B2M)O(B^2 M) for the fusion and O(Bdψdg)O(B d_\psi d_g) for the feature transformations.

3. Comparison to View-Level Fusion Paradigms

Traditional deep multi-view clustering approaches construct one graph per view and perform graph fusion at the view level, assigning global weights to views and yielding a mixture for all samples. In contrast, MoEGF outputs sample-wise fusion coefficients CiC_i, enabling personalized graph structures per sample.

Empirical results show substantial accuracy gains from this design. For example, removing MoEGF and instead concatenating zimz_i^m leads to absolute clustering accuracy (ACC) drops of 37.6% (Caltech5V), 6% (RGBD), and 41% (WebKB). MoEGF outperforms state-of-the-art multi-view clustering baselines by over 8% ACC on WebKB and by 4–7% on RGBD (Zhu et al., 8 Nov 2025).

4. Design Decisions, Hyperparameters, and Implementation Notes

Key implementation features and hyperparameter choices are summarized below:

Component Setting Notes
Number of Experts K=MK = M One per view
Gating Network (mlp¹) Two-layer MLP, softmax output, dropout p=0.1p=0.1 Hidden dim not specified; dψ=512d_\psi=512 used
KNN Graph (per view) kk-nearest neighbors, k[5,10]k \in [5,10] typical Binary adjacency
Embedding Dimensions dψ=512d_\psi = 512, dϕ=128d_\phi=128 \
Batch Size b=256b = 256 \
Training Epochs Tp=200T_p = 200 (pre-train), Tf=300T_f = 300 (fine-tune) \
MoEGF Mixture Weights Dense softmax, no regularizer \

Implementation is amenable to minibatch parallelism and scales as O(B2M)O(B^2 M) with batch size BB and number of views MM, dominated by fusion and GCN costs. KNN adjacency and gating can be precomputed or batched for efficiency.

5. Integration Within the MoEGCL Framework

MoEGF operates immediately after per-view autoencoder embedding. It delivers the fused adjacency SS to a two-layer GCN, producing structure-aware node embeddings Z~\tilde{Z}. The subsequent Ego Graph Contrastive Learning (EGCL) module aligns fused and per-view representations via the loss

LEgc=12Ni=1Nm=1Mlogexp(cos(h^i,him)/τ)j=1Nexp((1Sij)cos(h^i,hjm)/τ)exp(1/τ)\mathcal{L}_{\mathrm{Egc}} = -\frac{1}{2N} \sum_{i=1}^N \sum_{m=1}^M \log\frac{ \exp ( \cos( \hat h_i, h_i^m ) / \tau ) }{ \sum_{j=1}^N \exp( (1 - S_{ij}) \cos( \hat h_i, h_j^m ) / \tau ) - \exp(1/\tau) }

Gradients from LEgc\mathcal{L}_{\mathrm{Egc}} propagate through the GNN layers and the MoEGF gating MLP, ensuring that the fused structure is optimized for cluster-aware representation learning.

MoEGF advances prior fusion mechanisms for multi-view graph data in deep clustering tasks by providing an alternative to global or view-level fusion. The approach is conceptually related to the class of Mixture-of-Experts graph methods, including MoG (Zhang et al., 23 May 2024), which extend MoE strategies to graph sparsification and subgraph selection via per-node adaptive fusion. MoEGF, however, is specifically tailored for sample-level ego-graph combination, with direct integration into a contrastive clustering framework. Both approaches share the use of per-node/per-sample gating and fusion, but differ in fusion domains (ego-adjacency in MoEGF, Grassmannian spectral fusion in MoG).

A plausible implication is that the fundamental Mixture-of-Experts paradigm, when applied locally to ego-centric structures, generalizes beyond clustering to other graph learning problems, including efficient sparsification, node classification, and adaptive edge selection.

7. Empirical Performance and Observed Impact

On benchmark datasets, MoEGF within MoEGCL results in pronounced accuracy improvements over both naive and coarse-grained fusion strategies, as evinced by substantial drops in ACC upon ablation. The empirical findings underscore the significance of fine-grained, per-sample fusion in capturing the mutual reinforcement and complementarity of multi-view graph signals. The method’s flexible, differentiable construction further allows direct end-to-end optimization with contrastive learning objectives, significantly advancing state-of-the-art performance in multi-view clustering settings (Zhu et al., 8 Nov 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)
Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Mixture of Ego-Graphs Fusion (MoEGF).