MoEGCL: Mixture Ego-Graph Contrastive Learning
- The paper introduces MoEGCL, a method that constructs 5 distinct ego-graph views per node and uses contrastive losses to maximize mutual information.
- MoEGCL employs shared GCN encoders and a mixture-of-experts gating mechanism to fuse multi-view embeddings, enhancing local structure capture and feature expressivity.
- Empirical findings show MoEGCL outperforms existing methods in node classification, link prediction, and multi-view clustering, validating its innovative design.
Ego Graph Contrastive Learning (EGCL) encompasses a class of methods that employ node-centered ego-graph structures to realize contrastive representation learning in graph-based problems. Recent implementations, such as the MoEGCL frameworks (Li et al., 2023, Zhu et al., 8 Nov 2025), have formalized multi-subgraph sampling and mutual information maximization protocols to boost the expressivity and utility of node or sample representations for tasks ranging from self-supervised node classification to multi-view clustering.
1. Formal Definitions of Ego-Graphs and Their Role in Representation Learning
"Editor’s term": Ego-graph—refers to a node-centric subgraph capturing the local topology and semantics around a central node. In MoEGCL (Li et al., 2023), for an input graph with nodes and node features , each node is associated with ego-graphs providing different structural perspectives:
- Basic (core) subgraph: isolates .
- 1-hop neighborhood: , capturing immediate neighbors.
- Intimate subgraph: top- nodes by Personalized PageRank similarity.
- Communal subgraph: all nodes in the same cluster as , with cluster membership determined via differentiable K-means.
- Full subgraph: the entire graph, where an embedding mixture preserves individuality.
In multi-view clustering scenarios (Zhu et al., 8 Nov 2025), for the -th view, the ego-graph for sample is encoded in the adjacency vector extracted from a k-NN graph on learned features . This establishes a modular instance-level graph context for each sample and view.
2. Construction and Fusion of Multiple Ego-Graphs
The fine-grained construction and fusion of ego-graphs are pivotal to EGCL's representational power:
- MoEGCL (Li et al., 2023): Constructs all node-centered subgraphs per node, independently encodes each via a shared GCN, and pools embeddings.
- MoEGF (Zhu et al., 8 Nov 2025): For multi-view clustering, fuses per-view ego-graphs via a "mixture-of-experts" gating mechanism. Concatenated view embeddings are passed through an MLP to produce softmax gating weights . The fused ego-graph adjacency vector for sample is , yielding a fused graph for downstream GCN encoding.
This protocol bypasses coarse view-level fusion, enabling sample-level fusion that preserves heterogeneous local structures and relationships specific to each node or sample.
3. Contrastive Objectives and Mutual Information Maximization
Contrastive learning within EGCL frameworks aims to maximize the mutual information between different views of the same node or sample, while minimizing alignment with corruptions or negatives:
- MoEGCL (Li et al., 2023): Employs a readout function to pool each subgraph embedding and its negative . The contrastive losses implement either:
- Core-View (CV): Contrasts basic vs. all other subgraphs.
- Full-Graph (FG): Contrasts all pairs among the ego-graphs.
- Both utilize a bilinear discriminator and binary cross-entropy loss.
- EGCL module (Zhu et al., 8 Nov 2025): Projects both fused GCN outputs and view-specific features into a common latent space, applying a cosine similarity metric. The EGCL loss is:
This penalizes cases where fused and view-specific representations disagree for different clusters, thereby enforcing both instance-level and cluster-level alignment.
This suggests that EGCL frameworks structurally encourage embeddings to reflect both individual identity and shared cluster membership, enabled by ego-graph-aware loss formulations.
4. Encoder Architectures and Model Implementation
EGCL models rely on GCN architectures for encoding subgraphs or fused graphs:
- MoEGCL (Li et al., 2023):
- Transductive tasks employ a single-layer GCN: , typically with output dimension .
- Inductive tasks use a residual GCN: .
- MoEGCL (Zhu et al., 8 Nov 2025): After sample-level graph fusion, a two-layer GCN is used, where .
In both cases, readout, projection, and gating MLPs are employed for view aggregation and alignment.
5. Training Algorithms, Hyperparameters, and Regularization
Training proceeds in epoch-wise cycles using Adam optimizers:
- MoEGCL (Li et al., 2023):
- Uses views, (neighbor range), (PPR ranking), clusters, (PPR damping), , (mixing coefficient), early stopping, and neighborhood sampling for large-scale graphs.
- Corruption techniques include diffusion on , feature shuffling, and graph-swapping.
- MoEGCL (Zhu et al., 8 Nov 2025):
- Pre-trains MLP autoencoders per view, then fine-tunes the fusion and contrastive modules for 300 epochs (batch size 256, lr=3e–4, , , , ).
- Employs for reconstruction fidelity and for alignment; no adversarial norms are used.
Clustering is performed post-training with k-means on projected fused representations.
6. Empirical Outcomes and Ablation Studies
EGCL and MoEGCL modules have demonstrated state-of-the-art performance across a range of tasks and datasets:
- Node classification (Li et al., 2023): On Cora, MNCSCL-CV reached 84.7% accuracy; MNCSCL-FG and MNCSCL-CV outperformed DGI, GMI, GIC, GRACE, MVGRL, and even matched supervised GCNs.
- Link prediction (Li et al., 2023): AUC/AP scores up to 94.8/94.2, surpassing GIC by 1.3–1.5 points.
- Multi-view clustering (Zhu et al., 8 Nov 2025): Achieved highest scores on Caltech5V, MNIST, LGG, WebKB, RGBD, and LandUse—e.g., MNIST ACC/NMI/PUR = 0.9920/0.9747/0.9920; Caltech5V ACC = 0.8207.
- Ablation (Zhu et al., 8 Nov 2025): Removing MoEGF, EGCL, MoE gating, or GCN causes significant drops (up to 24% in ACC), indicating the necessity of each component.
- A plausible implication is that the MoEGCL architecture's fine-grained fusion and contrastive alignment mechanisms are directly responsible for its empirical gains.
7. Significance, Implications, and Limitations
EGCL, as instantiated in MoEGCL, advances the paradigm of contrastive representation learning on graphs by:
- Enabling multiple ego-graph views per node or sample, rather than single ambiguous neighborhoods.
- Allowing instance- and cluster-level contrastive objectives, yielding robust, transferable node/sample representations.
- Achieving fine-grained, sample-level information fusion critical for multi-view graph applications, markedly outperforming conventional view-level graph fusion.
- Avoiding adversarial or norm-based regularizers, relying instead on structural and semantic alignment via ego-graph contrastive losses.
Performance stability across hyperparameter choices is reported, and convergence occurs within 400 epochs in practice.
However, large-scale graphs and datasets require careful batching and neighborhood sampling to manage computational footprint. All gains depend crucially on methodical construction and fusion of ego-graphs, as demonstrated by extensive ablation.
In conclusion, Ego Graph Contrastive Learning establishes a foundation for sophisticated graph representation learning, leveraging multi-perspective local structures and contrastive mutual information objectives for superior performance in node-centric and clustering tasks, with continued research promising further generalizations and optimizations (Li et al., 2023, Zhu et al., 8 Nov 2025).
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days free