Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Contrastive Multi-View Representation Learning on Graphs (2006.05582v1)

Published 10 Jun 2020 in cs.LG and stat.ML

Abstract: We introduce a self-supervised approach for learning node and graph level representations by contrasting structural views of graphs. We show that unlike visual representation learning, increasing the number of views to more than two or contrasting multi-scale encodings do not improve performance, and the best performance is achieved by contrasting encodings from first-order neighbors and a graph diffusion. We achieve new state-of-the-art results in self-supervised learning on 8 out of 8 node and graph classification benchmarks under the linear evaluation protocol. For example, on Cora (node) and Reddit-Binary (graph) classification benchmarks, we achieve 86.8% and 84.5% accuracy, which are 5.5% and 2.4% relative improvements over previous state-of-the-art. When compared to supervised baselines, our approach outperforms them in 4 out of 8 benchmarks. Source code is released at: https://github.com/kavehhassani/mvgrl

Citations (1,176)

Summary

  • The paper introduces a contrastive framework that maximizes mutual information between dual graph views, enhancing both node and graph representations.
  • It employs tailored graph encoders and a discriminator, achieving 86.8% accuracy on Cora and notable improvements on other benchmarks.
  • The approach demonstrates robust generalization in unsupervised settings, reducing the dependency on costly labeled data for complex graph tasks.

Contrastive Multi-View Representation Learning on Graphs

In recent years, there has been a growing interest in self-supervised learning methods, especially on graphs, to reduce the dependency on extensive labeled datasets. "Contrastive Multi-View Representation Learning on Graphs," authored by Kaveh Hassani and Amir Hosein Khasahmadi, contributes to this line of work by proposing a novel contrastive framework tailored for graph representation learning. This framework employs multiple views of graph structures to learn embeddings for both node-level and graph-level tasks.

Problem Statement and Motivation

Graph Neural Networks (GNNs) have shown significant promise in learning representations from graph-structured data. However, these models frequently require labeled data for training, which is a major bottleneck because annotating graph data is often more challenging and costly than other data modalities like images or text. To mitigate this challenge, the authors explore self-supervised learning approaches which do not rely on external labels but instead utilize structural properties of the graphs themselves.

Methodology

The core idea of this work is to maximize the mutual information (MI) between node and graph level representations derived from different structural views of the same graph. The authors specifically examine:

  1. Augmentations: Generating multiple views of the graph structure via adjacency and diffusion matrices.
  2. Encoders: Learning high-dimensional representations using dedicated GNNs and Multi-Layer Perceptrons (MLPs).
  3. Discriminator: Contrasting node representations from one view with graph representations from another view using a discriminator network to compute the MI.

The methodology is elaboratively described in four main components:

  1. Augmentation Mechanism: The authors generate two structural views by converting an adjacency matrix to a diffusion matrix (either via Personalized PageRank or Heat Kernels). These views allow leveraging both local and global graph information.
  2. Graph Encoders: For each view, separate encoder networks (GCNs) are used to learn node embeddings. Representations are projected through shared MLPs to ensure consistent feature dimensions.
  3. Graph Pooling Layer: Node embeddings from each encoder are aggregated into graph-level representations using a simple yet effective readout function. This readout is akin to JK-Net, providing improved pooling compared to hierarchical methods like DiffPool.
  4. Mutual Information Maximization: A discriminator contrasts representations from different views, optimizing the MI, thus facilitating the learning of richer representations.

Experimental Results

The authors' approach is rigorously tested on several node and graph classification benchmarks:

  • Node Classification: On datasets such as Cora, Citeseer, and Pubmed, their method achieves significant improvements. For instance, on Cora, they achieve 86.8% accuracy, which is a 5.5% relative improvement over previous state-of-the-art unsupervised models.
  • Graph Classification: Their approach also demonstrates superior performance on graph classification tasks across various datasets (MUTAG, PTC-MR, IMDB-Binary, IMDB-Multi, Reddit-Binary). On Reddit-Binary, they report an 84.5% accuracy, marking a 2.4% relative improvement.

Across all benchmarks, their method either matches or exceeds the performance of strong supervised baselines like GCN and GAT, highlighting the efficacy of their contrastive learning approach.

Ablation Studies

The authors provide comprehensive ablation studies to underline the rationale behind their design choices:

  • Mutual Information Estimator: Jensen-Shannon (JSD) estimator was found to be consistently effective across most benchmark datasets.
  • Contrastive Modes: Local-global contrast (contrasting node representations with graph representations) outperformed global-global or multi-scale contrast methods.
  • Number of Views: Surprisingly, increasing the number of views beyond two did not improve performance, contrary to observations in visual representation learning.

Implications and Future Directions

This work has several critical implications:

  1. Graph Representation Learning: It advances the state-of-the-art in self-supervised graph learning, proving that contrastive learning using structural graph views can yield rich node and graph representations.
  2. Generalization: The method's success across diverse graph datasets underscores its generalizability and robustness.
  3. Practical Utility: The technique is particularly useful for applications in domains where labeled data is scarce, such as biology and social network analysis.

For future work, the authors suggest exploring large-scale pre-training and transfer learning capabilities of their model, which could further enhance its applicability in real-world scenarios.

Conclusion

The paper "Contrastive Multi-View Representation Learning on Graphs" delivers a significant advancement in self-supervised graph representation learning. By leveraging contrastive learning with multiple structural views, the authors provide a scalable and effective approach that surpasses existing methods on numerous benchmarks. This work lays a solid foundation for future research in this domain, particularly in leveraging unsupervised techniques for complex graph-structured data.

Github Logo Streamline Icon: https://streamlinehq.com

GitHub