Augmentation-Free Self-Supervised Learning on Graphs (2112.02472v2)

Published 5 Dec 2021 in cs.LG and cs.AI

Abstract: Inspired by the recent success of self-supervised methods applied on images, self-supervised learning on graph structured data has seen rapid growth especially centered on augmentation-based contrastive methods. However, we argue that without carefully designed augmentation techniques, augmentations on graphs may behave arbitrarily in that the underlying semantics of graphs can drastically change. As a consequence, the performance of existing augmentation-based methods is highly dependent on the choice of augmentation scheme, i.e., hyperparameters associated with augmentations. In this paper, we propose a novel augmentation-free self-supervised learning framework for graphs, named AFGRL. Specifically, we generate an alternative view of a graph by discovering nodes that share the local structural information and the global semantics with the graph. Extensive experiments towards various node-level tasks, i.e., node classification, clustering, and similarity search on various real-world datasets demonstrate the superiority of AFGRL. The source code for AFGRL is available at https://github.com/Namkyeong/AFGRL.

Authors (3)

Namkyeong Lee (21 papers)
Junseok Lee (30 papers)
Chanyoung Park (83 papers)

Citations (183)

View on Semantic Scholar

Summary

Augmentation-Free Self-Supervised Learning on Graphs

The paper "Augmentation-Free Self-Supervised Learning on Graphs" introduces an approach that challenges the reliance on augmentation-based contrastive methods commonly used in self-supervised learning on graph-structured data. The authors highlight the limitations of current methods, particularly the potential for arbitrary alterations in graph semantics dependent on augmentation schemes and associated hyperparameters. To address this, the paper presents an augmentation-free framework termed AFGRL (Augmentation-Free Graph Representation Learning).

Graph-structured data inherently differs from images as perturbations might lead to significant changes in semantics which are not easily discernible, unlike images where augmentations like rotation or cropping maintain underlying semantic integrity. Motivated by this challenge, AFGRL generates alternative graph views based on the structural and semantic cohesion within the graph itself, thus eliminating the need for manual augmentation.

Key Contributions

Augmentation-Free Approach: AFGRL eliminates the dependency on hand-crafted augmentations by discovering node relationships leveraging $k$ -nearest neighbors in the latent space. The approach ensures these discovered relations retain semantic coherence by utilizing local structural and global semantic information, thus alleviating the inherent arbitrariness of augmentation schemes.
Local and Global Perspective Integration: AFGRL implements a dual perspective strategy to refine the set of nodes regarded as semantically similar. It examines local adjacency (neighboring nodes) and broader semantics through clustering, utilizing $k$ -means to achieve comprehensive data representation.
Siamese Model Architecture: Adopting a non-contrastive learning framework inspired by BYOL, AFGRL employs a siamese network structure but centers on positive samples' identification without necessitating negative samples, addressing sampling bias issues and reducing computational costs.

Evaluation and Results

AFGRL's performance was evaluated on several node-level tasks including node classification, clustering, and similarity search. It showed competitive results against state-of-the-art augmentation-based methods on datasets such as WikiCS, Amazon-Computers, Amazon-Photo, Coauthor-CS, and Coauthor-Physics.

Node Classification: AFGRL outperformed existing augmentation-dependent frameworks in most benchmark datasets, displaying resilience against hyperparameter sensitivity.
Node Clustering and Similarity Search: The integration of local structure and global semantics allowed AFGRL to deliver improved or comparable performance. Its capability to discern fine-grained semantic similarities within clusters was particularly emphasized in t-SNE visualizations, showing tight groupings in embeddings where other models dispersed.

Implications and Future Directions

AFGRL provides significant insights and implications for self-supervised learning on graphs by demonstrating that effective graph representation learning can be achieved without complex augmentations or extensive negative sampling. The method proposes a path to more robust, hyperparameter-stable learning paradigms for graphs.

The future scope involves adapting this framework to more extensive and diverse networks, exploring automated techniques for node cluster validation, and assessing applicability beyond graph-structured data. Furthermore, integrating AFGRL with hybrid architectures that blend structural and feature-based learning approaches could advance generalization capabilities in a broader array of real-world applications. This direction has potential to significantly influence AI developments in complex network-based domains such as social networks, bioinformatics, and recommendation systems.

PDF Markdown

Related Papers

Find Related Papers

GitHub

GitHub - Namkyeong/AFGRL: The official source code for "Augmentation-Free Self-Supervised Learning on Graphs" (76 stars)