Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Where to Mask: Structure-Guided Masking for Graph Masked Autoencoders (2404.15806v1)

Published 24 Apr 2024 in cs.LG

Abstract: Graph masked autoencoders (GMAE) have emerged as a significant advancement in self-supervised pre-training for graph-structured data. Previous GMAE models primarily utilize a straightforward random masking strategy for nodes or edges during training. However, this strategy fails to consider the varying significance of different nodes within the graph structure. In this paper, we investigate the potential of leveraging the graph's structural composition as a fundamental and unique prior in the masked pre-training process. To this end, we introduce a novel structure-guided masking strategy (i.e., StructMAE), designed to refine the existing GMAE models. StructMAE involves two steps: 1) Structure-based Scoring: Each node is evaluated and assigned a score reflecting its structural significance. Two distinct types of scoring manners are proposed: predefined and learnable scoring. 2) Structure-guided Masking: With the obtained assessment scores, we develop an easy-to-hard masking strategy that gradually increases the structural awareness of the self-supervised reconstruction task. Specifically, the strategy begins with random masking and progresses to masking structure-informative nodes based on the assessment scores. This design gradually and effectively guides the model in learning graph structural information. Furthermore, extensive experiments consistently demonstrate that our StructMAE method outperforms existing state-of-the-art GMAE models in both unsupervised and transfer learning tasks. Codes are available at https://github.com/LiuChuang0059/StructMAE.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. Language models are few-shot learners. In NeurIPS, 2020.
  2. Libsvm: a library for support vector machines. TIST, 2011.
  3. BERT: Pre-training of deep bidirectional transformers for language understanding. In ACL, 2019.
  4. Fast graph representation learning with PyTorch Geometric. In ICLR Workshop, 2019.
  5. Semi-implicit graph variational auto-encoders. In NeurIPS, 2019.
  6. Masked autoencoders are scalable vision learners. In CVPR, 2022.
  7. Graphmae: Self-supervised masked graph autoencoders. In SIGKDD, 2022.
  8. Graphmae2: A decoding-enhanced masked self-supervised graph learner. In WWW, 2023.
  9. Open graph benchmark: Datasets for machine learning on graphs. arXiv:2005.00687, 2020.
  10. Strategies for pre-training graph neural networks. In ICLR, 2020.
  11. Motif-aware attribute masking for molecular graph pre-training. arXiv:2309.04589, 2023.
  12. Adam: A method for stochastic optimization. arXiv:1412.6980, 2014.
  13. Variational graph auto-encoders. arXiv:1611.07308, 2016.
  14. Semi-supervised classification with graph convolutional networks. In ICLR, 2017.
  15. Let invariant rationale discovery inspire graph contrastive learning. In ICLR, 2022.
  16. What’s behind the mask: Understanding masked graph modeling for graph autoencoders. In SIGKDD, 2023.
  17. Gapformer: Graph transformer with graph pooling for node classification. In IJCAI-23, 2023.
  18. Masked graph auto-encoder constrained graph pooling. In ECML PKDD, 2023.
  19. Rethinking tokenizer and decoder in masked graph modeling for molecules. NeurIPS, 2023.
  20. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford InfoLab, 1999.
  21. Adversarially regularized graph autoencoder for graph embedding. In IJCAI, 2018.
  22. Gcc: Graph contrastive coding for graph neural network pre-training. In SIGKDD, 2020.
  23. Exploring the limits of transfer learning with a unified text-to-text transformer. JMLR, 2020.
  24. Gigamae: Generalizable graph masked autoencoder via collaborative latent space reconstruction. In CIKM, 2023.
  25. Zinc 15–ligand discovery for everyone. Journal of chemical information and modeling, 2015.
  26. Infograph: Unsupervised and semi-supervised graph-level representation learning via mutual information maximization. In ICLR, 2020.
  27. S2gae: Self-supervised graph autoencoders are generalizable learners with graph masking. In WSDM, 2023.
  28. Heterogeneous graph masked autoencoders. In AAAI, 2023.
  29. Rare: Robust masked graph autoencoder. IEEE TKDE, 2024.
  30. Deep graph infomax. In ICLR, 2019.
  31. Generative and contrastive paradigms are complementary for graph self-supervised learning. In ICDE, 2024.
  32. Moleculenet: a benchmark for molecular machine learning. Chem. Sci., 2018.
  33. Simgrace: A simple framework for graph contrastive learning without data augmentation. In WWW, 2022.
  34. A survey of pretraining on graphs: Taxonomy, methods, and applications. arXiv:2202.07893, 2022.
  35. Mole-BERT: Rethinking pre-training graph neural networks for molecules. In ICLR, 2023.
  36. How powerful are graph neural networks? In ICLR, 2019.
  37. InfoGCL: Information-aware graph contrastive learning. In NeurIPS, 2021.
  38. Self-supervised graph-level representation learning with local and global structure. In ICML, 2021.
  39. Skeletonmae: graph-based masked autoencoder for skeleton sequence pre-training. In ICCV, 2023.
  40. Hierarchical graph representation learning with differentiable pooling. In NeurIPS, 2018.
  41. Do transformers really perform badly for graph representation? In NeurIPS, 2021.
  42. Graph contrastive learning with augmentations. In NeurIPS, 2020.
  43. Graph contrastive learning automated. In ICML, 2021.
  44. Provable training for graph contrastive learning. NeurIPS, 36, 2024.
  45. ProteinMAE: Masked Autoencoder for Protein Surface Self-supervised Learning. Bioinformatics, 2023.
  46. Graph masked autoencoders with transformers. arXiv:2202.08391, 2022.
  47. Costa: covariance-preserving feature augmentation for graph contrastive learning. In SIGKDD, pages 2524–2534, 2022.
Citations (3)

Summary

We haven't generated a summary for this paper yet.