Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer (2402.02464v3)

Published 4 Feb 2024 in cs.LG, cs.AI, and cs.SI

Abstract: Can we model Non-Euclidean graphs as pure language or even Euclidean vectors while retaining their inherent information? The Non-Euclidean property have posed a long term challenge in graph modeling. Despite recent graph neural networks and graph transformers efforts encoding graphs as Euclidean vectors, recovering the original graph from vectors remains a challenge. In this paper, we introduce GraphsGPT, featuring an Graph2Seq encoder that transforms Non-Euclidean graphs into learnable Graph Words in the Euclidean space, along with a GraphGPT decoder that reconstructs the original graph from Graph Words to ensure information equivalence. We pretrain GraphsGPT on $100$M molecules and yield some interesting findings: (1) The pretrained Graph2Seq excels in graph representation learning, achieving state-of-the-art results on $8/9$ graph classification and regression tasks. (2) The pretrained GraphGPT serves as a strong graph generator, demonstrated by its strong ability to perform both few-shot and conditional graph generation. (3) Graph2Seq+GraphGPT enables effective graph mixup in the Euclidean space, overcoming previously known Non-Euclidean challenges. (4) The edge-centric pretraining framework GraphsGPT demonstrates its efficacy in graph domain tasks, excelling in both representation and generation. Code is available at \href{https://github.com/A4Bio/GraphsGPT}{GitHub}.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (96)
  1. Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
  2. Distributed large-scale natural graph factorization. In Proceedings of the 22nd international conference on World Wide Web, pp.  37–48, 2013.
  3. Flamingo: a visual language model for few-shot learning. Advances in Neural Information Processing Systems, 35:23716–23736, 2022.
  4. Molgpt: molecular generation using a transformer-decoder model. Journal of Chemical Information and Modeling, 62(9):2064–2076, 2021.
  5. Guacamol: benchmarking models for de novo molecular design. Journal of chemical information and modeling, 59(3):1096–1108, 2019.
  6. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
  7. Infinitewalk: Deep network embeddings as laplacian embeddings with a nonlinearity. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp.  1325–1333, 2020.
  8. Structure-aware transformer for graph representation learning. In International Conference on Machine Learning, pp.  3469–3489. PMLR, 2022.
  9. Fastgcn: fast learning with graph convolutional networks via importance sampling. arXiv preprint arXiv:1801.10247, 2018.
  10. Cluster-gcn: An efficient algorithm for training deep and large graph convolutional networks. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pp.  257–266, 2019.
  11. Scaling vision transformers to 22 billion parameters. In ICML, pp.  7480–7512. PMLR, 2023.
  12. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805, 2018.
  13. Diehl, F. Edge contraction pooling for graph neural networks. arXiv preprint arXiv:1905.10990, 2019.
  14. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
  15. A generalization of transformer networks to graphs. arXiv preprint arXiv:2012.09699, 2020.
  16. Pifold: Toward effective and efficient protein inverse folding. In The Eleventh International Conference on Learning Representations, 2022.
  17. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pp.  855–864, 2016.
  18. Interpolating graph pair to regularize graph classification. In AAAI, volume 37, pp.  7766–7774, 2023.
  19. Inductive representation learning on large graphs. Advances in neural information processing systems, 30, 2017.
  20. G-mixup: Graph data augmentation for graph classification. In ICML, pp.  8230–8248. PMLR, 2022.
  21. Graphmae: Self-supervised masked graph autoencoders. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp.  594–604, 2022.
  22. Strategies for pre-training graph neural networks. arXiv preprint arXiv:1905.12265, 2019.
  23. Gpt-gnn: Generative pre-training of graph neural networks. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp.  1857–1867, 2020a.
  24. Gpt-gnn: Generative pre-training of graph neural networks. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp.  1857–1867, 2020b.
  25. Heterogeneous graph transformer. In Proceedings of the web conference 2020, pp.  2704–2710, 2020c.
  26. 3dlinker: an e (3) equivariant variational autoencoder for molecular linker design. arXiv preprint arXiv:2205.07309, 2022.
  27. Edge-augmented graph transformers: Global self-attention is enough for graphs. arXiv preprint arXiv:2108.03348, 2021.
  28. Self-supervised auxiliary learning with meta-paths for heterogeneous graphs. Advances in Neural Information Processing Systems, 33:10294–10305, 2020.
  29. Zinc- a free database of commercially available compounds for virtual screening. Journal of chemical information and modeling, 45(1):177–182, 2005.
  30. Self-supervised learning on graphs: Deep insights and new direction. arXiv preprint arXiv:2006.10141, 2020.
  31. Pure transformers are powerful graph learners. Advances in Neural Information Processing Systems, 35:14582–14595, 2022.
  32. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016a.
  33. Variational graph auto-encoders. arXiv preprint arXiv:1611.07308, 2016b.
  34. Rethinking graph transformers with spectral attention. Advances in Neural Information Processing Systems, 34:21618–21629, 2021.
  35. Self-attention graph pooling. In International conference on machine learning, pp.  3734–3743. PMLR, 2019.
  36. A survey of graph neural network based recommendation in social networks. Neurocomputing, pp.  126441, 2023a.
  37. General point model with autoencoding and autoregressive. arXiv preprint arXiv:2310.16861, 2023b.
  38. Improving graph collaborative filtering with neighborhood-enriched contrastive learning. In Proceedings of the ACM Web Conference 2022, pp.  2320–2329, 2022.
  39. Gnnrec: Gated graph neural network for session-based social recommendation model. Journal of Intelligent Information Systems, 60(1):137–156, 2023a.
  40. Pre-training molecular graph representation with 3d geometry. arXiv preprint arXiv:2110.07728, 2021a.
  41. Self-supervised learning: Generative or contrastive. IEEE transactions on knowledge and data engineering, 35(1):857–876, 2021b.
  42. Graph self-supervised learning: A survey. IEEE Transactions on Knowledge and Data Engineering, 35(6):5879–5900, 2022.
  43. Simple contrastive graph clustering. IEEE Transactions on Neural Networks and Learning Systems, 2023b.
  44. Hard sample aware network for contrastive deep graph clustering. In Proceedings of the AAAI conference on artificial intelligence, volume 37, pp.  8914–8922, 2023c.
  45. Swin transformer: Hierarchical vision transformer using shifted windows. In ICCV, pp.  10012–10022, 2021c.
  46. Graph convolutional networks with eigenpooling. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pp.  723–731, 2019.
  47. Accelerated hierarchical density based clustering. In Data Mining Workshops (ICDMW), 2017 IEEE International Conference on, pp.  33–42. IEEE, 2017.
  48. Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426, 2018.
  49. Graphit: Encoding graph structure in transformers. arXiv preprint arXiv:2106.05667, 2021.
  50. Transformer for graphs: An overview from architecture perspective. arXiv preprint arXiv:2202.08455, 2022.
  51. Masked autoencoders for point cloud self-supervised learning. In ECCV, pp.  604–621. Springer, 2022.
  52. Graph transplant: Node saliency-guided graph mixup with local structure preservation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pp.  7966–7974, 2022.
  53. Self-supervised graph representation learning via global context prediction. arXiv:2003.01604, 2020a.
  54. Graph representation learning via graphical mutual information maximization. In Proceedings of The Web Conference 2020, pp.  259–270, 2020b.
  55. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp.  701–710, 2014.
  56. Molecular sets (moses): a benchmarking platform for molecular generation models. Frontiers in pharmacology, 11:565644, 2020.
  57. Gcc: Graph contrastive coding for graph neural network pre-training. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pp.  1150–1160, 2020.
  58. Searching for activation functions. arXiv:1710.05941, 2017.
  59. Recipe for a general, powerful, scalable graph transformer. Advances in Neural Information Processing Systems, 35:14501–14515, 2022.
  60. Self-supervised graph transformer on large-scale molecular data. Advances in Neural Information Processing Systems, 33:12559–12571, 2020.
  61. Rimeshgnn: A rotation-invariant graph neural network for mesh classification. In WACV, pp.  3150–3160, 2024.
  62. 3d infomax improves gnns for molecular property prediction. In ICML, pp.  20479–20502. PMLR, 2022.
  63. Infograph: Unsupervised and semi-supervised graph-level representation learning via mutual information maximization. arXiv preprint arXiv:1908.01000, 2019.
  64. Adversarial graph augmentation to improve graph contrastive learning. Advances in Neural Information Processing Systems, 34:15920–15933, 2021.
  65. Target-aware molecular graph generation. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp.  410–427. Springer, 2023.
  66. Heterogeneous graph masked autoencoders. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pp.  9997–10005, 2023.
  67. Graph attention networks. arXiv preprint arXiv:1710.10903, 2017.
  68. Self-supervised learning of contextual embeddings for link prediction in heterogeneous networks. In Proceedings of the web conference 2021, pp.  2946–2957, 2021.
  69. Molecular contrastive learning of representations via graph neural networks. NMI, 4(3):279–287, 2022.
  70. Simplifying graph convolutional networks. In ICML, pp.  6861–6871. PMLR, 2019.
  71. Self-supervised learning on graphs: Contrastive, generative, or predictive. IEEE Transactions on Knowledge and Data Engineering, 2021a.
  72. Graphmixup: Improving class-imbalanced node classification by reinforcement mixup and self-supervised context prediction. In ECML-PKDD, pp.  519–535. Springer, 2022.
  73. Moleculenet: a benchmark for molecular machine learning. Chemical science, 9(2):513–530, 2018.
  74. Representing long-range context for graph neural networks with global attention. NeurIPS, 34:13266–13279, 2021b.
  75. Simgrace: A simple framework for graph contrastive learning without data augmentation. In Proceedings of the ACM Web Conference 2022, pp.  1070–1079, 2022a.
  76. Mole-bert: Rethinking pre-training graph neural networks for molecules. In The Eleventh International Conference on Learning Representations, 2022b.
  77. Mole-bert: Rethinking pre-training graph neural networks for molecules. In The Eleventh International Conference on Learning Representations, 2023.
  78. Vertex-reinforced random walk for network embedding. In Proceedings of the 2020 SIAM International Conference on Data Mining, pp.  595–603. SIAM, 2020.
  79. Self-supervised learning of graph neural networks: A unified review. IEEE transactions on pattern analysis and machine intelligence, 45(2):2412–2429, 2022.
  80. How powerful are graph neural networks? arXiv preprint arXiv:1810.00826, 2018.
  81. Self-supervised graph-level representation learning with local and global structure. In International Conference on Machine Learning, pp.  11548–11558. PMLR, 2021.
  82. Do transformers really perform badly for graph representation? Advances in Neural Information Processing Systems, 34:28877–28888, 2021.
  83. Hierarchical graph representation learning with differentiable pooling. Advances in neural information processing systems, 31, 2018.
  84. Graph contrastive learning with augmentations. NeurIPS, 33:5812–5823, 2020.
  85. Graph contrastive learning automated. In International Conference on Machine Learning, pp.  12121–12132. PMLR, 2021.
  86. Point-bert: Pre-training 3d point cloud transformers with masked point modeling. In CVPR, pp.  19313–19322, 2022.
  87. Contrastive self-supervised learning for graph classification. In AAAI, volume 35, pp.  10824–10832, 2021.
  88. Root mean square layer normalization. Advances in Neural Information Processing Systems, 32, 2019.
  89. Mixupexplainer: Generalizing explanations for graph neural networks with data augmentation. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp.  3286–3296, 2023.
  90. Motif-based graph self-supervised learning for molecular property prediction. Advances in Neural Information Processing Systems, 34:15870–15882, 2021.
  91. Gophormer: Ego-graph transformer for node classification. arXiv preprint arXiv:2110.13094, 2021.
  92. Graph neural networks: A review of methods and applications. AI open, 1:57–81, 2020a.
  93. Data augmentation for graph classification. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pp.  2341–2344, 2020b.
  94. Deep graph contrastive representation learning. arXiv preprint arXiv:2006.04131, 2020.
  95. Graph contrastive learning with adaptive augmentation. In Proceedings of the Web Conference 2021, pp.  2069–2080, 2021.
  96. Multi-level cross-view contrastive learning for knowledge-aware recommender system. In SIGIR, pp.  1358–1368, 2022.
Citations (4)

Summary

We haven't generated a summary for this paper yet.