Pre-Training and Prompting for Few-Shot Node Classification on Text-Attributed Graphs (2407.15431v1)
Abstract: The text-attributed graph (TAG) is one kind of important real-world graph-structured data with each node associated with raw texts. For TAGs, traditional few-shot node classification methods directly conduct training on the pre-processed node features and do not consider the raw texts. The performance is highly dependent on the choice of the feature pre-processing method. In this paper, we propose P2TAG, a framework designed for few-shot node classification on TAGs with graph pre-training and prompting. P2TAG first pre-trains the LLM (LM) and graph neural network (GNN) on TAGs with self-supervised loss. To fully utilize the ability of LLMs, we adapt the masked LLMing objective for our framework. The pre-trained model is then used for the few-shot node classification with a mixed prompt method, which simultaneously considers both text and graph information. We conduct experiments on six real-world TAGs, including paper citation networks and product co-purchasing networks. Experimental results demonstrate that our proposed framework outperforms existing graph few-shot learning methods on these datasets with +18.98% ~ +35.98% improvements.
- The extreme classification repository: Multi-label datasets and code. http://manikvarma.org/downloads/XC/XMLRepository.html
- Cluster-gcn: An efficient algorithm for training deep and large graph convolutional networks. In KDD’19. 257–266.
- Node Feature Extraction by Self-Supervised Multi-scale Neighborhood Prediction. In ICLR’22.
- Electra: Pre-training text encoders as discriminators rather than generators. In ICLR’20.
- Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering. In NeurIPS’16. 3837–3845.
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT’19 (1). 4171–4186.
- Graph prototypical networks for few-shot learning on attributed networks. In CIKM’20.
- GLM: General language model pretraining with autoregressive blank infilling. In ACL’22. 320–335.
- Model-agnostic meta-learning for fast adaptation of deep networks. In ICML’17.
- Inductive Representation Learning on Large Graphs. In NeurIPS’17.
- Kaveh Hassani and Amir Hosein Khasahmadi. 2020. Contrastive multi-view representation learning on graphs. In ICML’20. 4116–4126.
- DeBERTa: Decoding-enhanced BERT with Disentangled Attention. In ICLR’21.
- Harnessing explanations: LLM-to-LM interpreter for enhanced text-attributed graph representation learning. arXiv preprint arXiv:2305 (2023).
- GraphMAE2: A Decoding-Enhanced Masked Self-Supervised Graph Learner. In WWW’23. 737–746.
- GraphMAE: Self-Supervised Masked Graph Autoencoders. In KDD’22. 594–604.
- Kexin Huang and Marinka Zitnik. 2020. Graph meta learning via local subgraphs. NeurIPS’20 (2020).
- Prompt-based node feature extractor for few-shot learning on text-attributed graphs. arXiv preprint arXiv:2309.02848 (2023).
- A survey on knowledge graphs: Representation, acquisition, and applications. IEEE transactions on neural networks and learning systems 33, 2 (2021), 494–514.
- Albert: A lite bert for self-supervised learning of language representations. In ICLR’20.
- Jure Leskovec and Christos Faloutsos. 2006. Sampling from large graphs. In KDD’06.
- Adsgnn: Behavior-graph augmented relevance modeling in sponsored search. In SIGIR’21. 223–232.
- Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).
- Ilya Loshchilov and Frank Hutter. 2017. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017).
- Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).
- GCC: Graph contrastive coding for graph neural network pre-training. In KDD’20. 1150–1160.
- Sign: Scalable inception graph neural networks. arXiv preprint arXiv:2004.11198 7 (2020), 15.
- Prototypical networks for few-shot learning. In NeurIPS’17.
- Gppt: Graph pre-training and prompt tuning to generalize graph neural networks. In KDD’22. 1717–1727.
- All in One: Multi-Task prompting for graph neural networks. In KDD’23.
- Large-scale representation learning on graphs via bootstrapping. In ICLR’22.
- Graph Attention Networks. In ICLR’18.
- Microsoft academic graph: When experts are not enough. Quantitative Science Studies 1, 1 (2020), 396–413.
- Text embeddings by weakly-supervised contrastive pre-training. arXiv preprint arXiv:2212.03533 (2022).
- Task-adaptive few-shot node classification. In KDD’22.
- Max Welling and Thomas N Kipf. 2017. Semi-supervised classification with graph convolutional networks. In ICLR’17.
- Zhihao Wen and Yuan Fang. 2023. Augmenting low-Resource text classification with graph-grounded pre-training and prompting. In SIGIR’23.
- Simplifying graph convolutional networks. In ICML’19.
- Graph wavelet neural network. In ICLR’19.
- A comprehensive study on text-attributed graphs: Benchmarking and rethinking. In NeurIPS’23.
- Does Graph Distillation See Like Vision Dataset Counterpart?. In NeurIPS’23.
- GraphFormers: GNN-nested Transformers for Representation Learning on Textual Graph. In NeurIPS’21.
- XLNet: Generalized Autoregressive Pretraining for Language Understanding. In NeurIPS’19. 5754–5764.
- Graph few-shot learning via knowledge transfer. In AAAI’20.
- LinkBERT: Pretraining Language Models with Document Links. In ACL’22.
- Graph convolutional neural networks for web-scale recommender systems. In KDD’18.
- Graph Contrastive Learning with Augmentations. In NeurIPS’20.
- Empower text-attributed graphs learning with large language models (llms). arXiv preprint arXiv:2310.09872 (2023).
- GraphSAINT: Graph Sampling Based Inductive Learning Method. In ICLR’20.
- Jiaqi Zeng and Pengtao Xie. 2021. Contrastive self-supervised learning for graph classification. In AAAI’21, Vol. 35. 10824–10832.
- From Canonical Correlation Analysis to Self-supervised Graph Neural Networks. In NeurIPS’21. 76–89.
- Learning on Large-scale Text-attributed Graphs via Variational Inference. In ICLR’23.
- Hierarchical label with imbalance and attributed network structure fusion for network embedding. AI Open 3 (2022), 91–100.
- Hierarchical representation learning for attributed networks. IEEE Transactions on Knowledge and Data Engineering 35, 3 (2023), 2641–2656.
- Rethinking and Scaling Up Graph Contrastive Learning: An Extremely Efficient Approach with Group Discrimination. In NeurIPS’22.
- Meta-gnn: On few-shot node classification in graph meta-learning. In CIKM’2019.
- Textgnn: Improving text encoder via graph neural network in sponsored search. In WWW’21. 2848–2857.
- Deep graph contrastive representation learning. arXiv preprint arXiv:2006.04131 (2020).
- Huanjing Zhao (2 papers)
- Beining Yang (6 papers)
- Yukuo Cen (19 papers)
- Junyu Ren (2 papers)
- Chenhui Zhang (16 papers)
- Yuxiao Dong (119 papers)
- Evgeny Kharlamov (34 papers)
- Shu Zhao (31 papers)
- Jie Tang (302 papers)