Distribution Consistency based Self-Training for Graph Neural Networks with Sparse Labels (2401.10394v1)
Abstract: Few-shot node classification poses a significant challenge for Graph Neural Networks (GNNs) due to insufficient supervision and potential distribution shifts between labeled and unlabeled nodes. Self-training has emerged as a widely popular framework to leverage the abundance of unlabeled data, which expands the training set by assigning pseudo-labels to selected unlabeled nodes. Efforts have been made to develop various selection strategies based on confidence, information gain, etc. However, none of these methods takes into account the distribution shift between the training and testing node sets. The pseudo-labeling step may amplify this shift and even introduce new ones, hindering the effectiveness of self-training. Therefore, in this work, we explore the potential of explicitly bridging the distribution shift between the expanded training set and test set during self-training. To this end, we propose a novel Distribution-Consistent Graph Self-Training (DC-GST) framework to identify pseudo-labeled nodes that are both informative and capable of redeeming the distribution discrepancy and formulate it as a differentiable optimization task. A distribution-shift-aware edge predictor is further adopted to augment the graph and increase the model's generalizability in assigning pseudo labels. We evaluate our proposed method on four publicly available benchmark datasets and extensive experiments demonstrate that our framework consistently outperforms state-of-the-art baselines.
- MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing. In Proceedings of the 36th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 97), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, 21–29.
- Size-invariant graph representations for graph classification extrapolations. In International Conference on Machine Learning. PMLR, 837–851.
- Lukas Biewald. 2020. Experiment Tracking with Weights and Biases. https://www.wandb.com/ Software available from wandb.com.
- Beyond low-frequency information in graph convolutional networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 3950–3957.
- How Attentive are Graph Attention Networks?. In International Conference on Learning Representations. https://openreview.net/forum?id=F72ximsx7C1
- A comprehensive survey on trustworthy graph neural networks: Privacy, robustness, fairness, and explainability. arXiv preprint arXiv:2204.08570 (2022).
- Learning with Few Labeled Nodes via Augmented Graph Self-Training. arXiv preprint arXiv:2208.12422 (2022).
- Domain-adversarial training of neural networks. The journal of machine learning research 17, 1 (2016), 2096–2030.
- Inductive representation learning on large graphs. Advances in neural information processing systems 30 (2017).
- Open graph benchmark: Datasets for machine learning on graphs. Advances in neural information processing systems 33 (2020), 22118–22133.
- A Three-Stage Self-Training Framework for Semi-Supervised Semantic Segmentation. IEEE Transactions on Image Processing 31 (2022), 1805–1815. https://doi.org/10.1109/TIP.2022.3144036
- Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
- Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations. https://openreview.net/forum?id=SJU4ayYgl
- Dong-Hyun Lee et al. 2013. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Workshop on challenges in representation learning, ICML, Vol. 3. 896.
- Out-of-distribution generalization on graphs: A survey. arXiv preprint arXiv:2202.07987 (2022).
- Deeper insights into graph convolutional networks for semi-supervised learning. In Proceedings of the AAAI conference on artificial intelligence, Vol. 32.
- Informative pseudo-labeling for graph neural networks with few labels. Data Mining and Knowledge Discovery 37, 1 (2023), 228–254.
- Confidence May Cheat: Self-Training on Graph Neural Networks under Distribution Shift. In Proceedings of the ACM Web Conference 2022. 1248–1258.
- Deep transfer learning with joint adaptation networks. In International conference on machine learning. PMLR, 2208–2217.
- Learning to drop: Robust graph neural network via topological denoising. In Proceedings of the 14th ACM international conference on web search and data mining. 779–787.
- The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables. In International Conference on Learning Representations. https://openreview.net/forum?id=S1jE5L5gl
- Simplifying approach to node classification in Graph Neural Networks. Journal of Computational Science 62 (2022), 101695. https://doi.org/10.1016/j.jocs.2022.101695
- Subhabrata Mukherjee and Ahmed Awadallah. 2020. Uncertainty-aware self-training for few-shot text classification. Advances in Neural Information Processing Systems 33 (2020), 21199–21212.
- Daniel Carlos Guimarães Pedronette and Longin Jan Latecki. 2021. Rank-based self-training for graph convolutional networks. Information Processing & Management 58, 2 (2021), 102443.
- DropEdge: Towards Deep Graph Convolutional Networks on Node Classification. In International Conference on Learning Representations. https://openreview.net/forum?id=Hkx1qkrKPr
- Semi-Supervised Self-Training of Object Detection Models. In 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION’05)-Volume 1, Vol. 1. IEEE, 29–36.
- Henry Scudder. 1965. Probability of error of some adaptive pattern-recognition machines. IEEE Transactions on Information Theory 11, 3 (1965), 363–371. https://doi.org/10.1109/TIT.1965.1053799
- Collective classification in network data. AI magazine 29, 3 (2008), 93–93.
- Multi-stage self-supervised learning for graph convolutional networks on graphs with few labeled nodes. In Proceedings of the AAAI conference on artificial intelligence, Vol. 34. 5892–5899.
- Graph Attention Networks. In International Conference on Learning Representations. https://openreview.net/forum?id=rJXMpikCZ
- Be confident! towards trustworthy graph neural networks via confidence calibration. Advances in Neural Information Processing Systems 34 (2021), 23768–23779.
- Recent advances in reliable deep graph learning: Adversarial attack, inherent noise, and distribution shift. arXiv preprint arXiv:2202.07114 (2022).
- Handling Distribution Shifts on Graphs: An Invariance Perspective. In International Conference on Learning Representations. https://openreview.net/forum?id=FQOC5u-1egI
- Robust unsupervised domain adaptation for neural networks via moment alignment. Information Sciences 483 (2019), 174–191.
- Kun Zhan and Chaoxi Niu. 2021. Mutual teaching for graph convolutional networks. Future Generation Computer Systems 115 (2021), 837–843.
- Data Augmentation for Graph Neural Networks. Proceedings of the AAAI Conference on Artificial Intelligence 35, 12 (May 2021), 11015–11023. https://doi.org/10.1609/aaai.v35i12.17315
- Effective Semi-Supervised Node Classification on Few-Labeled Graph Data. arXiv preprint arXiv:1910.02684 (2019).
- Shift-robust gnns: Overcoming the limitations of localized graph training data. Advances in Neural Information Processing Systems 34 (2021), 27965–27977.
- X ZHU. 2002. Learning from labeled and unlabeled data with label propagation. Tech. Report (2002).
- Rethinking pre-training and self-training. Advances in neural information processing systems 33 (2020), 3833–3845.
- Fali Wang (10 papers)
- Tianxiang Zhao (26 papers)
- Suhang Wang (118 papers)