Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
103 tokens/sec
GPT-4o
11 tokens/sec
Gemini 2.5 Pro Pro
50 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
3 tokens/sec
DeepSeek R1 via Azure Pro
33 tokens/sec
2000 character limit reached

In-n-Out: Calibrating Graph Neural Networks for Link Prediction (2403.04605v2)

Published 7 Mar 2024 in cs.LG

Abstract: Deep neural networks are notoriously miscalibrated, i.e., their outputs do not reflect the true probability of the event we aim to predict. While networks for tabular or image data are usually overconfident, recent works have shown that graph neural networks (GNNs) show the opposite behavior for node-level classification. But what happens when we are predicting links? We show that, in this case, GNNs often exhibit a mixed behavior. More specifically, they may be overconfident in negative predictions while being underconfident in positive ones. Based on this observation, we propose IN-N-OUT, the first-ever method to calibrate GNNs for link prediction. IN-N-OUT is based on two simple intuitions: i) attributing true/false labels to an edge while respecting a GNNs prediction should cause but small fluctuations in that edge's embedding; and, conversely, ii) if we label that same edge contradicting our GNN, embeddings should change more substantially. An extensive experimental campaign shows that IN-N-OUT significantly improves the calibration of GNNs in link prediction, consistently outperforming the baselines available -- which are not designed for this specific task.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini, “The graph neural network model,” IEEE Transactions on Neural Networks, 2009.
  2. M. Gori, G. Monfardini, and F. Scarselli, “A new model for learning in graph domains,” in IEEE International Joint Conference on Neural Networks (IJCNN), 2005.
  3. D. Cheng, S. Xiang, C. Shang, Y. Zhang, F. Yang, and L. Zhang, “Spatio-temporal attention-based neural network for credit card fraud detection,” AAAI Conference on Artificial Intelligence, 2020.
  4. J. M. Stokes, K. Yang, K. Swanson, W. Jin, A. Cubillos-Ruiz, N. M. Donghia, C. R. MacNair, S. French, L. A. Carfrae, Z. Bloom-Ackermann, V. M. Tran, A. Chiappino-Pepe, A. H. Badran, I. W. Andrews, E. J. Chory, G. M. Church, E. D. Brown, T. S. Jaakkola, R. Barzilay, and J. J. Collins, “A deep learning approach to antibiotic discovery,” Cell, vol. 180, no. 4, 2020.
  5. C. Wu, F. Wu, L. Lyu, T. Qi, Y. Huang, and X. Xie, “A federated graph neural network framework for privacy-preserving personalization,” Nature Communications, 2022.
  6. Z. Liu, B. Ma, Q. Liu, J. Xu, and B. Zheng, “Heterogeneous graph neural networks for large-scale bid keyword matching,” in ACM International Conference on Information & Knowledge Management (CIKM), 2021.
  7. X. Wang, H. Liu, C. Shi, and C. Yang, “Be confident! towards trustworthy graph neural networks via confidence calibration,” Advances in Neural Information Processing Systems, 2021.
  8. H. H. H. Hsu, Y. Shen, C. Tomani, and D. Cremers, “What makes graph neural networks miscalibrated?” Advances in Neural Information Processing Systems, 2022.
  9. B. Zadrozny and C. Elkan, “Transforming classifier scores into accurate multiclass probability estimates,” in International Conference on Knowledge Discovery and Data Mining, 2002.
  10. C. Guo, G. Pleiss, Y. Sun, and K. Q. Weinberger, “On calibration of modern neural networks,” in International conference on machine learning, 2017.
  11. B. Zadrozny and C. Elkan, “Obtaining calibrated probability estimates from decision trees and naive bayesian classifiers,” in International Conference on Machine Learning, 2001, pp. 609–616.
  12. M. P. Naeini, G. Cooper, and M. Hauskrecht, “Obtaining well calibrated probabilities using bayesian binning,” in AAAI conference on artificial intelligence, 2015.
  13. J. Gilmer, S. Schoenholz, P. Riley, O. Vinyals, and G. Dahl, “Neural message passing for quantum chemistry,” in International conference on machine learning, 2017.
  14. P. Veličković, “Message passing all the way up,” ArXiv e-prints, 2022.
  15. T. N. Kipf and M. Welling, “Variational graph auto-encoders,” arXiv preprint arXiv:1611.07308, 2016.
  16. ——, “Semi-supervised classification with graph convolutional networks,” in International Conference on Learning Representations, 2017.
  17. D. Kingma and M. Welling, “Auto-encoding variational Bayes,” International Conference on Learning Representations, 2014.
  18. W. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” Advances in neural information processing systems, vol. 30, 2017.
  19. K. Xu, W. Hu, J. Leskovec, and S. Jegelka, “How powerful are graph neural networks?” International Conference on Learning Representations, 2019.
  20. H. Wang, H. Yin, M. Zhang, and P. Li, “Equivariant and stable positional encoding for more powerful graph neural networks,” in International Conference on Learning Representations, 2022.
  21. M. Zhang and Y. Chen, “Link prediction based on graph neural networks,” Advances in neural information processing systems, vol. 31, 2018.
  22. W. L. Hamilton, “Graph representation learning,” Synthesis Lectures on Artificial Intelligence and Machine Learning, vol. 14, no. 3, pp. 1–159, 2020.
  23. A. Niculescu-Mizil and R. Caruana, “Predicting good probabilities with supervised learning,” in International Conference on Machine Learning, 2005.
  24. S. Kumar, F. Spezzano, V. Subrahmanian, , and C. Faloutsos, “Edge weight prediction in weighted signed networks,” in International Conference on Data Mining (ICDM), 2016.
  25. L. Teixeira, B. Jalaian, and B. Ribeiro, “Are graph neural networks miscalibrated?” in Workshop on Learning and Reasoning with Graph-Structured Data, 2019.
  26. M. Wang, H. Yang, and Q. Cheng, “Gcl: Graph calibration loss for trustworthy graph neural network,” in Proceedings of the 30th ACM International Conference on Multimedia, 2022.
  27. S. H. Zargarbashi, S. Antonelli, and A. Bojchevski, “Conformal prediction sets for graph neural networks,” in International Conference on Machine Learning, 2023.
  28. K. Huang, Y. Jin, E. Candes, and J. Leskovec, “Uncertainty quantification over graph with conformalized graph neural networks,” arXiv preprint ArXiV:2305.1435, 2023.
  29. I. Vos, I. Bhat, B. Velthuis, Y. Ruigrok, and H. Kuijf, “Calibration techniques for node classification using graph neural networks on medical image data,” in Medical Imaging with Deep Learning, 2023.
  30. A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer, “Automatic differentiation in pytorch,” in Advances in Neural Information Processing Systems (NeurIPS - Workshop), 2017.
  31. M. Fey and J. E. Lenssen, “Fast graph representation learning with PyTorch Geometric,” in Workshop on Representation Learning on Graphs and Manifolds, 2019.
  32. P. Sen, G. Namata, M. Bilgic, L. Getoor, B. Galligher, and T. Eliassi-Rad, “Collective classification in network data,” AI magazine, vol. 29, no. 3, pp. 93–93, 2008.
  33. B. Rozemberczki, C. Allen, and R. Sarkar, “Multi-scale attributed node embedding,” Journal of Complex Networks, vol. 9, no. 2, 2021.
  34. S. Kumar, B. Hooi, D. Makhija, M. Kumar, C. Faloutsos, , and V. Subrahmanian, “Rev2: Fraudulent user prediction in rating platforms,” in International Conference on Web Search and Data Mining, 2018.
  35. F. Küppers, J. Kronenberger, A. Shantia, and A. Haselhoff, “Multivariate confidence calibration for object detection,” in The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2020.
Citations (1)

Summary

  • The paper presents IN-N-OUT, a novel calibration technique that adjusts GNN logits to better match predicted confidences with empirical link outcomes.
  • It employs a temperature-scaling approach that modulates predictions based on embedding variability from adding or removing links.
  • Extensive experiments across standard datasets show that IN-N-OUT reduces calibration errors compared to conventional methods like Isotonic Regression and Temperature Scaling.

Introduction to \texttt{IN-N-OUT}

Graph Neural Networks (GNNs) have established themselves as pivotal instruments in understanding complex relational data, impacting various sectors by providing insights into network structured problems. However, a common critique against these models is their miscalibration - the disparity between their predicted confidences and the actual likelihood of predictions. Such a disjunction poses significant risks, especially in decision-sensitive environments.

Amplifying on this concern for GNNs focused on link prediction tasks, this paper introduces \texttt{IN-N-OUT}, a novel approach for the post-hoc calibration of GNNs. \texttt{IN-N-OUT} particularly addresses the calibration challenge for link prediction by leveraging two key insights about the behavior of GNN predictions and the structural properties of graph embeddings.

Key Insights and Methodology

Primarily, the research underscores the mixed behavior of GNN predictions for link prediction tasks, observed across various datasets and models. GNNs tend to display overconfidence in negative predictions while showing underconfidence in positive ones. To mitigate this miscalibration, the paper presents \texttt{IN-N-OUT}, derived from two main premises:

  1. The introduction or absence of a true link should minimally perturb the embedding of the said link if the GNN's prediction aligns with reality.
  2. Conversely, if a link's presence contradicts the GNN's prediction, its embedding should exhibit significant variance.

Grounded in these premises, \texttt{IN-N-OUT} employs a temperature-scaling approach that modulates the GNN logits based on the degree of discrepancy between calculated embeddings upon the hypothetical addition or removal of a link.

Empirical Validation and Results

An extensive experimental campaign across multiple standard datasets and GNN architectures showcases \texttt{IN-N-OUT}'s superior performance in enhancing the calibration of GNNs for link prediction. The approach consistently outperforms conventional calibration methods like Isotonic Regression, Temperature Scaling, and others. Specifically, \texttt{IN-N-OUT} achieved the lowest calibration errors in the majority of tested scenarios, indicating a closer alignment between predicted probabilities and empirical outcomes.

Practical and Theoretical Implications

This paper not only illuminates the calibration issues inherent in GNNs for link prediction but also provides a robust tool to mitigate such challenges, thereby enhancing the reliability of GNN predictions in sensitive applications. On a theoretical level, the methodology propels a deeper understanding of the interaction between graph embeddings and GNN predictions, potentially guiding future explorations in improving GNN architectures.

Towards Future Developments in AI

While focusing on link prediction, the implications of this paper stretch beyond, hinting at the potential for similar calibration strategies across other graph-related tasks. The interplay between graph structure, embeddings, and predictive certainty uncovered here lays a foundational stone for future advancements in calibration methods, fostering the development of more reliable and interpretable GNNs across varied applications.

Conclusion

\texttt{IN-N-OUT} presents a significant step forward in calibrating GNNs for link prediction, addressing the nuanced predictive behaviors of these models. Through meticulous experimentation and insightful methodology, this work not only advances our understanding of GNN calibration but also sets the groundwork for further innovations in the field of graph neural networks. As we push the boundaries of what AI can achieve, ensuring the reliability of our models remains paramount, with \texttt{IN-N-OUT} marking a notable advancement towards this goal.

X Twitter Logo Streamline Icon: https://streamlinehq.com