Papers
Topics
Authors
Recent
Search
2000 character limit reached

PAC-Bayesian Generalization Bounds for Knowledge Graph Representation Learning

Published 10 May 2024 in cs.LG, cs.AI, and stat.ML | (2405.06418v2)

Abstract: While a number of knowledge graph representation learning (KGRL) methods have been proposed over the past decade, very few theoretical analyses have been conducted on them. In this paper, we present the first PAC-Bayesian generalization bounds for KGRL methods. To analyze a broad class of KGRL models, we propose a generic framework named ReED (Relation-aware Encoder-Decoder), which consists of a relation-aware message passing encoder and a triplet classification decoder. Our ReED framework can express at least 15 different existing KGRL models, including not only graph neural network-based models such as R-GCN and CompGCN but also shallow-architecture models such as RotatE and ANALOGY. Our generalization bounds for the ReED framework provide theoretical grounds for the commonly used tricks in KGRL, e.g., parameter-sharing and weight normalization schemes, and guide desirable design choices for practical KGRL methods. We empirically show that the critical factors in our generalization bounds can explain actual generalization errors on three real-world knowledge graphs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (56)
  1. Weisfeiler and Leman go relational. In Proceedings of the 1st Learning on Graphs Conference, pp. 46:1–46:26, 2022.
  2. Rademacher and Gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research, 3:463–482, 2002.
  3. Spectrally-normalized margin bounds for neural networks. In Proceedings of the 31st Conference on Neural Information Processing Systems, pp.  6240–6249, 2017.
  4. Nearly-tight VC-dimension and pseudodimension bounds for piecewise linear neural networks. Journal of Machine Learning Research, 20(63):1–17, 2019.
  5. PAC-Bayesian theory for transductive learning. In Proceedings of the 17th International Conference on Artificial Intelligence and Statistics, pp.  105–113, 2014.
  6. Bodenreider, O. The unified medical language system (UMLS): integrating biomedical terminology. Nucleic Acids Research, 32(suppl_1):D267–D270, 2004.
  7. Translating embeddings for modeling multi-relational data. In Proceedings of the 27th Conference on Neural Information Processing Systems, pp.  2787–2795, 2013.
  8. How attentive are graph attention networks? In Proceedings of the 10th International Conference on Learning Representations, 2022.
  9. PairRE: Knowledge graph embeddings via paired relation vectors. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp.  4360–4369, 2021.
  10. Learning theory can (sometimes) explain generalisation in graph neural networks. In Proceedings of the 35th Conference on Neural Information Processing Systems, pp.  27043–27056, 2021.
  11. Generalization and representational limits of graph neural networks. In Proceedings of the 37th International Conference on Machine Learning, pp.  3419–3430, 2020.
  12. Neural message passing for quantum chemistry. In Proceedings of the 34th International Conference on Machine Learning, pp.  1263–1272, 2017.
  13. A theory of link prediction via relational Weisfeiler-Leman on knowledge graphs. In Proceedings of the 37th Conference on Neural Information Processing Systems, 2023.
  14. Generalization in graph neural networks: Improved PAC-Bayesian bounds on graph diffusion. In Proceedings of the 26th International Conference on Artificial Intelligence and Statistics, pp.  6314–6341, 2023.
  15. SimplE embedding for link prediction in knowledge graphs. In Proceedings of the 32nd Conference on Neural Information Processing Systems, pp.  4284–4295, 2018.
  16. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations, 2015.
  17. Generalization bounds for knowledge graph embedding (trained by maximum likelihood). NeurIPS 2019 Workshop on Machine Learning with Guarantees, 2019.
  18. A PAC-Bayesian approach to generalization bounds for graph neural networks. In Proceedings of the 9th International Conference on Learning Representations, 2021.
  19. Learning entity and relation embeddings for knowledge graph completion. In Proceedings of the 29th AAAI Conference on Artificial Intelligence, pp.  2181–2187, 2015.
  20. Analogical inference for multi-relational embeddings. In Proceedings of the 34th International Conference on Machine Learning, pp.  2168–2178, 2017.
  21. Assessing the effects of hyperparameters on knowledge graph embedding quality. Journal of Big Data, 10(1):59, 2023.
  22. Subgroup generalization and fairness of graph neural networks. In Proceedings of the 35th Conference on Neural Information Processing Systems, pp.  1048–1061, 2021.
  23. Generalization analysis of message passing neural networks on large random graphs. In Proceedings of the 36th Conference on Neural Information Processing Systems, pp.  4805–4817, 2022.
  24. McAllester, D. Simplified PAC-Bayesian margin bounds. In Proceedings of the 16th Annual Conference on Computational Learning Theory and 7th Kernel Workshop, pp.  203–215, 2003.
  25. McAllester, D. A. Some PAC-Bayesian theorems. In Proceedings of the 11th Annual Conference on Computational Learning Theory, pp.  230–234, 1998.
  26. WL meet VC. In Proceedings of the 40th International Conference on Machine Learning, pp.  25275–25302, 2023.
  27. Learning attention-based embeddings for relation prediction in knowledge graphs. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp.  4710–4723, 2019.
  28. Exploring generalization in deep learning. In Proceedings of the 31st Conference on Neural Information Processing Systems, pp.  5947–5956, 2017.
  29. A PAC-Bayesian approach to spectrally-normalized margin bounds for neural networks. In Proceedings of the 6th International Conference on Learning Representations, 2018.
  30. A three-way model for collective learning on multi-relational data. In Proceedings of the 28th International Conference on Machine Learning, pp.  809–816, 2011.
  31. Holographic embeddings of knowledge graphs. In Proceedings of the 30th AAAI Conference on Artificial Intelligence, pp.  1955–1961, 2016.
  32. Optimization and generalization analysis of transduction through gradient boosting and application to multi-scale graph neural networks. In Proceedings of the 34th Conference on Neural Information Processing Systems, pp.  18917–18930, 2020a.
  33. Graph neural networks exponentially lose expressive power for node classification. In Proceedings of the 8th International Conference on Learning Representations, 2020b.
  34. CoDEx: A comprehensive knowledge graph completion benchmark. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, pp.  8328–8350, 2020.
  35. The Vapnik-Chervonenkis dimension of graph and recursive neural networks. Neural Networks, 108:248–259, 2018.
  36. Modeling relational data with graph convolutional networks. In Proceedings of the 15th Extended Semantic Web Conference, pp.  593–607, 2018.
  37. End-to-end structure-aware convolutional networks for knowledge base completion. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence, pp.  3060–3067, 2019.
  38. Reasoning with neural tensor networks for knowledge base completion. In Proceedings of the 27th Conference on Neural Information Processing Systems, pp.  926–934, 2013.
  39. RotatE: Knowledge graph embedding by relational rotation in complex space. In Proceedings of the 7th International Conference on Learning Representations, 2019.
  40. Inductive relation prediction by subgraph reasoning. In Proceedings of the 37th International Conference on Machine Learning, pp.  9448–9457, 2020.
  41. Observed versus latent features for knowledge base and text inference. In Proceedings of the 3rd Workshop on Continuous Vector Space Models and their Compositionality, pp.  57–66, 2015.
  42. Tropp, J. A. User-friendly tail bounds for sums of random matrices. Foundations of Computational Mathematics, 12(4):389–434, 2012.
  43. Complex embeddings for simple link prediction. In Proceedings of the 33rd International Conference on Machine Learning, pp.  2071–2080, 2016.
  44. Valiant, L. G. A theory of the learnable. Communications of the ACM, 27(11):1134–1142, 1984.
  45. On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and Its Applications, 16(2):264–280, 1971.
  46. Composition-based multi-relational graph convolutional networks. In Proceedings of the 8th International Conference on Learning Representations, 2020.
  47. Graph attention networks. In Proceedings of the 6th International Conference on Learning Representations, 2018.
  48. Lipschitz regularity of deep neural networks: analysis and efficient estimation. In Proceedings of the 32nd Conference on Neural Information Processing Systems, pp.  3835–3844, 2018.
  49. Knowledge graph embedding: A survey of approaches and applications. IEEE Transactions on Knowledge and Data Engineering, 29(12):2724–2743, 2017.
  50. Knowledge graph embedding by translating on hyperplanes. In Proceedings of the 28th AAAI Conference on Artificial Intelligence, pp.  1112–1119, 2014.
  51. A reduction of a graph to a canonical form and an algebra arising during this reduction. Nauchno-Technicheskaya Informatsia, 2(9):12–16, 1968.
  52. A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems, 32(1):4–24, 2021.
  53. How powerful are graph neural networks? In Proceedings of the 7th International Conference on Learning Representations, 2019.
  54. Embedding entities and relations for learning and inference in knowledge bases. In Proceedings of the 3rd International Conference on Learning Representations, 2015.
  55. Quaternion knowledge graph embeddings. In Proceedings of the 33rd Conference on Neural Information Processing Systems, pp.  2735–2745, 2019.
  56. OOD link prediction generalization capabilities of message-passing GNNs in larger test graphs. In Proceedings of the 36th Conference on Neural Information Processing Systems, pp.  20257–20272, 2022.
Citations (1)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 0 likes about this paper.