Negative Sampling in Knowledge Graph Representation Learning: A Review (2402.19195v2)
Abstract: Knowledge Graph Representation Learning (KGRL), or Knowledge Graph Embedding (KGE), is essential for AI applications such as knowledge construction and information retrieval. These models encode entities and relations into lower-dimensional vectors, supporting tasks like link prediction and recommendation systems. Training KGE models relies on both positive and negative samples for effective learning, but generating high-quality negative samples from existing knowledge graphs is challenging. The quality of these samples significantly impacts the model's accuracy. This comprehensive survey paper systematically reviews various negative sampling (NS) methods and their contributions to the success of KGRL. Their respective advantages and disadvantages are outlined by categorizing existing NS methods into six distinct categories. Moreover, this survey identifies open research questions that serve as potential directions for future investigations. By offering a generalization and alignment of fundamental NS concepts, this survey provides valuable insights for designing effective NS methods in the context of KGRL and serves as a motivating force for further advancements in the field.
- Toward an architecture for never-ending language learning. In Proceedings of Twenty-Fourth AAAI Conference on Artificial Intelligence, pages 1306–1313, 2010.
- Freebase: a collaboratively created graph database for structuring human knowledge. In Proceedings of International Conference on Management of Data, pages 1247–1250, 2008.
- DBpedia: A nucleus for a web of open data. In Proceedings of International Semantic Web Conference, volume 4825, pages 722–735, 2007.
- George A. Miller. WordNet: A lexical database for english. Communication of the ACM, 38(11):39–41, 1995.
- Yago: a core of semantic knowledge. In Proceedings of Sixteenth International Conference on World Wide Web, pages 697–706, 2007.
- Knowledge graph embedding based question answering. In Proceedings of Twelfth ACM International Conference on Web Search and Data Mining, pages 105–113, 2019.
- Exemplar queries: A new way of searching. The International Journal on Very Large Data Bases, 25(6):741–765, 2016.
- Enhancing financial table and text question answering with tabular graph and numerical reasoning. In Proceedings of Second Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the Twelfth International Joint Conference on Natural Language Processing, pages 991–1000, 2022.
- KGAT: Knowledge graph attention network for recommendation. In Proceedings of Twenty-Fifth International Conference on Knowledge Discovery and Data Mining, pages 950–958, 2019.
- Collaborative knowledge base embedding for recommender systems. In Proceedings of Twenty-Second International Conference on Knowledge Discovery and Data Mining, pages 353–362, 2016.
- DKN: Deep knowledge-aware network for news recommendation. In Proceedings of World Wide Web Conference, pages 1835–1844, 2018.
- Conversational information retrieval using knowledge graphs. In Proceedings of Thirty-First ACM International Conference on Information and Knowledge Management and First Workshop on Proactive and Agent-Supported Information Retrieval, 2022.
- Utilizing knowledge graphs for text-centric information retrieval. In Proceedings of Fourty-First International Conference on Research and Development in Information Retrieval, pages 1387–1390, 2018.
- Knowledge Vault: A web-scale approach to probabilistic knowledge fusion. In Proceedings of Twentieth International Conference on Knowledge Discovery and Data Mining, pages 601–610, 2014.
- Type-constrained representation learning in knowledge graphs. In Proceedings of International Semantic Web Conference, pages 640–655, 2015.
- Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. British Medical Journal, 339, 2009.
- A survey on knowledge graph embeddings for link prediction. Symmetry, 13(3):485, 2021.
- A survey on knowledge graphs: Representation, acquisition, and applications. IEEE Transactions on Neural Networks and Learning Systems, 33(2):494–514, 2022.
- TuckerDNCaching: high-quality negative sampling with tucker decomposition. Journal of Intelligent Information Systems, 60(3).
- Translating embeddings for modeling multi-relational data. In Proceedings of Thirteenth International Conference on Neural Information Processing Systems, pages 2787–2795, 2013.
- Knowledge graph embedding by translating on hyperplanes. In Proceedings of Fourteenth AAAI Conference on Artificial Intelligence, pages 1112–1119, 2014.
- Learning entity and relation embeddings for knowledge graph completion. In Proceedings of Twenty-Ninth AAAI Conference on Artificial Intelligence, pages 2181–2187, 2015.
- Knowledge graph embedding via dynamic mapping matrix. In Proceedings of Fifty-Third Annual Meeting of the Association for Computational Linguistics and Seventh International Joint Conference on Natural Language Processing), pages 687–696, 2015.
- Knowledge graph embedding for hyper-relational data. Tsinghua Science and Technology, 22(2):185–197, 2017.
- TransA: An adaptive approach for knowledge graph embedding. CoRR, abs/1509.05490, 2015.
- Transition-based knowledge graph embedding with relational mapping properties. In Proceedings of Twenty-Eighth Pacific Asia Conference on Language, Information, and Computing, pages 328–337, 2014.
- Knowledge graph embedding by flexible translation. In Proceedings of Fifteenth International Conference on Principles of Knowledge Representation and Reasoning, pages 557–560, 2016.
- An interpretable knowledge transfer model for knowledge base completion. In Proceedings of Annual Meeting of the Association for Computational Linguistics, pages 950–962, 2017.
- Translating embeddings for knowledge graph completion with relation attention mechanism. In Proceedings of Twenty-Seventh International Joint Conference on Artificial Intelligence, pages 4286–4292, 2018.
- TransMS: Knowledge graph embedding for complex relations by multidirectional semantics. In Proceedings of Twenty-Eighth International Joint Conference on Artificial Intelligence, pages 1935–1942, 2019.
- From one point to a manifold: Knowledge graph embedding for precise link prediction. In Proceedings of Twenty-Fifth International Joint Conference on Artificial Intelligence, pages 1315–1321, 2016.
- Learning to represent knowledge graphs with gaussian embedding. In Proceedings of Twenty-Fourth ACM International on Conference on Information and Knowledge Management, pages 623–632, 2015.
- TransG : A generative model for knowledge graph embedding. In Proceedings of Fifty-Fourth Annual Meeting of the Association for Computational Linguistics, pages 2316––2325, 2016.
- Learning hierarchy-aware knowledge graph embeddings for link prediction. In Proceedings of Thirty-Fourth AAAI Conference on Artificial Intelligence, Innovative Applications of Artificial Intelligence Conference, and Symposium on Educational Advances in Artificial Intelligence, pages 3065–3072, 2020.
- TorusE: Knowledge graph embedding on a lie group. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial, pages 1819–1826, 2018.
- Quaternion knowledge graph embeddings. Advances in neural information processing systems, 32:246, 2019.
- RotatE: Knowledge graph embedding by relational rotation in complex space. In Proceedings of International Conference on Learning Representations, 2019.
- MöBiusE: Knowledge graph embedding on möbius ring. Knowledge Based Systems, 227:107181, 2021.
- A three-way model for collective learning on multi-relational data. In Proceedings of Twenty-Eighth International Conference on Machine Learning, pages 809–816, 2011.
- Embedding entities and relations for learning and inference in knowledge bases. In Proceedings of International Conference on Learning Representations, 2015.
- Holographic embeddings of knowledge graphs. In Proceedings of Thirtieth AAAI Conference on Artificial Intelligence, pages 1955–1961, 2016.
- Complex embeddings for simple link prediction. In Proceedings of Thirty-Third International Conference on Machine Learning, pages 2071–2080, 2016.
- Simple embedding for link prediction in knowledge graphs. In Proceedings of Thirty-Second International Conference on Neural Information Processing Systems, pages 4289–4300, 2018.
- Expanding holographic embeddings for knowledge completion. In Proceedings of Thirty-Second International Conference on Neural Information Processing Systems, pages 4496–4506, 2018.
- Interaction embeddings for prediction and explanation in knowledge graphs. In Proceedings of Twelfth ACM International Conference on Web Search and Data Mining, pages 96–104, 2019.
- Analogical inference for multi-relational embeddings. In Proceedings of Thirty-Fourth International Conference on Machine Learning, pages 2168–2178, 2017.
- TuckER: Tensor factorization for knowledge graph completion. In Proceedings of Conference on Empirical Methods in Natural Language Processing and Ninth International Joint Conference on Natural Language Processing, pages 5184–5193, 2019.
- Facenet: A unified embedding for face recognition and clustering. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pages 815–823, 2015.
- Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics. The Journal of Machine Learning Research, 13:307–361, 2012.
- Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In Proceedings of International Conference on Artificial Intelligence and Statistics, pages 297–304, 2010.
- Distributed representations of words and phrases and their compositionality. In Proceedings of International Conference on Neural Information Processing Systems, pages 3111–3119, 2013.
- A fast and simple algorithm for training neural probabilistic language models. In Proceedings of Twenty-Ninth International Conference on International Conference on Machine Learning, pages 419–426, 2012.
- Raymond Reiter. Deductive Question-Answering on Relational Data Bases, chapter Logic and Data Bases, pages 149–177. 1978.
- Understanding negative sampling in graph representation learning. In Proceedings of Twenty-Sixth Conference on Knowledge Discovery and Data Mining, pages 1666–1676, 2020.
- Investigations on knowledge base embedding for relation prediction and extraction. CoRR, abs/1802.02114, 2018.
- Pytorch-BigGraph: A large scale graph embedding system. In Proceedings of Machine Learning and Systems, pages 120–131, 2019.
- Enhancing knowledge graph embedding with probabilistic negative sampling. In Proceedings of Twenty-sixth International Conference on World Wide Web Companion, pages 801–802, 2017.
- A novel negative sample generating method for knowledge graph embedding. In Proceedings of International Conference on Embedded Wireless Systems and Networks, pages 401–406, 2019.
- A novel negative sampling based on frequency of relational association entities for knowledge graph embedding. Journal of Web Engineering, 20(6):1867–1884, 2021.
- Analysis of the impact of negative sampling on link prediction in knowledge graphs. CoRR, abs/1708.06816, 2017.
- Language model-driven negative sampling. CoRR, abs/2203.04703, 2022.
- KGBoost: A classification-based knowledge base completion method with negative sampling. Pattern Recognition Letters, 157:104–111, 2022.
- Incorporating domain and range of relations for knowledge graph completion. In Proceedings of China Conference on Knowledge Graph and Semantic Computing, pages 50–61, 2019.
- Conditional constraints for knowledge graph embeddings. In Proceedings of Workshop on Deep Learning for Knowledge Graphs, 2020.
- Structure aware negative sampling in knowledge graphs. In Proceedings of Conference on Empirical Methods in Natural Language Processing, pages 6093–6101, 2020.
- Knowledge graph embedding based on adaptive negative sampling. In Proceedings of International Conference of Pioneering Computer Scientists, Engineers, and Educators, pages 551–563, 2019.
- Sang hyun Je. Entity aware negative sampling with auxiliary loss of false negative prediction for knowledge graph embedding. CoRR, abs/2210.06242, 2022.
- Bootstrapping entity alignment with knowledge graph embedding. In Proceedings of Twenty-Seventh International Joint Conference on Artificial Intelligence, pages 4396–4402, 2018.
- Fusing attribute character embeddings with truncated negative sampling for entity alignment. Electronics, 12(8):1947, 2023.
- Affinity dependent negative sampling for knowledge graph embeddings. In Proceedings of Workshop on Deep Learning for Knowledge Graphs, 2020.
- Distributional negative sampling for knowledge base completion. CoRR, abs/1908.06178, 2019.
- MixKG: Mixing for harder negative samples in knowledge graph. CoRR, abs/2202.09606, 2022.
- Simple negative sampling for link prediction in knowledge graphs. In Proceedings of International Conference on Complex Networks and Their Applications, pages 549–562, 2022.
- Improving knowledge graph embeddings with ontological reasoning. In Proceedings of International Semantic Web Conference, pages 410–426, 2021.
- KBGAN: adversarial learning for knowledge graph embeddings. In Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1470–1480, 2018.
- Incorporating GAN for negative sampling in knowledge representation learning. In Proceedings of Thirty-Second AAAI Conference on Artificial Intelligence, pages 2005–2012, 2018.
- GraphGAN: Graph representation learning with generative adversarial nets. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence, pages 2508–2515, 2018.
- A knowledge selective adversarial network for link prediction in knowledge graph. In Proceedings of Conference on Natural Language Processing and Chinese Computing, pages 171–183, 2019.
- Aggregating neighborhood information for negative sampling for knowledge graph embedding. Neural Computing and Applications, 32(23):17637–17653, 2020.
- A generative adversarial network for single and multi-hop distributional knowledge base completion. Neurocomputing, pages 543–551, 2021.
- Improving knowledge graph completion using soft rules and adversarial learning. Chinese Journal of Electronics, 30(4):623–633, 2021.
- Learning structured embeddings of knowledge graphs with generative adversarial framework. Expert Systems with Applications, 204:117361, 2022.
- Confidence-aware negative sampling method for noisy knowledge graph embedding. In Proceedings of IEEE International Conference on Big Knowledge, pages 33–40, 2018.
- NSCaching: Simple and efficient negative sampling for knowledge graph embedding. In Proceedings of Thirty-Fifth IEEE International Conference on Data Engineering, pages 614–625, 2019.
- Adversarial knowledge representation learning without external model. IEEE Access, 7:3512–3524, 2019.
- Relation-aware graph attention model with adaptive self-adversarial training. In Proceedings of Thirty-Fifth AAAI Conference on Artificial Intelligence, pages 9368–9376, 2021.
- Entity similarity-based negative sampling for knowledge graph embedding. In Proceedings of Pacific Rim International Conference on Artificial Intelligence, pages 73–87, 2022.
- MDNcaching: A strategy to generate quality negatives for knowledge graph embedding. In Proceedings of International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, pages 877–888, 2022.
- Op-Trans: An optimization framework for negative sampling and triplet-mapping properties in knowledge graph embedding. Applied Sciences, 13(5):2817, 2023.
- RatE: Relation-adaptive translating embedding for knowledge graph completion. In Proceedings of Twenty-Eighth International Conference on Computational Linguistics, pages 556–567, 2020.
- Sentence-BERT: Sentence embeddings using siamese bert-networks. In Conference on Empirical Methods in Natural Language Processing, pages 8328–8350, 2019.
- Principal component analysis. Wiley Interdisciplinary Reviews: Computational Statistics, 2(4):433–459, 2010.
- k-means++: the advantages of careful seeding. In Proceedings of Annual ACM-SIAM symposium on Discrete algorithms, pages 1027–1035, 2007.
- Algorithm AS 136: A K-Means clustering algorithm. Applied Statistics, 28(1):100–108, 1979.
- Generative adversarial nets. In Proceedings of International Conference on Neural Information Processing Systems, pages 2672–2680, 2014.
- mixup: Beyond empirical risk minimization. In Proceedings of Sixth International Conference on Learning Representations, 2018.
- Bootstrapped knowledge graph embedding based on neighbor expansion. In Proceedings of Thirty-First ACM International Conference on Information and Knowledge Management, pages 4123–4127, 2022.
- Stay positive: knowledge graph embedding without negative sampling. CoRR, abs/2201.02661, 2022.
- Efficient non-sampling knowledge graph embedding. In Proceedings of the Web Conference, pages 1727–1736, 2021.
- Tiroshan Madushanka (2 papers)
- Ryutaro Ichise (11 papers)