Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
51 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

GLaM: Fine-Tuning Large Language Models for Domain Knowledge Graph Alignment via Neighborhood Partitioning and Generative Subgraph Encoding (2402.06764v3)

Published 9 Feb 2024 in cs.AI

Abstract: Integrating LLMs with knowledge graphs derived from domain-specific data represents an important advancement towards more powerful and factual reasoning. As these models grow more capable, it is crucial to enable them to perform multi-step inferences over real-world knowledge graphs while minimizing hallucination. While LLMs excel at conversation and text generation, their ability to reason over domain-specialized graphs of interconnected entities remains limited. For example, can we query a LLM to identify the optimal contact in a professional network for a specific goal, based on relationships and attributes in a private database? The answer is no--such capabilities lie beyond current methods. However, this question underscores a critical technical gap that must be addressed. Many high-value applications in areas such as science, security, and e-commerce rely on proprietary knowledge graphs encoding unique structures, relationships, and logical constraints. We introduce a fine-tuning framework for developing Graph-aligned LLMs (GLaM) that transforms a knowledge graph into an alternate text representation with labeled question-answer pairs. We demonstrate that grounding the models in specific graph-based knowledge expands the models' capacity for structure-based reasoning. Our methodology leverages the large-LLM's generative capabilities to create the dataset and proposes an efficient alternate to retrieval-augmented generation styled methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. The Reversal Curse: LLMs trained on” A is B” fail to learn” B is A”. arXiv preprint arXiv:2309.12288.
  2. Bodenreider, O. 2004. The unified medical language system (UMLS): integrating biomedical terminology. Nucleic acids research, 32(suppl_1): D267–D270.
  3. Graph-based Multilingual Language Model: Leveraging Product Relations for Search Relevance. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2789–2799.
  4. Talk like a graph: Encoding graphs for large language models. arXiv preprint arXiv:2310.04560.
  5. Reasoning with language model is planning with world model. arXiv preprint arXiv:2305.14992.
  6. Deberta: Decoding-enhanced bert with disentangled attention. arXiv preprint arXiv:2006.03654.
  7. Structgpt: A general framework for large language model to reason over structured data. arXiv preprint arXiv:2305.09645.
  8. Large Language Models on Graphs: A Comprehensive Survey. arXiv preprint arXiv:2312.02783.
  9. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems, 33: 9459–9474.
  10. Towards graph foundation models: A survey and beyond. arXiv preprint arXiv:2310.11829.
  11. Capabilities of gpt-4 on medical challenge problems. arXiv preprint arXiv:2303.13375.
  12. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35: 27730–27744.
  13. Unifying Large Language Models and Knowledge Graphs: A Roadmap. arXiv preprint arXiv:2306.08302.
  14. Deepspeed: System optimizations enable training deep learning models with over 100 billion parameters. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 3505–3506.
  15. Improving multi-hop question answering over knowledge graphs using knowledge base embeddings. In Proceedings of the 58th annual meeting of the association for computational linguistics, 4498–4507.
  16. Outrageously large neural networks: The sparsely-gated mixture-of-experts layer. arXiv preprint arXiv:1701.06538.
  17. Monte Carlo Thought Search: Large Language Model Querying for Complex Scientific Reasoning in Catalyst Design. arXiv preprint arXiv:2310.14420.
  18. Think-on-graph: Deep and responsible reasoning of large language model with knowledge graph. arXiv preprint arXiv:2307.07697.
  19. ArnetMiner: Extraction and Mining of Academic Social Networks. In KDD’08, 990–998.
  20. Graph neural prompting with large language models. arXiv preprint arXiv:2309.15427.
  21. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
  22. Finetuned language models are zero-shot learners. arXiv preprint arXiv:2109.01652.
  23. Tree of thoughts: Deliberate problem solving with large language models. arXiv preprint arXiv:2305.10601.
  24. Deep bidirectional language-knowledge graph pretraining. Advances in Neural Information Processing Systems, 35: 37309–37323.
  25. Bertscore: Evaluating text generation with bert. arXiv preprint arXiv:1904.09675.
  26. Mixture-of-experts with expert choice routing. Advances in Neural Information Processing Systems, 35: 7103–7114.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Stefan Dernbach (5 papers)
  2. Khushbu Agarwal (13 papers)
  3. Alejandro Zuniga (1 paper)
  4. Michael Henry (1 paper)
  5. Sutanay Choudhury (36 papers)
Citations (4)
X Twitter Logo Streamline Icon: https://streamlinehq.com