Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

GreaseLM: Graph REASoning Enhanced Language Models for Question Answering (2201.08860v1)

Published 21 Jan 2022 in cs.CL and cs.LG

Abstract: Answering complex questions about textual narratives requires reasoning over both stated context and the world knowledge that underlies it. However, pretrained LLMs (LM), the foundation of most modern QA systems, do not robustly represent latent relationships between concepts, which is necessary for reasoning. While knowledge graphs (KG) are often used to augment LMs with structured representations of world knowledge, it remains an open question how to effectively fuse and reason over the KG representations and the language context, which provides situational constraints and nuances. In this work, we propose GreaseLM, a new model that fuses encoded representations from pretrained LMs and graph neural networks over multiple layers of modality interaction operations. Information from both modalities propagates to the other, allowing language context representations to be grounded by structured world knowledge, and allowing linguistic nuances (e.g., negation, hedging) in the context to inform the graph representations of knowledge. Our results on three benchmarks in the commonsense reasoning (i.e., CommonsenseQA, OpenbookQA) and medical question answering (i.e., MedQA-USMLE) domains demonstrate that GreaseLM can more reliably answer questions that require reasoning over both situational constraints and structured knowledge, even outperforming models 8x larger.

Overview of GreaseLM: A Fusion of LLMs and Knowledge Graphs for Question Answering

The paper introduces GreaseLM, a model that integrates pretrained LLMs (LMs) with graph neural networks (GNNs) to enhance question-answering (QA) capabilities. The core innovation of GreaseLM lies in its ability to effectively fuse and reason over both language representations and knowledge graph (KG) embeddings through multiple layers of modality interaction operations. This process allows for a deeper integration and synthesis of information from both LMs and KGs, surpassing prior approaches that rely on shallow or one-way interactions between these two modalities.

Background and Motivation

Traditional question-answering systems typically rely on large pretrained LLMs, which have demonstrated general success across a variety of NLP tasks. However, these models often struggle when tasked with questions that require understanding and reasoning over implicit knowledge not directly encoded in text. On the other hand, knowledge graphs provide structured knowledge through explicit relationships between entities, which can support reasoning. Yet, integrating this structured knowledge into LLMs remains challenging. Prior methods have attempted to combine LMs and KGs, but often do so in a limited manner that restricts interactive reasoning across these two knowledge sources.

Proposed Approach: GreaseLM

GreaseLM addresses the aforementioned limitations by leveraging a multi-layer approach to integrate LMs and KGs:

  • Architecture: The model architecture comprises two key components: unimodal encoding layers using an LM, followed by cross-modal GreaseLM layers that incorporate both an LM and a GNN. This setup ensures that both language and graph representations are updated with contributions from each other across multiple layers.
  • Modality Interaction: At the heart of GreaseLM is a modality interaction mechanism, mediated through a special interaction token in the LM and interaction node in the GNN. These special representations act as conduits, allowing rich two-way information flow between textual tokens and graph nodes. This ensures that language context representations are grounded in structured world knowledge and that linguistic nuances influence the comprehension of graph-based knowledge.
  • Fine-Grained Reasoning: The deep integration facilitates nuanced reasoning over situational constraints presented in text and the structured knowledge from graphs, addressing the intricacies of questions that require complex reasoning.

Empirical Evaluation

GreaseLM's performance was evaluated on multiple QA benchmarks spanning commonsense reasoning (CommonsenseQA, OpenbookQA) and biomedical domains (MedQA-USMLE):

  • Commonsense and OpenbookQA: GreaseLM outperformed both state-of-the-art vanilla LMs and existing LM+KG models. Notably, it achieved a 5.5% improvement over fine-tuned RoBERTa-Large on CommonsenseQA and a 6.4% improvement over AristoRoBERTa on OpenbookQA.
  • MedQA-USMLE: A more domain-specific evaluation was conducted using medical questions, where GreaseLM also showed improvements over contemporary biomedical LMs such as SapBERT, indicating its utility in domain-specific settings.

Implications and Future Directions

This research underscores the potential of GreaseLM as a model capable of enhancing reasoning and QA performance by effectively marrying the depth of LLMs with the explicit structuring of knowledge graphs. By providing a framework that allows interactive refinement and grounding of both modalities, GreaseLM paves the way for more robust and versatile reasoning models.

Future work may explore extending GreaseLM's techniques to other tasks beyond QA, improving its efficiency and scalability for real-world applications, and enriching its capabilities with additional types of external knowledge sources. Moreover, addressing potential biases inherent in both LMs and KGs remains a critical consideration for ensuring ethical deployment. As the field advances, developing models like GreaseLM that can integrate multiple knowledge modalities will be crucial for achieving more comprehensive AI systems capable of deep and flexible reasoning.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Xikun Zhang (23 papers)
  2. Antoine Bosselut (85 papers)
  3. Michihiro Yasunaga (48 papers)
  4. Hongyu Ren (31 papers)
  5. Percy Liang (239 papers)
  6. Christopher D. Manning (169 papers)
  7. Jure Leskovec (233 papers)
Citations (188)
Youtube Logo Streamline Icon: https://streamlinehq.com