Knowledge Graph Structure as Prompt: Enhancing Small LLMs for Knowledge-based Causal Discovery
The paper "Knowledge Graph Structure as Prompt: Improving Small LLMs Capabilities for Knowledge-based Causal Discovery" by Yuni Susanti and Michael Färber presents a novel approach designed to augment the performance of Small LLMs (SLMs) in knowledge-based causal discovery tasks. The authors propose a method known as "KG Structure as Prompt," which incorporates structural information from knowledge graphs (KGs) into prompt-based learning to enhance the reasoning capabilities of SLMs.
Introduction to the Task
Causal discovery aims to uncover causal relationships between variables using observational data, resulting in a causal graph where nodes represent variables and edges represent causal relationships. Traditional methods, such as covariance-based causal discovery, infer these relationships based on data values. Recent advancements in LLMs have introduced metadata-based approaches, focusing on variables' metadata rather than their data values for causal reasoning. This paper extends this concept to SLMs, defined as LLMs with fewer than 1 billion parameters.
Methodology
The authors present a structured methodology leveraging KGs such as Wikidata and Hetionet. The innovative aspect of their approach lies in transforming KG structural information into natural language prompts that can be understood and processed by SLMs. They explore three types of KG structural information: neighbor nodes (), common neighbor nodes (), and metapaths (). Each type of structural information offers a different dimension of relational context that can aid SLMs in causal inference.
Prompt Design
The design of the prompt is integral to the success of this method. The authors integrate KG-derived context into the prompt-based learning framework. For instance, the prompt might combine the input text sequence, KG-derived graph context, and the target variable pair, using a combination of few-shot examples and task-specific instructions. This multi-faceted prompt design enables SLMs to leverage external knowledge effectively, thereby enhancing their inferencing capabilities.
Experimental Framework
The paper evaluates the proposed method on three types of biomedical datasets (GENEC, DDI, COMAGC) and an open-domain dataset (SEMEVAL-2010 Task 8). The experiments compare the proposed approach with several baselines, including traditional fine-tuning, prompt tuning without graph context, and in-context learning (ICL) using a larger parameterized model (GPT-3.5-turbo). The evaluation is conducted under few-shot settings using metrics of precision, recall, and F1 score.
Results
The experimental results are compelling, showcasing the effectiveness of KG Structure as Prompt:
- Performance Improvement: The proposed method consistently outperformed no-graph context baselines, achieving up to a 15.1-point increase in F1 scores on biomedical datasets and a 6.8-point improvement on the open-domain dataset.
- Comparison with Full Training: Even with limited training samples, the proposed approach achieved performance close to, and sometimes surpassing, models trained on full datasets.
- SLMs vs. LLMs: The proposed approach demonstrated that SLMs, when combined with prompt-based learning and KGs, can surpass larger LLMs like GPT-3.5-turbo in causal discovery tasks.
Discussion
The paper presents several insightful findings:
- Structural Information: Metapaths () generally provided the best performance among the different types of KG structures, although the effectiveness varied with the dataset's characteristics.
- Model Architecture: Models using the Masked LLM (MLM) architecture typically performed best in classification tasks, followed by Sequence-to-Sequence (Seq2SeqLM) and Causal LLM (CLM) architectures.
- KG Selection: Domain-specific KGs like Hetionet generally provided better results for biomedical datasets compared to general-domain KGs like Wikidata.
Implications and Future Work
The findings underscore the potential of integrating external knowledge from KGs to enhance the capabilities of SLMs in specialized tasks such as causal discovery. The implications are significant, suggesting that SLMs, with appropriate contextual enhancements, can achieve high performance levels traditionally associated with more resource-intensive LLMs. Future research could extend this approach to more complex causal graphs involving multiple interconnected variables, further enriching the understanding of causal relationships.
In conclusion, "KG Structure as Prompt" offers a robust and flexible framework for leveraging knowledge graphs to augment the reasoning capabilities of Small LLMs, setting a new direction for efficient and cost-effective AI models in causal inference.