Causal Order: The Key to Leveraging Imperfect Experts in Causal Inference (2310.15117v2)

Published 23 Oct 2023 in cs.AI and cs.CL

Abstract: LLMs have been used as experts to infer causal graphs, often by repeatedly applying a pairwise prompt that asks about the causal relationship of each variable pair. However, such experts, including human domain experts, cannot distinguish between direct and indirect effects given a pairwise prompt. Therefore, instead of the graph, we propose that causal order be used as a more stable output interface for utilizing expert knowledge. Even when querying a perfect expert with a pairwise prompt, we show that the inferred graph can have significant errors whereas the causal order is always correct. In practice, however, LLMs are imperfect experts and we find that pairwise prompts lead to multiple cycles. Hence, we propose the triplet method, a novel querying strategy that introduces an auxiliary variable for every variable pair and instructs the LLM to avoid cycles within this triplet. It then uses a voting-based ensemble method that results in higher accuracy and fewer cycles while ensuring cost efficiency. Across multiple real-world graphs, such a triplet-based method yields a more accurate order than the pairwise prompt, using both LLMs and human annotators. The triplet method enhances robustness by repeatedly querying an expert with different auxiliary variables, enabling smaller models like Phi-3 and Llama-3 8B Instruct to surpass GPT-4 with pairwise prompting. For practical usage, we show how the expert-provided causal order from the triplet method can be used to reduce error in downstream graph discovery and effect inference tasks.

Citations (31)

View on Semantic Scholar

Summary

The paper demonstrates that deriving a topological order is sufficient for accurate causal inference, eliminating the need for complete causal graphs.
The study leverages LLMs as virtual domain experts with robust prompting techniques to reliably elicit causal order from complex datasets.
The integration of LLM outputs with traditional constraint-based and score-based methods leads to significant improvements in causal ordering accuracy.

An Expert Overview of "Causal Inference Using LLM-Guided Discovery"

In the field of causal inference, the task of deriving reliable causal graphs from observational data poses a significant challenge. Traditional methods often hinge on constructing a causal graph, which, when erroneous, can obscure true causal relationships and lead to flawed conclusions. The paper "Causal Inference Using LLM-Guided Discovery" introduces a novel paradigm that shifts the focus from acquiring the complete causal graph to obtaining the topological order of graph variables. This topological order proves sufficient for causal effect inference through the backdoor criterion, thus eliminating the exhaustive need for a complete graph structure.

The authors propose an innovative methodology where LLMs like GPT-3.5-turbo and GPT-4 are leveraged as virtual domain experts to determine the causal order. This approach capitalizes on the relative ease of eliciting causal order from domain experts, in contrast to specifying all graph edges, as the latter can require intricate knowledge of dependencies between multiple variables. This insight significantly simplifies the causal inference pipeline, echoing the results of the authors' rigorous empirical investigations.

A key contribution of the paper is the development of techniques to integrate LLM outputs with existing causal discovery algorithms, namely constraint-based and score-based methods. The authors designed robust prompting strategies and contextual cues to derive causal order from LLMs reliably. Through extensive experimentation, the paper demonstrates how this integration can substantially enhance causal ordering accuracy over standard discovery algorithms alone. The findings underscore the potential of LLMs to facilitate accurate and efficient causal inference across various domains.

Methodological Insights

Causal Order Sufficiency: The authors establish that a topological order among graph variables suffices for determining causal effects from observational data, as opposed to requiring a fully specified graph. This revelation is underpinned by their theoretical proof that accurate causal order alone enables correct identification of the backdoor adjustment set necessary for causal effect computation.
LLMs as Virtual Experts: By employing LLMs to ascertain causal order, the paper presents a novel use case of AI in automating domain expert tasks. The prompting strategies are meticulously crafted to maximize the LLMs' ability to discern causal relationships.
Algorithmic Integration: Supplementary algorithms are developed to combine LLM-inferred causal orders with traditional graph discovery methods. For instance, the causal order derived from LLMs aids in post-hoc orientation of edges in the constraint-based PC algorithm's output, yielding more precise causal graphs.

Numerical Results and Implications

The paper's experimental results validate the proposed approach, showing substantial improvements in causal ordering accuracy across benchmark datasets compared to baseline graphs derived using discovery algorithms without LLM-aided enhancements. This suggests that LLMs can play a transformative role in the practice of causal inference, particularly in contexts with limited data or complex variable interactions.

Future Directions

The theoretical and practical advancements presented in this research suggest several exciting avenues for future exploration. Expanding this framework to handle larger and more diverse datasets could further demonstrate its scalability and applicability. Additionally, refining LLM prompting techniques and exploring integrations with other forms of causal reasoning models may yield richer insights and enhance robustness.

In summary, "Causal Inference Using LLM-Guided Discovery" offers a compelling vision of how contemporary AI tools can complement and enhance traditional domain-specific tasks in causal inference. By simplifying the requirements for effective causal discovery, this approach not only bolsters the precision of scientific investigations but also broadens access to causal inference tools across disciplines with varying degrees of methodological expertise.

Related Papers

YouTube

Show All Videos