Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Efficient Causal Graph Discovery Using Large Language Models (2402.01207v4)

Published 2 Feb 2024 in cs.LG, cs.AI, and stat.ME

Abstract: We propose a novel framework that leverages LLMs for full causal graph discovery. While previous LLM-based methods have used a pairwise query approach, this requires a quadratic number of queries which quickly becomes impractical for larger causal graphs. In contrast, the proposed framework uses a breadth-first search (BFS) approach which allows it to use only a linear number of queries. We also show that the proposed method can easily incorporate observational data when available, to improve performance. In addition to being more time and data-efficient, the proposed framework achieves state-of-the-art results on real-world causal graphs of varying sizes. The results demonstrate the effectiveness and efficiency of the proposed method in discovering causal relationships, showcasing its potential for broad applicability in causal graph discovery tasks across different domains.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (28)
  1. Dagma: Learning dags via m-matrices and a log-determinant acyclicity characterization, 2023.
  2. Lmpriors: Pre-trained language models as task-specific priors. arXiv preprint arXiv: 2210.12530, 2022.
  3. Large language models are not strong abstract reasoners. arXiv preprint arXiv: 2305.19555, 2023.
  4. Mathprompter: Mathematical reasoning using large language models. Annual Meeting of the Association for Computational Linguistics, 2023. doi: 10.48550/arXiv.2303.05398.
  5. Causal reasoning and large language models: Opening a new frontier for causality. arXiv preprint arXiv: 2305.00050, 2023.
  6. Local computations with probabilities on graphical structures and their application to expert systems. Journal of the Royal Statistical Society. Series B (Methodological), 50(2):157–224, 1988. ISSN 00359246. URL http://www.jstor.org/stable/2345762.
  7. Large language models as counterfactual generator: Strengths and weaknesses. arXiv preprint arXiv: 2305.14791, 2023.
  8. Causal discovery with language models as imperfect experts, 2023a.
  9. Can large language models build causal graphs? arXiv preprint arXiv: 2303.05279, 2023b.
  10. Meek, C. Graphical Models: Selecting causal and statistical models. 4 2023. doi: 10.1184/R1/22696393.v1. URL https://kilthub.cmu.edu/articles/thesis/Graphical_Models_Selecting_causal_and_statistical_models/22696393.
  11. OpenAI. Gpt-4 technical report. arXiv preprint arXiv: 2303.08774, 2023.
  12. Pearl, J. Causality. Cambridge University Press, Cambridge, UK, 2 edition, 2009. ISBN 978-0-521-89560-6. doi: 10.1017/CBO9780511803161.
  13. Elements of Causal Inference: Foundations and Learning Algorithms. The MIT Press, 2017. ISBN 0262037319.
  14. Code llama: Open foundation models for code. arXiv preprint arXiv: 2308.12950, 2023.
  15. Schwarz, G. Estimating the dimension of a model. The Annals of Statistics, 6(2):461–464, 1978. ISSN 00905364. URL http://www.jstor.org/stable/2958889.
  16. Scutari, M. Learning bayesian networks with the bnlearn R package. Journal of Statistical Software, 35(3):1–22, 2010. doi: 10.18637/jss.v035.i03.
  17. Bayesian analysis in expert systems. Statistical Science, 8(3):219–247, 1993. ISSN 08834237. URL http://www.jstor.org/stable/2245959.
  18. An algorithm for fast recovery of sparse causal graphs. Social Science Computer Review, 9(1):62–72, 1991. doi: 10.1177/089443939100900106. URL https://doi.org/10.1177/089443939100900106.
  19. Causation, Prediction, and Search, volume 81. 01 1993. ISBN 978-1-4612-7650-0. doi: 10.1007/978-1-4612-2748-9.
  20. Neuropathic pain diagnosis simulator for causal discovery algorithm evaluation. Neural Information Processing Systems, 2019.
  21. PINTO: Faithful language reasoning using prompt-generated rationales. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=WBXbRs63oVu.
  22. Chain-of-thought prompting elicits reasoning in large language models. arXiv preprint arXiv: 2201.11903, 2022.
  23. Llms and the abstraction and reasoning corpus: Successes, failures, and the importance of object-based representations. arXiv preprint arXiv: 2305.18354, 2023.
  24. Tree of thoughts: Deliberate problem solving with large language models, 2023.
  25. A survey on causal discovery: Theory and practice, 2023.
  26. Large language models as commonsense knowledge for large-scale task planning. arXiv preprint arXiv: 2305.14078, 2023.
  27. Dags with no tears: Continuous optimization for structure learning, 2018.
  28. Causal-learn: Causal discovery in python. arXiv preprint arXiv:2307.16405, 2023.
Citations (12)

Summary

  • The paper introduces a BFS method leveraging LLMs to reduce query complexity from quadratic to linear for causal graph discovery.
  • It employs a three-stage process—initialization, expansion, and insertion—to construct directed acyclic graphs using domain expertise.
  • Experimental results on diverse causal graphs demonstrate high F-scores and low Normalized Hamming Distances, validating its robustness.

Efficient Causal Graph Discovery Using LLMs

The paper under consideration presents an innovative framework leveraging LLMs for the discovery of full causal graphs. The authors address significant limitations in prior LLM-based causal discovery methods by transitioning from a pairwise query approach—which scales quadratically with the number of variables—to an efficient breadth-first search (BFS) method, substantially reducing complexity to a linear scale in terms of queries needed.

Methodological Insights

The proposed framework is distinct in its approach, employing LLMs to perform causal discovery without relying on numerical observational data. This process aligns more closely with how human experts utilize domain knowledge for causal reasoning. Particularly notable is the implementation of a BFS strategy for causal graph construction, ensuring compliance with the Directed Acyclic Graph (DAG) structure [Pearl09]. This strategy involves three stages:

  1. Initialization: The authors use a specially crafted LLM prompt to identify and initialize variables deemed independent or unaffected.
  2. Expansion: The LLM is directed to assess and propose causally influenced variables for each node using BFS traversal.
  3. Insertion: Causal relations are cyclically validated before being incorporated into the developing graph structure.

The incorporation of observational data is elegantly integrated into the framework as a supplementary aid, enhancing performance when available but not obligatory for its operation.

Experimental Validation

The framework's efficacy is demonstrated across causal graphs of varying scales: the small Asia graph, the medium-sized Child graph, and the extensive Neuropathic Pain graph. On smaller graphs, the method achieves superior or competitive performance against both statistical methods—such as GES, PC, NOTEARS, and DAGMA—and pairwise LLM-based methods. The Neuropathic Pain graph, due to its size and complexity, presents a unique challenge, illustrating the framework's robustness where other methods fall short.

Quantitatively, the proposed BFS method with LLMs significantly outperforms its competitors, reflected in high F-scores and low Normalized Hamming Distance (NHD) ratios. The computational efficiency of the BFS approach also allows exploration into larger causal structures, which are otherwise impractical with pairwise application due to the exponential growth of query numbers.

Practical and Theoretical Implications

The proposed method opens new avenues for efficient causal graph discovery without the need for exhaustive observational datasets. This is particularly relevant in domains where data collection is costly or impractical. It exemplifies how advancements in LLMs can be harnessed for complex reasoning tasks in causal inference, suggesting broad applicability across varying domains, including medicine, biology, and social sciences.

From a theoretical standpoint, the paper underscores the need for integrative approaches in AI research—combining the interpretative power of LLMs with efficient algorithmic strategies to solve traditionally hard problems in machine learning.

Future Directions

Future work may focus on expanding this framework by fusing traditional statistical methods with LLM capabilities, thereby harnessing both structured data and contextual knowledge. Additionally, exploring the effects of different LLM architectures and scales on causal discovery efficacy could provide deeper insights into model capabilities. Advanced prompting techniques, such as Tree of Thoughts, present promising opportunities for further refinement.

Overall, this paper contributes a significant methodological advancement in causal graph discovery, revealing potential pathways for future research and application in artificial intelligence. The implications of deploying such efficient frameworks are profound, especially in promoting understanding and innovation in complex systems where causal reasoning is essential.