LINC: A Neurosymbolic Approach for Logical Reasoning by Combining Language Models with First-Order Logic Provers (2310.15164v2)

Published 23 Oct 2023 in cs.CL and cs.AI

Abstract: Logical reasoning, i.e., deductively inferring the truth value of a conclusion from a set of premises, is an important task for artificial intelligence with wide potential impacts on science, mathematics, and society. While many prompting-based strategies have been proposed to enable LLMs to do such reasoning more effectively, they still appear unsatisfactory, often failing in subtle and unpredictable ways. In this work, we investigate the validity of instead reformulating such tasks as modular neurosymbolic programming, which we call LINC: Logical Inference via Neurosymbolic Computation. In LINC, the LLM acts as a semantic parser, translating premises and conclusions from natural language to expressions in first-order logic. These expressions are then offloaded to an external theorem prover, which symbolically performs deductive inference. Leveraging this approach, we observe significant performance gains on FOLIO and a balanced subset of ProofWriter for three different models in nearly all experimental conditions we evaluate. On ProofWriter, augmenting the comparatively small open-source StarCoder+ (15.5B parameters) with LINC even outperforms GPT-3.5 and GPT-4 with Chain-of-Thought (CoT) prompting by an absolute 38% and 10%, respectively. When used with GPT-4, LINC scores 26% higher than CoT on ProofWriter while performing comparatively on FOLIO. Further analysis reveals that although both methods on average succeed roughly equally often on this dataset, they exhibit distinct and complementary failure modes. We thus provide promising evidence for how logical reasoning over natural language can be tackled through jointly leveraging LLMs alongside symbolic provers. All corresponding code is publicly available at https://github.com/benlipkin/linc

PDF Abstract

LINC: A Neurosymbolic Approach to Logical Reasoning

The paper presents a paper on logical reasoning in artificial intelligence, focusing on a novel model named LINC, which stands for Logical Inference via Neurosymbolic Computation. The authors propose combining LLMs with first-order logic provers to enhance deductive reasoning capabilities. LINC bypasses current limitations of LLMs in logical reasoning, offering a new perspective that blends neural and symbolic computation methods.

Core Contributions and Methodology

LINC addresses the task of deductively inferring the truth value from a set of premises. It employs a two-step process where LLMs are coupled with an external theorem prover. In this system, an LLM converts natural language premises and conclusions into first-order logic expressions, which a theorem prover evaluates deductively. This approach delegates the complex and often unreliable reasoning tasks traditionally assigned to LLMs to a logic solver, while the LLM focuses solely on semantic parsing.

The methodology is evaluated on two datasets, FOLIO, a manually curated dataset, and ProofWriter, a synthetically generated one. Both datasets present challenging logical inference tasks. LINC demonstrated substantial performance improvements compared to purely LLM-based methods such as Chain-of-Thought (CoT) prompting, particularly on the ProofWriter dataset, which includes more extensive and complex premises.

Experimental Findings

The experimental results showcase LINC’s superiority over traditional reasoning approaches like CoT in nearly all tests conducted. Notably, StarCoder+, a smaller open-source model, when augmented with LINC, outperformed larger models such as GPT-3.5 and GPT-4 on ProofWriter by a significant margin. The neurosymbolic approach also exhibited distinct failure modes, complementing the tendencies of LLMs like GPT-4, which suggests potential for synergy between these systems.

Interestingly, LINC's strategic partitioning of the semantic translation and deductive reasoning tasks allows it to maintain high precision in logical inference. However, reductions in recall, particularly when parsing complex semantic relations into FOL, occur due to inherent information losses during translation. This trade-off between semantic fidelity and syntactic rigor is an essential consideration for future development.

Theoretical and Practical Implications

This paper highlights several implications for both the theoretical understanding and practical application of AI in deductive reasoning. The neurosymbolic approach underscores the potential for integrating LLMs with symbolic reasoning systems, a method that could mitigate some traditional challenges in natural language understanding and reasoning.

From a practical standpoint, the augmented reliability and transparency offered by LINC could enhance AI applications requiring logical consistency, such as automated theorem proving, conversational agents, and educational technologies. The separation of concerns strategy positions LINC as a model poised to advance both the precision and scalability of AI systems performing complex reasoning tasks.

Future Directions

Given the promising results, future research could explore extensions to other logical frameworks beyond first-order logic, expanding the model's relevance and applicability. Research might also consider enhancing the semantic parsing capabilities of LLMs to improve recall rates, potentially through techniques such as back-translation or refining intermediate representations.

Moreover, the broader neurosymbolic paradigm invites exploration into diverse domains that benefit from rigorous reasoning abilities combined with the pattern recognition strengths of LLMs. Through such initiatives, AI could achieve new levels of robustness and interpretability in tasks previously dominated by purely statistical or handcrafted symbolic approaches.

In summary, the authors' contribution through LINC offers a compelling perspective on converging neural and symbolic computation, marking a significant step toward enriched logical reasoning abilities in AI systems.

PDF Markdown Bookmark Chat (Pro)

Authors (7)

Theo X. Olausson (5 papers)
Alex Gu (20 papers)
Benjamin Lipkin (5 papers)
Cedegao E. Zhang (8 papers)
Armando Solar-Lezama (65 papers)
Joshua B. Tenenbaum (257 papers)
Roger Levy (43 papers)

Citations (69)

View on Semantic Scholar

Related Papers

Find Related Papers

GitHub

GitHub - benlipkin/linc: 🔗 LINC: Logical Inference via Neurosymbolic Computation [EMNLP2023] (69 stars)

Tweets

https://twitter.com/thoughtful_orth/status/1765756190095544759

YouTube

Show All Videos