Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Leveraging LLMs for Hypothetical Deduction in Logical Inference: A Neuro-Symbolic Approach (2410.21779v1)

Published 29 Oct 2024 in cs.CL

Abstract: LLMs have exhibited remarkable potential across a wide array of reasoning tasks, including logical reasoning. Although massive efforts have been made to empower the logical reasoning ability of LLMs via external logical symbolic solvers, crucial challenges of the poor generalization ability to questions with different features and inevitable question information loss of symbolic solver-driven approaches remain unresolved. To mitigate these issues, we introduce LINA, a LLM-driven neuro-symbolic approach for faithful logical reasoning. By enabling an LLM to autonomously perform the transition from propositional logic extraction to sophisticated logical reasoning, LINA not only bolsters the resilience of the reasoning process but also eliminates the dependency on external solvers. Additionally, through its adoption of a hypothetical-deductive reasoning paradigm, LINA effectively circumvents the expansive search space challenge that plagues traditional forward reasoning methods. Empirical evaluations demonstrate that LINA substantially outperforms both established propositional logic frameworks and conventional prompting techniques across a spectrum of five logical reasoning tasks. Specifically, LINA achieves an improvement of 24.34% over LINC on the FOLIO dataset, while also surpassing prompting strategies like CoT and CoT-SC by up to 24.02%. Our code is available at https://github.com/wufeiwuwoshihua/nshy.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
  2. Palm 2 technical report. arXiv preprint arXiv:2305.10403, 2023.
  3. Konstantine Arkoudas. Gpt-4 can’t reason. arXiv preprint arXiv:2308.03762, 2023.
  4. LLMs with Chain-of-Thought Are Non-Causal Reasoners. CoRR, abs/2402.16048, 2024a. doi: 10.48550/ARXIV.2402.16048. URL https://doi.org/10.48550/arXiv.2402.16048. arXiv: 2402.16048.
  5. Abstract Meaning Representation-based logic-driven data augmentation for logical reasoning. In Lun-Wei Ku, Andre Martins, and Vivek Srikumar (eds.), Findings of the Association for Computational Linguistics ACL 2024, pp.  5914–5934, Bangkok, Thailand and virtual meeting, August 2024b. Association for Computational Linguistics. URL https://aclanthology.org/2024.findings-acl.353.
  6. Graph of Thoughts: Solving Elaborate Problems with Large Language Models. Proceedings of the AAAI Conference on Artificial Intelligence, 38(16):17682–17690, March 2024a. ISSN 2374-3468. doi: 10.1609/aaai.v38i16.29720. URL https://ojs.aaai.org/index.php/AAAI/article/view/29720. Number: 16.
  7. Topologies of Reasoning: Demystifying Chains, Trees, and Graphs of Thoughts. CoRR, abs/2401.14295, 2024b. doi: 10.48550/ARXIV.2401.14295. URL https://doi.org/10.48550/arXiv.2401.14295. arXiv: 2401.14295.
  8. John P Burgess. Philosophical logic. Princeton University Press, 2009.
  9. Premise Order Matters in Reasoning with Large Language Models. In Forty-first International Conference on Machine Learning, ICML 2024, Vienna, Austria, July 21-27, 2024. OpenReview.net, 2024. URL https://openreview.net/forum?id=4zAHgkiCQg.
  10. Towards personalized evaluation of large language models with an anonymous crowd-sourcing platform. In Companion Proceedings of the ACM on Web Conference 2024, pp.  1035–1038, 2024.
  11. Transformers as soft reasoners over language. In Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, pp.  3882–3890, 2021.
  12. A divide-conquer-reasoning approach to consistency evaluation and improvement in blackbox large language models. In Socially Responsible Language Modelling Research, 2023. URL https://openreview.net/forum?id=WcGXAxhC81.
  13. Folio: Natural language reasoning with first-order logic. arXiv preprint arXiv:2209.00840, 2022.
  14. James Higginbotham. On higher-order logic and natural. In proceedings of the British Academy, volume 95, pp.  1–27, 1998.
  15. LAMBADA: Backward Chaining for Automated Reasoning in Natural Language. In Anna Rogers, Jordan L. Boyd-Graber, and Naoaki Okazaki (eds.), Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023, Toronto, Canada, July 9-14, 2023, pp.  6547–6568. Association for Computational Linguistics, 2023. doi: 10.18653/V1/2023.ACL-LONG.361. URL https://doi.org/10.18653/v1/2023.acl-long.361.
  16. Large language models are zero-shot reasoners. Advances in neural information processing systems, 35:22199–22213, 2022.
  17. Measuring faithfulness in chain-of-thought reasoning. arXiv preprint arXiv:2307.13702, 2023.
  18. Evaluating the logical reasoning ability of chatgpt and gpt-4. arXiv preprint arXiv:2304.03439, 2023.
  19. Logiqa: A challenge dataset for machine reading comprehension with logical reasoning. arXiv preprint arXiv:2007.08124, 2020.
  20. Faithful chain-of-thought reasoning. arXiv preprint arXiv:2301.13379, 2023.
  21. Some uses of higher-order logic in computational linguistics. In 24th Annual Meeting of the Association for Computational Linguistics, pp.  247–256, 1986.
  22. Beyond accuracy: Evaluating the reasoning behavior of large language models - A survey. CoRR, abs/2404.01869, 2024. doi: 10.48550/ARXIV.2404.01869. URL https://doi.org/10.48550/arXiv.2404.01869.
  23. Show your work: Scratchpads for intermediate computation with language models. arXiv preprint arXiv:2112.00114, 2021.
  24. LINC: A Neurosymbolic Approach for Logical Reasoning by Combining Language Models with First-Order Logic Provers. In Houda Bouamor, Juan Pino, and Kalika Bali (eds.), Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023, Singapore, December 6-10, 2023, pp.  5153–5176. Association for Computational Linguistics, 2023. doi: 10.18653/V1/2023.EMNLP-MAIN.313. URL https://doi.org/10.18653/v1/2023.emnlp-main.313.
  25. Revisiting the solution of meta kdd cup 2024: Crag. arXiv preprint arXiv:2409.15337, 2024.
  26. Logic-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning. In Houda Bouamor, Juan Pino, and Kalika Bali (eds.), Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 6-10, 2023, pp.  3806–3824. Association for Computational Linguistics, 2023. doi: 10.18653/V1/2023.FINDINGS-EMNLP.248. URL https://doi.org/10.18653/v1/2023.findings-emnlp.248.
  27. Graham Priest. An introduction to non-classical logic: From if to is. Cambridge University Press, 2008.
  28. Reasoning with Language Model Prompting: A Survey. In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (eds.), Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.  5368–5393, Toronto, Canada, July 2023. Association for Computational Linguistics. doi: 10.18653/v1/2023.acl-long.294. URL https://aclanthology.org/2023.acl-long.294.
  29. To cot or not to cot? chain-of-thought helps mainly on math and symbolic reasoning. 2024. URL https://arxiv.org/abs/2409.12183.
  30. A survey of reasoning with foundation models. CoRR, abs/2312.11562, 2023. doi: 10.48550/ARXIV.2312.11562. URL https://doi.org/10.48550/arXiv.2312.11562.
  31. Proofwriter: Generating implications, proofs, and abductive statements over natural language. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pp.  3621–3634, 2021.
  32. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023.
  33. A & B == B & A: Triggering Logical Reasoning Failures in Large Language Models. CoRR, abs/2401.00757, 2024. doi: 10.48550/ARXIV.2401.00757. URL https://doi.org/10.48550/arXiv.2401.00757. arXiv: 2401.00757.
  34. Logic-Driven Context Extension and Data Augmentation for Logical Reasoning of Text. In Smaranda Muresan, Preslav Nakov, and Aline Villavicencio (eds.), Findings of the Association for Computational Linguistics: ACL 2022, Dublin, Ireland, May 22-27, 2022, pp.  1619–1629. Association for Computational Linguistics, 2022. doi: 10.18653/V1/2022.FINDINGS-ACL.127. URL https://doi.org/10.18653/v1/2022.findings-acl.127.
  35. Self-Consistency Improves Chain of Thought Reasoning in Language Models. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023. URL https://openreview.net/forum?id=1PL1NIMMrw.
  36. Chain-of-thought prompting elicits reasoning in large language models. In Proceedings of the 36th International Conference on Neural Information Processing Systems, NIPS ’22, pp.  24824–24837, Red Hook, NY, USA, April 2024. Curran Associates Inc. ISBN 978-1-71387-108-8.
  37. ReAct: Synergizing Reasoning and Acting in Language Models. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023. URL https://openreview.net/forum?id=WE_vluYUL-X.
  38. Tree of thoughts: deliberate problem solving with large language models. In Proceedings of the 37th International Conference on Neural Information Processing Systems, NIPS ’23, pp.  11809–11822, Red Hook, NY, USA, May 2024. Curran Associates Inc.
  39. SatLM: Satisfiability-Aided Language Models Using Declarative… URL https://openreview.net/forum?id=TqW5PL1Poi&noteId=OZMTWB3pzq.
  40. A knowledge-centric benchmarking framework and empirical study for retrieval-augmented generation. arXiv preprint arXiv:2409.13694, 2024.
  41. ReClor: A Reading Comprehension Dataset Requiring Logical Reasoning. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net, 2020. URL https://openreview.net/forum?id=HJgJtT4tvB.
  42. Cumulative Reasoning with Large Language Models. CoRR, abs/2308.04371, 2023. doi: 10.48550/ARXIV.2308.04371. URL https://doi.org/10.48550/arXiv.2308.04371. arXiv: 2308.04371.
  43. An examination on the effectiveness of divide-and-conquer prompting in large language models. 2024. URL https://arxiv.org/abs/2402.05359.
  44. Least-to-Most Prompting Enables Complex Reasoning in Large Language Models. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023. URL https://openreview.net/forum?id=WZH7099tgfM.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Qingchuan Li (2 papers)
  2. Jiatong Li (47 papers)
  3. Tongxuan Liu (12 papers)
  4. Yuting Zeng (9 papers)
  5. Mingyue Cheng (45 papers)
  6. Weizhe Huang (8 papers)
  7. Qi Liu (485 papers)
Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets