Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Chain of Logic: Rule-Based Reasoning with Large Language Models (2402.10400v2)

Published 16 Feb 2024 in cs.CL
Chain of Logic: Rule-Based Reasoning with Large Language Models

Abstract: Rule-based reasoning, a fundamental type of legal reasoning, enables us to draw conclusions by accurately applying a rule to a set of facts. We explore causal LLMs as rule-based reasoners, specifically with respect to compositional rules - rules consisting of multiple elements which form a complex logical expression. Reasoning about compositional rules is challenging because it requires multiple reasoning steps, and attending to the logical relationships between elements. We introduce a new prompting method, Chain of Logic, which elicits rule-based reasoning through decomposition (solving elements as independent threads of logic), and recomposition (recombining these sub-answers to resolve the underlying logical expression). This method was inspired by the IRAC (Issue, Rule, Application, Conclusion) framework, a sequential reasoning approach used by lawyers. We evaluate chain of logic across eight rule-based reasoning tasks involving three distinct compositional rules from the LegalBench benchmark and demonstrate it consistently outperforms other prompting methods, including chain of thought and self-ask, using open-source and commercial LLMs.

Enhancing Rule-Based Reasoning in LLMs with Chain of Logic

Introduction to Chain of Logic

The focus of this article centers on an exploration into the advancements in rule-based reasoning within LLMs (LMs), particularly through the lens of legal reasoning. Rule-based reasoning, especially when dealing with compositional rules, poses significant challenges due to the necessity of complex logical expressions and multiple reasoning steps. This paper introduces "Chain of Logic," a prompting methodology inspired by the IRAC framework, tailored to enhance LMs' ability to navigate the intricacies of such reasoning tasks effectively.

Evaluation: Benchmarks and Results

The evaluation encompassed eight rule-based reasoning tasks using the LegalBench benchmark, focusing on three distinct compositional rules. This methodology leverages a decompose-recompose approach, breaking down rules into manageable components before synthesizing the logical expression formed by these components. The chain of logic method is compared against existing prompting paradigms, such as chain of thought and self-ask, across a variety of LMs, including both open-source and commercial platforms.

Notable findings from this research include:

  • Consistent outperformance: Across all tasks, the chain of logic approach consistently surpassed other prompting methodologies in terms of rule-based reasoning performance.
  • Generalization capabilities: Models prompted with a single instance of chain of logic effectively generalized this reasoning approach to different rule sets and fact patterns, enhancing their rule-based reasoning capacity.
  • Transparency and explainability: The stepwise reasoning path elucidated by chain of logic not only aids in correct conclusion derivation but also ensures that the decision-making process remains transparent and interpretable.

Theoretical and Practical Implications

The theoretical significance of this work lies in its potential to deepen our understanding of how LMs can be more effectively utilized in domains requiring sophisticated reasoning capabilities, such as law. From a practical standpoint, improving LMs' performance on rule-based reasoning tasks could significantly benefit the legal industry by enhancing the efficiency and accuracy of legal services. Moreover, this methodology's emphasis on in-context learning greatly reduces dependency on extensive annotated datasets, which are often scarce in specialized domains like law.

Future Directions in LLM Research

While the chain of logic method marks a significant step forward in leveraging LMs for complex reasoning tasks, several avenues remain open for future research. These include exploring multi-pass reasoning strategies, dynamic reasoning path generation, and the integration of retrieval-augmented generation to access external knowledge sources. Moreover, extending this approach to tackle rules with more complicated consequents, beyond simple true/false outcomes, could further broaden the applicability of LMs in rule-based reasoning.

Conclusion

In sum, the chain of logic approach represents a promising advancement in enhancing rule-based reasoning capabilities of LLMs, particularly within the context of legal reasoning. By introducing a method that systematically deconstructs complex rules into comprehensible elements before synthesizing the overarching logical expression, this work sets the stage for future innovations in generative AI and its application in domains requiring sophisticated reasoning.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. ClauseRec: A clause recommendation framework for AI-aided contract authoring. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 8770–8776, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
  2. Michael Bommarito II au2 and Daniel Martin Katz. 2022. Gpt takes the bar exam.
  3. Blt: Can large language models handle basic legal text?
  4. Can gpt-3 perform statutory reasoning? In Proceedings of the Nineteenth International Conference on Artificial Intelligence and Law, ICAIL ’23, page 22–31, New York, NY, USA. Association for Computing Machinery.
  5. Language models are few-shot learners.
  6. Neural legal judgment prediction in english.
  7. Large legal fictions: Profiling legal hallucinations in large language models.
  8. Alex Graves. 2017. Adaptive computation time for recurrent neural networks.
  9. Legalbench: A collaboratively built benchmark for measuring legal reasoning in large language models.
  10. Cuad: An expert-annotated nlp dataset for legal contract review.
  11. Nils Holzenberger and Benjamin Van Durme. 2021. Factoring statutory reasoning as language understanding challenges. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 2742–2758, Online. Association for Computational Linguistics.
  12. Mistral 7b.
  13. Cong Jiang and Xiaolei Yang. 2023. Legal syllogism prompting: Teaching large language models for legal judgment prediction.
  14. Text modular networks: Learning to decompose tasks in the language of existing models.
  15. Large language models are zero-shot reasoners.
  16. Retrieval-augmented generation for knowledge-intensive nlp tasks.
  17. Program induction by rationale generation : Learning to solve and explain algebraic word problems.
  18. Laura Manor and Junyi Jessy Li. 2019. Plain English summarization of contracts. In Proceedings of the Natural Legal Language Processing Workshop 2019, pages 1–11, Minneapolis, Minnesota. Association for Computational Linguistics.
  19. Multi-hop reading comprehension through question decomposition and rescoring.
  20. Reframing instructional prompts to gptk’s language.
  21. Orca: Progressive learning from complex explanation traces of gpt-4. arXiv preprint arXiv:2306.02707.
  22. Show your work: Scratchpads for intermediate computation with language models.
  23. OpenAI. 2023. Gpt-4 technical report. View in Article, 2:13.
  24. Measuring and narrowing the compositionality gap in language models.
  25. Answering complex open-domain questions through iterative query generation.
  26. Martha T Roth. 1995. Law collections from Mesopotamia and Asia minor, volume 6. Scholars Press.
  27. Interpretation of natural language rules in conversational machine reading.
  28. Computable contracts by extracting obligation logic graphs. In Proceedings of the Nineteenth International Conference on Artificial Intelligence and Law, ICAIL ’23, page 267–276, New York, NY, USA. Association for Computing Machinery.
  29. Alon Talmor and Jonathan Berant. 2018. The web as a knowledge-base for answering complex questions.
  30. Llama 2: Open foundation and fine-tuned chat models.
  31. Self-consistency improves chain of thought reasoning in language models.
  32. Chain-of-thought prompting elicits reasoning in large language models.
  33. Jason Weston and Sainbayar Sukhbaatar. 2023. System 2 attention (is something you might need too).
  34. Break it down: A question understanding benchmark.
  35. Legal prompting: Teaching a language model to think like a lawyer.
  36. Jec-qa: A legal-domain question answering dataset.
  37. Least-to-most prompting enables complex reasoning in large language models.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Sergio Servantez (2 papers)
  2. Joe Barrow (12 papers)
  3. Kristian Hammond (3 papers)
  4. Rajiv Jain (20 papers)
Citations (1)
X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com