Chain of Logic: Rule-Based Reasoning with Large Language Models
Abstract: Rule-based reasoning, a fundamental type of legal reasoning, enables us to draw conclusions by accurately applying a rule to a set of facts. We explore causal LLMs as rule-based reasoners, specifically with respect to compositional rules - rules consisting of multiple elements which form a complex logical expression. Reasoning about compositional rules is challenging because it requires multiple reasoning steps, and attending to the logical relationships between elements. We introduce a new prompting method, Chain of Logic, which elicits rule-based reasoning through decomposition (solving elements as independent threads of logic), and recomposition (recombining these sub-answers to resolve the underlying logical expression). This method was inspired by the IRAC (Issue, Rule, Application, Conclusion) framework, a sequential reasoning approach used by lawyers. We evaluate chain of logic across eight rule-based reasoning tasks involving three distinct compositional rules from the LegalBench benchmark and demonstrate it consistently outperforms other prompting methods, including chain of thought and self-ask, using open-source and commercial LLMs.
- ClauseRec: A clause recommendation framework for AI-aided contract authoring. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 8770–8776, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Michael Bommarito II au2 and Daniel Martin Katz. 2022. Gpt takes the bar exam.
- Blt: Can large language models handle basic legal text?
- Can gpt-3 perform statutory reasoning? In Proceedings of the Nineteenth International Conference on Artificial Intelligence and Law, ICAIL ’23, page 22–31, New York, NY, USA. Association for Computing Machinery.
- Language models are few-shot learners.
- Neural legal judgment prediction in english.
- Large legal fictions: Profiling legal hallucinations in large language models.
- Alex Graves. 2017. Adaptive computation time for recurrent neural networks.
- Legalbench: A collaboratively built benchmark for measuring legal reasoning in large language models.
- Cuad: An expert-annotated nlp dataset for legal contract review.
- Nils Holzenberger and Benjamin Van Durme. 2021. Factoring statutory reasoning as language understanding challenges. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 2742–2758, Online. Association for Computational Linguistics.
- Mistral 7b.
- Cong Jiang and Xiaolei Yang. 2023. Legal syllogism prompting: Teaching large language models for legal judgment prediction.
- Text modular networks: Learning to decompose tasks in the language of existing models.
- Large language models are zero-shot reasoners.
- Retrieval-augmented generation for knowledge-intensive nlp tasks.
- Program induction by rationale generation : Learning to solve and explain algebraic word problems.
- Laura Manor and Junyi Jessy Li. 2019. Plain English summarization of contracts. In Proceedings of the Natural Legal Language Processing Workshop 2019, pages 1–11, Minneapolis, Minnesota. Association for Computational Linguistics.
- Multi-hop reading comprehension through question decomposition and rescoring.
- Reframing instructional prompts to gptk’s language.
- Orca: Progressive learning from complex explanation traces of gpt-4. arXiv preprint arXiv:2306.02707.
- Show your work: Scratchpads for intermediate computation with language models.
- OpenAI. 2023. Gpt-4 technical report. View in Article, 2:13.
- Measuring and narrowing the compositionality gap in language models.
- Answering complex open-domain questions through iterative query generation.
- Martha T Roth. 1995. Law collections from Mesopotamia and Asia minor, volume 6. Scholars Press.
- Interpretation of natural language rules in conversational machine reading.
- Computable contracts by extracting obligation logic graphs. In Proceedings of the Nineteenth International Conference on Artificial Intelligence and Law, ICAIL ’23, page 267–276, New York, NY, USA. Association for Computing Machinery.
- Alon Talmor and Jonathan Berant. 2018. The web as a knowledge-base for answering complex questions.
- Llama 2: Open foundation and fine-tuned chat models.
- Self-consistency improves chain of thought reasoning in language models.
- Chain-of-thought prompting elicits reasoning in large language models.
- Jason Weston and Sainbayar Sukhbaatar. 2023. System 2 attention (is something you might need too).
- Break it down: A question understanding benchmark.
- Legal prompting: Teaching a language model to think like a lawyer.
- Jec-qa: A legal-domain question answering dataset.
- Least-to-most prompting enables complex reasoning in large language models.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.