UPAR: A Kantian-Inspired Prompting Framework for Enhancing Large Language Model Capabilities (2310.01441v2)
Abstract: LLMs have demonstrated impressive inferential capabilities, with numerous research endeavors devoted to enhancing this capacity through prompting. Despite these efforts, a unified epistemological foundation is still conspicuously absent. Drawing inspiration from Kant's a priori philosophy, we propose the UPAR prompting framework, designed to emulate the structure of human cognition within LLMs. The UPAR framework is delineated into four phases: "Understand", "Plan", "Act", and "Reflect", enabling the extraction of structured information from complex contexts, prior planning of solutions, execution according to plan, and self-reflection. This structure significantly augments the explainability and accuracy of LLM inference, producing a human-understandable and inspectable inferential trajectory. Furthermore, our work offers an epistemological foundation for existing prompting techniques, allowing for a possible systematic integration of these methods. With GPT-4, our approach elevates the accuracy from COT baseline of 22.92% to 58.33% in a challenging subset of GSM8K, and from 67.91% to 75.40% in the causal judgment task. Without using few-shot examples or external tools, UPAR significantly outperforms existing prompting methods on SCIBENCH, a challenging dataset containing collegiate-level mathematics, chemistry, and physics scientific problems.
- Henry E Allison. Kant’s transcendental idealism. Yale University Press, 2004.
- A multitask, multilingual, multimodal evaluation of chatgpt on reasoning, hallucination, and interactivity. arXiv preprint arXiv:2302.04023, 2023.
- Graph of thoughts: Solving elaborate problems with large language models, 2023.
- Graham Bird. The revolutionary Kant: A commentary on the critique of pure reason. Open Court, 2013.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
- Sparks of artificial general intelligence: Early experiments with gpt-4, 2023.
- Large language models as tool makers, 2023.
- Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks, 2022.
- Factool: Factuality detection in generative ai – a tool augmented framework for multi-task and multi-domain scenarios, 2023.
- Dola: Decoding by contrasting layers improves factuality in large language models, 2023.
- Training verifiers to solve math word problems, 2021.
- Selection-inference: Exploiting large language models for interpretable logical reasoning, 2022.
- Language model cascades, 2022.
- A survey for in-context learning. arXiv preprint arXiv:2301.00234, 2022.
- A survey on in-context learning, 2023.
- Faith and fate: Limits of transformers on compositionality, 2023.
- Complexity-based prompting for multi-step reasoning, 2023.
- Pal: Program-aided language models, 2023a.
- Enabling large language models to generate text with citations, 2023b.
- Did aristotle use a laptop? a question answering benchmark with implicit reasoning strategies. Transactions of the Association for Computational Linguistics, 9:346–361, 2021. doi: 10.1162/tacl˙a˙00370. URL https://aclanthology.org/2021.tacl-1.21.
- History of western philosophy: Collectors edition. Routledge, 2013.
- Georg Wilhelm Fredrich Hegel. Georg Wilhelm Friedrich Hegel: the science of logic. Cambridge University Press, 2010.
- Investigating causal understanding in llms. In NeurIPS ML Safety Workshop, 2022.
- David Hume. The Philosophical Works of David Hume: In Four Volumes. Essays moral, political, and literary; 1, volume 3. Longmans, Green, 1875.
- Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12):1–38, mar 2023. doi: 10.1145/3571730. URL https://doi.org/10.1145%2F3571730.
- Maieutic prompting: Logically consistent reasoning with recursive explanations, 2022.
- Language models (mostly) know what they know, 2022.
- Daniel Kahneman. Thinking, fast and slow. macmillan, 2011.
- Critique of pure reason. JM Dent London, 1934.
- Large language models are zero-shot reasoners. In Advances in Neural Information Processing Systems, volume 35, pp. 22199–22213, 2022.
- Large language models are zero-shot reasoners, 2023.
- Causal reasoning and large language models: Opening a new frontier for causality, 2023.
- Deep learning. nature, 521(7553):436–444, 2015.
- Halueval: A large-scale hallucination evaluation benchmark for large language models. arXiv e-prints, pp. arXiv–2305, 2023.
- Holistic evaluation of language models, 2022.
- Program induction by rationale generation: Learning to solve and explain algebraic word problems. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 158–167, Vancouver, Canada, July 2017. Association for Computational Linguistics. doi: 10.18653/v1/P17-1015. URL https://aclanthology.org/P17-1015.
- Generated knowledge prompting for commonsense reasoning, 2022.
- Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing, 2021.
- Chameleon: Plug-and-play compositional reasoning with large language models, 2023a.
- A survey of deep learning for mathematical reasoning, 2023b.
- George F Luger. Artificial intelligence: structures and strategies for complex problem solving. Pearson education, 2005.
- Faithful chain-of-thought reasoning. arXiv preprint arXiv:2301.13379, 2023.
- Self-refine: Iterative refinement with self-feedback, 2023.
- Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017.
- Rationalism vs. empiricism. 2004.
- Sources of hallucination by large language models on inference tasks, 2023.
- Augmented language models: a survey, 2023.
- Nils J Nilsson. Principles of artificial intelligence. Springer Science & Business Media, 1982.
- OpenAI. Gpt-4 technical report, 2023.
- Prompting contrastive explanations for commonsense reasoning tasks, 2021.
- Art: Automatic multi-step reasoning and tool-use for large language models, 2023.
- Reasoning with language model prompting: A survey, 2023.
- A survey of hallucination in large foundation models, 2023.
- Investigating the factual knowledge boundary of large language models with retrieval augmentation, 2023.
- Toolformer: Language models can teach themselves to use tools, 2023.
- Algorithm of thoughts: Enhancing exploration of ideas in large language models, 2023.
- Wilfrid Sellars et al. Empiricism and the philosophy of mind. Minnesota studies in the philosophy of science, 1(19):253–329, 1956.
- Hugginggpt: Solving ai tasks with chatgpt and its friends in hugging face, 2023.
- Large language models can be easily distracted by irrelevant context, 2023.
- Unsupervised commonsense question answering with self-talk, 2020.
- The bounds of sense: An essay on Kant’s critique of pure reason. Routledge, 2018.
- Challenging big-bench tasks and whether chain-of-thought can solve them, 2022.
- CommonsenseQA: A question answering challenge targeting commonsense knowledge. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4149–4158, Minneapolis, Minnesota, June 2019. Association for Computational Linguistics. doi: 10.18653/v1/N19-1421. URL https://aclanthology.org/N19-1421.
- Scibench: Evaluating college-level scientific problem-solving abilities of large language models, 2023a.
- Self-consistency improves chain of thought reasoning in language models, 2023b.
- Emergent abilities of large language models. arXiv preprint arXiv:2206.07682, 2022.
- Chain-of-thought prompting elicits reasoning in large language models, 2023.
- Can foundation models talk causality?, 2022.
- From word models to world models: Translating from natural language to the probabilistic language of thought, 2023.
- Decomposition enhances reasoning via self-evaluation guided decoding, 2023.
- Large language models as optimizers, 2023a.
- Seqzero: Few-shot compositional semantic parsing with sequential prompts and zero-shot models, 2022.
- Mm-react: Prompting chatgpt for multimodal reasoning and action, 2023b.
- Tree of thoughts: Deliberate problem solving with large language models, 2023a.
- React: Synergizing reasoning and acting in language models, 2023b.
- Causal parrots: Large language models may talk causality but are not causal, 2023.
- Automatic chain of thought prompting in large language models, 2022.
- Explainability for large language models: A survey, 2023a.
- A survey of large language models. arXiv preprint arXiv:2303.18223, 2023b.
- Least-to-most prompting enables complex reasoning in large language models, 2023a.
- Large language models are human-level prompt engineers, 2023b.