Emergent Mind

Abstract

LLMs have shown impressive performance in reasoning benchmarks with the emergence of Chain-of-Thought (CoT), particularly in multi-choice question (MCQ). However, current works equally resolve questions regardless of the problem-solving difficulty, leading to an excessive focus on simple items while insufficient attention on intricate ones. To address this challenge, we propose a simple yet effective strategy, Divide and Conquer Reasoning (DCR), to enhance the reasoning capability of LLMs for MCQs, as inspired by human beings using heuristics to first categorize tasks and then handle them separately. In particular, we first categorize questions into two subsets based on confidence score ($\mathcal{CS}$), which is estimated by statistical frequency of generated answers. Subsequently, we propose Filter Choices based Reasoning (FCR) to improve model performance on MCQs with low ($\mathcal{CS}$). Our experiments demonstrate that the proposed strategy only costs 85% of SOTA, while still achieves average accuracy improvement of 1.56% across nine datasets including arithmetic, commonsense, and logic reasoning tasks. The code is at \url{https://github.com/AiMijie/Divide-and-Conquer}

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Sign up for a free account or log in to generate a summary of this paper:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.

References
  1. PaLM 2 Technical Report
  2. Jon Louis Bentley. Multidimensional divide-and-conquer. Communications of the ACM, 23(4):214–229
  3. Divide-and-conquer in multidimensional space. In Proceedings of the eighth annual ACM symposium on Theory of computing, pp.  220–230
  4. Graph of Thoughts: Solving Elaborate Problems with Large Language Models
  5. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901
  6. Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks
  7. ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language Models
  8. PaLM: Scaling Language Modeling with Pathways
  9. Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge
  10. Training Verifiers to Solve Math Word Problems
  11. Agent Instructs Large Language Models to be General Zero-Shot Reasoners
  12. Fill in the Blank: Exploring and Enhancing LLM Capabilities for Backward Reasoning in Math Word Problems
  13. Active Prompting with Chain-of-Thought for Large Language Models
  14. Michael Eisenstein. Divide and conquer. Nature, 441(7097):1179–1179
  15. Complexity-Based Prompting for Multi-Step Reasoning
  16. Pal: Program-aided language models. In International Conference on Machine Learning, pp.  10764–10799. PMLR
  17. Gauss and the history of the fast fourier transform. IEEE Assp Magazine, 1(4):14–21
  18. Measuring Massive Multitask Language Understanding
  19. Code prompting: a neural symbolic method for complex reasoning in large language models
  20. C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models
  21. Resprompt: Residual Connection Prompting Advances Multi-Step Reasoning in Large Language Models
  22. Design of Chain-of-Thought in Math Problem Solving
  23. Tab-CoT: Zero-shot Tabular Chain of Thought
  24. Donald Ervin Knuth. Sorting and searching. The art of computer programming, 3
  25. Large language models are zero-shot reasoners. Advances in neural information processing systems, 35:22199–22213
  26. Better Zero-Shot Reasoning with Role-Play Prompting
  27. Are Human-generated Demonstrations Necessary for In-context Learning?
  28. Benchmarking and Improving Generator-Validator Consistency of Language Models
  29. RiddleSense: Reasoning about Riddle Questions Featuring Linguistic Creativity and Commonsense Knowledge
  30. Program Induction by Rationale Generation : Learning to Solve and Explain Algebraic Word Problems
  31. Deductive Verification of Chain-of-Thought Reasoning
  32. Thomas E Mallouk. Divide and conquer. Nature chemistry, 5(5):362–363
  33. EchoPrompt: Instructing the Model to Rephrase Queries for Improved In-context Learning
  34. SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning
  35. Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering
  36. GPT-4 Technical Report
  37. The peter principle, volume 4. Souvenir Press London
  38. Large Language Models Sensitivity to The Order of Options in Multiple-Choice Questions
  39. Leveraging Large Language Models for Multiple Choice Question Answering
  40. Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language Models
  41. Large language models can be easily distracted by irrelevant context. In International Conference on Machine Learning, pp.  31210–31227. PMLR
  42. Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data
  43. Douglas R Smith. The design of divide and conquer algorithms. Science of Computer Programming, 5:37–58
  44. Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
  45. Enhancing Chain-of-Thoughts Prompting with Iterative Bootstrapping in Large Language Models
  46. CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge
  47. LLaMA: Open and Efficient Foundation Language Models
  48. Llama 2: Open Foundation and Fine-Tuned Chat Models
  49. Self-Consistency Improves Chain of Thought Reasoning in Language Models
  50. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837
  51. Large Language Models are Better Reasoners with Self-Verification
  52. RCOT: Detecting and Rectifying Factual Inconsistency in Reasoning by Reversing Chain-of-Thought
  53. LPML: LLM-Prompting Markup Language for Mathematical Reasoning
  54. Concise and Organized Perception Facilitates Large Language Models for Deductive Reasoning
  55. Tree of Thoughts: Deliberate Problem Solving with Large Language Models
  56. Large Language Models as Analogical Reasoners
  57. ReClor: A Reading Comprehension Dataset Requiring Logical Reasoning
  58. Self-Convinced Prompting: Few-Shot Question Answering with Repeated Introspection
  59. Automatic Chain of Thought Prompting in Large Language Models
  60. Progressive-Hint Prompting Improves Reasoning in Large Language Models
  61. Large Language Models Are Not Robust Multiple Choice Selectors
  62. Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models
  63. Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
  64. AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models
  65. Least-to-Most Prompting Enables Complex Reasoning in Large Language Models
  66. Large Language Models can Learn Rules
  67. Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Models

Show All 67