Blind Spot Navigation in LLM Reasoning with Thought Space Explorer (2410.24155v2)
Abstract: Recent advances in LLMs have demonstrated their potential in handling complex reasoning tasks, which are usually achieved by constructing a thought chain to guide the model to solve the problem with multi-step thinking. However, existing methods often remain confined to previously explored solution spaces and thus overlook the critical blind spot within LLMs' cognitive range. To address these issues, we design the Thought Space Explorer (TSE), a novel framework to expand and optimize thought structures to guide LLMs to explore their blind spots of thinking. By generating new reasoning steps and branches based on the original thought structure with various designed strategies, TSE broadens the thought space and alleviates the impact of blind spots for LLM reasoning. Experimental results on multiple levels of reasoning tasks demonstrate the efficacy of TSE. We also conduct extensive analysis to understand how structured and expansive thought can contribute to unleashing the potential of LLM reasoning capabilities.
- Gpt-4 technical report. arXiv preprint arXiv:2303.08774.
- Graph of thoughts: Solving elaborate problems with large language models. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 17682–17690.
- A survey of chain of thought reasoning: Advances, frontiers and future. arXiv preprint arXiv:2309.15402.
- Alphazero-like tree-search can guide large language model decoding and training. arXiv preprint arXiv:2309.17179.
- Interpretable contrastive monte carlo tree search reasoning. arXiv preprint arXiv:2410.01707.
- Reasoning with language model is planning with world model. arXiv preprint arXiv:2305.14992.
- Timetom: Temporal space is the key to unlocking the door of large language models’ theory-of-mind. arXiv preprint arXiv:2407.01455.
- Jie Huang and Kevin Chen-Chuan Chang. 2022. Towards reasoning in large language models: A survey. arXiv preprint arXiv:2212.10403.
- Hugging Face. 2024. Llama-3.1-8B-Instruct.
- Ziqi Jin and Wei Lu. 2024. Self-harmonized chain of thought. arXiv preprint arXiv:2409.04057.
- Maieutic prompting: Logically consistent reasoning with recursive explanations. arXiv preprint arXiv:2205.11822.
- Large language models are zero-shot reasoners. Advances in neural information processing systems, 35:22199–22213.
- Chain of thought empowers transformers to solve inherently serial problems. arXiv preprint arXiv:2402.12875.
- Internal consistency and self-feedback in large language models: A survey. arXiv preprint arXiv:2407.14507.
- Mind your step (by step): Chain-of-thought can reduce performance on tasks where thinking makes humans worse. Preprint, arXiv:2410.21333.
- Logic-of-thought: Injecting logic into contexts for full reasoning in large language models. arXiv preprint arXiv:2409.17539.
- Jieyi Long. 2023. Large language model guided tree-of-thought. arXiv preprint arXiv:2305.08291.
- Large language models know your contextual search intent: A prompting framework for conversational search. Preprint, arXiv:2303.06573.
- A survey of conversational search. Preprint, arXiv:2410.15576.
- Shentong Mo and Miao Xin. 2024. Tree of uncertain thoughts reasoning for large language models. In ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 12742–12746. IEEE.
- OpenAI. 2023. Gpt-4. https://openai.com/index/gpt-4-research/.
- OpenAI. 2024. GPT-4o-mini: Advancing Cost-Efficient Intelligence. Accessed: 2024-10-05.
- The carbon footprint of machine learning training will plateau, then shrink. Computer, 55(7):18–28.
- Reflexion: an autonomous agent with dynamic memory and self-reflection. arXiv preprint arXiv:2303.11366, 2(5):9.
- To cot or not to cot? chain-of-thought helps mainly on math and symbolic reasoning. Preprint, arXiv:2409.12183.
- To cot or not to cot? chain-of-thought helps mainly on math and symbolic reasoning. arXiv preprint arXiv:2409.12183.
- Chain of thoughtlessness: An analysis of cot in planning. arXiv preprint arXiv:2405.04776.
- Self-information loss compensation learning for machine-generated text detection. Mathematical Problems in Engineering, 2021(1):6669468.
- Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171.
- Xuezhi Wang and Denny Zhou. 2024. Chain-of-thought reasoning without prompting. arXiv preprint arXiv:2402.10200.
- Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems, 35:24824–24837.
- Beyond chain-of-thought: A survey of chain-of-x paradigms for llms. arXiv preprint arXiv:2404.15676.
- Tree of thoughts: Deliberate problem solving with large language models. Advances in Neural Information Processing Systems, 36.
- Prototypical reward network for data-efficient rlhf. Preprint, arXiv:2406.06606.
- Ratt: Athought structure for coherent and correct llmreasoning. arXiv preprint arXiv:2406.02746.
- Chain of preference optimization: Improving chain-of-thought reasoning in llms. arXiv preprint arXiv:2406.09136.
- On the diagram of thought. arXiv preprint arXiv:2409.10038.
- Solving math word problems via cooperative reasoning induced language models. arXiv preprint arXiv:2210.16257.