Beyond Natural Language: LLMs Leveraging Alternative Formats for Enhanced Reasoning and Communication (2402.18439v3)
Abstract: Natural language (NL) has long been the predominant format for human cognition and communication, and by extension, has been similarly pivotal in the development and application of LLMs. Yet, besides NL, LLMs have seen various non-NL formats during pre-training, such as code and logical expression. NL's status as the optimal format for LLMs, particularly in single-LLM reasoning and multi-agent communication, has not been thoroughly examined. In this work, we challenge the default use of NL by exploring the utility of non-NL formats in these contexts. We show that allowing LLMs to autonomously select the most suitable format before reasoning or communicating leads to a 3.3 to 5.7\% improvement in reasoning efficiency for different LLMs, and up to a 72.7\% reduction in token usage in multi-agent communication, all while maintaining communicative effectiveness. Our comprehensive analysis further reveals that LLMs can devise a format from limited task instructions and that the devised format is effectively transferable across different LLMs. Intriguingly, the structured communication format decided by LLMs exhibits notable parallels with established agent communication languages, suggesting a natural evolution towards efficient, structured communication in agent communication. Our code is released at \url{https://github.com/thunlp/AutoForm}.
- Anthropic. 2023. Introducing claude 2.1.
- Emergent autonomous scientific research capabilities of large language models. CoRR, abs/2304.05332.
- Chateval: Towards better llm-based evaluators through multi-agent debate. CoRR, abs/2308.07201.
- Agentverse: Facilitating multi-agent collaboration and exploring emergent behaviors in agents. CoRR, abs/2308.10848.
- Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. CoRR, abs/2211.12588.
- Noam Chomsky. 2006. Language and mind. Cambridge University Press.
- Improving factuality and reasoning in language models through multiagent debate. CoRR, abs/2305.14325.
- KQML as an agent communication language. In Proceedings of the Third International Conference on Information and Knowledge Management (CIKM’94), Gaithersburg, Maryland, USA, November 29 - December 2, 1994, pages 456–463. ACM.
- FIPA. 2001. FIPA ACL Message Structure Specification. FIPA.
- Jerry A Fodor. 1975. The language of thought, volume 5. Harvard university press.
- PAL: program-aided language models. In International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA, volume 202 of Proceedings of Machine Learning Research, pages 10764–10799. PMLR.
- Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805.
- Metagpt: Meta programming for multi-agent collaborative framework. CoRR, abs/2308.00352.
- The narrativeqa reading comprehension challenge. Trans. Assoc. Comput. Linguistics, 6:317–328.
- Large language models are zero-shot reasoners. In NeurIPS.
- George Lakoff. 2008. Women, fire, and dangerous things: What categories reveal about the mind. University of Chicago press.
- CAMEL: communicative agents for "mind" exploration of large scale language model society. CoRR, abs/2303.17760.
- Program induction by rationale generation: Learning to solve and explain algebraic word problems. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Volume 1: Long Papers, pages 158–167. Association for Computational Linguistics.
- Plan, verify and switch: Integrated reasoning with diverse x-of-thoughts. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023, Singapore, December 6-10, 2023, pages 2807–2822. Association for Computational Linguistics.
- GAIA: a benchmark for general AI assistants. CoRR, abs/2311.12983.
- OpenAI. 2023a. Assistants api.
- OpenAI. 2023b. GPT-4 technical report. CoRR, abs/2303.08774.
- Anton Osika. 2023. gpt-engineer.
- Generative agents: Interactive simulacra of human behavior. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, UIST 2023, San Francisco, CA, USA, 29 October 2023- 1 November 2023, pages 2:1–2:22. ACM.
- Let models speak ciphers: Multiagent debate through embeddings. CoRR, abs/2310.06272.
- Steven Pinker. 2003. The language instinct: How the mind creates language. Penguin uK.
- Communicative agents for software development. CoRR, abs/2307.07924.
- Reflexion: an autonomous agent with dynamic memory and self-reflection. CoRR, abs/2303.11366.
- Significant Gravitas. AutoGPT.
- Beyond the imitation game: Quantifying and extrapolating the capabilities of language models. CoRR, abs/2206.04615.
- Chain-of-thought prompting elicits reasoning in large language models. In NeurIPS.
- Constructing datasets for multi-hop reading comprehension across documents. Trans. Assoc. Comput. Linguistics, 6:287–302.
- Benjamin Lee Whorf. 2012. Language, thought, and reality: Selected writings of Benjamin Lee Whorf. MIT press.
- Autogen: Enabling next-gen LLM applications via multi-agent conversation framework. CoRR, abs/2308.08155.
- Hotpotqa: A dataset for diverse, explainable multi-hop question answering. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2018, pages 2369–2380. Association for Computational Linguistics.
- Tree of thoughts: Deliberate problem solving with large language models. CoRR, abs/2305.10601.
- React: Synergizing reasoning and acting in language models. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net.
- Webarena: A realistic web environment for building autonomous agents. CoRR, abs/2307.13854.
- Weize Chen (34 papers)
- Chenfei Yuan (5 papers)
- Jiarui Yuan (5 papers)
- Yusheng Su (21 papers)
- Chen Qian (226 papers)
- Cheng Yang (168 papers)
- Ruobing Xie (97 papers)
- Zhiyuan Liu (433 papers)
- Maosong Sun (337 papers)