Planning and Editing What You Retrieve for Enhanced Tool Learning (2404.00450v2)
Abstract: Recent advancements in integrating external tools with LLMs have opened new frontiers, with applications in mathematical reasoning, code generators, and smart assistants. However, existing methods, relying on simple one-time retrieval strategies, fall short on effectively and accurately shortlisting relevant tools. This paper introduces a novel PLUTO (Planning, Learning, and Understanding for TOols) approach, encompassing Plan-and-Retrieve (P&R)
and Edit-and-Ground (E&G)
paradigms. The P&R paradigm consists of a neural retrieval module for shortlisting relevant tools and an LLM-based query planner that decomposes complex queries into actionable tasks, enhancing the effectiveness of tool utilization. The E&G paradigm utilizes LLMs to enrich tool descriptions based on user scenarios, bridging the gap between user queries and tool functionalities. Experiment results demonstrate that these paradigms significantly improve the recall and NDCG in tool retrieval tasks, significantly surpassing current state-of-the-art models.
- Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint arXiv:2303.12712.
- Reading wikipedia to answer open-domain questions. In 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, pages 1870–1879. Association for Computational Linguistics (ACL).
- Tanmay Gupta and Aniruddha Kembhavi. 2023. Visual programming: Compositional visual reasoning without training. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14953–14962.
- Retrieval augmented language model pre-training. In International conference on machine learning, pages 3929–3938. PMLR.
- Toolkengpt: Augmenting frozen language models with massive tools via tool embeddings.
- Tool documentation enables zero-shot tool-usage with large language models. arXiv preprint arXiv:2308.00675.
- Unsupervised dense information retrieval with contrastive learning. Transactions on Machine Learning Research.
- Active retrieval augmented generation. arXiv preprint arXiv:2305.06983.
- Genegpt: Augmenting large language models with domain tools for improved access to biomedical information.
- Dense passage retrieval for open-domain question answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6769–6781, Online. Association for Computational Linguistics.
- Dspy: Compiling declarative language model calls into self-improving pipelines.
- Label efficient semi-supervised conversational intent classification. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track), pages 96–102, Toronto, Canada. Association for Computational Linguistics.
- Latent retrieval for weakly supervised open domain question answering. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 6086–6096, Florence, Italy. Association for Computational Linguistics.
- Retrieval-augmented generation for knowledge-intensive nlp tasks.
- Api-bank: A comprehensive benchmark for tool-augmented llms.
- Chameleon: Plug-and-play compositional reasoning with large language models.
- Dynamic prompt learning via policy gradient for semi-structured mathematical reasoning. In The Eleventh International Conference on Learning Representations.
- Interface evolution patterns: balancing compatibility and extensibility across service life cycles. pages 1–24.
- When not to trust language models: Investigating effectiveness of parametric and non-parametric memories. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 9802–9822, Toronto, Canada. Association for Computational Linguistics.
- Webgpt: Browser-assisted question-answering with human feedback.
- Gorilla: Large language model connected with massive apis. arXiv preprint arXiv:2305.15334.
- Creator: Tool creation for disentangling abstract and concrete reasoning of large language models.
- Toolllm: Facilitating large language models to master 16000+ real-world apis. arXiv preprint arXiv:2307.16789.
- Toolformer: Language models can teach themselves to use tools. arXiv preprint arXiv:2302.04761.
- Hugginggpt: Solving ai tasks with chatgpt and its friends in hugging face.
- Dialog2api: Task-oriented dialogue with api description and example programs. arXiv preprint arXiv:2212.09946.
- Llm-planner: Few-shot grounded planning for embodied agents with large language models. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2998–3009.
- Restgpt: Connecting large language models with real-world restful apis.
- Toolalpaca: Generalized tool learning for language models with 3000 simulated cases.
- Medagents: Large language models as collaborators for zero-shot medical reasoning. arXiv preprint arXiv:2311.10537.
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
- Visual chatgpt: Talking, drawing and editing with visual foundation models.
- On the tool manipulation capability of open-source large language models. arXiv preprint arXiv:2305.16504.
- Exploring continual learning for code generation models. In The 61st Annual Meeting Of The Association For Computational Linguistics.
- Large language models for automated open-domain scientific hypotheses discovery. arXiv preprint arXiv:2309.02726.
- React: Synergizing reasoning and acting in language models. In The Eleventh International Conference on Learning Representations.
- Making retrieval-augmented language models robust to irrelevant context.
- Syntax error-free and generalizable tool use for llms via finite-state decoding.
- Furthest reasoning with plan assessment: Stable reasoning path with retrieval-augmented large language models. arXiv preprint arXiv:2309.12767.